1
|
Heilemann G, Georg D, Dobiasch M, Widder J, Renner A. Automation of ePROMs in radiation oncology and its impact on patient response and bias. Radiother Oncol 2024:110427. [PMID: 39002570 DOI: 10.1016/j.radonc.2024.110427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/10/2024] [Accepted: 07/04/2024] [Indexed: 07/15/2024]
Abstract
PURPOSE This study evaluates the impact of integrating a novel, in-house developed electronic Patient-Reported Outcome Measures (ePROMs) tool with a commercial Oncology Information System (OIS) on patient response rates and potential biases in real-world data science applications. MATERIALS AND METHODS We designed an ePROMs tool using the NodeJS web application framework, automatically sending e-mail questionnaires to patients based on their treatment schedules in the OIS. The tool is used across various treatment sites to collect PROMs data in a real-world setting. This research examined the effects of increasing automation levels on both recruitment and response rates, as well as potential biases across different patient cohorts. Automation was implemented in three escalating levels, from telephone reminders for missing reports to minimal intervention from study nurses. RESULTS From August 2020 to December 2023, 1,944 patients participated in the PROMs study. Our findings indicate that automating the workflows substantially reduced the patient management workload. However, higher levels of automation led to lower response rates, particularly in collecting late-phase symptoms in breast and head-and-neck cancer cohorts. Additionally, email-based PROMs introduced an age bias when recruiting new patients for the ePROMs study. Nevertheless, age was not a significant predictor of early dropout or missing symptom reports among patients participating. Notably, increased automation was significantly correlated with lower response rates in breast (p = 0.026) and head-and-neck cancer patients (p < 0.001). CONCLUSION Integrating ePROMs within the OIS can significantly reduce workload and personnel resources. However, this efficiency may compromise patient responses in certain groups. A balance must be achieved between workload, resource allocation, and the sensitivity needed to detect clinically significant effects. This may necessitate customized automation levels tailored to specific cancer groups, highlighting a fundamental trade-off between operational efficiency and data quality.
Collapse
Affiliation(s)
- G Heilemann
- Department of Radiation Oncology, Comprehensive Cancer Center Vienna, Medical University Vienna, Vienna, Austria; Christian Doppler Laboratory for Image and Knowledge Driven Precision Radiation Oncology, Department of Radiation Oncology, Medical University Vienna, Vienna, Austria.
| | - D Georg
- Department of Radiation Oncology, Comprehensive Cancer Center Vienna, Medical University Vienna, Vienna, Austria; Christian Doppler Laboratory for Image and Knowledge Driven Precision Radiation Oncology, Department of Radiation Oncology, Medical University Vienna, Vienna, Austria
| | - M Dobiasch
- Department of Radiation Oncology, Comprehensive Cancer Center Vienna, Medical University Vienna, Vienna, Austria
| | - J Widder
- Department of Radiation Oncology, Comprehensive Cancer Center Vienna, Medical University Vienna, Vienna, Austria; Christian Doppler Laboratory for Image and Knowledge Driven Precision Radiation Oncology, Department of Radiation Oncology, Medical University Vienna, Vienna, Austria
| | - A Renner
- Department of Radiation Oncology, Comprehensive Cancer Center Vienna, Medical University Vienna, Vienna, Austria; Christian Doppler Laboratory for Image and Knowledge Driven Precision Radiation Oncology, Department of Radiation Oncology, Medical University Vienna, Vienna, Austria
| |
Collapse
|
2
|
Nottke A, Alan S, Brimble E, Cardillo AB, Henderson L, Littleford HE, Rojahn S, Sage H, Taylor J, West-Odell L, Berk A. Validation and clinical discovery demonstration of breast cancer data from a real-world data extraction platform. JAMIA Open 2024; 7:ooae041. [PMID: 38766645 PMCID: PMC11100995 DOI: 10.1093/jamiaopen/ooae041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 02/29/2024] [Indexed: 05/22/2024] Open
Abstract
Objective To validate and demonstrate the clinical discovery utility of a novel patient-mediated, medical record collection and data extraction platform developed to improve access and utilization of real-world clinical data. Materials and Methods Clinical variables were extracted from the medical records of 1011 consented patients with breast cancer. To validate the extracted data, case report forms completed using the structured data output of the platform were compared to manual chart review for 50 randomly-selected patients with metastatic breast cancer. To demonstrate the platform's clinical discovery utility, we identified 194 patients with early-stage clinical data who went on to develop distant metastases and utilized the platform-extracted data to assess associations between time to distant metastasis (TDM) and early-stage tumor histology, molecular type, and germline BRCA status. Results The platform-extracted data for the validation cohort had 97.6% precision (91.98%-100% by variable type) and 81.48% recall (58.15%-95.00% by variable type) compared to manual chart review. In our discovery cohort, the shortest TDM was significantly associated with metaplastic (739.0 days) and inflammatory histologies (1005.8 days), HR-/HER2- molecular types (1187.4 days), and positive BRCA status (1042.5 days) as compared to other histologies, molecular types, and negative BRCA status, respectively. Multivariable analyses did not produce statistically significant results. Discussion The precision and recall of platform-extracted clinical data are reported, although specificity could not be assessed. The data can generate clinically-relevant insights. Conclusion The structured real-world data produced by a novel patient-mediated, medical record-extraction platform are reliable and can power clinical discovery.
Collapse
Affiliation(s)
| | - Sophia Alan
- Ciitizen, San Francisco, CA 94112, United States
| | | | | | | | | | | | - Heather Sage
- Ciitizen, San Francisco, CA 94112, United States
| | | | | | | |
Collapse
|
3
|
Oddy C, Zhang J, Morley J, Ashrafian H. Promising algorithms to perilous applications: a systematic review of risk stratification tools for predicting healthcare utilisation. BMJ Health Care Inform 2024; 31:e101065. [PMID: 38901863 PMCID: PMC11191805 DOI: 10.1136/bmjhci-2024-101065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 05/14/2024] [Indexed: 06/22/2024] Open
Abstract
OBJECTIVES Risk stratification tools that predict healthcare utilisation are extensively integrated into primary care systems worldwide, forming a key component of anticipatory care pathways, where high-risk individuals are targeted by preventative interventions. Existing work broadly focuses on comparing model performance in retrospective cohorts with little attention paid to efficacy in reducing morbidity when deployed in different global contexts. We review the evidence supporting the use of such tools in real-world settings, from retrospective dataset performance to pathway evaluation. METHODS A systematic search was undertaken to identify studies reporting the development, validation and deployment of models that predict healthcare utilisation in unselected primary care cohorts, comparable to their current real-world application. RESULTS Among 3897 articles screened, 51 studies were identified evaluating 28 risk prediction models. Half underwent external validation yet only two were validated internationally. No association between validation context and model discrimination was observed. The majority of real-world evaluation studies reported no change, or indeed significant increases, in healthcare utilisation within targeted groups, with only one-third of reports demonstrating some benefit. DISCUSSION While model discrimination appears satisfactorily robust to application context there is little evidence to suggest that accurate identification of high-risk individuals can be reliably translated to improvements in service delivery or morbidity. CONCLUSIONS The evidence does not support further integration of care pathways with costly population-level interventions based on risk prediction in unselected primary care cohorts. There is an urgent need to independently appraise the safety, efficacy and cost-effectiveness of risk prediction systems that are already widely deployed within primary care.
Collapse
Affiliation(s)
- Christopher Oddy
- Department of Anaesthesia, Critical Care and Pain, Kingston Hospital NHS Foundation Trust, London, UK
| | - Joe Zhang
- Imperial College London Institute of Global Health Innovation, London, UK
- London AI Centre, Guy's and St. Thomas' Hospital, London, UK
| | - Jessica Morley
- Digital Ethics Center, Yale University, New Haven, Connecticut, USA
| | - Hutan Ashrafian
- Imperial College London Institute of Global Health Innovation, London, UK
| |
Collapse
|
4
|
Janssen SHM, Vlooswijk C, Bijlsma RM, Kaal SEJ, Kerst JM, Tromp JM, Bos MEMM, van der Hulle T, Lalisang RI, Nuver J, Kouwenhoven MCM, van der Graaf WTA, Husson O. Health-related conditions among long-term cancer survivors diagnosed in adolescence and young adulthood (AYA): results of the SURVAYA study. J Cancer Surviv 2024:10.1007/s11764-024-01597-0. [PMID: 38740702 DOI: 10.1007/s11764-024-01597-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 04/05/2024] [Indexed: 05/16/2024]
Abstract
BACKGROUND With 5-year survival rates > 85%, gaining insight into the long-term and late health-related conditions of cancer survivors diagnosed in adolescence and young adulthood is of utmost importance to improve their quantity and quality of survival. This study examined the prevalence of and factors associated with, patient-reported health-related conditions and their latency times among long-term adolescent and young adult (AYA) cancer survivors. METHODS AYA cancer survivors (5-20 years after diagnosis) were identified by the population-based Netherlands Cancer Registry (NCR), and invited to participate in the SURVAYA questionnaire study. Participants reported the prevalence and date of diagnosis of health-related conditions. Clinical data were retrieved from the NCR. RESULTS Three thousand seven hundred seventy-six AYA cancer survivors (response rate 33.4%) were included for analyses. More than half of the AYAs (58.5%) experienced health-related conditions after their cancer diagnosis, of whom 51.4% were diagnosed with two or more conditions. Participants reported conditions related to vision (15.0%), digestive system (15.0%), endocrine system (14.1%), cardiovascular system (11.7%), respiratory system (11.3%), urinary tract system (10.9%), depression (8.6%), hearing (7.4%), arthrosis (6.9%), secondary malignancy (6.4%), speech-, taste and smell (4.5%), and rheumatoid arthritis (2.1%). Time since diagnosis, tumor type, age at diagnosis, and educational level were most frequently associated with a health-related condition. CONCLUSIONS A significant proportion of long-term AYA cancer survivors report having one or more health-related conditions. IMPLICATIONS FOR CANCER SURVIVORS Future research should focus on better understanding the underlying mechanisms of, and risk factors for, these health-related conditions to support the development and implementation of risk-stratified survivorship care for AYA cancer survivors to further improve their outcomes. CLINICAL TRIALS REGISTRATION NCT05379387.
Collapse
Affiliation(s)
- Silvie H M Janssen
- Department of Psychosocial Research and Epidemiology, Netherlands Cancer Institute, 1066 CX, Amsterdam, the Netherlands
- Department of Medical Oncology, Netherlands Cancer Institute-Antoni van Leeuwenhoek, 1066 CX, Amsterdam, the Netherlands
| | - Carla Vlooswijk
- Research and Development, Netherlands Comprehensive Cancer Organization, 3511 DT, Utrecht, The Netherlands
| | - Rhodé M Bijlsma
- Department of Medical Oncology, University Medical Center Utrecht, 3584 CX, Utrecht, The Netherlands
| | - Suzanne E J Kaal
- Department of Medical Oncology, Radboud University Medical Center, 6525 GA, Nijmegen, The Netherlands
| | - Jan Martijn Kerst
- Department of Medical Oncology, Netherlands Cancer Institute-Antoni van Leeuwenhoek, 1066 CX, Amsterdam, the Netherlands
| | - Jacqueline M Tromp
- Department of Medical Oncology, Amsterdam University Medical Centers, 1105 AZ, Amsterdam, The Netherlands
| | - Monique E M M Bos
- Department of Medical Oncology, Erasmus MC Cancer Institute, Erasmus University Medical Center, 3015 GD, Rotterdam, The Netherlands
| | - Tom van der Hulle
- Department of Medical Oncology, Leiden University Medical Center, 2333 ZA, Leiden, The Netherlands
| | - Roy I Lalisang
- Division of Medical Oncology, Department of Internal Medicine, Maastricht UMC+ Comprehensive Cancer Center, GROW-School of Oncology and Reproduction, Maastricht University Medical Center+, 6229 HX, Maastricht, The Netherlands
| | - Janine Nuver
- Department of Medical Oncology, University Medical Center Groningen, 9713 GZ, Groningen, The Netherlands
| | - Mathilde C M Kouwenhoven
- Department of Neurology, Cancer Center Amsterdam, Amsterdam UMC, Amsterdam University Medical Centers, Location VUmc, 1081 HV, Amsterdam, The Netherlands
| | - Winette T A van der Graaf
- Department of Medical Oncology, Netherlands Cancer Institute-Antoni van Leeuwenhoek, 1066 CX, Amsterdam, the Netherlands
- Department of Medical Oncology, Erasmus MC Cancer Institute, Erasmus University Medical Center, 3015 GD, Rotterdam, The Netherlands
| | - Olga Husson
- Department of Psychosocial Research and Epidemiology, Netherlands Cancer Institute, 1066 CX, Amsterdam, the Netherlands.
- Department of Medical Oncology, Netherlands Cancer Institute-Antoni van Leeuwenhoek, 1066 CX, Amsterdam, the Netherlands.
- Department of Surgical Oncology, Erasmus MC Cancer Institute, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
| |
Collapse
|
5
|
Kervezee L, Dashti HS, Pilz LK, Skarke C, Ruben MD. Using routinely collected clinical data for circadian medicine: A review of opportunities and challenges. PLOS DIGITAL HEALTH 2024; 3:e0000511. [PMID: 38781189 PMCID: PMC11115276 DOI: 10.1371/journal.pdig.0000511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
A wealth of data is available from electronic health records (EHR) that are collected as part of routine clinical care in hospitals worldwide. These rich, longitudinal data offer an attractive object of study for the field of circadian medicine, which aims to translate knowledge of circadian rhythms to improve patient health. This narrative review aims to discuss opportunities for EHR in studies of circadian medicine, highlight the methodological challenges, and provide recommendations for using these data to advance the field. In the existing literature, we find that data collected in real-world clinical settings have the potential to shed light on key questions in circadian medicine, including how 24-hour rhythms in clinical features are associated with-or even predictive of-health outcomes, whether the effect of medication or other clinical activities depend on time of day, and how circadian rhythms in physiology may influence clinical reference ranges or sampling protocols. However, optimal use of EHR to advance circadian medicine requires careful consideration of the limitations and sources of bias that are inherent to these data sources. In particular, time of day influences almost every interaction between a patient and the healthcare system, creating operational 24-hour patterns in the data that have little or nothing to do with biology. Addressing these challenges could help to expand the evidence base for the use of EHR in the field of circadian medicine.
Collapse
Affiliation(s)
- Laura Kervezee
- Group of Circadian Medicine, Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Hassan S. Dashti
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Luísa K. Pilz
- Department of Anesthesiology and Intensive Care Medicine CCM / CVK, Charité–Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- ECRC Experimental and Clinical Research Center, Charité–Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Carsten Skarke
- Institute for Translational Medicine and Therapeutics (ITMAT), University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
- Chronobiology and Sleep Institute (CSI), University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
- Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Marc D. Ruben
- Divisions of Pulmonary and Sleep Medicine and Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| |
Collapse
|
6
|
Gierend K, Waltemath D, Ganslandt T, Siegel F. Traceable Research Data Sharing in a German Medical Data Integration Center With FAIR (Findability, Accessibility, Interoperability, and Reusability)-Geared Provenance Implementation: Proof-of-Concept Study. JMIR Form Res 2023; 7:e50027. [PMID: 38060305 PMCID: PMC10739241 DOI: 10.2196/50027] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/25/2023] [Accepted: 11/01/2023] [Indexed: 12/08/2023] Open
Abstract
BACKGROUND Secondary investigations into digital health records, including electronic patient data from German medical data integration centers (DICs), pave the way for enhanced future patient care. However, only limited information is captured regarding the integrity, traceability, and quality of the (sensitive) data elements. This lack of detail diminishes trust in the validity of the collected data. From a technical standpoint, adhering to the widely accepted FAIR (Findability, Accessibility, Interoperability, and Reusability) principles for data stewardship necessitates enriching data with provenance-related metadata. Provenance offers insights into the readiness for the reuse of a data element and serves as a supplier of data governance. OBJECTIVE The primary goal of this study is to augment the reusability of clinical routine data within a medical DIC for secondary utilization in clinical research. Our aim is to establish provenance traces that underpin the status of data integrity, reliability, and consequently, trust in electronic health records, thereby enhancing the accountability of the medical DIC. We present the implementation of a proof-of-concept provenance library integrating international standards as an initial step. METHODS We adhered to a customized road map for a provenance framework, and examined the data integration steps across the ETL (extract, transform, and load) phases. Following a maturity model, we derived requirements for a provenance library. Using this research approach, we formulated a provenance model with associated metadata and implemented a proof-of-concept provenance class. Furthermore, we seamlessly incorporated the internationally recognized Word Wide Web Consortium (W3C) provenance standard, aligned the resultant provenance records with the interoperable health care standard Fast Healthcare Interoperability Resources, and presented them in various representation formats. Ultimately, we conducted a thorough assessment of provenance trace measurements. RESULTS This study marks the inaugural implementation of integrated provenance traces at the data element level within a German medical DIC. We devised and executed a practical method that synergizes the robustness of quality- and health standard-guided (meta)data management practices. Our measurements indicate commendable pipeline execution times, attaining notable levels of accuracy and reliability in processing clinical routine data, thereby ensuring accountability in the medical DIC. These findings should inspire the development of additional tools aimed at providing evidence-based and reliable electronic health record services for secondary use. CONCLUSIONS The research method outlined for the proof-of-concept provenance class has been crafted to promote effective and reliable core data management practices. It aims to enhance biomedical data by imbuing it with meaningful provenance, thereby bolstering the benefits for both research and society. Additionally, it facilitates the streamlined reuse of biomedical data. As a result, the system mitigates risks, as data analysis without knowledge of the origin and quality of all data elements is rendered futile. While the approach was initially developed for the medical DIC use case, these principles can be universally applied throughout the scientific domain.
Collapse
Affiliation(s)
- Kerstin Gierend
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Dagmar Waltemath
- Core Unit Data Integration Center and Medical Informatics Laboratory, University Medicine Greifswald, Greifswald, Germany
| | - Thomas Ganslandt
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Fabian Siegel
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
7
|
Heudel P, Crochet H, Durand T, Zrounba P, Blay JY. From data strategy to implementation to advance cancer research and cancer care: A French comprehensive cancer center experience. PLOS DIGITAL HEALTH 2023; 2:e0000415. [PMID: 38113207 PMCID: PMC10729983 DOI: 10.1371/journal.pdig.0000415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 11/20/2023] [Indexed: 12/21/2023]
Abstract
In a comprehensive cancer center, effective data strategies are essential to evaluate practices, and outcome, understanding the disease and prognostic factors, identifying disparities in cancer care, and overall developing better treatments. To achieve these goals, the Center Léon Bérard (CLB) considers various data collection strategies, including electronic medical records (EMRs), clinical trial data, and research projects. Advanced data analysis techniques like natural language processing (NLP) can be used to extract and categorize information from these sources to provide a more complete description of patient data. Data sharing is also crucial for collaboration across comprehensive cancer centers, but it must be done securely and in compliance with regulations like GDPR. To ensure data is shared appropriately, CLB should develop clear data sharing policies and share data in a controlled, standardized format like OSIRIS RWD, OMOP and FHIR. The UNICANCER initiative has launched the CONSORE project to support the development of a structured and standardized repository of patient data to improve cancer research and patient outcomes. Real-world data (RWD) studies are vital in cancer research as they provide a comprehensive and accurate picture of patient outcomes and treatment patterns. By incorporating RWD into data collection, analysis, and sharing strategies, comprehensive cancer centers can take a more comprehensive and patient-centered approach to cancer research. In conclusion, comprehensive cancer centers must take an integrated approach to data collection, analysis, and sharing to enhance their understanding of cancer and improve patient outcomes. Leveraging advanced data analytics techniques and developing effective data sharing policies can help cancer centers effectively harness the power of data to drive progress in cancer research.
Collapse
Affiliation(s)
- Pierre Heudel
- Department of Medical Oncology, Centre Léon Bérard, Lyon, France
| | - Hugo Crochet
- Data and Artificial Intelligence Team, Centre Léon Bérard, Lyon, France
| | - Thierry Durand
- Data protection officer, Centre Léon Bérard, Lyon, France
| | - Philippe Zrounba
- Department of Surgical Oncology, Centre Léon Bérard, Lyon, France
| | - Jean-Yves Blay
- Department of Medical Oncology, Centre Léon Bérard, Lyon, France
- General Director, Centre Léon Bérard, Lyon, France
| |
Collapse
|
8
|
Thakur S. Real-World Evidence Studies in Oncology Therapeutics: Hope or Hype? Indian J Surg Oncol 2023; 14:829-835. [PMID: 38187834 PMCID: PMC10767035 DOI: 10.1007/s13193-023-01784-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 06/12/2023] [Indexed: 01/09/2024] Open
Abstract
Randomized controlled trial (RCT) remains a gold standard in evidence-based medicine for assessing the efficacy and safety of cancer therapies. However, due to some inherent methodological limitations of RCT, such as stringent inclusion criteria, highly specific treatment, ethical and scientific compromise in rare cancer, and inability to adequately assess safety, real-world evidence (RWE) has been adjudged as a suitable option to complement data obtained from RCT. Moreover, in the context of cancer therapeutics, few notable merits pertain to developing a novel product for rare cancer subtypes, establishing new indications for already approved drugs, optimization of treatment regimen and sequence, a better description of long-term safety, and supporting the reimbursement-related decision. However, the implementation of RWE for the aforementioned purposes will be limited by various challenges, especially in the context of developing economies such as India. Special attention should be given to the availability of data, maintaining the quality standard, and establishing stringent regulations for privacy and security along with active regulatory engagement with relevant stakeholders. Such activities will be key to facilitating the use of RWE in cancer therapeutics.
Collapse
Affiliation(s)
- Sayanta Thakur
- Department of Pharmacology, MJNMC&H, Vivekananda Street, Pilkhana, Cooch Behar 736101 India
| |
Collapse
|
9
|
Masum H, Bourne PE. Ten simple rules for humane data science. PLoS Comput Biol 2023; 19:e1011698. [PMID: 38127691 PMCID: PMC10734991 DOI: 10.1371/journal.pcbi.1011698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023] Open
Affiliation(s)
- Hassan Masum
- Waterloo Institute for Complexity and Innovation, Waterloo, Canada
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Virginia, United States of America
| |
Collapse
|
10
|
Zhang J, Morley J, Gallifant J, Oddy C, Teo JT, Ashrafian H, Delaney B, Darzi A. Mapping and evaluating national data flows: transparency, privacy, and guiding infrastructural transformation. Lancet Digit Health 2023; 5:e737-e748. [PMID: 37775190 DOI: 10.1016/s2589-7500(23)00157-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/07/2023] [Accepted: 08/02/2023] [Indexed: 10/01/2023]
Abstract
The importance of big health data is recognised worldwide. Most UK National Health Service (NHS) care interactions are recorded in electronic health records, resulting in an unmatched potential for population-level datasets. However, policy reviews have highlighted challenges from a complex data-sharing landscape relating to transparency, privacy, and analysis capabilities. In response, we used public information sources to map all electronic patient data flows across England, from providers to more than 460 subsequent academic, commercial, and public data consumers. Although NHS data support a global research ecosystem, we found that multistage data flow chains limit transparency and risk public trust, most data interactions do not fulfil recommended best practices for safe data access, and existing infrastructure produces aggregation of duplicate data assets, thus limiting diversity of data and added value to end users. We provide recommendations to support data infrastructure transformation and have produced a website (https://DataInsights.uk) to promote transparency and showcase NHS data assets.
Collapse
Affiliation(s)
- Joe Zhang
- Institute of Global Health Innovation, Imperial College London, London, UK; Department of Critical Care, Guy's and St Thomas' NHS Foundation Trust, London, UK.
| | - Jess Morley
- Oxford Internet Institute, University of Oxford, Oxford, UK
| | - Jack Gallifant
- Department of Intensive Care, Imperial College Healthcare NHS Trust, London, UK; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Chris Oddy
- Department of Anaesthesia, Critical Care and Pain, St George's Healthcare NHS Trust, London, UK
| | - James T Teo
- London Medical Imaging and AI Centre, Guy's and St Thomas' NHS Foundation Trust, London, UK; Department of Neurology, King's College Hospital NHS Foundation Trust, London, UK
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, UK; Leeds University Business School, Leeds, UK
| | - Brendan Delaney
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, UK
| |
Collapse
|
11
|
Dhingra LS, Shen M, Mangla A, Khera R. Cardiovascular Care Innovation through Data-Driven Discoveries in the Electronic Health Record. Am J Cardiol 2023; 203:136-148. [PMID: 37499593 PMCID: PMC10865722 DOI: 10.1016/j.amjcard.2023.06.104] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/24/2023] [Accepted: 06/29/2023] [Indexed: 07/29/2023]
Abstract
The electronic health record (EHR) represents a rich source of patient information, increasingly being leveraged for cardiovascular research. Although its primary use remains the seamless delivery of health care, the various longitudinally aggregated structured and unstructured data elements for each patient within the EHR can define the computational phenotypes of disease and care signatures and their association with outcomes. Although structured data elements, such as demographic characteristics, laboratory measurements, problem lists, and medications, are easily extracted, unstructured data are underused. The latter include free text in clinical narratives, documentation of procedures, and reports of imaging and pathology. Rapid scaling up of data storage and rapid innovation in natural language processing and computer vision can power insights from unstructured data streams. However, despite an array of opportunities for research using the EHR, specific expertise is necessary to adequately address confidentiality, accuracy, completeness, and heterogeneity challenges in EHR-based research. These often require methodological innovation and best practices to design and conduct successful research studies. Our review discusses these challenges and their proposed solutions. In addition, we highlight the ongoing innovations in federated learning in the EHR through a greater focus on common data models and discuss ongoing work that defines such an approach to large-scale, multicenter, federated studies. Such parallel improvements in technology and research methods enable innovative care and optimization of patient outcomes.
Collapse
Affiliation(s)
| | - Miles Shen
- Section of Cardiovascular Medicine, Department of Internal Medicine; Department of Internal Medicine
| | - Anjali Mangla
- Section of Cardiovascular Medicine, Department of Internal Medicine; Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine; Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut; Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut.; Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut.
| |
Collapse
|
12
|
Zhang J, Ashrafian H, Delaney B, Darzi A. Impact of primary to secondary care data sharing on care quality in NHS England hospitals. NPJ Digit Med 2023; 6:144. [PMID: 37580595 PMCID: PMC10425337 DOI: 10.1038/s41746-023-00891-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 08/01/2023] [Indexed: 08/16/2023] Open
Abstract
Health information exchange (HIE) is seen as a key component of effective care but remains poorly evidenced at a health system level. In the UK National Health Service (NHS), the ability to share primary care data with secondary care clinicians is a focus of continued digital investment. In this study, we report the evolution of interoperable technology across a period of rapid digital transformation in NHS England from 2015 to 2019, and test association of primary to secondary care data-sharing capabilities with clinical care quality indicators across all acute secondary care providers (n = 135 NHS Trusts). In multivariable analyses, data-sharing capabilities are associated with reduction in patients breaching an Accident & Emergency (A&E) 4-h decision time threshold, and better patient-reported experience of acute hospital care quality. Using synthetic control analyses, we estimate mean 2.271% (STD+/-3.371) absolute reduction in A&E 4-h decision time breach, 12 months following introduction of data-sharing capabilities. Our findings support current digital transformation programmes for developing regional HIE networks but highlight the need to focus on implementation factors in addition to technological procurement.
Collapse
Affiliation(s)
- Joe Zhang
- Institute of Global Health Innovation, Imperial College London, London, UK.
- Department of Critical Care Medicine, Guy's and St Thomas' Hospital, London, UK.
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Brendan Delaney
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, UK
| |
Collapse
|
13
|
Idowu EAA, Teo J, Salih S, Valverde J, Yeung JA. Streams, rivers and data lakes: an introduction to understanding modern electronic healthcare records. Clin Med (Lond) 2023; 23:409-413. [PMID: 38614657 PMCID: PMC10541049 DOI: 10.7861/clinmed.2022-0325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
As foundation doctors, we have often found ourselves informing patients that a certain aspect of their medical information cannot be immediately found, either because it is on an electronic system we cannot access, or it is in a hospital that is unlinked to our own. Unsurprisingly, this frequently leaves patients flabbergasted and confused. We started to wonder: if patients' data are entered onto an electronic system: where do those data go? If medical data are searched for, where do those data come from? Why are there so many hidden sources of information that clinicians cannot access? In an ever-increasing digital sphere, electronic data will be the future of holistic health and social care planning, impacting every clinician's day-to-day role. From electronic healthcare records to the use of artificial intelligence solutions, this article will serve as an introduction to how data flows in modern healthcare systems.
Collapse
Affiliation(s)
| | - James Teo
- King's College Hospital and Guy's and St Thomas' Hospital NHS Foundation Trust, London UK
| | | | - Joshua Valverde
- Chesterfield Royal Hospital NHS Foundation Trust, Chesterfield, UK
| | - Joshua Au Yeung
- Guy's and St Thomas' Hospital NHS Foundation Trust, London, UK
| |
Collapse
|
14
|
Carrasco-Ribelles LA, Cabrera-Bean M, Danés-Castells M, Zabaleta-Del-Olmo E, Roso-Llorach A, Violán C. Contribution of Frailty to Multimorbidity Patterns and Trajectories: Longitudinal Dynamic Cohort Study of Aging People. JMIR Public Health Surveill 2023; 9:e45848. [PMID: 37368462 PMCID: PMC10365626 DOI: 10.2196/45848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 05/02/2023] [Accepted: 05/25/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Multimorbidity and frailty are characteristics of aging that need individualized evaluation, and there is a 2-way causal relationship between them. Thus, considering frailty in analyses of multimorbidity is important for tailoring social and health care to the specific needs of older people. OBJECTIVE This study aimed to assess how the inclusion of frailty contributes to identifying and characterizing multimorbidity patterns in people aged 65 years or older. METHODS Longitudinal data were drawn from electronic health records through the SIDIAP (Sistema d'Informació pel Desenvolupament de la Investigació a l'Atenció Primària) primary care database for the population aged 65 years or older from 2010 to 2019 in Catalonia, Spain. Frailty and multimorbidity were measured annually using validated tools (eFRAGICAP, a cumulative deficit model; and Swedish National Study of Aging and Care in Kungsholmen [SNAC-K], respectively). Two sets of 11 multimorbidity patterns were obtained using fuzzy c-means. Both considered the chronic conditions of the participants. In addition, one set included age, and the other included frailty. Cox models were used to test their associations with death, nursing home admission, and home care need. Trajectories were defined as the evolution of the patterns over the follow-up period. RESULTS The study included 1,456,052 unique participants (mean follow-up of 7.0 years). Most patterns were similar in both sets in terms of the most prevalent conditions. However, the patterns that considered frailty were better for identifying the population whose main conditions imposed limitations on daily life, with a higher prevalence of frail individuals in patterns like chronic ulcers &peripheral vascular. This set also included a dementia-specific pattern and showed a better fit with the risk of nursing home admission and home care need. On the other hand, the risk of death had a better fit with the set of patterns that did not include frailty. The change in patterns when considering frailty also led to a change in trajectories. On average, participants were in 1.8 patterns during their follow-up, while 45.1% (656,778/1,456,052) remained in the same pattern. CONCLUSIONS Our results suggest that frailty should be considered in addition to chronic diseases when studying multimorbidity patterns in older adults. Multimorbidity patterns and trajectories can help to identify patients with specific needs. The patterns that considered frailty were better for identifying the risk of certain age-related outcomes, such as nursing home admission or home care need, while those considering age were better for identifying the risk of death. Clinical and social intervention guidelines and resource planning can be tailored based on the prevalence of these patterns and trajectories.
Collapse
Affiliation(s)
- Lucía A Carrasco-Ribelles
- Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Signal Processing and Communications Group (SPCOM), Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, Spain
- Grup de REcerca en Impacte de les Malalties Cròniques i les seves Trajectòries (GRIMTRA) (2021 SGR 01537), Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS) (RD21/0016/0029), Instituto de Salud Carlos III, Madrid, Spain
| | - Margarita Cabrera-Bean
- Signal Processing and Communications Group (SPCOM), Department of Signal Theory and Communications, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Marc Danés-Castells
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, Spain
| | - Edurne Zabaleta-Del-Olmo
- Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Grup de REcerca en Impacte de les Malalties Cròniques i les seves Trajectòries (GRIMTRA) (2021 SGR 01537), Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS) (RD21/0016/0029), Instituto de Salud Carlos III, Madrid, Spain
- Gerència Territorial de Barcelona, Institut Català de la Salut, Barcelona, Spain
- Nursing Department, Faculty of Nursing, Universitat de Girona, Girona, Spain
| | - Albert Roso-Llorach
- Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Grup de REcerca en Impacte de les Malalties Cròniques i les seves Trajectòries (GRIMTRA) (2021 SGR 01537), Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS) (RD21/0016/0029), Instituto de Salud Carlos III, Madrid, Spain
- Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Concepción Violán
- Unitat de Suport a la Recerca Metropolitana Nord, Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Mataró, Spain
- Grup de REcerca en Impacte de les Malalties Cròniques i les seves Trajectòries (GRIMTRA) (2021 SGR 01537), Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol I Gurina (IDIAPJGol), Barcelona, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS) (RD21/0016/0029), Instituto de Salud Carlos III, Madrid, Spain
- Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- Fundació Institut d'Investigació en ciències de la Salut Germans Trias i Pujol (IGTP), Badalona, Spain
| |
Collapse
|
15
|
Betti M, Maria Salzano C, Massacci A, D'Antonio M, Grassucci I, Marcozzi B, Canfora M, Melucci E, Buglioni S, Casini B, Gallo E, Pescarmona E, Ciliberto G, Pallocca M. Development of a Somatic Variant Registry in a National Cancer Center: towards Molecular Real World Data preparedness. J Biomed Inform 2023; 142:104394. [PMID: 37209976 DOI: 10.1016/j.jbi.2023.104394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 03/21/2023] [Accepted: 05/14/2023] [Indexed: 05/22/2023]
Abstract
The Biomedical Research field is currently advancing to develop Clinical Trials and translational projects based on Real World Evidence. To make this transition feasible, clinical centers need to work toward Data Accessibility and Interoperability. This task is particularly challenging when applied to Genomics, that entered in routinary screening in the last years via mostly amplicon-based Next-Generation Sequencing panels. Said experiments produce up to hundreds of features per patient, and their summarized results are often stored in static clinical reports, making critical information inaccessible to automated access and Federated Search consortia. In this study, we present a reanalysis of 4620 solid tumor sequencing samples in five different histology settings. Furthermore, we describe all the Bioinformatics and Data Engineering processes that were put in place in order to create a Somatic Variant Registry able to deal with the large biotechnological variability of routinary Genomics Profiling.
Collapse
Affiliation(s)
- Martina Betti
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Chiara Maria Salzano
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Alice Massacci
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Mattia D'Antonio
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Isabella Grassucci
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Benedetta Marcozzi
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Marco Canfora
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Elisa Melucci
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Simonetta Buglioni
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Beatrice Casini
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Enzo Gallo
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Edoardo Pescarmona
- Department of Pathology, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Gennaro Ciliberto
- Scientific Direction, IRCCS Regina Elena National Cancer Institute, Rome, Italy
| | - Matteo Pallocca
- Biostatistics, Bioinformatics and Clinical Trial Center, IRCCS Regina Elena National Cancer Institute, Rome, Italy.
| |
Collapse
|
16
|
Bean DM, Kraljevic Z, Shek A, Teo J, Dobson RJB. Hospital-wide natural language processing summarising the health data of 1 million patients. PLOS DIGITAL HEALTH 2023; 2:e0000218. [PMID: 37159441 PMCID: PMC10168555 DOI: 10.1371/journal.pdig.0000218] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/16/2023] [Indexed: 05/11/2023]
Abstract
Electronic health records (EHRs) represent a major repository of real world clinical trajectories, interventions and outcomes. While modern enterprise EHR's try to capture data in structured standardised formats, a significant bulk of the available information captured in the EHR is still recorded only in unstructured text format and can only be transformed into structured codes by manual processes. Recently, Natural Language Processing (NLP) algorithms have reached a level of performance suitable for large scale and accurate information extraction from clinical text. Here we describe the application of open-source named-entity-recognition and linkage (NER+L) methods (CogStack, MedCAT) to the entire text content of a large UK hospital trust (King's College Hospital, London). The resulting dataset contains 157M SNOMED concepts generated from 9.5M documents for 1.07M patients over a period of 9 years. We present a summary of prevalence and disease onset as well as a patient embedding that captures major comorbidity patterns at scale. NLP has the potential to transform the health data lifecycle, through large-scale automation of a traditionally manual task.
Collapse
Affiliation(s)
- Daniel M Bean
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
- Health Data Research UK London, University College London, London, United Kingdom
| | - Zeljko Kraljevic
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
- NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, United Kingdom
| | - Anthony Shek
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
- Department of Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - James Teo
- Department of Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
- Department of Neuroscience, King's College Hospital NHS Foundation Trust, London, United Kingdom
| | - Richard J B Dobson
- Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
- Health Data Research UK London, University College London, London, United Kingdom
- NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, United Kingdom
- Institute for Health Informatics, University College London, London, United Kingdom
- NIHR Biomedical Research Centre, University College London Hospitals NHS Foundation Trust, London, United Kingdom
| |
Collapse
|
17
|
Replication of Real-World Evidence in Oncology Using Electronic Health Record Data Extracted by Machine Learning. Cancers (Basel) 2023; 15:cancers15061853. [PMID: 36980739 PMCID: PMC10046618 DOI: 10.3390/cancers15061853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 03/07/2023] [Accepted: 03/14/2023] [Indexed: 03/22/2023] Open
Abstract
Meaningful real-world evidence (RWE) generation requires unstructured data found in electronic health records (EHRs) which are often missing from administrative claims; however, obtaining relevant data from unstructured EHR sources is resource-intensive. In response, researchers are using natural language processing (NLP) with machine learning (ML) techniques (i.e., ML extraction) to extract real-world data (RWD) at scale. This study assessed the quality and fitness-for-use of EHR-derived oncology data curated using NLP with ML as compared to the reference standard of expert abstraction. Using a sample of 186,313 patients with lung cancer from a nationwide EHR-derived de-identified database, we performed a series of replication analyses demonstrating some common analyses conducted in retrospective observational research with complex EHR-derived data to generate evidence. Eligible patients were selected into biomarker- and treatment-defined cohorts, first with expert-abstracted then with ML-extracted data. We utilized the biomarker- and treatment-defined cohorts to perform analyses related to biomarker-associated survival and treatment comparative effectiveness, respectively. Across all analyses, the results differed by less than 8% between the data curation methods, and similar conclusions were reached. These results highlight that high-performance ML-extracted variables trained on expert-abstracted data can achieve similar results as when using abstracted data, unlocking the ability to perform oncology research at scale.
Collapse
|
18
|
A validated artificial intelligence-based pipeline for population-wide primary immunodeficiency screening. J Allergy Clin Immunol 2023; 151:272-279. [PMID: 36243223 DOI: 10.1016/j.jaci.2022.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/21/2022] [Accepted: 10/05/2022] [Indexed: 11/07/2022]
Abstract
BACKGROUND Identification of patients with underlying inborn errors of immunity and inherent susceptibility to infection remains challenging. The ensuing protracted diagnostic odyssey for such patients often results in greater morbidity and suboptimal outcomes, underscoring a need to develop systematic methods for improving diagnostic rates. OBJECTIVE The principal aim of this study is to build and validate a generalizable analytical pipeline for population-wide detection of infection susceptibility and risk of primary immunodeficiency. METHODS This prospective, longitudinal cohort study coupled weighted rules with a machine learning classifier for risk stratification. Claims data were analyzed from a diverse population (n = 427,110) iteratively over 30 months. Cohort outcomes were enumerated for new diagnoses, hospitalizations, and acute care visits. This study followed TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) standards. RESULTS Cohort members initially identified as high risk were proportionally more likely to receive a diagnosis of primary immunodeficiency compared to those at low-medium risk or those without claims of interest respectively (9% vs 1.5% vs 0.2%; P < .001, chi-square test). Subsequent machine learning stratification enabled an annualized individual snapshot of complexity for triaging referrals. This study's top-performing machine learning model for visit-level prediction used a single dense layer neural network architecture (area under the receiver-operator characteristic curve = 0.98; F1 score = 0.98). CONCLUSIONS A 2-step analytical pipeline can facilitate identification of individuals with primary immunodeficiency and accurately quantify clinical risk.
Collapse
|
19
|
Forde SA, Campbell JM, Gill KW, Sobers NP. Real-World Data and Paper-Based Disease Registries in the Small Island Developing State of Barbados During the COVID-19 Pandemic. JOURNAL OF REGISTRY MANAGEMENT 2023; 50:40-42. [PMID: 37577281 PMCID: PMC10414192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Significant data is being produced on the impact of COVID-19 on aspects of clinical care. However, less is known about the impact on real-world health data. The US Food and Drug Administration defines real-world data as "data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources," including disease registries.1 The methodology used by the Barbados National Registry (BNR)-active pursuit of first-hand clinical data using paper-based charts from multiple sources-makes it an ideal example of real-world data. Real-world data can overcome the barriers to clinical trials often present in small island developing states. This paper reviews the impact of the COVID-19 pandemic on the data of the BNR within the context of the real-world data cycle. Data collected retrospectively for 2016-2018, undergoing traceback during the pandemic, demonstrated a greater reliance on death certificate registration. A 38% reduction in the collection of new cases was noted in the postpandemic period compared to data collected in previous periods. The lack of access to source data delayed cancer registry reporting. We conclude that, given the challenges highlighted during the COVID-19 pandemic, more effort should be placed on providing timely access to real-world data for public health decision-making, particularly in small island developing states.
Collapse
Affiliation(s)
- Shelly-Ann Forde
- George Alleyne Chronic Disease Research Centre, Caribbean Institute for Health Research, The University of the West Indies, Bridgetown, Barbados
| | - Jacqueline M. Campbell
- George Alleyne Chronic Disease Research Centre, Caribbean Institute for Health Research, The University of the West Indies, Bridgetown, Barbados
| | - Kirt W. Gill
- George Alleyne Chronic Disease Research Centre, Caribbean Institute for Health Research, The University of the West Indies, Bridgetown, Barbados
| | - Natasha P. Sobers
- George Alleyne Chronic Disease Research Centre, Caribbean Institute for Health Research, The University of the West Indies, Bridgetown, Barbados
| |
Collapse
|
20
|
Mavragani A, Setia S, Shinde SP, Santoso H, Furtner D. Contemporary Databases in Real-world Studies Regarding the Diverse Health Care Systems of India, Thailand, and Taiwan: Protocol for a Scoping Review. JMIR Res Protoc 2022; 11:e43741. [PMID: 36512386 PMCID: PMC9795390 DOI: 10.2196/43741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 11/30/2022] [Accepted: 12/01/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Real-world data (RWD) related to patient health status or health care delivery can be broadly defined as data collected outside of conventional clinical trials, including those from databases, treatment and disease registries, electronic medical records, insurance claims, and information directly contributed by health care professionals or patients. RWD are used to generate real-world evidence (RWE), which is increasingly relevant to policy makers in Asia, who use RWE to support decision-making in several areas, including public health policy, regulatory health technology assessment, and reimbursement; set priorities; or inform clinical practice. OBJECTIVE To support the achievement of the benefits of RWE in Asian health care strategies and policies, we sought to identify the linked contemporary databases used in real-world studies from three representative countries-India, Thailand, and Taiwan-and explore variations in results based on these countries' economies and health care reimbursement systems by performing a systematic scoping review. Herein, we describe the protocol and preliminary findings of our scoping review. METHODS The PubMed search strategy covered 3 concepts. Concept 1 was designed to identify potential RWE and RWD studies by applying various Medical Subject Headings (MeSH) terms ("Treatment Outcome," "Evidence-Based Medicine," "Retrospective Studies," and "Time Factors") and related keywords (eg, "real-world," "actual life," and "actual practice"). Concept 2 introduced the three countries-India, Taiwan, and Thailand. Concept 3 focused on data types, using a combination of MeSH terms ("Electronic Health Records," "Insurance, Health," "Registries," "Databases, Pharmaceutical," and "Pharmaceutical Services") and related keywords (eg, "electronic medical record," "electronic healthcare record," "EMR," "EHR," "administrative database," and "registry"). These searches were conducted with filters for language (English) and publication date (publications in the last 5 years before the search). The retrieved articles will undergo 2 screening phases (phase 1: review of titles and abstracts; phase 2: review of full texts) to identify relevant and eligible articles for data extraction. The data to be extracted from eligible studies will include the characteristics of databases, the regions covered, and the patient populations. RESULTS The literature search was conducted on September 27, 2022. We retrieved 3,172,434, 1,094,125, and 672,794 articles for concepts 1, 2, and 3, respectively. After applying all 3 concepts and the language and publication date filters, 2277 articles were identified. These will be further screened to identify eligible studies. Based on phase 1 screening and our progress to date, approximately 44% (1003/2277) of articles have undergone phase 2 screening to judge their eligibility. Around 800 studies will be used for data extraction. CONCLUSIONS Our research will be crucial for nurturing advancement in RWD generation within Asia by identifying linked clinical RWD databases and new avenues for public-private partnerships and multiple collaborations for expanding the scope and spectrum of high-quality, robust RWE generation in Asia. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/43741.
Collapse
Affiliation(s)
| | - Sajita Setia
- Executive Office, Transform Medical Communications Limited, Wanganui, New Zealand
| | - Salil Prakash Shinde
- Regional Medical Affairs, Pfizer Corporation Hong Kong Limited, Hong Kong, Hong Kong
| | - Handoko Santoso
- Regional Medical Affairs, Pfizer Corporation Hong Kong Limited, Hong Kong, Hong Kong
| | - Daniel Furtner
- Executive Office, Transform Medical Communications Limited, Wanganui, New Zealand
| |
Collapse
|
21
|
Challenges and recommendations for wearable devices in digital health: Data quality, interoperability, health equity, fairness. PLOS DIGITAL HEALTH 2022; 1:e0000104. [PMID: 36812619 PMCID: PMC9931360 DOI: 10.1371/journal.pdig.0000104] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Wearable devices are increasingly present in the health context, as tools for biomedical research and clinical care. In this context, wearables are considered key tools for a more digital, personalised, preventive medicine. At the same time, wearables have also been associated with issues and risks, such as those connected to privacy and data sharing. Yet, discussions in the literature have mostly focused on either technical or ethical considerations, framing these as largely separate areas of discussion, and the contribution of wearables to the collection, development, application of biomedical knowledge has only partially been discussed. To fill in these gaps, in this article we provide an epistemic (knowledge-related) overview of the main functions of wearable technology for health: monitoring, screening, detection, and prediction. On this basis, we identify 4 areas of concern in the application of wearables for these functions: data quality, balanced estimations, health equity, and fairness. To move the field forward in an effective and beneficial direction, we present recommendations for the 4 areas: local standards of quality, interoperability, access, and representativity.
Collapse
|
22
|
Moving towards vertically integrated artificial intelligence development. NPJ Digit Med 2022; 5:143. [PMID: 36104535 PMCID: PMC9474277 DOI: 10.1038/s41746-022-00690-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 08/31/2022] [Indexed: 11/08/2022] Open
Abstract
AbstractSubstantial interest and investment in clinical artificial intelligence (AI) research has not resulted in widespread translation to deployed AI solutions. Current attention has focused on bias and explainability in AI algorithm development, external validity and model generalisability, and lack of equity and representation in existing data. While of great importance, these considerations also reflect a model-centric approach seen in published clinical AI research, which focuses on optimising architecture and performance of an AI model on best available datasets. However, even robustly built models using state-of-the-art algorithms may fail once tested in realistic environments due to unpredictability of real-world conditions, out-of-dataset scenarios, characteristics of deployment infrastructure, and lack of added value to clinical workflows relative to cost and potential clinical risks. In this perspective, we define a vertically integrated approach to AI development that incorporates early, cross-disciplinary, consideration of impact evaluation, data lifecycles, and AI production, and explore its implementation in two contrasting AI development pipelines: a scalable “AI factory” (Mayo Clinic, Rochester, United States), and an end-to-end cervical cancer screening platform for resource poor settings (Paps AI, Mbarara, Uganda). We provide practical recommendations for implementers, and discuss future challenges and novel approaches (including a decentralised federated architecture being developed in the NHS (AI4VBH, London, UK)). Growth in global clinical AI research continues unabated, and introduction of vertically integrated teams and development practices can increase the translational potential of future clinical AI projects.
Collapse
|
23
|
Zhang J, Mattie H, Shuaib H, Hensman T, Teo JT, Celi LA. Addressing the "elephant in the room" of AI clinical decision support through organisation-level regulation. PLOS DIGITAL HEALTH 2022; 1:e0000111. [PMID: 36812576 PMCID: PMC9931314 DOI: 10.1371/journal.pdig.0000111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Joe Zhang
- Institute of Global Health Innovation, Imperial College London, London, United Kingdom
- * E-mail:
| | - Heather Mattie
- Department of Biostatistics, Harvard T H Chan School of Public Health, Harvard University, Cambridge, Massachusetts, United States of America
| | - Haris Shuaib
- Department of Clinical Scientific Computing, Guy’s and St. Thomas’ Hospital NHS Foundation Trust, London, United Kingdom
| | - Tamishta Hensman
- Department of Critical Care, Guy’s and St. Thomas’ Hospital NHS Foundation Trust, London, United Kingdom
- The Australian and New Zealand Intensive Care Society Centre for Outcome and Resource Evaluation, Camberwell, Australia
| | - James T. Teo
- Department of Neurology, King’s College Hospital NHS Foundation Trust, London, United Kingdom
- London Medical Imaging & AI Centre, Guy’s and St. Thomas’ Hospital, London, United Kingdom
| | - Leo Anthony Celi
- Department of Biostatistics, Harvard T H Chan School of Public Health, Harvard University, Cambridge, Massachusetts, United States of America
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
| |
Collapse
|
24
|
Magalhães T, Dinis-Oliveira RJ, Taveira-Gomes T. Digital Health and Big Data Analytics: Implications of Real-World Evidence for Clinicians and Policymakers. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:8364. [PMID: 35886214 PMCID: PMC9325235 DOI: 10.3390/ijerph19148364] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 07/07/2022] [Indexed: 11/24/2022]
Abstract
Real world data (RWD) and real-world evidence (RWE) plays an increasingly important role in clinical research since scientific knowledge is obtained during routine clinical large-scale practice and not experimentally as occurs in the highly controlled traditional clinical trials. Particularly, the electronic health records (EHRs) are a relevant source of data. Nevertheless, there are also significant challenges in the correct use and interpretation of EHRs data, such as bias, heterogeneity of the population, and missing or non-standardized data formats. Despite the RWD and RWE recognized difficulties, these are easily outweighed by the benefits of ensuring the efficacy, safety, and cost-effectiveness in complement to the gold standards of the randomized controlled trial (RCT), namely by providing a complete picture regarding factors and variables that can guide robust clinical decisions. Their relevance can be even further evident as healthcare units develop more accurate EHRs always in the respect for the privacy of patient data. This editorial is an overview of the RWD and RWE major aspects of the state of the art and supports the Special Issue on "Digital Health and Big Data Analytics: Implications of Real-World Evidence for Clinicians and Policymakers" aimed to explore all the potential and the utility of RWD and RWE in offering insights on diseases in a broad spectrum.
Collapse
Affiliation(s)
- Teresa Magalhães
- Department of Public Health and Forensic Sciences, and Medical Education, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
- Center for Health Technology and Services Research (CINTESIS), 4200-450 Porto, Portugal
- MTG Research and Development Lab, 4200-604 Porto, Portugal
- TOXRUN—Toxicology Research Unit, University Institute of Health Sciences, Advanced Polytechnic and University Cooperative (CESPU), CRL, 4585-116 Gandra, Portugal
| | - Ricardo Jorge Dinis-Oliveira
- Department of Public Health and Forensic Sciences, and Medical Education, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
- MTG Research and Development Lab, 4200-604 Porto, Portugal
- TOXRUN—Toxicology Research Unit, University Institute of Health Sciences, Advanced Polytechnic and University Cooperative (CESPU), CRL, 4585-116 Gandra, Portugal
- UCIBIO-REQUIMTE, Laboratory of Toxicology, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, 4050-313 Porto, Portugal
| | - Tiago Taveira-Gomes
- Center for Health Technology and Services Research (CINTESIS), 4200-450 Porto, Portugal
- MTG Research and Development Lab, 4200-604 Porto, Portugal
- Department of Community Medicine, Information and Decision in Health, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
- Faculty of Health Sciences, University Fernando Pessoa (FCS-UFP), 4249-004 Porto, Portugal
| |
Collapse
|
25
|
Sarri G. Can Real-World Evidence Help Restore Decades of Health Inequalities by Informing Health Care Decision-Making? Certainly, and Here is How. Front Pharmacol 2022; 13:905820. [PMID: 35784688 PMCID: PMC9241066 DOI: 10.3389/fphar.2022.905820] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 05/06/2022] [Indexed: 11/19/2022] Open
|