1
|
Abdalla T, Preen DB, Pole JD, Walwyn T, Bulsara M, Ives A, Choong CS, Ohan JL. Psychiatric disorders in childhood cancer survivors: A retrospective matched cohort study of inpatient hospitalisations and community-based mental health services utilisation in Western Australia. Aust N Z J Psychiatry 2024; 58:515-527. [PMID: 38404162 PMCID: PMC11128143 DOI: 10.1177/00048674241233871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
OBJECTIVE We examined the impact of long-term mental health outcomes on healthcare services utilisation among childhood cancer survivors in Western Australia using linked hospitalisations and community-based mental healthcare records from 1987 to 2019. METHOD The study cohort included 2977 childhood cancer survivors diagnosed with cancer at age < 18 years in Western Australia from 1982 to 2014 and a matched non-cancer control group of 24,994 individuals. Adjusted hazard ratios of recurrent events were estimated using the Andersen-Gill model. The cumulative burden of events over time was assessed using the method of mean cumulative count. The annual percentage change in events was estimated using the negative binomial regression model. RESULTS The results showed higher community-based service contacts (rate/100 person-years: 30.2, 95% confidence interval = [29.7-30.7] vs 22.8, 95% confidence interval = [22.6-22.9]) and hospitalisations (rate/1000 person-years: 14.8, 95% confidence interval = [13.6-16.0] vs 12.7, 95% confidence interval = [12.3-13.1]) in childhood cancer survivors compared to the control group. Childhood cancer survivors had a significantly higher risk of any event (adjusted hazard ratio = 1.5, 95% confidence interval = [1.1-2.0]). The cumulative burden of events increased with time since diagnosis and across age groups. The annual percentage change for hospitalisations and service contacts significantly increased over time (p < 0.05). Substance abuse was the leading cause of hospitalisations, while mood/affective and anxiety disorders were common causes of service contacts. Risk factors associated with increased service events included cancer diagnosis at age < 5 years, leukaemia diagnosis, high socioeconomic deprivation, and an attained age of < 18 years. CONCLUSIONS The elevated utilisation of healthcare services observed among childhood cancer survivors emphasises the need for periodic assessment of psychiatric disorders, particularly in high-risk survivors, to facilitate early management and optimise healthcare resources.
Collapse
Affiliation(s)
- Tasnim Abdalla
- Medical School, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
| | - David B Preen
- School of Population and Global Health, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
| | - Jason D Pole
- Centre for Health Services Research, The University of Queensland, Herston, QLD, Australia
| | - Thomas Walwyn
- Medical School, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
- Department of Paediatric and Adolescent Oncology and Haematology, Perth Children’s Hospital, Nedlands, WA, Australia
| | - Max Bulsara
- School of Population and Global Health, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
- Institute for Health Research, The University of Notre Dame Australia, Fremantle, WA, Australia
| | - Angela Ives
- Medical School, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
| | - Catherine S Choong
- Medical School, Faculty of Health and Medical Sciences, The University of Western Australia, Perth, WA, Australia
- Department of Endocrinology, Perth Children’s Hospital, Nedlands, WA, Australia
| | - Jeneva L Ohan
- School of Psychological Science, The University of Western Australia, Perth, WA, Australia
| |
Collapse
|
2
|
Osei-Yeboah R, Ngwenya O, Tiffin N. Kidney function in healthcare clients in Khayelitsha, South Africa: Routine laboratory testing and results reflect distinct healthcare experiences by age for healthcare clients with and without HIV. PLOS GLOBAL PUBLIC HEALTH 2024; 4:e0002526. [PMID: 38753721 PMCID: PMC11098392 DOI: 10.1371/journal.pgph.0002526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 04/12/2024] [Indexed: 05/18/2024]
Abstract
In South Africa, PLHIV are eligible for free ART and kidney function screening. Serum creatinine (SCr) laboratory test data from the National Health Laboratory Service are collated at the Provincial Health Data Centre and linked with other routine health data. We analysed SCr and estimated glomerular filtration rate (eGFR) results for PLHIV and HIV-negative healthcare clients aged 18-80 years accessing healthcare in Khayelitsha, South Africa and comorbidity profiles at SCr and eGFR testing. 45 640 individuals aged 18-80 years with at least one renal test accessed Khayelitsha public health facilities in 2016/2017. 22 961 (50.3%) were PLHIV. Median age at first SCr and eGFR test for PLHIV was 33yrs (IQR: 27,41) to 36yrs (IQR: 30,43) compared to 49yrs (IQR: 37,57) and 52yrs (IQR: 44,59) for those without HIV. PLHIV first median SCr results were 66 (IQR: 55,78) μmol/l compared to 69 (IQR: 58,82) μmol/l for HIV-negative individuals. Hypertension, diabetes, and CKD at testing were more common in HIV-negative people than PLHIV. HIV, diabetes and tuberculosis (TB) are associated with higher eGFR results; whilst hypertension, being male and older are associated with lower eGFR results. These data reflect testing practices in the Western Cape: younger people without HIV have generally worse kidney function test results; younger PLHIV have generally good test results, and older people with/without HIV have generally similar test results, reflecting regular screening for kidney function in asymptomatic PLHIV whereas young HIV-negative people are tested only when presenting with renal symptoms. Our analysis suggests we cannot infer the future healthcare requirements of younger PLHIV based on the current ageing population, due to changing ART availability for different generations of PLHIV. Instead, routine health data may be used in an agile way to assess ongoing healthcare requirements of ageing PLHIV, and to reflect implementation of treatment guidelines.
Collapse
Affiliation(s)
- Richard Osei-Yeboah
- Faculty of Health Sciences, Integrative Biomedical Sciences Department, Division of Computational Biology, University of Cape Town, Cape Town, South Africa
| | - Olina Ngwenya
- Faculty of Biology, Centre for Biostatistics, School of Health Sciences, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Institute of Infectious Diseases and Molecular Medicine, Wellcome Centre for Infectious Disease Research in Africa, University of Cape Town, Cape Town, South Africa
| | - Nicki Tiffin
- Institute of Infectious Diseases and Molecular Medicine, Wellcome Centre for Infectious Disease Research in Africa, University of Cape Town, Cape Town, South Africa
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville, South Africa
| |
Collapse
|
3
|
Seoni S, Jahmunah V, Salvi M, Barua PD, Molinari F, Acharya UR. Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013-2023). Comput Biol Med 2023; 165:107441. [PMID: 37683529 DOI: 10.1016/j.compbiomed.2023.107441] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 08/27/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023]
Abstract
Uncertainty estimation in healthcare involves quantifying and understanding the inherent uncertainty or variability associated with medical predictions, diagnoses, and treatment outcomes. In this era of Artificial Intelligence (AI) models, uncertainty estimation becomes vital to ensure safe decision-making in the medical field. Therefore, this review focuses on the application of uncertainty techniques to machine and deep learning models in healthcare. A systematic literature review was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our analysis revealed that Bayesian methods were the predominant technique for uncertainty quantification in machine learning models, with Fuzzy systems being the second most used approach. Regarding deep learning models, Bayesian methods emerged as the most prevalent approach, finding application in nearly all aspects of medical imaging. Most of the studies reported in this paper focused on medical images, highlighting the prevalent application of uncertainty quantification techniques using deep learning models compared to machine learning models. Interestingly, we observed a scarcity of studies applying uncertainty quantification to physiological signals. Thus, future research on uncertainty quantification should prioritize investigating the application of these techniques to physiological signals. Overall, our review highlights the significance of integrating uncertainty techniques in healthcare applications of machine learning and deep learning models. This can provide valuable insights and practical solutions to manage uncertainty in real-world medical data, ultimately improving the accuracy and reliability of medical diagnoses and treatment recommendations.
Collapse
Affiliation(s)
- Silvia Seoni
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | | | - Massimo Salvi
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Prabal Datta Barua
- School of Business (Information System), University of Southern Queensland, Toowoomba, QLD, 4350, Australia; Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, 2007, Australia
| | - Filippo Molinari
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy.
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
4
|
Mews S, Surmann B, Hasemann L, Elkenkamp S. Markov-modulated marked Poisson processes for modeling disease dynamics based on medical claims data. Stat Med 2023; 42:3804-3815. [PMID: 37308135 DOI: 10.1002/sim.9832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 05/26/2023] [Accepted: 06/01/2023] [Indexed: 06/14/2023]
Abstract
We explore Markov-modulated marked Poisson processes (MMMPPs) as a natural framework for modeling patients' disease dynamics over time based on medical claims data. In claims data, observations do not only occur at random points in time but are also informative, that is, driven by unobserved disease levels, as poor health conditions usually lead to more frequent interactions with the health care system. Therefore, we model the observation process as a Markov-modulated Poisson process, where the rate of health care interactions is governed by a continuous-time Markov chain. Its states serve as proxies for the patients' latent disease levels and further determine the distribution of additional data collected at each observation time, the so-called marks. Overall, MMMPPs jointly model observations and their informative time points by comprising two state-dependent processes: the observation process (corresponding to the event times) and the mark process (corresponding to event-specific information), which both depend on the underlying states. The approach is illustrated using claims data from patients diagnosed with chronic obstructive pulmonary disease by modeling their drug use and the interval lengths between consecutive physician consultations. The results indicate that MMMPPs are able to detect distinct patterns of health care utilization related to disease processes and reveal interindividual differences in the state-switching dynamics.
Collapse
Affiliation(s)
- Sina Mews
- Department of Business Administration and Economics, Bielefeld University, Bielefeld, Germany
| | - Bastian Surmann
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| | - Lena Hasemann
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| | - Svenja Elkenkamp
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
5
|
Brossard M, Paterson AD, Espin-Garcia O, Craiu RV, Bull SB. Characterization of direct and/or indirect genetic associations for multiple traits in longitudinal studies of disease progression. Genetics 2023; 225:iyad119. [PMID: 37369448 DOI: 10.1093/genetics/iyad119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/07/2023] [Accepted: 06/19/2023] [Indexed: 06/29/2023] Open
Abstract
When quantitative longitudinal traits are risk factors for disease progression and subject to random biological variation, joint model analysis of time-to-event and longitudinal traits can effectively identify direct and/or indirect genetic association of single nucleotide polymorphisms (SNPs) with time-to-event. We present a joint model that integrates: (1) a multivariate linear mixed model describing trajectories of multiple longitudinal traits as a function of time, SNP effects, and subject-specific random effects and (2) a frailty Cox survival model that depends on SNPs, longitudinal trajectory effects, and subject-specific frailty accounting for dependence among multiple time-to-event traits. Motivated by complex genetic architecture of type 1 diabetes complications (T1DC) observed in the Diabetes Control and Complications Trial (DCCT), we implement a 2-stage approach to inference with bootstrap joint covariance estimation and develop a hypothesis testing procedure to classify direct and/or indirect SNP association with each time-to-event trait. By realistic simulation study, we show that joint modeling of 2 time-to-T1DC (retinopathy and nephropathy) and 2 longitudinal risk factors (HbA1c and systolic blood pressure) reduces estimation bias in genetic effects and improves classification accuracy of direct and/or indirect SNP associations, compared to methods that ignore within-subject risk factor variability and dependence among longitudinal and time-to-event traits. Through DCCT data analysis, we demonstrate feasibility for candidate SNP modeling and quantify effects of sample size and Winner's curse bias on classification for 2 SNPs identified as having indirect associations with time-to-T1DC traits. Joint analysis of multiple longitudinal and multiple time-to-event traits provides insight into complex traits architecture.
Collapse
Affiliation(s)
- Myriam Brossard
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto M5T 3L9, Ontario, Canada
| | - Andrew D Paterson
- Program in Genetics and Genome Biology, Hospital for Sick Children Research Institute, Toronto M5G 1X8, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
| | - Osvaldo Espin-Garcia
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
- Department of Biostatistics, Princess Margaret Cancer Centre, Toronto M5G 2C1, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Toronto M5S 3G3, Ontario, Canada
- Department of Epidemiology and Biostatistics, Western University, London N6A 5C1, Ontario, Canada
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto, Toronto M5S 3G3, Ontario, Canada
| | - Shelley B Bull
- Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto M5T 3L9, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto M5T 3M7, Ontario, Canada
| |
Collapse
|
6
|
Illipse M, Czene K, Hall P, Humphreys K. Studying the association between longitudinal mammographic density measurements and breast cancer risk: a joint modelling approach. Breast Cancer Res 2023; 25:64. [PMID: 37296473 PMCID: PMC10257295 DOI: 10.1186/s13058-023-01667-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 05/30/2023] [Indexed: 06/12/2023] Open
Abstract
BACKGROUND Researchers have suggested that longitudinal trajectories of mammographic breast density (MD) can be used to understand changes in breast cancer (BC) risk over a woman's lifetime. Some have suggested, based on biological arguments, that the cumulative trajectory of MD encapsulates the risk of BC across time. Others have tried to connect changes in MD to the risk of BC. METHODS To summarize the MD-BC association, we jointly model longitudinal trajectories of MD and time to diagnosis using data from a large ([Formula: see text]) mammography cohort of Swedish women aged 40-80 years. Five hundred eighteen women were diagnosed with BC during follow-up. We fitted three joint models (JMs) with different association structures; Cumulative, current value and slope, and current value association structures. RESULTS All models showed evidence of an association between MD trajectory and BC risk ([Formula: see text] for current value of MD, [Formula: see text] and [Formula: see text] for current value and slope of MD respectively, and [Formula: see text] for cumulative value of MD). Models with cumulative association structure and with current value and slope association structure had better goodness of fit than a model based only on current value. The JM with current value and slope structure suggested that a decrease in MD may be associated with an increased (instantaneous) BC risk. It is possible that this is because of increased screening sensitivity rather than being related to biology. CONCLUSION We argue that a JM with a cumulative association structure may be the most appropriate/biologically relevant model in this context.
Collapse
Affiliation(s)
- Maya Illipse
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Swedish eScience Research Centre (SeRC), Karolinska Institutet, Stockholm, Sweden
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Department of Oncology, Södersjukhuset, Stockholm, Sweden
| | - Keith Humphreys
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Swedish eScience Research Centre (SeRC), Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
7
|
Pullenayegum EM, Birken C, Maguire J. Causal inference with longitudinal data subject to irregular assessment times. Stat Med 2023. [PMID: 37054723 DOI: 10.1002/sim.9727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 02/10/2023] [Accepted: 03/18/2023] [Indexed: 04/15/2023]
Abstract
Data collected in the context of usual care present a rich source of longitudinal data for research, but often require analyses that simultaneously enable causal inferences from observational data while handling irregular and informative assessment times. An inverse-weighting approach to this was recently proposed, and handles the case where the assessment times are at random (ie, conditionally independent of the outcome process given the observed history). In this paper, we extend the inverse-weighting approach to handle a special case of assessment not at random, where assessment and outcome processes are conditionally independent given past observed covariates and random effects. We use multiple outputation to accomplish the same purpose as inverse-weighting, and apply it to the Liang semi-parametric joint model. Moreover, we develop an alternative joint model that does not require covariates for the outcome model to be known at times where there is no assessment of the outcome. We examine the performance of these methods through simulation and illustrate them through a study of the causal effect of wheezing on time spent playing outdoors among children aged 2-9 years and enrolled in the TargetKids! study.
Collapse
Affiliation(s)
- Eleanor M Pullenayegum
- Child Health Evaluative Sciences, Hospital for Sick Children, Toronto, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Catherine Birken
- Child Health Evaluative Sciences, Hospital for Sick Children, Toronto, Canada
- Department of Paediatrics, University of Toronto, Toronto, Canada
| | - Jonathon Maguire
- Department of Paediatrics, St Michael's Hospital, Toronto, Canada
- Departments of Paediatrics & Nutritional Sciences, University of Toronto, Toronto, Canada
- Li Ka Shing Knowledge Institute, Unity Health Toronto, Toronto, Canada
| |
Collapse
|
8
|
Estimation of marginal structural models under irregular visits and unmeasured confounder: calibrated inverse probability weights. BMC Med Res Methodol 2023; 23:4. [PMID: 36611135 PMCID: PMC9825036 DOI: 10.1186/s12874-022-01831-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Accepted: 12/26/2022] [Indexed: 01/09/2023] Open
Abstract
Clinical information collected in electronic health records (EHRs) is becoming an essential source to emulate randomized experiments. Since patients do not interact with the healthcare system at random, the longitudinal information in large observational databases must account for irregular visits. Moreover, we need to also account for subject-specific unmeasured confounders which may act as a common cause for treatment assignment mechanism (e.g. glucose-lowering medications) while also influencing the outcome (e.g. Hemoglobin A1c). We used the calibration of longitudinal weights to improve the finite sample properties and to account for subject-specific unmeasured confounders. A Monte Carlo simulation study is conducted to evaluate the performance of calibrated inverse probability estimators using time-dependent treatment assignment and irregular visits with subject-specific unmeasured confounders. The simulation study showed that the longitudinal weights with calibrated restrictions improved the finite sample bias when compared to the stabilized weights. The application of the calibrated weights is demonstrated using the exposure of glucose lowering medications and the longitudinal outcome of Hemoglobin A1c. Our results support the effectiveness of glucose lowering medications in reducing Hemoglobin A1c among type II diabetes patients with elevated glycemic index ([Formula: see text]) using stabilized and calibrated weights.
Collapse
|
9
|
Lokku A, Birken CS, Maguire JL, Pullenayegum EM. Quantifying the extent of visit irregularity in longitudinal data. Int J Biostat 2022; 18:487-520. [PMID: 34392639 DOI: 10.1515/ijb-2020-0144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 08/02/2021] [Indexed: 01/10/2023]
Abstract
The timings of visits in observational longitudinal data may depend on the study outcome, and this can result in bias if ignored. Assessing the extent of visit irregularity is important because it can help determine whether visits can be treated as repeated measures or as irregular data. We propose plotting the mean proportions of individuals with 0 visits per bin against the mean proportions of individuals with >1 visit per bin as bin width is varied and using the area under the curve (AUC) to assess the extent of irregularity. The AUC is a single score which can be used to quantify the extent of irregularity and assess how closely visits resemble repeated measures. Simulation results confirm that the AUC increases with increasing irregularity while being invariant to sample size and the number of scheduled measurement occasions. A demonstration of the AUC was performed on the TARGet Kids! study which enrolls healthy children aged 0-5 years with the aim of investigating the relationship between early life exposures and later health problems. The quality of statistical analyses can be improved by using the AUC as a guide to select the appropriate analytic outcome approach and minimize the potential for biased results.
Collapse
Affiliation(s)
- Armend Lokku
- Child Health Evaluative Sciences, Hospital for Sick Children, Toronto, ON, Canada.,Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Catherine S Birken
- Division of Pediatric Medicine and the Pediatric Outcomes Research Team (PORT), Hospital for Sick Children, Toronto, ON, Canada.,Sick Kids Research Institute, Toronto, ON, Canada.,Institute of Health Policy, Management, and Evaluation, Toronto, ON, Canada.,Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Jonathon L Maguire
- Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.,Division of Pediatric Medicine and the Pediatric Outcomes Research Team (PORT), Faculty of Medicine, University of Toronto, Toronto, ON, Canada.,Applied Health Research Centre, Li Ka Shing Knowledge Institute, Toronto, ON, Canada.,Department of Pediatrics, Li Ka Shing Knowledge Institute, Toronto, ON, Canada
| | - Eleanor M Pullenayegum
- Child Health Evaluative Sciences, Hospital for Sick Children, Toronto, ON, Canada.,Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | -
- Child Health Evaluative Sciences, Hospital for Sick Children, Toronto, ON, Canada
| |
Collapse
|
10
|
Alizadehsani R, Roshanzamir M, Hussain S, Khosravi A, Koohestani A, Zangooei MH, Abdar M, Beykikhoshk A, Shoeibi A, Zare A, Panahiazar M, Nahavandi S, Srinivasan D, Atiya AF, Acharya UR. Handling of uncertainty in medical data using machine learning and probability theory techniques: a review of 30 years (1991-2020). ANNALS OF OPERATIONS RESEARCH 2021; 339:1-42. [PMID: 33776178 PMCID: PMC7982279 DOI: 10.1007/s10479-021-04006-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/23/2021] [Indexed: 05/17/2023]
Abstract
Understanding the data and reaching accurate conclusions are of paramount importance in the present era of big data. Machine learning and probability theory methods have been widely used for this purpose in various fields. One critically important yet less explored aspect is capturing and analyzing uncertainties in the data and model. Proper quantification of uncertainty helps to provide valuable information to obtain accurate diagnosis. This paper reviewed related studies conducted in the last 30 years (from 1991 to 2020) in handling uncertainties in medical data using probability theory and machine learning techniques. Medical data is more prone to uncertainty due to the presence of noise in the data. So, it is very important to have clean medical data without any noise to get accurate diagnosis. The sources of noise in the medical data need to be known to address this issue. Based on the medical data obtained by the physician, diagnosis of disease, and treatment plan are prescribed. Hence, the uncertainty is growing in healthcare and there is limited knowledge to address these problems. Our findings indicate that there are few challenges to be addressed in handling the uncertainty in medical raw data and new models. In this work, we have summarized various methods employed to overcome this problem. Nowadays, various novel deep learning techniques have been proposed to deal with such uncertainties and improve the performance in decision making.
Collapse
Affiliation(s)
- Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovations (IISRI), Deakin University, Geelong, Australia
| | - Mohamad Roshanzamir
- Department of Computer Engineering, Faculty of Engineering, Fasa University, 74617-81189 Fasa, Iran
| | - Sadiq Hussain
- System Administrator, Dibrugarh University, Dibrugarh, Assam 786004 India
| | - Abbas Khosravi
- Institute for Intelligent Systems Research and Innovations (IISRI), Deakin University, Geelong, Australia
| | - Afsaneh Koohestani
- Institute for Intelligent Systems Research and Innovations (IISRI), Deakin University, Geelong, Australia
| | | | - Moloud Abdar
- Institute for Intelligent Systems Research and Innovations (IISRI), Deakin University, Geelong, Australia
| | - Adham Beykikhoshk
- Applied Artificial Intelligence Institute, Deakin University, Geelong, Australia
| | - Afshin Shoeibi
- Computer Engineering Department, Ferdowsi University of Mashhad, Mashhad, Iran
- Faculty of Electrical and Computer Engineering, Biomedical Data Acquisition Lab, K. N. Toosi University of Technology, Tehran, Iran
| | - Assef Zare
- Faculty of Electrical Engineering, Gonabad Branch, Islamic Azad University, Gonabad, Iran
| | - Maryam Panahiazar
- Institute for Computational Health Sciences, University of California, San Francisco, USA
| | - Saeid Nahavandi
- Institute for Intelligent Systems Research and Innovations (IISRI), Deakin University, Geelong, Australia
| | - Dipti Srinivasan
- Dept. of Electrical and Computer Engineering, National University of Singapore, Singapore, 117576 Singapore
| | - Amir F. Atiya
- Department of Computer Engineering, Faculty of Engineering, Cairo University, Cairo, 12613 Egypt
| | - U. Rajendra Acharya
- Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore, Singapore
- Department of Biomedical Engineering, School of Science and Technology, Singapore University of Social Sciences, Singapore, Singapore
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
11
|
Goldstein BA, Phelan M, Pagidipati NJ, Peskoe SB. How and when informative visit processes can bias inference when using electronic health records data for clinical research. J Am Med Inform Assoc 2021; 26:1609-1617. [PMID: 31553474 DOI: 10.1093/jamia/ocz148] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 07/16/2019] [Accepted: 07/23/2019] [Indexed: 01/05/2023] Open
Abstract
OBJECTIVE Electronic health records (EHR) data have become a central data source for clinical research. One concern for using EHR data is that the process through which individuals engage with the health system, and find themselves within EHR data, can be informative. We have termed this process informed presence. In this study we use simulation and real data to assess how the informed presence can impact inference. MATERIALS AND METHODS We first simulated a visit process where a series of biomarkers were observed informatively and uninformatively over time. We further compared inference derived from a randomized control trial (ie, uninformative visits) and EHR data (ie, potentially informative visits). RESULTS We find that only when there is both a strong association between the biomarker and the outcome as well as the biomarker and the visit process is there bias. Moreover, once there are some uninformative visits this bias is mitigated. In the data example we find, that when the "true" associations are null, there is no observed bias. DISCUSSION These results suggest that an informative visit process can exaggerate an association but cannot induce one. Furthermore, careful study design can, mitigate the potential bias when some noninformative visits are included. CONCLUSIONS While there are legitimate concerns regarding biases that "messy" EHR data may induce, the conditions for such biases are extreme and can be accounted for.
Collapse
Affiliation(s)
- Benjamin A Goldstein
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA.,Center for Predictive Medicine, Duke Clinical Research Institute, Durham, North Carolina, USA
| | - Matthew Phelan
- Center for Predictive Medicine, Duke Clinical Research Institute, Durham, North Carolina, USA
| | - Neha J Pagidipati
- Center for Predictive Medicine, Duke Clinical Research Institute, Durham, North Carolina, USA.,Department of Medicine, Duke University, Durham, North Carolina, USA
| | - Sarah B Peskoe
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| |
Collapse
|
12
|
Sisk R, Lin L, Sperrin M, Barrett JK, Tom B, Diaz-Ordaz K, Peek N, Martin GP. Informative presence and observation in routine health data: A review of methodology for clinical risk prediction. J Am Med Inform Assoc 2021; 28:155-166. [PMID: 33164082 PMCID: PMC7810439 DOI: 10.1093/jamia/ocaa242] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 09/17/2020] [Indexed: 12/20/2022] Open
Abstract
Objective Informative presence (IP) is the phenomenon whereby the presence or absence of patient data is potentially informative with respect to their health condition, with informative observation (IO) being the longitudinal equivalent. These phenomena predominantly exist within routinely collected healthcare data, in which data collection is driven by the clinical requirements of patients and clinicians. The extent to which IP and IO are considered when using such data to develop clinical prediction models (CPMs) is unknown, as is the existing methodology aiming at handling these issues. This review aims to synthesize such existing methodology, thereby helping identify an agenda for future methodological work. Materials and Methods A systematic literature search was conducted by 2 independent reviewers using prespecified keywords. Results Thirty-six articles were included. We categorized the methods presented within as derived predictors (including some representation of the measurement process as a predictor in the model), modeling under IP, and latent structures. Including missing indicators or summary measures as predictors is the most commonly presented approach amongst the included studies (24 of 36 articles). Discussion This is the first review to collate the literature in this area under a prediction framework. A considerable body relevant of literature exists, and we present ways in which the described methods could be developed further. Guidance is required for specifying the conditions under which each method should be used to enable applied prediction modelers to use these methods. Conclusions A growing recognition of IP and IO exists within the literature, and methodology is increasingly becoming available to leverage these phenomena for prediction purposes. IP and IO should be approached differently in a prediction context than when the primary goal is explanation. The work included in this review has demonstrated theoretical and empirical benefits of incorporating IP and IO, and therefore we recommend that applied health researchers consider incorporating these methods in their work.
Collapse
Affiliation(s)
- Rose Sisk
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Lijing Lin
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Matthew Sperrin
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Jessica K Barrett
- MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom.,Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Brian Tom
- MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
| | - Karla Diaz-Ordaz
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Niels Peek
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom.,NIHR Biomedical Research Centre, Manchester Academic Health Science Centre, University of Manchester, Manchester, United Kingdom.,Alan Turing Institute, University of Manchester, London, United Kingdom
| | - Glen P Martin
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
13
|
Zaccardi F, Davies MJ, Khunti K. The present and future scope of real-world evidence research in diabetes: What questions can and cannot be answered and what might be possible in the future? Diabetes Obes Metab 2020; 22 Suppl 3:21-34. [PMID: 32250528 DOI: 10.1111/dom.13929] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/18/2019] [Accepted: 11/18/2019] [Indexed: 12/16/2022]
Abstract
The last decade has witnessed an exponential growth in the opportunities to collect and link health-related data from multiple resources, including primary care, administrative, and device data. The availability of these "real-world," "big data" has fuelled also an intense methodological research into methods to handle them and extract actionable information. In medicine, the evidence generated from "real-world data" (RWD), which are not purposely collected to answer biomedical questions, is commonly termed "real-world evidence" (RWE). In this review, we focus on RWD and RWE in the area of diabetes research, highlighting their contributions in the last decade; and give some suggestions for future RWE diabetes research, by applying well-established and less-known tools to direct RWE diabetes research towards better personalized approaches to diabetes care. We underline the essential aspects to consider when using RWD and the key features limiting the translational potential of RWD in generating high-quality and applicable RWE. Only if viewed in the context of other study designs and statistical methods, with its pros and cons carefully considered, RWE will exploit its full potential as a complementary or even, in some cases, substitutive source of evidence compared to the expensive evidence obtained from randomized controlled trials.
Collapse
Affiliation(s)
- Francesco Zaccardi
- Diabetes Research Centre, Leicester Diabetes Centre, Leicester, UK
- Leicester Real World Evidence Unit, Leicester Diabetes Centre, Leicester, UK
| | - Melanie J Davies
- Diabetes Research Centre, Leicester Diabetes Centre, Leicester, UK
| | - Kamlesh Khunti
- Diabetes Research Centre, Leicester Diabetes Centre, Leicester, UK
- Leicester Real World Evidence Unit, Leicester Diabetes Centre, Leicester, UK
| |
Collapse
|