1
|
Zhang Y, Kreif N, Gc VS, Manca A. Machine Learning Methods to Estimate Individualized Treatment Effects for Use in Health Technology Assessment. Med Decis Making 2024:272989X241263356. [PMID: 39056320 DOI: 10.1177/0272989x241263356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
BACKGROUND Recent developments in causal inference and machine learning (ML) allow for the estimation of individualized treatment effects (ITEs), which reveal whether treatment effectiveness varies according to patients' observed covariates. ITEs can be used to stratify health policy decisions according to individual characteristics and potentially achieve greater population health. Little is known about the appropriateness of available ML methods for use in health technology assessment. METHODS In this scoping review, we evaluate ML methods available for estimating ITEs, aiming to help practitioners assess their suitability in health technology assessment. We present a taxonomy of ML approaches, categorized by key challenges in health technology assessment using observational data, including handling time-varying confounding and time-to event data and quantifying uncertainty. RESULTS We found a wide range of algorithms for simpler settings with baseline confounding and continuous or binary outcomes. Not many ML algorithms can handle time-varying or unobserved confounding, and at the time of writing, no ML algorithm was capable of estimating ITEs for time-to-event outcomes while accounting for time-varying confounding. Many of the ML algorithms that estimate ITEs in longitudinal settings do not formally quantify uncertainty around the point estimates. LIMITATIONS This scoping review may not cover all relevant ML methods and algorithms as they are continuously evolving. CONCLUSIONS Existing ML methods available for ITE estimation are limited in handling important challenges posed by observational data when used for cost-effectiveness analysis, such as time-to-event outcomes, time-varying and hidden confounding, or the need to estimate sampling uncertainty around the estimates. IMPLICATIONS ML methods are promising but need further development before they can be used to estimate ITEs for health technology assessments. HIGHLIGHTS Estimating individualized treatment effects (ITEs) using observational data and machine learning (ML) can support personalized treatment advice and help deliver more customized information on the effectiveness and cost-effectiveness of health technologies.ML methods for ITE estimation are mostly designed for handling confounding at baseline but not time-varying or unobserved confounding. The few models that account for time-varying confounding are designed for continuous or binary outcomes, not time-to-event outcomes.Not all ML methods for estimating ITEs can quantify the uncertainty of their predictions.Future work on developing ML that addresses the concerns summarized in this review is needed before these methods can be widely used in clinical and health technology assessment-like decision making.
Collapse
Affiliation(s)
| | - Noemi Kreif
- Centre for Health Economics, University of York, UK
- Department of Pharmacy, University of Washington, Seattle, USA
| | - Vijay S Gc
- School of Human and Health Sciences, University of Huddersfield, UK
| | - Andrea Manca
- Centre for Health Economics, University of York, UK
| |
Collapse
|
2
|
Rivera AS, Pierce JB, Sinha A, Pawlowski AE, Lloyd-Jones DM, Lee YC, Feinstein MJ, Petito LC. Designing target trials using electronic health records: A case study of second-line disease-modifying anti-rheumatic drugs and cardiovascular disease outcomes in patients with rheumatoid arthritis. PLoS One 2024; 19:e0305467. [PMID: 38875273 PMCID: PMC11178161 DOI: 10.1371/journal.pone.0305467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 05/30/2024] [Indexed: 06/16/2024] Open
Abstract
BACKGROUND Emulation of the "target trial" (TT), a hypothetical pragmatic randomized controlled trial (RCT), using observational data can be used to mitigate issues commonly encountered in comparative effectiveness research (CER) when randomized trials are not logistically, ethically, or financially feasible. However, cardiovascular (CV) health research has been slow to adopt TT emulation. Here, we demonstrate the design and analysis of a TT emulation using electronic health records to study the comparative effectiveness of the addition of a disease-modifying anti-rheumatic drug (DMARD) to a regimen of methotrexate on CV events among rheumatoid arthritis (RA) patients. METHODS We used data from an electronic medical records-based cohort of RA patients from Northwestern Medicine to emulate the TT. Follow-up began 3 months after initial prescription of MTX (2000-2020) and included all available follow-up through June 30, 2020. Weighted pooled logistic regression was used to estimate differences in CVD risk and survival. Cloning was used to handle immortal time bias and weights to improve baseline and time-varying covariate imbalance. RESULTS We identified 659 eligible people with RA with average follow-up of 46 months and 31 MACE events. The month 24 adjusted risk difference for MACE comparing initiation vs non-initiation of a DMARD was -1.47% (95% confidence interval [CI]: -4.74, 1.95%), and the marginal hazard ratio (HR) was 0.72 (95% CI: 0.71, 1.23). In analyses subject to immortal time bias, the HR was 0.62 (95% CI: 0.29-1.44). CONCLUSION In this sample, we did not observe evidence of differences in risk of MACE, a finding that is compatible with previously published meta-analyses of RCTs. Thoughtful application of the TT framework provides opportunities to conduct CER in observational data. Benchmarking results of observational analyses to previously published RCTs can lend credibility to interpretation.
Collapse
Affiliation(s)
- Adovich S Rivera
- Institute for Public Health and Management, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California, United States of America
| | - Jacob B Pierce
- Department of Medicine, Duke University School of Medicine, Durham, North Carolina, United States of America
| | - Arjun Sinha
- Department of Medicine, Division of Cardiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Anna E Pawlowski
- Northwestern Medicine Enterprise Data Warehouse, Northwestern University, Chicago, Illinois, United States of America
| | - Donald M Lloyd-Jones
- Department of Medicine, Division of Cardiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Department of Preventive Medicine, Division of Epidemiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Yvonne C Lee
- Department of Medicine, Division of Rheumatology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Matthew J Feinstein
- Department of Medicine, Division of Cardiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Department of Preventive Medicine, Division of Epidemiology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Lucia C Petito
- Department of Preventive Medicine, Division of Biostatistics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| |
Collapse
|
3
|
Petersen GL, Jørgensen TSH, Mathisen J, Osler M, Mortensen EL, Molbo D, Hougaard CØ, Lange T, Lund R. Inverse probability weighting for self-selection bias correction in the investigation of social inequality in mortality. Int J Epidemiol 2024; 53:dyae097. [PMID: 38996447 DOI: 10.1093/ije/dyae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 07/04/2024] [Indexed: 07/14/2024] Open
Abstract
BACKGROUND Empirical evaluation of inverse probability weighting (IPW) for self-selection bias correction is inaccessible without the full source population. We aimed to: (i) investigate how self-selection biases frequency and association measures and (ii) assess self-selection bias correction using IPW in a cohort with register linkage. METHODS The source population included 17 936 individuals invited to the Copenhagen Aging and Midlife Biobank during 2009-11 (ages 49-63 years). Participants counted 7185 (40.1%). Register data were obtained for every invited person from 7 years before invitation to the end of 2020. The association between education and mortality was estimated using Cox regression models among participants, IPW participants and the source population. RESULTS Participants had higher socioeconomic position and fewer hospital contacts before baseline than the source population. Frequency measures of participants approached those of the source population after IPW. Compared with primary/lower secondary education, upper secondary, short tertiary, bachelor and master/doctoral were associated with reduced risk of death among participants (adjusted hazard ratio [95% CI]: 0.60 [0.46; 0.77], 0.68 [0.42; 1.11], 0.37 [0.25; 0.54], 0.28 [0.18; 0.46], respectively). IPW changed the estimates marginally (0.59 [0.45; 0.77], 0.57 [0.34; 0.93], 0.34 [0.23; 0.50], 0.24 [0.15; 0.39]) but not only towards those of the source population (0.57 [0.51; 0.64], 0.43 [0.32; 0.60], 0.38 [0.32; 0.47], 0.22 [0.16; 0.29]). CONCLUSIONS Frequency measures of study participants may not reflect the source population in the presence of self-selection, but the impact on association measures can be limited. IPW may be useful for (self-)selection bias correction, but the returned results can still reflect residual or other biases and random errors.
Collapse
Affiliation(s)
- Gitte Lindved Petersen
- Section of Social Medicine, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- Department of Translational Type 1 Diabetes Research, Steno Diabetes Center Copenhagen, Herlev, Denmark
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Terese Sara Høj Jørgensen
- Section of Social Medicine, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Jimmi Mathisen
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Merete Osler
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- Center for Clinical Research and Prevention, Bispebjerg & Frederiksberg Hospitals, Copenhagen, Denmark
| | - Erik Lykke Mortensen
- Center for Healthy Aging, University of Copenhagen, Copenhagen, Denmark
- Unit of Medical Psychology, Section of Environmental Health, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Drude Molbo
- Section of Social Medicine, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Charlotte Ørsted Hougaard
- Section of Social Medicine, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Theis Lange
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Rikke Lund
- Section of Social Medicine, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- Center for Healthy Aging, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
4
|
Norris T, Mitchell JJ, Blodgett JM, Hamer M, Pinto Pereira SM. Does cardiorespiratory fitness mediate or moderate the association between mid-life physical activity frequency and cognitive function? findings from the 1958 British birth cohort study. PLoS One 2024; 19:e0295092. [PMID: 38848437 PMCID: PMC11161044 DOI: 10.1371/journal.pone.0295092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/23/2024] [Indexed: 06/09/2024] Open
Abstract
BACKGROUND Physical activity (PA) is associated with a lower risk of cognitive decline and all-cause dementia in later life. Pathways underpinning this association are unclear but may involve either mediation and/or moderation by cardiorespiratory fitness (CRF). METHODS Data on PA frequency (exposure) at 42y, non-exercise testing CRF (NETCRF, mediator/moderator) at 45y and overall cognitive function (outcome) at 50y were obtained from 9,385 participants (50.8% female) in the 1958 British birth cohort study. We used a four-way decomposition approach to examine the relative contributions of mediation and moderation by NETCRF on the association between PA frequency at 42y and overall cognitive function at 50y. RESULTS In males, the estimated overall effect of 42y PA ≥once per week (vs. CONCLUSION We present the first evidence from a four-way decomposition analysis of the potential contribution that CRF plays in the relationship between mid-life PA frequency and subsequent cognitive function. Our lack of evidence in support of CRF mediating or moderating the PA frequency-cognitive function association suggests that other pathways underpin this association.
Collapse
Affiliation(s)
- Tom Norris
- Faculty of Medical Sciences, Institute of Sport, Division of Surgery and Interventional Science, Exercise and Health, UCL, London, United Kingdom
| | - John J. Mitchell
- Faculty of Medical Sciences, Institute of Sport, Division of Surgery and Interventional Science, Exercise and Health, UCL, London, United Kingdom
| | - Joanna M. Blodgett
- Faculty of Medical Sciences, Institute of Sport, Division of Surgery and Interventional Science, Exercise and Health, UCL, London, United Kingdom
| | - Mark Hamer
- Faculty of Medical Sciences, Institute of Sport, Division of Surgery and Interventional Science, Exercise and Health, UCL, London, United Kingdom
| | - Snehal M. Pinto Pereira
- Faculty of Medical Sciences, Institute of Sport, Division of Surgery and Interventional Science, Exercise and Health, UCL, London, United Kingdom
| |
Collapse
|
5
|
Bai Y, Shi X, Du J. A computable biomedical knowledge system: Toward rapidly building candidate-directed acyclic graphs. J Evid Based Med 2024; 17:307-316. [PMID: 38556728 DOI: 10.1111/jebm.12602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/17/2024] [Indexed: 04/02/2024]
Abstract
AIM It is essential for health researchers to have a systematic understanding of third-party variables that influence both the exposure and outcome under investigation, as shown by a directed acyclic graph (DAG). The traditional construction of DAGs through literature review and expert knowledge often needs to be more systematic and consistent, leading to potential biases. We try to introduce an automatic approach to building network linking variables of interest. METHODS Large-scale text mining from medical literature was utilized to construct a conceptual network based on the Semantic MEDLINE Database (SemMedDB). SemMedDB is a PubMed-scale repository of the "concept-relation-concept" triple format. Relations between concepts are categorized as Excitatory, Inhibitory, or General. RESULTS To facilitate the use of large-scale triple sets in SemMedDB, we have developed a computable biomedical knowledge (CBK) system (https://cbk.bjmu.edu.cn/), a website that enables direct retrieval of related publications and their corresponding triples without the necessity of writing SQL statements. Three case studies were elaborated to demonstrate the applications of the CBK system. CONCLUSIONS The CBK system is openly available and user-friendly for rapidly capturing a set of influencing factors for a phenotype and building candidate DAGs between exposure-outcome variables. It could be a valuable tool to reduce the exploration time in considering relationships between variables, and constructing a DAG. A reliable and standardized DAG could significantly improve the design and interpretation of observational health research.
Collapse
Affiliation(s)
- Yongmei Bai
- Institute of Medical Technology, Peking University Health Science Center, Beijing, China
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Xuanyu Shi
- Institute of Medical Technology, Peking University Health Science Center, Beijing, China
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Jian Du
- Institute of Medical Technology, Peking University Health Science Center, Beijing, China
- National Institute of Health Data Science, Peking University, Beijing, China
| |
Collapse
|
6
|
Yoon J, Kim JH, Chung Y, Park J, Leigh JH, Kim SS. Change in employment status and its causal effect on suicidal ideation and depressive symptoms: A marginal structural model with machine learning algorithms. Scand J Work Environ Health 2024; 50:218-227. [PMID: 38466615 PMCID: PMC11106614 DOI: 10.5271/sjweh.4150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Indexed: 03/13/2024] Open
Abstract
OBJECTIVE This study aimed to assess the causal effect of a change in employment status on suicidal ideation and depressive symptoms by applying marginal structural models (MSM) with machine-learning (ML) algorithms. METHODS We analyzed data from the 8-15th waves (2013-2020) of the Korean Welfare Panel Study, a nationally representative longitudinal dataset. Our analysis included 13 294 observations from 3621 participants who had standard employment at baseline (2013-2019). Based on employment status at follow-up year (2014-2020), respondents were classified into two groups: (i) maintained standard employment (reference group), (ii) changed to non-standard employment. Suicidal ideation during the past year and depressive symptoms during the past week were assessed through self-report questionnaire. To apply the ML algorithms to the MSM, we conducted eight ML algorithms to build the propensity score indicating a change in employment status. Then, we applied the MSM to examine the causal effect by using inverse probability weights calculated based on the propensity score from ML algorithms. RESULTS The random forest algorithm performed best among all algorithms, showing the highest area under the curve 0.702, 95% confidence interval (CI) 0.686-0.718. In the MSM with the random forest algorithm, workers who changed from standard to non-standard employment were 2.07 times more likely to report suicidal ideation compared to those who maintained standard employment (95% CI 1.16-3.70). A similar trend was observed in the analysis of depressive symptoms. CONCLUSIONS This study found that a change in employment status could lead to a higher risk of suicidal ideation and depressive symptoms.
Collapse
Affiliation(s)
| | | | | | | | | | - Seung-Sup Kim
- Department of Environmental Health Sciences, Seoul National University, Room 718, Bldg 220, Gwanak-ro 1, Seoul 08826, Republic of Korea.
| |
Collapse
|
7
|
Kuwornu JP, Maldonado F, Groot G, Cooper EJ, Penz E, Sommer L, Reid A, Marciniuk DD. An economic evaluation of chronic obstructive pulmonary disease clinical pathway in Saskatchewan, Canada: Data-driven techniques to identify cost-effectiveness among patient subgroups. PLoS One 2024; 19:e0301334. [PMID: 38557914 PMCID: PMC10984414 DOI: 10.1371/journal.pone.0301334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Saskatchewan has implemented care pathways for several common health conditions. To date, there has not been any cost-effectiveness evaluation of care pathways in the province. The objective of this study was to evaluate the real-world cost-effectiveness of a chronic obstructive pulmonary disease (COPD) care pathway program in Saskatchewan. METHODS Using patient-level administrative health data, we identified adults (35+ years) with COPD diagnosis recruited into the care pathway program in Regina between April 1, 2018, and March 31, 2019 (N = 759). The control group comprised adults (35+ years) with COPD who lived in Saskatoon during the same period (N = 759). The control group was matched to the intervention group using propensity scores. Costs were calculated at the patient level. The outcome measure was the number of days patients remained without experiencing COPD exacerbation within 1-year follow-up. Both manual and data-driven policy learning approaches were used to assess heterogeneity in the cost-effectiveness by patient demographic and disease characteristics. Bootstrapping was used to quantify uncertainty in the results. RESULTS In the overall sample, the estimates indicate that the COPD care pathway was not cost-effective using the willingness to pay (WTP) threshold values in the range of $1,000 and $5,000/exacerbation day averted. The manual subgroup analyses show the COPD care pathway was dominant among patients with comorbidities and among patients aged 65 years or younger at the WTP threshold of $2000/exacerbation day averted. Although similar profiles as those identified in the manual subgroup analyses were confirmed, the data-driven policy learning approach suggests more nuanced demographic and disease profiles that the care pathway would be most appropriate for. CONCLUSIONS Both manual subgroup analysis and data-driven policy learning approach showed that the COPD care pathway consistently produced cost savings and better health outcomes among patients with comorbidities or among those relatively younger. The care pathway was not cost-effective in the entire sample.
Collapse
Affiliation(s)
- John Paul Kuwornu
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, Faculty of Health, School of Public Health and Social Work, Queensland University of Technology, Brisbane, Queensland, Australia
| | | | - Gary Groot
- Community Health and Epidemiology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Elizabeth J. Cooper
- Kinesiology and Health Studies, University of Regina, Regina, Saskatchewan, Canada
| | - Erika Penz
- Respirology, Critical Care & Sleep Medicine, The Respiratory Research Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Leland Sommer
- Stewardship and Clinical Appropriateness, Saskatchewan Health Authority, Regina, Saskatchewan, Canada
| | - Amy Reid
- Clinical Integration Unit, Saskatchewan Health Authority, Regina, Saskatchewan, Canada
| | - Darcy D. Marciniuk
- Respirology, Critical Care & Sleep Medicine, The Respiratory Research Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
8
|
Post RAJ, Petkovic M, van den Heuvel IL, van den Heuvel ER. Flexible Machine Learning Estimation of Conditional Average Treatment Effects: A Blessing and a Curse. Epidemiology 2024; 35:32-40. [PMID: 37889951 DOI: 10.1097/ede.0000000000001684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Causal inference from observational data requires untestable identification assumptions. If these assumptions apply, machine learning methods can be used to study complex forms of causal effect heterogeneity. Recently, several machine learning methods were developed to estimate the conditional average treatment effect (ATE). If the features at hand cannot explain all heterogeneity, the individual treatment effects can seriously deviate from the conditional ATE. In this work, we demonstrate how the distributions of the individual treatment effect and the conditional ATE can differ when a causal random forest is applied. We extend the causal random forest to estimate the difference in conditional variance between treated and controls. If the distribution of the individual treatment effect equals that of the conditional ATE, this estimated difference in variance should be small. If they differ, an additional causal assumption is necessary to quantify the heterogeneity not captured by the distribution of the conditional ATE. The conditional variance of the individual treatment effect can be identified when the individual effect is independent of the outcome under no treatment given the measured features. Then, in the cases where the individual treatment effect and conditional ATE distributions differ, the extended causal random forest can appropriately estimate the variance of the individual treatment effect distribution, whereas the causal random forest fails to do so.
Collapse
Affiliation(s)
- Richard A J Post
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Marko Petkovic
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Isabel L van den Heuvel
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
| | - Edwin R van den Heuvel
- From the Department of Mathematics and Computer Science, Eindhoven University of Technology, the Netherlands
- Department of Preventive Medicine and Epidemiology, School of Medicine, Boston University, Boston, MA
| |
Collapse
|
9
|
Kaas-Hansen BS, Granholm A, Sivapalan P, Anthon CT, Schjørring OL, Maagaard M, Kjaer MBN, Mølgaard J, Ellekjaer KL, Fagerberg SK, Lange T, Møller MH, Perner A. Real-world causal evidence for planned predictive enrichment in critical care trials: A scoping review. Acta Anaesthesiol Scand 2024; 68:16-25. [PMID: 37649412 DOI: 10.1111/aas.14321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/01/2023] [Accepted: 08/12/2023] [Indexed: 09/01/2023]
Abstract
BACKGROUND Randomised clinical trials in critical care are prone to inconclusiveness due, in part, to undue optimism about effect sizes and suboptimal accounting for heterogeneous treatment effects. Although causal evidence from rich real-world critical care can help overcome these challenges by informing predictive enrichment, no overview exists. METHODS We conducted a scoping review, systematically searching 10 general and speciality journals for reports published on or after 1 January 2018, of randomised clinical trials enrolling adult critically ill patients. We collected trial metadata on 22 variables including recruitment period, intervention type and early stopping (including reasons) as well as data on the use of causal evidence from secondary data for planned predictive enrichment. RESULTS We screened 9020 records and included 316 unique RCTs with a total of 268,563 randomised participants. One hundred seventy-three (55%) trials tested drug interventions, 101 (32%) management strategies and 42 (13%) devices. The median duration of enrolment was 2.2 (IQR: 1.3-3.4) years, and 83% of trials randomised less than 1000 participants. Thirty-six trials (11%) were restricted to COVID-19 patients. Of the 55 (17%) trials that stopped early, 23 (42%) used predefined rules; futility, slow enrolment and safety concerns were the commonest stopping reasons. None of the included RCTs had used causal evidence from secondary data for planned predictive enrichment. CONCLUSION Work is needed to harness the rich multiverse of critical care data and establish its utility in critical care RCTs. Such work will likely need to leverage methodology from interventional and analytical epidemiology as well as data science.
Collapse
Affiliation(s)
- Benjamin Skov Kaas-Hansen
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Anders Granholm
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Praleene Sivapalan
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Carl Thomas Anthon
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Olav Lilleholt Schjørring
- Department of Anaesthesia and Intensive Care, Aalborg University Hospital, Aalborg, Denmark
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | - Mathias Maagaard
- Centre for Anaesthesiological Research, Department of Anaesthesiology, Zealand University Hospital, Køge, Denmark
| | | | - Jesper Mølgaard
- Department of Anesthesiology, Centre for Cancer and Organ Dysfunction, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Karen Louise Ellekjaer
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Steen Kåre Fagerberg
- Department of Anaesthesia and Intensive Care, Aalborg University Hospital, Aalborg, Denmark
| | - Theis Lange
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Morten Hylander Møller
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Anders Perner
- Department of Intensive Care, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
10
|
Newby D, Orgeta V, Marshall CR, Lourida I, Albertyn CP, Tamburin S, Raymont V, Veldsman M, Koychev I, Bauermeister S, Weisman D, Foote IF, Bucholc M, Leist AK, Tang EYH, Tai XY, Llewellyn DJ, Ranson JM. Artificial intelligence for dementia prevention. Alzheimers Dement 2023; 19:5952-5969. [PMID: 37837420 PMCID: PMC10843720 DOI: 10.1002/alz.13463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/01/2023] [Accepted: 08/07/2023] [Indexed: 10/16/2023]
Abstract
INTRODUCTION A wide range of modifiable risk factors for dementia have been identified. Considerable debate remains about these risk factors, possible interactions between them or with genetic risk, and causality, and how they can help in clinical trial recruitment and drug development. Artificial intelligence (AI) and machine learning (ML) may refine understanding. METHODS ML approaches are being developed in dementia prevention. We discuss exemplar uses and evaluate the current applications and limitations in the dementia prevention field. RESULTS Risk-profiling tools may help identify high-risk populations for clinical trials; however, their performance needs improvement. New risk-profiling and trial-recruitment tools underpinned by ML models may be effective in reducing costs and improving future trials. ML can inform drug-repurposing efforts and prioritization of disease-modifying therapeutics. DISCUSSION ML is not yet widely used but has considerable potential to enhance precision in dementia prevention. HIGHLIGHTS Artificial intelligence (AI) is not widely used in the dementia prevention field. Risk-profiling tools are not used in clinical practice. Causal insights are needed to understand risk factors over the lifespan. AI will help personalize risk-management tools for dementia prevention. AI could target specific patient groups that will benefit most for clinical trials.
Collapse
Affiliation(s)
- Danielle Newby
- University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
| | - Vasiliki Orgeta
- Division of Psychiatry, University College London, London, W1T 7BN, UK
| | - Charles R Marshall
- Preventive Neurology Unit, Wolfson Institute of Population Health, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 4NS, UK
- Department of Neurology, Royal London Hospital, London, E1 1BB, UK
| | - Ilianna Lourida
- Population Health Sciences Institute, Newcastle University, Newcastle, NE2 4AX, UK
- University of Exeter Medical School, Exeter, EX1 2HZ, UK
| | - Christopher P Albertyn
- Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, SE5 8AF, UK
| | - Stefano Tamburin
- Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, 37129, Italy
| | - Vanessa Raymont
- University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
| | - Michele Veldsman
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, OX3 9DU, UK
- Department of Experimental Psychology, University of Oxford, Oxford, OX2 6GG, UK
| | - Ivan Koychev
- University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
| | - Sarah Bauermeister
- University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
| | - David Weisman
- Abington Neurological Associates, Abington, PA 19001, USA
| | - Isabelle F Foote
- Preventive Neurology Unit, Wolfson Institute of Population Health, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 4NS, UK
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA
| | - Magda Bucholc
- Cognitive Analytics Research Lab, School of Computing, Engineering & Intelligent Systems, Ulster University, Derry, BT48 7JL, UK
| | - Anja K Leist
- Institute for Research on Socio-Economic Inequality (IRSEI), Department of Social Sciences, University of Luxembourg, L-4365, Luxembourg
| | - Eugene Y H Tang
- Population Health Sciences Institute, Newcastle University, Newcastle, NE2 4AX, UK
| | - Xin You Tai
- Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, OX3 9DU, UK
- Division of Clinical Neurology, John Radcliffe Hospital, Oxford University Hospitals Trust, Oxford, OX3 9DU, UK
| | | | - David J. Llewellyn
- University of Exeter Medical School, Exeter, EX1 2HZ, UK
- The Alan Turing Institute, London, NW1 2DB, UK
| | | |
Collapse
|
11
|
Hayward S, Parmesar K, Saleem MA. What is circulating factor disease and how is it currently explained? Pediatr Nephrol 2023; 38:3513-3518. [PMID: 36952039 PMCID: PMC10514121 DOI: 10.1007/s00467-023-05928-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 02/15/2023] [Accepted: 02/20/2023] [Indexed: 03/24/2023]
Abstract
Nephrotic syndrome (NS) consists of the clinical triad of hypoalbuminaemia, high levels of proteinuria and oedema, and describes a heterogeneous group of disease processes with different underlying drivers. The existence of circulating factor disease (CFD) as a driver of NS has been epitomised by a subset of patients who exhibit disease recurrence after transplantation, alongside laboratory work. Several circulating factors have been proposed and studied, broadly grouped into protease components such as soluble urokinase-type plasminogen activator (suPAR), hemopexin (Hx) and calcium/calmodulin-serine protease kinase (CASK), and other circulating proteases, and immune components such as TNF-α, CD40 and cardiotrophin-like cytokine-1 (CLC-1). While currently there is no definitive way of assessing risk of CFD pre-transplantation, promising work is emerging through the study of 'multi-omic' bioinformatic data from large national cohorts and biobanks.
Collapse
Affiliation(s)
- Samantha Hayward
- Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
| | - Kevon Parmesar
- Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Moin A Saleem
- Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| |
Collapse
|
12
|
Affiliation(s)
- David J Hunter
- From the Nuffield Department of Population Health (D.J.H.) and the Department of Statistics and Nuffield Department of Medicine (C.H.), University of Oxford, Oxford, and the Alan Turing Institute, London (C.H.) - both in the United Kingdom
| | - Christopher Holmes
- From the Nuffield Department of Population Health (D.J.H.) and the Department of Statistics and Nuffield Department of Medicine (C.H.), University of Oxford, Oxford, and the Alan Turing Institute, London (C.H.) - both in the United Kingdom
| |
Collapse
|
13
|
de Jong VMT, Hoogland J, Moons KGM, Riley RD, Nguyen TL, Debray TPA. Propensity-based standardization to enhance the validation and interpretation of prediction model discrimination for a target population. Stat Med 2023; 42:3508-3528. [PMID: 37311563 DOI: 10.1002/sim.9817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 02/26/2023] [Accepted: 05/19/2023] [Indexed: 06/15/2023]
Abstract
External validation of the discriminative ability of prediction models is of key importance. However, the interpretation of such evaluations is challenging, as the ability to discriminate depends on both the sample characteristics (ie, case-mix) and the generalizability of predictor coefficients, but most discrimination indices do not provide any insight into their respective contributions. To disentangle differences in discriminative ability across external validation samples due to a lack of model generalizability from differences in sample characteristics, we propose propensity-weighted measures of discrimination. These weighted metrics, which are derived from propensity scores for sample membership, are standardized for case-mix differences between the model development and validation samples, allowing for a fair comparison of discriminative ability in terms of model characteristics in a target population of interest. We illustrate our methods with the validation of eight prediction models for deep vein thrombosis in 12 external validation data sets and assess our methods in a simulation study. In the illustrative example, propensity score standardization reduced between-study heterogeneity of discrimination, indicating that between-study variability was partially attributable to case-mix. The simulation study showed that only flexible propensity-score methods (allowing for non-linear effects) produced unbiased estimates of model discrimination in the target population, and only when the positivity assumption was met. Propensity score-based standardization may facilitate the interpretation of (heterogeneity in) discriminative ability of a prediction model as observed across multiple studies, and may guide model updating strategies for a particular target population. Careful propensity score modeling with attention for non-linear relations is recommended.
Collapse
Affiliation(s)
- Valentijn M T de Jong
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Data Analytics and Methods Task Force, European Medicines Agency, Amsterdam, The Netherlands
| | - Jeroen Hoogland
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Tri-Long Nguyen
- Section of Epidemiology, Department of Public Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Thomas P A Debray
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Smart Data Analysis and Statistics, Utrecht, The Netherlands
| |
Collapse
|
14
|
Souza MCO, Cruz JC, Rocha BA, Maria Oliveira Souza J, Devóz PP, Santana A, Campíglia AD, Barbosa F. The influence of the co-exposure to polycyclic aromatic hydrocarbons and toxic metals on DNA damage in brazilian lactating women and their infants: A cross-sectional study using machine learning approaches. CHEMOSPHERE 2023; 334:138975. [PMID: 37224977 DOI: 10.1016/j.chemosphere.2023.138975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 03/29/2023] [Accepted: 05/16/2023] [Indexed: 05/26/2023]
Abstract
Polycyclic aromatic hydrocarbons (PAHs) and toxic metals are widely spread pollutants of public health concern. The co-contamination of these chemicals in the environment is frequent, but relatively little is known about their combined toxicities. In this context, this study aimed to evaluate the influence of the co-exposure to PAHs and toxic metals on DNA damage in Brazilian lactating women and their infants using machine learning approaches. Data were collected from an observational, cross-sectional study with 96 lactating women and 96 infants living in two cities. The exposure to these pollutants was estimated by determining urinary levels of seven mono-hydroxylated PAH metabolites and the free form of three toxic metals. 8-Hydroxydeoxyguanosine (8-OHdG) levels in the urine were used as the oxidative stress biomarker and set as the outcome. Individual sociodemographic factors were also collected using questionnaires. Sixteen machine learning algorithms were trained using 10-fold cross-validation to investigate the associations of urinary OH-PAHs and metals with 8-OHdG levels. This approach was also compared with models attained by multiple linear regression. The results showed that the urinary concentration of OH-PAHs was highly correlated between the mothers and their infants. Multiple linear regression did not show a statistically significant association between the contaminants and urinary 8OHdG levels. Machine learning models indicated that all investigated variables did not present predictive performance on 8-OHdG concentrations. In conclusion, PAHs and toxic metals were not associated with 8-OHdG levels in Brazilian lactating women and their infants. These novelty and originality results were achieved even after applying sophisticated statistical models to capture non-linear relationships. However, these findings should be interpreted cautiously because the exposure to the studied contaminants was considerably low, which may not reflect other populations at risk.
Collapse
Affiliation(s)
- Marília Cristina Oliveira Souza
- ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil.
| | - Jonas Carneiro Cruz
- ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
| | - Bruno Alves Rocha
- ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
| | - Juliana Maria Oliveira Souza
- Department of Biochemistry, Biological Sciences Institute, University of Juiz de Fora, Campus Universitário, Rua José Lourenço Kelmer, S/n - São Pedro, Juiz de Fora, MG, 36036-900, Brazil
| | - Paula Pícoli Devóz
- ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
| | - Anthony Santana
- Department of Chemistry, University of Central Florida, Orlando, FL, 32816, USA
| | | | - Fernando Barbosa
- ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
| |
Collapse
|
15
|
Rivera AS, Beach LB. Unaddressed Sources of Bias Lead to Biased Conclusions About Sexual Orientation Change Efforts and Suicidality in Sexual Minority Individuals. ARCHIVES OF SEXUAL BEHAVIOR 2023; 52:875-879. [PMID: 36472764 DOI: 10.1007/s10508-022-02498-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/21/2022] [Accepted: 11/22/2022] [Indexed: 05/11/2023]
Affiliation(s)
- Adovich S Rivera
- Kaiser Permanente Southern California Department of Research and Evaluation, 100 S Los Robles, Pasadena, CA, 91101, USA.
| | - Lauren B Beach
- Institute of Sexual and Gender Minority Health, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
16
|
Roster KO, Martinelli T, Connaughton C, Santillana M, Rodrigues FA. Estimating the impact of the COVID-19 pandemic on dengue in Brazil. RESEARCH SQUARE 2023:rs.3.rs-2548491. [PMID: 36798282 PMCID: PMC9934738 DOI: 10.21203/rs.3.rs-2548491/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Atypical dengue prevalence was observed in 2020 in many dengue-endemic countries, including Brazil. Evidence suggests that the pandemic disrupted not only dengue dynamics due to changes in mobility patterns, but also several aspects of dengue surveillance, such as care seeking behavior, care availability, and monitoring systems. However, we lack a clear understanding of the overall impact on dengue in different parts of the country as well as the role of individual causal drivers. In this study, we estimated the gap between expected and observed dengue cases in 2020 using an interrupted time series design with forecasts from a neural network and a structural Bayesian time series model. We also decomposed the gap into the impacts of climate conditions, pandemic-induced changes in reporting, human susceptibility, and human mobility. We find that there is considerable variation across the country in both overall pandemic impact on dengue and the relative importance of individual drivers. Increased understanding of the causal mechanisms driving the 2020 dengue season helps mitigate some of the data gaps caused by the COVID-19 pandemic and is critical to developing effective public health interventions to control dengue in the future.
Collapse
Affiliation(s)
- K. O. Roster
- Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, SP, Brazil
| | - T. Martinelli
- Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, SP, Brazil
| | - C. Connaughton
- Mathematics Institute, University of Warwick, Coventry, United Kingdom
- London Mathematical Laboratory, London, United Kingdom
| | - M. Santillana
- Machine Intelligence Group for the Betterment of Health and the Environment, Network Science Institute, Northeastern University, Boston, MA, USA
- Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - F. A. Rodrigues
- Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, SP, Brazil
| |
Collapse
|
17
|
Nghiem N, Atkinson J, Nguyen BP, Tran-Duy A, Wilson N. Predicting high health-cost users among people with cardiovascular disease using machine learning and nationwide linked social administrative datasets. HEALTH ECONOMICS REVIEW 2023; 13:9. [PMID: 36738348 PMCID: PMC9898915 DOI: 10.1186/s13561-023-00422-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 01/23/2023] [Indexed: 06/18/2023]
Abstract
OBJECTIVES To optimise planning of public health services, the impact of high-cost users needs to be considered. However, most of the existing statistical models for costs do not include many clinical and social variables from administrative data that are associated with elevated health care resource use, and are increasingly available. This study aimed to use machine learning approaches and big data to predict high-cost users among people with cardiovascular disease (CVD). METHODS We used nationally representative linked datasets in New Zealand to predict CVD prevalent cases with the most expensive cost belonging to the top quintiles by cost. We compared the performance of four popular machine learning models (L1-regularised logistic regression, classification trees, k-nearest neighbourhood (KNN) and random forest) with the traditional regression models. RESULTS The machine learning models had far better accuracy in predicting high health-cost users compared with the logistic models. The harmony score F1 (combining sensitivity and positive predictive value) of the machine learning models ranged from 30.6% to 41.2% (compared with 8.6-9.1% for the logistic models). Previous health costs, income, age, chronic health conditions, deprivation, and receiving a social security benefit were among the most important predictors of the CVD high-cost users. CONCLUSIONS This study provides additional evidence that machine learning can be used as a tool together with big data in health economics for identification of new risk factors and prediction of high-cost users with CVD. As such, machine learning may potentially assist with health services planning and preventive measures to improve population health while potentially saving healthcare costs.
Collapse
Affiliation(s)
- Nhung Nghiem
- Department of Public Health, University of Otago, Wellington, New Zealand.
| | - June Atkinson
- Department of Public Health, University of Otago, Wellington, New Zealand
| | - Binh P Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Wellington, New Zealand
| | - An Tran-Duy
- Centre for Health Policy, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Australia
| | - Nick Wilson
- Department of Public Health, University of Otago, Wellington, New Zealand
| |
Collapse
|
18
|
Cipriano LE. Evaluating the Impact and Potential Impact of Machine Learning on Medical Decision Making. Med Decis Making 2023; 43:147-149. [PMID: 36575951 PMCID: PMC9827491 DOI: 10.1177/0272989x221146506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Lauren E. Cipriano
- Lauren E. Cipriano, Medical
Decision Making and MDM Policy & Practice;
Ivey Business School and Departments of Medicine and Epidemiology and
Biostatistics, Schulich School of Medicine & Dentistry, Western University,
London, ON, Canada; ()
| |
Collapse
|
19
|
MacNell N, Feinstein L, Wilkerson J, Salo PM, Molsberry SA, Fessler MB, Thorne PS, Motsinger-Reif AA, Zeldin DC. Implementing machine learning methods with complex survey data: Lessons learned on the impacts of accounting sampling weights in gradient boosting. PLoS One 2023; 18:e0280387. [PMID: 36638125 PMCID: PMC9838837 DOI: 10.1371/journal.pone.0280387] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 12/28/2022] [Indexed: 01/14/2023] Open
Abstract
Despite the prominent use of complex survey data and the growing popularity of machine learning methods in epidemiologic research, few machine learning software implementations offer options for handling complex samples. A major challenge impeding the broader incorporation of machine learning into epidemiologic research is incomplete guidance for analyzing complex survey data, including the importance of sampling weights for valid prediction in target populations. Using data from 15, 820 participants in the 1988-1994 National Health and Nutrition Examination Survey cohort, we determined whether ignoring weights in gradient boosting models of all-cause mortality affected prediction, as measured by the F1 score and corresponding 95% confidence intervals. In simulations, we additionally assessed the impact of sample size, weight variability, predictor strength, and model dimensionality. In the National Health and Nutrition Examination Survey data, unweighted model performance was inflated compared to the weighted model (F1 score 81.9% [95% confidence interval: 81.2%, 82.7%] vs 77.4% [95% confidence interval: 76.1%, 78.6%]). However, the error was mitigated if the F1 score was subsequently recalculated with observed outcomes from the weighted dataset (F1: 77.0%; 95% confidence interval: 75.7%, 78.4%). In simulations, this finding held in the largest sample size (N = 10,000) under all analytic conditions assessed. For sample sizes <5,000, sampling weights had little impact in simulations that more closely resembled a simple random sample (low weight variability) or in models with strong predictors, but findings were inconsistent under other analytic scenarios. Failing to account for sampling weights in gradient boosting models may limit generalizability for data from complex surveys, dependent on sample size and other analytic properties. In the absence of software for configuring weighted algorithms, post-hoc re-calculations of unweighted model performance using weighted observed outcomes may more accurately reflect model prediction in target populations than ignoring weights entirely.
Collapse
Affiliation(s)
- Nathaniel MacNell
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Lydia Feinstein
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Jesse Wilkerson
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Pӓivi M. Salo
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Samantha A. Molsberry
- Social & Scientific Systems, a DLH Holdings Company, Durham, North Carolina, United States of America
| | - Michael B. Fessler
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Peter S. Thorne
- Department of Occupational and Environmental Health, University of Iowa, College of Public Health, Iowa City, Iowa, United States of America
| | - Alison A. Motsinger-Reif
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Darryl C. Zeldin
- Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| |
Collapse
|
20
|
Bowe AK, Lightbody G, Staines A, Murray DM. Big data, machine learning, and population health: predicting cognitive outcomes in childhood. Pediatr Res 2023; 93:300-307. [PMID: 35681091 PMCID: PMC7614199 DOI: 10.1038/s41390-022-02137-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 05/05/2022] [Accepted: 05/17/2022] [Indexed: 11/09/2022]
Abstract
The application of machine learning (ML) to address population health challenges has received much less attention than its application in the clinical setting. One such challenge is addressing disparities in early childhood cognitive development-a complex public health issue rooted in the social determinants of health, exacerbated by inequity, characterised by intergenerational transmission, and which will continue unabated without novel approaches to address it. Early life, the period of optimal neuroplasticity, presents a window of opportunity for early intervention to improve cognitive development. Unfortunately for many, this window will be missed, and intervention may never occur or occur only when overt signs of cognitive delay manifest. In this review, we explore the potential value of ML and big data analysis in the early identification of children at risk for poor cognitive outcome, an area where there is an apparent dearth of research. We compare and contrast traditional statistical methods with ML approaches, provide examples of how ML has been used to date in the field of neurodevelopmental disorders, and present a discussion of the opportunities and risks associated with its use at a population level. The review concludes by highlighting potential directions for future research in this area. IMPACT: To date, the application of machine learning to address population health challenges in paediatrics lags behind other clinical applications. This review provides an overview of the public health challenge we face in addressing disparities in childhood cognitive development and focuses on the cornerstone of early intervention. Recent advances in our ability to collect large volumes of data, and in analytic capabilities, provide a potential opportunity to improve current practices in this field. This review explores the potential role of machine learning and big data analysis in the early identification of children at risk for poor cognitive outcomes.
Collapse
Affiliation(s)
- Andrea K. Bowe
- grid.7872.a0000000123318773INFANT Research Centre, University College Cork, Cork, Ireland
| | - Gordon Lightbody
- grid.7872.a0000000123318773INFANT Research Centre, University College Cork, Cork, Ireland ,grid.7872.a0000000123318773Department of Electrical and Electronic Engineering, University College Cork, Cork, Ireland
| | - Anthony Staines
- grid.15596.3e0000000102380260School of Nursing, Psychotherapy, and Community Health, Dublin City University, Dublin, Ireland
| | - Deirdre M. Murray
- grid.7872.a0000000123318773INFANT Research Centre, University College Cork, Cork, Ireland
| |
Collapse
|
21
|
Søegaard SH, Spanggaard M, Rostgaard K, Kamper-Jørgensen M, Stensballe LG, Schmiegelow K, Hjalgrim H. Childcare attendance and risk of infections in childhood and adolescence. Int J Epidemiol 2022; 52:466-475. [PMID: 36413040 DOI: 10.1093/ije/dyac219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 11/10/2022] [Indexed: 11/23/2022] Open
Abstract
Abstract
Background
It has been suggested that the transiently increased infection risk following childcare enrolment is compensated by decreased infection risk later in childhood and adolescence. We investigated how childcare enrolment affected rates of antimicrobial-treated infections during childhood and adolescence.
Methods
In a register-based cohort study of all children born in Denmark 1997–2014 with available exposure information (n = 1 007 448), we assessed the association between childcare enrolment before age 6 years and infection risks up to age 20 years, using antimicrobial exposure as proxy for infections. Nationwide childcare and prescription data were used. We estimated infection rates and the cumulative number of infections using adjusted Poisson regression models.
Results
We observed 4 599 993 independent episodes of infection (antimicrobial exposure) during follow-up. Childcare enrolment transiently increased infection rates; the younger the child, the greater the increase. The resulting increased cumulative number of infections associated with earlier age at childcare enrolment was not compensated by lower infection risk later in childhood or adolescence. Accordingly, children enrolled in childcare before age 12 months had experienced 0.5–0.7 more infections at age 6 years (in total 4.5–5.1 infections) than peers enrolled at age 3 years, differences that persisted throughout adolescence. The type of childcare had little impact on infection risks.
Conclusions
Early age at childcare enrolment is associated with a modest increase in the cumulative number of antimicrobial-treated infections at all ages through adolescence. Emphasis should be given to disrupting infectious disease transmission in childcare facilities through prevention strategies with particular focus on the youngest children.
Collapse
Affiliation(s)
- Signe Holst Søegaard
- Department of Epidemiology Research, Statens Serum Institut , Copenhagen, Denmark
- Haematology, Danish Cancer Society Research Centre, Danish Cancer Society , Copenhagen, Denmark
| | - Maria Spanggaard
- Department of Epidemiology Research, Statens Serum Institut , Copenhagen, Denmark
| | - Klaus Rostgaard
- Department of Epidemiology Research, Statens Serum Institut , Copenhagen, Denmark
- Haematology, Danish Cancer Society Research Centre, Danish Cancer Society , Copenhagen, Denmark
| | - Mads Kamper-Jørgensen
- Section of Epidemiology, Department of Public Health, University of Copenhagen , Copenhagen, Denmark
| | - Lone Graff Stensballe
- Department of Paediatrics and Adolescent Medicine, University Hospital Rigshospitalet , Copenhagen, Denmark
| | - Kjeld Schmiegelow
- Department of Paediatrics and Adolescent Medicine, University Hospital Rigshospitalet , Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen , Copenhagen, Denmark
| | - Henrik Hjalgrim
- Department of Epidemiology Research, Statens Serum Institut , Copenhagen, Denmark
- Haematology, Danish Cancer Society Research Centre, Danish Cancer Society , Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen , Copenhagen, Denmark
- Department of Haematology, University Hospital Rigshospitalet , Copenhagen, Denmark
| |
Collapse
|
22
|
Kaas‐Hansen BS, Granholm A, Anthon CT, Kjær MN, Sivapalan P, Maagaard M, Schjørring OL, Fagerberg SK, Ellekjær KL, Mølgaard J, Ekstrøm CT, Møller MH, Perner A. Causal inference for planning randomised critical care trials: Protocol for a scoping review. Acta Anaesthesiol Scand 2022; 66:1274-1278. [PMID: 36054374 PMCID: PMC9826202 DOI: 10.1111/aas.14142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 08/17/2022] [Indexed: 01/11/2023]
Abstract
BACKGROUND Randomised clinical trials in critical care are prone to inconclusiveness owing, in part, to undue optimism about effect sizes and suboptimal accounting for heterogeneous treatment effects. Planned predictive enrichment based on secondary critical care data (often very rich with respect to both data types and temporal granularity) and causal inference methods may help overcome these challenges, but no overview exists about their use to this end. METHODS We will conduct a scoping review to assess the extent and nature of the use of causal inference from secondary data for planned predictive enrichment of randomised clinical trials in critical care. We will systematically search 10 general and specialty journals for reports published on or after 1 January 2018, of randomised clinical trials enrolling adult critically ill patients. We will collect trial metadata (e.g., recruitment period and phase) and, when available, information pertaining to the focus of the review (predictive enrichment based on causal inference estimates from secondary data): causal inference methods, estimation techniques and software used; types of patient populations; data provenance, types and models; and the availability of the data (public or not). The results will be reported in a descriptive manner. DISCUSSION The outlined scoping review aims to assess the use of causal inference methods and secondary data for planned predictive enrichment in randomised critical care trials. This will help guide methodological improvements to increase the utility, and facilitate the use, of causal inference estimates when planning such trials in the future.
Collapse
Affiliation(s)
- Benjamin Skov Kaas‐Hansen
- Department of Intensive CareCopenhagen University HospitalCopenhagenDenmark
- Section for Biostatistics, Department of Public HealthUniversity of CopenhagenCopenhagenDenmark
| | - Anders Granholm
- Department of Intensive CareCopenhagen University HospitalCopenhagenDenmark
| | - Carl Thomas Anthon
- Department of Intensive CareCopenhagen University HospitalCopenhagenDenmark
| | | | - Praleene Sivapalan
- Department of Intensive CareCopenhagen University HospitalCopenhagenDenmark
| | - Mathias Maagaard
- Department of Anaesthesiology, Centre for Anaesthesiological Research, Zealand University Hospital KøgeKøgeDenmark
| | - Olav Lilleholt Schjørring
- Department of Anaesthesia and Intensive CareAalborg University HospitalAalborgDenmark
- Department of Clinical MedicineAalborg UniversityAalborgDenmark
| | - Steen Kåre Fagerberg
- Department of Anaesthesia and Intensive CareAalborg University HospitalAalborgDenmark
| | | | - Jesper Mølgaard
- Department of Anesthesiology, Centre for Cancer and Organ DysfunctionCopenhagen University HospitalCopenhagenDenmark
| | - Claus Thorn Ekstrøm
- Section for Biostatistics, Department of Public HealthUniversity of CopenhagenCopenhagenDenmark
| | | | - Anders Perner
- Department of Intensive CareCopenhagen University HospitalCopenhagenDenmark
| |
Collapse
|
23
|
Leist AK, Klee M, Kim JH, Rehkopf DH, Bordas SPA, Muniz-Terrera G, Wade S. Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. SCIENCE ADVANCES 2022; 8:eabk1942. [PMID: 36260666 PMCID: PMC9581488 DOI: 10.1126/sciadv.abk1942] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/01/2022] [Indexed: 05/20/2023]
Abstract
Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
Collapse
Affiliation(s)
- Anja K. Leist
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Corresponding author.
| | - Matthias Klee
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jung Hyun Kim
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - David H. Rehkopf
- Department of Epidemiology and Population Health, Stanford University, Palo Alto, CA, USA
| | | | - Graciela Muniz-Terrera
- Centre for Dementia Prevention, University of Edinburgh, Edinburgh, UK
- Ohio University, Athens, OH, USA
| | - Sara Wade
- School of Mathematics, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
24
|
van Smeden M, Heinze G, Van Calster B, Asselbergs FW, Vardas PE, Bruining N, de Jaegere P, Moore JH, Denaxas S, Boulesteix AL, Moons KGM. Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. Eur Heart J 2022; 43:2921-2930. [PMID: 35639667 PMCID: PMC9443991 DOI: 10.1093/eurheartj/ehac238] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/29/2022] [Accepted: 04/26/2022] [Indexed: 11/12/2022] Open
Abstract
The medical field has seen a rapid increase in the development of artificial intelligence (AI)-based prediction models. With the introduction of such AI-based prediction model tools and software in cardiovascular patient care, the cardiovascular researcher and healthcare professional are challenged to understand the opportunities as well as the limitations of the AI-based predictions. In this article, we present 12 critical questions for cardiovascular health professionals to ask when confronted with an AI-based prediction model. We aim to support medical professionals to distinguish the AI-based prediction models that can add value to patient care from the AI that does not.
Collapse
Affiliation(s)
- Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| | - Georg Heinze
- Section for Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- EPI Centre, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, The Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London, UK
- Health Data Research UK and Institute of Health Informatics, University College London, London, UK
| | - Panos E Vardas
- Department of Cardiology, Heraklion University Hospital, Heraklion, Greece
- Heart Sector, Hygeia Hospitals Group, Athens, Greece
| | - Nico Bruining
- Department of Cardiology, Erasmus MC , Thorax Center, Rotterdam, The Netherlands
| | - Peter de Jaegere
- Department of Cardiology, Erasmus MC, Thorax Center, Rotterdam, The Netherlands
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Spiros Denaxas
- Health Data Research UK and Institute of Health Informatics, University College London, London, UK
- The Alan Turing Institute, London, UK
| | - Anne Laure Boulesteix
- Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich, Germany
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
25
|
Harhay MO, Bell KJL, Huang JY, Arah OA. IJE's Education Corner turns 10! Looking back and looking forward. Int J Epidemiol 2022; 51:1357-1360. [PMID: 35950800 DOI: 10.1093/ije/dyac161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 08/02/2022] [Indexed: 02/06/2023] Open
Affiliation(s)
- Michael O Harhay
- Editorial Board, International Journal of Epidemiology, Sydney, Australia.,Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Katy J L Bell
- Editorial Board, International Journal of Epidemiology, Sydney, Australia.,School of Public Health, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia
| | - Jonathan Y Huang
- Editorial Board, International Journal of Epidemiology, Sydney, Australia.,Biostatistics and Human Development, Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research, Singapore.,Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Onyebuchi A Arah
- Editorial Board, International Journal of Epidemiology, Sydney, Australia.,Department of Epidemiology, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, USA.,Department of Statistics, UCLA College, Los Angeles, CA, USA.,Research Unit for Epidemiology, Department of Public Health, Aarhus University, Aarhus, Denmark
| |
Collapse
|
26
|
Simoneau G, Pellegrini F, Debray TPA, Rouette J, Muñoz J, Platt RW, Petkau J, Bohn J, Shen C, de Moor C, Karim ME. Recommendations for the use of propensity score methods in multiple sclerosis research. Mult Scler 2022; 28:1467-1480. [PMID: 35387508 PMCID: PMC9260471 DOI: 10.1177/13524585221085733] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 02/03/2022] [Accepted: 02/17/2022] [Indexed: 01/24/2023]
Abstract
BACKGROUND With many disease-modifying therapies currently approved for the management of multiple sclerosis, there is a growing need to evaluate the comparative effectiveness and safety of those therapies from real-world data sources. Propensity score methods have recently gained popularity in multiple sclerosis research to generate real-world evidence. Recent evidence suggests, however, that the conduct and reporting of propensity score analyses are often suboptimal in multiple sclerosis studies. OBJECTIVES To provide practical guidance to clinicians and researchers on the use of propensity score methods within the context of multiple sclerosis research. METHODS We summarize recommendations on the use of propensity score matching and weighting based on the current methodological literature, and provide examples of good practice. RESULTS Step-by-step recommendations are presented, starting with covariate selection and propensity score estimation, followed by guidance on the assessment of covariate balance and implementation of propensity score matching and weighting. Finally, we focus on treatment effect estimation and sensitivity analyses. CONCLUSION This comprehensive set of recommendations highlights key elements that require careful attention when using propensity score methods.
Collapse
Affiliation(s)
| | | | | | - Julie Rouette
- Department of Epidemiology, Biostatistics and
Occupational Health, McGill University, Montreal, QC, Canada/Centre for
Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital,
Montreal, QC, Canada
| | - Johanna Muñoz
- University Medical Center Utrecht, Utretch, The
Netherlands
| | - Robert W. Platt
- Department of Pediatrics, McGill University,
Montreal, QC, Canada/Department of Epidemiology, Biostatistics and
Occupational Health, McGill University, Montreal, QC, Canada/Centre for
Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital,
Montreal, QC, Canada
| | - John Petkau
- Department of Statistics, The University of
British Columbia, Vancouver, BC, Canada
| | | | | | | | - Mohammad Ehsanul Karim
- School of Population and Public Health, The
University of British Columbia, Vancouver, BC, Canada/Centre for Health
Evaluation and Outcome Sciences, The University of British Columbia,
Vancouver, BC, Canada
| |
Collapse
|
27
|
Padula WV, Kreif N, Vanness DJ, Adamson B, Rueda JD, Felizzi F, Jonsson P, IJzerman MJ, Butte A, Crown W. Machine Learning Methods in Health Economics and Outcomes Research-The PALISADE Checklist: A Good Practices Report of an ISPOR Task Force. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022; 25:1063-1080. [PMID: 35779937 DOI: 10.1016/j.jval.2022.03.022] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 06/15/2023]
Abstract
Advances in machine learning (ML) and artificial intelligence offer tremendous potential benefits to patients. Predictive analytics using ML are already widely used in healthcare operations and care delivery, but how can ML be used for health economics and outcomes research (HEOR)? To answer this question, ISPOR established an emerging good practices task force for the application of ML in HEOR. The task force identified 5 methodological areas where ML could enhance HEOR: (1) cohort selection, identifying samples with greater specificity with respect to inclusion criteria; (2) identification of independent predictors and covariates of health outcomes; (3) predictive analytics of health outcomes, including those that are high cost or life threatening; (4) causal inference through methods, such as targeted maximum likelihood estimation or double-debiased estimation-helping to produce reliable evidence more quickly; and (5) application of ML to the development of economic models to reduce structural, parameter, and sampling uncertainty in cost-effectiveness analysis. Overall, ML facilitates HEOR through the meaningful and efficient analysis of big data. Nevertheless, a lack of transparency on how ML methods deliver solutions to feature selection and predictive analytics, especially in unsupervised circumstances, increases risk to providers and other decision makers in using ML results. To examine whether ML offers a useful and transparent solution to healthcare analytics, the task force developed the PALISADE Checklist. It is a guide for balancing the many potential applications of ML with the need for transparency in methods development and findings.
Collapse
Affiliation(s)
- William V Padula
- Department of Pharmaceutical and Health Economics, School of Pharmacy, University of Southern California, Los Angeles, CA, USA; The Leonard D. Schaeffer Center for Health Policy & Economics, University of Southern California, Los Angeles, CA, USA.
| | - Noemi Kreif
- Centre for Health Economics, University of York, York, England, UK
| | - David J Vanness
- Department of Health Policy and Administration, College of Health and Human Development, Pennsylvania State University, Hershey, PA, USA
| | | | | | | | - Pall Jonsson
- National Institute for Health and Care Excellence, Manchester, England, UK
| | - Maarten J IJzerman
- Centre for Health Policy, School of Population and Global Health, University of Melbourne, Melbourne, Australia
| | - Atul Butte
- School of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - William Crown
- The Heller School for Social Policy and Management, Brandeis University, Waltham, MA, USA.
| |
Collapse
|
28
|
Kalia S, Saarela O, Chen T, O'Neill B, Meaney C, Gronsbell J, Sejdic E, Escobar M, Aliarzadeh B, Moineddin R, Pow C, Sullivan F, Greiver M. Marginal structural models using calibrated weights with SuperLearner: application to type II diabetes cohort. IEEE J Biomed Health Inform 2022; 26:4197-4206. [PMID: 35588417 DOI: 10.1109/jbhi.2022.3175862] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As different scientific disciplines begin to converge on machine learning for causal inference, we demonstrate the application of machine learning algorithms in the context of longitudinal causal estimation using electronic health records. Our aim is to formulate a marginal structural model for estimating diabetes care provisions in which we envisioned hypothetical (i.e. counterfactual) dynamic treatment regimes using a combination of drug therapies to manage diabetes: metformin, sulfonylurea and SGLT-2i. The binary outcome of diabetes care provisions was defined using a composite measure of chronic disease prevention and screening elements [27] including (i) primary care visit, (ii) blood pressure, (iii) weight, (iv) hemoglobin A1c, (v) lipid, (vi) ACR, (vii) eGFR and (viii) statin medication. We used several statistical learning algorithms to describe causal relationships between the prescription of three common classes of diabetes medications and quality of diabetes care using the electronic health records contained in National Diabetes Repository. In particular, we generated an ensemble of statistical learning algorithms using the SuperLearner framework based on the following base learners: (i) least absolute shrinkage and selection operator, (ii) ridge regression, (iii) elastic net, (iv) random forest, (v) gradient boosting machines, and (vi) neural network. Each statistical learning algorithm was fitted using the pseudo-population generated from the marginalization of the time-dependent confounding process. Covariate balance was assessed using the longitudinal (i.e. cumulative-time product) stabilized weights with calibrated restrictions. Our results indicated that the treatment drop-in cohorts (with respect to metformin, sulfonylurea and SGLT-2i) may have improved diabetes care provisions in relation to treatment naive (i.e. no treatment) cohort. As a clinical utility, we hope that this article will facilitate discussions around the prevention of adverse chronic outcomes associated with type II diabetes through the improvement of diabetes care provisions in primary care.
Collapse
|
29
|
Remiro-Azócar A, Heath A, Baio G. Effect modification in anchored indirect treatment comparison: Comments on "Matching-adjusted indirect comparisons: Application to time-to-event data". Stat Med 2022; 41:1541-1553. [PMID: 35274754 DOI: 10.1002/sim.9286] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 11/04/2021] [Accepted: 11/17/2021] [Indexed: 01/17/2023]
Affiliation(s)
- Antonio Remiro-Azócar
- Department of Statistical Science, University College London, London, United Kingdom.,Quantitative Research, Statistical Outcomes Research & Analytics (SORA) Ltd., London, United Kingdom
| | - Anna Heath
- Department of Statistical Science, University College London, London, United Kingdom.,Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.,Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Gianluca Baio
- Department of Statistical Science, University College London, London, United Kingdom
| |
Collapse
|
30
|
Broadbent A, Grote T. Can Robots Do Epidemiology? Machine Learning, Causal Inference, and Predicting the Outcomes of Public Health Interventions. PHILOSOPHY & TECHNOLOGY 2022; 35:14. [PMID: 35251906 PMCID: PMC8881939 DOI: 10.1007/s13347-022-00509-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 11/20/2021] [Indexed: 11/29/2022]
Abstract
This paper argues that machine learning (ML) and epidemiology are on collision course over causation. The discipline of epidemiology lays great emphasis on causation, while ML research does not. Some epidemiologists have proposed imposing what amounts to a causal constraint on ML in epidemiology, requiring it either to engage in causal inference or restrict itself to mere projection. We whittle down the issues to the question of whether causal knowledge is necessary for underwriting predictions about the outcomes of public health interventions. While there is great plausibility to the idea that it is, conviction that something is impossible does not by itself motivate a constraint to forbid trying. We disambiguate the possible motivations for such a constraint into definitional, metaphysical, epistemological, and pragmatic considerations and argue that “Proceed with caution” (rather than “Stop!”) is the outcome of each. We then argue that there are positive reasons to proceed, albeit cautiously. Causal inference enforces existing classification schema prior to the testing of associational claims (causal or otherwise), but associations and classification schema are more plausibly discovered (rather than tested or justified) in a back-and-forth process of gaining reflective equilibrium. ML instantiates this kind of process, we argue, and thus offers the welcome prospect of uncovering meaningful new concepts in epidemiology and public health—provided it is not causally constrained.
Collapse
Affiliation(s)
- Alex Broadbent
- Department of Philosophy, Durham University, Durham, England
- Department of Philosophy, University of Johannesburg, Johannesburg, South Africa
| | - Thomas Grote
- Cluster of Excellence: Machine Learning for Science, University of Tubingen, Tubingen, Germany
| |
Collapse
|
31
|
Gebremedhin AT, Hogan AB, Blyth CC, Glass K, Moore HC. Developing a prediction model to estimate the true burden of respiratory syncytial virus (RSV) in hospitalised children in Western Australia. Sci Rep 2022; 12:332. [PMID: 35013434 PMCID: PMC8748465 DOI: 10.1038/s41598-021-04080-3] [Citation(s) in RCA: 58] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 12/14/2021] [Indexed: 12/23/2022] Open
Abstract
Respiratory syncytial virus (RSV) is a leading cause of childhood morbidity, however there is no systematic testing in children hospitalised with respiratory symptoms. Therefore, current RSV incidence likely underestimates the true burden. We used probabilistically linked perinatal, hospital, and laboratory records of 321,825 children born in Western Australia (WA), 2000-2012. We generated a predictive model for RSV positivity in hospitalised children aged < 5 years. We applied the model to all hospitalisations in our population-based cohort to determine the true RSV incidence, and under-ascertainment fraction. The model's predictive performance was determined using cross-validated area under the receiver operating characteristic (AUROC) curve. From 321,825 hospitalisations, 37,784 were tested for RSV (22.8% positive). Predictors of RSV positivity included younger admission age, male sex, non-Aboriginal ethnicity, a diagnosis of bronchiolitis and longer hospital stay. Our model showed good predictive accuracy (AUROC: 0.87). The respective sensitivity, specificity, positive predictive value and negative predictive values were 58.4%, 92.2%, 68.6% and 88.3%. The predicted incidence rates of hospitalised RSV for children aged < 3 months was 43.7/1000 child-years (95% CI 42.1-45.4) compared with 31.7/1000 child-years (95% CI 30.3-33.1) from laboratory-confirmed RSV admissions. Findings from our study suggest that the true burden of RSV may be 30-57% higher than current estimates.
Collapse
Affiliation(s)
- Amanuel Tesfay Gebremedhin
- Wesfarmers Centre of Vaccines and Infectious Diseases, Telethon Kids Institute, University of Western Australia, Perth, 6872, Australia.
| | - Alexandra B Hogan
- MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, UK
| | - Christopher C Blyth
- Wesfarmers Centre of Vaccines and Infectious Diseases, Telethon Kids Institute, University of Western Australia, Perth, 6872, Australia
- School of Medicine, The University of Western Australia, Perth, WA, Australia
- Department of Infectious Diseases, Perth Children's Hospital, Perth, WA, Australia
- PathWest Laboratory Medicine, QEII Medical Centre, Nedlands, Perth, WA, Australia
| | - Kathryn Glass
- Research School of Population Health, Australian National University, Canberra, Australia
| | - Hannah C Moore
- Wesfarmers Centre of Vaccines and Infectious Diseases, Telethon Kids Institute, University of Western Australia, Perth, 6872, Australia
| |
Collapse
|
32
|
Chatton A, Borgne FL, Leyrat C, Foucher Y. G-computation and doubly robust standardisation for continuous-time data: A comparison with inverse probability weighting. Stat Methods Med Res 2021; 31:706-718. [PMID: 34861799 DOI: 10.1177/09622802211047345] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In time-to-event settings, g-computation and doubly robust estimators are based on discrete-time data. However, many biological processes are evolving continuously over time. In this paper, we extend the g-computation and the doubly robust standardisation procedures to a continuous-time context. We compare their performance to the well-known inverse-probability-weighting estimator for the estimation of the hazard ratio and restricted mean survival times difference, using a simulation study. Under a correct model specification, all methods are unbiased, but g-computation and the doubly robust standardisation are more efficient than inverse-probability-weighting. We also analyse two real-world datasets to illustrate the practical implementation of these approaches. We have updated the R package RISCA to facilitate the use of these methods and their dissemination.
Collapse
Affiliation(s)
- Arthur Chatton
- INSERM UMR 1246 - SPHERE, 27045Nantes University, Tours University, France.,IDBC-A2COM, Pacé, France
| | - Florent Le Borgne
- INSERM UMR 1246 - SPHERE, 27045Nantes University, Tours University, France.,IDBC-A2COM, Pacé, France
| | - Clémence Leyrat
- Department of Medical Statistics, 4906London School of Hygiene and Tropical Medicine, UK.,Inequalities in Cancer Outcomes Network (ICON), 4906London School of Hygiene and Tropical Medicine, UK
| | - Yohann Foucher
- INSERM UMR 1246 - SPHERE, 27045Nantes University, Tours University, France.,26922Centre Hospitalier Universitaire de Nantes, France
| |
Collapse
|
33
|
Zhong Y, Kennedy EH, Bodnar LM, Naimi AI. AIPW: An R Package for Augmented Inverse Probability-Weighted Estimation of Average Causal Effects. Am J Epidemiol 2021; 190:2690-2699. [PMID: 34268567 PMCID: PMC8796813 DOI: 10.1093/aje/kwab207] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 07/09/2021] [Accepted: 07/13/2021] [Indexed: 12/26/2022] Open
Abstract
An increasing number of recent studies have suggested that doubly robust estimators with cross-fitting should be used when estimating causal effects with machine learning methods. However, not all existing programs that implement doubly robust estimators support machine learning methods and cross-fitting, or provide estimates on multiplicative scales. To address these needs, we developed AIPW, a software package implementing augmented inverse probability weighting (AIPW) estimation of average causal effects in R (R Foundation for Statistical Computing, Vienna, Austria). Key features of the AIPW package include cross-fitting and flexible covariate adjustment for observational studies and randomized controlled trials (RCTs). In this paper, we use a simulated RCT to illustrate implementation of the AIPW estimator. We also perform a simulation study to evaluate the performance of the AIPW package compared with other doubly robust implementations, including CausalGAM, npcausal, tmle, and tmle3. Our simulation showed that the AIPW package yields performance comparable to that of other programs. Furthermore, we also found that cross-fitting substantively decreases the bias and improves the confidence interval coverage for doubly robust estimators fitted with machine learning algorithms. Our findings suggest that the AIPW package can be a useful tool for estimating average causal effects with machine learning methods in RCTs and observational studies.
Collapse
Affiliation(s)
| | | | | | - Ashley I Naimi
- Correspondence to Dr. Ashley I. Naimi, Department of Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322 (e-mail: )
| |
Collapse
|
34
|
Gao Q, Zhang Y, Liang J, Sun H, Wang T. High-dimensional generalized propensity score with application to omics data. Brief Bioinform 2021; 22:6354024. [PMID: 34410351 DOI: 10.1093/bib/bbab331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 07/26/2021] [Accepted: 07/27/2021] [Indexed: 01/09/2023] Open
Abstract
Propensity score (PS) methods are popular when estimating causal effects in non-randomized studies. Drawing causal conclusion relies on the unconfoundedness assumption. This assumption is untestable and is considered more plausible if a large number of pre-treatment covariates are included in the analysis. However, previous studies have shown that including unnecessary covariates into PS models can lead to bias and efficiency loss. With the ever-increasing amounts of available data, such as the omics data, there is often little prior knowledge of the exact set of important covariates. Therefore, variable selection for causal inference in high-dimensional settings has received considerable attention in recent years. However, recent studies have focused mainly on binary treatments. In this study, we considered continuous treatments and proposed the generalized outcome-adaptive LASSO (GOAL) to select covariates that can provide an unbiased and statistically efficient estimation. Simulation studies showed that when the outcome model was linear, the GOAL selected almost all true confounders and predictors of outcome and excluded other covariates. The accuracy and precision of the estimates were close to ideal. Furthermore, the GOAL is robust to model misspecification. We applied the GOAL to seven DNA methylation datasets from the Gene Expression Omnibus database, which covered four brain regions, to estimate the causal effects of epigenetic aging acceleration on the incidence of Alzheimer's disease.
Collapse
Affiliation(s)
- Qian Gao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Yu Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jie Liang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongwei Sun
- Department of Health Statistics, School of Public Health and Management, Binzhou Medical University, Yantai, China
| | - Tong Wang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| |
Collapse
|
35
|
Tennant PWG, Murray EJ, Arnold KF, Berrie L, Fox MP, Gadd SC, Harrison WJ, Keeble C, Ranker LR, Textor J, Tomova GD, Gilthorpe MS, Ellison GTH. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol 2021; 50:620-632. [PMID: 33330936 PMCID: PMC8128477 DOI: 10.1093/ije/dyaa213] [Citation(s) in RCA: 341] [Impact Index Per Article: 113.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2020] [Indexed: 12/12/2022] Open
Abstract
Background Directed acyclic graphs (DAGs) are an increasingly popular approach for identifying confounding variables that require conditioning when estimating causal effects. This review examined the use of DAGs in applied health research to inform recommendations for improving their transparency and utility in future research. Methods Original health research articles published during 1999–2017 mentioning ‘directed acyclic graphs’ (or similar) or citing DAGitty were identified from Scopus, Web of Science, Medline and Embase. Data were extracted on the reporting of: estimands, DAGs and adjustment sets, alongside the characteristics of each article’s largest DAG. Results A total of 234 articles were identified that reported using DAGs. A fifth (n = 48, 21%) reported their target estimand(s) and half (n = 115, 48%) reported the adjustment set(s) implied by their DAG(s). Two-thirds of the articles (n = 144, 62%) made at least one DAG available. DAGs varied in size but averaged 12 nodes [interquartile range (IQR): 9–16, range: 3–28] and 29 arcs (IQR: 19–42, range: 3–99). The median saturation (i.e. percentage of total possible arcs) was 46% (IQR: 31–67, range: 12–100). 37% (n = 53) of the DAGs included unobserved variables, 17% (n = 25) included ‘super-nodes’ (i.e. nodes containing more than one variable) and 34% (n = 49) were visually arranged so that the constituent arcs flowed in the same direction (e.g. top-to-bottom). Conclusion There is substantial variation in the use and reporting of DAGs in applied health research. Although this partly reflects their flexibility, it also highlights some potential areas for improvement. This review hence offers several recommendations to improve the reporting and use of DAGs in future research.
Collapse
Affiliation(s)
- Peter W G Tennant
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
| | - Eleanor J Murray
- Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA
| | - Kellyn F Arnold
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK
| | - Laurie Berrie
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,School of Geography, University of Leeds, Leeds, UK.,School of GeoSciences, University of Edinburgh, Edinburgh, UK
| | - Matthew P Fox
- Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA.,Department of Global Health, Boston University, Boston, MA, USA
| | - Sarah C Gadd
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,School of Geography, University of Leeds, Leeds, UK
| | - Wendy J Harrison
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK
| | - Claire Keeble
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK
| | - Lynsie R Ranker
- Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA
| | - Johannes Textor
- Department of Tumour Immunology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Georgia D Tomova
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
| | - Mark S Gilthorpe
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
| | - George T H Ellison
- Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Centre for Data Innovation, Faculty of Science and Technology, University of Central Lancashire, Preston, UK
| |
Collapse
|
36
|
Rogers P, Wang D, Lu Z. Medical Information Mart for Intensive Care: A Foundation for the Fusion of Artificial Intelligence and Real-World Data. Front Artif Intell 2021; 4:691626. [PMID: 34136802 PMCID: PMC8201087 DOI: 10.3389/frai.2021.691626] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 05/18/2021] [Indexed: 01/27/2023] Open
Affiliation(s)
- Paul Rogers
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR, United States
| | - Dong Wang
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR, United States
| | - Zhiyuan Lu
- Division of Bioinformatics and Biostatistics, FDA National Center for Toxicological Research, Jefferson, AR, United States
| |
Collapse
|
37
|
Fu EL, van Diepen M, Xu Y, Trevisan M, Dekker FW, Zoccali C, Jager K, Carrero JJ. Pharmacoepidemiology for nephrologists (part 2): potential biases and how to overcome them. Clin Kidney J 2021; 14:1317-1326. [PMID: 33959262 PMCID: PMC8087121 DOI: 10.1093/ckj/sfaa242] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 10/02/2020] [Indexed: 12/21/2022] Open
Abstract
Observational pharmacoepidemiological studies using routinely collected healthcare data are increasingly being used in the field of nephrology to answer questions on the effectiveness and safety of medications. This review discusses a number of biases that may arise in such studies and proposes solutions to minimize them during the design or statistical analysis phase. We first describe designs to handle confounding by indication (e.g. active comparator design) and methods to investigate the influence of unmeasured confounding, such as the E-value, the use of negative control outcomes and control cohorts. We next discuss prevalent user and immortal time biases in pharmacoepidemiology research and how these can be prevented by focussing on incident users and applying either landmarking, using a time-varying exposure, or the cloning, censoring and weighting method. Lastly, we briefly discuss the common issues with missing data and misclassification bias. When these biases are properly accounted for, pharmacoepidemiological observational studies can provide valuable information for clinical practice.
Collapse
Affiliation(s)
- Edouard L Fu
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Merel van Diepen
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Yang Xu
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| | - Marco Trevisan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| | - Friedo W Dekker
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Carmine Zoccali
- CNR-IFC, Clinical Epidemiology of Renal Diseases and Hypertension, Reggio Calabria, Italy
| | - Kitty Jager
- Department of Medical Informatics, ERA-EDTA Registry, Amsterdam University Medical Center, Amsterdam Public Health Research Institute, University of Amsterdam, Amsterdam, The Netherlands
| | - Juan Jesus Carrero
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| |
Collapse
|
38
|
Lin L, Sperrin M, Jenkins DA, Martin GP, Peek N. A scoping review of causal methods enabling predictions under hypothetical interventions. Diagn Progn Res 2021; 5:3. [PMID: 33536082 PMCID: PMC7860039 DOI: 10.1186/s41512-021-00092-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/02/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. For many applications, this is perfectly acceptable. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. AIMS We aimed to identify published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference. We aimed to identify the main methodological approaches, their underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method. Finally, we aimed to highlight unresolved methodological challenges. METHODS We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. We included both methodologies proposed in statistical/machine learning literature and methodologies used in applied studies. RESULTS We identified 4919 papers through database searches and a further 115 papers through manual searches. Of these, 87 papers were retained for full-text screening, of which 13 were selected for inclusion. We found papers from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation. CONCLUSIONS There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: (1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses and (2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy.
Collapse
Affiliation(s)
- Lijing Lin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
| | - Matthew Sperrin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - David A Jenkins
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Greater Manchester Patient Safety Translational Research Centre, The University of Manchester, Manchester, UK
| | - Glen P Martin
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Niels Peek
- Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- NIHR Greater Manchester Patient Safety Translational Research Centre, The University of Manchester, Manchester, UK
- NIHR Manchester Biomedical Research Centre, The University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| |
Collapse
|
39
|
Le Borgne F, Chatton A, Léger M, Lenain R, Foucher Y. G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes. Sci Rep 2021; 11:1435. [PMID: 33446866 PMCID: PMC7809122 DOI: 10.1038/s41598-021-81110-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 12/24/2020] [Indexed: 11/09/2022] Open
Abstract
In clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.
Collapse
Affiliation(s)
- Florent Le Borgne
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,IDBC-A2COM, Pacé, France
| | - Arthur Chatton
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,IDBC-A2COM, Pacé, France
| | - Maxime Léger
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,Département D'Anesthésie Réanimation, Centre Hospitalier Universitaire D'Angers, Angers, France
| | - Rémi Lenain
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,Lille University Hospital, Lille, France
| | - Yohann Foucher
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France. .,Nantes University Hospital, Nantes, France.
| |
Collapse
|
40
|
Oskar S, Stingone JA. Machine Learning Within Studies of Early-Life Environmental Exposures and Child Health: Review of the Current Literature and Discussion of Next Steps. Curr Environ Health Rep 2021; 7:170-184. [PMID: 32578067 DOI: 10.1007/s40572-020-00282-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
PURPOSE OF REVIEW The goal of this article is to review the use of machine learning (ML) within studies of environmental exposures and children's health, identify common themes across studies, and provide recommendations to advance their use in research and practice. RECENT FINDINGS We identified 42 articles reporting upon the use of ML within studies of environmental exposures and children's health between 2017 and 2019. The common themes among the articles were analysis of mixture data, exposure prediction, disease prediction and forecasting, analysis of complex data, and causal inference. With the increasing complexity of environmental health data, we anticipate greater use of ML to address the challenges that cannot be handled by traditional analytics. In order for these methods to beneficially impact public health, the ML techniques we use need to be appropriate for our study questions, rigorously evaluated and reported in a way that can be critically assessed by the scientific community.
Collapse
Affiliation(s)
- Sabine Oskar
- Department of Epidemiology, Columbia University Mailman School of Public Health, 722 West 168th St, Room 1608, New York, NY, 10032, USA
| | - Jeanette A Stingone
- Department of Epidemiology, Columbia University Mailman School of Public Health, 722 West 168th St, Room 1608, New York, NY, 10032, USA.
| |
Collapse
|
41
|
Mitchell EG, Tabak EG, Levine ME, Mamykina L, Albers DJ. Enabling personalized decision support with patient-generated data and attributable components. J Biomed Inform 2020; 113:103639. [PMID: 33316422 DOI: 10.1016/j.jbi.2020.103639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 08/03/2020] [Accepted: 11/26/2020] [Indexed: 10/22/2022]
Abstract
Decision-making related to health is complex. Machine learning (ML) and patient generated data can identify patterns and insights at the individual level, where human cognition falls short, but not all ML-generated information is of equal utility for making health-related decisions. We develop and apply attributable components analysis (ACA), a method inspired by optimal transport theory, to type 2 diabetes self-monitoring data to identify patterns of association between nutrition and blood glucose control. In comparison with linear regression, we found that ACA offers a number of characteristics that make it promising for use in decision support applications. For example, ACA was able to identify non-linear relationships, was more robust to outliers, and offered broader and more expressive uncertainty estimates. In addition, our results highlight a tradeoff between model accuracy and interpretability, and we discuss implications for ML-driven decision support systems.
Collapse
Affiliation(s)
- Elliot G Mitchell
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| | - Esteban G Tabak
- Courant Institute of Mathematical Sciences, New York, NY, USA.
| | | | - Lena Mamykina
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| | - David J Albers
- Department of Biomedical Informatics, Columbia University, New York, NY, USA; Department of Pediatrics, Division of Informatics, University of Colorado, Aurora, CO, USA.
| |
Collapse
|
42
|
Blakely T, Moss R, Collins J, Mizdrak A, Singh A, Carvalho N, Wilson N, Geard N, Flaxman A. Proportional multistate lifetable modelling of preventive interventions: concepts, code and worked examples. Int J Epidemiol 2020; 49:1624-1636. [PMID: 33038892 DOI: 10.1093/ije/dyaa132] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 07/14/2020] [Indexed: 11/12/2022] Open
Abstract
Burden of Disease studies-such as the Global Burden of Disease (GBD) Study-quantify health loss in disability-adjusted life-years. However, these studies stop short of quantifying the future impact of interventions that shift risk factor distributions, allowing for trends and time lags. This methodology paper explains how proportional multistate lifetable (PMSLT) modelling quantifies intervention impacts, using comparisons between three tobacco control case studies [eradication of tobacco, tobacco-free generation i.e. the age at which tobacco can be legally purchased is lifted by 1 year of age for each calendar year) and tobacco tax]. We also illustrate the importance of epidemiological specification of business-as-usual in the comparator arm that the intervention acts on, by demonstrating variations in simulated health gains when incorrectly: (i) assuming no decreasing trend in tobacco prevalence; and (ii) not including time lags from quitting tobacco to changing disease incidence. In conjunction with increasing availability of baseline and forecast demographic and epidemiological data, PMSLT modelling is well suited to future multiple country comparisons to better inform national, regional and global prioritization of preventive interventions. To facilitate use of PMSLT, we introduce a Python-based modelling framework and associated tools that facilitate the construction, calibration and analysis of PMSLT models.
Collapse
Affiliation(s)
- Tony Blakely
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
| | - Rob Moss
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
| | - James Collins
- Institute of Health Metrics and Evaluation, University of Washington, Seattle, WA, USA
| | - Anja Mizdrak
- Department of Public Health, University of Otago, Wellington, New Zealand
| | - Ankur Singh
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
| | - Natalie Carvalho
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
| | - Nick Wilson
- Department of Public Health, University of Otago, Wellington, New Zealand
| | - Nicholas Geard
- Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia
| | - Abraham Flaxman
- Institute of Health Metrics and Evaluation, University of Washington, Seattle, WA, USA
| |
Collapse
|
43
|
Sengupta PP, Shrestha S, Berthon B, Messas E, Donal E, Tison GH, Min JK, D'hooge J, Voigt JU, Dudley J, Verjans JW, Shameer K, Johnson K, Lovstakken L, Tabassian M, Piccirilli M, Pernot M, Yanamala N, Duchateau N, Kagiyama N, Bernard O, Slomka P, Deo R, Arnaout R. Proposed Requirements for Cardiovascular Imaging-Related Machine Learning Evaluation (PRIME): A Checklist: Reviewed by the American College of Cardiology Healthcare Innovation Council. JACC Cardiovasc Imaging 2020; 13:2017-2035. [PMID: 32912474 PMCID: PMC7953597 DOI: 10.1016/j.jcmg.2020.07.015] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 07/15/2020] [Accepted: 07/16/2020] [Indexed: 12/20/2022]
Abstract
Machine learning (ML) has been increasingly used within cardiology, particularly in the domain of cardiovascular imaging. Due to the inherent complexity and flexibility of ML algorithms, inconsistencies in the model performance and interpretation may occur. Several review articles have been recently published that introduce the fundamental principles and clinical application of ML for cardiologists. This paper builds on these introductory principles and outlines a more comprehensive list of crucial responsibilities that need to be completed when developing ML models. This paper aims to serve as a scientific foundation to aid investigators, data scientists, authors, editors, and reviewers involved in machine learning research with the intent of uniform reporting of ML investigations. An independent multidisciplinary panel of ML experts, clinicians, and statisticians worked together to review the theoretical rationale underlying 7 sets of requirements that may reduce algorithmic errors and biases. Finally, the paper summarizes a list of reporting items as an itemized checklist that highlights steps for ensuring correct application of ML models and the consistent reporting of model specifications and results. It is expected that the rapid pace of research and development and the increased availability of real-world evidence may require periodic updates to the checklist.
Collapse
Affiliation(s)
- Partho P Sengupta
- West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia.
| | - Sirish Shrestha
- West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
| | - Béatrice Berthon
- Physique pour la Médecine Paris, Inserm U1273, CNRS FRE 2031, ESPCI Paris, PSL Research University, Paris, France
| | - Emmanuel Messas
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Erwan Donal
- Département de Cardiologie et Maladies Vasculaires, Service de Cardiologie et maladies vasculaires, CHU Rennes, Rennes, France
| | - Geoffrey H Tison
- Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California
| | | | - Jan D'hooge
- Laboratory on Cardiovascular Imaging and Dynamics, Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| | - Jens-Uwe Voigt
- Department of Cardiovascular Science, KU Leuven, Leuven, Belgium; Department of Cardiovascular Diseases, University Hospitals Leuven, Belgium
| | - Joel Dudley
- Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Johan W Verjans
- Australian Institute for Machine Learning, University of Adelaide, North Terrace, Adelaide, South Australia, Australia; Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Khader Shameer
- Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Kipp Johnson
- Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Lasse Lovstakken
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Mahdi Tabassian
- Laboratory on Cardiovascular Imaging and Dynamics, Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| | - Marco Piccirilli
- West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
| | - Mathieu Pernot
- Physique pour la Médecine Paris, Inserm U1273, CNRS FRE 2031, ESPCI Paris, PSL Research University, Paris, France
| | - Naveena Yanamala
- West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
| | - Nicolas Duchateau
- CREATIS, CNRS UMR 5220, INSERM U1206, Université Lyon 1, INSA-LYON, France
| | - Nobuyuki Kagiyama
- West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
| | - Olivier Bernard
- CREATIS, CNRS UMR 5220, INSERM U1206, Université Lyon 1, INSA-LYON, France
| | - Piotr Slomka
- Department of Imaging and Medicine, Cedars-Sinai Medical Center, Los Angeles, California
| | - Rahul Deo
- Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Rima Arnaout
- Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California
| |
Collapse
|
44
|
Gokhale M, Stürmer T, Buse JB. Real-world evidence: the devil is in the detail. Diabetologia 2020; 63:1694-1705. [PMID: 32666226 PMCID: PMC7448554 DOI: 10.1007/s00125-020-05217-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 05/01/2020] [Indexed: 12/12/2022]
Abstract
Much has been written about real-world evidence (RWE), a concept that offers an understanding of the effects of healthcare interventions using routine clinical data. The reflection of diverse real-world practices is a double-edged sword that makes RWE attractive but also opens doors to several biases that need to be minimised both in the design and analytical phases of non-experimental studies. Additionally, it is critical to ensure that researchers who conduct these studies possess adequate methodological expertise and ability to accurately implement these methods. Critical design elements to be considered should include a clearly defined research question using a causal inference framework, choice of a fit-for-purpose data source, inclusion of new users of a treatment with comparators that are as similar as possible to that group, accurately classifying person-time and deciding censoring approaches. Having taken measures to minimise bias 'by design', the next step is to implement appropriate analytical techniques (for example propensity scores) to minimise the remnant potential biases. A clear protocol should be provided at the beginning of the study and a report of the results after, including caveats to consider. We also point the readers to readings on some novel analytical methods as well as newer areas of application of RWE. While there is no one-size-fits-all solution to evaluating RWE studies, we have focused our discussion on key methods and issues commonly encountered in comparative observational cohort studies with the hope that readers are better equipped to evaluate non-experimental studies that they encounter in the future. Graphical abstract.
Collapse
Affiliation(s)
- Mugdha Gokhale
- Pharmacoepidemiology, Center for Observational & Real-World Evidence, Merck, 770 Sumneytown Pike, West Point, PA, 19486, USA.
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.
| | - Til Stürmer
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - John B Buse
- Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
45
|
Goldstein ND, LeVasseur MT, McClure LA. On the Convergence of Epidemiology, Biostatistics, and Data Science. HARVARD DATA SCIENCE REVIEW 2020; 2. [PMID: 35005710 DOI: 10.1162/99608f92.9f0215e6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Epidemiology, biostatistics, and data science are broad disciplines that incorporate a variety of substantive areas. Common among them is a focus on quantitative approaches for solving intricate problems. When the substantive area is health and health care, the overlap is further cemented. Researchers in these disciplines are fluent in statistics, data management and analysis, and health and medicine, to name but a few competencies. Yet there are important and perhaps mutually exclusive attributes of these fields that warrant a tighter integration. For example, epidemiologists receive substantial training in the science of study design, measurement, and the art of causal inference. Biostatisticians are well versed in the theory and application of methodological techniques, as well as the design and conduct of public health research. Data scientists receive equivalently rigorous training in computational and visualization approaches for high-dimensional data. Compared to data scientists, epidemiologists and biostatisticians may have less expertise in computer science and informatics, while data scientists may benefit from a working knowledge of study design and causal inference. Collaboration and cross-training offer the opportunity to share and learn of the constructs, frameworks, theories, and methods of these fields with the goal of offering fresh and innovate perspectives for tackling challenging problems in health and health care. In this article, we first describe the evolution of these fields focusing on their convergence in the era of electronic health data, notably electronic medical records (EMRs). Next we present how a collaborative team may design, analyze, and implement an EMR-based study. Finally, we review the curricula at leading epidemiology, biostatistics, and data science training programs, identifying gaps and offering suggestions for the fields moving forward.
Collapse
Affiliation(s)
- Neal D Goldstein
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| | - Michael T LeVasseur
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| | - Leslie A McClure
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| |
Collapse
|