Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Blakely T, Lynch J, Simons K, Bentley R, Rose S. Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. Int J Epidemiol 2021;49:2058-2064. [PMID: 31298274 DOI: 10.1093/ije/dyz132] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2019] [Indexed: 02/06/2023] Open

For:	Blakely T, Lynch J, Simons K, Bentley R, Rose S. Reflection on modern methods: when worlds collide-prediction, machine learning and causal inference. Int J Epidemiol 2021;49:2058-2064. [PMID: 31298274 DOI: 10.1093/ije/dyz132] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2019] [Indexed: 02/06/2023] Open

Number

Cited by Other Article(s)

Zhang Y, Kreif N, Gc VS, Manca A. Machine Learning Methods to Estimate Individualized Treatment Effects for Use in Health Technology Assessment. Med Decis Making 2024:272989X241263356. [PMID: 39056320 DOI: 10.1177/0272989x241263356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]

Abstract

BACKGROUND

Recent developments in causal inference and machine learning (ML) allow for the estimation of individualized treatment effects (ITEs), which reveal whether treatment effectiveness varies according to patients' observed covariates. ITEs can be used to stratify health policy decisions according to individual characteristics and potentially achieve greater population health. Little is known about the appropriateness of available ML methods for use in health technology assessment.

METHODS

In this scoping review, we evaluate ML methods available for estimating ITEs, aiming to help practitioners assess their suitability in health technology assessment. We present a taxonomy of ML approaches, categorized by key challenges in health technology assessment using observational data, including handling time-varying confounding and time-to event data and quantifying uncertainty.

RESULTS

We found a wide range of algorithms for simpler settings with baseline confounding and continuous or binary outcomes. Not many ML algorithms can handle time-varying or unobserved confounding, and at the time of writing, no ML algorithm was capable of estimating ITEs for time-to-event outcomes while accounting for time-varying confounding. Many of the ML algorithms that estimate ITEs in longitudinal settings do not formally quantify uncertainty around the point estimates.

LIMITATIONS

This scoping review may not cover all relevant ML methods and algorithms as they are continuously evolving.

CONCLUSIONS

Existing ML methods available for ITE estimation are limited in handling important challenges posed by observational data when used for cost-effectiveness analysis, such as time-to-event outcomes, time-varying and hidden confounding, or the need to estimate sampling uncertainty around the estimates.

IMPLICATIONS

ML methods are promising but need further development before they can be used to estimate ITEs for health technology assessments.

HIGHLIGHTS

Estimating individualized treatment effects (ITEs) using observational data and machine learning (ML) can support personalized treatment advice and help deliver more customized information on the effectiveness and cost-effectiveness of health technologies.ML methods for ITE estimation are mostly designed for handling confounding at baseline but not time-varying or unobserved confounding. The few models that account for time-varying confounding are designed for continuous or binary outcomes, not time-to-event outcomes.Not all ML methods for estimating ITEs can quantify the uncertainty of their predictions.Future work on developing ML that addresses the concerns summarized in this review is needed before these methods can be widely used in clinical and health technology assessment-like decision making.

Collapse

Rivera AS, Pierce JB, Sinha A, Pawlowski AE, Lloyd-Jones DM, Lee YC, Feinstein MJ, Petito LC. Designing target trials using electronic health records: A case study of second-line disease-modifying anti-rheumatic drugs and cardiovascular disease outcomes in patients with rheumatoid arthritis. PLoS One 2024;19:e0305467. [PMID: 38875273 PMCID: PMC11178161 DOI: 10.1371/journal.pone.0305467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 05/30/2024] [Indexed: 06/16/2024] Open

Abstract

BACKGROUND

Emulation of the "target trial" (TT), a hypothetical pragmatic randomized controlled trial (RCT), using observational data can be used to mitigate issues commonly encountered in comparative effectiveness research (CER) when randomized trials are not logistically, ethically, or financially feasible. However, cardiovascular (CV) health research has been slow to adopt TT emulation. Here, we demonstrate the design and analysis of a TT emulation using electronic health records to study the comparative effectiveness of the addition of a disease-modifying anti-rheumatic drug (DMARD) to a regimen of methotrexate on CV events among rheumatoid arthritis (RA) patients.

METHODS

We used data from an electronic medical records-based cohort of RA patients from Northwestern Medicine to emulate the TT. Follow-up began 3 months after initial prescription of MTX (2000-2020) and included all available follow-up through June 30, 2020. Weighted pooled logistic regression was used to estimate differences in CVD risk and survival. Cloning was used to handle immortal time bias and weights to improve baseline and time-varying covariate imbalance.

RESULTS

We identified 659 eligible people with RA with average follow-up of 46 months and 31 MACE events. The month 24 adjusted risk difference for MACE comparing initiation vs non-initiation of a DMARD was -1.47% (95% confidence interval [CI]: -4.74, 1.95%), and the marginal hazard ratio (HR) was 0.72 (95% CI: 0.71, 1.23). In analyses subject to immortal time bias, the HR was 0.62 (95% CI: 0.29-1.44).

CONCLUSION

In this sample, we did not observe evidence of differences in risk of MACE, a finding that is compatible with previously published meta-analyses of RCTs. Thoughtful application of the TT framework provides opportunities to conduct CER in observational data. Benchmarking results of observational analyses to previously published RCTs can lend credibility to interpretation.

Collapse

Petersen GL, Jørgensen TSH, Mathisen J, Osler M, Mortensen EL, Molbo D, Hougaard CØ, Lange T, Lund R. Inverse probability weighting for self-selection bias correction in the investigation of social inequality in mortality. Int J Epidemiol 2024;53:dyae097. [PMID: 38996447 DOI: 10.1093/ije/dyae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 07/04/2024] [Indexed: 07/14/2024] Open

Abstract

BACKGROUND

Empirical evaluation of inverse probability weighting (IPW) for self-selection bias correction is inaccessible without the full source population. We aimed to: (i) investigate how self-selection biases frequency and association measures and (ii) assess self-selection bias correction using IPW in a cohort with register linkage.

METHODS

The source population included 17 936 individuals invited to the Copenhagen Aging and Midlife Biobank during 2009-11 (ages 49-63 years). Participants counted 7185 (40.1%). Register data were obtained for every invited person from 7 years before invitation to the end of 2020. The association between education and mortality was estimated using Cox regression models among participants, IPW participants and the source population.

RESULTS

Participants had higher socioeconomic position and fewer hospital contacts before baseline than the source population. Frequency measures of participants approached those of the source population after IPW. Compared with primary/lower secondary education, upper secondary, short tertiary, bachelor and master/doctoral were associated with reduced risk of death among participants (adjusted hazard ratio [95% CI]: 0.60 [0.46; 0.77], 0.68 [0.42; 1.11], 0.37 [0.25; 0.54], 0.28 [0.18; 0.46], respectively). IPW changed the estimates marginally (0.59 [0.45; 0.77], 0.57 [0.34; 0.93], 0.34 [0.23; 0.50], 0.24 [0.15; 0.39]) but not only towards those of the source population (0.57 [0.51; 0.64], 0.43 [0.32; 0.60], 0.38 [0.32; 0.47], 0.22 [0.16; 0.29]).

CONCLUSIONS

Frequency measures of study participants may not reflect the source population in the presence of self-selection, but the impact on association measures can be limited. IPW may be useful for (self-)selection bias correction, but the returned results can still reflect residual or other biases and random errors.

Collapse

Norris T, Mitchell JJ, Blodgett JM, Hamer M, Pinto Pereira SM. Does cardiorespiratory fitness mediate or moderate the association between mid-life physical activity frequency and cognitive function? findings from the 1958 British birth cohort study. PLoS One 2024;19:e0295092. [PMID: 38848437 PMCID: PMC11161044 DOI: 10.1371/journal.pone.0295092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/23/2024] [Indexed: 06/09/2024] Open

Abstract

BACKGROUND

Physical activity (PA) is associated with a lower risk of cognitive decline and all-cause dementia in later life. Pathways underpinning this association are unclear but may involve either mediation and/or moderation by cardiorespiratory fitness (CRF).

METHODS

Data on PA frequency (exposure) at 42y, non-exercise testing CRF (NETCRF, mediator/moderator) at 45y and overall cognitive function (outcome) at 50y were obtained from 9,385 participants (50.8% female) in the 1958 British birth cohort study. We used a four-way decomposition approach to examine the relative contributions of mediation and moderation by NETCRF on the association between PA frequency at 42y and overall cognitive function at 50y.

RESULTS

In males, the estimated overall effect of 42y PA ≥once per week (vs.

CONCLUSION

We present the first evidence from a four-way decomposition analysis of the potential contribution that CRF plays in the relationship between mid-life PA frequency and subsequent cognitive function. Our lack of evidence in support of CRF mediating or moderating the PA frequency-cognitive function association suggests that other pathways underpin this association.

Collapse

Bai Y, Shi X, Du J. A computable biomedical knowledge system: Toward rapidly building candidate-directed acyclic graphs. J Evid Based Med 2024;17:307-316. [PMID: 38556728 DOI: 10.1111/jebm.12602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/17/2024] [Indexed: 04/02/2024]

Yoon J, Kim JH, Chung Y, Park J, Leigh JH, Kim SS. Change in employment status and its causal effect on suicidal ideation and depressive symptoms: A marginal structural model with machine learning algorithms. Scand J Work Environ Health 2024;50:218-227. [PMID: 38466615 PMCID: PMC11106614 DOI: 10.5271/sjweh.4150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Indexed: 03/13/2024] Open

Kuwornu JP, Maldonado F, Groot G, Cooper EJ, Penz E, Sommer L, Reid A, Marciniuk DD. An economic evaluation of chronic obstructive pulmonary disease clinical pathway in Saskatchewan, Canada: Data-driven techniques to identify cost-effectiveness among patient subgroups. PLoS One 2024;19:e0301334. [PMID: 38557914 PMCID: PMC10984414 DOI: 10.1371/journal.pone.0301334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open

Abstract

BACKGROUND

Saskatchewan has implemented care pathways for several common health conditions. To date, there has not been any cost-effectiveness evaluation of care pathways in the province. The objective of this study was to evaluate the real-world cost-effectiveness of a chronic obstructive pulmonary disease (COPD) care pathway program in Saskatchewan.

METHODS

Using patient-level administrative health data, we identified adults (35+ years) with COPD diagnosis recruited into the care pathway program in Regina between April 1, 2018, and March 31, 2019 (N = 759). The control group comprised adults (35+ years) with COPD who lived in Saskatoon during the same period (N = 759). The control group was matched to the intervention group using propensity scores. Costs were calculated at the patient level. The outcome measure was the number of days patients remained without experiencing COPD exacerbation within 1-year follow-up. Both manual and data-driven policy learning approaches were used to assess heterogeneity in the cost-effectiveness by patient demographic and disease characteristics. Bootstrapping was used to quantify uncertainty in the results.

RESULTS

In the overall sample, the estimates indicate that the COPD care pathway was not cost-effective using the willingness to pay (WTP) threshold values in the range of $1,000 and $5,000/exacerbation day averted. The manual subgroup analyses show the COPD care pathway was dominant among patients with comorbidities and among patients aged 65 years or younger at the WTP threshold of $2000/exacerbation day averted. Although similar profiles as those identified in the manual subgroup analyses were confirmed, the data-driven policy learning approach suggests more nuanced demographic and disease profiles that the care pathway would be most appropriate for.

CONCLUSIONS

Both manual subgroup analysis and data-driven policy learning approach showed that the COPD care pathway consistently produced cost savings and better health outcomes among patients with comorbidities or among those relatively younger. The care pathway was not cost-effective in the entire sample.

Collapse

Post RAJ, Petkovic M, van den Heuvel IL, van den Heuvel ER. Flexible Machine Learning Estimation of Conditional Average Treatment Effects: A Blessing and a Curse. Epidemiology 2024;35:32-40. [PMID: 37889951 DOI: 10.1097/ede.0000000000001684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]

Kaas-Hansen BS, Granholm A, Sivapalan P, Anthon CT, Schjørring OL, Maagaard M, Kjaer MBN, Mølgaard J, Ellekjaer KL, Fagerberg SK, Lange T, Møller MH, Perner A. Real-world causal evidence for planned predictive enrichment in critical care trials: A scoping review. Acta Anaesthesiol Scand 2024;68:16-25. [PMID: 37649412 DOI: 10.1111/aas.14321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/01/2023] [Accepted: 08/12/2023] [Indexed: 09/01/2023]

Newby D, Orgeta V, Marshall CR, Lourida I, Albertyn CP, Tamburin S, Raymont V, Veldsman M, Koychev I, Bauermeister S, Weisman D, Foote IF, Bucholc M, Leist AK, Tang EYH, Tai XY, Llewellyn DJ, Ranson JM. Artificial intelligence for dementia prevention. Alzheimers Dement 2023;19:5952-5969. [PMID: 37837420 PMCID: PMC10843720 DOI: 10.1002/alz.13463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/01/2023] [Accepted: 08/07/2023] [Indexed: 10/16/2023]

Affiliation(s)

Danielle Newby University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
Vasiliki Orgeta Division of Psychiatry, University College London, London, W1T 7BN, UK
Charles R Marshall Preventive Neurology Unit, Wolfson Institute of Population Health, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 4NS, UK Department of Neurology, Royal London Hospital, London, E1 1BB, UK
Ilianna Lourida Population Health Sciences Institute, Newcastle University, Newcastle, NE2 4AX, UK University of Exeter Medical School, Exeter, EX1 2HZ, UK
Christopher P Albertyn Department of Old Age Psychiatry, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, SE5 8AF, UK
Stefano Tamburin Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, 37129, Italy
Vanessa Raymont University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
Michele Veldsman Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, OX3 9DU, UK Department of Experimental Psychology, University of Oxford, Oxford, OX2 6GG, UK
Ivan Koychev University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
Sarah Bauermeister University of Oxford, Department of Psychiatry, Warneford Hospital, Oxford, OX3 7JX, UK
David Weisman Abington Neurological Associates, Abington, PA 19001, USA
Isabelle F Foote Preventive Neurology Unit, Wolfson Institute of Population Health, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 4NS, UK Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO 80309, USA
Magda Bucholc Cognitive Analytics Research Lab, School of Computing, Engineering & Intelligent Systems, Ulster University, Derry, BT48 7JL, UK
Anja K Leist Institute for Research on Socio-Economic Inequality (IRSEI), Department of Social Sciences, University of Luxembourg, L-4365, Luxembourg
Eugene Y H Tang Population Health Sciences Institute, Newcastle University, Newcastle, NE2 4AX, UK
Xin You Tai Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, OX3 9DU, UK Division of Clinical Neurology, John Radcliffe Hospital, Oxford University Hospitals Trust, Oxford, OX3 9DU, UK
The Deep Dementia Phenotyping (DEMON) Network
David J. Llewellyn University of Exeter Medical School, Exeter, EX1 2HZ, UK The Alan Turing Institute, London, NW1 2DB, UK
Janice M. Ranson University of Exeter Medical School, Exeter, EX1 2HZ, UK

Collapse

Hayward S, Parmesar K, Saleem MA. What is circulating factor disease and how is it currently explained? Pediatr Nephrol 2023;38:3513-3518. [PMID: 36952039 PMCID: PMC10514121 DOI: 10.1007/s00467-023-05928-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 02/15/2023] [Accepted: 02/20/2023] [Indexed: 03/24/2023]

Hunter DJ, Holmes C. Where Medical Statistics Meets Artificial Intelligence. N Engl J Med 2023;389:1211-1219. [PMID: 37754286 DOI: 10.1056/nejmra2212850] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/28/2023]

de Jong VMT, Hoogland J, Moons KGM, Riley RD, Nguyen TL, Debray TPA. Propensity-based standardization to enhance the validation and interpretation of prediction model discrimination for a target population. Stat Med 2023;42:3508-3528. [PMID: 37311563 DOI: 10.1002/sim.9817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 02/26/2023] [Accepted: 05/19/2023] [Indexed: 06/15/2023]

Abstract

External validation of the discriminative ability of prediction models is of key importance. However, the interpretation of such evaluations is challenging, as the ability to discriminate depends on both the sample characteristics (ie, case-mix) and the generalizability of predictor coefficients, but most discrimination indices do not provide any insight into their respective contributions. To disentangle differences in discriminative ability across external validation samples due to a lack of model generalizability from differences in sample characteristics, we propose propensity-weighted measures of discrimination. These weighted metrics, which are derived from propensity scores for sample membership, are standardized for case-mix differences between the model development and validation samples, allowing for a fair comparison of discriminative ability in terms of model characteristics in a target population of interest. We illustrate our methods with the validation of eight prediction models for deep vein thrombosis in 12 external validation data sets and assess our methods in a simulation study. In the illustrative example, propensity score standardization reduced between-study heterogeneity of discrimination, indicating that between-study variability was partially attributable to case-mix. The simulation study showed that only flexible propensity-score methods (allowing for non-linear effects) produced unbiased estimates of model discrimination in the target population, and only when the positivity assumption was met. Propensity score-based standardization may facilitate the interpretation of (heterogeneity in) discriminative ability of a prediction model as observed across multiple studies, and may guide model updating strategies for a particular target population. Careful propensity score modeling with attention for non-linear relations is recommended.

Collapse

Souza MCO, Cruz JC, Rocha BA, Maria Oliveira Souza J, Devóz PP, Santana A, Campíglia AD, Barbosa F. The influence of the co-exposure to polycyclic aromatic hydrocarbons and toxic metals on DNA damage in brazilian lactating women and their infants: A cross-sectional study using machine learning approaches. CHEMOSPHERE 2023;334:138975. [PMID: 37224977 DOI: 10.1016/j.chemosphere.2023.138975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 03/29/2023] [Accepted: 05/16/2023] [Indexed: 05/26/2023]

Abstract

Polycyclic aromatic hydrocarbons (PAHs) and toxic metals are widely spread pollutants of public health concern. The co-contamination of these chemicals in the environment is frequent, but relatively little is known about their combined toxicities. In this context, this study aimed to evaluate the influence of the co-exposure to PAHs and toxic metals on DNA damage in Brazilian lactating women and their infants using machine learning approaches. Data were collected from an observational, cross-sectional study with 96 lactating women and 96 infants living in two cities. The exposure to these pollutants was estimated by determining urinary levels of seven mono-hydroxylated PAH metabolites and the free form of three toxic metals. 8-Hydroxydeoxyguanosine (8-OHdG) levels in the urine were used as the oxidative stress biomarker and set as the outcome. Individual sociodemographic factors were also collected using questionnaires. Sixteen machine learning algorithms were trained using 10-fold cross-validation to investigate the associations of urinary OH-PAHs and metals with 8-OHdG levels. This approach was also compared with models attained by multiple linear regression. The results showed that the urinary concentration of OH-PAHs was highly correlated between the mothers and their infants. Multiple linear regression did not show a statistically significant association between the contaminants and urinary 8OHdG levels. Machine learning models indicated that all investigated variables did not present predictive performance on 8-OHdG concentrations. In conclusion, PAHs and toxic metals were not associated with 8-OHdG levels in Brazilian lactating women and their infants. These novelty and originality results were achieved even after applying sophisticated statistical models to capture non-linear relationships. However, these findings should be interpreted cautiously because the exposure to the studied contaminants was considerably low, which may not reflect other populations at risk.

Collapse

Affiliation(s)

Marília Cristina Oliveira Souza ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil.
Jonas Carneiro Cruz ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
Bruno Alves Rocha ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
Juliana Maria Oliveira Souza Department of Biochemistry, Biological Sciences Institute, University of Juiz de Fora, Campus Universitário, Rua José Lourenço Kelmer, S/n - São Pedro, Juiz de Fora, MG, 36036-900, Brazil
Paula Pícoli Devóz ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil
Anthony Santana Department of Chemistry, University of Central Florida, Orlando, FL, 32816, USA
Andres Dobal Campíglia Department of Chemistry, University of Central Florida, Orlando, FL, 32816, USA
Fernando Barbosa ASTox Lab - Analytical and System Toxicology Laboratory, Department of Clinical Analyses, Toxicology and Food Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Avenida Do Café S/n, 14040-903, Ribeirão Preto, São Paulo, Brazil

Collapse

Rivera AS, Beach LB. Unaddressed Sources of Bias Lead to Biased Conclusions About Sexual Orientation Change Efforts and Suicidality in Sexual Minority Individuals. ARCHIVES OF SEXUAL BEHAVIOR 2023;52:875-879. [PMID: 36472764 DOI: 10.1007/s10508-022-02498-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/21/2022] [Accepted: 11/22/2022] [Indexed: 05/11/2023]

Roster KO, Martinelli T, Connaughton C, Santillana M, Rodrigues FA. Estimating the impact of the COVID-19 pandemic on dengue in Brazil. RESEARCH SQUARE 2023:rs.3.rs-2548491. [PMID: 36798282 PMCID: PMC9934738 DOI: 10.21203/rs.3.rs-2548491/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]

Nghiem N, Atkinson J, Nguyen BP, Tran-Duy A, Wilson N. Predicting high health-cost users among people with cardiovascular disease using machine learning and nationwide linked social administrative datasets. HEALTH ECONOMICS REVIEW 2023;13:9. [PMID: 36738348 PMCID: PMC9898915 DOI: 10.1186/s13561-023-00422-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 01/23/2023] [Indexed: 06/18/2023]

Cipriano LE. Evaluating the Impact and Potential Impact of Machine Learning on Medical Decision Making. Med Decis Making 2023;43:147-149. [PMID: 36575951 PMCID: PMC9827491 DOI: 10.1177/0272989x221146506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

MacNell N, Feinstein L, Wilkerson J, Salo PM, Molsberry SA, Fessler MB, Thorne PS, Motsinger-Reif AA, Zeldin DC. Implementing machine learning methods with complex survey data: Lessons learned on the impacts of accounting sampling weights in gradient boosting. PLoS One 2023;18:e0280387. [PMID: 36638125 PMCID: PMC9838837 DOI: 10.1371/journal.pone.0280387] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 12/28/2022] [Indexed: 01/14/2023] Open

Abstract

Despite the prominent use of complex survey data and the growing popularity of machine learning methods in epidemiologic research, few machine learning software implementations offer options for handling complex samples. A major challenge impeding the broader incorporation of machine learning into epidemiologic research is incomplete guidance for analyzing complex survey data, including the importance of sampling weights for valid prediction in target populations. Using data from 15, 820 participants in the 1988-1994 National Health and Nutrition Examination Survey cohort, we determined whether ignoring weights in gradient boosting models of all-cause mortality affected prediction, as measured by the F1 score and corresponding 95% confidence intervals. In simulations, we additionally assessed the impact of sample size, weight variability, predictor strength, and model dimensionality. In the National Health and Nutrition Examination Survey data, unweighted model performance was inflated compared to the weighted model (F1 score 81.9% [95% confidence interval: 81.2%, 82.7%] vs 77.4% [95% confidence interval: 76.1%, 78.6%]). However, the error was mitigated if the F1 score was subsequently recalculated with observed outcomes from the weighted dataset (F1: 77.0%; 95% confidence interval: 75.7%, 78.4%). In simulations, this finding held in the largest sample size (N = 10,000) under all analytic conditions assessed. For sample sizes <5,000, sampling weights had little impact in simulations that more closely resembled a simple random sample (low weight variability) or in models with strong predictors, but findings were inconsistent under other analytic scenarios. Failing to account for sampling weights in gradient boosting models may limit generalizability for data from complex surveys, dependent on sample size and other analytic properties. In the absence of software for configuring weighted algorithms, post-hoc re-calculations of unweighted model performance using weighted observed outcomes may more accurately reflect model prediction in target populations than ignoring weights entirely.

Collapse

Bowe AK, Lightbody G, Staines A, Murray DM. Big data, machine learning, and population health: predicting cognitive outcomes in childhood. Pediatr Res 2023;93:300-307. [PMID: 35681091 PMCID: PMC7614199 DOI: 10.1038/s41390-022-02137-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 05/05/2022] [Accepted: 05/17/2022] [Indexed: 11/09/2022]

Abstract

The application of machine learning (ML) to address population health challenges has received much less attention than its application in the clinical setting. One such challenge is addressing disparities in early childhood cognitive development-a complex public health issue rooted in the social determinants of health, exacerbated by inequity, characterised by intergenerational transmission, and which will continue unabated without novel approaches to address it. Early life, the period of optimal neuroplasticity, presents a window of opportunity for early intervention to improve cognitive development. Unfortunately for many, this window will be missed, and intervention may never occur or occur only when overt signs of cognitive delay manifest. In this review, we explore the potential value of ML and big data analysis in the early identification of children at risk for poor cognitive outcome, an area where there is an apparent dearth of research. We compare and contrast traditional statistical methods with ML approaches, provide examples of how ML has been used to date in the field of neurodevelopmental disorders, and present a discussion of the opportunities and risks associated with its use at a population level. The review concludes by highlighting potential directions for future research in this area. IMPACT: To date, the application of machine learning to address population health challenges in paediatrics lags behind other clinical applications. This review provides an overview of the public health challenge we face in addressing disparities in childhood cognitive development and focuses on the cornerstone of early intervention. Recent advances in our ability to collect large volumes of data, and in analytic capabilities, provide a potential opportunity to improve current practices in this field. This review explores the potential role of machine learning and big data analysis in the early identification of children at risk for poor cognitive outcomes.

Collapse

Søegaard SH, Spanggaard M, Rostgaard K, Kamper-Jørgensen M, Stensballe LG, Schmiegelow K, Hjalgrim H. Childcare attendance and risk of infections in childhood and adolescence. Int J Epidemiol 2022;52:466-475. [PMID: 36413040 DOI: 10.1093/ije/dyac219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 11/10/2022] [Indexed: 11/23/2022] Open

Abstract Abstract Background It has been suggested that the transiently increased infection risk following childcare enrolment is compensated by decreased infection risk later in childhood and adolescence. We investigated how childcare enrolment affected rates of antimicrobial-treated infections during childhood and adolescence. Methods In a register-based cohort study of all children born in Denmark 1997–2014 with available exposure information (n = 1 007 448), we assessed the association between childcare enrolment before age 6 years and infection risks up to age 20 years, using antimicrobial exposure as proxy for infections. Nationwide childcare and prescription data were used. We estimated infection rates and the cumulative number of infections using adjusted Poisson regression models. Results We observed 4 599 993 independent episodes of infection (antimicrobial exposure) during follow-up. Childcare enrolment transiently increased infection rates; the younger the child, the greater the increase. The resulting increased cumulative number of infections associated with earlier age at childcare enrolment was not compensated by lower infection risk later in childhood or adolescence. Accordingly, children enrolled in childcare before age 12 months had experienced 0.5–0.7 more infections at age 6 years (in total 4.5–5.1 infections) than peers enrolled at age 3 years, differences that persisted throughout adolescence. The type of childcare had little impact on infection risks. Conclusions Early age at childcare enrolment is associated with a modest increase in the cumulative number of antimicrobial-treated infections at all ages through adolescence. Emphasis should be given to disrupting infectious disease transmission in childcare facilities through prevention strategies with particular focus on the youngest children. Collapse

Kaas‐Hansen BS, Granholm A, Anthon CT, Kjær MN, Sivapalan P, Maagaard M, Schjørring OL, Fagerberg SK, Ellekjær KL, Mølgaard J, Ekstrøm CT, Møller MH, Perner A. Causal inference for planning randomised critical care trials: Protocol for a scoping review. Acta Anaesthesiol Scand 2022;66:1274-1278. [PMID: 36054374 PMCID: PMC9826202 DOI: 10.1111/aas.14142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 08/17/2022] [Indexed: 01/11/2023]

Leist AK, Klee M, Kim JH, Rehkopf DH, Bordas SPA, Muniz-Terrera G, Wade S. Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. SCIENCE ADVANCES 2022;8:eabk1942. [PMID: 36260666 PMCID: PMC9581488 DOI: 10.1126/sciadv.abk1942] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/01/2022] [Indexed: 05/20/2023]

van Smeden M, Heinze G, Van Calster B, Asselbergs FW, Vardas PE, Bruining N, de Jaegere P, Moore JH, Denaxas S, Boulesteix AL, Moons KGM. Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. Eur Heart J 2022;43:2921-2930. [PMID: 35639667 PMCID: PMC9443991 DOI: 10.1093/eurheartj/ehac238] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/29/2022] [Accepted: 04/26/2022] [Indexed: 11/12/2022] Open

Harhay MO, Bell KJL, Huang JY, Arah OA. IJE's Education Corner turns 10! Looking back and looking forward. Int J Epidemiol 2022;51:1357-1360. [PMID: 35950800 DOI: 10.1093/ije/dyac161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 08/02/2022] [Indexed: 02/06/2023] Open

Simoneau G, Pellegrini F, Debray TPA, Rouette J, Muñoz J, Platt RW, Petkau J, Bohn J, Shen C, de Moor C, Karim ME. Recommendations for the use of propensity score methods in multiple sclerosis research. Mult Scler 2022;28:1467-1480. [PMID: 35387508 PMCID: PMC9260471 DOI: 10.1177/13524585221085733] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 02/03/2022] [Accepted: 02/17/2022] [Indexed: 01/24/2023]

Padula WV, Kreif N, Vanness DJ, Adamson B, Rueda JD, Felizzi F, Jonsson P, IJzerman MJ, Butte A, Crown W. Machine Learning Methods in Health Economics and Outcomes Research-The PALISADE Checklist: A Good Practices Report of an ISPOR Task Force. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022;25:1063-1080. [PMID: 35779937 DOI: 10.1016/j.jval.2022.03.022] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 06/15/2023]

Kalia S, Saarela O, Chen T, O'Neill B, Meaney C, Gronsbell J, Sejdic E, Escobar M, Aliarzadeh B, Moineddin R, Pow C, Sullivan F, Greiver M. Marginal structural models using calibrated weights with SuperLearner: application to type II diabetes cohort. IEEE J Biomed Health Inform 2022;26:4197-4206. [PMID: 35588417 DOI: 10.1109/jbhi.2022.3175862] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

As different scientific disciplines begin to converge on machine learning for causal inference, we demonstrate the application of machine learning algorithms in the context of longitudinal causal estimation using electronic health records. Our aim is to formulate a marginal structural model for estimating diabetes care provisions in which we envisioned hypothetical (i.e. counterfactual) dynamic treatment regimes using a combination of drug therapies to manage diabetes: metformin, sulfonylurea and SGLT-2i. The binary outcome of diabetes care provisions was defined using a composite measure of chronic disease prevention and screening elements [27] including (i) primary care visit, (ii) blood pressure, (iii) weight, (iv) hemoglobin A1c, (v) lipid, (vi) ACR, (vii) eGFR and (viii) statin medication. We used several statistical learning algorithms to describe causal relationships between the prescription of three common classes of diabetes medications and quality of diabetes care using the electronic health records contained in National Diabetes Repository. In particular, we generated an ensemble of statistical learning algorithms using the SuperLearner framework based on the following base learners: (i) least absolute shrinkage and selection operator, (ii) ridge regression, (iii) elastic net, (iv) random forest, (v) gradient boosting machines, and (vi) neural network. Each statistical learning algorithm was fitted using the pseudo-population generated from the marginalization of the time-dependent confounding process. Covariate balance was assessed using the longitudinal (i.e. cumulative-time product) stabilized weights with calibrated restrictions. Our results indicated that the treatment drop-in cohorts (with respect to metformin, sulfonylurea and SGLT-2i) may have improved diabetes care provisions in relation to treatment naive (i.e. no treatment) cohort. As a clinical utility, we hope that this article will facilitate discussions around the prevention of adverse chronic outcomes associated with type II diabetes through the improvement of diabetes care provisions in primary care.

Collapse

Remiro-Azócar A, Heath A, Baio G. Effect modification in anchored indirect treatment comparison: Comments on "Matching-adjusted indirect comparisons: Application to time-to-event data". Stat Med 2022;41:1541-1553. [PMID: 35274754 DOI: 10.1002/sim.9286] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 11/04/2021] [Accepted: 11/17/2021] [Indexed: 01/17/2023]

Broadbent A, Grote T. Can Robots Do Epidemiology? Machine Learning, Causal Inference, and Predicting the Outcomes of Public Health Interventions. PHILOSOPHY & TECHNOLOGY 2022;35:14. [PMID: 35251906 PMCID: PMC8881939 DOI: 10.1007/s13347-022-00509-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 11/20/2021] [Indexed: 11/29/2022]

Gebremedhin AT, Hogan AB, Blyth CC, Glass K, Moore HC. Developing a prediction model to estimate the true burden of respiratory syncytial virus (RSV) in hospitalised children in Western Australia. Sci Rep 2022;12:332. [PMID: 35013434 PMCID: PMC8748465 DOI: 10.1038/s41598-021-04080-3] [Citation(s) in RCA: 58] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 12/14/2021] [Indexed: 12/23/2022] Open

Chatton A, Borgne FL, Leyrat C, Foucher Y. G-computation and doubly robust standardisation for continuous-time data: A comparison with inverse probability weighting. Stat Methods Med Res 2021;31:706-718. [PMID: 34861799 DOI: 10.1177/09622802211047345] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Zhong Y, Kennedy EH, Bodnar LM, Naimi AI. AIPW: An R Package for Augmented Inverse Probability-Weighted Estimation of Average Causal Effects. Am J Epidemiol 2021;190:2690-2699. [PMID: 34268567 PMCID: PMC8796813 DOI: 10.1093/aje/kwab207] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 07/09/2021] [Accepted: 07/13/2021] [Indexed: 12/26/2022] Open

Gao Q, Zhang Y, Liang J, Sun H, Wang T. High-dimensional generalized propensity score with application to omics data. Brief Bioinform 2021;22:6354024. [PMID: 34410351 DOI: 10.1093/bib/bbab331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 07/26/2021] [Accepted: 07/27/2021] [Indexed: 01/09/2023] Open

Tennant PWG, Murray EJ, Arnold KF, Berrie L, Fox MP, Gadd SC, Harrison WJ, Keeble C, Ranker LR, Textor J, Tomova GD, Gilthorpe MS, Ellison GTH. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol 2021;50:620-632. [PMID: 33330936 PMCID: PMC8128477 DOI: 10.1093/ije/dyaa213] [Citation(s) in RCA: 341] [Impact Index Per Article: 113.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2020] [Indexed: 12/12/2022] Open

Affiliation(s)

Peter W G Tennant Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
Eleanor J Murray Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA
Kellyn F Arnold Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK
Laurie Berrie Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,School of Geography, University of Leeds, Leeds, UK.,School of GeoSciences, University of Edinburgh, Edinburgh, UK
Matthew P Fox Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA.,Department of Global Health, Boston University, Boston, MA, USA
Sarah C Gadd Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,School of Geography, University of Leeds, Leeds, UK
Wendy J Harrison Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK
Claire Keeble Leeds Institute for Data Analytics, University of Leeds, Leeds, UK
Lynsie R Ranker Department of Epidemiology, School of Public Health, Boston University, Boston, MA, USA
Johannes Textor Department of Tumour Immunology, Radboud University Medical Center, Nijmegen, The Netherlands
Georgia D Tomova Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
Mark S Gilthorpe Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Alan Turing Institute, British Library, London, UK
George T H Ellison Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.,Faculty of Medicine and Health, University of Leeds, Leeds, UK.,Centre for Data Innovation, Faculty of Science and Technology, University of Central Lancashire, Preston, UK

Collapse

Rogers P, Wang D, Lu Z. Medical Information Mart for Intensive Care: A Foundation for the Fusion of Artificial Intelligence and Real-World Data. Front Artif Intell 2021;4:691626. [PMID: 34136802 PMCID: PMC8201087 DOI: 10.3389/frai.2021.691626] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 05/18/2021] [Indexed: 01/27/2023] Open

Fu EL, van Diepen M, Xu Y, Trevisan M, Dekker FW, Zoccali C, Jager K, Carrero JJ. Pharmacoepidemiology for nephrologists (part 2): potential biases and how to overcome them. Clin Kidney J 2021;14:1317-1326. [PMID: 33959262 PMCID: PMC8087121 DOI: 10.1093/ckj/sfaa242] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 10/02/2020] [Indexed: 12/21/2022] Open

Lin L, Sperrin M, Jenkins DA, Martin GP, Peek N. A scoping review of causal methods enabling predictions under hypothetical interventions. Diagn Progn Res 2021;5:3. [PMID: 33536082 PMCID: PMC7860039 DOI: 10.1186/s41512-021-00092-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/02/2021] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. For many applications, this is perfectly acceptable. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions.

AIMS

We aimed to identify published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference. We aimed to identify the main methodological approaches, their underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method. Finally, we aimed to highlight unresolved methodological challenges.

METHODS

We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. We included both methodologies proposed in statistical/machine learning literature and methodologies used in applied studies.

RESULTS

We identified 4919 papers through database searches and a further 115 papers through manual searches. Of these, 87 papers were retained for full-text screening, of which 13 were selected for inclusion. We found papers from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation.

CONCLUSIONS

There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: (1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses and (2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy.

Collapse

Le Borgne F, Chatton A, Léger M, Lenain R, Foucher Y. G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes. Sci Rep 2021;11:1435. [PMID: 33446866 PMCID: PMC7809122 DOI: 10.1038/s41598-021-81110-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 12/24/2020] [Indexed: 11/09/2022] Open

Oskar S, Stingone JA. Machine Learning Within Studies of Early-Life Environmental Exposures and Child Health: Review of the Current Literature and Discussion of Next Steps. Curr Environ Health Rep 2021;7:170-184. [PMID: 32578067 DOI: 10.1007/s40572-020-00282-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Mitchell EG, Tabak EG, Levine ME, Mamykina L, Albers DJ. Enabling personalized decision support with patient-generated data and attributable components. J Biomed Inform 2020;113:103639. [PMID: 33316422 DOI: 10.1016/j.jbi.2020.103639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 08/03/2020] [Accepted: 11/26/2020] [Indexed: 10/22/2022]

Blakely T, Moss R, Collins J, Mizdrak A, Singh A, Carvalho N, Wilson N, Geard N, Flaxman A. Proportional multistate lifetable modelling of preventive interventions: concepts, code and worked examples. Int J Epidemiol 2020;49:1624-1636. [PMID: 33038892 DOI: 10.1093/ije/dyaa132] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 07/14/2020] [Indexed: 11/12/2022] Open

Sengupta PP, Shrestha S, Berthon B, Messas E, Donal E, Tison GH, Min JK, D'hooge J, Voigt JU, Dudley J, Verjans JW, Shameer K, Johnson K, Lovstakken L, Tabassian M, Piccirilli M, Pernot M, Yanamala N, Duchateau N, Kagiyama N, Bernard O, Slomka P, Deo R, Arnaout R. Proposed Requirements for Cardiovascular Imaging-Related Machine Learning Evaluation (PRIME): A Checklist: Reviewed by the American College of Cardiology Healthcare Innovation Council. JACC Cardiovasc Imaging 2020;13:2017-2035. [PMID: 32912474 PMCID: PMC7953597 DOI: 10.1016/j.jcmg.2020.07.015] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 07/15/2020] [Accepted: 07/16/2020] [Indexed: 12/20/2022]

Affiliation(s)

Partho P Sengupta West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia.
Sirish Shrestha West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
Béatrice Berthon Physique pour la Médecine Paris, Inserm U1273, CNRS FRE 2031, ESPCI Paris, PSL Research University, Paris, France
Emmanuel Messas Université Paris Descartes, Sorbonne Paris Cité, Paris, France
Erwan Donal Département de Cardiologie et Maladies Vasculaires, Service de Cardiologie et maladies vasculaires, CHU Rennes, Rennes, France
Geoffrey H Tison Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California
James K Min Cleerly, Inc., New York, New York
Jan D'hooge Laboratory on Cardiovascular Imaging and Dynamics, Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
Jens-Uwe Voigt Department of Cardiovascular Science, KU Leuven, Leuven, Belgium; Department of Cardiovascular Diseases, University Hospitals Leuven, Belgium
Joel Dudley Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
Johan W Verjans Australian Institute for Machine Learning, University of Adelaide, North Terrace, Adelaide, South Australia, Australia; Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
Khader Shameer Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
Kipp Johnson Department of Genetics and Genomic Sciences and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York; Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
Lasse Lovstakken Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
Mahdi Tabassian Laboratory on Cardiovascular Imaging and Dynamics, Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
Marco Piccirilli West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
Mathieu Pernot Physique pour la Médecine Paris, Inserm U1273, CNRS FRE 2031, ESPCI Paris, PSL Research University, Paris, France
Naveena Yanamala West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
Nicolas Duchateau CREATIS, CNRS UMR 5220, INSERM U1206, Université Lyon 1, INSA-LYON, France
Nobuyuki Kagiyama West Virginia University Heart and Vascular Institute, Division of Cardiology, Morgantown, West Virginia
Olivier Bernard CREATIS, CNRS UMR 5220, INSERM U1206, Université Lyon 1, INSA-LYON, France
Piotr Slomka Department of Imaging and Medicine, Cedars-Sinai Medical Center, Los Angeles, California
Rahul Deo Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California
Rima Arnaout Division of Cardiology, Department of Medicine, University of California San Francisco, San Francisco, California

Collapse

Gokhale M, Stürmer T, Buse JB. Real-world evidence: the devil is in the detail. Diabetologia 2020;63:1694-1705. [PMID: 32666226 PMCID: PMC7448554 DOI: 10.1007/s00125-020-05217-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 05/01/2020] [Indexed: 12/12/2022]

Goldstein ND, LeVasseur MT, McClure LA. On the Convergence of Epidemiology, Biostatistics, and Data Science. HARVARD DATA SCIENCE REVIEW 2020;2. [PMID: 35005710 DOI: 10.1162/99608f92.9f0215e6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Abstract

Epidemiology, biostatistics, and data science are broad disciplines that incorporate a variety of substantive areas. Common among them is a focus on quantitative approaches for solving intricate problems. When the substantive area is health and health care, the overlap is further cemented. Researchers in these disciplines are fluent in statistics, data management and analysis, and health and medicine, to name but a few competencies. Yet there are important and perhaps mutually exclusive attributes of these fields that warrant a tighter integration. For example, epidemiologists receive substantial training in the science of study design, measurement, and the art of causal inference. Biostatisticians are well versed in the theory and application of methodological techniques, as well as the design and conduct of public health research. Data scientists receive equivalently rigorous training in computational and visualization approaches for high-dimensional data. Compared to data scientists, epidemiologists and biostatisticians may have less expertise in computer science and informatics, while data scientists may benefit from a working knowledge of study design and causal inference. Collaboration and cross-training offer the opportunity to share and learn of the constructs, frameworks, theories, and methods of these fields with the goal of offering fresh and innovate perspectives for tackling challenging problems in health and health care. In this article, we first describe the evolution of these fields focusing on their convergence in the era of electronic health data, notably electronic medical records (EMRs). Next we present how a collaborative team may design, analyze, and implement an EMR-based study. Finally, we review the curricula at leading epidemiology, biostatistics, and data science training programs, identifying gaps and offering suggestions for the fields moving forward.

Collapse