1
|
Pigoni A, Delvecchio G, Turtulici N, Madonna D, Pietrini P, Cecchetti L, Brambilla P. Machine learning and the prediction of suicide in psychiatric populations: a systematic review. Transl Psychiatry 2024; 14:140. [PMID: 38461283 PMCID: PMC10925059 DOI: 10.1038/s41398-024-02852-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 02/22/2024] [Accepted: 02/22/2024] [Indexed: 03/11/2024] Open
Abstract
Machine learning (ML) has emerged as a promising tool to enhance suicidal prediction. However, as many large-sample studies mixed psychiatric and non-psychiatric populations, a formal psychiatric diagnosis emerged as a strong predictor of suicidal risk, overshadowing more subtle risk factors specific to distinct populations. To overcome this limitation, we conducted a systematic review of ML studies evaluating suicidal behaviors exclusively in psychiatric clinical populations. A systematic literature search was performed from inception through November 17, 2022 on PubMed, EMBASE, and Scopus following the PRISMA guidelines. Original research using ML techniques to assess the risk of suicide or predict suicide attempts in the psychiatric population were included. An assessment for bias risk was performed using the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines. About 1032 studies were retrieved, and 81 satisfied the inclusion criteria and were included for qualitative synthesis. Clinical and demographic features were the most frequently employed and random forest, support vector machine, and convolutional neural network performed better in terms of accuracy than other algorithms when directly compared. Despite heterogeneity in procedures, most studies reported an accuracy of 70% or greater based on features such as previous attempts, severity of the disorder, and pharmacological treatments. Although the evidence reported is promising, ML algorithms for suicidal prediction still present limitations, including the lack of neurobiological and imaging data and the lack of external validation samples. Overcoming these issues may lead to the development of models to adopt in clinical practice. Further research is warranted to boost a field that holds the potential to critically impact suicide mortality.
Collapse
Affiliation(s)
- Alessandro Pigoni
- Social and Affective Neuroscience Group, MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Giuseppe Delvecchio
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Nunzio Turtulici
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
| | - Domenico Madonna
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Pietro Pietrini
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Luca Cecchetti
- Social and Affective Neuroscience Group, MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Paolo Brambilla
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy.
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy.
| |
Collapse
|
2
|
Simon GE, Cruz M, Shortreed SM, Sterling SA, Coleman KJ, Ahmedani BK, Yaseen ZS, Mosholder AD. Stability of Suicide Risk Prediction Models During Changes in Health Care Delivery. Psychiatr Serv 2024; 75:139-147. [PMID: 37587793 DOI: 10.1176/appi.ps.20230172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 08/18/2023]
Abstract
OBJECTIVE The authors aimed to use health records data to examine how the accuracy of statistical models predicting self-harm or suicide changed between 2015 and 2019, as health systems implemented suicide prevention programs. METHODS Data from four large health systems were used to identify specialty mental health visits by patients ages ≥11 years, assess 311 potential predictors of self-harm (including demographic characteristics, historical risk factors, and index visit characteristics), and ascertain fatal or nonfatal self-harm events over 90 days after each visit. New prediction models were developed with logistic regression with LASSO (least absolute shrinkage and selection operator) in random samples of visits (65%) from each calendar year and were validated in the remaining portion of the sample (35%). RESULTS A model developed for visits from 2009 to mid-2015 showed similar classification performance and calibration accuracy in a new sample of about 13.1 million visits from late 2015 to 2019. Area under the receiver operating characteristic curve (AUC) ranged from 0.840 to 0.849 in the new sample, compared with 0.851 in the original sample. New models developed for each year for 2015-2019 had classification performance (AUC range 0.790-0.853), sensitivity, and positive predictive value similar to those of the previously developed model. Models selected similar predictors from 2015 to 2019, except for more frequent selection of depression questionnaire data in later years, when questionnaires were more frequently recorded. CONCLUSIONS A self-harm prediction model developed with 2009-2015 visit data performed similarly when applied to 2015-2019 visits. New models did not yield superior performance or identify different predictors.
Collapse
Affiliation(s)
- Gregory E Simon
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Maricela Cruz
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Susan M Shortreed
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Stacy A Sterling
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Karen J Coleman
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Brian K Ahmedani
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Zimri S Yaseen
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| | - Andrew D Mosholder
- Washington Health Research Institute, Kaiser Permanente, Seattle (Simon, Cruz, Shortreed); Bernard J. Tyson School of Medicine (Simon, Coleman) and Southern California Department of Research and Evaluation (Coleman), Kaiser Permanente, Pasadena; Department of Biostatistics, University of Washington, Seattle (Cruz, Shortreed); Northern California Division of Research, Kaiser Permanente, Oakland (Sterling); Henry Ford Health Center for Health Services Research, Detroit (Ahmedani); U.S. Food and Drug Administration (FDA), Silver Spring, Maryland (Yaseen, Mosholder)
| |
Collapse
|
3
|
Simon GE, Shortreed SM, Johnson E, Yaseen ZS, Stone M, Mosholder AD, Ahmedani BK, Coleman KJ, Coley RY, Penfold RB, Toh S. Predicting risk of suicidal behavior from insurance claims data vs. linked data from insurance claims and electronic health records. Pharmacoepidemiol Drug Saf 2024; 33:e5734. [PMID: 38112287 PMCID: PMC10843611 DOI: 10.1002/pds.5734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/16/2023] [Accepted: 11/10/2023] [Indexed: 12/21/2023]
Abstract
PURPOSE Observational studies assessing effects of medical products on suicidal behavior often rely on health record data to account for pre-existing risk. We assess whether high-dimensional models predicting suicide risk using data derived from insurance claims and electronic health records (EHRs) are superior to models using data from insurance claims alone. METHODS Data were from seven large health systems identified outpatient mental health visits by patients aged 11 or older between 1/1/2009 and 9/30/2017. Data for the 5 years prior to each visit identified potential predictors of suicidal behavior typically available from insurance claims (e.g., mental health diagnoses, procedure codes, medication dispensings) and additional potential predictors available from EHRs (self-reported race and ethnicity, responses to Patient Health Questionnaire or PHQ-9 depression questionnaires). Nonfatal self-harm events following each visit were identified from insurance claims data and fatal self-harm events were identified by linkage to state mortality records. Random forest models predicting nonfatal or fatal self-harm over 90 days following each visit were developed in a 70% random sample of visits and validated in a held-out sample of 30%. Performance of models using linked claims and EHR data was compared to models using claims data only. RESULTS Among 15 845 047 encounters by 1 574 612 patients, 99 098 (0.6%) were followed by a self-harm event within 90 days. Overall classification performance did not differ between the best-fitting model using all data (area under the receiver operating curve or AUC = 0.846, 95% CI 0.839-0.854) and the best-fitting model limited to data available from insurance claims (AUC = 0.846, 95% CI 0.838-0.853). Competing models showed similar classification performance across a range of cut-points and similar calibration performance across a range of risk strata. Results were similar when the sample was limited to health systems and time periods where PHQ-9 depression questionnaires were recorded more frequently. CONCLUSION Investigators using health record data to account for pre-existing risk in observational studies of suicidal behavior need not limit that research to databases including linked EHR data.
Collapse
Affiliation(s)
- Gregory E Simon
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Health Systems Science, Bernard J. Tyson Kaiser Permanente School of Medicine, Pasadena, California, USA
| | - Susan M Shortreed
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Eric Johnson
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
| | - Zimri S Yaseen
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Marc Stone
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | | | - Brian K Ahmedani
- Center for Health Policy and Health Services Research, Henry Ford Health, Detroit, Michigan, USA
| | - Karen J Coleman
- Department of Health Systems Science, Bernard J. Tyson Kaiser Permanente School of Medicine, Pasadena, California, USA
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California, USA
| | - R Yates Coley
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Robert B Penfold
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA
| |
Collapse
|
4
|
Shortreed SM, Walker RL, Johnson E, Wellman R, Cruz M, Ziebell R, Coley RY, Yaseen ZS, Dharmarajan S, Penfold RB, Ahmedani BK, Rossom RC, Beck A, Boggs JM, Simon GE. Complex modeling with detailed temporal predictors does not improve health records-based suicide risk prediction. NPJ Digit Med 2023; 6:47. [PMID: 36959268 PMCID: PMC10036475 DOI: 10.1038/s41746-023-00772-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 02/07/2023] [Indexed: 03/25/2023] Open
Abstract
Suicide risk prediction models can identify individuals for targeted intervention. Discussions of transparency, explainability, and transportability in machine learning presume complex prediction models with many variables outperform simpler models. We compared random forest, artificial neural network, and ensemble models with 1500 temporally defined predictors to logistic regression models. Data from 25,800,888 mental health visits made by 3,081,420 individuals in 7 health systems were used to train and evaluate suicidal behavior prediction models. Model performance was compared across several measures. All models performed well (area under the receiver operating curve [AUC]: 0.794-0.858). Ensemble models performed best, but improvements over a regression model with 100 predictors were minimal (AUC improvements: 0.006-0.020). Results are consistent across performance metrics and subgroups defined by race, ethnicity, and sex. Our results suggest simpler parametric models, which are easier to implement as part of routine clinical practice, perform comparably to more complex machine learning methods.
Collapse
Affiliation(s)
- Susan M Shortreed
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA.
- Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA, 98195, USA.
| | - Rod L Walker
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| | - Eric Johnson
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| | - Robert Wellman
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| | - Maricela Cruz
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
- Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA, 98195, USA
| | - Rebecca Ziebell
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| | - R Yates Coley
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
- Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA, 98195, USA
| | - Zimri S Yaseen
- U.S. Food and Drug Administration, Silver Spring, MD, USA
| | | | - Robert B Penfold
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| | - Brian K Ahmedani
- Center for Health Policy & Health Services Research, Henry Ford Health System, 1 Ford Place, Detroit, MI, 48202, USA
| | - Rebecca C Rossom
- HealthPartners Institute, Division of Research, 8170 33rd Ave S, Minneapolis, MN, 55425, USA
| | - Arne Beck
- Kaiser Permanente Colorado Institute for Health Research, 2550 S. Parker Road, Suite 200, Aurora, CO, 80014, USA
| | - Jennifer M Boggs
- Kaiser Permanente Colorado Institute for Health Research, 2550 S. Parker Road, Suite 200, Aurora, CO, 80014, USA
| | - Greg E Simon
- Kaiser Permanente Washington Health Research Institute, 1730 Minor Avenue, Ste 1600, Seattle, WA, 98101, USA
| |
Collapse
|