1
|
Benrimoh D, Kleinerman A, Furukawa TA, Iii CFR, Lenze EJ, Karp J, Mulsant B, Armstrong C, Mehltretter J, Fratila R, Perlman K, Israel S, Popescu C, Golden G, Qassim S, Anacleto A, Tanguay-Sela M, Kapelner A, Rosenfeld A, Turecki G. Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies. Am J Geriatr Psychiatry 2024; 32:280-292. [PMID: 37839909 DOI: 10.1016/j.jagp.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 09/08/2023] [Indexed: 10/17/2023]
Abstract
BACKGROUND Major depressive disorder (MDD) is a heterogeneous condition; multiple underlying neurobiological and behavioral substrates are associated with treatment response variability. Understanding the sources of this variability and predicting outcomes has been elusive. Machine learning (ML) shows promise in predicting treatment response in MDD, but its application is limited by challenges to the clinical interpretability of ML models, and clinicians often lack confidence in model results. In order to improve the interpretability of ML models in clinical practice, our goal was to demonstrate the derivation of treatment-relevant patient profiles comprised of clinical and demographic information using a novel ML approach. METHODS We analyzed data from six clinical trials of pharmacological treatment for depression (total n = 5438) using the Differential Prototypes Neural Network (DPNN), a ML model that derives patient prototypes which can be used to derive treatment-relevant patient clusters while learning to generate probabilities for differential treatment response. A model classifying remission and outputting individual remission probabilities for five first-line monotherapies and three combination treatments was trained using clinical and demographic data. Prototypes were evaluated for interpretability by assessing differences in feature distributions (e.g. age, sex, symptom severity) and treatment-specific outcomes. RESULTS A 3-prototype model achieved an area under the receiver operating curve of 0.66 and an expected absolute improvement in remission rate for those receiving the best predicted treatment of 6.5% (relative improvement of 15.6%) compared to the population remission rate. We identified three treatment-relevant patient clusters. Cluster A patients tended to be younger, to have increased levels of fatigue, and more severe symptoms. Cluster B patients tended to be older, female, have less severe symptoms, and the highest remission rates. Cluster C patients had more severe symptoms, lower remission rates, more psychomotor agitation, more intense suicidal ideation, and more somatic genital symptoms. CONCLUSION It is possible to produce novel treatment-relevant patient profiles using ML models; doing so may improve interpretability of ML models and the quality of precision medicine treatments for MDD.
Collapse
Affiliation(s)
- David Benrimoh
- Department of Psychiatry (DB, KP, GT), McGill University, Montreal, Canada; Department of Psychiatry (DB), Stanford University, Stanford, CA; Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada.
| | | | - Toshi A Furukawa
- Department of Health Promotion and Human Behavior (TAF), Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan
| | - Charles F Reynolds Iii
- Department of Psychiatry (CFR), University of Pittsburgh School of Medicine, Pittsburgh, PA; Department of Psychiatry (CFR), Tufts University School of Medicine, Medford, MA
| | - Eric J Lenze
- Department of Psychiatry (EJL), Washington University School of Medicine, St. Louis, MS
| | - Jordan Karp
- Department of Psychiatry (JK), University of Arizona, Tucson, AZ
| | - Benoit Mulsant
- Department of Psychiatry (BM), University of Toronto, Toronto, ON, Canada
| | - Caitrin Armstrong
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Joseph Mehltretter
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Robert Fratila
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Kelly Perlman
- Department of Psychiatry (DB, KP, GT), McGill University, Montreal, Canada; Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Sonia Israel
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Christina Popescu
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Grace Golden
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Sabrina Qassim
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Alexandra Anacleto
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Myriam Tanguay-Sela
- Aifred Health (DB, CA, JM, RF, KP, SI, CP, GG, SQ, AA, MTS), Montreal, Canada
| | - Adam Kapelner
- Department of Mathematics (AK), Queens College, CUNY, New York, NY
| | | | - Gustavo Turecki
- Department of Psychiatry (DB, KP, GT), McGill University, Montreal, Canada
| |
Collapse
|
2
|
Zrubka Z, Kertész G, Gulácsi L, Czere J, Hölgyesi Á, Nezhad HM, Mosavi A, Kovács L, Butte AJ, Péntek M. The Reporting Quality of Machine Learning Studies on Pediatric Diabetes Mellitus: Systematic Review. J Med Internet Res 2024; 26:e47430. [PMID: 38241075 PMCID: PMC10837761 DOI: 10.2196/47430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/29/2023] [Accepted: 11/17/2023] [Indexed: 01/23/2024] Open
Abstract
BACKGROUND Diabetes mellitus (DM) is a major health concern among children with the widespread adoption of advanced technologies. However, concerns are growing about the transparency, replicability, biasedness, and overall validity of artificial intelligence studies in medicine. OBJECTIVE We aimed to systematically review the reporting quality of machine learning (ML) studies of pediatric DM using the Minimum Information About Clinical Artificial Intelligence Modelling (MI-CLAIM) checklist, a general reporting guideline for medical artificial intelligence studies. METHODS We searched the PubMed and Web of Science databases from 2016 to 2020. Studies were included if the use of ML was reported in children with DM aged 2 to 18 years, including studies on complications, screening studies, and in silico samples. In studies following the ML workflow of training, validation, and testing of results, reporting quality was assessed via MI-CLAIM by consensus judgments of independent reviewer pairs. Positive answers to the 17 binary items regarding sufficient reporting were qualitatively summarized and counted as a proxy measure of reporting quality. The synthesis of results included testing the association of reporting quality with publication and data type, participants (human or in silico), research goals, level of code sharing, and the scientific field of publication (medical or engineering), as well as with expert judgments of clinical impact and reproducibility. RESULTS After screening 1043 records, 28 studies were included. The sample size of the training cohort ranged from 5 to 561. Six studies featured only in silico patients. The reporting quality was low, with great variation among the 21 studies assessed using MI-CLAIM. The number of items with sufficient reporting ranged from 4 to 12 (mean 7.43, SD 2.62). The items on research questions and data characterization were reported adequately most often, whereas items on patient characteristics and model examination were reported adequately least often. The representativeness of the training and test cohorts to real-world settings and the adequacy of model performance evaluation were the most difficult to judge. Reporting quality improved over time (r=0.50; P=.02); it was higher than average in prognostic biomarker and risk factor studies (P=.04) and lower in noninvasive hypoglycemia detection studies (P=.006), higher in studies published in medical versus engineering journals (P=.004), and higher in studies sharing any code of the ML pipeline versus not sharing (P=.003). The association between expert judgments and MI-CLAIM ratings was not significant. CONCLUSIONS The reporting quality of ML studies in the pediatric population with DM was generally low. Important details for clinicians, such as patient characteristics; comparison with the state-of-the-art solution; and model examination for valid, unbiased, and robust results, were often the weak points of reporting. To assess their clinical utility, the reporting standards of ML studies must evolve, and algorithms for this challenging population must become more transparent and replicable.
Collapse
Affiliation(s)
- Zsombor Zrubka
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - Gábor Kertész
- John von Neumann Faculty of Informatics, Óbuda University, Budapest, Hungary
| | - László Gulácsi
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - János Czere
- Doctoral School of Innovation Management, Óbuda University, Budapest, Hungary
| | - Áron Hölgyesi
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
- Doctoral School of Molecular Medicine, Semmelweis University, Budapest, Hungary
| | - Hossein Motahari Nezhad
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
- Doctoral School of Business and Management, Corvinus University of Budapest, Budapest, Hungary
| | - Amir Mosavi
- John von Neumann Faculty of Informatics, Óbuda University, Budapest, Hungary
| | - Levente Kovács
- Physiological Controls Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, United States
| | - Márta Péntek
- HECON Health Economics Research Center, University Research and Innovation Center, Óbuda University, Budapest, Hungary
| |
Collapse
|