1
|
Shin D, Kim H, Lee S, Cho Y, Jung W. Using Large Language Models to Detect Depression From User-Generated Diary Text Data as a Novel Approach in Digital Mental Health Screening: Instrument Validation Study. J Med Internet Res 2024; 26:e54617. [PMID: 39292502 PMCID: PMC11447422 DOI: 10.2196/54617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 05/17/2024] [Accepted: 08/11/2024] [Indexed: 09/19/2024] Open
Abstract
BACKGROUND Depressive disorders have substantial global implications, leading to various social consequences, including decreased occupational productivity and a high disability burden. Early detection and intervention for clinically significant depression have gained attention; however, the existing depression screening tools, such as the Center for Epidemiologic Studies Depression Scale, have limitations in objectivity and accuracy. Therefore, researchers are identifying objective indicators of depression, including image analysis, blood biomarkers, and ecological momentary assessments (EMAs). Among EMAs, user-generated text data, particularly from diary writing, have emerged as a clinically significant and analyzable source for detecting or diagnosing depression, leveraging advancements in large language models such as ChatGPT. OBJECTIVE We aimed to detect depression based on user-generated diary text through an emotional diary writing app using a large language model (LLM). We aimed to validate the value of the semistructured diary text data as an EMA data source. METHODS Participants were assessed for depression using the Patient Health Questionnaire and suicide risk was evaluated using the Beck Scale for Suicide Ideation before starting and after completing the 2-week diary writing period. The text data from the daily diaries were also used in the analysis. The performance of leading LLMs, such as ChatGPT with GPT-3.5 and GPT-4, was assessed with and without GPT-3.5 fine-tuning on the training data set. The model performance comparison involved the use of chain-of-thought and zero-shot prompting to analyze the text structure and content. RESULTS We used 428 diaries from 91 participants; GPT-3.5 fine-tuning demonstrated superior performance in depression detection, achieving an accuracy of 0.902 and a specificity of 0.955. However, the balanced accuracy was the highest (0.844) for GPT-3.5 without fine-tuning and prompt techniques; it displayed a recall of 0.929. CONCLUSIONS Both GPT-3.5 and GPT-4.0 demonstrated relatively reasonable performance in recognizing the risk of depression based on diaries. Our findings highlight the potential clinical usefulness of user-generated text data for detecting depression. In addition to measurable indicators, such as step count and physical activity, future research should increasingly emphasize qualitative digital expression.
Collapse
Affiliation(s)
- Daun Shin
- Department of Psychiatry, Anam Hospital, Korea University, Seoul, Republic of Korea
- Doctorpresso, Seoul, Republic of Korea
| | | | | | - Younhee Cho
- Doctorpresso, Seoul, Republic of Korea
- Department of Design, Seoul National University, Seoul, Republic of Korea
| | | |
Collapse
|
2
|
Liu Z, Wu Y, Zhang H, Li G, Ding Z, Hu B. Stimulus-Response Patterns: The Key to Giving Generalizability to Text-Based Depression Detection Models. IEEE J Biomed Health Inform 2024; 28:4925-4936. [PMID: 38656850 DOI: 10.1109/jbhi.2024.3393244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Text content analysis for depression detection using machine learning techniques has become a prominent area of research. However, previous studies focused mainly on analyzing the textual content, neglecting the fundamental factors driving text generation. Consequently, existing models face the challenge of poor generalization to out-of-domain data as they struggle to capture the crucial features of depression. To address this, we propose a novel computational perspective of "stimulus-response patterns" that brings us closer to the essence of clinical diagnosis of depression. Adopting this computational perspective allows us to conceptually unify diverse datasets and generalize this perspective to common datasets in the field. We introduce the Stimulus-Response Patterns-aware Network (SRP-Net) as an exemplary approach within this computational perspective. To assess the performance of the SRP-Net, we constructed a multi-stimulus dataset and conducted experimental evaluations, demonstrating its exceptional cross-stimulus generalizability. Furthermore, we demonstrated the promising performance of SPR-Net in real medical scenarios and conducted an interpretability analysis of the stimulus-response patterns. Our research investigates the critical role of stimulus-response patterns in enhancing the generalizability of text-based depression detection models, which can potentially facilitate data-driven depression detection to approach the diagnostic accuracy of psychiatrists.
Collapse
|
3
|
Pigoni A, Delvecchio G, Turtulici N, Madonna D, Pietrini P, Cecchetti L, Brambilla P. Machine learning and the prediction of suicide in psychiatric populations: a systematic review. Transl Psychiatry 2024; 14:140. [PMID: 38461283 PMCID: PMC10925059 DOI: 10.1038/s41398-024-02852-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 02/22/2024] [Accepted: 02/22/2024] [Indexed: 03/11/2024] Open
Abstract
Machine learning (ML) has emerged as a promising tool to enhance suicidal prediction. However, as many large-sample studies mixed psychiatric and non-psychiatric populations, a formal psychiatric diagnosis emerged as a strong predictor of suicidal risk, overshadowing more subtle risk factors specific to distinct populations. To overcome this limitation, we conducted a systematic review of ML studies evaluating suicidal behaviors exclusively in psychiatric clinical populations. A systematic literature search was performed from inception through November 17, 2022 on PubMed, EMBASE, and Scopus following the PRISMA guidelines. Original research using ML techniques to assess the risk of suicide or predict suicide attempts in the psychiatric population were included. An assessment for bias risk was performed using the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines. About 1032 studies were retrieved, and 81 satisfied the inclusion criteria and were included for qualitative synthesis. Clinical and demographic features were the most frequently employed and random forest, support vector machine, and convolutional neural network performed better in terms of accuracy than other algorithms when directly compared. Despite heterogeneity in procedures, most studies reported an accuracy of 70% or greater based on features such as previous attempts, severity of the disorder, and pharmacological treatments. Although the evidence reported is promising, ML algorithms for suicidal prediction still present limitations, including the lack of neurobiological and imaging data and the lack of external validation samples. Overcoming these issues may lead to the development of models to adopt in clinical practice. Further research is warranted to boost a field that holds the potential to critically impact suicide mortality.
Collapse
Affiliation(s)
- Alessandro Pigoni
- Social and Affective Neuroscience Group, MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Giuseppe Delvecchio
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Nunzio Turtulici
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
| | - Domenico Madonna
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy
| | - Pietro Pietrini
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Luca Cecchetti
- Social and Affective Neuroscience Group, MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Paolo Brambilla
- Department of Neurosciences and Mental Health, Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico, Milan, Italy.
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy.
| |
Collapse
|
4
|
Li TMH, Chen J, Law FOC, Li CT, Chan NY, Chan JWY, Chau SWH, Liu Y, Li SX, Zhang J, Leung KS, Wing YK. Detection of Suicidal Ideation in Clinical Interviews for Depression Using Natural Language Processing and Machine Learning: Cross-Sectional Study. JMIR Med Inform 2023; 11:e50221. [PMID: 38054498 DOI: 10.2196/50221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 07/31/2023] [Accepted: 08/23/2023] [Indexed: 12/07/2023] Open
Abstract
Background Assessing patients' suicide risk is challenging, especially among those who deny suicidal ideation. Primary care providers have poor agreement in screening suicide risk. Patients' speech may provide more objective, language-based clues about their underlying suicidal ideation. Text analysis to detect suicide risk in depression is lacking in the literature. Objective This study aimed to determine whether suicidal ideation can be detected via language features in clinical interviews for depression using natural language processing (NLP) and machine learning (ML). Methods This cross-sectional study recruited 305 participants between October 2020 and May 2022 (mean age 53.0, SD 11.77 years; female: n=176, 57%), of which 197 had lifetime depression and 108 were healthy. This study was part of ongoing research on characterizing depression with a case-control design. In this study, 236 participants were nonsuicidal, while 56 and 13 had low and high suicide risks, respectively. The structured interview guide for the Hamilton Depression Rating Scale (HAMD) was adopted to assess suicide risk and depression severity. Suicide risk was clinician rated based on a suicide-related question (H11). The interviews were transcribed and the words in participants' verbal responses were translated into psychologically meaningful categories using Linguistic Inquiry and Word Count (LIWC). Results Ordinal logistic regression revealed significant suicide-related language features in participants' responses to the HAMD questions. Increased use of anger words when talking about work and activities posed the highest suicide risk (odds ratio [OR] 2.91, 95% CI 1.22-8.55; P=.02). Random forest models demonstrated that text analysis of the direct responses to H11 was effective in identifying individuals with high suicide risk (AUC 0.76-0.89; P<.001) and detecting suicide risk in general, including both low and high suicide risk (AUC 0.83-0.92; P<.001). More importantly, suicide risk can be detected with satisfactory performance even without patients' disclosure of suicidal ideation. Based on the response to the question on hypochondriasis, ML models were trained to identify individuals with high suicide risk (AUC 0.76; P<.001). Conclusions This study examined the perspective of using NLP and ML to analyze the texts from clinical interviews for suicidality detection, which has the potential to provide more accurate and specific markers for suicidal ideation detection. The findings may pave the way for developing high-performance assessment of suicide risk for automated detection, including online chatbot-based interviews for universal screening.
Collapse
Affiliation(s)
- Tim M H Li
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Jie Chen
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Framenia O C Law
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Chun-Tung Li
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Ngan Yin Chan
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Joey W Y Chan
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Steven W H Chau
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Yaping Liu
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Shirley Xin Li
- Department of Psychology, The University of Hong Kong, Hong Kong, China (Hong Kong)
- The State Key Laboratory of Brain and Cognitive Sciences, The University of Hong Kong, Hong Kong, China (Hong Kong)
| | - Jihui Zhang
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
- Guangdong Mental Health Center, Guangdong General Hospital and Guangdong Academy of Medical Sciences, Guangdong, China
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
- Department of Applied Data Science, Hong Kong Shue Yan University, Hong Kong, China (Hong Kong)
| | - Yun-Kwok Wing
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China (Hong Kong)
| |
Collapse
|