1
|
Linardon J, Messer M, Anderson C, Liu C, McClure Z, Jarman HK, Goldberg SB, Torous J. Role of large language models in mental health research: an international survey of researchers' practices and perspectives. BMJ MENTAL HEALTH 2025; 28:e301787. [PMID: 40514050 PMCID: PMC12164621 DOI: 10.1136/bmjment-2025-301787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2025] [Accepted: 06/03/2025] [Indexed: 06/16/2025]
Abstract
BACKGROUND Large language models (LLMs) offer significant potential to streamline research workflows and enhance productivity. However, limited data exist on the extent of their adoption within the mental health research community. OBJECTIVE We examined how LLMs are being used in mental health research, the types of tasks they support, barriers to their adoption and broader attitudes towards their integration. METHODS 714 mental health researchers from 42 countries and various career stages (from PhD student, to early career researcher, to Professor) completed a survey assessing LLM-related practices and perspectives. FINDINGS 496 (69.5%) reported using LLMs to assist with research, with 94% indicating use of ChatGPT. The most common applications were for proofreading written work (69%) and refining or generating code (49%). LLM use was more prevalent among early career researchers. Common challenges reported by users included inaccurate responses (78%), ethical concerns (48%) and biased outputs (27%). However, many users indicated that LLMs improved efficiency (73%) and output quality (44%). Reasons for non-use were concerns with ethical issues (53%) and accuracy of outputs (50%). Most agreed that they wanted more training on responsible use (77%), that researchers should be required to disclose use of LLMs in manuscripts (79%) and that they were concerned about LLMs affecting how their work is evaluated (60%). CONCLUSION While LLM use is widespread in mental health research, key barriers and implementation challenges remain. CLINICAL IMPLICATIONS LLMs may streamline mental health research processes, but clear guidelines are needed to support their ethical and transparent use across the research lifecycle.
Collapse
Affiliation(s)
- Jake Linardon
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Mariel Messer
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Cleo Anderson
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Claudia Liu
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Zoe McClure
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Hannah K Jarman
- SEED Lifespan Strategic Research Centre, School of Psychology, Faculty of Health, Deakin University, Geelong, Victoria, Australia
| | - Simon B Goldberg
- Department of Counselling Psychology, University of Wisconsin, Madison, Wisconsin, USA
- Center for Healthy Minds, University of Wisconsin, Madison, Wisconsin, USA
| | - John Torous
- Division of Digital Psychiatry, Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
2
|
Ni Y, Jia F. A Scoping Review of AI-Driven Digital Interventions in Mental Health Care: Mapping Applications Across Screening, Support, Monitoring, Prevention, and Clinical Education. Healthcare (Basel) 2025; 13:1205. [PMID: 40428041 PMCID: PMC12110772 DOI: 10.3390/healthcare13101205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2025] [Revised: 05/09/2025] [Accepted: 05/15/2025] [Indexed: 05/29/2025] Open
Abstract
BACKGROUND/OBJECTIVES Artificial intelligence (AI)-enabled digital interventions are increasingly used to expand access to mental health care. This PRISMA-ScR scoping review maps how AI technologies support mental health care across five phases: pre-treatment (screening), treatment (therapeutic support), post-treatment (monitoring), clinical education, and population-level prevention. METHODS We synthesized findings from 36 empirical studies published through January 2024 that implemented AI-driven digital tools, including large language models (LLMs), machine learning (ML) models, and conversational agents. Use cases include referral triage, remote patient monitoring, empathic communication enhancement, and AI-assisted psychotherapy delivered via chatbots and voice agents. RESULTS Across the 36 included studies, the most common AI modalities included chatbots, natural language processing tools, machine learning and deep learning models, and large language model-based agents. These technologies were predominantly used for support, monitoring, and self-management purposes rather than as standalone treatments. Reported benefits included reduced wait times, increased engagement, and improved symptom tracking. However, recurring challenges such as algorithmic bias, data privacy risks, and workflow integration barriers highlight the need for ethical design and human oversight. CONCLUSION By introducing a four-pillar framework, this review offers a comprehensive overview of current applications and future directions in AI-augmented mental health care. It aims to guide researchers, clinicians, and policymakers in developing safe, effective, and equitable digital mental health interventions.
Collapse
Affiliation(s)
- Yang Ni
- School of International and Public Affairs, Columbia University, New York, NY 10027, USA;
| | - Fanli Jia
- Department of Psychology, Seton Hall University, South Orange, NJ 07079, USA
| |
Collapse
|
3
|
Albikawi Z, Abuadas M, Rayani AM. Nursing Students' Perceptions of AI-Driven Mental Health Support and Its Relationship with Anxiety, Depression, and Seeking Professional Psychological Help: Transitioning from Traditional Counseling to Digital Support. Healthcare (Basel) 2025; 13:1089. [PMID: 40361868 PMCID: PMC12071227 DOI: 10.3390/healthcare13091089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2025] [Revised: 05/03/2025] [Accepted: 05/05/2025] [Indexed: 05/15/2025] Open
Abstract
Background: The integration of artificial intelligence (AI) into mental health care is reshaping psychological support systems, particularly for digitally literate populations such as nursing students. Given the high prevalence of anxiety and depression in this group, understanding their perceptions of AI-driven mental health support is critical for effective implementation. Objectives: to evaluate nursing students' perceptions toward AI-driven mental health support and examine its relationship with anxiety, depression, and their attitudes to seeking professional psychological help. Methods: A cross-sectional survey was conducted among 176 undergraduate nursing students in northern Jordan. Results: Students reported moderately positive perceptions toward AI-driven mental health support (mean score: 36.70 ± 4.80). Multiple linear regression revealed that prior use of AI tools (β = 0.44, p < 0.0001), positive help-seeking attitudes (β = 0.41, p < 0.0001), and higher levels of psychological distress encompassing both anxiety (β = 0.29, p = 0.005) and depression (β = 0.24, p = 0.007) significantly predicted more positive perceptions. Daily AI usage was not a significant predictor (β = 0.15, p = 0.174). Logistic regression analysis further indicated that psychological distress, reflected by elevated anxiety (OR = 1.42, p = 0.002) and depression scores (OR = 1.32, p = 0.003), along with stronger help-seeking attitudes (OR = 1.35, p = 0.011), significantly increased the likelihood of using AI-based mental health support. Conclusions: AI-driven mental health tools hold promises as adjuncts to traditional counseling, particularly for nursing students experiencing psychological distress. Despite growing acceptance, concerns regarding data privacy, bias, and lack of human empathy remain. Ethical integration and blended care models are essential for effective mental health support.
Collapse
Affiliation(s)
- Zainab Albikawi
- Faculty of Nursing, Yarmouk University, Irbid P.O. Box 566, Jordan;
| | - Mohammad Abuadas
- Faculty of Nursing, Yarmouk University, Irbid P.O. Box 566, Jordan;
| | - Ahmad M. Rayani
- Community and Psychiatric Mental Health Nursing Department, College of Nursing, King Saud University, Riyadh City P.O. Box 12372, Saudi Arabia;
| |
Collapse
|
4
|
Lauderdale S, Griffin SA, Lahman KR, Mbaba E, Tomlinson S. Unveiling Public Stigma for Borderline Personality Disorder: A Comparative Study of Artificial Intelligence and Mental Health Care Providers. Personal Ment Health 2025; 19:e70018. [PMID: 40272185 DOI: 10.1002/pmh.70018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 02/08/2025] [Accepted: 03/20/2025] [Indexed: 04/25/2025]
Abstract
Generative artificial intelligence (GAI) programs can identify symptoms and make recommendations for treatment for mental disorders, including borderline personality disorder (BPD). Despite GAI's potential as a clinical tool, stereotypes are inherent in their algorithms but not obvious until directly assessed. Given this concern, we assessed and compared GAIs' (ChatGPT-3.5, 4, and Google Gemini) symptom recognition and public stigma for a woman and man vignette character with BPD. The GAIs' responses were also compared to a sample of mental health care practitioners (MHCPs; n = 218). Compared to MHCPs, GAI showed more empathy for the characters. GAI were also less likely to view the characters' mental health symptoms as developmental stage problems and rated these symptoms as more chronic and unchangeable. The GAI also rated the vignette characters as less trustworthy and more likely to have difficulty forming close relationships than the MHCPs. Across GAI, gender biases were found with Google Gemini showing less empathy, more negative reactions, and greater public stigma, particularly for a woman with BPD, than either ChatGPT-3.5 or ChatGPT-4. A woman with BPD was also rated as having more chronic mental health problems than a man by all GAI. Overall, these results suggest that GAI may express empathy but reflects gender bias and stereotyped beliefs for people with BPD. Greater transparency and incorporation of knowledgeable MHCPs and people with lived experiences are needed in GAI training to reduce bias and enhance their accuracy prior to use in mental health applications.
Collapse
Affiliation(s)
| | | | | | - Eno Mbaba
- University of Houston, Houston, Texas, USA
| | | |
Collapse
|
5
|
Lauderdale SA, Schmitt R, Wuckovich B, Dalal N, Desai H, Tomlinson S. Effectiveness of generative AI-large language models' recognition of veteran suicide risk: a comparison with human mental health providers using a risk stratification model. Front Psychiatry 2025; 16:1544951. [PMID: 40248601 PMCID: PMC12003356 DOI: 10.3389/fpsyt.2025.1544951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Accepted: 03/13/2025] [Indexed: 04/19/2025] Open
Abstract
Background With over 6,300 United States military veterans dying by suicide annually, the Veterans Health Administration (VHA) is exploring innovative strategies, including artificial intelligence (AI), for suicide risk assessment. Machine learning has been predominantly utilized, but the application of generative AI-large language models (GAI-LLMs) remains unexplored. Objective This study evaluates the effectiveness of GAI-LLMs, specifically ChatGPT-3.5, ChatGPT-4o, and Google Gemini, in using the VHA's Risk Stratification Table for identifying suicide risks and making treatment recommendations in response to standardized veteran vignettes. Methods We compared the GAI-LLMs' assessments and recommendations for both acute and chronic suicide risks to evaluations by mental health care providers (MHCPs). Four vignettes, representing varying levels of suicide risk, were used. Results GAI-LLMs' assessments showed discrepancies with MHCPs, particularly rating the most acute case as less acute and the least acute case as more acute. For chronic risk, GAI-LLMs' evaluations were generally in line with MHCPs, except for one vignette rated with higher chronic risk by the GAI-LLM. Variation across GAI-LLMs was also observed. Notably, ChatGPT-3.5 showed lower acute risk ratings compared to ChatGPT-4o and Google Gemini, while ChatGPT-4o identified higher chronic risk ratings and recommended hospitalization for all veterans. Treatment planning by GAI-LLMs was predicted by chronic but not acute risk ratings. Conclusion While GAI-LLMs offers potential suicide risk assessment comparable to MHCPs, significant variation exists across different GAI-LLMs in both risk evaluation and treatment recommendations. Continued MHCP oversight is essential to ensure accuracy and appropriate care. Implications These findings highlight the need for further research into optimizing GAI-LLMs for consistent and reliable use in clinical settings, ensuring they complement rather than replace human expertise.
Collapse
Affiliation(s)
- Sean A. Lauderdale
- Department of Psychological and Behavioral Sciences, University of Houston – Clear Lake, Houston, TX, United States
| | | | | | | | | | | |
Collapse
|
6
|
Resnik DB, Hosseini M. The ethics of using artificial intelligence in scientific research: new guidance needed for a new tool. AI AND ETHICS 2025; 5:1499-1521. [PMID: 40337745 PMCID: PMC12057767 DOI: 10.1007/s43681-024-00493-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 05/07/2024] [Indexed: 05/09/2025]
Abstract
Using artificial intelligence (AI) in research offers many important benefits for science and society but also creates novel and complex ethical issues. While these ethical issues do not necessitate changing established ethical norms of science, they require the scientific community to develop new guidance for the appropriate use of AI. In this article, we briefly introduce AI and explain how it can be used in research, examine some of the ethical issues raised when using it, and offer nine recommendations for responsible use, including: (1) Researchers are responsible for identifying, describing, reducing, and controlling AI-related biases and random errors; (2) Researchers should disclose, describe, and explain their use of AI in research, including its limitations, in language that can be understood by non-experts; (3) Researchers should engage with impacted communities, populations, and other stakeholders concerning the use of AI in research to obtain their advice and assistance and address their interests and concerns, such as issues related to bias; (4) Researchers who use synthetic data should (a) indicate which parts of the data are synthetic; (b) clearly label the synthetic data; (c) describe how the data were generated; and (d) explain how and why the data were used; (5) AI systems should not be named as authors, inventors, or copyright holders but their contributions to research should be disclosed and described; (6) Education and mentoring in responsible conduct of research should include discussion of ethical use of AI.
Collapse
Affiliation(s)
- David B. Resnik
- National Institute of Environmental Health Sciences, Durham, USA
| | - Mohammad Hosseini
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL USA
- Galter Health Sciences Library and Learning Center, Northwestern University Feinberg School of Medicine, Chicago, IL USA
| |
Collapse
|
7
|
Rotstein NM, Cohen ZD, Welborn A, Zbozinek TD, Akre S, Jones KG, Null KE, Pontanares J, Sanchez KL, Flanagan DC, Halavi SE, Kittle E, McClay MG, Bui AAT, Narr KL, Welsh RC, Craske MG, Kuhn TP. Investigating low intensity focused ultrasound pulsation in anhedonic depression-A randomized controlled trial. Front Hum Neurosci 2025; 19:1478534. [PMID: 40196448 PMCID: PMC11973349 DOI: 10.3389/fnhum.2025.1478534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 03/05/2025] [Indexed: 04/09/2025] Open
Abstract
Introduction Anhedonic depression is a subtype of depression characterized by deficits in reward processing. This subtype of depression is associated with higher suicide risk and longer depressive episodes, underscoring the importance of effective treatments. Anhedonia has also been found to correlate with alterations in activity in several subcortical regions, including the caudate head and nucleus accumbens. Low intensity focused ultrasound pulsation (LIFUP) is an emerging technology that enables non-invasive stimulation of these subcortical regions, which were previously only accessible with surgically-implanted electrodes. Methods This double-blinded, sham-controlled study aims to investigate the effects of LIFUP to the left caudate head and right nucleus accumbens in participants with anhedonic depression. Participants in this protocol will undergo three sessions of LIFUP over the span of 5-9 days. To investigate LIFUP-related changes, this 7-week protocol collects continuous digital phenotyping data, an array of self-report measures of depression, anhedonia, and other psychopathology, and magnetic resonance imaging (MRI) before and after the LIFUP intervention. Primary self-report outcome measures include Ecological Momentary Assessment, the Positive Valence Systems Scale, and the Patient Health Questionnaire. Primary imaging measures include magnetic resonance spectroscopy and functional MRI during reward-based tasks and at rest. Digital phenotyping data is collected with an Apple Watch and participants' personal iPhones throughout the study, and includes information about sleep, heart rate, and physical activity. Discussion This study is the first to investigate the effects of LIFUP to the caudate head or nucleus accumbens in depressed subjects. Furthermore, the data collected for this protocol covers a wide array of potentially affected modalities. As a result, this protocol will help to elucidate potential impacts of LIFUP in individuals with anhedonic depression.
Collapse
Affiliation(s)
- Natalie M. Rotstein
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Zachary D. Cohen
- Department of Psychology, University of Arizona, Tucson, AZ, United States
| | - Amelia Welborn
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Tomislav D. Zbozinek
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Samir Akre
- Medical & Imaging Informatics Group, University of California, Los Angeles, Los Angeles, CA, United States
| | - Keith G. Jones
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Kaylee E. Null
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Jillian Pontanares
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Katy L. Sanchez
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Demarko C. Flanagan
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Sabrina E. Halavi
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Evan Kittle
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Mason G. McClay
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Alex A. T. Bui
- Medical & Imaging Informatics Group, University of California, Los Angeles, Los Angeles, CA, United States
| | - Katherine L. Narr
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Robert C. Welsh
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| | - Michelle G. Craske
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Taylor P. Kuhn
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
8
|
Ayik G, Kolac UC, Aksoy T, Yilmaz A, Sili MV, Tokgozoglu M, Huri G. Exploring the role of artificial intelligence in Turkish orthopedic progression exams. ACTA ORTHOPAEDICA ET TRAUMATOLOGICA TURCICA 2025; 59:18-26. [PMID: 40337975 PMCID: PMC11992947 DOI: 10.5152/j.aott.2025.24090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 01/03/2025] [Indexed: 05/09/2025]
Abstract
Objective The aim of this study was to evaluate and compare the performance of the artificial intelligence (AI) models ChatGPT-3.5, ChatGPT-4, and Gemini on the Turkish Specialization Training and Development Examination (UEGS) to determine their utility in medical education and their potential to improve patient care. Methods This retrospective study analyzed responses of ChatGPT-3.5, ChatGPT-4, and Gemini to 1000 true or false questions from UEGS administered over 5 years (2018-2023). Questions, encompassing 9 orthopedic subspecialties, were categorized by 2 independent residents, with discrepancies resolved by a senior author. Artificial intelligence models were restarted for each query to prevent data retention. Performance was evaluated by calculating net scores and comparing them to orthopedic resident scores obtained from the Turkish Orthopedics and Traumatology Education Council (TOTEK) database. Statistical analyses included chi-squared tests, Bonferroni-adjusted Z tests, Cochran's Q test, and receiver operating characteristic (ROC) analysis to determine the optimal question length for AI accuracy. All AI responses were generated independently without retaining prior information. Results Significant di!erences in AI tool accuracy were observed across di!erent years and subspecialties (P < .001). ChatGPT-4 consistently outperformed other models, achieving the highest overall accuracy (95% in specific subspecialties). Notably, ChatGPT-4 demonstrated superior performance in Basic and General Orthopedics and Foot and Ankle Surgery, while Gemini and ChatGPT-3.5 showed variability in accuracy across topics and years. Receiver operating characteristic analysis revealed a significant relationship between shorter letter counts and higher accuracy for ChatGPT-4 (P=.002). ChatGPT-4 showed significant negative correlations between letter count and accuracy across all years (r="0.099, P=.002), outperformed residents in basic and general orthopedics (P=.015) and trauma (P=.012), unlike other AI models. Conclusion The findings underscore the advancing role of AI in the medical field, with ChatGPT-4 demonstrating significant potential as a tool for medical education and clinical decision-making. Continuous evaluation and refinement of AI technologies are essential to enhance their educational and clinical impact.
Collapse
Affiliation(s)
- Gokhan Ayik
- Department of Orthopedics and Traumatology, Yuksek Ihtisas University Faculty of Medicine, Ankara, Türkiye
| | - Ulas Can Kolac
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
| | - Taha Aksoy
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
| | - Abdurrahman Yilmaz
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
| | - Mazlum Veysel Sili
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
| | - Mazhar Tokgozoglu
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
| | - Gazi Huri
- Department of Orthopedics and Traumatology, Hacettepe University Faculty Of Medicine, Ankara, Türkiye
- Aspetar, Orthopedic and Sports Medicine Hospital, FIFA Medical Center of Excellence, Doha, Qatar
| |
Collapse
|
9
|
Rahsepar Meadi M, Sillekens T, Metselaar S, van Balkom A, Bernstein J, Batelaan N. Exploring the Ethical Challenges of Conversational AI in Mental Health Care: Scoping Review. JMIR Ment Health 2025; 12:e60432. [PMID: 39983102 PMCID: PMC11890142 DOI: 10.2196/60432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 12/21/2024] [Accepted: 12/23/2024] [Indexed: 02/23/2025] Open
Abstract
BACKGROUND Conversational artificial intelligence (CAI) is emerging as a promising digital technology for mental health care. CAI apps, such as psychotherapeutic chatbots, are available in app stores, but their use raises ethical concerns. OBJECTIVE We aimed to provide a comprehensive overview of ethical considerations surrounding CAI as a therapist for individuals with mental health issues. METHODS We conducted a systematic search across PubMed, Embase, APA PsycINFO, Web of Science, Scopus, the Philosopher's Index, and ACM Digital Library databases. Our search comprised 3 elements: embodied artificial intelligence, ethics, and mental health. We defined CAI as a conversational agent that interacts with a person and uses artificial intelligence to formulate output. We included articles discussing the ethical challenges of CAI functioning in the role of a therapist for individuals with mental health issues. We added additional articles through snowball searching. We included articles in English or Dutch. All types of articles were considered except abstracts of symposia. Screening for eligibility was done by 2 independent researchers (MRM and TS or AvB). An initial charting form was created based on the expected considerations and revised and complemented during the charting process. The ethical challenges were divided into themes. When a concern occurred in more than 2 articles, we identified it as a distinct theme. RESULTS We included 101 articles, of which 95% (n=96) were published in 2018 or later. Most were reviews (n=22, 21.8%) followed by commentaries (n=17, 16.8%). The following 10 themes were distinguished: (1) safety and harm (discussed in 52/101, 51.5% of articles); the most common topics within this theme were suicidality and crisis management, harmful or wrong suggestions, and the risk of dependency on CAI; (2) explicability, transparency, and trust (n=26, 25.7%), including topics such as the effects of "black box" algorithms on trust; (3) responsibility and accountability (n=31, 30.7%); (4) empathy and humanness (n=29, 28.7%); (5) justice (n=41, 40.6%), including themes such as health inequalities due to differences in digital literacy; (6) anthropomorphization and deception (n=24, 23.8%); (7) autonomy (n=12, 11.9%); (8) effectiveness (n=38, 37.6%); (9) privacy and confidentiality (n=62, 61.4%); and (10) concerns for health care workers' jobs (n=16, 15.8%). Other themes were discussed in 9.9% (n=10) of the identified articles. CONCLUSIONS Our scoping review has comprehensively covered ethical aspects of CAI in mental health care. While certain themes remain underexplored and stakeholders' perspectives are insufficiently represented, this study highlights critical areas for further research. These include evaluating the risks and benefits of CAI in comparison to human therapists, determining its appropriate roles in therapeutic contexts and its impact on care access, and addressing accountability. Addressing these gaps can inform normative analysis and guide the development of ethical guidelines for responsible CAI use in mental health care.
Collapse
Affiliation(s)
- Mehrdad Rahsepar Meadi
- Department of Psychiatry, Amsterdam Public Health, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Ethics, Law, & Humanities, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Tomas Sillekens
- GGZ Centraal Mental Health Care, Amersfoort, The Netherlands
| | - Suzanne Metselaar
- Department of Ethics, Law, & Humanities, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Anton van Balkom
- Department of Psychiatry, Amsterdam Public Health, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Justin Bernstein
- Department of Philosophy, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Neeltje Batelaan
- Department of Psychiatry, Amsterdam Public Health, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
10
|
Holmes G, Tang B, Gupta S, Venkatesh S, Christensen H, Whitton A. Applications of Large Language Models in the Field of Suicide Prevention: Scoping Review. J Med Internet Res 2025; 27:e63126. [PMID: 39847414 PMCID: PMC11809463 DOI: 10.2196/63126] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 10/19/2024] [Accepted: 12/10/2024] [Indexed: 01/24/2025] Open
Abstract
BACKGROUND Prevention of suicide is a global health priority. Approximately 800,000 individuals die by suicide yearly, and for every suicide death, there are another 20 estimated suicide attempts. Large language models (LLMs) hold the potential to enhance scalable, accessible, and affordable digital services for suicide prevention and self-harm interventions. However, their use also raises clinical and ethical questions that require careful consideration. OBJECTIVE This scoping review aims to identify emergent trends in LLM applications in the field of suicide prevention and self-harm research. In addition, it summarizes key clinical and ethical considerations relevant to this nascent area of research. METHODS Searches were conducted in 4 databases (PsycINFO, Embase, PubMed, and IEEE Xplore) in February 2024. Eligible studies described the application of LLMs for suicide or self-harm prevention, detection, or management. English-language peer-reviewed articles and conference proceedings were included, without date restrictions. Narrative synthesis was used to synthesize study characteristics, objectives, models, data sources, proposed clinical applications, and ethical considerations. This review adhered to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) standards. RESULTS Of the 533 studies identified, 36 (6.8%) met the inclusion criteria. An additional 7 studies were identified through citation chaining, resulting in 43 studies for review. The studies showed a bifurcation of publication fields, with varying publication norms between computer science and mental health. While most of the studies (33/43, 77%) focused on identifying suicide risk, newer applications leveraging generative functions (eg, support, education, and training) are emerging. Social media was the most common source of LLM training data. Bidirectional Encoder Representations from Transformers (BERT) was the predominant model used, although generative pretrained transformers (GPTs) featured prominently in generative applications. Clinical LLM applications were reported in 60% (26/43) of the studies, often for suicide risk detection or as clinical assistance tools. Ethical considerations were reported in 33% (14/43) of the studies, with privacy, confidentiality, and consent strongly represented. CONCLUSIONS This evolving research area, bridging computer science and mental health, demands a multidisciplinary approach. While open access models and datasets will likely shape the field of suicide prevention, documenting their limitations and potential biases is crucial. High-quality training data are essential for refining these models and mitigating unwanted biases. Policies that address ethical concerns-particularly those related to privacy and security when using social media data-are imperative. Limitations include high variability across disciplines in how LLMs and study methodology are reported. The emergence of generative artificial intelligence signals a shift in approach, particularly in applications related to care, support, and education, such as improved crisis care and gatekeeper training methods, clinician copilot models, and improved educational practices. Ongoing human oversight-through human-in-the-loop testing or expert external validation-is essential for responsible development and use. TRIAL REGISTRATION OSF Registries osf.io/nckq7; https://osf.io/nckq7.
Collapse
Affiliation(s)
- Glenn Holmes
- Black Dog Institute, University of New South Wales, Sydney, Randwick, Australia
| | - Biya Tang
- Black Dog Institute, University of New South Wales, Sydney, Randwick, Australia
| | - Sunil Gupta
- Applied Artificial Intelligence Institute, Deakin University, Melbourne, Australia
| | - Svetha Venkatesh
- Applied Artificial Intelligence Institute, Deakin University, Melbourne, Australia
| | - Helen Christensen
- Black Dog Institute, University of New South Wales, Sydney, Randwick, Australia
| | - Alexis Whitton
- Black Dog Institute, University of New South Wales, Sydney, Randwick, Australia
| |
Collapse
|
11
|
Lange M, Koliousis A, Fayez F, Gogarty E, Twumasi R. Schizophrenia more employable than depression? Language-based artificial intelligence model ratings for employability of psychiatric diagnoses and somatic and healthy controls. PLoS One 2025; 20:e0315768. [PMID: 39774560 PMCID: PMC11709238 DOI: 10.1371/journal.pone.0315768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 11/30/2024] [Indexed: 01/11/2025] Open
Abstract
Artificial Intelligence (AI) assists recruiting and job searching. Such systems can be biased against certain characteristics. This results in potential misrepresentations and consequent inequalities related to people with mental health disorders. Hence occupational and mental health bias in existing Natural Language Processing (NLP) models used in recruiting and job hunting must be assessed. We examined occupational bias against mental health disorders in NLP models through relationships between occupations, employability, and psychiatric diagnoses. We investigated Word2Vec and GloVe embedding algorithms through analogy questions and graphical representation of cosine similarities. Word2Vec embeddings exhibit minor bias against mental health disorders when asked analogies regarding employability attributes and no evidence of bias when asked analogies regarding high earning jobs. GloVe embeddings view common mental health disorders such as depression less healthy and less employable than severe mental health disorders and most physical health conditions. Overall, physical, and psychiatric disorders are seen as similarly healthy and employable. Both algorithms appear to be safe for use in downstream task without major repercussions. Further research is needed to confirm this. This project was funded by the London Interdisciplinary Social Science Doctoral Training Programme (LISS-DTP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Collapse
Affiliation(s)
- Maximin Lange
- Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| | | | - Feras Fayez
- King’s College Hospital NHS Foundation Trust, London, United Kingdom
- Imperial College Healthcare NHS Trust, London, United Kingdom
| | - Eoin Gogarty
- Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
- King’s College Hospital NHS Foundation Trust, London, United Kingdom
| | - Ricardo Twumasi
- Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| |
Collapse
|
12
|
Li F, Wang S, Gao Z, Qing M, Pan S, Liu Y, Hu C. Harnessing artificial intelligence in sepsis care: advances in early detection, personalized treatment, and real-time monitoring. Front Med (Lausanne) 2025; 11:1510792. [PMID: 39835096 PMCID: PMC11743359 DOI: 10.3389/fmed.2024.1510792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Accepted: 12/10/2024] [Indexed: 01/22/2025] Open
Abstract
Sepsis remains a leading cause of morbidity and mortality worldwide due to its rapid progression and heterogeneous nature. This review explores the potential of Artificial Intelligence (AI) to transform sepsis management, from early detection to personalized treatment and real-time monitoring. AI, particularly through machine learning (ML) techniques such as random forest models and deep learning algorithms, has shown promise in analyzing electronic health record (EHR) data to identify patterns that enable early sepsis detection. For instance, random forest models have demonstrated high accuracy in predicting sepsis onset in intensive care unit (ICU) patients, while deep learning approaches have been applied to recognize complications such as sepsis-associated acute respiratory distress syndrome (ARDS). Personalized treatment plans developed through AI algorithms predict patient-specific responses to therapies, optimizing therapeutic efficacy and minimizing adverse effects. AI-driven continuous monitoring systems, including wearable devices, provide real-time predictions of sepsis-related complications, enabling timely interventions. Beyond these advancements, AI enhances diagnostic accuracy, predicts long-term outcomes, and supports dynamic risk assessment in clinical settings. However, ethical challenges, including data privacy concerns and algorithmic biases, must be addressed to ensure fair and effective implementation. The significance of this review lies in addressing the current limitations in sepsis management and highlighting how AI can overcome these hurdles. By leveraging AI, healthcare providers can significantly enhance diagnostic accuracy, optimize treatment protocols, and improve overall patient outcomes. Future research should focus on refining AI algorithms with diverse datasets, integrating emerging technologies, and fostering interdisciplinary collaboration to address these challenges and realize AI's transformative potential in sepsis care.
Collapse
Affiliation(s)
- Fang Li
- Department of General Surgery, Chongqing General Hospital, Chongqing, China
| | - Shengguo Wang
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Zhi Gao
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Maofeng Qing
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Shan Pan
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yingying Liu
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Chengchen Hu
- Department of Stomatology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
13
|
Haider SA, Borna S, Gomez-Cabello CA, Pressman SM, Haider CR, Forte AJ. The Algorithmic Divide: A Systematic Review on AI-Driven Racial Disparities in Healthcare. J Racial Ethn Health Disparities 2024:10.1007/s40615-024-02237-0. [PMID: 39695057 DOI: 10.1007/s40615-024-02237-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 11/10/2024] [Accepted: 11/14/2024] [Indexed: 12/20/2024]
Abstract
INTRODUCTION As artificial intelligence (AI) continues to permeate various sectors, concerns about disparities arising from its deployment have surfaced. AI's effectiveness correlates not only with the algorithm's quality but also with its training data's integrity. This systematic review investigates the racial disparities perpetuated by AI systems across diverse medical domains and the implications of deploying them, particularly in healthcare. METHODS Six electronic databases (PubMed, Scopus, IEEE, Google Scholar, EMBASE, and Cochrane) were systematically searched on October 3, 2023. Inclusion criteria were peer-reviewed articles in English from 2013 to 2023 that examined instances of racial bias perpetuated by AI in healthcare. Studies conducted outside of healthcare settings or that addressed biases other than racial, as well as letters, opinions were excluded. The risk of bias was identified using CASP criteria for reviews and the Modified Newcastle Scale for observational studies. RESULTS Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, 1272 articles were initially identified, from which 26 met eligibility criteria. Four articles were identified via snowballing, resulting in 30 articles in the analysis. Studies indicate a significant association between AI utilization and the exacerbation of racial disparities, especially in minority populations, including Blacks and Hispanics. Biased data, algorithm design, unfair deployment of algorithms, and historic/systemic inequities were identified as the causes. Study limitations stem from heterogeneity impeding broad comparisons and the preclusion of meta-analysis. CONCLUSION To address racial disparities in healthcare outcomes, enhanced ethical considerations and regulatory frameworks are needed in AI healthcare applications. Comprehensive bias detection tools and mitigation strategies, coupled with active supervision by physicians, are essential to ensure AI becomes a tool for reducing racial disparities in healthcare outcomes.
Collapse
Affiliation(s)
- Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Cesar A Gomez-Cabello
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Sophia M Pressman
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Clifton R Haider
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA.
- Center for Digital Health, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
14
|
Teferra BG, Rueda A, Pang H, Valenzano R, Samavi R, Krishnan S, Bhat V. Screening for Depression Using Natural Language Processing: Literature Review. Interact J Med Res 2024; 13:e55067. [PMID: 39496145 PMCID: PMC11574504 DOI: 10.2196/55067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 06/25/2024] [Accepted: 09/17/2024] [Indexed: 11/06/2024] Open
Abstract
BACKGROUND Depression is a prevalent global mental health disorder with substantial individual and societal impact. Natural language processing (NLP), a branch of artificial intelligence, offers the potential for improving depression screening by extracting meaningful information from textual data, but there are challenges and ethical considerations. OBJECTIVE This literature review aims to explore existing NLP methods for detecting depression, discuss successes and limitations, address ethical concerns, and highlight potential biases. METHODS A literature search was conducted using Semantic Scholar, PubMed, and Google Scholar to identify studies on depression screening using NLP. Keywords included "depression screening," "depression detection," and "natural language processing." Studies were included if they discussed the application of NLP techniques for depression screening or detection. Studies were screened and selected for relevance, with data extracted and synthesized to identify common themes and gaps in the literature. RESULTS NLP techniques, including sentiment analysis, linguistic markers, and deep learning models, offer practical tools for depression screening. Supervised and unsupervised machine learning models and large language models like transformers have demonstrated high accuracy in a variety of application domains. However, ethical concerns related to privacy, bias, interpretability, and lack of regulations to protect individuals arise. Furthermore, cultural and multilingual perspectives highlight the need for culturally sensitive models. CONCLUSIONS NLP presents opportunities to enhance depression detection, but considerable challenges persist. Ethical concerns must be addressed, governance guidance is needed to mitigate risks, and cross-cultural perspectives must be integrated. Future directions include improving interpretability, personalization, and increased collaboration with domain experts, such as data scientists and machine learning engineers. NLP's potential to enhance mental health care remains promising, depending on overcoming obstacles and continuing innovation.
Collapse
Affiliation(s)
- Bazen Gashaw Teferra
- Unity Health Toronto, St. Michael's Hospital, Interventional Psychiatry Program, Toronto, ON, Canada
| | - Alice Rueda
- Unity Health Toronto, St. Michael's Hospital, Interventional Psychiatry Program, Toronto, ON, Canada
| | - Hilary Pang
- Unity Health Toronto, St. Michael's Hospital, Interventional Psychiatry Program, Toronto, ON, Canada
| | - Richard Valenzano
- Toronto Metropolitan University, Department of Computer Science, Toronto, ON, Canada
| | - Reza Samavi
- Toronto Metropolitan University, Department of Electrical, Computer, and Biomedical Engineering, Toronto, ON, Canada
| | - Sridhar Krishnan
- Toronto Metropolitan University, Department of Electrical, Computer, and Biomedical Engineering, Toronto, ON, Canada
| | - Venkat Bhat
- Unity Health Toronto, St. Michael's Hospital, Interventional Psychiatry Program, Toronto, ON, Canada
- University of Toronto, Department of Psychiatry, Toronto, ON, Canada
| |
Collapse
|
15
|
Adu M, Banire B, Dockrill M, Ilie A, Lappin E, McGrath P, Munro S, Myers K, Obuobi-Donkor G, Orji R, Pillai Riddell R, Wozney L, Yisa V. Centering equity, diversity, and inclusion in youth digital mental health: findings from a research, policy, and practice knowledge exchange workshop. Front Digit Health 2024; 6:1449129. [PMID: 39544986 PMCID: PMC11560888 DOI: 10.3389/fdgth.2024.1449129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 10/02/2024] [Indexed: 11/17/2024] Open
Abstract
Background Youth mental health service organizations continue to rapidly broaden their use of virtual care and digital mental health interventions as well as leverage artificial intelligence and other technologies to inform care decisions. However, many of these digital services have failed to alleviate persistent mental health disparities among equity-seeking populations and in some instances have exacerbated them. Transdisciplinary and intersectional knowledge exchange is greatly needed to address structural barriers to digital mental health engagement, develop and evaluate interventions with historically underserved communities, and ultimately promote more accessible, useful, and equitable care. Methods To that end, the Digital, Inclusive, Virtual, and Equitable Research Training in Mental Health Platform (DIVERT), the Maritime Strategy for Patient Oriented Research (SPOR) SUPPORT (Support for People and Patient-Oriented Research and Trials) Unit and IWK Mental Health Program invited researchers, policymakers, interprofessional mental health practitioners, trainees, computer scientists, health system administrators, community leaders and youth advocates to participate in a knowledge exchange workshop. The workshop aimed to (a) highlight local research and innovation in youth-focused digital mental health services; (b) learn more about current policy and practice issues in inclusive digital mental health for youth in Canada, (c) participate in generating action recommendations to address challenges to inclusive, diverse and equitable digital mental health services, and (d) to synthesize cross-sector feedback to inform future training curriculum, policy, strategic planning and to stimulate new lines of patient-oriented research. Results Eleven challenge themes emerged related to white-colonial normativity, lack of cultural humility, inaccessibility and affordability of participating in the digital world, lack of youth and community involvement, risks of too much digital time in youth's lives, and lack of scientific evidence derived from equity-deserving communities. Nine action recommendations focused on diversifying research and development funding, policy and standards, youth and community led promotion, long-term trust-building and collaboration, and needing to callout and advocate against unsafe digital services and processes. Conclusion Key policy, training and practice implications are discussed.
Collapse
Affiliation(s)
- Medard Adu
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| | - Bilikis Banire
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
- Department of Computer Science, Dalhousie University, Halifax, NS, Canada
| | - Mya Dockrill
- Department of Psychology, Dalhousie University, Halifax, NS, Canada
| | - Alzena Ilie
- Department of Psychology, Dalhousie University, Halifax, NS, Canada
| | | | - Patrick McGrath
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
- Centre for Research in Family Health, Halifax, NS, Canada
| | - Samantha Munro
- Department of Psychology, Acadia University, Wolfville, NS, Canada
| | - Kady Myers
- Mental Health and Addictions, Nova Scotia Health, Halifax, NS, Canada
| | | | - Rita Orji
- Department of Computer Science, Dalhousie University, Halifax, NS, Canada
| | | | - Lori Wozney
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
- Centre for Research in Family Health, Halifax, NS, Canada
- Mental Health and Addictions, IWK Health, Halifax, NS, Canada
| | - Victor Yisa
- Department of Computer Science, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
16
|
Das KP, Gavade P. A review on the efficacy of artificial intelligence for managing anxiety disorders. Front Artif Intell 2024; 7:1435895. [PMID: 39479229 PMCID: PMC11523650 DOI: 10.3389/frai.2024.1435895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 09/16/2024] [Indexed: 11/02/2024] Open
Abstract
Anxiety disorders are psychiatric conditions characterized by prolonged and generalized anxiety experienced by individuals in response to various events or situations. At present, anxiety disorders are regarded as the most widespread psychiatric disorders globally. Medication and different types of psychotherapies are employed as the primary therapeutic modalities in clinical practice for the treatment of anxiety disorders. However, combining these two approaches is known to yield more significant benefits than medication alone. Nevertheless, there is a lack of resources and a limited availability of psychotherapy options in underdeveloped areas. Psychotherapy methods encompass relaxation techniques, controlled breathing exercises, visualization exercises, controlled exposure exercises, and cognitive interventions such as challenging negative thoughts. These methods are vital in the treatment of anxiety disorders, but executing them proficiently can be demanding. Moreover, individuals with distinct anxiety disorders are prescribed medications that may cause withdrawal symptoms in some instances. Additionally, there is inadequate availability of face-to-face psychotherapy and a restricted capacity to predict and monitor the health, behavioral, and environmental aspects of individuals with anxiety disorders during the initial phases. In recent years, there has been notable progress in developing and utilizing artificial intelligence (AI) based applications and environments to improve the precision and sensitivity of diagnosing and treating various categories of anxiety disorders. As a result, this study aims to establish the efficacy of AI-enabled environments in addressing the existing challenges in managing anxiety disorders, reducing reliance on medication, and investigating the potential advantages, issues, and opportunities of integrating AI-assisted healthcare for anxiety disorders and enabling personalized therapy.
Collapse
Affiliation(s)
- K. P. Das
- Department of Computer Science, Christ University, Bengaluru, India
| | - P. Gavade
- Independent Practitioner, San Francisco, CA, United States
| |
Collapse
|
17
|
Gargari OK, Fatehi F, Mohammadi I, Firouzabadi SR, Shafiee A, Habibi G. Diagnostic accuracy of large language models in psychiatry. Asian J Psychiatr 2024; 100:104168. [PMID: 39111087 DOI: 10.1016/j.ajp.2024.104168] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 07/20/2024] [Accepted: 07/22/2024] [Indexed: 09/13/2024]
Abstract
INTRODUCTION Medical decision-making is crucial for effective treatment, especially in psychiatry where diagnosis often relies on subjective patient reports and a lack of high-specificity symptoms. Artificial intelligence (AI), particularly Large Language Models (LLMs) like GPT, has emerged as a promising tool to enhance diagnostic accuracy in psychiatry. This comparative study explores the diagnostic capabilities of several AI models, including Aya, GPT-3.5, GPT-4, GPT-3.5 clinical assistant (CA), Nemotron, and Nemotron CA, using clinical cases from the DSM-5. METHODS We curated 20 clinical cases from the DSM-5 Clinical Cases book, covering a wide range of psychiatric diagnoses. Four advanced AI models (GPT-3.5 Turbo, GPT-4, Aya, Nemotron) were tested using prompts to elicit detailed diagnoses and reasoning. The models' performances were evaluated based on accuracy and quality of reasoning, with additional analysis using the Retrieval Augmented Generation (RAG) methodology for models accessing the DSM-5 text. RESULTS The AI models showed varied diagnostic accuracy, with GPT-3.5 and GPT-4 performing notably better than Aya and Nemotron in terms of both accuracy and reasoning quality. While models struggled with specific disorders such as cyclothymic and disruptive mood dysregulation disorders, others excelled, particularly in diagnosing psychotic and bipolar disorders. Statistical analysis highlighted significant differences in accuracy and reasoning, emphasizing the superiority of the GPT models. DISCUSSION The application of AI in psychiatry offers potential improvements in diagnostic accuracy. The superior performance of the GPT models can be attributed to their advanced natural language processing capabilities and extensive training on diverse text data, enabling more effective interpretation of psychiatric language. However, models like Aya and Nemotron showed limitations in reasoning, indicating a need for further refinement in their training and application. CONCLUSION AI holds significant promise for enhancing psychiatric diagnostics, with certain models demonstrating high potential in interpreting complex clinical descriptions accurately. Future research should focus on expanding the dataset and integrating multimodal data to further enhance the diagnostic capabilities of AI in psychiatry.
Collapse
Affiliation(s)
- Omid Kohandel Gargari
- Farzan Artificial Intelligence Team, Farzan Clinical Research Institute, Tehran, Islamic Republic of Iran
| | - Farhad Fatehi
- Centre for Health Services Research, Faculty of Medicine, The University of Queensland, Brisbane, Australia; School of Psychological Sciences, Monash University, Melbourne, Australia
| | - Ida Mohammadi
- Farzan Artificial Intelligence Team, Farzan Clinical Research Institute, Tehran, Islamic Republic of Iran
| | - Shahryar Rajai Firouzabadi
- Farzan Artificial Intelligence Team, Farzan Clinical Research Institute, Tehran, Islamic Republic of Iran
| | - Arman Shafiee
- Farzan Artificial Intelligence Team, Farzan Clinical Research Institute, Tehran, Islamic Republic of Iran
| | - Gholamreza Habibi
- Farzan Artificial Intelligence Team, Farzan Clinical Research Institute, Tehran, Islamic Republic of Iran.
| |
Collapse
|
18
|
Milasan LH. Unveiling the Transformative Potential of AI-Generated Imagery in Enriching Mental Health Research. QUALITATIVE HEALTH RESEARCH 2024:10497323241274767. [PMID: 39299269 DOI: 10.1177/10497323241274767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Visual methods in mental health research have been extensively explored and utilized following the expanse of art-therapy. The existing literature shows visual arts as a valuable research method with multi-fold benefits for both researchers and research participants. However, the way contemporary art is understood, conceptualized, and experienced has been challenged by the current digital advancements in our society. Despite heated debates whether AI may diminish the value of human creativity, AI-generated art is a complex reality that started to influence the way visual research is conducted. Within this context, researchers employing visual methods need to develop a deeper understanding of this topic. For this purpose, this article explores the concept of AI-generated images with a focus on benefits and limitations when applied to mental health research and potentially other areas of health and social care. As this is an emerging topic, more research on the effectiveness and therapeutic value of AI-generated images is required beyond the current anecdotical evidence, from the perspective of the researchers and research participants.
Collapse
|
19
|
Alfeir NM. Dimensions of artificial intelligence on family communication. Front Artif Intell 2024; 7:1398960. [PMID: 39324132 PMCID: PMC11422382 DOI: 10.3389/frai.2024.1398960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/23/2024] [Indexed: 09/27/2024] Open
Abstract
Introduction Artificial intelligence (AI) has created a plethora of prospects for communication. The study aims to examine the impacts of AI dimensions on family communication. By investigating the multifaceted effects of AI on family communication, this research aims to provide valuable insights, uncover potential concerns, and offer recommendations for both families and society at large in this digital era. Method A convenience sampling technique was adopted to recruit 300 participants. Results A linear regression model was measured to examine the impact of AI dimensions which showed a statistically significant effect on accessibility (p = 0.001), personalization (p = 0.001), and language translation (p = 0.016). Discussion The findings showed that in terms of accessibility (p = 0.006), and language translation (p = 0.010), except personalization (p = 0.126), there were differences between males and females. However, using multiple AI tools was statistically associated with raising concerns about bias and privacy (p = 0.015), safety, and dependence (p = 0.049) of parents. Conclusion The results showed a lack of knowledge and transparency about the data storage and privacy policy of AI-enabled communication systems. Overall, there was a positive impact of AI dimensions on family communication.
Collapse
Affiliation(s)
- Nada Mohammed Alfeir
- Department of Communication Skills, King AbdulAziz University, Jeddah, Saudi Arabia
| |
Collapse
|
20
|
Laricheva M, Liu Y, Shi E, Wu A. Scoping review on natural language processing applications in counselling and psychotherapy. Br J Psychol 2024. [PMID: 39095975 DOI: 10.1111/bjop.12721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 07/03/2024] [Indexed: 08/04/2024]
Abstract
Recent years have witnessed some rapid and tremendous progress in natural language processing (NLP) techniques that are used to analyse text data. This study endeavours to offer an up-to-date review of NLP applications by examining their use in counselling and psychotherapy from 1990 to 2021. The purpose of this scoping review is to identify trends, advancements, challenges and limitations of these applications. Among the 41 papers included in this review, 4 primary study purposes were identified: (1) developing automated coding; (2) predicting outcomes; (3) monitoring counselling sessions; and (4) investigating language patterns. Our findings showed a growing trend in the number of papers utilizing advanced machine learning methods, particularly neural networks. Unfortunately, only a third of the articles addressed the issues of bias and generalizability. Our findings provided a timely systematic update, shedding light on concerns related to bias, generalizability and validity in the context of NLP applications in counselling and psychotherapy.
Collapse
Affiliation(s)
- Maria Laricheva
- Educational and Counselling Psychology, and Special Education, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Yan Liu
- Psychology, Carleton University, Ottawa, Ontario, Canada
| | - Edward Shi
- Arts, Business and Law, Victoria University Melbourne, Melbourne, Victoria, Australia
| | - Amery Wu
- Educational and Counselling Psychology, and Special Education, The University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
21
|
Lawrence HR, Schneider RA, Rubin SB, Matarić MJ, McDuff DJ, Jones Bell M. The Opportunities and Risks of Large Language Models in Mental Health. JMIR Ment Health 2024; 11:e59479. [PMID: 39105570 PMCID: PMC11301767 DOI: 10.2196/59479] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 05/31/2024] [Accepted: 06/01/2024] [Indexed: 08/07/2024] Open
Abstract
Unlabelled Global rates of mental health concerns are rising, and there is increasing realization that existing models of mental health care will not adequately expand to meet the demand. With the emergence of large language models (LLMs) has come great optimism regarding their promise to create novel, large-scale solutions to support mental health. Despite their nascence, LLMs have already been applied to mental health-related tasks. In this paper, we summarize the extant literature on efforts to use LLMs to provide mental health education, assessment, and intervention and highlight key opportunities for positive impact in each area. We then highlight risks associated with LLMs' application to mental health and encourage the adoption of strategies to mitigate these risks. The urgent need for mental health support must be balanced with responsible development, testing, and deployment of mental health LLMs. It is especially critical to ensure that mental health LLMs are fine-tuned for mental health, enhance mental health equity, and adhere to ethical standards and that people, including those with lived experience with mental health concerns, are involved in all stages from development through deployment. Prioritizing these efforts will minimize potential harms to mental health and maximize the likelihood that LLMs will positively impact mental health globally.
Collapse
Affiliation(s)
| | | | | | - Maja J Matarić
- Google LLC, Mountain View, CA, 90291, United States, 13103106000
| | - Daniel J McDuff
- Google LLC, Mountain View, CA, 90291, United States, 13103106000
| | - Megan Jones Bell
- Google LLC, Mountain View, CA, 90291, United States, 13103106000
| |
Collapse
|
22
|
Isleem UN, Zaidat B, Ren R, Geng EA, Burapachaisri A, Tang JE, Kim JS, Cho SK. Can generative artificial intelligence pass the orthopaedic board examination? J Orthop 2024; 53:27-33. [PMID: 38450060 PMCID: PMC10912220 DOI: 10.1016/j.jor.2023.10.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 10/24/2023] [Accepted: 10/26/2023] [Indexed: 03/08/2024] Open
Abstract
Background Resident training programs in the US use the Orthopaedic In-Training Examination (OITE) developed by the American Academy of Orthopaedic Surgeons (AAOS) to assess the current knowledge of their residents and to identify the residents at risk of failing the Amerian Board of Orthopaedic Surgery (ABOS) examination. Optimal strategies for OITE preparation are constantly being explored. There may be a role for Large Language Models (LLMs) in orthopaedic resident education. ChatGPT, an LLM launched in late 2022 has demonstrated the ability to produce accurate, detailed answers, potentially enabling it to aid in medical education and clinical decision-making. The purpose of this study is to evaluate the performance of ChatGPT on Orthopaedic In-Training Examinations using Self-Assessment Exams from the AAOS database and approved literature as a proxy for the Orthopaedic Board Examination. Methods 301 SAE questions from the AAOS database and associated AAOS literature were input into ChatGPT's interface in a question and multiple-choice format and the answers were then analyzed to determine which answer choice was selected. A new chat was used for every question. All answers were recorded, categorized, and compared to the answer given by the OITE and SAE exams, noting whether the answer was right or wrong. Results Of the 301 questions asked, ChatGPT was able to correctly answer 183 (60.8%) of them. The subjects with the highest percentage of correct questions were basic science (81%), oncology (72.7%, shoulder and elbow (71.9%), and sports (71.4%). The questions were further subdivided into 3 groups: those about management, diagnosis, or knowledge recall. There were 86 management questions and 47 were correct (54.7%), 45 diagnosis questions with 32 correct (71.7%), and 168 knowledge recall questions with 102 correct (60.7%). Conclusions ChatGPT has the potential to provide orthopedic educators and trainees with accurate clinical conclusions for the majority of board-style questions, although its reasoning should be carefully analyzed for accuracy and clinical validity. As such, its usefulness in a clinical educational context is currently limited but rapidly evolving. Clinical relevance ChatGPT can access a multitude of medical data and may help provide accurate answers to clinical questions.
Collapse
Affiliation(s)
- Ula N. Isleem
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bashar Zaidat
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Renee Ren
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eric A. Geng
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Aonnicha Burapachaisri
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Justin E. Tang
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jun S. Kim
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Samuel K. Cho
- Department of Orthopaedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
23
|
Hornstein S, Scharfenberger J, Lueken U, Wundrack R, Hilbert K. Predicting recurrent chat contact in a psychological intervention for the youth using natural language processing. NPJ Digit Med 2024; 7:132. [PMID: 38762694 PMCID: PMC11102489 DOI: 10.1038/s41746-024-01121-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 04/23/2024] [Indexed: 05/20/2024] Open
Abstract
Chat-based counseling hotlines emerged as a promising low-threshold intervention for youth mental health. However, despite the resulting availability of large text corpora, little work has investigated Natural Language Processing (NLP) applications within this setting. Therefore, this preregistered approach (OSF: XA4PN) utilizes a sample of approximately 19,000 children and young adults that received a chat consultation from a 24/7 crisis service in Germany. Around 800,000 messages were used to predict whether chatters would contact the service again, as this would allow the provision of or redirection to additional treatment. We trained an XGBoost Classifier on the words of the anonymized conversations, using repeated cross-validation and bayesian optimization for hyperparameter search. The best model was able to achieve an AUROC score of 0.68 (p < 0.01) on the previously unseen 3942 newest consultations. A shapely-based explainability approach revealed that words indicating younger age or female gender and terms related to self-harm and suicidal thoughts were associated with a higher chance of recontacting. We conclude that NLP-based predictions of recurrent contact are a promising path toward personalized care at chat hotlines.
Collapse
Affiliation(s)
- Silvan Hornstein
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany.
| | | | - Ulrike Lueken
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
- German Center for Mental Health (DZPG), partner site Berlin/Potsdam, Potsdam, Germany
| | - Richard Wundrack
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
| | - Kevin Hilbert
- Department of Psychology, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
| |
Collapse
|
24
|
Wimbarti S, Kairupan BHR, Tallei TE. Critical review of self-diagnosis of mental health conditions using artificial intelligence. Int J Ment Health Nurs 2024; 33:344-358. [PMID: 38345132 DOI: 10.1111/inm.13303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 03/10/2024]
Abstract
The advent of artificial intelligence (AI) has revolutionised various aspects of our lives, including mental health nursing. AI-driven tools and applications have provided a convenient and accessible means for individuals to assess their mental well-being within the confines of their homes. Nonetheless, the widespread trend of self-diagnosing mental health conditions through AI poses considerable risks. This review article examines the perils associated with relying on AI for self-diagnosis in mental health, highlighting the constraints and possible adverse outcomes that can arise from such practices. It delves into the ethical, psychological, and social implications, underscoring the vital role of mental health professionals, including psychologists, psychiatrists, and nursing specialists, in providing professional assistance and guidance. This article aims to highlight the importance of seeking professional assistance and guidance in addressing mental health concerns, especially in the era of AI-driven self-diagnosis.
Collapse
Affiliation(s)
- Supra Wimbarti
- Faculty of Psychology, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - B H Ralph Kairupan
- Department of Psychiatry, Faculty of Medicine, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
| | - Trina Ekawati Tallei
- Department of Biology, Faculty of Mathematics and Natural Sciences, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
- Department of Biology, Faculty of Medicine, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
| |
Collapse
|
25
|
Thakkar A, Gupta A, De Sousa A. Artificial intelligence in positive mental health: a narrative review. Front Digit Health 2024; 6:1280235. [PMID: 38562663 PMCID: PMC10982476 DOI: 10.3389/fdgth.2024.1280235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 02/26/2024] [Indexed: 04/04/2024] Open
Abstract
The paper reviews the entire spectrum of Artificial Intelligence (AI) in mental health and its positive role in mental health. AI has a huge number of promises to offer mental health care and this paper looks at multiple facets of the same. The paper first defines AI and its scope in the area of mental health. It then looks at various facets of AI like machine learning, supervised machine learning and unsupervised machine learning and other facets of AI. The role of AI in various psychiatric disorders like neurodegenerative disorders, intellectual disability and seizures are discussed along with the role of AI in awareness, diagnosis and intervention in mental health disorders. The role of AI in positive emotional regulation and its impact in schizophrenia, autism spectrum disorders and mood disorders is also highlighted. The article also discusses the limitations of AI based approaches and the need for AI based approaches in mental health to be culturally aware, with structured flexible algorithms and an awareness of biases that can arise in AI. The ethical issues that may arise with the use of AI in mental health are also visited.
Collapse
|
26
|
Zafar F, Fakhare Alam L, Vivas RR, Wang J, Whei SJ, Mehmood S, Sadeghzadegan A, Lakkimsetti M, Nazir Z. The Role of Artificial Intelligence in Identifying Depression and Anxiety: A Comprehensive Literature Review. Cureus 2024; 16:e56472. [PMID: 38638735 PMCID: PMC11025697 DOI: 10.7759/cureus.56472] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/18/2024] [Indexed: 04/20/2024] Open
Abstract
This narrative literature review undertakes a comprehensive examination of the burgeoning field, tracing the development of artificial intelligence (AI)-powered tools for depression and anxiety detection from the level of intricate algorithms to practical applications. Delivering essential mental health care services is now a significant public health priority. In recent years, AI has become a game-changer in the early identification and intervention of these pervasive mental health disorders. AI tools can potentially empower behavioral healthcare services by helping psychiatrists collect objective data on patients' progress and tasks. This study emphasizes the current understanding of AI, the different types of AI, its current use in multiple mental health disorders, advantages, disadvantages, and future potentials. As technology develops and the digitalization of the modern era increases, there will be a rise in the application of artificial intelligence in psychiatry; therefore, a comprehensive understanding will be needed. We searched PubMed, Google Scholar, and Science Direct using keywords for this. In a recent review of studies using electronic health records (EHR) with AI and machine learning techniques for diagnosing all clinical conditions, roughly 99 publications have been found. Out of these, 35 studies were identified for mental health disorders in all age groups, and among them, six studies utilized EHR data sources. By critically analyzing prominent scholarly works, we aim to illuminate the current state of this technology, exploring its successes, limitations, and future directions. In doing so, we hope to contribute to a nuanced understanding of AI's potential to revolutionize mental health diagnostics and pave the way for further research and development in this critically important domain.
Collapse
Affiliation(s)
- Fabeha Zafar
- Internal Medicine, Dow University of Health Sciences (DUHS), Karachi, PAK
| | | | - Rafael R Vivas
- Nutrition, Food and Exercise Sciences, Florida State University College of Human Sciences, Tallahassee, USA
| | - Jada Wang
- Medicine, St. George's University, Brooklyn, USA
| | - See Jia Whei
- Internal Medicine, Sriwijaya University, Palembang, IDN
| | | | | | | | - Zahra Nazir
- Internal Medicine, Combined Military Hospital, Quetta, Quetta, PAK
| |
Collapse
|
27
|
Tayebi Arasteh S, Han T, Lotfinia M, Kuhl C, Kather JN, Truhn D, Nebelung S. Large language models streamline automated machine learning for clinical studies. Nat Commun 2024; 15:1603. [PMID: 38383555 PMCID: PMC10881983 DOI: 10.1038/s41467-024-45879-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/06/2024] [Indexed: 02/23/2024] Open
Abstract
A knowledge gap persists between machine learning (ML) developers (e.g., data scientists) and practitioners (e.g., clinicians), hampering the full utilization of ML for clinical data analysis. We investigated the potential of the ChatGPT Advanced Data Analysis (ADA), an extension of GPT-4, to bridge this gap and perform ML analyses efficiently. Real-world clinical datasets and study details from large trials across various medical specialties were presented to ChatGPT ADA without specific guidance. ChatGPT ADA autonomously developed state-of-the-art ML models based on the original study's training data to predict clinical outcomes such as cancer development, cancer progression, disease complications, or biomarkers such as pathogenic gene sequences. Following the re-implementation and optimization of the published models, the head-to-head comparison of the ChatGPT ADA-crafted ML models and their respective manually crafted counterparts revealed no significant differences in traditional performance metrics (p ≥ 0.072). Strikingly, the ChatGPT ADA-crafted ML models often outperformed their counterparts. In conclusion, ChatGPT ADA offers a promising avenue to democratize ML in medicine by simplifying complex data analyses, yet should enhance, not replace, specialized training and resources, to promote broader applications in medical research and practice.
Collapse
Affiliation(s)
- Soroosh Tayebi Arasteh
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
| | - Tianyu Han
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
| | - Mahshad Lotfinia
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
- Institute of Heat and Mass Transfer, RWTH Aachen University, Aachen, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| |
Collapse
|
28
|
Flores L, Kim S, Young SD. Addressing bias in artificial intelligence for public health surveillance. JOURNAL OF MEDICAL ETHICS 2024; 50:190-194. [PMID: 37130756 DOI: 10.1136/jme-2022-108875] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/20/2023] [Indexed: 05/04/2023]
Abstract
Components of artificial intelligence (AI) for analysing social big data, such as natural language processing (NLP) algorithms, have improved the timeliness and robustness of health data. NLP techniques have been implemented to analyse large volumes of text from social media platforms to gain insights on disease symptoms, understand barriers to care and predict disease outbreaks. However, AI-based decisions may contain biases that could misrepresent populations, skew results or lead to errors. Bias, within the scope of this paper, is described as the difference between the predictive values and true values within the modelling of an algorithm. Bias within algorithms may lead to inaccurate healthcare outcomes and exacerbate health disparities when results derived from these biased algorithms are applied to health interventions. Researchers who implement these algorithms must consider when and how bias may arise. This paper explores algorithmic biases as a result of data collection, labelling and modelling of NLP algorithms. Researchers have a role in ensuring that efforts towards combating bias are enforced, especially when drawing health conclusions derived from social media posts that are linguistically diverse. Through the implementation of open collaboration, auditing processes and the development of guidelines, researchers may be able to reduce bias and improve NLP algorithms that improve health surveillance.
Collapse
Affiliation(s)
- Lidia Flores
- Department of Informatics, University of California Irvine, Irvine, California, USA
| | - Seungjun Kim
- Department of Informatics, University of California Irvine, Irvine, California, USA
| | - Sean D Young
- Department of Informatics, University of California Irvine, Irvine, California, USA
- Department of Emergency Medicine, School of Medicine, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
29
|
Rashid L, Möckel C, Bohn S. The blessing and curse of "no strings attached": An automated literature analysis of psychological health and non-attachmental work in the digitalization era. PLoS One 2024; 19:e0298040. [PMID: 38329979 PMCID: PMC10852238 DOI: 10.1371/journal.pone.0298040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 01/16/2024] [Indexed: 02/10/2024] Open
Abstract
Amidst tremendous changes in the worlds of work in light of digitalization, non-attachmental work designs, where individuals gain income without being bound by a fixed administrative attachment to an employer, hold promises of self-actualization along with threats of insecurity. Today's technology boom and the consequent flexibility and uncertainty it brings into workers' lives may translate into inspiring growth opportunities or overloading pressure, contingent upon mental health and wellbeing impacts. This paper first provides a conceptualization of the non-attachmental work designs of the 21st century, before proceeding to an extensive mapping of literature at their intersection with psychological health. This involves a machine-learning-driven review of 1094 scientific articles using topic modeling, combined with in-depth manual content analyses and inductive-deductive cycles of pattern discovery and category building. The resulting scholarly blueprint reveals several tendencies, including a prevalence of positive psychology concepts in research on work designs with high levels of autonomy and control, contrasted with narratives of disempowerment in service- and task-based work. We note that some psychological health issues are researched with respect to specific work designs but not others, for instance neurodiversity and the role of gender in ownership-based work, self-image and digital addiction in content-based work, and ratings-induced anxiety in platform-mediated task-based work. We also find a heavy representation of 'heroic' entrepreneurs, quantitative methods, and western contexts in addition to a surprising dearth of analyses on the roles of policy and technological interventions. The results are positioned to guide academics, decision-makers, technologists, and workers in the pursuit of healthier work designs for a more sustainable future.
Collapse
Affiliation(s)
- Lubna Rashid
- Chair of Entrepreneurship & Innovation Management (H76), Technische Universität Berlin, Berlin, Germany
| | | | - Stephan Bohn
- Humboldt Institute for Internet and Society, Berlin, Germany
- Department of Management, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
30
|
Moore R, Al-Tamimi AK, Freeman E. Investigating the Potential of a Conversational Agent (Phyllis) to Support Adolescent Health and Overcome Barriers to Physical Activity: Co-Design Study. JMIR Form Res 2024; 8:e51571. [PMID: 38294857 PMCID: PMC10867744 DOI: 10.2196/51571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND Conversational agents (CAs) are a promising solution to support people in improving physical activity (PA) behaviors. However, there is a lack of CAs targeted at adolescents that aim to provide support to overcome barriers to PA. This study reports the results of the co-design, development, and evaluation of a prototype CA called "Phyllis" to support adolescents in overcoming barriers to PA with the aim of improving PA behaviors. The study presents one of the first theory-driven CAs that use existing research, a theoretical framework, and a behavior change model. OBJECTIVE The aim of the study is to use a mixed methods approach to investigate the potential of a CA to support adolescents in overcoming barriers to PA and enhance their confidence and motivation to engage in PA. METHODS The methodology involved co-designing with 8 adolescents to create a relational and persuasive CA with a suitable persona and dialogue. The CA was evaluated to determine its acceptability, usability, and effectiveness, with 46 adolescents participating in the study via a web-based survey. RESULTS The co-design participants were students aged 11 to 13 years, with a sex distribution of 56% (5/9) female and 44% (4/9) male, representing diverse ethnic backgrounds. Participants reported 37 specific barriers to PA, and the most common barriers included a "lack of confidence," "fear of failure," and a "lack of motivation." The CA's persona, named "Phyllis," was co-designed with input from the students, reflecting their preferences for a friendly, understanding, and intelligent personality. Users engaged in 61 conversations with Phyllis and reported a positive user experience, and 73% of them expressed a definite intention to use the fully functional CA in the future, with a net promoter score indicating a high likelihood of recommendation. Phyllis also performed well, being able to recognize a range of different barriers to PA. The CA's persuasive capacity was evaluated in modules focusing on confidence and motivation, with a significant increase in students' agreement in feeling confident and motivated to engage in PA after interacting with Phyllis. Adolescents also expect to have a personalized experience and be able to personalize all aspects of the CA. CONCLUSIONS The results showed high acceptability and a positive user experience, indicating the CA's potential. Promising outcomes were observed, with increasing confidence and motivation for PA. Further research and development are needed to create further interventions to address other barriers to PA and assess long-term behavior change. Addressing concerns regarding bias and privacy is crucial for achieving acceptability in the future. The CA's potential extends to health care systems and multimodal support, providing valuable insights for designing digital health interventions including tackling global inactivity issues among adolescents.
Collapse
Affiliation(s)
- Richard Moore
- Sheffield Hallam University, Sport and Physical Activity Research Centre / Advanced Wellbeing Research Centre, Sheffield, United Kingdom
| | | | - Elizabeth Freeman
- Department of Psychology, Sociology & Politics, Sheffield Hallam University, Sheffield, United Kingdom
| |
Collapse
|
31
|
Rao KN, Arora RD, Dange P, Nagarkar NM. NLP AI Models for Optimizing Medical Research: Demystifying the Concerns. Indian J Surg Oncol 2023; 14:854-858. [PMID: 38187847 PMCID: PMC10767031 DOI: 10.1007/s13193-023-01791-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 06/24/2023] [Indexed: 01/09/2024] Open
Abstract
Natural language processing (NLP) AI models have gained popularity in research; however, ethical considerations are necessary to avoid potential negative consequences. This paper identifies and explores the key areas of ethical concern for researchers using NLP AI models, such as bias in training data and algorithms, plagiarism, data privacy, accuracy of generated content, prompt and content generation, and training data quality. To mitigate bias, researchers should use diverse training data and regularly evaluate models for potential biases. Proper attribution and privacy protection are essential when using AI-generated content, while accuracy should be regularly tested and evaluated. Specific and appropriate prompts, algorithms, and techniques should be used for content generation, and training data quality should be high, diverse, and updated regularly. Finally, appropriate authorship credit and avoidance of conflicts of interest must be ensured. Adherence to ethical standards, such as those outlined by ICMJE, is crucial. These ethical considerations are vital for ensuring the quality and integrity of NLP AI model research and avoiding negative consequences.
Collapse
Affiliation(s)
- Karthik Nagaraja Rao
- Department of Head and Neck Oncology, All India Institute of Medical Sciences, Raipur, Chhattisgarh 492099 India
| | - Ripu Daman Arora
- Department of Otolaryngology and Head Neck Surgery, All India Institute of Medical Sciences, Raipur, India
| | - Prajwal Dange
- Department of Head and Neck Oncology, All India Institute of Medical Sciences, Raipur, Chhattisgarh 492099 India
| | | |
Collapse
|
32
|
Allareddy V, Oubaidin M, Rampa S, Venugopalan SR, Elnagar MH, Yadav S, Lee MK. Call for algorithmic fairness to mitigate amplification of racial biases in artificial intelligence models used in orthodontics and craniofacial health. Orthod Craniofac Res 2023; 26 Suppl 1:124-130. [PMID: 37846615 DOI: 10.1111/ocr.12721] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/09/2023] [Indexed: 10/18/2023]
Abstract
Machine Learning (ML), a subfield of Artificial Intelligence (AI), is being increasingly used in Orthodontics and craniofacial health for predicting clinical outcomes. Current ML/AI models are prone to accentuate racial disparities. The objective of this narrative review is to provide an overview of how AI/ML models perpetuate racial biases and how we can mitigate this situation. A narrative review of articles published in the medical literature on racial biases and the use of AI/ML models was undertaken. Current AI/ML models are built on homogenous clinical datasets that have a gross underrepresentation of historically disadvantages demographic groups, especially the ethno-racial minorities. The consequence of such AI/ML models is that they perform poorly when deployed on ethno-racial minorities thus further amplifying racial biases. Healthcare providers, policymakers, AI developers and all stakeholders should pay close attention to various steps in the pipeline of building AI/ML models and every effort must be made to establish algorithmic fairness to redress inequities.
Collapse
Affiliation(s)
- Veerasathpurush Allareddy
- Department of Orthodontics, University of Illinois Chicago College of Dentistry, Chicago, Illinois, USA
| | - Maysaa Oubaidin
- Department of Orthodontics, University of Illinois Chicago College of Dentistry, Chicago, Illinois, USA
| | - Sankeerth Rampa
- Health Care Administration Program, School of Business, Rhode Island College, Providence, Rhode Island, USA
| | | | - Mohammed H Elnagar
- Department of Orthodontics, University of Illinois Chicago College of Dentistry, Chicago, Illinois, USA
| | - Sumit Yadav
- Department of Orthodontics, University of Nebraska Medical Center, Lincoln, Nebraska, USA
| | - Min Kyeong Lee
- Department of Orthodontics, University of Illinois Chicago College of Dentistry, Chicago, Illinois, USA
| |
Collapse
|
33
|
Peng C, Yang X, Chen A, Smith KE, PourNejatian N, Costa AB, Martin C, Flores MG, Zhang Y, Magoc T, Lipori G, Mitchell DA, Ospina NS, Ahmed MM, Hogan WR, Shenkman EA, Guo Y, Bian J, Wu Y. A study of generative large language model for medical research and healthcare. NPJ Digit Med 2023; 6:210. [PMID: 37973919 PMCID: PMC10654385 DOI: 10.1038/s41746-023-00958-w] [Citation(s) in RCA: 86] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 11/01/2023] [Indexed: 11/19/2023] Open
Abstract
There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians' Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.
Collapse
Affiliation(s)
- Cheng Peng
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Xi Yang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
- Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Aokun Chen
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
- Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | | | | | | | | | | | - Ying Zhang
- Research Computing, University of Florida, Gainesville, FL, USA
| | - Tanja Magoc
- Integrated Data Repository Research Services, University of Florida, Gainesville, FL, USA
| | - Gloria Lipori
- Integrated Data Repository Research Services, University of Florida, Gainesville, FL, USA
- Lillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of Florida, Gainesville, FL, USA
| | - Duane A Mitchell
- Lillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of Florida, Gainesville, FL, USA
| | - Naykky S Ospina
- Division of Endocrinology, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Mustafa M Ahmed
- Division of Cardiovascular Medicine, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Elizabeth A Shenkman
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
- Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
- Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
- Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA.
| |
Collapse
|
34
|
Wang X, Li J. Using artificial intelligence in medical research: Challenge or opportunity? Asian J Surg 2023; 46:4811. [PMID: 37268462 DOI: 10.1016/j.asjsur.2023.05.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 05/19/2023] [Indexed: 06/04/2023] Open
Affiliation(s)
- Xingru Wang
- Department of Hepatobiliary Surgery, Qujing Second People's Hospital of Yunnan Province, Qujing, China
| | - Jianwei Li
- Department of Hepatobiliary Surgery, Southwest Hospital, Army Medical University, Chongqing, China.
| |
Collapse
|
35
|
Ho A, Perry J. What We Owe Those Who Chat Woe: A Relational Lens for Mental Health Apps. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2023; 23:77-80. [PMID: 37812122 DOI: 10.1080/15265161.2023.2250306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Affiliation(s)
- Anita Ho
- University of British Columbia
- University of California, San Francisco
- CommonSpirit Health
| | | |
Collapse
|
36
|
Timmons AC, Duong JB, Fiallo NS, Lee T, Vo HPQ, Ahle MW, Comer JS, Brewer LC, Frazier SL, Chaspari T. A Call to Action on Assessing and Mitigating Bias in Artificial Intelligence Applications for Mental Health. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2023; 18:1062-1096. [PMID: 36490369 PMCID: PMC10250563 DOI: 10.1177/17456916221134490] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Advances in computer science and data-analytic methods are driving a new era in mental health research and application. Artificial intelligence (AI) technologies hold the potential to enhance the assessment, diagnosis, and treatment of people experiencing mental health problems and to increase the reach and impact of mental health care. However, AI applications will not mitigate mental health disparities if they are built from historical data that reflect underlying social biases and inequities. AI models biased against sensitive classes could reinforce and even perpetuate existing inequities if these models create legacies that differentially impact who is diagnosed and treated, and how effectively. The current article reviews the health-equity implications of applying AI to mental health problems, outlines state-of-the-art methods for assessing and mitigating algorithmic bias, and presents a call to action to guide the development of fair-aware AI in psychological science.
Collapse
Affiliation(s)
- Adela C. Timmons
- University of Texas at Austin Institute for Mental Health Research
- Colliga Apps Corporation
| | | | | | | | | | | | | | - LaPrincess C. Brewer
- Department of Cardiovascular Medicine, May Clinic College of Medicine, Rochester, Minnesota, United States
- Center for Health Equity and Community Engagement Research, Mayo Clinic, Rochester, Minnesota, United States
| | | | | |
Collapse
|
37
|
Levis M, Levy J, Dufort V, Russ CJ, Shiner B. Dynamic suicide topic modelling: Deriving population-specific, psychosocial and time-sensitive suicide risk variables from Electronic Health Record psychotherapy notes. Clin Psychol Psychother 2023; 30:795-810. [PMID: 36797651 PMCID: PMC11172400 DOI: 10.1002/cpp.2842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 02/14/2023] [Indexed: 02/18/2023]
Abstract
In the machine learning subfield of natural language processing, a topic model is a type of unsupervised method that is used to uncover abstract topics within a corpus of text. Dynamic topic modelling (DTM) is used for capturing change in these topics over time. The study deploys DTM on corpus of electronic health record psychotherapy notes. This retrospective study examines whether DTM helps distinguish closely matched patients that did and did not die by suicide. Cohort consists of United States Department of Veterans Affairs (VA) patients diagnosed with Posttraumatic Stress Disorder (PTSD) between 2004 and 2013. Each case (those who died by suicide during the year following diagnosis) was matched with five controls (those who remained alive) that shared psychotherapists and had similar suicide risk based on VA's suicide prediction algorithm. Cohort was restricted to patients who received psychotherapy for 9+ months after initial PTSD diagnoses (cases = 77; controls = 362). For cases, psychotherapy notes from diagnosis until death were examined. For controls, psychotherapy notes from diagnosis until matched case's death date were examined. A Python-based DTM algorithm was utilized. Derived topics identified population-specific themes, including PTSD, psychotherapy, medication, communication and relationships. Control topics changed significantly more over time than case topics. Topic differences highlighted engagement, expressivity and therapeutic alliance. This study strengthens groundwork for deriving population-specific, psychosocial and time-sensitive suicide risk variables.
Collapse
Affiliation(s)
- Maxwell Levis
- White River Junction VA Medical Center, Hartford, Vermont, USA
- Geisel School of Medicine at Dartmouth, Hanover, New Hampshire, USA
| | - Joshua Levy
- Geisel School of Medicine at Dartmouth, Hanover, New Hampshire, USA
| | - Vincent Dufort
- White River Junction VA Medical Center, Hartford, Vermont, USA
| | - Carey J. Russ
- White River Junction VA Medical Center, Hartford, Vermont, USA
- Geisel School of Medicine at Dartmouth, Hanover, New Hampshire, USA
| | - Brian Shiner
- White River Junction VA Medical Center, Hartford, Vermont, USA
- Geisel School of Medicine at Dartmouth, Hanover, New Hampshire, USA
- National Center for PTSD Executive Division, Hartford, Vermont, USA
| |
Collapse
|
38
|
Wright-Berryman J, Cohen J, Haq A, Black DP, Pease JL. Virtually screening adults for depression, anxiety, and suicide risk using machine learning and language from an open-ended interview. Front Psychiatry 2023; 14:1143175. [PMID: 37377466 PMCID: PMC10291825 DOI: 10.3389/fpsyt.2023.1143175] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 05/22/2023] [Indexed: 06/29/2023] Open
Abstract
Background Current depression, anxiety, and suicide screening techniques rely on retrospective patient reported symptoms to standardized scales. A qualitative approach to screening combined with the innovation of natural language processing (NLP) and machine learning (ML) methods have shown promise to enhance person-centeredness while detecting depression, anxiety, and suicide risk from in-the-moment patient language derived from an open-ended brief interview. Objective To evaluate the performance of NLP/ML models to identify depression, anxiety, and suicide risk from a single 5-10-min semi-structured interview with a large, national sample. Method Two thousand four hundred sixteen interviews were conducted with 1,433 participants over a teleconference platform, with 861 (35.6%), 863 (35.7%), and 838 (34.7%) sessions screening positive for depression, anxiety, and suicide risk, respectively. Participants completed an interview over a teleconference platform to collect language about the participants' feelings and emotional state. Logistic regression (LR), support vector machine (SVM), and extreme gradient boosting (XGB) models were trained for each condition using term frequency-inverse document frequency features from the participants' language. Models were primarily evaluated with the area under the receiver operating characteristic curve (AUC). Results The best discriminative ability was found when identifying depression with an SVM model (AUC = 0.77; 95% CI = 0.75-0.79), followed by anxiety with an LR model (AUC = 0.74; 95% CI = 0.72-0.76), and an SVM for suicide risk (AUC = 0.70; 95% CI = 0.68-0.72). Model performance was generally best with more severe depression, anxiety, or suicide risk. Performance improved when individuals with lifetime but no suicide risk in the past 3 months were considered controls. Conclusion It is feasible to use a virtual platform to simultaneously screen for depression, anxiety, and suicide risk using a 5-to-10-min interview. The NLP/ML models performed with good discrimination in the identification of depression, anxiety, and suicide risk. Although the utility of suicide risk classification in clinical settings is still undetermined and suicide risk classification had the lowest performance, the result taken together with the qualitative responses from the interview can better inform clinical decision-making by providing additional drivers associated with suicide risk.
Collapse
Affiliation(s)
- Jennifer Wright-Berryman
- Department of Social Work, College of Allied Health Sciences, University of Cincinnati, Cincinnati, OH, United States
| | | | - Allie Haq
- Clarigent Health, Mason, OH, United States
| | | | - James L. Pease
- Department of Social Work, College of Allied Health Sciences, University of Cincinnati, Cincinnati, OH, United States
| |
Collapse
|
39
|
Solans Noguero D, Ramírez-Cifuentes D, Ríssola EA, Freire A. Gender Bias When Using Artificial Intelligence to Assess Anorexia Nervosa on Social Media: Data-Driven Study. J Med Internet Res 2023; 25:e45184. [PMID: 37289496 PMCID: PMC10288345 DOI: 10.2196/45184] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 04/12/2023] [Accepted: 04/26/2023] [Indexed: 06/09/2023] Open
Abstract
BACKGROUND Social media sites are becoming an increasingly important source of information about mental health disorders. Among them, eating disorders are complex psychological problems that involve unhealthy eating habits. In particular, there is evidence showing that signs and symptoms of anorexia nervosa can be traced in social media platforms. Knowing that input data biases tend to be amplified by artificial intelligence algorithms and, in particular, machine learning, these methods should be revised to mitigate biased discrimination in such important domains. OBJECTIVE The main goal of this study was to detect and analyze the performance disparities across genders in algorithms trained for the detection of anorexia nervosa on social media posts. We used a collection of automated predictors trained on a data set in Spanish containing cases of 177 users that showed signs of anorexia (471,262 tweets) and 326 control cases (910,967 tweets). METHODS We first inspected the predictive performance differences between the algorithms for male and female users. Once biases were detected, we applied a feature-level bias characterization to evaluate the source of such biases and performed a comparative analysis of such features and those that are relevant for clinicians. Finally, we showcased different bias mitigation strategies to develop fairer automated classifiers, particularly for risk assessment in sensitive domains. RESULTS Our results revealed concerning predictive performance differences, with substantially higher false negative rates (FNRs) for female samples (FNR=0.082) compared with male samples (FNR=0.005). The findings show that biological processes and suicide risk factors were relevant for classifying positive male cases, whereas age, emotions, and personal concerns were more relevant for female cases. We also proposed techniques for bias mitigation, and we could see that, even though disparities can be mitigated, they cannot be eliminated. CONCLUSIONS We concluded that more attention should be paid to the assessment of biases in automated methods dedicated to the detection of mental health issues. This is particularly relevant before the deployment of systems that are thought to assist clinicians, especially considering that the outputs of such systems can have an impact on the diagnosis of people at risk.
Collapse
Affiliation(s)
- David Solans Noguero
- Telefonica I+D, Telefónica Research, Barcelona, Spain
- Web Science and Social Computing group, Universidad Pompeu Fabra, Barcelona, Spain
| | - Diana Ramírez-Cifuentes
- Web Science and Social Computing group, Universidad Pompeu Fabra, Barcelona, Spain
- Computer Vision Center, Bellaterra (Cerdanyola del Vallès), Spain
| | | | - Ana Freire
- Innovation and Sustainability Data Lab, UPF Barcelona School of Management, Barcelona, Spain
| |
Collapse
|
40
|
Daniali M, Galer PD, Lewis-Smith D, Parthasarathy S, Kim E, Salvucci DD, Miller JM, Haag S, Helbig I. Enriching representation learning using 53 million patient notes through human phenotype ontology embedding. Artif Intell Med 2023; 139:102523. [PMID: 37100502 PMCID: PMC10782859 DOI: 10.1016/j.artmed.2023.102523] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 02/17/2023] [Accepted: 02/23/2023] [Indexed: 03/04/2023]
Abstract
The Human Phenotype Ontology (HPO) is a dictionary of >15,000 clinical phenotypic terms with defined semantic relationships, developed to standardize phenotypic analysis. Over the last decade, the HPO has been used to accelerate the implementation of precision medicine into clinical practice. In addition, recent research in representation learning, specifically in graph embedding, has led to notable progress in automated prediction via learned features. Here, we present a novel approach to phenotype representation by incorporating phenotypic frequencies based on 53 million full-text health care notes from >1.5 million individuals. We demonstrate the efficacy of our proposed phenotype embedding technique by comparing our work to existing phenotypic similarity-measuring methods. Using phenotype frequencies in our embedding technique, we are able to identify phenotypic similarities that surpass current computational models. Furthermore, our embedding technique exhibits a high degree of agreement with domain experts' judgment. By transforming complex and multidimensional phenotypes from the HPO format into vectors, our proposed method enables efficient representation of these phenotypes for downstream tasks that require deep phenotyping. This is demonstrated in a patient similarity analysis and can further be applied to disease trajectory and risk prediction.
Collapse
Affiliation(s)
- Maryam Daniali
- Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Peter D Galer
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - David Lewis-Smith
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Translational and Clinical Research Institute, Newcastle University, Newcastle-upon-Tyne, UK; Department of Clinical Neurosciences, Royal Victoria Infirmary, Newcastle-upon-Tyne, UK
| | - Shridhar Parthasarathy
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Edward Kim
- Department of Computer Science, Drexel University, Philadelphia, PA, USA
| | - Dario D Salvucci
- Department of Computer Science, Drexel University, Philadelphia, PA, USA
| | - Jeffrey M Miller
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Scott Haag
- Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ingo Helbig
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.
| |
Collapse
|
41
|
Affiliation(s)
- Anmol Arora
- School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SP, UK.
| | - Ananya Arora
- School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SP, UK
| |
Collapse
|
42
|
Rocheteau E. On the role of artificial intelligence in psychiatry. Br J Psychiatry 2023; 222:54-57. [PMID: 36093950 DOI: 10.1192/bjp.2022.132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Recently, there has been growing interest in artificial intelligence (AI) to improve efficiency and personalisation of mental health services. So far, the progress has been slow, however, advancements in deep learning may change this. This paper discusses the role for AI in psychiatry, in particular (a) diagnosis tools, (b) monitoring of symptoms, and (c) delivering personalised treatment recommendations. Finally, I discuss ethical concerns and technological limitations.
Collapse
Affiliation(s)
- Emma Rocheteau
- School of Clinical Medicine, University of Cambridge, UK; and Department of Computer Science and Technology, University of Cambridge, UK
| |
Collapse
|
43
|
Abstract
ABSTRACT The need for objective measurement in psychiatry has stimulated interest in alternative indicators of the presence and severity of illness. Speech may offer a source of information that bridges the subjective and objective in the assessment of mental disorders. We systematically reviewed the literature for articles exploring speech analysis for psychiatric applications. The utility of speech analysis depends on how accurately speech features represent clinical symptoms within and across disorders. We identified four domains of the application of speech analysis in the literature: diagnostic classification, assessment of illness severity, prediction of onset of illness, and prognosis and treatment outcomes. We discuss the findings in each of these domains, with a focus on how types of speech features characterize different aspects of psychopathology. Models that bring together multiple speech features can distinguish speakers with psychiatric disorders from healthy controls with high accuracy. Differentiating between types of mental disorders and symptom dimensions are more complex problems that expose the transdiagnostic nature of speech features. Convergent progress in speech research and computer sciences opens avenues for implementing speech analysis to enhance objectivity of assessment in clinical practice. Application of speech analysis will need to address issues of ethics and equity, including the potential to perpetuate discriminatory bias through models that learn from clinical assessment data. Methods that mitigate bias are available and should play a key role in the implementation of speech analysis.
Collapse
Affiliation(s)
- Katerina Dikaios
- From: Dalhousie University, Department of Psychiatry, Halifax, NS (Ms. Dikaios, Dr. Uher); Novia Scotia Health, Halifax, NS (Ms. Rempel); Faculty of Computer Science, Dalhousie University, and Vector Institute for Artificial Intelligence, University of Toronto (Mr. Dumpala, Dr. Oore); School of Communication Sciences and Disorders, Dalhousie University (Dr. Kiefte)
| | | | | | | | | | | |
Collapse
|
44
|
Cotes RO, Boazak M, Griner E, Jiang Z, Kim B, Bremer W, Seyedi S, Bahrami Rad A, Clifford GD. Multimodal Assessment of Schizophrenia and Depression Utilizing Video, Acoustic, Locomotor, Electroencephalographic, and Heart Rate Technology: Protocol for an Observational Study. JMIR Res Protoc 2022; 11:e36417. [PMID: 35830230 PMCID: PMC9330209 DOI: 10.2196/36417] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 05/30/2022] [Accepted: 05/31/2022] [Indexed: 11/20/2022] Open
Abstract
Background Current standards of psychiatric assessment and diagnostic evaluation rely primarily on the clinical subjective interpretation of a patient’s outward manifestations of their internal state. While psychometric tools can help to evaluate these behaviors more systematically, the tools still rely on the clinician’s interpretation of what are frequently nuanced speech and behavior patterns. With advances in computing power, increased availability of clinical data, and improving resolution of recording and sensor hardware (including acoustic, video, accelerometer, infrared, and other modalities), researchers have begun to demonstrate the feasibility of cutting-edge technologies in aiding the assessment of psychiatric disorders. Objective We present a research protocol that utilizes facial expression, eye gaze, voice and speech, locomotor, heart rate, and electroencephalography monitoring to assess schizophrenia symptoms and to distinguish patients with schizophrenia from those with other psychiatric disorders and control subjects. Methods We plan to recruit three outpatient groups: (1) 50 patients with schizophrenia, (2) 50 patients with unipolar major depressive disorder, and (3) 50 individuals with no psychiatric history. Using an internally developed semistructured interview, psychometrically validated clinical outcome measures, and a multimodal sensing system utilizing video, acoustic, actigraphic, heart rate, and electroencephalographic sensors, we aim to evaluate the system’s capacity in classifying subjects (schizophrenia, depression, or control), to evaluate the system’s sensitivity to within-group symptom severity, and to determine if such a system can further classify variations in disorder subtypes. Results Data collection began in July 2020 and is expected to continue through December 2022. Conclusions If successful, this study will help advance current progress in developing state-of-the-art technology to aid clinical psychiatric assessment and treatment. If our findings suggest that these technologies are capable of resolving diagnoses and symptoms to the level of current psychometric testing and clinician judgment, we would be among the first to develop a system that can eventually be used by clinicians to more objectively diagnose and assess schizophrenia and depression with the possibility of less risk of bias. Such a tool has the potential to improve accessibility to care; to aid clinicians in objectively evaluating diagnoses, severity of symptoms, and treatment efficacy through time; and to reduce treatment-related morbidity. International Registered Report Identifier (IRRID) DERR1-10.2196/36417
Collapse
Affiliation(s)
- Robert O Cotes
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, United States
| | - Mina Boazak
- Animo Sano Psychiatry, Durham, NC, United States
| | - Emily Griner
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, United States
| | - Zifan Jiang
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, United States.,Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Bona Kim
- Visual Medical Education, Emory School of Medicine, Atlanta, GA, United States
| | - Whitney Bremer
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, United States
| | - Salman Seyedi
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, United States
| | - Ali Bahrami Rad
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, United States
| | - Gari D Clifford
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, United States.,Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| |
Collapse
|
45
|
Huang J, Galal G, Etemadi M, Vaidyanathan M. Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review. JMIR Med Inform 2022; 10:e36388. [PMID: 35639450 PMCID: PMC9198828 DOI: 10.2196/36388] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/17/2022] [Accepted: 03/27/2022] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Racial bias is a key concern regarding the development, validation, and implementation of machine learning (ML) models in clinical settings. Despite the potential of bias to propagate health disparities, racial bias in clinical ML has yet to be thoroughly examined and best practices for bias mitigation remain unclear. OBJECTIVE Our objective was to perform a scoping review to characterize the methods by which the racial bias of ML has been assessed and describe strategies that may be used to enhance algorithmic fairness in clinical ML. METHODS A scoping review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Extension for Scoping Reviews. A literature search using PubMed, Scopus, and Embase databases, as well as Google Scholar, identified 635 records, of which 12 studies were included. RESULTS Applications of ML were varied and involved diagnosis, outcome prediction, and clinical score prediction performed on data sets including images, diagnostic studies, clinical text, and clinical variables. Of the 12 studies, 1 (8%) described a model in routine clinical use, 2 (17%) examined prospectively validated clinical models, and the remaining 9 (75%) described internally validated models. In addition, 8 (67%) studies concluded that racial bias was present, 2 (17%) concluded that it was not, and 2 (17%) assessed the implementation of bias mitigation strategies without comparison to a baseline model. Fairness metrics used to assess algorithmic racial bias were inconsistent. The most commonly observed metrics were equal opportunity difference (5/12, 42%), accuracy (4/12, 25%), and disparate impact (2/12, 17%). All 8 (67%) studies that implemented methods for mitigation of racial bias successfully increased fairness, as measured by the authors' chosen metrics. Preprocessing methods of bias mitigation were most commonly used across all studies that implemented them. CONCLUSIONS The broad scope of medical ML applications and potential patient harms demand an increased emphasis on evaluation and mitigation of racial bias in clinical ML. However, the adoption of algorithmic fairness principles in medicine remains inconsistent and is limited by poor data availability and ML model reporting. We recommend that researchers and journal editors emphasize standardized reporting and data availability in medical ML studies to improve transparency and facilitate evaluation for racial bias.
Collapse
Affiliation(s)
- Jonathan Huang
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - Galal Galal
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - Mozziyar Etemadi
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
- Department of Biomedical Engineering, Northwestern University, Evanston, IL, United States
| | - Mahesh Vaidyanathan
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
- Digital Health & Data Science Curricular Thread, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| |
Collapse
|
46
|
Straw I, Wu H. Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction. BMJ Health Care Inform 2022; 29:e100457. [PMID: 35470133 PMCID: PMC9039354 DOI: 10.1136/bmjhci-2021-100457] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 04/06/2022] [Indexed: 12/25/2022] Open
Abstract
OBJECTIVES The Indian Liver Patient Dataset (ILPD) is used extensively to create algorithms that predict liver disease. Given the existing research describing demographic inequities in liver disease diagnosis and management, these algorithms require scrutiny for potential biases. We address this overlooked issue by investigating ILPD models for sex bias. METHODS Following our literature review of ILPD papers, the models reported in existing studies are recreated and then interrogated for bias. We define four experiments, training on sex-unbalanced/balanced data, with and without feature selection. We build random forests (RFs), support vector machines (SVMs), Gaussian Naïve Bayes and logistic regression (LR) classifiers, running experiments 100 times, reporting average results with SD. RESULTS We reproduce published models achieving accuracies of >70% (LR 71.31% (2.37 SD) - SVM 79.40% (2.50 SD)) and demonstrate a previously unobserved performance disparity. Across all classifiers females suffer from a higher false negative rate (FNR). Presently, RF and LR classifiers are reported as the most effective models, yet in our experiments they demonstrate the greatest FNR disparity (RF; -21.02%; LR; -24.07%). DISCUSSION We demonstrate a sex disparity that exists in published ILPD classifiers. In practice, the higher FNR for females would manifest as increased rates of missed diagnosis for female patients and a consequent lack of appropriate care. Our study demonstrates that evaluating biases in the initial stages of machine learning can provide insights into inequalities in current clinical practice, reveal pathophysiological differences between the male and females, and can mitigate the digitisation of inequalities into algorithmic systems. CONCLUSION Our findings are important to medical data scientists, clinicians and policy-makers involved in the implementation medical artificial intelligence systems. An awareness of the potential biases of these systems is essential in preventing the digital exacerbation of healthcare inequalities.
Collapse
Affiliation(s)
- Isabel Straw
- Institute of Health Informatics, University College London, London, UK
| | - Honghan Wu
- Institute of Health Informatics, University College London, London, UK
| |
Collapse
|
47
|
Monteith S, Glenn T, Geddes J, Whybrow PC, Bauer M. Commercial Use of Emotion Artificial Intelligence (AI): Implications for Psychiatry. Curr Psychiatry Rep 2022; 24:203-211. [PMID: 35212918 DOI: 10.1007/s11920-022-01330-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/07/2022] [Indexed: 11/03/2022]
Abstract
PURPOSE OF REVIEW Emotion artificial intelligence (AI) is technology for emotion detection and recognition. Emotion AI is expanding rapidly in commercial and government settings outside of medicine, and will increasingly become a routine part of daily life. The goal of this narrative review is to increase awareness both of the widespread use of emotion AI, and of the concerns with commercial use of emotion AI in relation to people with mental illness. RECENT FINDINGS This paper discusses emotion AI fundamentals, a general overview of commercial emotion AI outside of medicine, and examples of the use of emotion AI in employee hiring and workplace monitoring. The successful re-integration of patients with mental illness into society must recognize the increasing commercial use of emotion AI. There are concerns that commercial use of emotion AI will increase stigma and discrimination, and have negative consequences in daily life for people with mental illness. Commercial emotion AI algorithm predictions about mental illness should not be treated as medical fact.
Collapse
Affiliation(s)
- Scott Monteith
- Michigan State University College of Human Medicine, Traverse City Campus, 1400 Medical Campus Drive, Traverse City, MI, 49684, USA.
| | - Tasha Glenn
- ChronoRecord Association, Fullerton, CA, USA
| | - John Geddes
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - Peter C Whybrow
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Michael Bauer
- Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus Medical Faculty, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
48
|
Abstract
Human-computer interaction (HCI) has contributed to the design and development of some efficient, user-friendly, cost-effective, and adaptable digital mental health solutions. But HCI has not been well-combined into technological developments resulting in quality and safety concerns. Digital platforms and artificial intelligence (AI) have a good potential to improve prediction, identification, coordination, and treatment by mental health care and suicide prevention services. AI is driving web-based and smartphone apps; mostly it is used for self-help and guided cognitive behavioral therapy (CBT) for anxiety and depression. Interactive AI may help real-time screening and treatment in outdated, strained or lacking mental healthcare systems. The barriers for using AI in mental healthcare include accessibility, efficacy, reliability, usability, safety, security, ethics, suitable education and training, and socio-cultural adaptability. Apps, real-time machine learning algorithms, immersive technologies, and digital phenotyping are notable prospects. Generally, there is a need for faster and better human factors in combination with machine interaction and automation, higher levels of effectiveness evaluation and the application of blended, hybrid or stepped care in an adjunct approach. HCI modeling may assist in the design and development of usable applications, and to effectively recognize, acknowledge, and address the inequities of mental health care and suicide prevention and assist in the digital therapeutic alliance.
Collapse
|
49
|
Cohen J, Wright-Berryman J, Rohlfs L, Trocinski D, Daniel L, Klatt TW. Integration and Validation of a Natural Language Processing Machine Learning Suicide Risk Prediction Model Based on Open-Ended Interview Language in the Emergency Department. Front Digit Health 2022; 4:818705. [PMID: 35187527 PMCID: PMC8847784 DOI: 10.3389/fdgth.2022.818705] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 01/10/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Emergency departments (ED) are an important intercept point for identifying suicide risk and connecting patients to care, however, more innovative, person-centered screening tools are needed. Natural language processing (NLP) -based machine learning (ML) techniques have shown promise to assess suicide risk, although whether NLP models perform well in differing geographic regions, at different time periods, or after large-scale events such as the COVID-19 pandemic is unknown. OBJECTIVE To evaluate the performance of an NLP/ML suicide risk prediction model on newly collected language from the Southeastern United States using models previously tested on language collected in the Midwestern US. METHOD 37 Suicidal and 33 non-suicidal patients from two EDs were interviewed to test a previously developed suicide risk prediction NLP/ML model. Model performance was evaluated with the area under the receiver operating characteristic curve (AUC) and Brier scores. RESULTS NLP/ML models performed with an AUC of 0.81 (95% CI: 0.71-0.91) and Brier score of 0.23. CONCLUSION The language-based suicide risk model performed with good discrimination when identifying the language of suicidal patients from a different part of the US and at a later time period than when the model was originally developed and trained.
Collapse
Affiliation(s)
| | - Jennifer Wright-Berryman
- Department of Social Work, College of Allied Health Sciences, University of Cincinnati, Cincinnati, OH, United States
| | | | | | | | | |
Collapse
|
50
|
Musbahi O, Syed L, Le Feuvre P, Cobb J, Jones G. Public patient views of artificial intelligence in healthcare: A nominal group technique study. Digit Health 2021; 7:20552076211063682. [PMID: 34950499 PMCID: PMC8689636 DOI: 10.1177/20552076211063682] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Objectives The beliefs of laypeople and medical professionals often diverge with regards to disease, and technology has had a positive impact on how research is conducted. Surprisingly, given the expanding worldwide funding and research into Artificial Intelligence (AI) applications in healthcare, there is a paucity of research exploring the public patient perspective on this technology. Our study sets out to address this knowledge gap, by applying the Nominal Group Technique (NGT) to explore patient public views on AI. Methods A Nominal Group Technique (NGT) was used involving four study groups with seven participants in each group. This started with a silent generation of ideas regarding the benefits and concerns of AI in Healthcare. Then a group discussion and round-robin process were conducted until no new ideas were generated. Participants ranked their top five benefits and top five concerns regarding the use of AI in healthcare. A final group consensus was reached. Results Twenty-Eight participants were recruited with the mean age of 47 years. The top five benefits were: Faster health services, Greater accuracy in management, AI systems available 24/7, reducing workforce burden, and equality in healthcare decision making. The top five concerns were: Data cybersecurity, bias and quality of AI data, less human interaction, algorithm errors and responsibility, and limitation in technology. Conclusion This is the first formal qualitative study exploring patient public views on the use of AI in healthcare, and highlights that there is a clear understanding of the potential benefits delivered by this technology. Greater patient public group involvement, and a strong regulatory framework is recommended.
Collapse
Affiliation(s)
- Omar Musbahi
- MSK Lab, Imperial College London, Charing Cross Campus, Hammersmith, London, UK
| | - Labib Syed
- MSK Lab, Imperial College London, Charing Cross Campus, Hammersmith, London, UK
| | - Peter Le Feuvre
- MSK Lab, Imperial College London, Charing Cross Campus, Hammersmith, London, UK
| | - Justin Cobb
- MSK Lab, Imperial College London, Charing Cross Campus, Hammersmith, London, UK
| | - Gareth Jones
- MSK Lab, Imperial College London, Charing Cross Campus, Hammersmith, London, UK
| |
Collapse
|