1
|
Alsuhaibani M, Dodge HH, Mahoor MH. Mild cognitive impairment detection from facial video interviews by applying spatial-to-temporal attention module. EXPERT SYSTEMS WITH APPLICATIONS 2024; 252:124185. [PMID: 38881832 PMCID: PMC11174143 DOI: 10.1016/j.eswa.2024.124185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
Early detection of Mild Cognitive Impairment (MCI) leads to early interventions to slow the progression from MCI into dementia. Deep Learning (DL) algorithms could help achieve early non-invasive and low-cost detection of MCI. This paper presents the detection of MCI in older adults using DL models based only on facial features extracted from video-recorded conversations at home. We used the data collected from the I-CONECT behavioral intervention study (NCT02871921), where several sessions of semi-structured interviews between socially isolated older individuals and interviewers were video recorded. We develop a framework that extracts holistic spatial facial features using a convolutional autoencoder and temporal information using transformers. We proposed the Spatial-to-Temporal Attention Module (STAM) to detect the I-CONECT study participants' cognitive conditions (MCI vs. those with normal cognition (NC)) using facial and interaction features. The interaction features of the facial features improved the prediction performance compared with applying facial features solely. The detection accuracy using this combined method reached 88%, whereas the accuracy without applying the segments and sequences information of the facial features within a video on a certain theme was 84%. Overall, the results show that spatiotemporal facial features modeled using DL algorithms have a discriminating power for the detection of MCI.
Collapse
Affiliation(s)
- Muath Alsuhaibani
- Department of Electrical and Computer Engineering, University of Denver, Denver 80208, CO, United States
- Department of Electrical Engineering, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
| | - Hiroko H. Dodge
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, United States
| | - Mohammad H. Mahoor
- Department of Electrical and Computer Engineering, University of Denver, Denver 80208, CO, United States
| |
Collapse
|
2
|
Aina J, Akinniyi O, Rahman MM, Odero-Marah V, Khalifa F. A Hybrid Learning-Architecture for Mental Disorder Detection Using Emotion Recognition. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2024; 12:91410-91425. [PMID: 39054996 PMCID: PMC11270886 DOI: 10.1109/access.2024.3421376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Mental illness has grown to become a prevalent and global health concern that affects individuals across various demographics. Timely detection and accurate diagnosis of mental disorders are crucial for effective treatment and support as late diagnosis could result in suicidal, harmful behaviors and ultimately death. To this end, the present study introduces a novel pipeline for the analysis of facial expressions, leveraging both the AffectNet and 2013 Facial Emotion Recognition (FER) datasets. Consequently, this research goes beyond traditional diagnostic methods by contributing a system capable of generating a comprehensive mental disorder dataset and concurrently predicting mental disorders based on facial emotional cues. Particularly, we introduce a hybrid architecture for mental disorder detection leveraging the state-of-the-art object detection algorithm, YOLOv8 to detect and classify visual cues associated with specific mental disorders. To achieve accurate predictions, an integrated learning architecture based on the fusion of Convolution Neural Networks (CNNs) and Visual Transformer (ViT) models is developed to form an ensemble classifier that predicts the presence of mental illness (e.g., depression, anxiety, and other mental disorder). The overall accuracy is improved to about 81% using the proposed ensemble technique. To ensure transparency and interpretability, we integrate techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) and saliency maps to highlight the regions in the input image that significantly contribute to the model's predictions thus providing healthcare professionals with a clear understanding of the features influencing the system's decisions thereby enhancing trust and more informed diagnostic process.
Collapse
Affiliation(s)
- Joseph Aina
- Electrical and Computer Engineering Department, School of Engineering, Morgan State University, Baltimore, MD 21251, USA
| | - Oluwatunmise Akinniyi
- Electrical and Computer Engineering Department, School of Engineering, Morgan State University, Baltimore, MD 21251, USA
| | - Md Mahmudur Rahman
- Department of Computer Science, School of Computer, Mathematical and Natural Sciences, Morgan State University, Baltimore, MD 21251, USA
| | - Valerie Odero-Marah
- Center for Urban Health Disparities Research and Innovation, Department of Biology, Morgan State University, Baltimore, MD 21251, USA
| | - Fahmi Khalifa
- Electrical and Computer Engineering Department, School of Engineering, Morgan State University, Baltimore, MD 21251, USA
- Electronics and Communications Engineering Department, Mansoura University, Mansoura 35516, Egypt
| |
Collapse
|
3
|
Perini I, Pabst A, Martinez D, Maurage P, Heilig M. Modeling social cognition in alcohol use disorder: lessons from schizophrenia. Psychopharmacology (Berl) 2024:10.1007/s00213-024-06601-0. [PMID: 38761256 DOI: 10.1007/s00213-024-06601-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 04/25/2024] [Indexed: 05/20/2024]
Abstract
A better understanding of social deficits in alcohol use disorder (AUD) has the potential to improve our understanding of the disorder. Clinical research shows that AUD is associated with interpersonal problems and the loss of a social network which impedes response to treatment. Translational research between animal models and clinical research may benefit from a discussion of the models and methods that currently guide research into social cognition in AUD. We propose that research in AUD should harness recent technological developments to improve ecological validity while maintaining experimental control. Novel methods allow us to parse naturalistic social cognition into tangible components, and to investigate previously neglected aspects of social cognition. Furthermore, to incorporate social cognition as a defining element of AUD, it is critical to clarify the timing of these social disturbances. Currently, there is limited evidence to distinguish factors that influence social cognition as a consequence of AUD, and those that precede the onset of the disorder. Both increasing the focus on operationalization of social cognition into objective components and adopting a perspective that spans the clinical spectrum will improve our understanding in humans, but also possibly increase methodological consistency and translational dialogue across species. This commentary underscores current challenges and perspectives in this area of research.
Collapse
Affiliation(s)
- Irene Perini
- Center for Social and Affective Neuroscience, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden.
- Center for Medical Image Science and Visualization, Linköping, Sweden.
| | - Arthur Pabst
- Louvain Experimental Psychopathology research group (LEP), Psychological Sciences Research Institute, UCLouvain, Place C. Mercier 10, Louvain-la-Neuve, B-1348, Belgium
| | - Diana Martinez
- Columbia University, New York State Psychiatric Institute, New York, NY, 10032, USA
| | - Pierre Maurage
- Louvain Experimental Psychopathology research group (LEP), Psychological Sciences Research Institute, UCLouvain, Place C. Mercier 10, Louvain-la-Neuve, B-1348, Belgium
| | - Markus Heilig
- Center for Social and Affective Neuroscience, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| |
Collapse
|
4
|
Olah J, Cummins N, Arribas M, Gibbs-Dean T, Molina E, Sethi D, Kempton MJ, Morgan S, Spencer T, Diederen K. Towards a scalable approach to assess speech organization across the psychosis-spectrum -online assessment in conjunction with automated transcription and extraction of speech measures. Transl Psychiatry 2024; 14:156. [PMID: 38509087 PMCID: PMC10954690 DOI: 10.1038/s41398-024-02851-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 02/15/2024] [Accepted: 02/22/2024] [Indexed: 03/22/2024] Open
Abstract
Automatically extracted measures of speech constitute a promising marker of psychosis as disorganized speech is associated with psychotic symptoms and predictive of psychosis-onset. The potential of speech markers is, however, hampered by (i) lengthy assessments in laboratory settings and (ii) manual transcriptions. We investigated whether a short, scalable data collection (online) and processing (automated transcription) procedure would provide data of sufficient quality to extract previously validated speech measures. To evaluate the fit of our approach for purpose, we assessed speech in relation to psychotic-like experiences in the general population. Participants completed an 8-minute-long speech task online. Sample 1 included measures of psychometric schizotypy and delusional ideation (N = 446). Sample 2 included a low and high psychometric schizotypy group (N = 144). Recordings were transcribed both automatically and manually, and connectivity, semantic, and syntactic speech measures were extracted for both types of transcripts. 73%/86% participants in sample 1/2 completed the experiment. Nineteen out of 25 speech measures were strongly (r > 0.7) and significantly correlated between automated and manual transcripts in both samples. Amongst the 14 connectivity measures, 11 showed a significant relationship with delusional ideation. For the semantic and syntactic measures, On Topic score and the Frequency of personal pronouns were negatively correlated with both schizotypy and delusional ideation. Combined with demographic information, the speech markers could explain 11-14% of the variation of delusional ideation and schizotypy in Sample 1 and could discriminate between high-low schizotypy with high accuracy (0.72-0.70, AUC = 0.78-0.79) in Sample 2. The moderate to high retention rate, strong correlation of speech measures across manual and automated transcripts and sensitivity to psychotic-like experiences provides initial evidence that online collected speech in combination with automatic transcription is a feasible approach to increase accessibility and scalability of speech-based assessment of psychosis.
Collapse
Affiliation(s)
- Julianna Olah
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.
| | - Nicholas Cummins
- Institute of Psychiatry, Psychology and Neuroscience, Department of Biostatistics & Health Informatics, King's College London, London, UK
| | - Maite Arribas
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Toni Gibbs-Dean
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Elena Molina
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Divina Sethi
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Matthew J Kempton
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Sarah Morgan
- Behavioural and Clinical Neuroscience Institute, Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom
| | - Tom Spencer
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Kelly Diederen
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| |
Collapse
|
5
|
Olah J, Spencer T, Cummins N, Diederen K. Automated analysis of speech as a marker of sub-clinical psychotic experiences. Front Psychiatry 2024; 14:1265880. [PMID: 38361830 PMCID: PMC10867252 DOI: 10.3389/fpsyt.2023.1265880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/22/2023] [Indexed: 02/17/2024] Open
Abstract
Automated speech analysis techniques, when combined with artificial intelligence and machine learning, show potential in capturing and predicting a wide range of psychosis symptoms, garnering attention from researchers. These techniques hold promise in predicting the transition to clinical psychosis from at-risk states, as well as relapse or treatment response in individuals with clinical-level psychosis. However, challenges in scientific validation hinder the translation of these techniques into practical applications. Although sub-clinical research could aid to tackle most of these challenges, there have been only few studies conducted in speech and psychosis research in non-clinical populations. This work aims to facilitate this work by summarizing automated speech analytical concepts and the intersection of this field with psychosis research. We review psychosis continuum and sub-clinical psychotic experiences, and the benefits of researching them. Then, we discuss the connection between speech and psychotic symptoms. Thirdly, we overview current and state-of-the art approaches to the automated analysis of speech both in terms of language use (text-based analysis) and vocal features (audio-based analysis). Then, we review techniques applied in subclinical population and findings in these samples. Finally, we discuss research challenges in the field, recommend future research endeavors and outline how research in subclinical populations can tackle the listed challenges.
Collapse
Affiliation(s)
- Julianna Olah
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Thomas Spencer
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Nicholas Cummins
- Department of Biostatistics & Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Kelly Diederen
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| |
Collapse
|
6
|
Granrud OE, Rodriguez Z, Cowan T, Masucci MD, Cohen AS. Alogia and pressured speech do not fall on a continuum of speech production using objective speech technologies. Schizophr Res 2023; 259:121-126. [PMID: 35864001 DOI: 10.1016/j.schres.2022.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 07/02/2022] [Accepted: 07/04/2022] [Indexed: 10/17/2022]
Abstract
Speech production is affected in a variety of serious mental illnesses (SMI; e.g., schizophrenia, unipolar depression, bipolar disorders) and at its extremes can be observed in the gross reduction of speech (e.g., alogia) or increase of speech (e.g., pressured speech). The present study evaluated whether clinically-rated alogia and pressured speech represent antithetical constructs when analyzed using objective metrics of speech production. We examined natural speech using acoustic and natural language processing features from two archival studies using several different speaking tasks and a combined 107 patients meeting criteria for SMI. Contrary to expectations, we did not find that alogia and pressured speech presented as opposing ends of a speech production continuum. Objective speech markers were associated with clinically rated alogia but not pressured speech, and these results were consistent across speaking tasks and studies. Implications for our understanding of speech production symptoms in SMI are discussed, as well as implications for Natural Language Processing and digital phenotyping efforts more generally.
Collapse
Affiliation(s)
- Ole Edvard Granrud
- Louisiana State University, Department of Psychology, United States of America
| | - Zachary Rodriguez
- Louisiana State University, Department of Psychology, United States of America; Louisiana State University, Center for Computation and Technology, United States of America
| | - Tovah Cowan
- Louisiana State University, Department of Psychology, United States of America
| | - Michael D Masucci
- Louisiana State University, Department of Psychology, United States of America
| | - Alex S Cohen
- Louisiana State University, Department of Psychology, United States of America; Louisiana State University, Center for Computation and Technology, United States of America.
| |
Collapse
|
7
|
Loch AA, Gondim JM, Argolo FC, Lopes-Rocha AC, Andrade JC, van de Bilt MT, de Jesus LP, Haddad NM, Cecchi GA, Mota NB, Gattaz WF, Corcoran CM, Ara A. Detecting at-risk mental states for psychosis (ARMS) using machine learning ensembles and facial features. Schizophr Res 2023; 258:45-52. [PMID: 37473667 PMCID: PMC10448183 DOI: 10.1016/j.schres.2023.07.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 04/26/2023] [Accepted: 07/10/2023] [Indexed: 07/22/2023]
Abstract
AIMS Our study aimed to develop a machine learning ensemble to distinguish "at-risk mental states for psychosis" (ARMS) subjects from control individuals from the general population based on facial data extracted from video-recordings. METHODS 58 non-help-seeking medication-naïve ARMS and 70 healthy subjects were screened from a general population sample. At-risk status was assessed with the Structured Interview for Prodromal Syndromes (SIPS), and "Subject's Overview" section was filmed (5-10 min). Several features were extracted, e.g., eye and mouth aspect ratio, Euler angles, coordinates from 51 facial landmarks. This elicited 649 facial features, which were further selected using Gradient Boosting Machines (AdaBoost combined with Random Forests). Data was split in 70/30 for training, and Monte Carlo cross validation was used. RESULTS Final model reached 83 % of mean F1-score, and balanced accuracy of 85 %. Mean area under the curve for the receiver operator curve classifier was 93 %. Convergent validity testing showed that two features included in the model were significantly correlated with Avolition (SIPS N2 item) and expression of emotion (SIPS N3 item). CONCLUSION Our model capitalized on short video-recordings from individuals recruited from the general population, effectively distinguishing between ARMS and controls. Results are encouraging for large-screening purposes in low-resource settings.
Collapse
Affiliation(s)
- Alexandre Andrade Loch
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil; Instituto Nacional de Biomarcadores em Neuropsiquiatria (INBION), Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil.
| | - João Medrado Gondim
- Instituto de Computação, Universidade Federal da Bahia, Salvador, BA, Brazil
| | - Felipe Coelho Argolo
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | - Ana Caroline Lopes-Rocha
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | - Julio Cesar Andrade
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | - Martinus Theodorus van de Bilt
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil; Instituto Nacional de Biomarcadores em Neuropsiquiatria (INBION), Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil
| | - Leonardo Peroni de Jesus
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | - Natalia Mansur Haddad
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | | | - Natalia Bezerra Mota
- Instituto de Psiquiatria (IPUB), Departamento de Psiquiatria e Medicina Legal, Universidade Federal do Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil; Research Department at Motrix Lab - Motrix, Rio de Janeiro, Brazil
| | - Wagner Farid Gattaz
- Laboratório de Neurociencias (LIM 27), Instituto de Psiquiatria, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, SP, Brazil; Instituto Nacional de Biomarcadores em Neuropsiquiatria (INBION), Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil
| | - Cheryl Mary Corcoran
- Icahn School of Medicine at Mount Sinai, New York, NY, USA; James J. Peters VA Medical Center Bronx, NY, USA
| | - Anderson Ara
- Statistics Department, Federal University of Paraná, Curitiba, PR, Brazil
| |
Collapse
|
8
|
Chen ZS, Kulkarni P(P, Galatzer-Levy IR, Bigio B, Nasca C, Zhang Y. Modern views of machine learning for precision psychiatry. PATTERNS (NEW YORK, N.Y.) 2022; 3:100602. [PMID: 36419447 PMCID: PMC9676543 DOI: 10.1016/j.patter.2022.100602] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
In light of the National Institute of Mental Health (NIMH)'s Research Domain Criteria (RDoC), the advent of functional neuroimaging, novel technologies and methods provide new opportunities to develop precise and personalized prognosis and diagnosis of mental disorders. Machine learning (ML) and artificial intelligence (AI) technologies are playing an increasingly critical role in the new era of precision psychiatry. Combining ML/AI with neuromodulation technologies can potentially provide explainable solutions in clinical practice and effective therapeutic treatment. Advanced wearable and mobile technologies also call for the new role of ML/AI for digital phenotyping in mobile mental health. In this review, we provide a comprehensive review of ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice. We further review the role of ML in molecular phenotyping and cross-species biomarker identification in precision psychiatry. We also discuss explainable AI (XAI) and neuromodulation in a closed human-in-the-loop manner and highlight the ML potential in multi-media information extraction and multi-modal data fusion. Finally, we discuss conceptual and practical challenges in precision psychiatry and highlight ML opportunities in future research.
Collapse
Affiliation(s)
- Zhe Sage Chen
- Department of Psychiatry, New York University Grossman School of Medicine, New York, NY 10016, USA
- Department of Neuroscience and Physiology, New York University Grossman School of Medicine, New York, NY 10016, USA
- The Neuroscience Institute, New York University Grossman School of Medicine, New York, NY 10016, USA
- Department of Biomedical Engineering, New York University Tandon School of Engineering, Brooklyn, NY 11201, USA
| | | | - Isaac R. Galatzer-Levy
- Department of Psychiatry, New York University Grossman School of Medicine, New York, NY 10016, USA
- Meta Reality Lab, New York, NY, USA
| | - Benedetta Bigio
- Department of Psychiatry, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Carla Nasca
- Department of Psychiatry, New York University Grossman School of Medicine, New York, NY 10016, USA
- The Neuroscience Institute, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Yu Zhang
- Department of Bioengineering, Lehigh University, Bethlehem, PA 18015, USA
- Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, PA 18015, USA
| |
Collapse
|