1
|
Nwosu OI, Naunheim MR. Artificial Intelligence in Laryngology, Broncho-Esophagology, and Sleep Surgery. Otolaryngol Clin North Am 2024; 57:821-829. [PMID: 38719714 DOI: 10.1016/j.otc.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
Technological advancements in laryngology, broncho-esophagology, and sleep surgery have enabled the collection of increasing amounts of complex data for diagnosis and treatment of voice, swallowing, and sleep disorders. Clinicians face challenges in efficiently synthesizing these data for personalized patient care. Artificial intelligence (AI), specifically machine learning and deep learning, offers innovative solutions for processing and interpreting these data, revolutionizing diagnosis and management in these fields, and making care more efficient and effective. In this study, we review recent AI-based innovations in the fields of laryngology, broncho-esophagology, and sleep surgery.
Collapse
Affiliation(s)
- Obinna I Nwosu
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, MA, USA; Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, MA, USA
| | - Matthew R Naunheim
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, MA, USA; Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
2
|
Siegel JS, Cohen AS, Szabo ST, Tomioka S, Opler M, Kirkpatrick B, Hopkins S. Enrichment using speech latencies improves treatment effect size in a clinical trial of bipolar depression. Psychiatry Res 2024; 340:116105. [PMID: 39151277 DOI: 10.1016/j.psychres.2024.116105] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 07/23/2024] [Accepted: 07/24/2024] [Indexed: 08/19/2024]
Abstract
Clinical trials in depression lack objective measures. Speech latencies are an objective measure of psychomotor slowing with face validity and empirical support. 'Turn latency' is the response time between speakers. Retrospective analysis was carried-out on the utility of turn latencies as an enrichment tool in a clinical trial of bipolar I depression. Speech data was obtained from 274 participants during 1,352 Montgomery-Åsberg Depression Rating Scale (MADRS) recordings in a randomized, placebo controlled, 6-week clinical trial of SEP-4199 (200 mg or 400 mg). Post-randomization turn latencies were compared between patients with moderate to severe depression and patients whose depression had remitted. A cutoff was determined and applied to turn latencies pre-randomization to classify individuals into two groups: Speech Latencies Slow (SL-Slow) and Speech Latencies Normal (SL-Normal). At week 6, SL-Slow (N = 172) showed significant separation in MADRS scores between placebo and treatment arms. SL-Normal (N = 102) showed larger MADRS improvements and no significant separation between placebo and treatment arms. Excluding SL-Normal increased primary outcome effect size by 52 % and 100 % for the treatment arms. Turn latencies are an objective measure available from standard clinical assessments and may assess the severity of symptoms more accurately and screen out placebo responders.
Collapse
Affiliation(s)
- Joshua S Siegel
- Sumitomo Pharmaceuticals Inc; Washington University in St. Louis, Department of Psychiatry
| | - Alex S Cohen
- Louisiana State University, Department of Psychology, Baton Rouge, LA 70803; Louisiana State University, Center for Computation and Technology; Quantic Innovation, Inc.
| | | | | | | | - Brian Kirkpatrick
- Quantic Innovation, Inc; Psychiatric Research Institute, University of Arkansas for Medical Sciences
| | | |
Collapse
|
3
|
Cohen AS, Rodriguez Z, Opler M, Kirkpatrick B, Milanovic S, Piacentino D, Szabo ST, Tomioka S, Ogirala A, Koblan KS, Siegel JS, Hopkins S. Evaluating speech latencies during structured psychiatric interviews as an automated objective measure of psychomotor slowing. Psychiatry Res 2024; 340:116104. [PMID: 39137558 DOI: 10.1016/j.psychres.2024.116104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 07/23/2024] [Accepted: 07/24/2024] [Indexed: 08/15/2024]
Abstract
We sought to derive an objective measure of psychomotor slowing from speech analytics during a psychiatric interview to avoid potential burden of dedicated neurophysiological testing. Speech latency, which reflects response time between speakers, shows promise from the literature. Speech data was obtained from 274 subjects with a diagnosis of bipolar I depression enrolled in a randomized, doubleblind, 6-week phase 2 clinical trial. Audio recordings of structured Montgomery-Åsberg Depression Rating Scale (MADRS) interviews at 6 time points were examined (k = 1,352). We evaluated speech latencies, and other aspects of speech, for temporal stability, convergent validity, sensitivity/responsivity to clinical change, and generalization across seven socio-linguistically diverse countries. Speech latency was minimally associated with demographic features, and explained nearly a third of the variance in depression (categorically defined). Speech latency significantly decreased as depression symptoms improved over time, explaining nearly 20 % of variance in depression remission. Classification for differentiating people with versus without concurrent depression was high (AUCs > 0.85) both cross-sectionally and longitudinally. Results replicated across countries. Other speech features offered modest incremental contribution. Neurophysiological speech parameters with face validity can be derived from psychiatric interviews without the added patient burden of additional testing.
Collapse
Affiliation(s)
- Alex S Cohen
- Louisiana State University, Department of Psychology, USA; Louisiana State University, Center for Computation and Technology, USA; Quantic Innovation, Inc, USA.
| | - Zachary Rodriguez
- Louisiana State University, Department of Psychology, USA; Louisiana State University, Center for Computation and Technology, USA
| | - Mark Opler
- Quantic Innovation, Inc, USA; WCG, Inc, USA
| | - Brian Kirkpatrick
- Quantic Innovation, Inc, USA; Psychiatric Research Institute, University of Arkansas for Medical Sciences, USA
| | | | | | | | | | | | | | - Joshua S Siegel
- Sumitomo Pharmaceuticals Inc, USA; Washington University in St. Louis, Department of Psychiatry, USA
| | | |
Collapse
|
4
|
Olah J, Wong WLE, Chaudhry AURR, Mena O, Tang SX. Detecting schizophrenia, bipolar disorder, psychosis vulnerability and major depressive disorder from 5 minutes of online-collected speech. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.03.24313020. [PMID: 39281747 PMCID: PMC11398428 DOI: 10.1101/2024.09.03.24313020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]
Abstract
Background Psychosis poses substantial social and healthcare burdens. The analysis of speech is a promising approach for the diagnosis and monitoring of psychosis, capturing symptoms like thought disorder and flattened affect. Recent advancements in Natural Language Processing (NLP) methodologies enable the automated extraction of informative speech features, which has been leveraged for early psychosis detection and assessment of symptomology. However, critical gaps persist, including the absence of standardized sample collection protocols, small sample sizes, and a lack of multi-illness classification, limiting clinical applicability. Our study aimed to (1) identify an optimal assessment approach for the online and remote collection of speech, in the context of assessing the psychosis spectrum and evaluate whether a fully automated, speech-based machine learning (ML) pipeline can discriminate among different conditions on the schizophrenia-bipolar spectrum (SSD-BD-SPE), help-seeking comparison subjects (MDD), and healthy controls (HC) at varying layers of analysis and diagnostic complexity. Methods We adopted online data collection methods to collect 20 minutes of speech and demographic information from individuals. Participants were categorized as "healthy" help-seekers (HC), having a schizophrenia-spectrum disorder (SSD), bipolar disorder (BD), major depressive disorder (MDD), or being on the psychosis spectrum with sub-clinical psychotic experiences (SPE). SPE status was determined based on self-reported clinical diagnosis and responses to the PHQ-8 and PQ-16 screening questionnaires, while other diagnoses were determined based on self-report from participants. Linguistic and paralinguistic features were extracted and ensemble learning algorithms (e.g., XGBoost) were used to train models. A 70%-30% train-test split and 30-fold cross-validation was used to validate the model performance. Results The final analysis sample included 1140 individuals and 22,650 minutes of speech. Using 5-minutes of speech, our model could discriminate between HC and those with a serious mental illness (SSD or BD) with 86% accuracy (AUC = 0.91, Recall = 0.7, Precision = 0.98). Furthermore, our model could discern among HC, SPE, BD and SSD groups with 86% accuracy (F1 macro = 0.855, Recall Macro = 0.86, Precision Macro = 0.86). Finally, in a 5-class discrimination task including individuals with MDD, our model had 76% accuracy (F1 macro = 0.757, Recall Macro = 0.758, Precision Macro = 0.766). Conclusion Our ML pipeline demonstrated disorder-specific learning, achieving excellent or good accuracy across several classification tasks. We demonstrated that the screening of mental disorders is possible via a fully automated, remote speech assessment pipeline. We tested our model on relatively high number conditions (5 classes) in the literature and in a stratified sample of psychosis spectrum, including HC, SPE, SSD and BD (4 classes). We tested our model on a large sample (N = 1150) and demonstrated best-in-class accuracy with remotely collected speech data in the psychosis spectrum, however, further clinical validation is needed to test the reliability of model performance.
Collapse
Affiliation(s)
| | | | | | | | - Sunny X Tang
- Psychiatry Research, Feinstein Institutes for Medical Research
| |
Collapse
|
5
|
Luo Q, Di Y, Zhu T. Predictive modeling of neuroticism in depressed and non-depressed cohorts using voice features. J Affect Disord 2024; 352:395-402. [PMID: 38342318 DOI: 10.1016/j.jad.2024.02.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/30/2024] [Accepted: 02/07/2024] [Indexed: 02/13/2024]
Abstract
BACKGROUND Neuroticism's impact on psychopathological and physical health issues has significant public health implications. Multiple studies confirm its predictive effect on suicide risk among depressed patients. However, previous research lacks a standardized criterion for assessing neuroticism through speech, often relying on simple features (such as pitch, loudness and MFCCs). This study aims to improve upon this by extracting features using advanced pre-trained speaker embedding models (i-vector and x-vector extractors). Additionally, unlike prior studies utilizing general population data, we explore neuroticism prediction in depressed and non-depressed subgroups. METHODS We collected edited discourse data from clinical interviews of 3580 depressed individuals and 4016 healthy individuals from the CONVERGE study. Instead of solely extracting Low-Level Acoustic Descriptors, we incorporated i-vector and x-vector features. We compared the performance of three different features in predicting neuroticism and explored their combination to enhance model accuracy. RESULTS The SVR model, combining three speech features with downscaled features to 300, exhibited the highest performance in predicting neuroticism scores. It achieved a coefficient of determination (R-squared) of 0.3 or higher and a correlation of 0.56 between predicted and actual values. The predictive classification accuracy of speech features for neuroticism in specific populations (healthy and depressed) exceeded 60 %. LIMITATIONS This study included only women. CONCLUSION Combining diverse speech features enhances the predictive capacity of models using speech features to assess neuroticism, particularly in specific populations. This study lays the foundation for future exploration of speech features in neuroticism prediction.
Collapse
Affiliation(s)
- Qian Luo
- Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yazheng Di
- Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingshao Zhu
- Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
6
|
Olah J, Spencer T, Cummins N, Diederen K. Automated analysis of speech as a marker of sub-clinical psychotic experiences. Front Psychiatry 2024; 14:1265880. [PMID: 38361830 PMCID: PMC10867252 DOI: 10.3389/fpsyt.2023.1265880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/22/2023] [Indexed: 02/17/2024] Open
Abstract
Automated speech analysis techniques, when combined with artificial intelligence and machine learning, show potential in capturing and predicting a wide range of psychosis symptoms, garnering attention from researchers. These techniques hold promise in predicting the transition to clinical psychosis from at-risk states, as well as relapse or treatment response in individuals with clinical-level psychosis. However, challenges in scientific validation hinder the translation of these techniques into practical applications. Although sub-clinical research could aid to tackle most of these challenges, there have been only few studies conducted in speech and psychosis research in non-clinical populations. This work aims to facilitate this work by summarizing automated speech analytical concepts and the intersection of this field with psychosis research. We review psychosis continuum and sub-clinical psychotic experiences, and the benefits of researching them. Then, we discuss the connection between speech and psychotic symptoms. Thirdly, we overview current and state-of-the art approaches to the automated analysis of speech both in terms of language use (text-based analysis) and vocal features (audio-based analysis). Then, we review techniques applied in subclinical population and findings in these samples. Finally, we discuss research challenges in the field, recommend future research endeavors and outline how research in subclinical populations can tackle the listed challenges.
Collapse
Affiliation(s)
- Julianna Olah
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Thomas Spencer
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Nicholas Cummins
- Department of Biostatistics & Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| | - Kelly Diederen
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom
| |
Collapse
|
7
|
Aziz D, Dávid S. Multitask and Transfer Learning Approach for Joint Classification and Severity Estimation of Dysphonia. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2023; 12:233-244. [PMID: 38196819 PMCID: PMC10776101 DOI: 10.1109/jtehm.2023.3340345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/30/2023] [Accepted: 12/04/2023] [Indexed: 01/11/2024]
Abstract
OBJECTIVE Despite speech being the primary communication medium, it carries valuable information about a speaker's health, emotions, and identity. Various conditions can affect the vocal organs, leading to speech difficulties. Extensive research has been conducted by voice clinicians and academia in speech analysis. Previous approaches primarily focused on one particular task, such as differentiating between normal and dysphonic speech, classifying different voice disorders, or estimating the severity of voice disorders. METHODS AND PROCEDURES This study proposes an approach that combines transfer learning and multitask learning (MTL) to simultaneously perform dysphonia classification and severity estimation. Both tasks use a shared representation; network is learned from these shared features. We employed five computer vision models and changed their architecture to support multitask learning. Additionally, we conducted binary 'healthy vs. dysphonia' and multiclass 'healthy vs. organic and functional dysphonia' classification using multitask learning, with the speaker's sex as an auxiliary task. RESULTS The proposed method achieved improved performance across all classification metrics compared to single-task learning (STL), which only performs classification or severity estimation. Specifically, the model achieved F1 scores of 93% and 90% in MTL and STL, respectively. Moreover, we observed considerable improvements in both classification tasks by evaluating beta values associated with the weight assigned to the sex-predicting auxiliary task. MTL achieved an accuracy of 77% compared to the STL score of 73.2%. However, the performance of severity estimation in MTL was comparable to STL. CONCLUSION Our goal is to improve how voice pathologists and clinicians understand patients' conditions, make it easier to track their progress, and enhance the monitoring of vocal quality and treatment procedures. Clinical and Translational Impact Statement: By integrating both classification and severity estimation of dysphonia using multitask learning, we aim to enable clinicians to gain a better understanding of the patient's situation, effectively monitor their progress and voice quality.
Collapse
Affiliation(s)
- Dosti Aziz
- Department of Telecommunications and Media InformaticsBudapest University of Technology and Economics1117BudapestHungary
| | - Sztahó Dávid
- Department of Telecommunications and Media InformaticsBudapest University of Technology and Economics1117BudapestHungary
| |
Collapse
|
8
|
Gomez-Zaragoza L, Marin-Morales J, Vargas EP, Giglioli IAC, Raya MA. An Online Attachment Style Recognition System Based on Voice and Machine Learning. IEEE J Biomed Health Inform 2023; 27:5576-5587. [PMID: 37566508 DOI: 10.1109/jbhi.2023.3304369] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2023]
Abstract
Attachment styles are known to have significant associations with mental and physical health. Specifically, insecure attachment leads individuals to higher risk of suffering from mental disorders and chronic diseases. The aim of this study is to develop an attachment recognition model that can distinguish between secure and insecure attachment styles from voice recordings, exploring the importance of acoustic features while also evaluating gender differences. A total of 199 participants recorded their responses to four open questions intended to trigger their attachment system using a web-based interrogation system. The recordings were processed to obtain the standard acoustic feature set eGeMAPS, and recursive feature elimination was applied to select the relevant features. Different supervised machine learning models were trained to recognize attachment styles using both gender-dependent and gender-independent approaches. The gender-independent model achieved a test accuracy of 58.88%, whereas the gender-dependent models obtained 63.88% and 83.63% test accuracy for women and men respectively, indicating a strong influence of gender on attachment style recognition and the need to consider them separately in further studies. These results also demonstrate the potential of acoustic properties for remote assessment of attachment style, enabling fast and objective identification of this health risk factor, and thus supporting the implementation of large-scale mobile screening systems.
Collapse
|
9
|
Sprotte Y. Computerized text and voice analysis of patients with chronic schizophrenia in art therapy. Sci Rep 2023; 13:16062. [PMID: 37749186 PMCID: PMC10520069 DOI: 10.1038/s41598-023-43069-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 09/19/2023] [Indexed: 09/27/2023] Open
Abstract
This explorative study of patients with chronic schizophrenia aimed to clarify whether group art therapy followed by a therapist-guided picture review could influence patients' communication behaviour. Data on voice and speech characteristics were obtained via objective technological instruments, and these characteristics were selected as indicators of communication behaviour. Seven patients were recruited to participate in weekly group art therapy over a period of 6 months. Three days after each group meeting, they talked about their last picture during a standardized interview that was digitally recorded. The audio recordings were evaluated using validated computer-assisted procedures, the transcribed texts were evaluated using the German version of the LIWC2015 program, and the voice recordings were evaluated using the audio analysis software VocEmoApI. The dual methodological approach was intended to form an internal control of the study results. An exploratory factor analysis of the complete sets of output parameters was carried out with the expectation of obtaining typical speech and voice characteristics that map barriers to communication in patients with schizophrenia. The parameters of both methods were thus processed into five factors each, i.e., into a quantitative digitized classification of the texts and voices. The factor scores were subjected to a linear regression analysis to capture possible process-related changes. Most patients continued to participate in the study. This resulted in high-quality datasets for statistical analysis. To answer the study question, two results were summarized: First, text analysis factor called Presence proved to be a potential surrogate parameter for positive language development. Second, quantitative changes in vocal emotional factors were detected, demonstrating differentiated activation patterns of emotions. These results can be interpreted as an expression of a cathartic healing process. The methods presented in this study make a potentially significant contribution to quantitative research into the effectiveness and mode of action of art therapy.
Collapse
Affiliation(s)
- Yvonne Sprotte
- Art Therapy Department, Dresden University of Fine Arts (Hochschule für Bildende Künste Dresden), Dresden, Germany.
| |
Collapse
|
10
|
Granrud OE, Rodriguez Z, Cowan T, Masucci MD, Cohen AS. Alogia and pressured speech do not fall on a continuum of speech production using objective speech technologies. Schizophr Res 2023; 259:121-126. [PMID: 35864001 DOI: 10.1016/j.schres.2022.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 07/02/2022] [Accepted: 07/04/2022] [Indexed: 10/17/2022]
Abstract
Speech production is affected in a variety of serious mental illnesses (SMI; e.g., schizophrenia, unipolar depression, bipolar disorders) and at its extremes can be observed in the gross reduction of speech (e.g., alogia) or increase of speech (e.g., pressured speech). The present study evaluated whether clinically-rated alogia and pressured speech represent antithetical constructs when analyzed using objective metrics of speech production. We examined natural speech using acoustic and natural language processing features from two archival studies using several different speaking tasks and a combined 107 patients meeting criteria for SMI. Contrary to expectations, we did not find that alogia and pressured speech presented as opposing ends of a speech production continuum. Objective speech markers were associated with clinically rated alogia but not pressured speech, and these results were consistent across speaking tasks and studies. Implications for our understanding of speech production symptoms in SMI are discussed, as well as implications for Natural Language Processing and digital phenotyping efforts more generally.
Collapse
Affiliation(s)
- Ole Edvard Granrud
- Louisiana State University, Department of Psychology, United States of America
| | - Zachary Rodriguez
- Louisiana State University, Department of Psychology, United States of America; Louisiana State University, Center for Computation and Technology, United States of America
| | - Tovah Cowan
- Louisiana State University, Department of Psychology, United States of America
| | - Michael D Masucci
- Louisiana State University, Department of Psychology, United States of America
| | - Alex S Cohen
- Louisiana State University, Department of Psychology, United States of America; Louisiana State University, Center for Computation and Technology, United States of America.
| |
Collapse
|
11
|
Olah J, Diederen K, Gibbs-Dean T, Kempton MJ, Dobson R, Spencer T, Cummins N. Online speech assessment of the psychotic spectrum: Exploring the relationship between overlapping acoustic markers of schizotypy, depression and anxiety. Schizophr Res 2023; 259:11-19. [PMID: 37080802 DOI: 10.1016/j.schres.2023.03.044] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 03/22/2023] [Accepted: 03/23/2023] [Indexed: 04/22/2023]
Abstract
BACKGROUND Remote assessment of acoustic alterations in speech holds promise to increase scalability and validity in research across the psychosis spectrum. A feasible first step in establishing a procedure for online assessments is to assess acoustic alterations in psychometric schizotypy. However, to date, the complex relationship between alterations in speech related to schizotypy and those related to comorbid conditions such as symptoms of depression and anxiety has not been investigated. This study tested whether (1) depression, generalized anxiety and high psychometric schizotypy have similar voice characteristics, (2) which acoustic markers of online collected speech are the strongest predictors of psychometric schizotypy, (3) whether including generalized anxiety and depression symptoms in the model can improve the prediction of schizotypy. METHODS We collected cross-sectional, online-recorded speech data from 441 participants, assessing demographics, symptoms of depression, generalized anxiety and psychometric schizotypy. RESULTS Speech samples collected online could predict psychometric schizotypy, depression, and anxiety symptoms with weak to moderate predictive power, and with moderate and good predictive power when basic demographic variables were added to the models. Most influential features of these models largely overlapped. The predictive power of speech marker-based models of schizotypy significantly improved after including symptom scores of depression and generalized anxiety in the models (from R2 = 0.296 to R2 = 0. 436). CONCLUSIONS Acoustic features of online collected speech are predictive of psychometric schizotypy as well as generalized anxiety and depression symptoms. The acoustic characteristics of schizotypy, depression and anxiety symptoms significantly overlap. Speech models that are designed to predict schizotypy or symptoms of the schizophrenia spectrum might therefore benefit from controlling for symptoms of depression and anxiety.
Collapse
Affiliation(s)
- Julianna Olah
- Institute of Psychiatry, Psychology and Neuroscience, Department of Psychosis Studies, King's College London, London SE5 8AF, UK.
| | - Kelly Diederen
- Institute of Psychiatry, Psychology and Neuroscience, Department of Psychosis Studies, King's College London, London SE5 8AF, UK
| | - Toni Gibbs-Dean
- Institute of Psychiatry, Psychology and Neuroscience, Department of Psychosis Studies, King's College London, London SE5 8AF, UK
| | - Matthew J Kempton
- Institute of Psychiatry, Psychology and Neuroscience, Department of Psychosis Studies, King's College London, London SE5 8AF, UK
| | - Richard Dobson
- Institute of Psychiatry, Psychology and Neuroscience, Department of Biostatistics & Health Informatics, King's College London, London SE5 8AF, UK
| | - Thomas Spencer
- Institute of Psychiatry, Psychology and Neuroscience, Department of Psychosis Studies, King's College London, London SE5 8AF, UK
| | - Nicholas Cummins
- Institute of Psychiatry, Psychology and Neuroscience, Department of Biostatistics & Health Informatics, King's College London, London SE5 8AF, UK
| |
Collapse
|
12
|
Tan EJ, Neill E, Kleiner JL, Rossell SL. Depressive symptoms are specifically related to speech pauses in schizophrenia spectrum disorders. Psychiatry Res 2023; 321:115079. [PMID: 36716551 DOI: 10.1016/j.psychres.2023.115079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/03/2023] [Accepted: 01/25/2023] [Indexed: 01/28/2023]
Abstract
Depression is a common and debilitating mental illness associated with sadness and negativity and is often comorbid with other psychiatric conditions, such as schizophrenia. Depressive symptoms are presently primarily assessed through clinical interviews, however there are other behavioural indicators being investigated as more objective methods of depressive symptom assessment. The present study aimed to evaluate the utility of assessing depression using quantitative speech parameters by comparing speech between 23 schizophrenia/schizoaffective patients with clinically significant depressive symptoms (DP) 19 schizophrenia/schizoaffective patients without depressive symptoms (NDP) and 22 healthy controls with no psychiatric history (HC). Participant audio recordings were transcribed and analyzed to extract five types of speech variables: utterances, words, speaking rate, formulation errors and pauses. The results indicated that DP patients produced significantly more pauses within utterances, and had more utterances with pauses compared to NDP patients and HCs (p = <.05), who performed similarly to each other. Word, speaking rate and formulation errors variables were not significantly different between the patient groups (p > .05). The findings suggest that depressive symptoms may have a specific relationship to speech pauses, and support the potential future use of speech pause assessments as an alternative and objective depression rating and monitoring tool.
Collapse
Affiliation(s)
- Eric J Tan
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia.
| | - Erica Neill
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia
| | - Jacqui L Kleiner
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia
| | - Susan L Rossell
- Centre for Mental Health and Brain Sciences, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St Vincent's Hospital, Melbourne, Australia
| |
Collapse
|
13
|
Castro Martínez JC, Santamaría-García H. Understanding mental health through computers: An introduction to computational psychiatry. Front Psychiatry 2023; 14:1092471. [PMID: 36824671 PMCID: PMC9941647 DOI: 10.3389/fpsyt.2023.1092471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/16/2023] [Indexed: 02/10/2023] Open
Abstract
Computational psychiatry recently established itself as a new tool in the study of mental disorders and problems. Integration of different levels of analysis is creating computational phenotypes with clinical and research values, and constructing a way to arrive at precision psychiatry are part of this new branch. It conceptualizes the brain as a computational organ that receives from the environment parameters to respond to challenges through calculations and algorithms in continuous feedback and feedforward loops with a permanent degree of uncertainty. Through this conception, one can seize an understanding of the cerebral and mental processes in the form of theories or hypotheses based on data. Using these approximations, a better understanding of the disorder and its different determinant factors facilitates the diagnostics and treatment by having an individual, ecologic, and holistic approach. It is a tool that can be used to homologate and integrate multiple sources of information given by several theoretical models. In conclusion, it helps psychiatry achieve precision and reproducibility, which can help the mental health field achieve significant advancement. This article is a narrative review of the basis of the functioning of computational psychiatry with a critical analysis of its concepts.
Collapse
Affiliation(s)
- Juan Camilo Castro Martínez
- Departamento de Psiquiatría y Salud Mental, Facultad de Medicina, Pontificia Universidad Javeriana, Bogotá, Colombia
| | - Hernando Santamaría-García
- Ph.D. Programa de Neurociencias, Departamento de Psiquiatría y Salud Mental, Pontificia Universidad Javeriana, Bogotá, Colombia
- Centro de Memoria y Cognición Intellectus, Hospital Universitario San Ignacio, Bogotá, Colombia
- Global Brain Health Institute, University of California, San Francisco – Trinity College Dublin, San Francisco, CA, United States
| |
Collapse
|
14
|
Daniel DG, Cohen AS, Velligan D, Harvey PD, Alphs L, Davidson M, Potter W, Kott A, Schooler N, Brodie CR, Moore RC, Lindenmeyer P, Marder SR. Remote Assessment of Negative Symptoms of Schizophrenia. SCHIZOPHRENIA BULLETIN OPEN 2023; 4:sgad001. [PMID: 39145343 PMCID: PMC11207840 DOI: 10.1093/schizbullopen/sgad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
In contrast to the validated scales for face-to-face assessment of negative symptoms, no widely accepted tools currently exist for remote monitoring of negative symptoms. Remote assessment of negative symptoms can be broadly divided into 3 categories: (1) remote administration of an existing negative-symptom scale by a clinician, in real time, using videoconference technology to communicate with the patient; (2) direct inference of negative symptoms through detection and analysis of the patient's voice, appearance, or activity by way of the patient's smartphone or other device; and (3) ecological momentary assessment, in which the patient self-reports their condition upon receipt of periodic prompts from a smartphone or other device during their daily routine. These modalities vary in cost, technological complexity, and applicability to the different negative-symptom domains. Each modality has unique strengths, weaknesses, and issues with validation. As a result, an optimal solution may be more likely to employ several techniques than to use a single tool. For remote assessment of negative symptoms to be adopted as primary or secondary endpoints in regulated clinical trials, appropriate psychometric standards will need to be met. Standards for substituting 1 set of measures for another, as well as what constitutes a "gold" reference standard, will need to be precisely defined and a process for defining them developed. Despite over 4 decades of progress toward this goal, significant work remains to be done before clinical trials addressing negative symptoms can utilize remotely assessed secondary or primary outcome measures.
Collapse
Affiliation(s)
| | - Alex S Cohen
- Louisiana State University, Baton Rouge, LA, USA
| | - Dawn Velligan
- University of Texas Health Science Center at San Antonio, San Antonio, TX, USA
| | - Phillip D Harvey
- University of Miami, Miami, FL, USA
- Research Service, Bruce W. Carter VA Medical Center, Miami, FL, USA
| | | | | | | | - Alan Kott
- Signant Health, Prague, Czech Republic
| | | | - Christopher R Brodie
- Otsuka Pharmaceutical Development and Commercialization, Inc, Princeton, NJ, USA
| | | | | | - Stephen R Marder
- Semel Institute for Neuroscience at UCLA and the VA Desert Pacific Mental Illness Research, Education and Clinical Center, Los Angeles, CA, USA
| |
Collapse
|
15
|
Gumus M, DeSouza DD, Xu M, Fidalgo C, Simpson W, Robin J. Evaluating the utility of daily speech assessments for monitoring depression symptoms. Digit Health 2023; 9:20552076231180523. [PMID: 37426590 PMCID: PMC10328009 DOI: 10.1177/20552076231180523] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 05/19/2023] [Indexed: 07/11/2023] Open
Abstract
Objective Depression is a common mental health disorder and a major public health concern, significantly interfering with the lives of those affected. The complex clinical presentation of depression complicates symptom assessments. Day-to-day fluctuations of depression symptoms within an individual bring an additional barrier, since infrequent testing may not reveal symptom fluctuation. Digital measures such as speech can facilitate daily objective symptom evaluation. Here, we evaluated the effectiveness of daily speech assessment in characterizing speech fluctuations in the context of depression symptoms, which can be completed remotely, at a low cost and with relatively low administrative resources. Methods Community volunteers (N = 16) completed a daily speech assessment, using the Winterlight Speech App, and Patient Health Questionnaire-9 (PHQ-9) for 30 consecutive business days. We calculated 230 acoustic and 290 linguistic features from individual's speech and investigated their relationship to depression symptoms at the intra-individual level through repeated measures analyses. Results We observed that depression symptoms were linked to linguistic features, such as less frequent use of dominant and positive words. Greater depression symptomatology was also significantly correlated with acoustic features: reduced variability in speech intensity and increased jitter. Conclusions Our findings support the feasibility of using acoustic and linguistic features as a measure of depression symptoms and propose daily speech assessment as a tool for better characterization of symptom fluctuations.
Collapse
Affiliation(s)
- Melisa Gumus
- Winterlight Labs, Toronto, Ontario, Canada
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
| | | | - Mengdan Xu
- Winterlight Labs, Toronto, Ontario, Canada
| | | | - William Simpson
- Winterlight Labs, Toronto, Ontario, Canada
- McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
16
|
Bambini V, Frau F, Bischetti L, Cuoco F, Bechi M, Buonocore M, Agostoni G, Ferri I, Sapienza J, Martini F, Spangaro M, Bigai G, Cocchi F, Cavallaro R, Bosia M. Deconstructing heterogeneity in schizophrenia through language: a semi-automated linguistic analysis and data-driven clustering approach. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2022; 8:102. [PMID: 36446789 PMCID: PMC9708845 DOI: 10.1038/s41537-022-00306-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
Previous works highlighted the relevance of automated language analysis for predicting diagnosis in schizophrenia, but a deeper language-based data-driven investigation of the clinical heterogeneity through the illness course has been generally neglected. Here we used a semiautomated multidimensional linguistic analysis innovatively combined with a machine-driven clustering technique to characterize the speech of 67 individuals with schizophrenia. Clusters were then compared for psychopathological, cognitive, and functional characteristics. We identified two subgroups with distinctive linguistic profiles: one with higher fluency, lower lexical variety but greater use of psychological lexicon; the other with reduced fluency, greater lexical variety but reduced psychological lexicon. The former cluster was associated with lower symptoms and better quality of life, pointing to the existence of specific language profiles, which also show clinically meaningful differences. These findings highlight the importance of considering language disturbances in schizophrenia as multifaceted and approaching them in automated and data-driven ways.
Collapse
Affiliation(s)
- Valentina Bambini
- Department of Humanities and Life Sciences, University School for Advanced Studies IUSS, Pavia, Italy.
| | - Federico Frau
- Department of Humanities and Life Sciences, University School for Advanced Studies IUSS, Pavia, Italy
| | - Luca Bischetti
- Department of Humanities and Life Sciences, University School for Advanced Studies IUSS, Pavia, Italy
| | - Federica Cuoco
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Margherita Bechi
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Mariachiara Buonocore
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Giulia Agostoni
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Milan, Italy
| | - Ilaria Ferri
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Jacopo Sapienza
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Milan, Italy
| | - Francesca Martini
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Marco Spangaro
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Giorgia Bigai
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Milan, Italy
| | - Federica Cocchi
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Roberto Cavallaro
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Milan, Italy
| | - Marta Bosia
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy
- School of Medicine, Vita-Salute San Raffaele University, Milan, Italy
| |
Collapse
|
17
|
Cohen AS, Rodriguez Z, Warren KK, Cowan T, Masucci MD, Edvard Granrud O, Holmlund TB, Chandler C, Foltz PW, Strauss GP. Natural Language Processing and Psychosis: On the Need for Comprehensive Psychometric Evaluation. Schizophr Bull 2022; 48:939-948. [PMID: 35738008 PMCID: PMC9434462 DOI: 10.1093/schbul/sbac051] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
BACKGROUND AND HYPOTHESIS Despite decades of "proof of concept" findings supporting the use of Natural Language Processing (NLP) in psychosis research, clinical implementation has been slow. One obstacle reflects the lack of comprehensive psychometric evaluation of these measures. There is overwhelming evidence that criterion and content validity can be achieved for many purposes, particularly using machine learning procedures. However, there has been very little evaluation of test-retest reliability, divergent validity (sufficient to address concerns of a "generalized deficit"), and potential biases from demographics and other individual differences. STUDY DESIGN This article highlights these concerns in development of an NLP measure for tracking clinically rated paranoia from video "selfies" recorded from smartphone devices. Patients with schizophrenia or bipolar disorder were recruited and tracked over a week-long epoch. A small NLP-based feature set from 499 language samples were modeled on clinically rated paranoia using regularized regression. STUDY RESULTS While test-retest reliability was high, criterion, and convergent/divergent validity were only achieved when considering moderating variables, notably whether a patient was away from home, around strangers, or alone at the time of the recording. Moreover, there were systematic racial and sex biases in the model, in part, reflecting whether patients submitted videos when they were away from home, around strangers, or alone. CONCLUSIONS Advancing NLP measures for psychosis will require deliberate consideration of test-retest reliability, divergent validity, systematic biases and the potential role of moderators. In our example, a comprehensive psychometric evaluation revealed clear strengths and weaknesses that can be systematically addressed in future research.
Collapse
Affiliation(s)
- Alex S Cohen
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
- Louisiana State University, Center for Computation and Technology, Baton Rouge, LA, USA
| | - Zachary Rodriguez
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
- Louisiana State University, Center for Computation and Technology, Baton Rouge, LA, USA
| | - Kiara K Warren
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
| | - Tovah Cowan
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
| | - Michael D Masucci
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
| | - Ole Edvard Granrud
- Louisiana State University, Department of Psychology, Baton Rouge, LA, USA
| | - Terje B Holmlund
- University of Tromsø—The Arctic University of Norway, Tromso, Norway
| | - Chelsea Chandler
- University of Colorado, Institute of Cognitive Science, Boulder, CO, USA
- University of Colorado, Department of Computer Science, Boulder, CO, USA
| | - Peter W Foltz
- University of Colorado, Institute of Cognitive Science, Boulder, CO, USA
- University of Colorado, Department of Computer Science, Boulder, CO, USA
| | | |
Collapse
|
18
|
Who does what to whom? graph representations of action-predication in speech relate to psychopathological dimensions of psychosis. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2022; 8:58. [PMID: 35853912 PMCID: PMC9261087 DOI: 10.1038/s41537-022-00263-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 06/01/2022] [Indexed: 11/09/2022]
Abstract
Graphical representations of speech generate powerful computational measures related to psychosis. Previous studies have mostly relied on structural relations between words as the basis of graph formation, i.e., connecting each word to the next in a sequence of words. Here, we introduced a method of graph formation grounded in semantic relationships by identifying elements that act upon each other (action relation) and the contents of those actions (predication relation). Speech from picture descriptions and open-ended narrative tasks were collected from a cross-diagnostic group of healthy volunteers and people with psychotic or non-psychotic disorders. Recordings were transcribed and underwent automated language processing, including semantic role labeling to identify action and predication relations. Structural and semantic graph features were computed using static and dynamic (moving-window) techniques. Compared to structural graphs, semantic graphs were more strongly correlated with dimensional psychosis symptoms. Dynamic features also outperformed static features, and samples from picture descriptions yielded larger effect sizes than narrative responses for psychosis diagnoses and symptom dimensions. Overall, semantic graphs captured unique and clinically meaningful information about psychosis and related symptom dimensions. These features, particularly when derived from semi-structured tasks using dynamic measurement, are meaningful additions to the repertoire of computational linguistic methods in psychiatry.
Collapse
|
19
|
Hu HX, Lau WYS, Ma EPY, Hung KSY, Chen SY, Cheng KS, Cheung EFC, Lui SSY, Chan RCK. The Important Role of Motivation and Pleasure Deficits on Social Functioning in Patients With Schizophrenia: A Network Analysis. Schizophr Bull 2022; 48:860-870. [PMID: 35524755 PMCID: PMC9212088 DOI: 10.1093/schbul/sbac017] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Negative symptoms, particularly the motivation and pleasure (MAP) deficits, are associated with impaired social functioning in patients with schizophrenia (SCZ). However, previous studies seldom examined the role of the MAP on social functioning while accounting for the complex interplay between other psychopathology. This network analysis study examined the network structure and interrelationship between negative symptoms (at the "symptom-dimension" and "symptom-item" levels), other psychopathology and social functioning in a sample of 269 patients with SCZ. The psychopathological symptoms were assessed using the Clinical Assessment Interview for Negative Symptoms (CAINS) and the Positive and Negative Syndrome Scale (PANSS). Social functioning was evaluated using the Social and Occupational Functioning Assessment Scale (SOFAS). Centrality indices and relative importance of each node were estimated. The network structures between male and female participants were compared. Our resultant networks at both the "symptom-dimension" and the "symptom-item" levels suggested that the MAP factor/its individual items were closely related to social functioning in SCZ patients, after controlling for the complex interplay between other nodes. Relative importance analysis showed that MAP factor accounted for the largest proportion of variance of social functioning. This study is among the few which used network analysis and the CAINS to examine the interrelationship between negative symptoms and social functioning. Our findings supported the pivotal role of the MAP factor to determine SCZ patients' social functioning, and as a potential intervention target for improving functional outcomes of SCZ.
Collapse
Affiliation(s)
- Hui-Xin Hu
- Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Wilson Y S Lau
- Castle Peak Hospital, Hong Kong Special Administrative Region, China
| | - Eugenia P Y Ma
- Department of Adult Psychiatry, Kwai Chung Hospital, Hong Kong Special Administrative Region, China
| | - Karen S Y Hung
- Castle Peak Hospital, Hong Kong Special Administrative Region, China
| | - Si-Yu Chen
- Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Kin-Shing Cheng
- Department of Adult Psychiatry, Kwai Chung Hospital, Hong Kong Special Administrative Region, China
| | - Eric F C Cheung
- Castle Peak Hospital, Hong Kong Special Administrative Region, China
| | - Simon S Y Lui
- To whom correspondence should be addressed; Department of Psychiatry, The University of Hong Kong, Hong Kong Special Administrative Region, China; tel/fax: (852) 2831 5343, e-mail:
| | - Raymond C K Chan
- Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
20
|
Cohen AS, Cox CR, Cowan T, Masucci MD, Le TP, Docherty AR, Bedwell JS. High Predictive Accuracy of Negative Schizotypy With Acoustic Measures. Clin Psychol Sci 2022; 10:310-323. [PMID: 38031625 PMCID: PMC10686546 DOI: 10.1177/21677026211017835] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Negative schizotypal traits potentially can be digitally phenotyped using objective vocal analysis. Prior attempts have shown mixed success in this regard, potentially because acoustic analysis has relied on small, constrained feature sets. We employed machine learning to (a) optimize and cross-validate predictive models of self-reported negative schizotypy using a large acoustic feature set, (b) evaluate model performance as a function of sex and speaking task, (c) understand potential mechanisms underlying negative schizotypal traits by evaluating the key acoustic features within these models, and (d) examine model performance in its convergence with clinical symptoms and cognitive functioning. Accuracy was good (> 80%) and was improved by considering speaking task and sex. However, the features identified as most predictive of negative schizotypal traits were generally not considered critical to their conceptual definitions. Implications for validating and implementing digital phenotyping to understand and quantify negative schizotypy are discussed.
Collapse
Affiliation(s)
- Alex S. Cohen
- Department of Psychology, Louisiana State University
- Center for Computation and Technology, Louisiana State University
| | - Christopher R. Cox
- Department of Psychology, Louisiana State University
- Center for Computation and Technology, Louisiana State University
| | - Tovah Cowan
- Department of Psychology, Louisiana State University
- Center for Computation and Technology, Louisiana State University
| | - Michael D. Masucci
- Department of Psychology, Louisiana State University
- Center for Computation and Technology, Louisiana State University
| | - Thanh P. Le
- Department of Psychology, Louisiana State University
- Center for Computation and Technology, Louisiana State University
| | | | | |
Collapse
|
21
|
Nahar JK, Lopez-Jimenez F. Utilizing Conversational Artificial Intelligence, Voice, and Phonocardiography Analytics in Heart Failure Care. Heart Fail Clin 2022; 18:311-323. [DOI: 10.1016/j.hfc.2021.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
22
|
Hitczenko K, Cowan HR, Goldrick M, Mittal VA. Racial and Ethnic Biases in Computational Approaches to Psychopathology. Schizophr Bull 2022; 48:285-288. [PMID: 34729605 PMCID: PMC8886581 DOI: 10.1093/schbul/sbab131] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Kasia Hitczenko
- Department of Linguistics, Northwestern University, Evanston, IL, USA
| | - Henry R Cowan
- Department of Psychology, Northwestern University, Evanston, IL, USA
| | - Matthew Goldrick
- Department of Linguistics, Northwestern University, Evanston, IL, USA
- Department of Psychology, Northwestern University, Evanston, IL, USA
- Institute for Innovations in Developmental Sciences, Northwestern University, Evanston/Chicago, IL, USA
| | - Vijay A Mittal
- Department of Psychology, Northwestern University, Evanston, IL, USA
- Institute for Innovations in Developmental Sciences, Northwestern University, Evanston/Chicago, IL, USA
- Department of Psychiatry, Northwestern University, Chicago, IL, USA
- Institute for Policy Research, Northwestern University, Evanston, IL, USA
- Medical Social Sciences, Northwestern University, Chicago, IL, USA
| |
Collapse
|
23
|
Birnbaum ML, Abrami A, Heisig S, Ali A, Arenare E, Agurto C, Lu N, Kane JM, Cecchi G. Acoustic and Facial Features From Clinical Interviews for Machine Learning-Based Psychiatric Diagnosis: Algorithm Development. JMIR Ment Health 2022; 9:e24699. [PMID: 35072648 PMCID: PMC8822433 DOI: 10.2196/24699] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 04/29/2021] [Accepted: 12/01/2021] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions. OBJECTIVE We aimed to investigate whether reliable inferences-psychiatric signs, symptoms, and diagnoses-can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder. METHODS We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation. RESULTS The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner-pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61). CONCLUSIONS This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.
Collapse
Affiliation(s)
- Michael L Birnbaum
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States.,The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States
| | - Avner Abrami
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| | - Stephen Heisig
- Icahn School of Medicine at Mount Sinai, New York City, NY, United States
| | - Asra Ali
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - Elizabeth Arenare
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - Carla Agurto
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| | - Nathaniel Lu
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - John M Kane
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States.,The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States
| | - Guillermo Cecchi
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| |
Collapse
|
24
|
Ferrer-I-Cancho R, Gómez-Rodríguez C, Esteban JL, Alemany-Puig L. Optimality of syntactic dependency distances. Phys Rev E 2022; 105:014308. [PMID: 35193296 DOI: 10.1103/physreve.105.014308] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 11/10/2021] [Indexed: 06/14/2023]
Abstract
It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies, and the space is defined by the linear order of the words in the sentence. We introduce a score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions: that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a hierarchical ranking of languages by their degree of optimization. The score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics, and cognitive science). Finally, the principles behind the design of the score have implications for network science.
Collapse
Affiliation(s)
- Ramon Ferrer-I-Cancho
- Complexity and Quantitative Linguistics Lab, LARCA Research Group, Departament de Ciències de la Computació, Universitat Politècnica de Catalunya, Campus Nord, Edifici Omega, Jordi Girona Salgado 1-3 08034 Barcelona, Catalonia, Spain
| | - Carlos Gómez-Rodríguez
- Universidade da Coruña, CITIC, FASTPARSE Lab, LyS Research Group, Departamento de Ciencias de la Computación y Tecnologías de la Información, Facultade de Informática, Elviña, 15071, A Coruña, Spain
| | - Juan Luis Esteban
- Departament de Ciències de la Computació, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edifici Omega, Jordi Girona Salgado 1-3 08034 Barcelona, Catalonia, Spain
| | - Lluís Alemany-Puig
- Complexity and Quantitative Linguistics Lab, LARCA Research Group, Departament de Ciències de la Computació, Universitat Politècnica de Catalunya, Campus Nord, Edifici Omega, Jordi Girona Salgado 1-3 08034 Barcelona, Catalonia, Spain
| |
Collapse
|
25
|
Tan EJ, Meyer D, Neill E, Rossell SL. Investigating the diagnostic utility of speech patterns in schizophrenia and their symptom associations. Schizophr Res 2021; 238:91-98. [PMID: 34649084 DOI: 10.1016/j.schres.2021.10.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 09/19/2021] [Accepted: 10/03/2021] [Indexed: 12/13/2022]
Abstract
BACKGROUND Speech disturbances are a recognised aspect of schizophrenia that may have potential utility as a diagnostic indicator. Recent advances in quantitative speech assessment methods have led to more reproducible and precise metrics making this possible. The current study sought firstly to characterise the speech profile of schizophrenia patients using quantitative speech measures, then examine the diagnostic utility of these measures and explore their relationship to symptoms. METHODS Speech recordings from 43 schizophrenia/schizoaffective disorder (SZ) patients and 46 healthy controls (HC) were obtained and transcribed. Cognitive and symptom measures were also administered. RESULTS Compared to HCs, SZ patients had higher incidences of aberrance across five types of quantitative speech variables: utterances, single words, time/speaking rate, turns and formulation errors, but not pauses. Based on two machine learning algorithms, 21 speech variables across the same five speech variable types (again not including pauses) were identified as significant classifiers for a schizophrenia diagnosis with 90-100% specificity and 80-90% sensitivity for both models. Selective relationships were also observed between these speech variables and only positive, disorganisation, excitement and formal thought disorder symptoms. CONCLUSIONS The findings support pervasive speech impairments in schizophrenia patients relative to HCs, and the potential diagnostic utility of these speech disturbances. Continued work is needed to build the evidence base for quantitative speech assessment as a future objective diagnostic tool for schizophrenia. It holds the promise of improved diagnostic accuracy leading to increased treatment efficacy and better patient outcomes.
Collapse
Affiliation(s)
- Eric J Tan
- Centre for Mental Health, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St. Vincent's Hospital, Melbourne, Australia.
| | - Denny Meyer
- Centre for Mental Health, Swinburne University of Technology, Melbourne, Australia
| | - Erica Neill
- Centre for Mental Health, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St. Vincent's Hospital, Melbourne, Australia
| | - Susan L Rossell
- Centre for Mental Health, Swinburne University of Technology, Melbourne, Australia; Department of Psychiatry, St. Vincent's Hospital, Melbourne, Australia
| |
Collapse
|
26
|
Fagherazzi G, Fischer A, Ismael M, Despotovic V. Voice for Health: The Use of Vocal Biomarkers from Research to Clinical Practice. Digit Biomark 2021; 5:78-88. [PMID: 34056518 PMCID: PMC8138221 DOI: 10.1159/000515346] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 02/18/2021] [Indexed: 12/17/2022] Open
Abstract
Diseases can affect organs such as the heart, lungs, brain, muscles, or vocal folds, which can then alter an individual's voice. Therefore, voice analysis using artificial intelligence opens new opportunities for healthcare. From using vocal biomarkers for diagnosis, risk prediction, and remote monitoring of various clinical outcomes and symptoms, we offer in this review an overview of the various applications of voice for health-related purposes. We discuss the potential of this rapidly evolving environment from a research, patient, and clinical perspective. We also discuss the key challenges to overcome in the near future for a substantial and efficient use of voice in healthcare.
Collapse
Affiliation(s)
- Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Aurélie Fischer
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Muhannad Ismael
- IT for Innovation in Services Department (ITIS), Luxembourg Institute of Science and Technology (LIST), Esch-sur-Alzette, Luxembourg
| | - Vladimir Despotovic
- Department of Computer Science, Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
27
|
Cohen AS, Cox CR, Tucker RP, Mitchell KR, Schwartz EK, Le TP, Foltz PW, Holmlund TB, Elvevåg B. Validating Biobehavioral Technologies for Use in Clinical Psychiatry. Front Psychiatry 2021; 12:503323. [PMID: 34177631 PMCID: PMC8225932 DOI: 10.3389/fpsyt.2021.503323] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 05/11/2021] [Indexed: 11/14/2022] Open
Abstract
The last decade has witnessed the development of sophisticated biobehavioral and genetic, ambulatory, and other measures that promise unprecedented insight into psychiatric disorders. As yet, clinical sciences have struggled with implementing these objective measures and they have yet to move beyond "proof of concept." In part, this struggle reflects a traditional, and conceptually flawed, application of traditional psychometrics (i.e., reliability and validity) for evaluating them. This paper focuses on "resolution," concerning the degree to which changes in a signal can be detected and quantified, which is central to measurement evaluation in informatics, engineering, computational and biomedical sciences. We define and discuss resolution in terms of traditional reliability and validity evaluation for psychiatric measures, then highlight its importance in a study using acoustic features to predict self-injurious thoughts/behaviors (SITB). This study involved tracking natural language and self-reported symptoms in 124 psychiatric patients: (a) over 5-14 recording sessions, collected using a smart phone application, and (b) during a clinical interview. Importantly, the scope of these measures varied as a function of time (minutes, weeks) and spatial setting (i.e., smart phone vs. interview). Regarding reliability, acoustic features were temporally unstable until we specified the level of temporal/spatial resolution. Regarding validity, accuracy based on machine learning of acoustic features predicting SITB varied as a function of resolution. High accuracy was achieved (i.e., ~87%), but only when the acoustic and SITB measures were "temporally-matched" in resolution was the model generalizable to new data. Unlocking the potential of biobehavioral technologies for clinical psychiatry will require careful consideration of resolution.
Collapse
Affiliation(s)
- Alex S Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States.,Center for Computation and Technology Louisiana State University, Baton Rouge, LA, United States
| | - Christopher R Cox
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States
| | - Raymond P Tucker
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States
| | - Kyle R Mitchell
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States
| | - Elana K Schwartz
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States
| | - Thanh P Le
- Department of Psychology, Louisiana State University, Baton Rouge, LA, United States
| | - Peter W Foltz
- Department of Psychology, University of Colorado, Boulder, CO, United States
| | - Terje B Holmlund
- Department of Clinical Medicine, University of Tromsø-The Arctic University of Norway, Tromsø, Norway
| | - Brita Elvevåg
- Department of Clinical Medicine, University of Tromsø-The Arctic University of Norway, Tromsø, Norway.,The Norwegian Center for eHealth Research, University Hospital of North Norway, Tromsø, Norway
| |
Collapse
|
28
|
Kelly DL, Spaderna M, Hodzic V, Nair S, Kitchen C, Werkheiser AE, Powell MM, Liu F, Coppersmith G, Chen S, Resnik P. Blinded Clinical Ratings of Social Media Data are Correlated with In-Person Clinical Ratings in Participants Diagnosed with Either Depression, Schizophrenia, or Healthy Controls. Psychiatry Res 2020; 294:113496. [PMID: 33065372 DOI: 10.1016/j.psychres.2020.113496] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 10/01/2020] [Indexed: 12/16/2022]
Abstract
This study investigates clinically valid signals about psychiatric symptoms in social media data, by rating severity of psychiatric symptoms in donated, de-identified Facebook posts and comparing to in-person clinical assessments. Participants with schizophrenia (N=8), depression (N=7), or who were healthy controls (N=8) also consented to the collection of their Facebook activity from three months before the in-person assessments to six weeks after this evaluation. Depressive symptoms were assessed in- person using the Montgomery-Åsberg Depression Rating Scale (MADRS), psychotic symptoms were assessed using the Brief Psychiatric Rating Scale (BPRS), and global functioning was assessed using the Community Assessment of Psychotic Experiences (CAPE-42). Independent raters (psychiatrists, non-psychiatrist mental health clinicians, and two staff members) rated depression, psychosis, and global functioning symptoms from the social media activity of deidentified participants. The correlations between in-person clinical ratings and blinded ratings based on social media data were evaluated. Significant correlations (and trends for significance in the mixed model controlling for multiple raters) were found for psychotic symptoms, global symptom ratings and depressive symptoms. Results like these, indicating the presence of clinically valid signal in social media, are an important step toward developing computational tools that could assist clinicians by providing additional data outside the context of clinical encounters.
Collapse
Affiliation(s)
- Deanna L Kelly
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA.
| | - Max Spaderna
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Vedrana Hodzic
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Suraj Nair
- University of Maryland College Park, Department of Computer Science and Institute for Advanced Computer Studies, College Park, MD, USA
| | - Christopher Kitchen
- Center for Population Health IT, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | - Anne E Werkheiser
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA; Department of Psychology, Georgia State University, USA
| | | | - Fang Liu
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | | | - Shuo Chen
- University of Maryland Baltimore, School of Medicine, Baltimore, MD, USA
| | - Philip Resnik
- University of Maryland College Park, Department of Linguistics and Institute for Advanced Computer Studies, College Park, MD, USA
| |
Collapse
|
29
|
Argolo F, Magnavita G, Mota NB, Ziebold C, Mabunda D, Pan PM, Zugman A, Gadelha A, Corcoran C, Bressan RA. Lowering costs for large-scale screening in psychosis: a systematic review and meta-analysis of performance and value of information for speech-based psychiatric evaluation. REVISTA BRASILEIRA DE PSIQUIATRIA (SAO PAULO, BRAZIL : 1999) 2020; 42:673-686. [PMID: 32321060 PMCID: PMC7678898 DOI: 10.1590/1516-4446-2019-0722] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 01/23/2020] [Indexed: 11/22/2022]
Abstract
OBJECTIVE Obstacles for computational tools in psychiatry include gathering robust evidence and keeping implementation costs reasonable. We report a systematic review of automated speech evaluation for the psychosis spectrum and analyze the value of information for a screening program in a healthcare system with a limited number of psychiatrists (Maputo, Mozambique). METHODS Original studies on speech analysis for forecasting of conversion in individuals at clinical high risk (CHR) for psychosis, diagnosis of manifested psychotic disorder, and first-episode psychosis (FEP) were included in this review. Studies addressing non-verbal components of speech (e.g., pitch, tone) were excluded. RESULTS Of 168 works identified, 28 original studies were included. Valuable speech features included direct measures (e.g., relative word counting) and mathematical embeddings (e.g.: word-to-vector, graphs). Accuracy estimates reported for schizophrenia diagnosis and CHR conversion ranged from 71 to 100% across studies. Studies used structured interviews, directed tasks, or prompted free speech. Directed-task protocols were faster while seemingly maintaining performance. The expected value of perfect information is USD 9.34 million. Imperfect tests would nevertheless yield high value. CONCLUSION Accuracy for screening and diagnosis was high. Larger studies are needed to enhance precision of classificatory estimates. Automated analysis presents itself as a feasible, low-cost method which should be especially useful for regions in which the physician pool is insufficient to meet demand.
Collapse
Affiliation(s)
- Felipe Argolo
- Universidade Federal de São Paulo, São Paulo, SP, Brazil
- King’s College London, London, UK
| | | | - Natalia Bezerra Mota
- Brain Institute, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
- Departamento de Física, Universidade Federal de Pernambuco (UFPE), Recife, PE, Brazil
| | | | - Dirceu Mabunda
- Faculdade de Medicina, Universidade Eduardo Mondlane, Maputo, Mozambique
| | - Pedro M. Pan
- Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - André Zugman
- National Institute of Mental Health (NIMH), Bethesda, MD, USA
| | - Ary Gadelha
- Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Cheryl Corcoran
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education and Clinical Center (MIRECC VISN2), New York, NY, USA
| | - Rodrigo A. Bressan
- Universidade Federal de São Paulo, São Paulo, SP, Brazil
- King’s College London, London, UK
| |
Collapse
|
30
|
Robin J, Harrison JE, Kaufman LD, Rudzicz F, Simpson W, Yancheva M. Evaluation of Speech-Based Digital Biomarkers: Review and Recommendations. Digit Biomark 2020; 4:99-108. [PMID: 33251474 DOI: 10.1159/000510820] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 08/11/2020] [Indexed: 12/23/2022] Open
Abstract
Speech represents a promising novel biomarker by providing a window into brain health, as shown by its disruption in various neurological and psychiatric diseases. As with many novel digital biomarkers, however, rigorous evaluation is currently lacking and is required for these measures to be used effectively and safely. This paper outlines and provides examples from the literature of evaluation steps for speech-based digital biomarkers, based on the recent V3 framework (Goldsack et al., 2020). The V3 framework describes 3 components of evaluation for digital biomarkers: verification, analytical validation, and clinical validation. Verification includes assessing the quality of speech recordings and comparing the effects of hardware and recording conditions on the integrity of the recordings. Analytical validation includes checking the accuracy and reliability of data processing and computed measures, including understanding test-retest reliability, demographic variability, and comparing measures to reference standards. Clinical validity involves verifying the correspondence of a measure to clinical outcomes which can include diagnosis, disease progression, or response to treatment. For each of these sections, we provide recommendations for the types of evaluation necessary for speech-based biomarkers and review published examples. The examples in this paper focus on speech-based biomarkers, but they can be used as a template for digital biomarker development more generally.
Collapse
Affiliation(s)
| | - John E Harrison
- Metis Cognition Ltd., Park House, Kilmington Common, Warminster, United Kingdom.,Alzheimer Center, AUmc, Amsterdam, The Netherlands.,Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | | | - Frank Rudzicz
- Li Ka Shing Knowledge Institute, St Michael's Hospital, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - William Simpson
- Winterlight Labs, Toronto, Ontario, Canada.,Department of Psychiatry and Behavioural Neuroscience, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
31
|
Cohen AS, Cox CR, Le TP, Cowan T, Masucci MD, Strauss GP, Kirkpatrick B. Using machine learning of computerized vocal expression to measure blunted vocal affect and alogia. NPJ SCHIZOPHRENIA 2020; 6:26. [PMID: 32978400 PMCID: PMC7519104 DOI: 10.1038/s41537-020-00115-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 08/06/2020] [Indexed: 11/16/2022]
Abstract
Negative symptoms are a transdiagnostic feature of serious mental illness (SMI) that can be potentially “digitally phenotyped” using objective vocal analysis. In prior studies, vocal measures show low convergence with clinical ratings, potentially because analysis has used small, constrained acoustic feature sets. We sought to evaluate (1) whether clinically rated blunted vocal affect (BvA)/alogia could be accurately modelled using machine learning (ML) with a large feature set from two separate tasks (i.e., a 20-s “picture” and a 60-s “free-recall” task), (2) whether “Predicted” BvA/alogia (computed from the ML model) are associated with demographics, diagnosis, psychiatric symptoms, and cognitive/social functioning, and (3) which key vocal features are central to BvA/Alogia ratings. Accuracy was high (>90%) and was improved when computed separately by speaking task. ML scores were associated with poor cognitive performance and social functioning and were higher in patients with schizophrenia versus depression or mania diagnoses. However, the features identified as most predictive of BvA/Alogia were generally not considered critical to their operational definitions. Implications for validating and implementing digital phenotyping to reduce SMI burden are discussed.
Collapse
Affiliation(s)
- Alex S Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| | - Christopher R Cox
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Thanh P Le
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Tovah Cowan
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michael D Masucci
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | | | - Brian Kirkpatrick
- Department of Psychiatry and Behavioral Sciences, University of Nevada, Reno, USA
| |
Collapse
|
32
|
|
33
|
Makowski C, Lewis JD, Lepage C, Malla AK, Joober R, Evans AC, Lepage M. Intersection of verbal memory and expressivity on cortical contrast and thickness in first episode psychosis. Psychol Med 2020; 50:1923-1936. [PMID: 31456533 DOI: 10.1017/s0033291719002071] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
BACKGROUND Longitudinal studies of first episode of psychosis (FEP) patients are critical to understanding the dynamic clinical factors influencing functional outcomes; negative symptoms and verbal memory (VM) deficits are two such factors that remain a therapeutic challenge. This study uses white-gray matter contrast at the inner edge of the cortex, in addition to cortical thickness, to probe changes in microstructure and their relation with negative symptoms and possible intersections with verbal memory. METHODS T1-weighted images and clinical data were collected longitudinally for patients (N = 88) over a two-year period. Cognitive data were also collected at baseline. Relationships between baseline VM (immediate/delayed recall) and rate of change in two negative symptom dimensions, amotivation and expressivity, were assessed at the behavioral level, as well as at the level of brain structure. RESULTS VM, particularly immediate recall, was significantly and positively associated with a steeper rate of expressivity symptom decline (r = 0.32, q = 0.012). Significant interaction effects between baseline delayed recall and change in expressivity were uncovered in somatomotor regions bilaterally for both white-gray matter contrast and cortical thickness. Furthermore, interaction effects between immediate recall and change in expressivity on cortical thickness rates were uncovered across higher-order regions of the language processing network. CONCLUSIONS This study shows common neural correlates of language-related brain areas underlying expressivity and VM in FEP, suggesting deficits in these domains may be more linked to speech production rather than general cognitive capacity. Together, white-gray matter contrast and cortical thickness may optimally inform clinical investigations aiming to capture peri-cortical microstructural changes.
Collapse
Affiliation(s)
- Carolina Makowski
- McGill Centre for Integrative Neuroscience, McGill University, Montreal, Canada
- McConnell Brain Imaging Centre, Montreal Neurological Institute, Montreal, Canada
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal, Canada
- Department of Psychiatry, McGill University, Verdun, Canada
| | - John D Lewis
- McGill Centre for Integrative Neuroscience, McGill University, Montreal, Canada
- McConnell Brain Imaging Centre, Montreal Neurological Institute, Montreal, Canada
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal, Canada
| | - Claude Lepage
- McGill Centre for Integrative Neuroscience, McGill University, Montreal, Canada
- McConnell Brain Imaging Centre, Montreal Neurological Institute, Montreal, Canada
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal, Canada
| | - Ashok K Malla
- Department of Psychiatry, McGill University, Verdun, Canada
- Prevention and Early Intervention Program for Psychosis, Douglas Mental Health University Institute, Verdun, Canada
| | - Ridha Joober
- Department of Psychiatry, McGill University, Verdun, Canada
- Prevention and Early Intervention Program for Psychosis, Douglas Mental Health University Institute, Verdun, Canada
| | - Alan C Evans
- McGill Centre for Integrative Neuroscience, McGill University, Montreal, Canada
- McConnell Brain Imaging Centre, Montreal Neurological Institute, Montreal, Canada
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal, Canada
| | - Martin Lepage
- Department of Psychiatry, McGill University, Verdun, Canada
- Prevention and Early Intervention Program for Psychosis, Douglas Mental Health University Institute, Verdun, Canada
| |
Collapse
|
34
|
Cohen AS, Cowan T, Le TP, Schwartz EK, Kirkpatrick B, Raugh IM, Chapman HC, Strauss GP. Ambulatory digital phenotyping of blunted affect and alogia using objective facial and vocal analysis: Proof of concept. Schizophr Res 2020; 220:141-146. [PMID: 32247747 PMCID: PMC7306442 DOI: 10.1016/j.schres.2020.03.043] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Revised: 01/10/2020] [Accepted: 03/21/2020] [Indexed: 11/28/2022]
Abstract
Negative symptoms reflect one of the most debilitating aspects of one of the most debilitating diseases known to humankind. As yet, our treatments for negative symptoms are palliative at best and our understanding of their causes is relatively superficial. To address this, we are developing objective ambulatory tools for digitally phenotyping their severity which can be used outside the confines of the traditional clinical and research settings. The present study evaluated the feasibility, reliability and validity of ambulatory vocal acoustic and facial emotion expression analysis. Videos were provided by 25 patients with schizophrenia or schizoaffective disorder and 27 nonpsychiatric controls using inexpensive, non-invasive ambulatory recording methods. Controls provided 411 video recordings, and patients provided 377 video recordings; an average of 15.22 and 14.50 per participant per group respectively. The vast majority (over 80%) of these videos were usable for analysis. An empirically-supported, limited-feature vocal (7 features) and facial (3 features) set was examined. Within participants, these features varied considerably over time, but showed moderate to good test-retest reliability in many cases once contextual factors (e.g., activity involved in at the time of testing) were accounted for. Vocal and facial features showed statistically significant convergence with a "gold standard" negative symptom measure. Ambulatory vocal/facial features were more strongly associated with engagement in social or work activities in patients than negative symptom ratings. These data support the use of ambulatory vocal/facial analytic technologies for digital phenotyping of these negative symptoms.
Collapse
Affiliation(s)
- Alex S. Cohen
- Louisiana State University, Department of Psychology, 236 Audubon Hall, Louisiana State University, Baton Rouge, LA, USA, 70803
| | - Tovah Cowan
- Louisiana State University, Department of Psychology, 236 Audubon Hall, Louisiana State University, Baton Rouge, LA, USA, 70803
| | - Thanh P. Le
- Louisiana State University, Department of Psychology, 236 Audubon Hall, Louisiana State University, Baton Rouge, LA, USA, 70803
| | - Elana K. Schwartz
- Louisiana State University, Department of Psychology, 236 Audubon Hall, Louisiana State University, Baton Rouge, LA, USA, 70803
| | - Brian Kirkpatrick
- University of Nevada, Reno School of Medicine, Psychiatry & Behavioral Sciences, 5190 Neil Rd #215, Reno, NV, USA, 89502
| | - Ian M. Raugh
- University of Georgia, Department of Psychology, 125 Baldwin St, Athens, GA, USA, 30602
| | - Hannah C. Chapman
- University of Georgia, Department of Psychology, 125 Baldwin St, Athens, GA, USA, 30602
| | - Gregory P. Strauss
- University of Georgia, Department of Psychology, 125 Baldwin St, Athens, GA, USA, 30602
| |
Collapse
|
35
|
Agurto C, Cecchi GA, Norel R, Ostrand R, Kirkpatrick M, Baggott MJ, Wardle MC, Wit HD, Bedi G. Detection of acute 3,4-methylenedioxymethamphetamine (MDMA) effects across protocols using automated natural language processing. Neuropsychopharmacology 2020; 45:823-832. [PMID: 31978933 PMCID: PMC7075895 DOI: 10.1038/s41386-020-0620-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 11/28/2019] [Accepted: 01/08/2020] [Indexed: 11/17/2022]
Abstract
The detection of changes in mental states such as those caused by psychoactive drugs relies on clinical assessments that are inherently subjective. Automated speech analysis may represent a novel method to detect objective markers, which could help improve the characterization of these mental states. In this study, we employed computer-extracted speech features from multiple domains (acoustic, semantic, and psycholinguistic) to assess mental states after controlled administration of 3,4-methylenedioxymethamphetamine (MDMA) and intranasal oxytocin. The training/validation set comprised within-participants data from 31 healthy adults who, over four sessions, were administered MDMA (0.75, 1.5 mg/kg), oxytocin (20 IU), and placebo in randomized, double-blind fashion. Participants completed two 5-min speech tasks during peak drug effects. Analyses included group-level comparisons of drug conditions and estimation of classification at the individual level within this dataset and on two independent datasets. Promising classification results were obtained to detect drug conditions, achieving cross-validated accuracies of up to 87% in training/validation and 92% in the independent datasets, suggesting that the detected patterns of speech variability are associated with drug consumption. Specifically, we found that oxytocin seems to be mostly driven by changes in emotion and prosody, which are mainly captured by acoustic features. In contrast, mental states driven by MDMA consumption appear to manifest in multiple domains of speech. Furthermore, we find that the experimental task has an effect on the speech response within these mental states, which can be attributed to presence or absence of an interaction with another individual. These results represent a proof-of-concept application of the potential of speech to provide an objective measurement of mental states elicited during intoxication.
Collapse
Affiliation(s)
- Carla Agurto
- Computational Biology Center - Neuroscience, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Guillermo A Cecchi
- Computational Biology Center - Neuroscience, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA.
| | - Raquel Norel
- Computational Biology Center - Neuroscience, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Rachel Ostrand
- Computational Biology Center - Neuroscience, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Matthew Kirkpatrick
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew J Baggott
- Addiction and Pharmacology Research Laboratory, Friends Research Institute, San Francisco, CA, USA
| | - Margaret C Wardle
- Department of Psychology, University of Illinois at Chicago, Chicago, IL, USA
| | - Harriet de Wit
- Human Behavioral Pharmacology Laboratory, Department of Psychiatry and Behavioral Neuroscience, University of Chicago, Chicago, IL, USA
| | - Gillinder Bedi
- Centre for Youth Mental Health, University of Melbourne, and Orygen National Centre of Excellence in Youth Mental Health, Melbourne, Australia
| |
Collapse
|
36
|
Cohen AS, Schwartz E, Le T, Cowan T, Cox C, Tucker R, Foltz P, Holmlund TB, Elvevåg B. Validating digital phenotyping technologies for clinical use: the critical importance of "resolution". World Psychiatry 2020; 19:114-115. [PMID: 31922662 PMCID: PMC6953543 DOI: 10.1002/wps.20703] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Affiliation(s)
- Alex S. Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Elana Schwartz
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Thanh Le
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Tovah Cowan
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Christopher Cox
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Raymond Tucker
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Peter Foltz
- Department of Psychology, University of Colorado, Boulder, CO, USA
| | - Terje B. Holmlund
- Department of Clinical Medicine, University of Tromsø ‐ Arctic University of Norway, Tromsø, Norway
| | - Brita Elvevåg
- Department of Clinical Medicine, University of Tromsø ‐ Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
37
|
Parola A, Simonsen A, Bliksted V, Fusaroli R. Voice patterns in schizophrenia: A systematic review and Bayesian meta-analysis. Schizophr Res 2020; 216:24-40. [PMID: 31839552 DOI: 10.1016/j.schres.2019.11.031] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 09/13/2019] [Accepted: 11/19/2019] [Indexed: 12/28/2022]
Abstract
Voice atypicalities have been a characteristic feature of schizophrenia since its first definitions. They are often associated with core negative symptoms such as flat affect and alogia, and with the social impairments seen in the disorder. This suggests that voice atypicalities may represent a marker of clinical features and social functioning in schizophrenia. We systematically reviewed and meta-analyzed the evidence for distinctive acoustic patterns in schizophrenia, as well as their relation to clinical features. We identified 46 articles, including 55 studies with a total of 1254 patients with schizophrenia and 699 healthy controls. Summary effect sizes (Hedges'g and Pearson's r) estimates were calculated using multilevel Bayesian modeling. We identified weak atypicalities in pitch variability (g = -0.55) related to flat affect, and stronger atypicalities in proportion of spoken time, speech rate, and pauses (g's between -0.75 and -1.89) related to alogia and flat affect. However, the effects were mostly modest (with the important exception of pause duration) compared to perceptual and clinical judgments, and characterized by large heterogeneity between studies. Moderator analyses revealed that tasks with a more demanding cognitive and social component showed larger effects both in contrasting patients and controls and in assessing symptomatology. In conclusion, studies of acoustic patterns are a promising but, yet unsystematic avenue for establishing markers of schizophrenia. We outline recommendations towards more cumulative, open, and theory-driven research.
Collapse
Affiliation(s)
| | - Arndis Simonsen
- Psychosis Research Unit - Department of Clinical Medicine, Aarhus University, Denmark; The Interacting Minds Center - School of Culture and Society, Aarhus University, Denmark
| | - Vibeke Bliksted
- Psychosis Research Unit - Department of Clinical Medicine, Aarhus University, Denmark; The Interacting Minds Center - School of Culture and Society, Aarhus University, Denmark
| | - Riccardo Fusaroli
- The Interacting Minds Center - School of Culture and Society, Aarhus University, Denmark; Department of Linguistics, Semiotics and Cognitive Science - School of Communication and Culture, Aarhus University, Denmark
| |
Collapse
|
38
|
Low DM, Bentley KH, Ghosh SS. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investig Otolaryngol 2020; 5:96-116. [PMID: 32128436 PMCID: PMC7042657 DOI: 10.1002/lio2.354] [Citation(s) in RCA: 156] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/31/2019] [Accepted: 01/17/2020] [Indexed: 12/31/2022] Open
Abstract
OBJECTIVE There are many barriers to accessing mental health assessments including cost and stigma. Even when individuals receive professional care, assessments are intermittent and may be limited partly due to the episodic nature of psychiatric symptoms. Therefore, machine-learning technology using speech samples obtained in the clinic or remotely could one day be a biomarker to improve diagnosis and treatment. To date, reviews have only focused on using acoustic features from speech to detect depression and schizophrenia. Here, we present the first systematic review of studies using speech for automated assessments across a broader range of psychiatric disorders. METHODS We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. We included studies from the last 10 years using speech to identify the presence or severity of disorders within the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). For each study, we describe sample size, clinical evaluation method, speech-eliciting tasks, machine learning methodology, performance, and other relevant findings. RESULTS 1395 studies were screened of which 127 studies met the inclusion criteria. The majority of studies were on depression, schizophrenia, and bipolar disorder, and the remaining on post-traumatic stress disorder, anxiety disorders, and eating disorders. 63% of studies built machine learning predictive models, and the remaining 37% performed null-hypothesis testing only. We provide an online database with our search results and synthesize how acoustic features appear in each disorder. CONCLUSION Speech processing technology could aid mental health assessments, but there are many obstacles to overcome, especially the need for comprehensive transdiagnostic and longitudinal studies. Given the diverse types of data sets, feature extraction, computational methodologies, and evaluation criteria, we provide guidelines for both acquiring data and building machine learning models with a focus on testing hypotheses, open science, reproducibility, and generalizability. LEVEL OF EVIDENCE 3a.
Collapse
Affiliation(s)
- Daniel M. Low
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical SchoolBostonMassachusetts
- Department of Brain and Cognitive SciencesMITCambridgeMassachusetts
| | - Kate H. Bentley
- Department of PsychiatryMassachusetts General Hospital/Harvard Medical SchoolBostonMassachusetts
- McGovern Institute for Brain Research, MITCambridgeMassachusetts
| | - Satrajit S. Ghosh
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical SchoolBostonMassachusetts
- McGovern Institute for Brain Research, MITCambridgeMassachusetts
- Department of Otolaryngology, Head and Neck SurgeryHarvard Medical SchoolBostonMassachusetts
| |
Collapse
|
39
|
Arevian AC, Bone D, Malandrakis N, Martinez VR, Wells KB, Miklowitz DJ, Narayanan S. Clinical state tracking in serious mental illness through computational analysis of speech. PLoS One 2020; 15:e0225695. [PMID: 31940347 PMCID: PMC6961853 DOI: 10.1371/journal.pone.0225695] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 11/11/2019] [Indexed: 11/19/2022] Open
Abstract
Individuals with serious mental illness experience changes in their clinical states over time that are difficult to assess and that result in increased disease burden and care utilization. It is not known if features derived from speech can serve as a transdiagnostic marker of these clinical states. This study evaluates the feasibility of collecting speech samples from people with serious mental illness and explores the potential utility for tracking changes in clinical state over time. Patients (n = 47) were recruited from a community-based mental health clinic with diagnoses of bipolar disorder, major depressive disorder, schizophrenia or schizoaffective disorder. Patients used an interactive voice response system for at least 4 months to provide speech samples. Clinic providers (n = 13) reviewed responses and provided global assessment ratings. We computed features of speech and used machine learning to create models of outcome measures trained using either population data or an individual's own data over time. The system was feasible to use, recording 1101 phone calls and 117 hours of speech. Most (92%) of the patients agreed that it was easy to use. The individually-trained models demonstrated the highest correlation with provider ratings (rho = 0.78, p<0.001). Population-level models demonstrated statistically significant correlations with provider global assessment ratings (rho = 0.44, p<0.001), future provider ratings (rho = 0.33, p<0.05), BASIS-24 summary score, depression sub score, and self-harm sub score (rho = 0.25,0.25, and 0.28 respectively; p<0.05), and the SF-12 mental health sub score (rho = 0.25, p<0.05), but not with other BASIS-24 or SF-12 sub scores. This study brings together longitudinal collection of objective behavioral markers along with a transdiagnostic, personalized approach for tracking of mental health clinical state in a community-based clinical setting.
Collapse
Affiliation(s)
- Armen C. Arevian
- Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, United States of America
| | - Daniel Bone
- Signal Analysis and Interpretation Lab, University of Southern California, Los Angeles, CA, United States of America
| | - Nikolaos Malandrakis
- Signal Analysis and Interpretation Lab, University of Southern California, Los Angeles, CA, United States of America
| | - Victor R. Martinez
- Signal Analysis and Interpretation Lab, University of Southern California, Los Angeles, CA, United States of America
| | - Kenneth B. Wells
- Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, United States of America
- RAND Corporation, Santa Monica, CA, United States of America
| | - David J. Miklowitz
- Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, United States of America
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Lab, University of Southern California, Los Angeles, CA, United States of America
| |
Collapse
|
40
|
Lundin NB, Hochheiser J, Minor KS, Hetrick WP, Lysaker PH. Piecing together fragments: Linguistic cohesion mediates the relationship between executive function and metacognition in schizophrenia. Schizophr Res 2020; 215:54-60. [PMID: 31784337 PMCID: PMC8106973 DOI: 10.1016/j.schres.2019.11.032] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Revised: 08/24/2019] [Accepted: 11/19/2019] [Indexed: 12/28/2022]
Abstract
Speech disturbances are prevalent in psychosis. These may arise in part from executive function impairment, as research suggests that inhibition and monitoring are associated with production of cohesive discourse. However, it is not yet understood how linguistic and executive function impairments in psychosis interact with disrupted metacognition, or deficits in the ability to integrate information to form a complex sense of oneself and others and use that synthesis to respond to psychosocial challenges. Whereas discourse studies have historically employed manual hand-coding techniques, automated computational tools can characterize deep semantic structures that may be closely linked with metacognition. In the present study, we examined whether higher executive functioning promotes metacognition by way of altering linguistic cohesion. Ninety-four individuals with schizophrenia-spectrum disorders provided illness narratives and completed an executive function task battery (Delis-Kaplan Executive Function System). We assessed the narratives for linguistic cohesion (Coh-Metrix 3.0) and metacognitive capacity (Metacognition Assessment Scale - Abbreviated). Selected linguistic indices measured the frequency of connections between causal and intentional content (deep cohesion), word and theme overlap (referential cohesion), and unique word usage (lexical diversity). In path analyses using bootstrapped confidence intervals, we found that deep cohesion and lexical diversity independently mediated the relationship between executive functioning and metacognitive capacity. Findings suggest that executive control abilities support integration of mental experiences by way of increasing causal, goal-driven speech and word expression in individuals with schizophrenia. Metacognitive-based therapeutic interventions for psychosis may promote insight and recovery in part by scaffolding use of language that links ideas together.
Collapse
Affiliation(s)
- Nancy B Lundin
- Department of Psychological and Brain Sciences and Program in Neuroscience, Indiana University, 1101 E. 10th Street, Bloomington, IN 47405, United States.
| | - Jesse Hochheiser
- Department of Psychiatry, Richard L. Roudebush VA Medical Center, 1481 W. 10th Street, Indianapolis, IN 46202, United States
| | - Kyle S Minor
- Department of Psychology, Indiana University Purdue University Indianapolis, 402 N. Blackford Street, Indianapolis, IN 46202, United States.
| | - William P Hetrick
- Department of Psychological and Brain Sciences and Program in Neuroscience, Indiana University, 1101 E. 10th Street, Bloomington, IN 47405, United States.
| | - Paul H Lysaker
- Department of Psychiatry, Richard L. Roudebush VA Medical Center, 1481 W. 10th Street, Indianapolis, IN 46202, United States; Indiana University School of Medicine, department of Psychiatry Indianapolis IN.
| |
Collapse
|
41
|
Wang J, Zhang L, Liu T, Pan W, Hu B, Zhu T. Acoustic differences between healthy and depressed people: a cross-situation study. BMC Psychiatry 2019; 19:300. [PMID: 31615470 PMCID: PMC6794822 DOI: 10.1186/s12888-019-2300-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 09/20/2019] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Abnormalities in vocal expression during a depressed episode have frequently been reported in people with depression, but less is known about if these abnormalities only exist in special situations. In addition, the impacts of irrelevant demographic variables on voice were uncontrolled in previous studies. Therefore, this study compares the vocal differences between depressed and healthy people under various situations with irrelevant variables being regarded as covariates. METHODS To examine whether the vocal abnormalities in people with depression only exist in special situations, this study compared the vocal differences between healthy people and patients with unipolar depression in 12 situations (speech scenarios). Positive, negative and neutral voice expressions between depressed and healthy people were compared in four tasks. Multiple analysis of covariance (MANCOVA) was used for evaluating the main effects of variable group (depressed vs. healthy) on acoustic features. The significances of acoustic features were evaluated by both statistical significance and magnitude of effect size. RESULTS The results of multivariate analysis of covariance showed that significant differences between the two groups were observed in all 12 speech scenarios. Although significant acoustic features were not the same in different scenarios, we found that three acoustic features (loudness, MFCC5 and MFCC7) were consistently different between people with and without depression with large effect magnitude. CONCLUSIONS Vocal differences between depressed and healthy people exist in 12 scenarios. Acoustic features including loudness, MFCC5 and MFCC7 have potentials to be indicators for identifying depression via voice analysis. These findings support that depressed people's voices include both situation-specific and cross-situational patterns of acoustic features.
Collapse
Affiliation(s)
- Jingying Wang
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Lei Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, VA USA
| | - Tianli Liu
- Institute of Population Research, Peking University, Beijing, China
| | - Wei Pan
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Bin Hu
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province China
| | - Tingshao Zhu
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
42
|
Cohen AS, Fedechko T, Schwartz EK, Le TP, Foltz PW, Bernstein J, Cheng J, Rosenfeld E, Elvevåg B. Psychiatric Risk Assessment from the Clinician's Perspective: Lessons for the Future. Community Ment Health J 2019; 55:1165-1172. [PMID: 31154587 DOI: 10.1007/s10597-019-00411-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 05/13/2019] [Indexed: 01/30/2023]
Abstract
Accurate prediction of risk-states in Serious Mental Illnesses (SMIs) is critical for reducing their massive societal burden. Risk-state assessments are notably inaccurate. Recent innovations, including widely available and inexpensive mobile technologies for ambulatory "biobehavioral" data, can reshape risk assessment. To help understand and accelerate clinician involvement, we surveyed 90 multi-disciplinary clinicians serving SMI populations in various settings to evaluate how risk assessment is conducted and can improve. Clinicians reported considerable variability in conducting risk assessment, and few clinicians explicated their procedures beyond tying it to broader mental status examinations or interviews. Very few clinicians endorsed using currently-available standardized risk measures, and most reported low confidence in their utility. Clinicians also reported spending approximately half the time conducting individual risk assessments than optimally needed. When asked about improvement, virtually no clinicians acknowledged biobehavioral, objective technologies, or ambulatory recording. Overall, clinicians seemed unaware of meaningful ways to improve risk assessment.
Collapse
Affiliation(s)
- Alex S Cohen
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA.
| | - Taylor Fedechko
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| | - Elana K Schwartz
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| | - Thanh P Le
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| | - Peter W Foltz
- Institute of Cognitive Science, University of Colorado, Boulder, USA
| | | | - Jian Cheng
- Analytic Measures Inc, Palo Alto, CA, USA
| | | | - Brita Elvevåg
- Department of Clinical Medicine, University of Tromsø - The Arctic University of Norway, Tromsø, Norway.,The Norwegian Centre for eHealth Research, University Hospital of North Norway, Tromsø, Norway
| |
Collapse
|
43
|
Minor KS, Willits JA, Marggraf MP, Jones MN, Lysaker PH. Measuring disorganized speech in schizophrenia: automated analysis explains variance in cognitive deficits beyond clinician-rated scales. Psychol Med 2019; 49:440-448. [PMID: 29692287 DOI: 10.1017/s0033291718001046] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
BACKGROUND Conveying information cohesively is an essential element of communication that is disrupted in schizophrenia. These disruptions are typically expressed through disorganized symptoms, which have been linked to neurocognitive, social cognitive, and metacognitive deficits. Automated analysis can objectively assess disorganization within sentences, between sentences, and across paragraphs by comparing explicit communication to a large text corpus. METHOD Little work in schizophrenia has tested: (1) links between disorganized symptoms measured via automated analysis and neurocognition, social cognition, or metacognition; and (2) if automated analysis explains incremental variance in cognitive processes beyond clinician-rated scales. Disorganization was measured in schizophrenia (n = 81) with Coh-Metrix 3.0, an automated program that calculates basic and complex language indices. Trained staff also assessed neurocognition, social cognition, metacognition, and clinician-rated disorganization. RESULTS Findings showed that all three cognitive processes were significantly associated with at least one automated index of disorganization. When automated analysis was compared with a clinician-rated scale, it accounted for significant variance in neurocognition and metacognition beyond the clinician-rated measure. When combined, these two methods explained 28-31% of the variance in neurocognition, social cognition, and metacognition. CONCLUSIONS This study illustrated how automated analysis can highlight the specific role of disorganization in neurocognition, social cognition, and metacognition. Generally, those with poor cognition also displayed more disorganization in their speech-making it difficult for listeners to process essential information needed to tie the speaker's ideas together. Our findings showcase how implementing a mixed-methods approach in schizophrenia can explain substantial variance in cognitive processes.
Collapse
Affiliation(s)
- K S Minor
- Department of Psychology,Indiana University- Purdue University Indianapolis,Indianapolis, IN,USA
| | - J A Willits
- Department of Psychology,University of California-Riverside,Riverside, CA,USA
| | - M P Marggraf
- Department of Psychology,Indiana University- Purdue University Indianapolis,Indianapolis, IN,USA
| | - M N Jones
- Department of Psychology,Indiana University,Bloomington, IN,USA
| | - P H Lysaker
- Roudebush VA Medical Center,Indianapolis, IN,USA
| |
Collapse
|
44
|
Ratana R, Sharifzadeh H, Krishnan J, Pang S. A Comprehensive Review of Computational Methods for Automatic Prediction of Schizophrenia With Insight Into Indigenous Populations. Front Psychiatry 2019; 10:659. [PMID: 31607962 PMCID: PMC6759015 DOI: 10.3389/fpsyt.2019.00659] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 08/15/2019] [Indexed: 01/13/2023] Open
Abstract
Psychiatrists rely on language and speech behavior as one of the main clues in psychiatric diagnosis. Descriptive psychopathology and phenomenology form the basis of a common language used by psychiatrists to describe abnormal mental states. This conventional technique of clinical observation informed early studies on disturbances of thought form, speech, and language observed in psychosis and schizophrenia. These findings resulted in language models that were used as tools in psychosis research that concerned itself with the links between formal thought disorder and language disturbances observed in schizophrenia. The end result was the development of clinical rating scales measuring severity of disturbances in speech, language, and thought form. However, these linguistic measures do not fully capture the richness of human discourse and are time-consuming and subjective when measured against psychometric rating scales. These linguistic measures have not considered the influence of culture on psychopathology. With recent advances in computational sciences, we have seen a re-emergence of novel research using computing methods to analyze free speech for improving prediction and diagnosis of psychosis. Current studies on automated speech analysis examining for semantic incoherence are carried out based on natural language processing and acoustic analysis, which, in some studies, have been combined with machine learning approaches for classification and prediction purposes.
Collapse
Affiliation(s)
- Randall Ratana
- School of Computing, Unitec Institute of Technology, Auckland, New Zealand
| | - Hamid Sharifzadeh
- School of Computing, Unitec Institute of Technology, Auckland, New Zealand
| | | | - Shaoning Pang
- School of Computing, Unitec Institute of Technology, Auckland, New Zealand
| |
Collapse
|
45
|
de Boer J, Voppel A, Begemann M, Schnack H, Wijnen F, Sommer I. Clinical use of semantic space models in psychiatry and neurology: A systematic review and meta-analysis. Neurosci Biobehav Rev 2018; 93:85-92. [DOI: 10.1016/j.neubiorev.2018.06.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 06/07/2018] [Accepted: 06/07/2018] [Indexed: 01/17/2023]
|
46
|
Evidence of disturbances of deep levels of semantic cohesion within personal narratives in schizophrenia. Schizophr Res 2018; 197:365-369. [PMID: 29153448 DOI: 10.1016/j.schres.2017.11.014] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Revised: 10/17/2017] [Accepted: 11/10/2017] [Indexed: 12/24/2022]
Abstract
Since initial conceptualizations, schizophrenia has been thought to involve core disturbances in the ability to form complex, integrated ideas. Although this has been studied in terms of formal thought disorder, the level of involvement of altered latent semantic structure is less clear. To explore this question, we compared the personal narratives of adults with schizophrenia (n=200) to those produced by an HIV+ sample (n=55) using selected indices from Coh-Metrix. Coh-Metrix is a software system designed to compute various language usage statistics from transcribed written and spoken language documents. It differs from many other frequency-based systems in that Coh-Metrix measures a wide range of language processes, ranging from basic descriptors (e.g., total words) to indices assessing more sophisticated processes within sentences, between sentences, and across paragraphs (e.g., deep cohesion). Consistent with predictions, the narratives in schizophrenia exhibited less cohesion even after controlling for age and education. Specifically, the schizophrenia group spoke fewer words, demonstrated less connection between ideas and clauses, provided fewer causal/intentional markers, and displayed lower levels of deep cohesion. A classification model using only Coh-Metrix indices found language markers correctly classified participants in nearly three-fourths of cases. These findings suggest a particular pattern of difficulties cohesively connecting thoughts about oneself and the world results in a perceived lack of coherence in schizophrenia. These results are consistent with Bleuler's model of schizophrenia and offer a novel way to understand and measure alterations in thought and speech over time.
Collapse
|
47
|
Pauselli L, Halpern B, Cleary SD, Ku BS, Covington MA, Compton MT. Computational linguistic analysis applied to a semantic fluency task to measure derailment and tangentiality in schizophrenia. Psychiatry Res 2018; 263:74-79. [PMID: 29502041 PMCID: PMC6048590 DOI: 10.1016/j.psychres.2018.02.037] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 12/18/2017] [Accepted: 02/16/2018] [Indexed: 12/31/2022]
Abstract
Although rating scales to assess formal thought disorder exist, there are no objective, high-reliability instruments that can quantify and track it. This proof-of-concept study shows that CoVec, a new automated tool, is able to differentiate between controls and patients with schizophrenia with derailment and tangentiality. According to ratings from the derailment and tangentiality items of the Scale for the Assessment of Positive Symptoms, we divided the sample into three groups: controls, patients without formal thought disorder, and patients with derailment/tangentiality. Their lists of animals produced during a one-minute semantic fluency task were processed using CoVec, a newly developed software that measures the semantic similarity of words based on vector semantic analysis. CoVec outputs were Mean Similarity, Coherence, Coherence-5, and Coherence-10. Patients with schizophrenia produced fewer words than controls. Patients with derailment had a significantly lower mean number of words and lower Coherence-5 than controls and patients without derailment. Patients with tangentiality had significantly lower Coherence-5 and Coherence-10 than controls and patients without tangentiality. Despite the small samples of patients with clinically apparent thought disorder, CoVec was able to detect subtle differences between controls and patients with either or both of the two forms of disorganization.
Collapse
Affiliation(s)
- Luca Pauselli
- Department of Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY, USA.
| | - Brooke Halpern
- Department of Psychiatry, Lenox Hill Hospital, New York, NY, USA
| | - Sean D Cleary
- Department of Epidemiology and Biostatistics, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA
| | - Benson S Ku
- Hofstra Northwell School of Medicine, Hempstead, NY, USA
| | | | - Michael T Compton
- Department of Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY, USA
| |
Collapse
|
48
|
Discriminant document embeddings with an extreme learning machine for classifying clinical narratives. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.01.117] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
49
|
Semantic coherence in psychometric schizotypy: An investigation using Latent Semantic Analysis. Psychiatry Res 2018; 259:63-67. [PMID: 29028526 DOI: 10.1016/j.psychres.2017.09.078] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2016] [Revised: 05/23/2017] [Accepted: 09/25/2017] [Indexed: 12/30/2022]
Abstract
Technological advancements have led to the development of automated methods for assessing semantic coherence in psychiatric populations. Latent Semantic Analysis (LSA) is an automated method that has been used to quantify semantic coherence in schizophrenia-spectrum disorders. The current study examined whether: 1) Semantic coherence reductions extended to psychometrically-defined schizotypy and 2) Greater cognitive load further reduces semantic coherence. LSA was applied to responses generated during category fluency tasks in baseline and cognitive load conditions. Significant differences between schizotypy and non-schizotypy groups were not observed. Findings suggest that semantic coherence may be relatively preserved at this point on the schizophrenia-spectrum.
Collapse
|
50
|
Cohen AS, Mitchell KR, Strauss GP, Blanchard JJ, Buchanan RW, Kelly DL, Gold J, McMahon RP, Adams HA, Carpenter WT. The effects of oxytocin and galantamine on objectively-defined vocal and facial expression: Data from the CIDAR study. Schizophr Res 2017; 188:141-143. [PMID: 28130004 PMCID: PMC5524598 DOI: 10.1016/j.schres.2017.01.028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 01/17/2017] [Accepted: 01/18/2017] [Indexed: 11/29/2022]
Affiliation(s)
- Alex S. Cohen
- Louisiana State University, Department of Psychology
| | | | | | | | - Robert W. Buchanan
- University of Maryland School of Medicine, Maryland Psychiatric Research Center
| | - Deanna L. Kelly
- University of Maryland School of Medicine, Maryland Psychiatric Research Center
| | - James Gold
- University of Maryland School of Medicine, Maryland Psychiatric Research Center
| | - Robert P. McMahon
- University of Maryland School of Medicine, Maryland Psychiatric Research Center
| | - Heather A. Adams
- University of Maryland School of Medicine, Maryland Psychiatric Research Center
| | | |
Collapse
|