Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Low DM, Bentley KH, Ghosh SS. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investig Otolaryngol 2020;5:96-116. [PMID: 32128436 PMCID: PMC7042657 DOI: 10.1002/lio2.354] [Citation(s) in RCA: 138] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/31/2019] [Accepted: 01/17/2020] [Indexed: 12/31/2022] Open

For:	Low DM, Bentley KH, Ghosh SS. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investig Otolaryngol 2020;5:96-116. [PMID: 32128436 PMCID: PMC7042657 DOI: 10.1002/lio2.354] [Citation(s) in RCA: 138] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/31/2019] [Accepted: 01/17/2020] [Indexed: 12/31/2022] Open

Number

Cited by Other Article(s)

Neumann M, Kothare H, Ramanarayanan V. Multimodal speech biomarkers for remote monitoring of ALS disease progression. Comput Biol Med 2024;180:108949. [PMID: 39126786 DOI: 10.1016/j.compbiomed.2024.108949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 06/26/2024] [Accepted: 07/03/2024] [Indexed: 08/12/2024]

Abstract

Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that severely impacts affected persons' speech and motor functions, yet early detection and tracking of disease progression remain challenging. The current gold standard for monitoring ALS progression, the ALS functional rating scale - revised (ALSFRS-R), is based on subjective ratings of symptom severity, and may not capture subtle but clinically meaningful changes due to a lack of granularity. Multimodal speech measures which can be automatically collected from patients in a remote fashion allow us to bridge this gap because they are continuous-valued and therefore, potentially more granular at capturing disease progression. Here we investigate the responsiveness and sensitivity of multimodal speech measures in persons with ALS (pALS) collected via a remote patient monitoring platform in an effort to quantify how long it takes to detect a clinically-meaningful change associated with disease progression. We recorded audio and video from 278 participants and automatically extracted multimodal speech biomarkers (acoustic, orofacial, linguistic) from the data. We find that the timing alignment of pALS speech relative to a canonical elicitation of the same prompt and the number of words used to describe a picture are the most responsive measures at detecting such change in both pALS with bulbar (n = 36) and non-bulbar onset (n = 107). Interestingly, the responsiveness of these measures is stable even at small sample sizes. We further found that certain speech measures are sensitive enough to track bulbar decline even when there is no patient-reported clinical change, i.e. the ALSFRS-R speech score remains unchanged at 3 out of a total possible score of 4. The findings of this study have the potential to facilitate improved, accelerated and cost-effective clinical trials and care.

Collapse

Berisha V, Liss JM. Responsible development of clinical speech AI: Bridging the gap between clinical research and technology. NPJ Digit Med 2024;7:208. [PMID: 39122889 PMCID: PMC11316053 DOI: 10.1038/s41746-024-01199-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 07/19/2024] [Indexed: 08/12/2024] Open

Taşcı B. Multilevel hybrid handcrafted feature extraction based depression recognition method using speech. J Affect Disord 2024:S0165-0327(24)01215-1. [PMID: 39127304 DOI: 10.1016/j.jad.2024.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/26/2024] [Accepted: 08/07/2024] [Indexed: 08/12/2024]

Abstract

BACKGROUND AND PURPOSE

Diagnosis of depression is based on tests performed by psychiatrists and information provided by patients or their relatives. In the field of machine learning (ML), numerous models have been devised to detect depression automatically through the analysis of speech audio signals. While deep learning approaches often achieve superior classification accuracy, they are notably resource-intensive. This research introduces an innovative, multilevel hybrid feature extraction-based classification model, specifically designed for depression detection, which exhibits reduced time complexity.

MATERIALS AND METHODS

MODMA dataset consisting of 29 healthy and 23 Major depressive disorder audio signals was used. The constructed model architecture integrates multilevel hybrid feature extraction, iterative feature selection, and classification processes. During the Hybrid Handcrafted Feature (HHF) generation stage, a combination of textural and statistical methods was employed to extract low-level features from speech audio signals. To enhance this process for high-level feature creation, a Multilevel Discrete Wavelet Transform (MDWT) was applied. This technique produced wavelet subbands, which were then input into the hybrid feature extractor, enabling the extraction of both high and low-level features. For the selection of the most pertinent features from these extracted vectors, Iterative Neighborhood Component Analysis (INCA) was utilized. Finally, in the classification phase, a one-dimensional nearest neighbor classifier, augmented with ten-fold cross-validation, was implemented to achieve detailed, results.

RESULTS

The HHF-based speech audio signal classification model attained excellent performance, with the 94.63 % classification accuracy.

CONCLUSIONS

The findings validate the remarkable proficiency of the introduced HHF-based model in depression classification, underscoring its computational efficiency.

Collapse

Kaczmarek-Majer K, Dominiak M, Antosik AZ, Hryniewicz O, Kamińska O, Opara K, Owsiński J, Radziszewska W, Sochacka M, Święcicki Ł. Acoustic features from speech as markers of depressive and manic symptoms in bipolar disorder: A prospective study. Acta Psychiatr Scand 2024. [PMID: 39118422 DOI: 10.1111/acps.13735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/14/2024] [Accepted: 07/06/2024] [Indexed: 08/10/2024]

Abstract

INTRODUCTION

Voice features could be a sensitive marker of affective state in bipolar disorder (BD). Smartphone apps offer an excellent opportunity to collect voice data in the natural setting and become a useful tool in phase prediction in BD.

AIMS OF THE STUDY

We investigate the relations between the symptoms of BD, evaluated by psychiatrists, and patients' voice characteristics. A smartphone app extracted acoustic parameters from the daily phone calls of n = 51 patients. We show how the prosodic, spectral, and voice quality features correlate with clinically assessed affective states and explore their usefulness in predicting the BD phase.

METHODS

A smartphone app (BDmon) was developed to collect the voice signal and extract its physical features. BD patients used the application on average for 208 days. Psychiatrists assessed the severity of BD symptoms using the Hamilton depression rating scale -17 and the Young Mania rating scale. We analyze the relations between acoustic features of speech and patients' mental states using linear generalized mixed-effect models.

RESULTS

The prosodic, spectral, and voice quality parameters, are valid markers in assessing the severity of manic and depressive symptoms. The accuracy of the predictive generalized mixed-effect model is 70.9%-71.4%. Significant differences in the effect sizes and directions are observed between female and male subgroups. The greater the severity of mania in males, the louder (β = 1.6) and higher the tone of voice (β = 0.71), more clearly (β = 1.35), and more sharply they speak (β = 0.95), and their conversations are longer (β = 1.64). For females, the observations are either exactly the opposite-the greater the severity of mania, the quieter (β = -0.27) and lower the tone of voice (β = -0.21) and less clearly (β = -0.25) they speak - or no correlations are found (length of speech). On the other hand, the greater the severity of bipolar depression in males, the quieter (β = -1.07) and less clearly they speak (β = -1.00). In females, no distinct correlations between the severity of depressive symptoms and the change in voice parameters are found.

CONCLUSIONS

Speech analysis provides physiological markers of affective symptoms in BD and acoustic features extracted from speech are effective in predicting BD phases. This could personalize monitoring and care for BD patients, helping to decide whether a specialist should be consulted.

Collapse

Liu L, Liu L, Wafa HA, Tydeman F, Xie W, Wang Y. Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis. J Am Med Inform Assoc 2024:ocae189. [PMID: 39013193 DOI: 10.1093/jamia/ocae189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 06/12/2024] [Accepted: 07/05/2024] [Indexed: 07/18/2024] Open

Hartnagel LM, Ebner-Priemer UW, Foo JC, Streit F, Witt SH, Frank J, Limberger MF, Horn AB, Gilles M, Rietschel M, Sirignano L. Linguistic style as a digital marker for depression severity: An ambulatory assessment pilot study in patients with depressive disorder undergoing sleep deprivation therapy. Acta Psychiatr Scand 2024. [PMID: 38987940 DOI: 10.1111/acps.13726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 05/28/2024] [Accepted: 06/22/2024] [Indexed: 07/12/2024]

Affiliation(s)

Lisa-Marie Hartnagel Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
Ulrich W Ebner-Priemer Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Jerome C Foo Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany Institute for Psychopharmacology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany Neuroscience and Mental Health Institute, University of Alberta, Edmonton, Alberta, Canada Department of Psychiatry, College of Health Sciences, University of Alberta, Edmonton, Alberta, Canada
Fabian Streit Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany Hector Institute for Artificial Intelligence in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Stephanie H Witt Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Josef Frank Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Matthias F Limberger Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
Andrea B Horn University Research Priority Program (URPP) Dynamics of Healthy Aging, Healthy Longevity Center, University of Zürich, Zürich, Switzerland
Maria Gilles Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Marcella Rietschel Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
Lea Sirignano Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany

Collapse

Ramanarayanan V. Multimodal Technologies for Remote Assessment of Neurological and Mental Health. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024:1-13. [PMID: 38984943 DOI: 10.1044/2024_jslhr-24-00142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]

Abstract

PURPOSE

Automated remote assessment and monitoring of patients' neurological and mental health is increasingly becoming an essential component of the digital clinic and telehealth ecosystem, especially after the COVID-19 pandemic. This review article reviews various modalities of health information that are useful for developing such remote clinical assessments in the real world at scale.

APPROACH

We first present an overview of the various modalities of health information-speech acoustics, natural language, conversational dynamics, orofacial or full body movement, eye gaze, respiration, cardiopulmonary, and neural-which can each be extracted from various signal sources-audio, video, text, or sensors. We further motivate their clinical utility with examples of how information from each modality can help us characterize how different disorders affect different aspects of patients' spoken communication. We then elucidate the advantages of combining one or more of these modalities toward a more holistic, informative, and robust assessment.

FINDINGS

We find that combining multiple modalities of health information allows for improved scientific interpretability, improved performance on downstream health applications such as early detection and progress monitoring, improved technological robustness, and improved user experience. We illustrate how these principles can be leveraged for remote clinical assessment at scale using a real-world case study of the Modality assessment platform.

CONCLUSION

This review article motivates the combination of human-centric information from multiple modalities to measure various aspects of patients' health, arguing that remote clinical assessment that integrates this complementary information can be more effective and lead to better clinical outcomes than using any one data stream in isolation.

Collapse

Cordella C, Di Filippo L, Kolachalama VB, Kiran S. Connected Speech Fluency in Poststroke and Progressive Aphasia: A Scoping Review of Quantitative Approaches and Features. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024;33:2091-2128. [PMID: 38652820 PMCID: PMC11253646 DOI: 10.1044/2024_ajslp-23-00208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 10/09/2023] [Accepted: 01/08/2024] [Indexed: 04/25/2024]

Abstract

PURPOSE

Speech fluency has important diagnostic implications for individuals with poststroke aphasia (PSA) as well as primary progressive aphasia (PPA), and quantitative assessment of connected speech has emerged as a widely used approach across both etiologies. The purpose of this review was to provide a clearer picture on the range, nature, and utility of individual quantitative speech/language measures and methods used to assess connected speech fluency in PSA and PPA, and to compare approaches across etiologies.

METHOD

We conducted a scoping review of literature published between 2012 and 2022 following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews guidelines. Forty-five studies were included in the review. Literature was charted and summarized by etiology and characteristics of included patient populations and method(s) used for derivation and analysis of speech/language features. For a subset of included articles, we also charted the individual quantitative speech/language features reported and the level of significance of reported results.

RESULTS

Results showed that similar methodological approaches have been used to quantify connected speech fluency in both PSA and PPA. Two hundred nine individual speech-language features were analyzed in total, with low levels of convergence across etiology on specific features but greater agreement on the most salient features. The most useful features for differentiating fluent from nonfluent aphasia in both PSA and PPA were features related to overall speech quantity, speech rate, or grammatical competence.

CONCLUSIONS

Data from this review demonstrate the feasibility and utility of quantitative approaches to index connected speech fluency in PSA and PPA. We identified emergent trends toward automated analysis methods and data-driven approaches, which offer promising avenues for clinical translation of quantitative approaches. There is a further need for improved consensus on which subset of individual features might be most clinically useful for assessment and monitoring of fluency.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.25537237.

Collapse

Scroggins JK, Topaz M, Song J, Zolnoori M. Does synthetic data augmentation improve the performances of machine learning classifiers for identifying health problems in patient-nurse verbal communications in home healthcare settings? J Nurs Scholarsh 2024. [PMID: 38961517 DOI: 10.1111/jnu.13004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/21/2024] [Accepted: 06/19/2024] [Indexed: 07/05/2024]

Abstract

BACKGROUND

Identifying health problems in audio-recorded patient-nurse communication is important to improve outcomes in home healthcare patients who have complex conditions with increased risks of hospital utilization. Training machine learning classifiers for identifying problems requires resource-intensive human annotation.

OBJECTIVE

To generate synthetic patient-nurse communication and to automatically annotate for common health problems encountered in home healthcare settings using GPT-4. We also examined whether augmenting real-world patient-nurse communication with synthetic data can improve the performance of machine learning to identify health problems.

DESIGN

Secondary data analysis of patient-nurse verbal communication data in home healthcare settings.

METHODS

The data were collected from one of the largest home healthcare organizations in the United States. We used 23 audio recordings of patient-nurse communications from 15 patients. The audio recordings were transcribed verbatim and manually annotated for health problems (e.g., circulation, skin, pain) indicated in the Omaha System Classification scheme. Synthetic data of patient-nurse communication were generated using the in-context learning prompting method, enhanced by chain-of-thought prompting to improve the automatic annotation performance. Machine learning classifiers were applied to three training datasets: real-world communication, synthetic communication, and real-world communication augmented by synthetic communication.

RESULTS

Average F1 scores improved from 0.62 to 0.63 after training data were augmented with synthetic communication. The largest increase was observed using the XGBoost classifier where F1 scores improved from 0.61 to 0.64 (about 5% improvement). When trained solely on either real-world communication or synthetic communication, the classifiers showed comparable F1 scores of 0.62-0.61, respectively.

CONCLUSION

Integrating synthetic data improves machine learning classifiers' ability to identify health problems in home healthcare, with performance comparable to training on real-world data alone, highlighting the potential of synthetic data in healthcare analytics.

CLINICAL RELEVANCE

This study demonstrates the clinical relevance of leveraging synthetic patient-nurse communication data to enhance machine learning classifier performances to identify health problems in home healthcare settings, which will contribute to more accurate and efficient problem identification and detection of home healthcare patients with complex health conditions.

Collapse

Neumann M, Kothare H, Ramanarayanan V. Multimodal Speech Biomarkers for Remote Monitoring of ALS Disease Progression. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.26.24308811. [PMID: 38978682 PMCID: PMC11230328 DOI: 10.1101/2024.06.26.24308811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]

Abstract

Collapse

Ciharova M, Amarti K, van Breda W, Peng X, Lorente-Català R, Funk B, Hoogendoorn M, Koutsouleris N, Fusar-Poli P, Karyotaki E, Cuijpers P, Riper H. Use of Machine Learning Algorithms Based on Text, Audio, and Video Data in the Prediction of Anxiety and Posttraumatic Stress in General and Clinical Populations: A Systematic Review. Biol Psychiatry 2024:S0006-3223(24)01362-3. [PMID: 38866173 DOI: 10.1016/j.biopsych.2024.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/24/2024] [Accepted: 06/04/2024] [Indexed: 06/14/2024]

Affiliation(s)

Marketa Ciharova Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Black Dog Institute, University of New South Wales, Sydney, New South Wales, Australia.
Khadicha Amarti Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
Ward van Breda Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
Xianhua Peng Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Department of Methodology and Statistics, Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, the Netherlands
Rosa Lorente-Català Department of Basic and Clinical Psychology and Psychobiology, Universitat Jaume I, Castellon, Spain
Burkhardt Funk Institute of Information Systems, Leuphana University, Lüneburg, Germany
Mark Hoogendoorn Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
Nikolaos Koutsouleris Artificial Intelligence in Mental Health Group, Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom; Precision Psychiatry Group, Max Planck Institute, Munich, Germany; Section for Precision Psychiatry, Department of Psychiatry and Psychotherapy, University Medical Center, Ludwig-Maximilians-University Munich, Munich, Germany
Paolo Fusar-Poli Section for Precision Psychiatry, Department of Psychiatry and Psychotherapy, University Medical Center, Ludwig-Maximilians-University Munich, Munich, Germany; Early Psychosis: Interventions and Clinical-Detection (EPIC) Lab, Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom; Department of Brain and Behavioural Sciences, University of Pavia, Pavia, Italy; OASIS Service, South London and the Maudsley National Health Service Foundation Trust, London, United Kingdom
Eirini Karyotaki Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; WHO Collaborating Center for Research and Dissemination of Psychological Interventions, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
Pim Cuijpers Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; WHO Collaborating Center for Research and Dissemination of Psychological Interventions, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Babeș-Bolyai University, International Institute for Psychotherapy, Cluj-Napoca, Romania
Heleen Riper Department of Clinical, Neuro, and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands; Department of Psychiatry, Amsterdam Public Health Research Institute, Amsterdam University Medical Centre, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands

Collapse

Ben Moshe T, Ziv I, Dershowitz N, Bar K. The contribution of prosody to machine classification of schizophrenia. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2024;10:53. [PMID: 38762536 PMCID: PMC11102498 DOI: 10.1038/s41537-024-00463-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 03/15/2024] [Indexed: 05/20/2024]

Low DM, Rao V, Randolph G, Song PC, Ghosh SS. Identifying bias in models that detect vocal fold paralysis from audio recordings using explainable machine learning and clinician ratings. PLOS DIGITAL HEALTH 2024;3:e0000516. [PMID: 38814939 PMCID: PMC11139298 DOI: 10.1371/journal.pdig.0000516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 04/02/2024] [Indexed: 06/01/2024]

Abstract

Detecting voice disorders from voice recordings could allow for frequent, remote, and low-cost screening before costly clinical visits and a more invasive laryngoscopy examination. Our goals were to detect unilateral vocal fold paralysis (UVFP) from voice recordings using machine learning, to identify which acoustic variables were important for prediction to increase trust, and to determine model performance relative to clinician performance. Patients with confirmed UVFP through endoscopic examination (N = 77) and controls with normal voices matched for age and sex (N = 77) were included. Voice samples were elicited by reading the Rainbow Passage and sustaining phonation of the vowel "a". Four machine learning models of differing complexity were used. SHapley Additive exPlanations (SHAP) was used to identify important features. The highest median bootstrapped ROC AUC score was 0.87 and beat clinician's performance (range: 0.74-0.81) based on the recordings. Recording durations were different between UVFP recordings and controls due to how that data was originally processed when storing, which we can show can classify both groups. And counterintuitively, many UVFP recordings had higher intensity than controls, when UVFP patients tend to have weaker voices, revealing a dataset-specific bias which we mitigate in an additional analysis. We demonstrate that recording biases in audio duration and intensity created dataset-specific differences between patients and controls, which models used to improve classification. Furthermore, clinician's ratings provide further evidence that patients were over-projecting their voices and being recorded at a higher amplitude signal than controls. Interestingly, after matching audio duration and removing variables associated with intensity in order to mitigate the biases, the models were able to achieve a similar high performance. We provide a set of recommendations to avoid bias when building and evaluating machine learning models for screening in laryngology.

Collapse

Cao S, Rosenzweig I, Bilotta F, Jiang H, Xia M. Automatic detection of obstructive sleep apnea based on speech or snoring sounds: a narrative review. J Thorac Dis 2024;16:2654-2667. [PMID: 38738242 PMCID: PMC11087644 DOI: 10.21037/jtd-24-310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 04/15/2024] [Indexed: 05/14/2024]

Weisenburger RL, Mullarkey MC, Labrada J, Labrousse D, Yang MY, MacPherson AH, Hsu KJ, Ugail H, Shumake J, Beevers CG. Conversational assessment using artificial intelligence is as clinically useful as depression scales and preferred by users. J Affect Disord 2024;351:489-498. [PMID: 38290584 DOI: 10.1016/j.jad.2024.01.212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 01/15/2024] [Accepted: 01/22/2024] [Indexed: 02/01/2024]

Stein F, Gruber M, Mauritz M, Brosch K, Pfarr JK, Ringwald KG, Thomas-Odenthal F, Wroblewski A, Evermann U, Steinsträter O, Grumbach P, Thiel K, Winter A, Bonnekoh LM, Flinkenflügel K, Goltermann J, Meinert S, Grotegerd D, Bauer J, Opel N, Hahn T, Leehr EJ, Jansen A, de Lange SC, van den Heuvel MP, Nenadić I, Krug A, Dannlowski U, Repple J, Kircher T. Brain Structural Network Connectivity of Formal Thought Disorder Dimensions in Affective and Psychotic Disorders. Biol Psychiatry 2024;95:629-638. [PMID: 37207935 DOI: 10.1016/j.biopsych.2023.05.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 04/14/2023] [Accepted: 05/04/2023] [Indexed: 05/21/2023]

Abstract

BACKGROUND

The psychopathological syndrome of formal thought disorder (FTD) is not only present in schizophrenia (SZ), but also highly prevalent in major depressive disorder and bipolar disorder. It remains unknown how alterations in the structural white matter connectome of the brain correlate with psychopathological FTD dimensions across affective and psychotic disorders.

METHODS

Using FTD items of the Scale for the Assessment of Positive Symptoms and Scale for the Assessment of Negative Symptoms, we performed exploratory and confirmatory factor analyses in 864 patients with major depressive disorder (n= 689), bipolar disorder (n = 108), or SZ (n = 67) to identify psychopathological FTD dimensions. We used T1- and diffusion-weighted magnetic resonance imaging to reconstruct the structural connectome of the brain. To investigate the association of FTD subdimensions and global structural connectome measures, we employed linear regression models. We used network-based statistic to identify subnetworks of white matter fiber tracts associated with FTD symptomatology.

RESULTS

Three psychopathological FTD dimensions were delineated, i.e., disorganization, emptiness, and incoherence. Disorganization and incoherence were associated with global dysconnectivity. Network-based statistics identified subnetworks associated with the FTD dimensions disorganization and emptiness but not with the FTD dimension incoherence. Post hoc analyses on subnetworks did not reveal diagnosis × FTD dimension interaction effects. Results remained stable after correcting for medication and disease severity. Confirmatory analyses showed a substantial overlap of nodes from both subnetworks with cortical brain regions previously associated with FTD in SZ.

CONCLUSIONS

We demonstrated white matter subnetwork dysconnectivity in major depressive disorder, bipolar disorder, and SZ associated with FTD dimensions that predominantly comprise brain regions implicated in speech. Results open an avenue for transdiagnostic, psychopathology-informed, dimensional studies in pathogenetic research.

Collapse

Affiliation(s)

Frederike Stein Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany.
Marius Gruber Institute for Translational Psychiatry, University of Münster, Münster, Germany; Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital Frankfurt, Goethe University, Frankfurt, Germany
Marco Mauritz Institute for Translational Psychiatry, University of Münster, Münster, Germany
Katharina Brosch Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Julia-Katharina Pfarr Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Kai G Ringwald Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Florian Thomas-Odenthal Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Adrian Wroblewski Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Ulrika Evermann Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Olaf Steinsträter Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Pascal Grumbach Institute for Translational Psychiatry, University of Münster, Münster, Germany
Katharina Thiel Institute for Translational Psychiatry, University of Münster, Münster, Germany
Alexandra Winter Institute for Translational Psychiatry, University of Münster, Münster, Germany
Linda M Bonnekoh Institute for Translational Psychiatry, University of Münster, Münster, Germany
Kira Flinkenflügel Institute for Translational Psychiatry, University of Münster, Münster, Germany
Janik Goltermann Institute for Translational Psychiatry, University of Münster, Münster, Germany
Susanne Meinert Institute for Translational Psychiatry, University of Münster, Münster, Germany; Institute for Translational Neuroscience, University of Münster, Münster, Germany
Dominik Grotegerd Institute for Translational Psychiatry, University of Münster, Münster, Germany
Jochen Bauer Department of Radiology, University of Münster, Münster, Germany
Nils Opel Institute for Translational Psychiatry, University of Münster, Münster, Germany; Department of Psychiatry, Jena University Hospital/Friedrich Schiller University Jena, Jena, Germany
Tim Hahn Institute for Translational Psychiatry, University of Münster, Münster, Germany
Elisabeth J Leehr Institute for Translational Psychiatry, University of Münster, Münster, Germany
Andreas Jansen Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Siemon C de Lange Connectome Lab, Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, Amsterdam, the Netherlands; Department of Sleep and Cognition, Netherlands Institute for Neuroscience, an institute of the Royal Netherlands Academy of Arts and Sciences, Amsterdam, The Netherlands
Martijn P van den Heuvel Connectome Lab, Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit Amsterdam, Amsterdam Neuroscience, Amsterdam, the Netherlands; Department of Child and Adolescent Psychiatry and Psychology, Section Complex Trait Genetics, Amsterdam Neuroscience, Vrije Universiteit Medical Center, Amsterdam UMC, Amsterdam, the Netherlands
Igor Nenadić Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Axel Krug Department of Psychiatry and Psychotherapy, University of Bonn, Bonn, Germany
Udo Dannlowski Institute for Translational Psychiatry, University of Münster, Münster, Germany
Jonathan Repple Institute for Translational Psychiatry, University of Münster, Münster, Germany; Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital Frankfurt, Goethe University, Frankfurt, Germany
Tilo Kircher Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany; Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany

Collapse

Zaher F, Diallo M, Achim AM, Joober R, Roy MA, Demers MF, Subramanian P, Lavigne KM, Lepage M, Gonzalez D, Zeljkovic I, Davis K, Mackinley M, Sabesan P, Lal S, Voppel A, Palaniyappan L. Speech markers to predict and prevent recurrent episodes of psychosis: A narrative overview and emerging opportunities. Schizophr Res 2024;266:205-215. [PMID: 38428118 DOI: 10.1016/j.schres.2024.02.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 02/18/2024] [Accepted: 02/25/2024] [Indexed: 03/03/2024]

Affiliation(s)

Farida Zaher Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Mariama Diallo Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Amélie M Achim Département de Psychiatrie et Neurosciences, Université Laval, Québec City, QC, Canada; Vitam - Centre de Recherche en Santé Durable, Québec City, QC, Canada; Centre de Recherche CERVO, Québec City, QC, Canada
Ridha Joober Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Marc-André Roy Département de Psychiatrie et Neurosciences, Université Laval, Québec City, QC, Canada; Centre de Recherche CERVO, Québec City, QC, Canada
Marie-France Demers Centre de Recherche CERVO, Québec City, QC, Canada; Faculté de Pharmacie, Université Laval, Québec City, QC, Canada
Priya Subramanian Department of Psychiatry, Schulich School of Medicine, Western University, London, ON, Canada
Katie M Lavigne Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Martin Lepage Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Daniela Gonzalez Prevention and Early Intervention Program for Psychosis, London Health Sciences Center, Lawson Health Research Institute, London, ON, Canada
Irnes Zeljkovic Department of Psychiatry, Schulich School of Medicine, Western University, London, ON, Canada
Kristin Davis Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Michael Mackinley Department of Psychiatry, Schulich School of Medicine, Western University, London, ON, Canada; Prevention and Early Intervention Program for Psychosis, London Health Sciences Center, Lawson Health Research Institute, London, ON, Canada
Priyadharshini Sabesan Lakeshore General Hospital and Department of Psychiatry, McGill University, Montreal, QC, Canada
Shalini Lal Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada; School of Rehabilitation, Faculty of Medicine, University of Montréal, Montréal, QC, Canada
Alban Voppel Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada
Lena Palaniyappan Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, QC, Canada; Department of Psychiatry, Schulich School of Medicine, Western University, London, ON, Canada; Robarts Research Institute, Western University, London, ON, Canada.

Collapse

Casten LG, Koomar T, Elsadany M, McKone C, Tysseling B, Sasidharan M, Tomblin JB, Michaelson JJ. Lingo: an automated, web-based deep phenotyping platform for language ability. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.29.24305034. [PMID: 38585791 PMCID: PMC10996758 DOI: 10.1101/2024.03.29.24305034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Abstract

Background

Language and the ability to communicate effectively are key factors in mental health and well-being. Despite this critical importance, research on language is limited by the lack of a scalable phenotyping toolkit.

Methods

Here, we describe and showcase Lingo - a flexible online battery of language and nonverbal reasoning skills based on seven widely used tasks (COWAT, picture narration, vocal rhythm entrainment, rapid automatized naming, following directions, sentence repetition, and nonverbal reasoning). The current version of Lingo takes approximately 30 minutes to complete, is entirely open source, and allows for a wide variety of performance metrics to be extracted. We asked > 1,300 individuals from multiple samples to complete Lingo, then investigated the validity and utility of the resulting data.

Results

We conducted an exploratory factor analysis across 14 features derived from the seven assessments, identifying five factors. Four of the five factors showed acceptable test-retest reliability (Pearson's R > 0.7). Factor 2 showed the highest reliability (Pearson's R = 0.95) and loaded primarily on sentence repetition task performance. We validated Lingo with objective measures of language ability by comparing performance to gold-standard assessments: CELF-5 and the VABS-3. Factor 2 was significantly associated with the CELF-5 "core language ability" scale (Pearson's R = 0.77, p-value < 0.05) and the VABS-3 "communication" scale (Pearson's R = 0.74, p-value < 0.05). Factor 2 was positively associated with phenotypic and genetic measures of socieconomic status. Interestingly, we found the parents of children with language impairments had lower Factor 2 scores (p-value < 0.01). Finally, we found Lingo factor scores were significantly predictive of numerous psychiatric and neurodevelopmental conditions.

Conclusions

Together, these analyses support Lingo as a powerful platform for scalable deep phenotyping of language and other cognitive abilities. Additionally, exploratory analyses provide supporting evidence for the heritability of language ability and the complex relationship between mental health and language.

Collapse

Maleki Varnosfaderani S, Forouzanfar M. The Role of AI in Hospitals and Clinics: Transforming Healthcare in the 21st Century. Bioengineering (Basel) 2024;11:337. [PMID: 38671759 PMCID: PMC11047988 DOI: 10.3390/bioengineering11040337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/28/2024] Open

Low DM, Rao V, Randolph G, Song PC, Ghosh SS. Identifying bias in models that detect vocal fold paralysis from audio recordings using explainable machine learning and clinician ratings. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2020.11.23.20235945. [PMID: 33501466 PMCID: PMC7836138 DOI: 10.1101/2020.11.23.20235945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Abstract

Introduction

Methods

Patients with confirmed UVFP through endoscopic examination (N=77) and controls with normal voices matched for age and sex (N=77) were included. Voice samples were elicited by reading the Rainbow Passage and sustaining phonation of the vowel "a". Four machine learning models of differing complexity were used. SHapley Additive explanations (SHAP) was used to identify important features.

Results

The highest median bootstrapped ROC AUC score was 0.87 and beat clinician's performance (range: 0.74 - 0.81) based on the recordings. Recording durations were different between UVFP recordings and controls due to how that data was originally processed when storing, which we can show can classify both groups. And counterintuitively, many UVFP recordings had higher intensity than controls, when UVFP patients tend to have weaker voices, revealing a dataset-specific bias which we mitigate in an additional analysis.

Conclusion

We demonstrate that recording biases in audio duration and intensity created dataset-specific differences between patients and controls, which models used to improve classification. Furthermore, clinician's ratings provide further evidence that patients were over-projecting their voices and being recorded at a higher amplitude signal than controls. Interestingly, after matching audio duration and removing variables associated with intensity in order to mitigate the biases, the models were able to achieve a similar high performance. We provide a set of recommendations to avoid bias when building and evaluating machine learning models for screening in laryngology.

Collapse

Larsen E, Murton O, Song X, Joachim D, Watts D, Kapczinski F, Venesky L, Hurowitz G. Validating the efficacy and value proposition of mental fitness vocal biomarkers in a psychiatric population: prospective cohort study. Front Psychiatry 2024;15:1342835. [PMID: 38505797 PMCID: PMC10948552 DOI: 10.3389/fpsyt.2024.1342835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open

Wang L, Liu R, Wang Y, Xu X, Zhang R, Wei Y, Zhu R, Zhang X, Wang F. Effectiveness of a Biofeedback Intervention Targeting Mental and Physical Health Among College Students Through Speech and Physiology as Biomarkers Using Machine Learning: A Randomized Controlled Trial. Appl Psychophysiol Biofeedback 2024;49:71-83. [PMID: 38165498 DOI: 10.1007/s10484-023-09612-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/24/2023] [Indexed: 01/03/2024]

Abstract

Biofeedback therapy is mainly based on the analysis of physiological features to improve an individual's affective state. There are insufficient objective indicators to assess symptom improvement after biofeedback. In addition to psychological and physiological features, speech features can precisely convey information about emotions. The use of speech features can improve the objectivity of psychiatric assessments. Therefore, biofeedback based on subjective symptom scales, objective speech, and physiological features to evaluate efficacy provides a new approach for early screening and treatment of emotional problems in college students. A 4-week, randomized, controlled, parallel biofeedback therapy study was conducted with college students with symptoms of anxiety or depression. Speech samples, physiological samples, and clinical symptoms were collected at baseline and at the end of treatment, and the extracted speech features and physiological features were used for between-group comparisons and correlation analyses between the biofeedback and wait-list groups. Based on the speech features with differences between the biofeedback intervention and wait-list groups, an artificial neural network was used to predict the therapeutic effect and response after biofeedback therapy. Through biofeedback therapy, improvements in depression (p = 0.001), anxiety (p = 0.001), insomnia (p = 0.013), and stress (p = 0.004) severity were observed in college-going students (n = 52). The speech and physiological features in the biofeedback group also changed significantly compared to the waitlist group (n = 52) and were related to the change in symptoms. The energy parameters and Mel-Frequency Cepstral Coefficients (MFCC) of speech features can predict whether biofeedback intervention effectively improves anxiety and insomnia symptoms and treatment response. The accuracy of the classification model built using the artificial neural network (ANN) for treatment response and non-response was approximately 60%. The results of this study provide valuable information about biofeedback in improving the mental health of college-going students. The study identified speech features, such as the energy parameters, and MFCC as more accurate and objective indicators for tracking biofeedback therapy response and predicting efficacy. Trial Registration ClinicalTrials.gov ChiCTR2100045542.

Collapse

Affiliation(s)

Lifei Wang Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China
Rongxun Liu Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China Henan Key Laboratory of Immunology and Targeted Drugs, School of Laboratory Medicine, Xinxiang Medical University, Xinxiang, People's Republic of China
Yang Wang Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China Psychology Institute, Inner Mongolia Normal University, Hohhot, Inner Mongolia, People's Republic of China
Xiao Xu School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu, China
Ran Zhang Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China
Yange Wei Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China
Rongxin Zhu Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China
Xizhe Zhang School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu, China.
Fei Wang Early Intervention Unit, Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, People's Republic of China. Functional Brain Imaging Institute of Nanjing Medical University, Nanjing, People's Republic of China. Department of Mental Health, School of Public Health, Nanjing Medical University, Nanjing, China.

Collapse

Evangelista E, Kale R, McCutcheon D, Rameau A, Gelbard A, Powell M, Johns M, Law A, Song P, Naunheim M, Watts S, Bryson PC, Crowson MG, Pinto J, Bensoussan Y. Current Practices in Voice Data Collection and Limitations to Voice AI Research: A National Survey. Laryngoscope 2024;134:1333-1339. [PMID: 38087983 DOI: 10.1002/lary.31052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 08/08/2023] [Accepted: 08/29/2023] [Indexed: 02/17/2024]

Abstract

INTRODUCTION

Accuracy and validity of voice AI algorithms rely on substantial quality voice data. Although commensurable amounts of voice data are captured daily in voice centers across North America, there is no standardized protocol for acoustic data management, which limits the usability of these datasets for voice artificial intelligence (AI) research.

OBJECTIVE

The aim was to capture current practices of voice data collection, storage, analysis, and perceived limitations to collaborative voice research.

METHODS

A 30-question online survey was developed with expert guidance from the voicecollab.ai members, an international collaborative of voice AI researchers. The survey was disseminated via REDCap to an estimated 200 practitioners at North American voice centers. Survey questions assessed respondents' current practices in terms of acoustic data collection, storage, and retrieval as well as limitations to collaborative voice research.

RESULTS

Seventy-two respondents completed the survey of which 81.7% were laryngologists and 18.3% were speech language pathologists (SLPs). Eighteen percent of respondents reported seeing 40%-60% and 55% reported seeing >60 patients with voice disorders weekly (conservative estimate of over 4000 patients/week). Only 28% of respondents reported utilizing standardized protocols for collection and storage of acoustic data. Although, 87% of respondents conduct voice research, only 38% of respondents report doing so on a multi-institutional level. Perceived limitations to conducting collaborative voice research include lack of standardized methodology for collection (30%) and lack of human resources to prepare and label voice data adequately (55%).

CONCLUSION

To conduct large-scale multi-institutional voice research with AI, there is a pertinent need for standardization of acoustic data management, as well as an infrastructure for secure and efficient data sharing.

LEVEL OF EVIDENCE

5 Laryngoscope, 134:1333-1339, 2024.

Collapse

Affiliation(s)

Emily Evangelista University of South Florida Morsani College of Medicine, Tampa, Florida, U.S.A
Rohan Kale Department of Biology, University of South Florida, Tampa, Florida, U.S.A
Desiree McCutcheon USF Health, University of South Florida, Tampa, Florida, U.S.A
Anais Rameau Department of Otolaryngology, Head and Neck Surgery Weill Cornell Medical College, Ithaca, New York, U.S.A
Alexander Gelbard Department of Otolaryngology, Head and Neck Surgery Vanderbilt University Medical Center, Nashville, Tennessee, U.S.A
Maria Powell Department of Otolaryngology, Head and Neck Surgery Vanderbilt University Medical Center, Nashville, Tennessee, U.S.A
Michael Johns Department of Otolaryngology-Head and Neck Surgery Keck College of Medicine, University of Southern California, Los Angeles, California, U.S.A
Anthony Law Department of Otolaryngology, Emory University School of Medicine, Atlanta, Georgia, U.S.A
Phillip Song Massachusetts Eye and Ear, Division of Laryngology, Otolaryngology-Head and Neck Surgery Harvard Medical School, Boston, Massachusetts, U.S.A
Matthew Naunheim Massachusetts Eye and Ear, Division of Laryngology, Otolaryngology-Head and Neck Surgery Harvard Medical School, Boston, Massachusetts, U.S.A
Stephanie Watts Department of Otolaryngology, Head and Neck Surgery at University of South Florida Morsani College of Medicine, Tampa, Florida, U.S.A
Paul C Bryson Department of Otolaryngology, Head and Neck Surgery at Cleveland Clinic, Cleveland, Ohio, U.S.A
Matthew G Crowson Massachusetts Eye and Ear, Otolaryngology-Head and Neck Surgery Harvard Medical School, Boston, Massachusetts, U.S.A
Jeremy Pinto Mila Quebec Artificial Intelligence Institute, Montreal, Quebec, Canada
Yael Bensoussan Division of Laryngology Department of Otolaryngology, Head and Neck Surgery at University of South Florida Morsani College of Medicine, Tampa, Florida, U.S.A

Collapse

Treccarichi S, Failla P, Vinci M, Musumeci A, Gloria A, Vasta A, Calabrese G, Papa C, Federico C, Saccone S, Calì F. UNC5C: Novel Gene Associated with Psychiatric Disorders Impacts Dysregulation of Axon Guidance Pathways. Genes (Basel) 2024;15:306. [PMID: 38540364 PMCID: PMC10970690 DOI: 10.3390/genes15030306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 02/23/2024] [Accepted: 02/25/2024] [Indexed: 06/14/2024] Open

Albert P, Haider F, Luz S. CUSCO: An Unobtrusive Custom Secure Audio-Visual Recording System for Ambient Assisted Living. SENSORS (BASEL, SWITZERLAND) 2024;24:1506. [PMID: 38475042 DOI: 10.3390/s24051506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/21/2024] [Accepted: 02/24/2024] [Indexed: 03/14/2024]

Li S, Nair R, Naqvi SM. Acoustic and Text Features Analysis for Adult ADHD Screening: A Data-Driven Approach Utilizing DIVA Interview. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2024;12:359-370. [PMID: 38606391 PMCID: PMC11008805 DOI: 10.1109/jtehm.2024.3369764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/09/2024] [Accepted: 02/15/2024] [Indexed: 04/13/2024]

Abstract

Attention Deficit Hyperactivity Disorder (ADHD) is a neurodevelopmental disorder commonly seen in childhood that leads to behavioural changes in social development and communication patterns, often continues into undiagnosed adulthood due to a global shortage of psychiatrists, resulting in delayed diagnoses with lasting consequences on individual's well-being and the societal impact. Recently, machine learning methodologies have been incorporated into healthcare systems to facilitate the diagnosis and enhance the potential prediction of treatment outcomes for mental health conditions. In ADHD detection, the previous research focused on utilizing functional magnetic resonance imaging (fMRI) or Electroencephalography (EEG) signals, which require costly equipment and trained personnel for data collection. In recent years, speech and text modalities have garnered increasing attention due to their cost-effectiveness and non-wearable sensing in data collection. In this research, conducted in collaboration with the Cumbria, Northumberland, Tyne and Wear NHS Foundation Trust, we gathered audio data from both ADHD patients and normal controls based on the clinically popular Diagnostic Interview for ADHD in adults (DIVA). Subsequently, we transformed the speech data into text modalities through the utilization of the Google Cloud Speech API. We extracted both acoustic and text features from the data, encompassing traditional acoustic features (e.g., MFCC), specialized feature sets (e.g., eGeMAPS), as well as deep-learned linguistic and semantic features derived from pre-trained deep learning models. These features are employed in conjunction with a support vector machine for ADHD classification, yielding promising outcomes in the utilization of audio and text data for effective adult ADHD screening. Clinical impact: This research introduces a transformative approach in ADHD diagnosis, employing speech and text analysis to facilitate early and more accessible detection, particularly beneficial in areas with limited psychiatric resources. Clinical and Translational Impact Statement: The successful application of machine learning techniques in analyzing audio and text data for ADHD screening represents a significant advancement in mental health diagnostics, paving the way for its integration into clinical settings and potentially improving patient outcomes on a broader scale.

Collapse

Luo J, Wu Y, Liu M, Li Z, Wang Z, Zheng Y, Feng L, Lu J, He F. Differentiation between depression and bipolar disorder in child and adolescents by voice features. Child Adolesc Psychiatry Ment Health 2024;18:19. [PMID: 38287442 PMCID: PMC10826007 DOI: 10.1186/s13034-024-00708-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 01/11/2024] [Indexed: 01/31/2024] Open

Abstract

OBJECTIVE

Major depressive disorder (MDD) and bipolar disorder (BD) are serious chronic disabling mental and emotional disorders, with symptoms that often manifest atypically in children and adolescents, making diagnosis difficult without objective physiological indicators. Therefore, we aimed to objectively identify MDD and BD in children and adolescents by exploring their voiceprint features.

METHODS

This study included a total of 150 participants, with 50 MDD patients, 50 BD patients, and 50 healthy controls aged between 6 and 16 years. After collecting voiceprint data, chi-square test was used to screen and extract voiceprint features specific to emotional disorders in children and adolescents. Then, selected characteristic voiceprint features were used to establish training and testing datasets with the ratio of 7:3. The performances of various machine learning and deep learning algorithms were compared using the training dataset, and the optimal algorithm was selected to classify the testing dataset and calculate the sensitivity, specificity, accuracy, and ROC curve.

RESULTS

The three groups showed differences in clustering centers for various voice features such as root mean square energy, power spectral slope, low-frequency percentile energy level, high-frequency spectral slope, spectral harmonic gain, and audio signal energy level. The model of linear SVM showed the best performance in the training dataset, achieving a total accuracy of 95.6% in classifying the three groups in the testing dataset, with sensitivity of 93.3% for MDD, 100% for BD, specificity of 93.3%, AUC of 1 for BD, and AUC of 0.967 for MDD.

CONCLUSION

By exploring the characteristics of voice features in children and adolescents, machine learning can effectively differentiate between MDD and BD in a population, and voice features hold promise as an objective physiological indicator for the auxiliary diagnosis of mood disorder in clinical practice.

Collapse

Affiliation(s)

Jie Luo National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Yuanzhen Wu National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Mengqi Liu National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Zhaojun Li Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Zhuo Wang Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Yi Zheng National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Lihui Feng Beijing Institute of Technology, School of Optics and Photonics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Jihua Lu Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China.
Fan He National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China.

Collapse

Zolnoori M, Sridharan S, Zolnour A, Vergez S, McDonald MV, Kostic Z, Bowles KH, Topaz M. Utilizing patient-nurse verbal communication in building risk identification models: the missing critical data stream in home healthcare. J Am Med Inform Assoc 2024;31:435-444. [PMID: 37847651 PMCID: PMC10797261 DOI: 10.1093/jamia/ocad195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Accepted: 09/21/2023] [Indexed: 10/19/2023] Open

Abstract

BACKGROUND

In the United States, over 12 000 home healthcare agencies annually serve 6+ million patients, mostly aged 65+ years with chronic conditions. One in three of these patients end up visiting emergency department (ED) or being hospitalized. Existing risk identification models based on electronic health record (EHR) data have suboptimal performance in detecting these high-risk patients.

OBJECTIVES

To measure the added value of integrating audio-recorded home healthcare patient-nurse verbal communication into a risk identification model built on home healthcare EHR data and clinical notes.

METHODS

This pilot study was conducted at one of the largest not-for-profit home healthcare agencies in the United States. We audio-recorded 126 patient-nurse encounters for 47 patients, out of which 8 patients experienced ED visits and hospitalization. The risk model was developed and tested iteratively using: (1) structured data from the Outcome and Assessment Information Set, (2) clinical notes, and (3) verbal communication features. We used various natural language processing methods to model the communication between patients and nurses.

RESULTS

Using a Support Vector Machine classifier, trained on the most informative features from OASIS, clinical notes, and verbal communication, we achieved an AUC-ROC = 99.68 and an F1-score = 94.12. By integrating verbal communication into the risk models, the F-1 score improved by 26%. The analysis revealed patients at high risk tended to interact more with risk-associated cues, exhibit more "sadness" and "anxiety," and have extended periods of silence during conversation.

CONCLUSION

This innovative study underscores the immense value of incorporating patient-nurse verbal communication in enhancing risk prediction models for hospitalizations and ED visits, suggesting the need for an evolved clinical workflow that integrates routine patient-nurse verbal communication recording into the medical record.

Collapse

Wadle LM, Ebner-Priemer UW, Foo JC, Yamamoto Y, Streit F, Witt SH, Frank J, Zillich L, Limberger MF, Ablimit A, Schultz T, Gilles M, Rietschel M, Sirignano L. Speech Features as Predictors of Momentary Depression Severity in Patients With Depressive Disorder Undergoing Sleep Deprivation Therapy: Ambulatory Assessment Pilot Study. JMIR Ment Health 2024;11:e49222. [PMID: 38236637 PMCID: PMC10835582 DOI: 10.2196/49222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 10/21/2023] [Indexed: 01/19/2024] Open

Abstract

BACKGROUND

The use of mobile devices to continuously monitor objectively extracted parameters of depressive symptomatology is seen as an important step in the understanding and prevention of upcoming depressive episodes. Speech features such as pitch variability, speech pauses, and speech rate are promising indicators, but empirical evidence is limited, given the variability of study designs.

OBJECTIVE

Previous research studies have found different speech patterns when comparing single speech recordings between patients and healthy controls, but only a few studies have used repeated assessments to compare depressive and nondepressive episodes within the same patient. To our knowledge, no study has used a series of measurements within patients with depression (eg, intensive longitudinal data) to model the dynamic ebb and flow of subjectively reported depression and concomitant speech samples. However, such data are indispensable for detecting and ultimately preventing upcoming episodes.

METHODS

In this study, we captured voice samples and momentary affect ratings over the course of 3 weeks in a sample of patients (N=30) with an acute depressive episode receiving stationary care. Patients underwent sleep deprivation therapy, a chronotherapeutic intervention that can rapidly improve depression symptomatology. We hypothesized that within-person variability in depressive and affective momentary states would be reflected in the following 3 speech features: pitch variability, speech pauses, and speech rate. We parametrized them using the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS) from open-source Speech and Music Interpretation by Large-Space Extraction (openSMILE; audEERING GmbH) and extracted them from a transcript. We analyzed the speech features along with self-reported momentary affect ratings, using multilevel linear regression analysis. We analyzed an average of 32 (SD 19.83) assessments per patient.

RESULTS

Analyses revealed that pitch variability, speech pauses, and speech rate were associated with depression severity, positive affect, valence, and energetic arousal; furthermore, speech pauses and speech rate were associated with negative affect, and speech pauses were additionally associated with calmness. Specifically, pitch variability was negatively associated with improved momentary states (ie, lower pitch variability was linked to lower depression severity as well as higher positive affect, valence, and energetic arousal). Speech pauses were negatively associated with improved momentary states, whereas speech rate was positively associated with improved momentary states.

CONCLUSIONS

Pitch variability, speech pauses, and speech rate are promising features for the development of clinical prediction technologies to improve patient care as well as timely diagnosis and monitoring of treatment response. Our research is a step forward on the path to developing an automated depression monitoring system, facilitating individually tailored treatments and increased patient empowerment.

Collapse

Affiliation(s)

Lisa-Marie Wadle Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
Ulrich W Ebner-Priemer Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Jerome C Foo Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany Institute for Psychopharmacology, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany Department of Psychiatry, College of Health Sciences, University of Alberta, Edmonton, AB, Canada
Yoshiharu Yamamoto Educational Physiology Laboratory, Graduate School of Education, University of Tokyo, Tokyo, Japan
Fabian Streit Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Stephanie H Witt Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Josef Frank Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Lea Zillich Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Matthias F Limberger Mental mHealth Lab, Institute of Sports and Sports Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
Ayimnisagul Ablimit Cognitive Systems Lab, University of Bremen, Bremen, Germany
Tanja Schultz Cognitive Systems Lab, University of Bremen, Bremen, Germany
Maria Gilles Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Marcella Rietschel Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany
Lea Sirignano Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Mannheim, Germany

Collapse

Bélisle-Pipon JC, Powell M, English R, Malo MF, Ravitsky V, Bensoussan Y. Stakeholder perspectives on ethical and trustworthy voice AI in health care. Digit Health 2024;10:20552076241260407. [PMID: 39055787 PMCID: PMC11271113 DOI: 10.1177/20552076241260407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 05/21/2024] [Indexed: 07/27/2024] Open

Silva WJ, Lopes L, Galdino MKC, Almeida AA. Voice Acoustic Parameters as Predictors of Depression. J Voice 2024;38:77-85. [PMID: 34353686 DOI: 10.1016/j.jvoice.2021.06.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 05/24/2021] [Accepted: 06/02/2021] [Indexed: 10/20/2022]

Chopra H, Annu, Shin DK, Munjal K, Priyanka, Dhama K, Emran TB. Revolutionizing clinical trials: the role of AI in accelerating medical breakthroughs. Int J Surg 2023;109:4211-4220. [PMID: 38259001 PMCID: PMC10720846 DOI: 10.1097/js9.0000000000000705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 08/13/2023] [Indexed: 01/24/2024]

Mao K, Wu Y, Chen J. A systematic review on automated clinical depression diagnosis. NPJ MENTAL HEALTH RESEARCH 2023;2:20. [PMID: 38609509 PMCID: PMC10955993 DOI: 10.1038/s44184-023-00040-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 09/27/2023] [Indexed: 04/14/2024]

Cummins N, Dineley J, Conde P, Matcham F, Siddi S, Lamers F, Carr E, Lavelle G, Leightley D, White KM, Oetzmann C, Campbell EL, Simblett S, Bruce S, Haro JM, Penninx BWJH, Ranjan Y, Rashid Z, Stewart C, Folarin AA, Bailón R, Schuller BW, Wykes T, Vairavan S, Dobson RJB, Narayan VA, Hotopf M. Multilingual markers of depression in remotely collected speech samples: A preliminary analysis. J Affect Disord 2023;341:128-136. [PMID: 37598722 DOI: 10.1016/j.jad.2023.08.097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 08/16/2023] [Accepted: 08/17/2023] [Indexed: 08/22/2023]

Affiliation(s)

Nicholas Cummins Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
Judith Dineley Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany
Pauline Conde Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Faith Matcham School of Psychology, University of Sussex, Falmer, UK; Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Sara Siddi Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Barcelona, Spain
Femke Lamers Department of Psychiatry, Amsterdam Public Health Research Institute and Amsterdam Neuroscience, Amsterdam University Medical Centre, Vrije Universiteit and GGZ InGeest, Amsterdam, the Netherlands
Ewan Carr Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Grace Lavelle School of Psychology, University of Sussex, Falmer, UK
Daniel Leightley Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Katie M White Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Carolin Oetzmann Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Edward L Campbell Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; GTM research group, AtlanTTic Research Center, University of Vigo, Spain
Sara Simblett Department of Psychology, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Stuart Bruce RADAR-CNS Patient Advisory Board, King's College London, UK
Josep Maria Haro Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Barcelona, Spain
Brenda W J H Penninx Department of Psychiatry, Amsterdam Public Health Research Institute and Amsterdam Neuroscience, Amsterdam University Medical Centre, Vrije Universiteit and GGZ InGeest, Amsterdam, the Netherlands
Yatharth Ranjan Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Zulqarnain Rashid Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Callum Stewart Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
Amos A Folarin Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; NIHR Biomedical Research Centre at South London, Maudsley NHS Foundation Trust, King's College London, London, UK
Raquel Bailón Biomedical Signal Interpretation and Computational Simulation (BSICoS) group, Aragon Institute for Engineering Research, University of Zaragoza, Zaragoza, Spain; Biomedical Research Networking Center in Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Spain
Björn W Schuller Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany; GLAM - Group on Language, Audio, & Music, Imperial College London, London, UK
Til Wykes Department of Psychology, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; NIHR Biomedical Research Centre at South London, Maudsley NHS Foundation Trust, King's College London, London, UK
Srinivasan Vairavan Janssen Research and Development LLC, Titusville, NJ, United States
Richard J B Dobson Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; Institute of Health Informatics, University College London, London, UK
Vaibhav A Narayan Davos Alzheimer's Collaborative, United States
Matthew Hotopf Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK; NIHR Biomedical Research Centre at South London, Maudsley NHS Foundation Trust, King's College London, London, UK

Collapse

Gerczuk M, Triantafyllopoulos A, Amiriparian S, Kathan A, Bauer J, Berking M, Schuller BW. Zero-shot personalization of speech foundation models for depressed mood monitoring. PATTERNS (NEW YORK, N.Y.) 2023;4:100873. [PMID: 38035199 PMCID: PMC10682756 DOI: 10.1016/j.patter.2023.100873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/01/2023] [Accepted: 10/10/2023] [Indexed: 12/02/2023]

Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: A cross-sectional observational study. Int J Nurs Stud 2023;146:104562. [PMID: 37531702 DOI: 10.1016/j.ijnurstu.2023.104562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 06/23/2023] [Accepted: 07/01/2023] [Indexed: 08/04/2023]

Abstract

BACKGROUND

Depression, anxiety, and apathy are highly prevalent in older people with preclinical dementia and mild cognitive impairment. These symptoms have also proven valuable in predicting the progression from mild cognitive impairment to dementia, enabling a timely diagnosis and treatment. However, objective and reliable indicators to detect and distinguish depression, anxiety, and apathy are relatively scarce.

OBJECTIVE

This study aimed to develop a machine learning model to detect and distinguish depression, anxiety, and apathy based on speech and facial expressions.

DESIGN

An observational, cross-sectional study design.

SETTING(S)

The memory outpatient department of a tertiary hospital.

PARTICIPANTS

319 older adults diagnosed with mild cognitive impairment.

METHODS

Depression, anxiety, and apathy were evaluated by the Public Health Questionnaire, General Anxiety Disorder, and Apathy Evaluation Scale, respectively. Speech and facial expressions of older adults with mild cognitive impairment were digitally captured using audio and video recording software. Open-source data analysis toolkits were utilized to extract speech, facial, and text features. The multiclass classification was used to develop classification models, and shapely additive explanations were used to explain the contribution of each feature within the model.

RESULTS

The random forest method was used to develop a multiclass emotion classification model, which performed well in classifying emotions with a weighted-average F1 score of 96.6 %. The model also demonstrated high accuracy, precision, and recall, with 87.4 %, 86.6 %, and 87.6 %, respectively.

CONCLUSIONS

The machine learning model developed in this study demonstrated strong classification performance in detecting and differentiating depression, anxiety, and apathy. This innovative approach combines text, audio, and video to provide objective methods for precise classification and remote monitoring of these symptoms in nursing practice.

REGISTRATION

This study was registered at the Chinese Clinical Trial Registry (registration number: ChiCTR1900023892; registration date: June 19th, 2019).

Collapse

Wang JZ, Zhao S, Wu C, Adams RB, Newman MG, Shafir T, Tsachor R. Unlocking the Emotional World of Visual Media: An Overview of the Science, Research, and Impact of Understanding Emotion: Drawing Insights From Psychology, Engineering, and the Arts, This Article Provides a Comprehensive Overview of the Field of Emotion Analysis in Visual Media and Discusses the Latest Research, Systems, Challenges, Ethical Implications, and Potential Impact of Artificial Emotional Intelligence on Society. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2023;111:1236-1286. [PMID: 37859667 PMCID: PMC10586271 DOI: 10.1109/jproc.2023.3273517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]

Chandler C, Diaz‐Asper C, Turner RS, Reynolds B, Elvevåg B. An explainable machine learning model of cognitive decline derived from speech. ALZHEIMER'S & DEMENTIA (AMSTERDAM, NETHERLANDS) 2023;15:e12516. [PMID: 38155915 PMCID: PMC10752754 DOI: 10.1002/dad2.12516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 11/26/2023] [Accepted: 11/27/2023] [Indexed: 12/30/2023]

Zolnoori M, Vergez S, Sridharan S, Zolnour A, Bowles K, Kostic Z, Topaz M. Is the patient speaking or the nurse? Automatic speaker type identification in patient-nurse audio recordings. J Am Med Inform Assoc 2023;30:1673-1683. [PMID: 37478477 PMCID: PMC10531109 DOI: 10.1093/jamia/ocad139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 06/06/2023] [Accepted: 07/16/2023] [Indexed: 07/23/2023] Open

Abstract

OBJECTIVES

Patient-clinician communication provides valuable explicit and implicit information that may indicate adverse medical conditions and outcomes. However, practical and analytical approaches for audio-recording and analyzing this data stream remain underexplored. This study aimed to 1) analyze patients' and nurses' speech in audio-recorded verbal communication, and 2) develop machine learning (ML) classifiers to effectively differentiate between patient and nurse language.

MATERIALS AND METHODS

Pilot studies were conducted at VNS Health, the largest not-for-profit home healthcare agency in the United States, to optimize audio-recording patient-nurse interactions. We recorded and transcribed 46 interactions, resulting in 3494 "utterances" that were annotated to identify the speaker. We employed natural language processing techniques to generate linguistic features and built various ML classifiers to distinguish between patient and nurse language at both individual and encounter levels.

RESULTS

A support vector machine classifier trained on selected linguistic features from term frequency-inverse document frequency, Linguistic Inquiry and Word Count, Word2Vec, and Medical Concepts in the Unified Medical Language System achieved the highest performance with an AUC-ROC = 99.01 ± 1.97 and an F1-score = 96.82 ± 4.1. The analysis revealed patients' tendency to use informal language and keywords related to "religion," "home," and "money," while nurses utilized more complex sentences focusing on health-related matters and medical issues and were more likely to ask questions.

CONCLUSION

The methods and analytical approach we developed to differentiate patient and nurse language is an important precursor for downstream tasks that aim to analyze patient speech to identify patients at risk of disease and negative health outcomes.

Collapse

Sprotte Y. Computerized text and voice analysis of patients with chronic schizophrenia in art therapy. Sci Rep 2023;13:16062. [PMID: 37749186 PMCID: PMC10520069 DOI: 10.1038/s41598-023-43069-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 09/19/2023] [Indexed: 09/27/2023] Open

Abstract

This explorative study of patients with chronic schizophrenia aimed to clarify whether group art therapy followed by a therapist-guided picture review could influence patients' communication behaviour. Data on voice and speech characteristics were obtained via objective technological instruments, and these characteristics were selected as indicators of communication behaviour. Seven patients were recruited to participate in weekly group art therapy over a period of 6 months. Three days after each group meeting, they talked about their last picture during a standardized interview that was digitally recorded. The audio recordings were evaluated using validated computer-assisted procedures, the transcribed texts were evaluated using the German version of the LIWC2015 program, and the voice recordings were evaluated using the audio analysis software VocEmoApI. The dual methodological approach was intended to form an internal control of the study results. An exploratory factor analysis of the complete sets of output parameters was carried out with the expectation of obtaining typical speech and voice characteristics that map barriers to communication in patients with schizophrenia. The parameters of both methods were thus processed into five factors each, i.e., into a quantitative digitized classification of the texts and voices. The factor scores were subjected to a linear regression analysis to capture possible process-related changes. Most patients continued to participate in the study. This resulted in high-quality datasets for statistical analysis. To answer the study question, two results were summarized: First, text analysis factor called Presence proved to be a potential surrogate parameter for positive language development. Second, quantitative changes in vocal emotional factors were detected, demonstrating differentiated activation patterns of emotions. These results can be interpreted as an expression of a cathartic healing process. The methods presented in this study make a potentially significant contribution to quantitative research into the effectiveness and mode of action of art therapy.

Collapse

Berardi M, Brosch K, Pfarr JK, Schneider K, Sültmann A, Thomas-Odenthal F, Wroblewski A, Usemann P, Philipsen A, Dannlowski U, Nenadić I, Kircher T, Krug A, Stein F, Dietrich M. Relative importance of speech and voice features in the classification of schizophrenia and depression. Transl Psychiatry 2023;13:298. [PMID: 37726285 PMCID: PMC10509176 DOI: 10.1038/s41398-023-02594-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 08/10/2023] [Accepted: 09/08/2023] [Indexed: 09/21/2023] Open

Affiliation(s)

Mark Berardi Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany.
Katharina Brosch Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Julia-Katharina Pfarr Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Katharina Schneider Institute for Linguistics: General Linguistics, University of Mainz, Mainz, Germany
Angela Sültmann Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Florian Thomas-Odenthal Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Adrian Wroblewski Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Paula Usemann Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Alexandra Philipsen Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany
Udo Dannlowski Institute for Translational Psychiatry, University of Münster, Münster, Germany
Igor Nenadić Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Tilo Kircher Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Axel Krug Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany
Frederike Stein Department of Psychiatry and Psychotherapy, University of Marburg, Marburg, Germany Center for Mind, Brain and Behavior, University of Marburg, Marburg, Germany
Maria Dietrich Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany

Collapse

Fusaroli M, Simonsen A, Borrie SA, Low DM, Parola A, Raschi E, Poluzzi E, Fusaroli R. Identifying Medications Underlying Communication Atypicalities in Psychotic and Affective Disorders: A Pharmacovigilance Study Within the FDA Adverse Event Reporting System. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023;66:3242-3259. [PMID: 37524118 DOI: 10.1044/2023_jslhr-22-00739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]

Abstract

PURPOSE

Communication atypicalities are considered promising markers of a broad range of clinical conditions. However, little is known about the mechanisms and confounders underlying them. Medications might have a crucial, relatively unknown role both as potential confounders and offering an insight on the mechanisms at work. The integration of regulatory documents with disproportionality analyses provides a more comprehensive picture to account for in future investigations of communication-related markers. The aim of this study was to identify a list of drugs potentially associated with communicative atypicalities within psychotic and affective disorders.

METHOD

We developed a query using the Medical Dictionary for Regulatory Activities to search for communicative atypicalities within the FDA Adverse Event Reporting System (updated June 2021). A Bonferroni-corrected disproportionality analysis (reporting odds ratio) was separately performed on spontaneous reports involving psychotic, affective, and non-neuropsychiatric disorders, to account for the confounding role of different underlying conditions. Drug-adverse event associations not already reported in the Side Effect Resource database of labeled adverse drug reactions (unexpected) were subjected to further robustness analyses to account for expected biases.

RESULTS

A list of 291 expected and 91 unexpected potential confounding medications was identified, including drugs that may irritate (inhalants) or desiccate (anticholinergics) the larynx, impair speech motor control (antipsychotics), or induce nodules (acitretin) or necrosis (vascular endothelial growth factor receptor inhibitors) on vocal cords; sedatives and stimulants; neurotoxic agents (anti-infectives); and agents acting on neurotransmitter pathways (dopamine agonists).

CONCLUSIONS

We provide a list of medications to account for in future studies of communication-related markers in affective and psychotic disorders. The current test case illustrates rigorous procedures for digital phenotyping, and the methodological tools implemented for large-scale disproportionality analyses can be considered a road map for investigations of communication-related markers in other clinical populations.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.23721345.

Collapse

Cohen J, Richter V, Neumann M, Black D, Haq A, Wright-Berryman J, Ramanarayanan V. A multimodal dialog approach to mental state characterization in clinically depressed, anxious, and suicidal populations. Front Psychol 2023;14:1135469. [PMID: 37767217 PMCID: PMC10520716 DOI: 10.3389/fpsyg.2023.1135469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 08/14/2023] [Indexed: 09/29/2023] Open

Abstract

Background

The rise of depression, anxiety, and suicide rates has led to increased demand for telemedicine-based mental health screening and remote patient monitoring (RPM) solutions to alleviate the burden on, and enhance the efficiency of, mental health practitioners. Multimodal dialog systems (MDS) that conduct on-demand, structured interviews offer a scalable and cost-effective solution to address this need.

Objective

This study evaluates the feasibility of a cloud based MDS agent, Tina, for mental state characterization in participants with depression, anxiety, and suicide risk.

Method

Sixty-eight participants were recruited through an online health registry and completed 73 sessions, with 15 (20.6%), 21 (28.8%), and 26 (35.6%) sessions screening positive for depression, anxiety, and suicide risk, respectively using conventional screening instruments. Participants then interacted with Tina as they completed a structured interview designed to elicit calibrated, open-ended responses regarding the participants' feelings and emotional state. Simultaneously, the platform streamed their speech and video recordings in real-time to a HIPAA-compliant cloud server, to compute speech, language, and facial movement-based biomarkers. After their sessions, participants completed user experience surveys. Machine learning models were developed using extracted features and evaluated with the area under the receiver operating characteristic curve (AUC).

Results

For both depression and suicide risk, affected individuals tended to have a higher percent pause time, while those positive for anxiety showed reduced lip movement relative to healthy controls. In terms of single-modality classification models, speech features performed best for depression (AUC = 0.64; 95% CI = 0.51-0.78), facial features for anxiety (AUC = 0.57; 95% CI = 0.43-0.71), and text features for suicide risk (AUC = 0.65; 95% CI = 0.52-0.78). Best overall performance was achieved by decision fusion of all models in identifying suicide risk (AUC = 0.76; 95% CI = 0.65-0.87). Participants reported the experience comfortable and shared their feelings.

Conclusion

MDS is a feasible, useful, effective, and interpretable solution for RPM in real-world clinical depression, anxiety, and suicidal populations. Facial information is more informative for anxiety classification, while speech and language are more discriminative of depression and suicidality markers. In general, combining speech, language, and facial information improved model performance on all classification tasks.

Collapse

Duey AH, Rana A, Siddi F, Hussein H, Onnela JP, Smith TR. Daily Pain Prediction Using Smartphone Speech Recordings of Patients With Spine Disease. Neurosurgery 2023;93:670-677. [PMID: 36995101 DOI: 10.1227/neu.0000000000002474] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 02/02/2023] [Indexed: 03/31/2023] Open

Gao CX, Dwyer D, Zhu Y, Smith CL, Du L, Filia KM, Bayer J, Menssink JM, Wang T, Bergmeir C, Wood S, Cotton SM. An overview of clustering methods with guidelines for application in mental health research. Psychiatry Res 2023;327:115265. [PMID: 37348404 DOI: 10.1016/j.psychres.2023.115265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/20/2023] [Accepted: 05/21/2023] [Indexed: 06/24/2023]

Foltz PW, Chandler C, Diaz-Asper C, Cohen AS, Rodriguez Z, Holmlund TB, Elvevåg B. Reflections on the nature of measurement in language-based automated assessments of patients' mental state and cognitive function. Schizophr Res 2023;259:127-139. [PMID: 36153250 DOI: 10.1016/j.schres.2022.07.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/12/2022] [Accepted: 07/13/2022] [Indexed: 11/23/2022]

Granrud OE, Rodriguez Z, Cowan T, Masucci MD, Cohen AS. Alogia and pressured speech do not fall on a continuum of speech production using objective speech technologies. Schizophr Res 2023;259:121-126. [PMID: 35864001 DOI: 10.1016/j.schres.2022.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 07/02/2022] [Accepted: 07/04/2022] [Indexed: 10/17/2022]

Mizuguchi D, Yamamoto T, Omiya Y, Endo K, Tano K, Oya M, Takano S. Novel Screening Tool Using Non-linguistic Voice Features Derived from Simple Phrases to Detect Mild Cognitive Impairment and Dementia. JAR LIFE 2023;12:72-76. [PMID: 37637273 PMCID: PMC10450207 DOI: 10.14283/jarlife.2023.12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 07/13/2023] [Indexed: 08/29/2023]

Neumann M, Kothare H, Ramanarayanan V. Combining Multiple Multimodal Speech Features into an Interpretable Index Score for Capturing Disease Progression in Amyotrophic Lateral Sclerosis. INTERSPEECH 2023;2023:2353-2357. [PMID: 39006832 PMCID: PMC11246072 DOI: 10.21437/interspeech.2023-2100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]

Wang J, Ravi V, Alwan A. Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals. INTERSPEECH 2023;2023:2343-2347. [PMID: 38045821 PMCID: PMC10691447 DOI: 10.21437/interspeech.2023-2101] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]