101
|
Lin D, Nazreen T, Rutowski T, Lu Y, Harati A, Shriberg E, Chlebek P, Aratow M. Feasibility of a Machine Learning-Based Smartphone Application in Detecting Depression and Anxiety in a Generally Senior Population. Front Psychol 2022; 13:811517. [PMID: 35478769 PMCID: PMC9037748 DOI: 10.3389/fpsyg.2022.811517] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 03/01/2022] [Indexed: 11/13/2022] Open
Abstract
BackgroundDepression and anxiety create a large health burden and increase the risk of premature mortality. Mental health screening is vital, but more sophisticated screening and monitoring methods are needed. The Ellipsis Health App addresses this need by using semantic information from recorded speech to screen for depression and anxiety.ObjectivesThe primary aim of this study is to determine the feasibility of collecting weekly voice samples for mental health screening. Additionally, we aim to demonstrate portability and improved performance of Ellipsis’ machine learning models for patients of various ages.MethodsStudy participants were current patients at Desert Oasis Healthcare, mean age 63 years (SD = 10.3). Two non-randomized cohorts participated: one with a documented history of depression within 24 months prior to the study (Group Positive), and the other without depression (Group Negative). Participants recorded 5-min voice samples weekly for 6 weeks via the Ellipsis Health App. They also completed PHQ-8 and GAD-7 questionnaires to assess for depression and anxiety, respectively.ResultsProtocol completion rate was 61% for both groups. Use beyond protocol was 27% for Group Positive and 9% for Group Negative. The Ellipsis Health App showed an AUC of 0.82 for the combined groups when compared to the PHQ-8 and GAD-7 with a threshold score of 10. Performance was high for senior participants as well as younger age ranges. Additionally, many participants spoke longer than the required 5 min.ConclusionThe Ellipsis Health App demonstrated feasibility in using voice recordings to screen for depression and anxiety among various age groups and the machine learning models using Transformer methodology maintain performance and improve over LSTM methodology when applied to the study population.
Collapse
|
102
|
Rybner A, Jessen ET, Mortensen MD, Larsen SN, Grossman R, Bilenberg N, Cantio C, Jepsen JRM, Weed E, Simonsen A, Fusaroli R. Vocal markers of autism: Assessing the generalizability of machine learning models. Autism Res 2022; 15:1018-1030. [PMID: 35385224 DOI: 10.1002/aur.2721] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 02/24/2022] [Accepted: 03/22/2022] [Indexed: 01/09/2023]
Abstract
Machine learning (ML) approaches show increasing promise in their ability to identify vocal markers of autism. Nonetheless, it is unclear to what extent such markers generalize to new speech samples collected, for example, using a different speech task or in a different language. In this paper, we systematically assess the generalizability of ML findings across a variety of contexts. We train promising published ML models of vocal markers of autism on novel cross-linguistic datasets following a rigorous pipeline to minimize overfitting, including cross-validated training and ensemble models. We test the generalizability of the models by testing them on (i) different participants from the same study, performing the same task; (ii) the same participants, performing a different (but similar) task; (iii) a different study with participants speaking a different language, performing the same type of task. While model performance is similar to previously published findings when trained and tested on data from the same study (out-of-sample performance), there is considerable variance between studies. Crucially, the models do not generalize well to different, though similar, tasks and not at all to new languages. The ML pipeline is openly shared. Generalizability of ML models of vocal markers of autism is an issue. We outline three recommendations for strategies researchers could take to be more explicit about generalizability and improve it in future studies. LAY SUMMARY: Machine learning approaches promise to be able to identify autism from voice only. These models underestimate how diverse the contexts in which we speak are, how diverse the languages used are and how diverse autistic voices are. Machine learning approaches need to be more careful in defining their limits and generalizability.
Collapse
Affiliation(s)
- Astrid Rybner
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Emil Trenckner Jessen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Marie Damsgaard Mortensen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Stine Nyhus Larsen
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
| | - Ruth Grossman
- Communication Sciences and Disorders, Emerson College, Boston, Massachusetts, USA
| | - Niels Bilenberg
- Child and Youth Psychiatry, University of Southern Denmark, Odense, Denmark
| | - Cathriona Cantio
- Child and Youth Psychiatry, University of Southern Denmark, Odense, Denmark.,Psychology, University of Southern Denmark, Odense, Denmark
| | - Jens Richardt Møllegaard Jepsen
- Child and Adolescent Mental Health Centre, Mental Health Services in the Capital Region of Denmark, Copenhagen, Denmark.,Center for Neuropsychiatric Schizophrenia Research and Center for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Services in the Capital Region of Denmark, Copenhagen, Denmark
| | - Ethan Weed
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark.,Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark
| | - Arndis Simonsen
- Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark.,Psychosis Research Unit, Aarhus University Hospital, Aarhus, Denmark
| | - Riccardo Fusaroli
- Linguistics, Cognitive Science and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark.,Interacting Minds Center, School of Culture and Society, Aarhus University, Aarhus, Denmark.,Linguistic Data Consortium, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
103
|
Gregory S, Linz N, König A, Langel K, Pullen H, Luz S, Harrison J, Ritchie CW. Remote data collection speech analysis and prediction of the identification of Alzheimer's disease biomarkers in people at risk for Alzheimer's disease dementia: the Speech on the Phone Assessment (SPeAk) prospective observational study protocol. BMJ Open 2022; 12:e052250. [PMID: 35292490 PMCID: PMC8928245 DOI: 10.1136/bmjopen-2021-052250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Identifying cost-effective, non-invasive biomarkers of Alzheimer's disease (AD) is a clinical and research priority. Speech data are easy to collect, and studies suggest it can identify those with AD. We do not know if speech features can predict AD biomarkers in a preclinical population. METHODS AND ANALYSIS The Speech on the Phone Assessment (SPeAk) study is a prospective observational study. SPeAk recruits participants aged 50 years and over who have previously completed studies with AD biomarker collection. Participants complete a baseline telephone assessment, including spontaneous speech and cognitive tests. A 3-month visit will repeat the cognitive tests with a conversational artificial intelligence bot. Participants complete acceptability questionnaires after each visit. Participants are randomised to receive their cognitive test results either after each visit or only after they have completed the study. We will combine SPeAK data with AD biomarker data collected in a previous study and analyse for correlations between extracted speech features and AD biomarkers. The outcome of this analysis will inform the development of an algorithm for prediction of AD risk based on speech features. ETHICS AND DISSEMINATION This study has been approved by the Edinburgh Medical School Research Ethics Committee (REC reference 20-EMREC-007). All participants will provide informed consent before completing any study-related procedures, participants must have capacity to consent to participate in this study. Participants may find the tests, or receiving their scores, causes anxiety or stress. Previous exposure to similar tests may make this more familiar and reduce this anxiety. The study information will include signposting in case of distress. Study results will be disseminated to study participants, presented at conferences and published in a peer reviewed journal. No study participants will be identifiable in the study results.
Collapse
Affiliation(s)
- Sarah Gregory
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| | - Nicklas Linz
- ki elements, ki elements, Saarbrucken, Saarland, Germany
| | - Alexandra König
- Stars Team, National Institute for Research in Computer Science and Automation, Nice, France
| | - Kai Langel
- Janssen Healthcare Innovation, Beerse, Belgium
| | - Hannah Pullen
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| | - Saturnino Luz
- Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| | - John Harrison
- Metis Cognition Ltd, Kilmington Common, UK
- Department of Neurology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Craig W Ritchie
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| |
Collapse
|
104
|
Bertini F, Allevi D, Lutero G, Calzà L, Montesi D. An automatic Alzheimer’s disease classifier based on spontaneous spoken English. COMPUT SPEECH LANG 2022. [DOI: 10.1016/j.csl.2021.101298] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
105
|
Faurholt-Jepsen M, Rohani DA, Busk J, Tønning ML, Vinberg M, Bardram JE, Kessing LV. Discriminating between patients with unipolar disorder, bipolar disorder, and healthy control individuals based on voice features collected from naturalistic smartphone calls. Acta Psychiatr Scand 2022; 145:255-267. [PMID: 34923626 DOI: 10.1111/acps.13391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/10/2021] [Accepted: 12/12/2021] [Indexed: 11/27/2022]
Abstract
BACKGROUND It is of crucial importance to be able to discriminate unipolar disorder (UD) from bipolar disorder (BD), as treatments, as well as course of illness, differ between the two disorders. AIMS To investigate whether voice features from naturalistic phone calls could discriminate between (1) UD, BD, and healthy control individuals (HC); (2) different states within UD. METHODS Voice features were collected daily during naturalistic phone calls for up to 972 days. A total of 48 patients with UD, 121 patients with BD, and 38 HC were included. A total of 115,483 voice data entries were collected (UD [n = 16,454], BD [n = 78,733], and HC [n = 20,296]). Patients evaluated symptoms daily using a smartphone-based system, making it possible to define illness states within UD and BD. Data were analyzed using random forest algorithms. RESULTS Compared with BD, UD was classified with a specificity of 0.84 (SD: 0.07)/AUC of 0.58 (SD: 0.07) and compared with HC with a sensitivity of 0.74 (SD: 0.10)/AUC = 0.74 (SD: 0.06). Compared with BD during euthymia, UD during euthymia was classified with a specificity of 0.79 (SD: 0.05)/AUC = 0.43 (SD: 0.16). Compared with BD during depression, UD during depression was classified with a specificity of 0.81 (SD: 0.09)/AUC = 0.48 (SD: 0.12). Within UD, compared with euthymia, depression was classified with a specificity of 0.70 (SD 0.31)/AUC = 0.65 (SD: 0.11). In all models, the user-dependent models outperformed the user-independent models. CONCLUSIONS The results from the present study are promising, but as reflected by the low AUCs, does not support that voice features collected during naturalistic phone calls at the current state of art can be implemented in clinical practice as a supplementary and assisting tool. Further studies are needed.
Collapse
Affiliation(s)
- Maria Faurholt-Jepsen
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| | - Darius Adam Rohani
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Jonas Busk
- Department of Energy Conversion and Storage, Technical University of Denmark, Lyngby, Denmark
| | - Morten Lindberg Tønning
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| | - Maj Vinberg
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark.,Psychiatric Center North Zealand, Copenhagen, Denmark
| | - Jakob Eyvind Bardram
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Lars Vedel Kessing
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
106
|
Moon K, Sobolev M, Kane JM. Digital and Mobile Health Technology in Collaborative Behavioral Health Care: Scoping Review. JMIR Ment Health 2022; 9:e30810. [PMID: 35171105 PMCID: PMC8892315 DOI: 10.2196/30810] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 09/08/2021] [Accepted: 10/20/2021] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The collaborative care model (CoCM) is a well-established system of behavioral health care in primary care settings. There is potential for digital and mobile technology to augment the CoCM to improve access, scalability, efficiency, and clinical outcomes. OBJECTIVE This study aims to conduct a scoping review to synthesize the evidence available on digital and mobile health technology in collaborative care settings. METHODS This review included cohort and experimental studies of digital and mobile technologies used to augment the CoCM. Studies examining primary care without collaborative care were excluded. A literature search was conducted using 4 electronic databases (MEDLINE, Embase, Web of Science, and Google Scholar). The search results were screened in 2 stages (title and abstract screening, followed by full-text review) by 2 reviewers. RESULTS A total of 3982 nonduplicate reports were identified, of which 20 (0.5%) were included in the analysis. Most studies used a combination of novel technologies. The range of digital and mobile health technologies used included mobile apps, websites, web-based platforms, telephone-based interactive voice recordings, and mobile sensor data. None of the identified studies used social media or wearable devices. Studies that measured patient and provider satisfaction reported positive results, although some types of interventions increased provider workload, and engagement was variable. In studies where clinical outcomes were measured (7/20, 35%), there were no differences between groups, or the differences were modest. CONCLUSIONS The use of digital and mobile health technologies in CoCM is still limited. This study found that technology was most successful when it was integrated into the existing workflow without relying on patient or provider initiative. However, the effect of digital and mobile health on clinical outcomes in CoCM remains unclear and requires additional clinical trials.
Collapse
Affiliation(s)
- Khatiya Moon
- Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States
| | - Michael Sobolev
- Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,Cornell Tech, Cornell University, New York City, NY, United States
| | - John M Kane
- Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States
| |
Collapse
|
107
|
Evaluating the Feasibility and Acceptability of an Artificial-Intelligence-Enabled and Speech-Based Distress Screening Mobile App for Adolescents and Young Adults Diagnosed with Cancer: A Study Protocol. Cancers (Basel) 2022; 14:cancers14040914. [PMID: 35205663 PMCID: PMC8870320 DOI: 10.3390/cancers14040914] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 01/21/2022] [Accepted: 02/03/2022] [Indexed: 12/02/2022] Open
Abstract
Simple Summary Adolescent and young adult (AYA) patients diagnosed with cancer are at a higher risk of psychological distress, which requires regular monitoring throughout their cancer journeys. Paper-and-pencil or digital surveys for psychological stress are often cumbersome to complete during a patient’s visit, and many patients find completing the same survey multiple times repetitive and boring. Recent advances in mobile technology and speech science have enabled flexible and engaging ways of monitoring psychological distress. This paper describes the scientific process we will use to evaluate an artificial intelligence (AI)-enabled mobile app to monitor depression and anxiety among AYAs diagnosed with cancer. Abstract Adolescents and young adults (AYAs) diagnosed with cancer are an age-defined population, with studies reporting up to 45% of the population experiencing psychological distress. Although it is essential to screen and monitor for psychological distress throughout AYAs’ cancer journeys, many cancer centers fail to effectively implement distress screening protocols largely due to busy clinical workflow and survey fatigue. Recent advances in mobile technology and speech science have enabled flexible and engaging methods to monitor psychological distress. However, patient-centered research focusing on these methods’ feasibility and acceptability remains lacking. Therefore, in this project, we aim to evaluate the feasibility and acceptability of an artificial intelligence (AI)-enabled and speech-based mobile application to monitor psychological distress among AYAs diagnosed with cancer. We use a single-arm prospective cohort design with a stratified sampling strategy. We aim to recruit 60 AYAs diagnosed with cancer and to monitor their psychological distress using an AI-enabled speech-based distress monitoring tool over a 6 month period. The primary feasibility endpoint of this study is defined by the number of participants completing four out of six monthly distress assessments, and the acceptability endpoint is defined both quantitatively using the acceptability of intervention measure and qualitatively using semi-structured interviews.
Collapse
|
108
|
Kent RD, Kim Y, Chen LM. Oral and Laryngeal Diadochokinesis Across the Life Span: A Scoping Review of Methods, Reference Data, and Clinical Applications. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:574-623. [PMID: 34958599 DOI: 10.1044/2021_jslhr-21-00396] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE The aim of this study was to conduct a scoping review of research on oral and laryngeal diadochokinesis (DDK) in children and adults, either typically developing/developed or with a clinical diagnosis. METHOD Searches were conducted with PubMed/MEDLINE, Google Scholar, CINAHL, and legacy sources in retrieved articles. Search terms included the following: DDK, alternating motion rate, maximum repetition rate, sequential motion rate, and syllable repetition rate. RESULTS Three hundred sixty articles were retrieved and included in the review. Data source tables for children and adults list the number and ages of study participants, DDK task, and language(s) spoken. Cross-sectional data for typically developing children and typically developed adults are compiled for the monosyllables /pʌ/, /tʌ/, and /kʌ/; the trisyllable /pʌtʌkʌ/; and laryngeal DDK. In addition, DDK results are summarized for 26 disorders or conditions. DISCUSSION A growing number of multidisciplinary reports on DDK affirm its role in clinical practice and research across the world. Atypical DDK is not a well-defined singular entity but rather a label for a collection of disturbances associated with diverse etiologies, including motoric, structural, sensory, and cognitive. The clinical value of DDK can be optimized by consideration of task parameters, analysis method, and population of interest.
Collapse
Affiliation(s)
- Ray D Kent
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison
| | - Yunjung Kim
- School of Communication Sciences & Disorders, Florida State University, Tallahassee
| | - Li-Mei Chen
- Department of Foreign Languages and Literature, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
109
|
Tonn P, Seule L, Degani Y, Herzinger S, Klein A, Schulze N. Evaluation of a Digital Content-free Speech Analysis Tool to Measure Affective Distress in Mental Health (Preprint). JMIR Form Res 2022; 6:e37061. [PMID: 36040767 PMCID: PMC9472064 DOI: 10.2196/37061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/08/2022] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Peter Tonn
- Neuropsychiatric Center of Hamburg, Hamburg, Germany
| | - Lea Seule
- Neuropsychiatric Center of Hamburg, Hamburg, Germany
| | | | | | | | - Nina Schulze
- Neuropsychiatric Center of Hamburg, Hamburg, Germany
| |
Collapse
|
110
|
Hansen L, Zhang YP, Wolf D, Sechidis K, Ladegaard N, Fusaroli R. A generalizable speech emotion recognition model reveals depression and remission. Acta Psychiatr Scand 2022; 145:186-199. [PMID: 34850386 DOI: 10.1111/acps.13388] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 11/24/2021] [Accepted: 11/25/2021] [Indexed: 12/12/2022]
Abstract
OBJECTIVE Affective disorders are associated with atypical voice patterns; however, automated voice analyses suffer from small sample sizes and untested generalizability on external data. We investigated a generalizable approach to aid clinical evaluation of depression and remission from voice using transfer learning: We train machine learning models on easily accessible non-clinical datasets and test them on novel clinical data in a different language. METHODS A Mixture of Experts machine learning model was trained to infer happy/sad emotional state using three publicly available emotional speech corpora in German and US English. We examined the model's predictive ability to classify the presence of depression on Danish speaking healthy controls (N = 42), patients with first-episode major depressive disorder (MDD) (N = 40), and the subset of the same patients who entered remission (N = 25) based on recorded clinical interviews. The model was evaluated on raw, de-noised, and speaker-diarized data. RESULTS The model showed separation between healthy controls and depressed patients at the first visit, obtaining an AUC of 0.71. Further, speech from patients in remission was indistinguishable from that of the control group. Model predictions were stable throughout the interview, suggesting that 20-30 s of speech might be enough to accurately screen a patient. Background noise (but not speaker diarization) heavily impacted predictions. CONCLUSION A generalizable speech emotion recognition model can effectively reveal changes in speaker depressive states before and after remission in patients with MDD. Data collection settings and data cleaning are crucial when considering automated voice analysis for clinical purposes.
Collapse
Affiliation(s)
- Lasse Hansen
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.,Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark.,Center for Humanities Computing Aarhus, Aarhus University, Aarhus, Denmark.,Roche Pharmaceutical Research & Early Development Informatics, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Yan-Ping Zhang
- Roche Pharmaceutical Research & Early Development Informatics, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Detlef Wolf
- Roche Pharmaceutical Research & Early Development Informatics, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | | | - Nicolai Ladegaard
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.,Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
| | - Riccardo Fusaroli
- Cognitive Science, School of Communication and Culture, Aarhus University, Aarhus, Denmark.,The Interacting Minds Centre, Aarhus University, Aarhus, Denmark
| |
Collapse
|
111
|
Sawalha J, Yousefnezhad M, Shah Z, Brown MRG, Greenshaw AJ, Greiner R. Detecting Presence of PTSD Using Sentiment Analysis From Text Data. Front Psychiatry 2022; 12:811392. [PMID: 35178000 PMCID: PMC8844448 DOI: 10.3389/fpsyt.2021.811392] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 12/27/2021] [Indexed: 12/15/2022] Open
Abstract
Rates of Post-traumatic stress disorder (PTSD) have risen significantly due to the COVID-19 pandemic. Telehealth has emerged as a means to monitor symptoms for such disorders. This is partly due to isolation or inaccessibility of therapeutic intervention caused from the pandemic. Additional screening tools may be needed to augment identification and diagnosis of PTSD through a virtual medium. Sentiment analysis refers to the use of natural language processing (NLP) to extract emotional content from text information. In our study, we train a machine learning (ML) model on text data, which is part of the Audio/Visual Emotion Challenge and Workshop (AVEC-19) corpus, to identify individuals with PTSD using sentiment analysis from semi-structured interviews. Our sample size included 188 individuals without PTSD, and 87 with PTSD. The interview was conducted by an artificial character (Ellie) over a video-conference call. Our model was able to achieve a balanced accuracy of 80.4% on a held out dataset used from the AVEC-19 challenge. Additionally, we implemented various partitioning techniques to determine if our model was generalizable enough. This shows that learned models can use sentiment analysis of speech to identify the presence of PTSD, even through a virtual medium. This can serve as an important, accessible and inexpensive tool to detect mental health abnormalities during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Jeff Sawalha
- Department of Psychiatry, University of Alberta, Edmonton, AB, Canada
- Department of Computer Science, University of Alberta, Edmonton, AB, Canada
- Department of Computing Science, Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, Canada
| | - Muhammad Yousefnezhad
- Department of Psychiatry, University of Alberta, Edmonton, AB, Canada
- Department of Computer Science, University of Alberta, Edmonton, AB, Canada
- Department of Computing Science, Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, Canada
| | - Zehra Shah
- Department of Computer Science, University of Alberta, Edmonton, AB, Canada
- Department of Computing Science, Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, Canada
| | - Matthew R. G. Brown
- Department of Psychiatry, University of Alberta, Edmonton, AB, Canada
- Department of Computer Science, University of Alberta, Edmonton, AB, Canada
| | | | - Russell Greiner
- Department of Computer Science, University of Alberta, Edmonton, AB, Canada
- Department of Computing Science, Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
112
|
Birnbaum ML, Abrami A, Heisig S, Ali A, Arenare E, Agurto C, Lu N, Kane JM, Cecchi G. Acoustic and Facial Features From Clinical Interviews for Machine Learning-Based Psychiatric Diagnosis: Algorithm Development. JMIR Ment Health 2022; 9:e24699. [PMID: 35072648 PMCID: PMC8822433 DOI: 10.2196/24699] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 04/29/2021] [Accepted: 12/01/2021] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions. OBJECTIVE We aimed to investigate whether reliable inferences-psychiatric signs, symptoms, and diagnoses-can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder. METHODS We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation. RESULTS The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner-pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61). CONCLUSIONS This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.
Collapse
Affiliation(s)
- Michael L Birnbaum
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States.,The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States
| | - Avner Abrami
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| | - Stephen Heisig
- Icahn School of Medicine at Mount Sinai, New York City, NY, United States
| | - Asra Ali
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - Elizabeth Arenare
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - Carla Agurto
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| | - Nathaniel Lu
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States
| | - John M Kane
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, United States.,The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States.,The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States
| | - Guillermo Cecchi
- Computational Biology Center, IBM Research, Yorktown Heights, NY, United States
| |
Collapse
|
113
|
Almaghrabi SA, Thewlis D, Thwaites S, Rogasch NC, Lau S, Clark SR, Baumert M. The reproducibility of bio-acoustic features is associated with sample duration, speech task and gender. IEEE Trans Neural Syst Rehabil Eng 2022; 30:167-175. [PMID: 35038295 DOI: 10.1109/tnsre.2022.3143117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Bio-acoustic properties of speech show evolving value in analyzing psychiatric illnesses. Obtaining a sufficient speech sample length to quantify these properties is essential, but the impact of sample duration on the stability of bio-acoustic features has not been systematically explored. We aimed to evaluate bio-acoustic features' reproducibility against changes in speech durations and tasks. We extracted source, spectral, formant, and prosodic features in 185 English-speaking adults (98 w, 87 m) for reading-a-story and counting tasks. We compared features at 25% of the total sample duration of the reading task to those obtained from non-overlapping randomly selected sub-samples shortened to 75%, 50%, and 25% of total duration using intraclass correlation coefficients. We also compared the features extracted from entire recordings to those measured at 25% of the duration and features obtained from 50% of the duration. Further, we compared features extracted from reading-a-story to counting tasks. Our results show that the number of reproducible features (out of 125) decreased stepwise with duration reduction. Spectral shape, pitch, and formants reached excellent reproducibility. Mel-frequency cepstral coefficients (MFCCs), loudness, and zero-crossing rate achieved excellent reproducibility only at a longer duration. Reproducibility of source, MFCC derivatives, and voicing probability (VP) was poor. Significant gender differences existed in jitter, MFCC first-derivative, spectral skewness, pitch, VP, and formants. Around 97% of features in both genders were not reproducible across speech tasks, in part due to the short counting task duration. In conclusion, bio-acoustic features are less reproducible in shorter samples and are affected by gender.
Collapse
|
114
|
Lin Y, Liyanage BN, Sun Y, Lu T, Zhu Z, Liao Y, Wang Q, Shi C, Yue W. A deep learning-based model for detecting depression in senior population. Front Psychiatry 2022; 13:1016676. [PMID: 36419976 PMCID: PMC9677587 DOI: 10.3389/fpsyt.2022.1016676] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVES With the attention paid to the early diagnosis of depression, this study tries to use the biological information of speech, combined with deep learning to build a rapid binary-classification model of depression in the elderly who use Mandarin and test its effectiveness. METHODS Demographic information and acoustic data of 56 Mandarin-speaking older adults with major depressive disorder (MDD), diagnosed with the Mini-International Neuropsychiatric Interview (MINI) and the fifth edition of Diagnostic and Statistical Manual of Mental Disorders (DSM-5), and 47 controls was collected. Acoustic data were recorded using different smart phones and analyzed by deep learning model which is developed and tested on independent validation set. The accuracy of the model is shown by the ROC curve. RESULTS The quality of the collected speech affected the accuracy of the model. The initial sensitivity and specificity of the model were respectively 82.14% [95%CI, (70.16-90.00)] and 80.85% [95%CI, (67.64-89.58)]. CONCLUSION This study provides a new method for rapid identification and diagnosis of depression utilizing deep learning technology. Vocal biomarkers extracted from raw speech signals have high potential for the early diagnosis of depression in older adults.
Collapse
Affiliation(s)
- Yunhan Lin
- Institute of Mental Health, Peking University Sixth Hospital, Beijing, China.,Research Unit of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences (2018RU006), Beijing, China.,National Clinical Research Center for Mental Disorders and NHC Key Laboratory of Mental Health and (Peking University Sixth Hospital), Beijing, China
| | | | - Yutao Sun
- The Fifth Hospital of Tangshan City, Tangshan, China
| | - Tianlan Lu
- Institute of Mental Health, Peking University Sixth Hospital, Beijing, China.,Research Unit of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences (2018RU006), Beijing, China.,National Clinical Research Center for Mental Disorders and NHC Key Laboratory of Mental Health and (Peking University Sixth Hospital), Beijing, China
| | | | - Yundan Liao
- Institute of Mental Health, Peking University Sixth Hospital, Beijing, China.,Research Unit of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences (2018RU006), Beijing, China.,National Clinical Research Center for Mental Disorders and NHC Key Laboratory of Mental Health and (Peking University Sixth Hospital), Beijing, China
| | | | - Chuan Shi
- Institute of Mental Health, Peking University Sixth Hospital, Beijing, China.,Research Unit of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences (2018RU006), Beijing, China.,National Clinical Research Center for Mental Disorders and NHC Key Laboratory of Mental Health and (Peking University Sixth Hospital), Beijing, China
| | - Weihua Yue
- Institute of Mental Health, Peking University Sixth Hospital, Beijing, China.,Research Unit of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences (2018RU006), Beijing, China.,National Clinical Research Center for Mental Disorders and NHC Key Laboratory of Mental Health and (Peking University Sixth Hospital), Beijing, China.,PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing, China.,Chinese Institute for Brain Research, Beijing, China
| |
Collapse
|
115
|
Hajduska-Dér B, Kiss G, Sztahó D, Vicsi K, Simon L. The applicability of the Beck Depression Inventory and Hamilton Depression Scale in the automatic recognition of depression based on speech signal processing. Front Psychiatry 2022; 13:879896. [PMID: 35990073 PMCID: PMC9385975 DOI: 10.3389/fpsyt.2022.879896] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 07/18/2022] [Indexed: 11/25/2022] Open
Abstract
Depression is a growing problem worldwide, impacting on an increasing number of patients, and also affecting health systems and the global economy. The most common diagnostical rating scales of depression are self-reported or clinician-administered, which differ in the symptoms that they are sampling. Speech is a promising biomarker in the diagnostical assessment of depression, due to non-invasiveness and cost and time efficiency. In our study, we try to achieve a more accurate, sensitive model for determining depression based on speech processing. Regression and classification models were also developed using a machine learning method. During the research, we had access to a large speech database that includes speech samples from depressed and healthy subjects. The database contains the Beck Depression Inventory (BDI) score of each subject and the Hamilton Rating Scale for Depression (HAMD) score of 20% of the subjects. This fact provided an opportunity to compare the usefulness of BDI and HAMD for training models of automatic recognition of depression based on speech signal processing. We found that the estimated values of the acoustic model trained on BDI scores are closer to HAMD assessment than to the BDI scores, and the partial application of HAMD scores instead of BDI scores in training improves the accuracy of automatic recognition of depression.
Collapse
Affiliation(s)
- Bálint Hajduska-Dér
- Department of Psychiatry and Psychotherapy, Semmelweis University, Budapest, Hungary
| | - Gábor Kiss
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Dávid Sztahó
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Klára Vicsi
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Lajos Simon
- Department of Psychiatry and Psychotherapy, Semmelweis University, Budapest, Hungary
| |
Collapse
|
116
|
Di Y, Wang J, Liu X, Zhu T. Combining Polygenic Risk Score and Voice Features to Detect Major Depressive Disorders. Front Genet 2021; 12:761141. [PMID: 34987547 PMCID: PMC8721147 DOI: 10.3389/fgene.2021.761141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 11/12/2021] [Indexed: 11/29/2022] Open
Abstract
Background: The application of polygenic risk scores (PRSs) in major depressive disorder (MDD) detection is constrained by its simplicity and uncertainty. One promising way to further extend its usability is fusion with other biomarkers. This study constructed an MDD biomarker by combining the PRS and voice features and evaluated their ability based on large clinical samples. Methods: We collected genome-wide sequences and utterances edited from clinical interview speech records from 3,580 women with recurrent MDD and 4,016 healthy people. Then, we constructed PRS as a gene biomarker by p value-based clumping and thresholding and extracted voice features using the i-vector method. Using logistic regression, we compared the ability of gene or voice biomarkers with the ability of both in combination for MDD detection. We also tested more machine learning models to further improve the detection capability. Results: With a p-value threshold of 0.005, the combined biomarker improved the area under the receiver operating characteristic curve (AUC) by 9.09% compared to that of genes only and 6.73% compared to that of voice only. Multilayer perceptron can further heighten the AUC by 3.6% compared to logistic regression, while support vector machine and random forests showed no better performance. Conclusion: The addition of voice biomarkers to genes can effectively improve the ability to detect MDD. The combination of PRS and voice biomarkers in MDD detection is feasible. This study provides a foundation for exploring the clinical application of genetic and voice biomarkers in the diagnosis of MDD.
Collapse
Affiliation(s)
- Yazheng Di
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Jingying Wang
- School of Optometry, Faculty of Health and Social Sciences, Hong Kong Polytechnic University, Hong Kong, China
| | - Xiaoqian Liu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Tingshao Zhu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- *Correspondence: Tingshao Zhu,
| |
Collapse
|
117
|
Kelly JR, Gillan CM, Prenderville J, Kelly C, Harkin A, Clarke G, O'Keane V. Psychedelic Therapy's Transdiagnostic Effects: A Research Domain Criteria (RDoC) Perspective. Front Psychiatry 2021; 12:800072. [PMID: 34975593 PMCID: PMC8718877 DOI: 10.3389/fpsyt.2021.800072] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 11/19/2021] [Indexed: 12/12/2022] Open
Abstract
Accumulating clinical evidence shows that psychedelic therapy, by synergistically combining psychopharmacology and psychological support, offers a promising transdiagnostic treatment strategy for a range of disorders with restricted and/or maladaptive habitual patterns of emotion, cognition and behavior, notably, depression (MDD), treatment resistant depression (TRD) and addiction disorders, but perhaps also anxiety disorders, obsessive-compulsive disorder (OCD), Post-Traumatic Stress Disorder (PTSD) and eating disorders. Despite the emergent transdiagnostic evidence, the specific clinical dimensions that psychedelics are efficacious for, and associated underlying neurobiological pathways, remain to be well-characterized. To this end, this review focuses on pre-clinical and clinical evidence of the acute and sustained therapeutic potential of psychedelic therapy in the context of a transdiagnostic dimensional systems framework. Focusing on the Research Domain Criteria (RDoC) as a template, we will describe the multimodal mechanisms underlying the transdiagnostic therapeutic effects of psychedelic therapy, traversing molecular, cellular and network levels. These levels will be mapped to the RDoC constructs of negative and positive valence systems, arousal regulation, social processing, cognitive and sensorimotor systems. In summarizing this literature and framing it transdiagnostically, we hope we can assist the field in moving toward a mechanistic understanding of how psychedelics work for patients and eventually toward a precise-personalized psychedelic therapy paradigm.
Collapse
Affiliation(s)
- John R. Kelly
- Department of Psychiatry, Trinity College, Dublin, Ireland
- Department of Psychiatry, Tallaght University Hospital, Dublin, Ireland
| | - Claire M. Gillan
- Trinity College Institute of Neuroscience, Trinity College, Dublin, Ireland
- School of Psychology, Trinity College, Dublin, Ireland
- Global Brain Health Institute, Trinity College, Dublin, Ireland
| | - Jack Prenderville
- Transpharmation Ireland Ltd, Institute of Neuroscience, Trinity College, Dublin, Ireland
- Discipline of Physiology, School of Medicine, Trinity College, Dublin, Ireland
| | - Clare Kelly
- Department of Psychiatry, Trinity College, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College, Dublin, Ireland
- School of Psychology, Trinity College, Dublin, Ireland
| | - Andrew Harkin
- Trinity College Institute of Neuroscience, Trinity College, Dublin, Ireland
- School of Pharmacy and Pharmaceutical Sciences, Trinity College, Dublin, Ireland
| | - Gerard Clarke
- Department of Psychiatry and Neurobehavioral Science, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Veronica O'Keane
- Department of Psychiatry, Trinity College, Dublin, Ireland
- Department of Psychiatry, Tallaght University Hospital, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College, Dublin, Ireland
| |
Collapse
|
118
|
Smrke U, Mlakar I, Lin S, Musil B, Plohl N. Language, Speech, and Facial Expression Features for Artificial Intelligence-Based Detection of Cancer Survivors' Depression: Scoping Meta-Review. JMIR Ment Health 2021; 8:e30439. [PMID: 34874883 PMCID: PMC8691410 DOI: 10.2196/30439] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 08/25/2021] [Accepted: 09/06/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Cancer survivors often experience disorders from the depressive spectrum that remain largely unrecognized and overlooked. Even though screening for depression is recognized as essential, several barriers prevent its successful implementation. It is possible that better screening options can be developed. New possibilities have been opening up with advances in artificial intelligence and increasing knowledge on the connection of observable cues and psychological states. OBJECTIVE The aim of this scoping meta-review was to identify observable features of depression that can be intercepted using artificial intelligence in order to provide a stepping stone toward better recognition of depression among cancer survivors. METHODS We followed a methodological framework for scoping reviews. We searched SCOPUS and Web of Science for relevant papers on the topic, and data were extracted from the papers that met inclusion criteria. We used thematic analysis within 3 predefined categories of depression cues (ie, language, speech, and facial expression cues) to analyze the papers. RESULTS The search yielded 1023 papers, of which 9 met the inclusion criteria. Analysis of their findings resulted in several well-supported cues of depression in language, speech, and facial expression domains, which provides a comprehensive list of observable features that are potentially suited to be intercepted by artificial intelligence for early detection of depression. CONCLUSIONS This review provides a synthesis of behavioral features of depression while translating this knowledge into the context of artificial intelligence-supported screening for depression in cancer survivors.
Collapse
Affiliation(s)
- Urška Smrke
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia
| | - Izidor Mlakar
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia
| | - Simon Lin
- Science Department, Symptoma, Vienna, Austria.,Department of Internal Medicine, Paracelsus Medical University, Salzburg, Austria
| | - Bojan Musil
- Department of Psychology, Faculty of Arts, University of Maribor, Maribor, Slovenia
| | - Nejc Plohl
- Department of Psychology, Faculty of Arts, University of Maribor, Maribor, Slovenia
| |
Collapse
|
119
|
Faurholt-Jepsen M, Rohani DA, Busk J, Vinberg M, Bardram JE, Kessing LV. Voice analyses using smartphone-based data in patients with bipolar disorder, unaffected relatives and healthy control individuals, and during different affective states. Int J Bipolar Disord 2021; 9:38. [PMID: 34850296 PMCID: PMC8632566 DOI: 10.1186/s40345-021-00243-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 10/27/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Voice features have been suggested as objective markers of bipolar disorder (BD). AIMS To investigate whether voice features from naturalistic phone calls could discriminate between (1) BD, unaffected first-degree relatives (UR) and healthy control individuals (HC); (2) affective states within BD. METHODS Voice features were collected daily during naturalistic phone calls for up to 972 days. A total of 121 patients with BD, 21 UR and 38 HC were included. A total of 107.033 voice data entries were collected [BD (n = 78.733), UR (n = 8004), and HC (n = 20.296)]. Daily, patients evaluated symptoms using a smartphone-based system. Affective states were defined according to these evaluations. Data were analyzed using random forest machine learning algorithms. RESULTS Compared to HC, BD was classified with a sensitivity of 0.79 (SD 0.11)/AUC = 0.76 (SD 0.11) and UR with a sensitivity of 0.53 (SD 0.21)/AUC of 0.72 (SD 0.12). Within BD, compared to euthymia, mania was classified with a specificity of 0.75 (SD 0.16)/AUC = 0.66 (SD 0.11). Compared to euthymia, depression was classified with a specificity of 0.70 (SD 0.16)/AUC = 0.66 (SD 0.12). In all models the user dependent models outperformed the user independent models. Models combining increased mood, increased activity and insomnia compared to periods without performed best with a specificity of 0.78 (SD 0.16)/AUC = 0.67 (SD 0.11). CONCLUSIONS Voice features from naturalistic phone calls may represent a supplementary objective marker discriminating BD from HC and a state marker within BD.
Collapse
Affiliation(s)
- Maria Faurholt-Jepsen
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Blegdamsvej 9, 2100, Copenhagen, Denmark.
| | - Darius Adam Rohani
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Jonas Busk
- Department of Energy Conversion and Storage, Technical University of Denmark, Lyngby, Denmark
| | - Maj Vinberg
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Blegdamsvej 9, 2100, Copenhagen, Denmark.,Psychiatric Centre North Zealand, Hilleroed, Denmark
| | - Jakob Eyvind Bardram
- Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Lars Vedel Kessing
- Copenhagen Affective Disorder Research Center (CADIC), Psychiatric Center Copenhagen, Rigshospitalet, Blegdamsvej 9, 2100, Copenhagen, Denmark
| |
Collapse
|
120
|
Design of a Data Glove for Assessment of Hand Performance Using Supervised Machine Learning. SENSORS 2021; 21:s21216948. [PMID: 34770255 PMCID: PMC8587288 DOI: 10.3390/s21216948] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/11/2021] [Accepted: 10/12/2021] [Indexed: 12/18/2022]
Abstract
The large number of poststroke recovery patients poses a burden on rehabilitation centers, hospitals, and physiotherapists. The advent of rehabilitation robotics and automated assessment systems can ease this burden by assisting in the rehabilitation of patients with a high level of recovery. This assistance will enable medical professionals to either better provide for patients with severe injuries or treat more patients. It also translates into financial assistance as well in the long run. This paper demonstrated an automated assessment system for in-home rehabilitation utilizing a data glove, a mobile application, and machine learning algorithms. The system can be used by poststroke patients with a high level of recovery to assess their performance. Furthermore, this assessment can be sent to a medical professional for supervision. Additionally, a comparison between two machine learning classifiers was performed on their assessment of physical exercises. The proposed system has an accuracy of 85% (±5.1%) with careful feature and classifier selection.
Collapse
|
121
|
König A, Mallick E, Tröger J, Linz N, Zeghari R, Manera V, Robert P. Measuring neuropsychiatric symptoms in patients with early cognitive decline using speech analysis. Eur Psychiatry 2021; 64:e64. [PMID: 34641989 PMCID: PMC8581700 DOI: 10.1192/j.eurpsy.2021.2236] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Certain neuropsychiatric symptoms (NPS), namely apathy, depression, and anxiety demonstrated great value in predicting dementia progression, representing eventually an opportunity window for timely diagnosis and treatment. However, sensitive and objective markers of these symptoms are still missing. Therefore, the present study aims to investigate the association between automatically extracted speech features and NPS in patients with mild neurocognitive disorders. METHODS Speech of 141 patients aged 65 or older with neurocognitive disorder was recorded while performing two short narrative speech tasks. NPS were assessed by the neuropsychiatric inventory. Paralinguistic markers relating to prosodic, formant, source, and temporal qualities of speech were automatically extracted, correlated with NPS. Machine learning experiments were carried out to validate the diagnostic power of extracted markers. RESULTS Different speech variables are associated with specific NPS; apathy correlates with temporal aspects, and anxiety with voice quality-and this was mostly consistent between male and female after correction for cognitive impairment. Machine learning regressors are able to extract information from speech features and perform above baseline in predicting anxiety, apathy, and depression scores. CONCLUSIONS Different NPS seem to be characterized by distinct speech features, which are easily extractable automatically from short vocal tasks. These findings support the use of speech analysis for detecting subtypes of NPS in patients with cognitive impairment. This could have great implications for the design of future clinical trials as this cost-effective method could allow more continuous and even remote monitoring of symptoms.
Collapse
Affiliation(s)
- Alexandra König
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Elisa Mallick
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Johannes Tröger
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Nicklas Linz
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Radia Zeghari
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Valeria Manera
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| | - Philippe Robert
- Stars Team, Sophia Antipolis, Institut National de Recherche en Informatique et en Automatique (INRIA), Valbonne, France.,Clinical Research, ki:elements, Saarbrücken, Germany.,CoBTeK (Cognition-Behaviour-Technology) Lab, FRIS-University Côte d'Azur, Nice, France
| |
Collapse
|
122
|
Brederoo SG, Nadema FG, Goedhart FG, Voppel AE, De Boer JN, Wouts J, Koops S, Sommer IEC. Implementation of automatic speech analysis for early detection of psychiatric symptoms: What do patients want? J Psychiatr Res 2021; 142:299-301. [PMID: 34416548 DOI: 10.1016/j.jpsychires.2021.08.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/09/2021] [Accepted: 08/15/2021] [Indexed: 10/20/2022]
Abstract
Psychiatry is in dire need of a method to aid early detection of symptoms. Recent developments in automatic speech analysis prove promising in this regard, and open avenues for implementation of speech-based applications to detect psychiatric symptoms. The current survey was conducted to assess positions with regard to speech recordings among a group (n = 675) of individuals who experience psychiatric symptoms. Overall, respondents are open to the idea of speech recordings in light of their mental welfare. Importantly, concerns with regard to privacy were raised. Given that speech recordings are privacy sensitive, this requires special attention upon implementation of automatic speech analysis techniques. Furthermore, respondents indicated a preference for speech recordings in the presence of a clinician, as opposed to a recording made at home without the clinician present. In developing a speech marker for psychiatry, close collaboration with the intended users is essential to arrive at a truly valid and implementable method.
Collapse
Affiliation(s)
- S G Brederoo
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands; Center for Psychiatry, University Medical Center Groningen, Groningen, the Netherlands.
| | - F G Nadema
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands
| | - F G Goedhart
- MIND Landelijk Platform Psychische Gezondheid, Amersfoort, the Netherlands
| | - A E Voppel
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands
| | - J N De Boer
- Department of Psychiatry, UMCU Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - J Wouts
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands
| | - S Koops
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands
| | - I E C Sommer
- University of Groningen, Department of Biomedical Sciences of Cells & Systems, University Medical Center Groningen, Groningen, the Netherlands
| |
Collapse
|
123
|
Bougeard A, Guay Hottin1 R, Houde V, Jean T, Piront T, Potvin S, Bernard P, Tourjman V, De Benedictis L, Orban P. Le phénotypage digital pour une pratique clinique en santé mentale mieux informée. SANTE MENTALE AU QUEBEC 2021. [DOI: 10.7202/1081513ar] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Objectifs Cette revue trouve sa motivation dans l’observation que la prise de décision clinique en santé mentale est limitée par la nature des mesures typiquement obtenues lors de l’entretien clinique et la difficulté des cliniciens à produire des prédictions justes sur les états mentaux futurs des patients. L’objectif est de présenter un survol représentatif du potentiel du phénotypage digital couplé à l’apprentissage automatique pour répondre à cette limitation, tout en en soulignant les faiblesses actuelles.
Méthode Au travers d’une revue narrative de la littérature non systématique, nous identifions les avancées technologiques qui permettent de quantifier, instant après instant et dans le milieu de vie naturel, le phénotype humain au moyen du téléphone intelligent dans diverses populations psychiatriques. Des travaux pertinents sont également sélectionnés afin de déterminer l’utilité et les limitations de l’apprentissage automatique pour guider les prédictions et la prise de décision clinique. Finalement, la littérature est explorée pour évaluer les barrières actuelles à l’adoption de tels outils.
Résultats Bien qu’émergeant d’un champ de recherche récent, de très nombreux travaux soulignent déjà la valeur des mesures extraites des senseurs du téléphone intelligent pour caractériser le phénotype humain dans les sphères comportementale, cognitive, émotionnelle et sociale, toutes étant affectées par les troubles mentaux. L’apprentissage automatique permet d’utiles et justes prédictions cliniques basées sur ces mesures, mais souffre d’un manque d’interprétabilité qui freinera son emploi prochain dans la pratique clinique. Du reste, plusieurs barrières identifiées tant du côté du patient que du clinicien freinent actuellement l’adoption de ce type d’outils de suivi et d’aide à la décision clinique.
Conclusion Le phénotypage digital couplé à l’apprentissage automatique apparaît fort prometteur pour améliorer la pratique clinique en santé mentale. La jeunesse de ces nouveaux outils technologiques requiert cependant un nécessaire processus de maturation qui devra être encadré par les différents acteurs concernés pour que ces promesses puissent être pleinement réalisées.
Collapse
Affiliation(s)
- Alan Bougeard
- Étudiant, Centre de recherche de l’Institut universitaire en santé mentale de Montréal
| | - Rose Guay Hottin1
- Étudiante, Centre de recherche de l’Institut universitaire en santé mentale de Montréal
| | - Valérie Houde
- M.D., étudiante, Centre de recherche de l’Institut universitaire en santé mentale de Montréal
| | - Thierry Jean
- Étudiant, Centre de recherche de l’Institut universitaire en santé mentale de Montréal
| | - Thibault Piront
- Professionnel de recherche, Centre de recherche de l’Institut universitaire en santé mentale de Montréal
| | - Stéphane Potvin
- Ph. D., chercheur, Centre de recherche de l’Institut universitaire en santé mentale de Montréal – professeur sous octroi titulaire, Département de psychiatrie et d’addictologie, Université de Montréal
| | - Paquito Bernard
- Ph. D., chercheur, Centre de recherche de l’Institut universitaire en santé mentale de Montréal – professeur régulier, Département des sciences de l’activité physique, Université du Québec à Montréal
| | - Valérie Tourjman
- M.D., psychiatre, Institut universitaire en santé mentale de Montréal – professeure agrégée de clinique, Département de psychiatrie et d’addictologie, Université de Montréal
| | - Luigi De Benedictis
- M.D., psychiatre, Institut universitaire en santé mentale de Montréal – professeur adjoint de clinique, Département de psychiatrie et d’addictologie, Université de Montréal
| | - Pierre Orban
- Ph. D., chercheur, Centre de recherche de l’Institut universitaire en santé mentale de Montréal – professeur sous octroi adjoint, Département de psychiatrie et d’addictologie, Université de Montréal
| |
Collapse
|
124
|
DeSouza DD, Robin J, Gumus M, Yeung A. Natural Language Processing as an Emerging Tool to Detect Late-Life Depression. Front Psychiatry 2021; 12:719125. [PMID: 34552519 PMCID: PMC8450440 DOI: 10.3389/fpsyt.2021.719125] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 08/11/2021] [Indexed: 12/14/2022] Open
Abstract
Late-life depression (LLD) is a major public health concern. Despite the availability of effective treatments for depression, barriers to screening and diagnosis still exist. The use of current standardized depression assessments can lead to underdiagnosis or misdiagnosis due to subjective symptom reporting and the distinct cognitive, psychomotor, and somatic features of LLD. To overcome these limitations, there has been a growing interest in the development of objective measures of depression using artificial intelligence (AI) technologies such as natural language processing (NLP). NLP approaches focus on the analysis of acoustic and linguistic aspects of human language derived from text and speech and can be integrated with machine learning approaches to classify depression and its severity. In this review, we will provide rationale for the use of NLP methods to study depression using speech, summarize previous research using NLP in LLD, compare findings to younger adults with depression and older adults with other clinical conditions, and discuss future directions including the use of complementary AI strategies to fully capture the spectrum of LLD.
Collapse
Affiliation(s)
| | | | | | - Anthony Yeung
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
125
|
More than a biomarker: could language be a biosocial marker of psychosis? NPJ SCHIZOPHRENIA 2021; 7:42. [PMID: 34465778 PMCID: PMC8408150 DOI: 10.1038/s41537-021-00172-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 08/06/2021] [Indexed: 02/07/2023]
Abstract
Automated extraction of quantitative linguistic features has the potential to predict objectively the onset and progression of psychosis. These linguistic variables are often considered to be biomarkers, with a large emphasis placed on the pathological aberrations in the biological processes that underwrite the faculty of language in psychosis. This perspective offers a reminder that human language is primarily a social device that is biologically implemented. As such, linguistic aberrations in patients with psychosis reflect both social and biological processes affecting an individual. Failure to consider the sociolinguistic aspects of NLP measures will limit their usefulness as digital tools in clinical settings. In the context of psychosis, considering language as a biosocial marker could lead to less biased and more accessible tools for patient-specific predictions in the clinic.
Collapse
|
126
|
Bickman L. Improving Mental Health Services: A 50-Year Journey from Randomized Experiments to Artificial Intelligence and Precision Mental Health. ADMINISTRATION AND POLICY IN MENTAL HEALTH AND MENTAL HEALTH SERVICES RESEARCH 2021; 47:795-843. [PMID: 32715427 PMCID: PMC7382706 DOI: 10.1007/s10488-020-01065-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
This conceptual paper describes the current state of mental health services, identifies critical problems, and suggests how to solve them. I focus on the potential contributions of artificial intelligence and precision mental health to improving mental health services. Toward that end, I draw upon my own research, which has changed over the last half century, to highlight the need to transform the way we conduct mental health services research. I identify exemplars from the emerging literature on artificial intelligence and precision approaches to treatment in which there is an attempt to personalize or fit the treatment to the client in order to produce more effective interventions.
Collapse
Affiliation(s)
- Leonard Bickman
- Center for Children and Families; Psychology, Academic Health Center 1, Florida International University, 11200 Southwest 8th Street, Room 140, Miami, FL, 33199, USA.
| |
Collapse
|
127
|
Fu J, Yang S, He F, He L, Li Y, Zhang J, Xiong X. Sch-net: a deep learning architecture for automatic detection of schizophrenia. Biomed Eng Online 2021; 20:75. [PMID: 34344372 PMCID: PMC8336375 DOI: 10.1186/s12938-021-00915-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 07/26/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Schizophrenia is a chronic and severe mental disease, which largely influences the daily life and work of patients. Clinically, schizophrenia with negative symptoms is usually misdiagnosed. The diagnosis is also dependent on the experience of clinicians. It is urgent to develop an objective and effective method to diagnose schizophrenia with negative symptoms. Recent studies had shown that impaired speech could be considered as an indicator to diagnose schizophrenia. The literature about schizophrenic speech detection was mainly based on feature engineering, in which effective feature extraction is difficult because of the variability of speech signals. METHODS This work designs a novel Sch-net neural network based on a convolutional neural network, which is the first work for end-to-end schizophrenic speech detection using deep learning techniques. The Sch-net adds two components, skip connections and convolutional block attention module (CBAM), to the convolutional backbone architecture. The skip connections enrich the information used for the classification by emerging low- and high-level features. The CBAM highlights the effective features by giving learnable weights. The proposed Sch-net combines the advantages of the two components, which can avoid the procedure of manual feature extraction and selection. RESULTS We validate our Sch-net through ablation experiments on a schizophrenic speech data set that contains 28 patients with schizophrenia and 28 healthy controls. The comparisons with the models based on feature engineering and deep neural networks are also conducted. The experimental results show that the Sch-net has a great performance on the schizophrenic speech detection task, which can achieve 97.68% accuracy on the schizophrenic speech data set. To further verify the generalization of our model, the Sch-net is tested on open access LANNA children speech database for specific language impairment detection. The results show that our model achieves 99.52% accuracy in classifying patients with SLI and healthy controls. Our code will be available at https://github.com/Scu-sen/Sch-net . CONCLUSIONS Extensive experiments show that the proposed Sch-net can provide aided information for the diagnosis of schizophrenia and specific language impairment.
Collapse
Affiliation(s)
- Jia Fu
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Sen Yang
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Fei He
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Ling He
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Yuanyuan Li
- Mental Health Center, West China Hospital of Sichuan University, Chengdu, China
| | - Jing Zhang
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Xi Xiong
- School of Cybersecurity, Chengdu University of Information Technology, Chengdu, China
| |
Collapse
|
128
|
Weiner L, Guidi A, Doignon-Camus N, Giersch A, Bertschy G, Vanello N. Vocal features obtained through automated methods in verbal fluency tasks can aid the identification of mixed episodes in bipolar disorder. Transl Psychiatry 2021; 11:415. [PMID: 34341338 PMCID: PMC8329226 DOI: 10.1038/s41398-021-01535-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 07/05/2021] [Accepted: 07/26/2021] [Indexed: 02/07/2023] Open
Abstract
There is a lack of consensus on the diagnostic thresholds that could improve the detection accuracy of bipolar mixed episodes in clinical settings. Some studies have shown that voice features could be reliable biomarkers of manic and depressive episodes compared to euthymic states, but none thus far have investigated whether they could aid the distinction between mixed and non-mixed acute bipolar episodes. Here we investigated whether vocal features acquired via verbal fluency tasks could accurately classify mixed states in bipolar disorder using machine learning methods. Fifty-six patients with bipolar disorder were recruited during an acute episode (19 hypomanic, 8 mixed hypomanic, 17 with mixed depression, 12 with depression). Nine different trials belonging to four conditions of verbal fluency tasks-letter, semantic, free word generation, and associational fluency-were administered. Spectral and prosodic features in three conditions were selected for the classification algorithm. Using the leave-one-subject-out (LOSO) strategy to train the classifier, we calculated the accuracy rate, the F1 score, and the Matthews correlation coefficient (MCC). For depression versus mixed depression, the accuracy and F1 scores were high, i.e., respectively 0.83 and 0.86, and the MCC was of 0.64. For hypomania versus mixed hypomania, accuracy and F1 scores were also high, i.e., 0.86 and 0.75, respectively, and the MCC was of 0.57. Given the high rates of correctly classified subjects, vocal features quickly acquired via verbal fluency tasks seem to be reliable biomarkers that could be easily implemented in clinical settings to improve diagnostic accuracy.
Collapse
Affiliation(s)
- Luisa Weiner
- INSERM 1114, Strasbourg, France. .,University Hospital of Strasbourg, Strasbourg, France. .,Laboratoire de Psychologie des Cognitions, Université de Strasbourg, Strasbourg, France.
| | - Andrea Guidi
- grid.5395.a0000 0004 1757 3729Dipartimento di Ingegneria dell’Informazione, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italy ,grid.5395.a0000 0004 1757 3729Research Center “E. Piaggio”, University of Pisa, Largo L, Lazzarino 1, 56122 Pisa, Italy
| | | | - Anne Giersch
- grid.7429.80000000121866389INSERM 1114, Strasbourg, France
| | - Gilles Bertschy
- grid.7429.80000000121866389INSERM 1114, Strasbourg, France ,grid.412220.70000 0001 2177 138XUniversity Hospital of Strasbourg, Strasbourg, France ,grid.11843.3f0000 0001 2157 9291Fédération de Médecine Translationnelle de Strasbourg, Université de Strasbourg, Strasbourg, France
| | - Nicola Vanello
- grid.5395.a0000 0004 1757 3729Dipartimento di Ingegneria dell’Informazione, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italy ,grid.5395.a0000 0004 1757 3729Research Center “E. Piaggio”, University of Pisa, Largo L, Lazzarino 1, 56122 Pisa, Italy
| |
Collapse
|
129
|
Lee S, Suh SW, Kim T, Kim K, Lee KH, Lee JR, Han G, Hong JW, Han JW, Lee K, Kim KW. Screening major depressive disorder using vocal acoustic features in the elderly by sex. J Affect Disord 2021; 291:15-23. [PMID: 34022551 DOI: 10.1016/j.jad.2021.04.098] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 01/12/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022]
Abstract
BACKGROUND Vocal acoustic features are potential biomarkers of elderly depression. Previous automated diagnostic tests for depression have employed unstandardized speech samples, and few studies have considered differences in voice reactivity. We aimed to develop a voice-based screening test for depression measuring vocal acoustic features of elderly Koreans while they read a series of mood-inducing sentences (MIS). METHODS In this case-control study, we recruited 61 individuals with major depressive disorder and 143 healthy controls (mean age [SD]: 72 [6]; female, 70%) from the community-dwelling elderly population. Participants were asked to read MIS and their variation pattern of acoustic features represented by the correlation distance between two MIS were analyzed as input features using the univariate feature selection technique and subsequently classified by AdaBoost. RESULTS Acoustic features showing significant discriminatory performances were spectral and energy-related features for males (sensitivity 0.95, specificity 0.88, and accuracy 0.86) and prosody-related features for females (sensitivity 0.73, specificity 0.86, and accuracy 0.77). The correlation distance between negative and positive MIS was significantly shorter in the depressed group than in the healthy control (F = 18.574, P < 0.001). LIMITATIONS Small sample size and relatively homogenous clinical profile of depression could limit the generalizability. CONCLUSIONS While reading MIS, spectral and energy-related acoustic features for males and prosody-related features for females are good discriminators for major depressive disorder. These features may be used as biomarkers of depression in the elderly.
Collapse
Affiliation(s)
- Subin Lee
- Music and Audio Research Group, Seoul National University, Seoul, Korea
| | - Seung Wan Suh
- Department of Psychiatry, Kangdong Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Korea
| | - Taehyun Kim
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Kayoung Kim
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Kyoung Hwan Lee
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Ju Ri Lee
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Guehee Han
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Jong Woo Hong
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Ji Won Han
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Kyogu Lee
- Music and Audio Research Group, Seoul National University, Seoul, Korea; Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Korea.
| | - Ki Woong Kim
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Korea; Department of Psychiatry, Seoul National University, College of Medicine, Seoul, Korea; Department of Brain and Cognitive Sciences, Seoul National University, College of Natural Sciences, Seoul, Korea.
| |
Collapse
|
130
|
Chen X, Pan Z. A Convenient and Low-Cost Model of Depression Screening and Early Warning Based on Voice Data Using for Public Mental Health. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:6441. [PMID: 34198659 PMCID: PMC8296267 DOI: 10.3390/ijerph18126441] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/10/2021] [Accepted: 06/10/2021] [Indexed: 12/12/2022]
Abstract
Depression is a common mental health disease, which has great harm to public health. At present, the diagnosis of depression mainly depends on the interviews between doctors and patients, which is subjective, slow and expensive. Voice data are a kind of data that are easy to obtain and have the advantage of low cost. It has been proved that it can be used in the diagnosis of depression. The voice data used for modeling in this study adopted the authoritative public data set, which had passed the ethical review. The features of voice data were extracted by Python programming, and the voice features were stored in the format of CSV files. Through data processing, a big database, containing 1479 voice feature samples, was generated for modeling. Then, the decision tree screening model of depression was established by 10-fold cross validation and algorithm selection. The experiment achieved 83.4% prediction accuracy on voice data set. According to the prediction results of the model, the patients can be given early warning and intervention in time, so as to realize the health management of personal depression.
Collapse
Affiliation(s)
- Xin Chen
- School of Medicine, Hangzhou Normal University, Hangzhou 311121, China;
- Engineering Research Center of Mobile Health Management System, Ministry of Education, Hangzhou Normal University, Hangzhou 311121, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou 311121, China
| | - Zhigeng Pan
- School of Medicine, Hangzhou Normal University, Hangzhou 311121, China;
- Engineering Research Center of Mobile Health Management System, Ministry of Education, Hangzhou Normal University, Hangzhou 311121, China
- Institute of VR and Intelligent System, Hangzhou Normal University, Hangzhou 311121, China
| |
Collapse
|
131
|
Abstract
Telepsychiatry refers to the use of technology to support the remote provision of psychiatric services. Discussions of this technology have often focussed on the use of video conferencing in place of in-person visits and how such care is found to be non-inferior to traditional care. New developments in the fields of remote-sensing and digital phenotyping have the potential to overcome the limitations inherent in remote visits as well as the limitations of current outpatient care models more generally. Such technologies may enable the collection of more relevant, objective clinical data which could lead to improved care quality and transformed care delivery models. The development and implementation of these new technologies raise important ethical questions.
Collapse
Affiliation(s)
- John Zulueta
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
| | - Olusola A Ajilore
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
| |
Collapse
|
132
|
Di Y, Wang J, Li W, Zhu T. Using i-vectors from voice features to identify major depressive disorder. J Affect Disord 2021; 288:161-166. [PMID: 33895418 DOI: 10.1016/j.jad.2021.04.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 03/27/2021] [Accepted: 04/02/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Machine-learning methods using acoustic features in the diagnosis of major depressive disorder (MDD) have insufficient evidence from large-scale samples and clinical trials. This study aimed to evaluate the effectiveness of the promising i-vector method on a large sample of women with recurrent MDD diagnosed clinically, examine its robustness, and provide an explicit acoustic explanation of the i-vectors. METHODS We collected utterances edited from clinical interview speech records of 785 depressed and 1,023 healthy individuals. Then, we extracted Mel-frequency cepstral coefficient (MFCC) features and MFCC i-vectors from their utterances. To examine the effectiveness of i-vectors, we compared the performance of binary logistic regression between MFCC i-vectors and MFCC features and tested its robustness on different utterance durations. We also determined the correlation between MFCC features and MFCC i-vectors to analyze the acoustic meaning of i-vectors. RESULTS The i-vectors improved 7% and 14% of area under the curve (AUC) for MFCC features using different utterances. When the duration is > 40 s, the classification results are stabilized. The i-vectors are consistently correlated to the maximum, minimum, and deviations of MFCC features (either positively or negatively). LIMITATIONS This study included only women. CONCLUSIONS The i-vectors can improve 14% of the AUC on a large-scale clinical sample. This system is robust to utterance duration > 40 s. This study provides a foundation for exploring the clinical application of voice features in the diagnosis of MDD.
Collapse
Affiliation(s)
- Yazheng Di
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingying Wang
- School of Optometry, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| | - Weidong Li
- Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Tingshao Zhu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
133
|
Cavus N, Lawan AA, Ibrahim Z, Dahiru A, Tahir S, Abdulrazak UI, Hussaini A. A Systematic Literature Review on the Application of Machine-Learning Models in Behavioral Assessment of Autism Spectrum Disorder. J Pers Med 2021; 11:299. [PMID: 33919878 PMCID: PMC8070763 DOI: 10.3390/jpm11040299] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 04/06/2021] [Accepted: 04/12/2021] [Indexed: 01/22/2023] Open
Abstract
Autism spectrum disorder (ASD) is associated with significant social, communication, and behavioral challenges. The insufficient number of trained clinicians coupled with limited accessibility to quick and accurate diagnostic tools resulted in overlooking early symptoms of ASD in children around the world. Several studies have utilized behavioral data in developing and evaluating the performance of machine learning (ML) models toward quick and intelligent ASD assessment systems. However, despite the good evaluation metrics achieved by the ML models, there is not enough evidence on the readiness of the models for clinical use. Specifically, none of the existing studies reported the real-life application of the ML-based models. This might be related to numerous challenges associated with the data-centric techniques utilized and their misalignment with the conceptual basis upon which professionals diagnose ASD. The present work systematically reviewed recent articles on the application of ML in the behavioral assessment of ASD, and highlighted common challenges in the studies, and proposed vital considerations for real-life implementation of ML-based ASD screening and diagnostic systems. This review will serve as a guide for researchers, neuropsychiatrists, psychologists, and relevant stakeholders on the advances in ASD screening and diagnosis using ML.
Collapse
Affiliation(s)
- Nadire Cavus
- Department of Computer Information Systems, Near East University, Nicosia 99138, Cyprus;
- Computer Information Systems Research and Technology Centre, Near East University, Nicosia 99138, Cyprus
| | - Abdulmalik A. Lawan
- Department of Computer Information Systems, Near East University, Nicosia 99138, Cyprus;
- Department of Computer Science, Kano University of Science and Technology, Wudil 713281, Nigeria;
| | - Zurki Ibrahim
- Department of Medical Genetics, Near East University, Nicosia 99138, Cyprus;
| | - Abdullahi Dahiru
- College of Nursing and Midwifery, School of Nursing, Kano 700233, Nigeria;
| | - Sadiya Tahir
- Department of Pediatrics, Murtala Muhammad Specialist Hospital, Kano 700251, Nigeria;
| | - Usama Ishaq Abdulrazak
- Department of Emergency Medicine, Peterborough City Hospital, North West Anglia NHS Foundation Trust, Peterborough PE3 9GZ, UK;
| | - Adamu Hussaini
- Department of Computer Science, Kano University of Science and Technology, Wudil 713281, Nigeria;
- Crestic Laboratory, Universite de Reims, 51100 Reims, France
| |
Collapse
|
134
|
Albuquerque L, Valente ARS, Teixeira A, Figueiredo D, Sa-Couto P, Oliveira C. Association between acoustic speech features and non-severe levels of anxiety and depression symptoms across lifespan. PLoS One 2021; 16:e0248842. [PMID: 33831018 PMCID: PMC8031302 DOI: 10.1371/journal.pone.0248842] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 03/07/2021] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Several studies have investigated the acoustic effects of diagnosed anxiety and depression. Anxiety and depression are not characteristics of the typical aging process, but minimal or mild symptoms can appear and evolve with age. However, the knowledge about the association between speech and anxiety or depression is scarce for minimal/mild symptoms, typical of healthy aging. As longevity and aging are still a new phenomenon worldwide, posing also several clinical challenges, it is important to improve our understanding of non-severe mood symptoms' impact on acoustic features across lifetime. The purpose of this study was to determine if variations in acoustic measures of voice are associated with non-severe anxiety or depression symptoms in adult population across lifetime. METHODS Two different speech tasks (reading vowels in disyllabic words and describing a picture) were produced by 112 individuals aged 35-97. To assess anxiety and depression symptoms, the Hospital Anxiety Depression Scale (HADS) was used. The association between the segmental and suprasegmental acoustic parameters and HADS scores were analyzed using the linear multiple regression technique. RESULTS The number of participants with presence of anxiety or depression symptoms is low (>7: 26.8% and 10.7%, respectively) and non-severe (HADS-A: 5.4 ± 2.9 and HADS-D: 4.2 ± 2.7, respectively). Adults with higher anxiety symptoms did not present significant relationships associated with the acoustic parameters studied. Adults with increased depressive symptoms presented higher vowel duration, longer total pause duration and short total speech duration. Finally, age presented a positive and significant effect only for depressive symptoms, showing that older participants tend to have more depressive symptoms. CONCLUSIONS Non-severe depression symptoms can be related to some acoustic parameters and age. Depression symptoms can be explained by acoustic parameters even among individuals without severe symptom levels.
Collapse
Affiliation(s)
- Luciana Albuquerque
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Center of Health Technology and Services Research, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
- Department of Education and Psychology, University of Aveiro, Aveiro, Portugal
- * E-mail:
| | - Ana Rita S. Valente
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - António Teixeira
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- Department of Electronics Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - Daniela Figueiredo
- Center of Health Technology and Services Research, University of Aveiro, Aveiro, Portugal
- School of Health Science, University of Aveiro, Aveiro, Portugal
| | - Pedro Sa-Couto
- Center for Research and Development in Mathematics and Applications, University of Aveiro, Aveiro, Portugal
- Department of Mathematics, University of Aveiro, Aveiro, Portugal
| | - Catarina Oliveira
- Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
- School of Health Science, University of Aveiro, Aveiro, Portugal
| |
Collapse
|
135
|
Patel N, Patel S, Mankad SH. Impact of autoencoder based compact representation on emotion detection from audio. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2021; 13:867-885. [PMID: 33686349 PMCID: PMC7927770 DOI: 10.1007/s12652-021-02979-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 02/15/2021] [Indexed: 06/12/2023]
Abstract
Emotion recognition from speech has its fair share of applications and consequently extensive research has been done over the past few years in this interesting field. However, many of the existing solutions aren't yet ready for real time applications. In this work, we propose a compact representation of audio using conventional autoencoders for dimensionality reduction, and test the approach on two benchmark publicly available datasets. Such compact and simple classification systems where the computing cost is low and memory is managed efficiently may be more useful for real time application. System is evaluated on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and the Toronto Emotional Speech Set (TESS). Three classifiers, namely, support vector machines (SVM), decision tree classifier, and convolutional neural networks (CNN) have been implemented to judge the impact of the approach. The results obtained by attempting classification with Alexnet and Resnet50 are also reported. Observations proved that this introduction of autoencoders indeed can improve the classification accuracy of the emotion in the input audio files. It can be concluded that in emotion recognition from speech, the choice and application of dimensionality reduction of audio features impacts the results that are achieved and therefore, by working on this aspect of the general speech emotion recognition model, it may be possible to make great improvements in the future.
Collapse
Affiliation(s)
- Nivedita Patel
- CSE Department, Institute of Technology, Nirma University, Ahmedabad, India
| | - Shireen Patel
- CSE Department, Institute of Technology, Nirma University, Ahmedabad, India
| | - Sapan H. Mankad
- CSE Department, Institute of Technology, Nirma University, Ahmedabad, India
| |
Collapse
|
136
|
Belouali A, Gupta S, Sourirajan V, Yu J, Allen N, Alaoui A, Dutton MA, Reinhard MJ. Acoustic and language analysis of speech for suicidal ideation among US veterans. BioData Min 2021; 14:11. [PMID: 33531048 PMCID: PMC7856815 DOI: 10.1186/s13040-021-00245-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 01/20/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Screening for suicidal ideation in high-risk groups such as U.S. veterans is crucial for early detection and suicide prevention. Currently, screening is based on clinical interviews or self-report measures. Both approaches rely on subjects to disclose their suicidal thoughts. Innovative approaches are necessary to develop objective and clinically applicable assessments. Speech has been investigated as an objective marker to understand various mental states including suicidal ideation. In this work, we developed a machine learning and natural language processing classifier based on speech markers to screen for suicidal ideation in US veterans. METHODOLOGY Veterans submitted 588 narrative audio recordings via a mobile app in a real-life setting. In addition, participants completed self-report psychiatric scales and questionnaires. Recordings were analyzed to extract voice characteristics including prosodic, phonation, and glottal. The audios were also transcribed to extract textual features for linguistic analysis. We evaluated the acoustic and linguistic features using both statistical significance and ensemble feature selection. We also examined the performance of different machine learning algorithms on multiple combinations of features to classify suicidal and non-suicidal audios. RESULTS A combined set of 15 acoustic and linguistic features of speech were identified by the ensemble feature selection. Random Forest classifier, using the selected set of features, correctly identified suicidal ideation in veterans with 86% sensitivity, 70% specificity, and an area under the receiver operating characteristic curve (AUC) of 80%. CONCLUSIONS Speech analysis of audios collected from veterans in everyday life settings using smartphones offers a promising approach for suicidal ideation detection. A machine learning classifier may eventually help clinicians identify and monitor high-risk veterans.
Collapse
Affiliation(s)
- Anas Belouali
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA.
| | - Samir Gupta
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA
| | - Vaibhav Sourirajan
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA
| | - Jiawei Yu
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA
| | - Nathaniel Allen
- War Related Illness and Injury Study Center, Veterans Affairs Medical Center, Washington, DC, USA
| | - Adil Alaoui
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA
| | - Mary Ann Dutton
- Department of Psychiatry, Georgetown University Medical Center, Washington, DC, USA
| | - Matthew J Reinhard
- War Related Illness and Injury Study Center, Veterans Affairs Medical Center, Washington, DC, USA
- Department of Psychiatry, Georgetown University Medical Center, Washington, DC, USA
| |
Collapse
|
137
|
Chen F, Cao Z, Grais EM, Zhao F. Contributions and limitations of using machine learning to predict noise-induced hearing loss. Int Arch Occup Environ Health 2021; 94:1097-1111. [PMID: 33491101 PMCID: PMC8238747 DOI: 10.1007/s00420-020-01648-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 12/29/2020] [Indexed: 12/20/2022]
Abstract
Purpose Noise-induced hearing loss (NIHL) is a global issue that impacts people’s life and health. The current review aims to clarify the contributions and limitations of applying machine learning (ML) to predict NIHL by analyzing the performance of different ML techniques and the procedure of model construction. Methods The authors searched PubMed, EMBASE and Scopus on November 26, 2020. Results Eight studies were recruited in the current review following defined inclusion and exclusion criteria. Sample size in the selected studies ranged between 150 and 10,567. The most popular models were artificial neural networks (n = 4), random forests (n = 3) and support vector machines (n = 3). Features mostly correlated with NIHL and used in the models were: age (n = 6), duration of noise exposure (n = 5) and noise exposure level (n = 4). Five included studies used either split-sample validation (n = 3) or ten-fold cross-validation (n = 2). Assessment of accuracy ranged in value from 75.3% to 99% with a low prediction error/root-mean-square error in 3 studies. Only 2 studies measured discrimination risk using the receiver operating characteristic (ROC) curve and/or the area under ROC curve. Conclusion In spite of high accuracy and low prediction error of machine learning models, some improvement can be expected from larger sample sizes, multiple algorithm use, completed reports of model construction and the sufficient evaluation of calibration and discrimination risk.
Collapse
Affiliation(s)
- Feifan Chen
- Centre for Speech and Language Therapy and Hearing Science, Cardiff School of Sport and Health Sciences, Cardiff Metropolitan University, Cardiff, UK
| | - Zuwei Cao
- Center for Rehabilitative Auditory Research, Guizhou Provincial People's Hospital, Guiyang, China
| | - Emad M Grais
- Centre for Speech and Language Therapy and Hearing Science, Cardiff School of Sport and Health Sciences, Cardiff Metropolitan University, Cardiff, UK
| | - Fei Zhao
- Centre for Speech and Language Therapy and Hearing Science, Cardiff School of Sport and Health Sciences, Cardiff Metropolitan University, Cardiff, UK. .,Department of Hearing and Speech Science, Xinhua College, Sun Yat-Sen University, Guangzhou, China.
| |
Collapse
|
138
|
D'Alfonso S, Lederman R, Bucci S, Berry K. The Digital Therapeutic Alliance and Human-Computer Interaction. JMIR Ment Health 2020; 7:e21895. [PMID: 33372897 PMCID: PMC7803473 DOI: 10.2196/21895] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 08/16/2020] [Accepted: 10/29/2020] [Indexed: 01/09/2023] Open
Abstract
The therapeutic alliance (TA), the relationship that develops between a therapist and a client/patient, is a critical factor in the outcome of psychological therapy. As mental health care is increasingly adopting digital technologies and offering therapeutic interventions that may not involve human therapists, the notion of a TA in digital mental health care requires exploration. To date, there has been some incipient work on developing measures to assess the conceptualization of a digital TA for mental health apps. However, the few measures that have been proposed have more or less been derivatives of measures from psychology used to assess the TA in traditional face-to-face therapy. This conceptual paper explores one such instrument that has been proposed in the literature, the Mobile Agnew Relationship Measure, and examines it through a human-computer interaction (HCI) lens. Through this process, we show how theories from HCI can play a role in shaping or generating a more suitable, purpose-built measure of the digital therapeutic alliance (DTA), and we contribute suggestions on how HCI methods and knowledge can be used to foster the DTA in mental health apps.
Collapse
Affiliation(s)
- Simon D'Alfonso
- School of Computing and Information Systems, University of Melbourne, Parkville, Australia
| | - Reeva Lederman
- School of Computing and Information Systems, University of Melbourne, Parkville, Australia
| | - Sandra Bucci
- Division of Psychology and Mental Health, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Katherine Berry
- Division of Psychology and Mental Health, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
139
|
Kunin A, Sargheini N, Birkenbihl C, Moiseeva N, Fröhlich H, Golubnitschaja O. Voice perturbations under the stress overload in young individuals: phenotyping and suboptimal health as predictors for cascading pathologies. EPMA J 2020; 11:517-527. [PMID: 33200009 PMCID: PMC7658305 DOI: 10.1007/s13167-020-00229-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 10/30/2020] [Indexed: 12/12/2022]
Abstract
Verbal communication is one of the most sophisticated human motor skills reflecting both-the mental and physical health of an individual. Voice parameters and quality changes are usually secondary towards functional and/or structural laryngological alterations under specific systemic processes, syndrome and pathologies. These include but are not restricted to dry mouth and Sicca syndromes, body dehydration, hormonal alterations linked to pubertal, menopausal, and andropausal status, respiratory disorders, gastrointestinal reflux, autoimmune diseases, endocrinologic disorders, underweight versus overweight and obesity, and diabetes mellitus. On the other hand, it is well-established that stress overload is a significant risk factor of cascading pathologies, including but not restricted to neurodegenerative and psychiatric disorders, diabetes mellitus, cardiovascular disease, stroke, and cancers. Our current study revealed voice perturbations under the stress overload as a potentially useful biomarker to identify individuals in suboptimal health conditions who might be strongly predisposed to associated pathologies. Contextually, extended surveys applied in the population might be useful to identify, for example, persons at high risk for respiratory complications under pandemic conditions such as COVID-19. Symptoms of dry mouth syndrome, disturbed microcirculation, altered sense regulation, shifted circadian rhythm, and low BMI were positively associated with voice perturbations under the stress overload. Their functional interrelationships and relevance for cascading associated pathologies are presented in the article. Automated analysis of voice recordings via artificial intelligence (AI) has a potential to derive digital biomarkers. Further, predictive machine learning models should be developed that allows for detecting a suboptimal health condition based on voice recordings, ideally in an automated manner using derived digital biomarkers. Follow-up stratification and monitoring of individuals in suboptimal health conditions are recommended using disease-specific cell-free nucleic acids (ccfDNA, ctDNA, mtDNA, miRNA) combined with metabolic patterns detected in body fluids. Application of the cost-effective targeted prevention within the phase of reversible health damage is recommended based on the individualised patient profiling.
Collapse
Affiliation(s)
- A. Kunin
- Departments of Maxillofacial Surgery and Hospital Dentistry, Voronezh N.N. Burdenko State Medical University, Voronezh, Russia
| | - N. Sargheini
- Center of Molecular Biotechnology, CEMBIO, Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - C. Birkenbihl
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - N. Moiseeva
- Departments of Maxillofacial Surgery and Hospital Dentistry, Voronezh N.N. Burdenko State Medical University, Voronezh, Russia
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - Olga Golubnitschaja
- Predictive, Preventive and Personalised (3P) Medicine, Department of Radiation Oncology, University Hospital Bonn, Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
140
|
Fluctuations in Subjective Tinnitus Ratings Over Time: Implications for Clinical Research. Otol Neurotol 2020; 41:e1167-e1173. [PMID: 32925865 DOI: 10.1097/mao.0000000000002759] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Patients with chronic, subjective tinnitus are often administered a battery of audiometric tests to characterize their tinnitus percept. Even a comprehensive battery, if applied just once, cannot capture fluctuations in tinnitus strength or quality over time. Moreover, subjects experience a learning curve when reporting the detailed characteristics of their tinnitus percept, such that a single assessment will reflect a lack of familiarity with test requirements. We addressed these challenges by programming an automated software platform for at-home tinnitus characterization over a 2-week period. STUDY DESIGN Prospective case series. SETTING Tertiary referral center, patients' homes. INTERVENTIONS Following an initial clinic visit, 25 subjects with chronic subjective tinnitus returned home with a tablet computer and calibrated headphones to complete questionnaires, hearing tests, and tinnitus psychoacoustic testing. We repeatedly characterized loudness discomfort levels and tinnitus matching over a 2-week period. MAIN OUTCOME MEASURES Primary outcomes included intrasubject variability in loudness discomfort levels, tinnitus intensity, and tinnitus acoustic matching over the course of testing. RESULTS Within-subject variability for all outcome measures could be reduced by approximately 25 to 50% by excluding initial measurements and by focusing only on tinnitus matching attempts where subjects report high confidence in the accuracy of their ratings. CONCLUSIONS Tinnitus self-report is inherently variable but can converge on reliable values with extended testing. Repeated, self-directed tinnitus assessments may have implications for identifying malingerers. Further, these findings suggest that extending the baseline phase of tinnitus characterizations will increase the statistical power for future studies focused on tinnitus interventions.
Collapse
|
141
|
Vázquez-Romero A, Gallardo-Antolín A. Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks. ENTROPY 2020; 22:e22060688. [PMID: 33286460 PMCID: PMC7517226 DOI: 10.3390/e22060688] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 06/17/2020] [Accepted: 06/19/2020] [Indexed: 12/29/2022]
Abstract
This paper proposes a speech-based method for automatic depression classification. The system is based on ensemble learning for Convolutional Neural Networks (CNNs) and is evaluated using the data and the experimental protocol provided in the Depression Classification Sub-Challenge (DCC) at the 2016 Audio–Visual Emotion Challenge (AVEC-2016). In the pre-processing phase, speech files are represented as a sequence of log-spectrograms and randomly sampled to balance positive and negative samples. For the classification task itself, first, a more suitable architecture for this task, based on One-Dimensional Convolutional Neural Networks, is built. Secondly, several of these CNN-based models are trained with different initializations and then the corresponding individual predictions are fused by using an Ensemble Averaging algorithm and combined per speaker to get an appropriate final decision. The proposed ensemble system achieves satisfactory results on the DCC at the AVEC-2016 in comparison with a reference system based on Support Vector Machines and hand-crafted features, with a CNN+LSTM-based system called DepAudionet, and with the case of a single CNN-based classifier.
Collapse
|
142
|
D'Alfonso S. AI in mental health. Curr Opin Psychol 2020; 36:112-117. [PMID: 32604065 DOI: 10.1016/j.copsyc.2020.04.005] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 04/14/2020] [Accepted: 04/16/2020] [Indexed: 10/24/2022]
Abstract
With the advent of digital approaches to mental health, modern artificial intelligence (AI), and machine learning in particular, is being used in the development of prediction, detection and treatment solutions for mental health care. In terms of treatment, AI is being incorporated into digital interventions, particularly web and smartphone apps, to enhance user experience and optimise personalised mental health care. In terms of prediction and detection, modern streams of abundant data mean that data-driven AI methods can be employed to develop prediction/detection models for mental health conditions. In particular, an individual's 'digital exhaust', the data gathered from their numerous personal digital device and social media interactions, can be mined for behavioural or mental health insights. Language, long considered a window into the human mind, can now be quantitatively harnessed as data with powerful computer-based natural language processing to also provide a method of inferring mental health. Furthermore, natural language processing can also be used to develop conversational agents used for therapeutic intervention.
Collapse
Affiliation(s)
- Simon D'Alfonso
- The University of Melbourne School of Computing and Information Systems, Australia.
| |
Collapse
|
143
|
Wang X, Wang Y, Zhou M, Li B, Liu X, Zhu T. Identifying Psychological Symptoms Based on Facial Movements. Front Psychiatry 2020; 11:607890. [PMID: 33384632 PMCID: PMC7769937 DOI: 10.3389/fpsyt.2020.607890] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 11/17/2020] [Indexed: 11/13/2022] Open
Abstract
Background: Many methods have been proposed to automatically identify the presence of mental illness, but these have mostly focused on one specific mental illness. In some non-professional scenarios, it would be more helpful to understand an individual's mental health status from all perspectives. Methods: We recruited 100 participants. Their multi-dimensional psychological symptoms of mental health were evaluated using the Symptom Checklist 90 (SCL-90) and their facial movements under neutral stimulation were recorded using Microsoft Kinect. We extracted the time-series characteristics of the key points as the input, and the subscale scores of the SCL-90 as the output to build facial prediction models. Finally, the convergent validity, discriminant validity, criterion validity, and the split-half reliability were respectively assessed using a multitrait-multimethod matrix and correlation coefficients. Results: The correlation coefficients between the predicted values and actual scores were 0.26 and 0.42 (P < 0.01), which indicated good criterion validity. All models except depression had high convergent validity but low discriminant validity. Results also indicated good levels of split-half reliability for each model [from 0.516 (hostility) to 0.817 (interpersonal sensitivity)] (P < 0.001). Conclusion: The validity and reliability of facial prediction models were confirmed for the measurement of mental health based on the SCL-90. Our research demonstrated that fine-grained aspects of mental health can be identified from the face, and provided a feasible evaluation method for multi-dimensional prediction models.
Collapse
Affiliation(s)
- Xiaoyang Wang
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Yilin Wang
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Mingjie Zhou
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Baobin Li
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.,School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoqian Liu
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Tingshao Zhu
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|