1
|
Kim H, Hillis AE, Themistocleous C. Machine Learning Classification of Patients with Amnestic Mild Cognitive Impairment and Non-Amnestic Mild Cognitive Impairment from Written Picture Description Tasks. Brain Sci 2024; 14:652. [PMID: 39061392 PMCID: PMC11274603 DOI: 10.3390/brainsci14070652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 06/24/2024] [Accepted: 06/25/2024] [Indexed: 07/28/2024] Open
Abstract
Individuals with Mild Cognitive Impairment (MCI), a transitional stage between cognitively healthy aging and dementia, are characterized by subtle neurocognitive changes. Clinically, they can be grouped into two main variants, namely patients with amnestic MCI (aMCI) and non-amnestic MCI (naMCI). The distinction of the two variants is known to be clinically significant as they exhibit different progression rates to dementia. However, it has been particularly challenging to classify the two variants robustly. Recent research indicates that linguistic changes may manifest as one of the early indicators of pathology. Therefore, we focused on MCI's discourse-level writing samples in this study. We hypothesized that a written picture description task can provide information that can be used as an ecological, cost-effective classification system between the two variants. We included one hundred sixty-nine individuals diagnosed with either aMCI or naMCI who received neurophysiological evaluations in addition to a short, written picture description task. Natural Language Processing (NLP) and a BERT pre-trained language model were utilized to analyze the writing samples. We showed that the written picture description task provided 90% overall classification accuracy for the best classification models, which performed better than cognitive measures. Written discourses analyzed by AI models can automatically assess individuals with aMCI and naMCI and facilitate diagnosis, prognosis, therapy planning, and evaluation.
Collapse
Affiliation(s)
- Hana Kim
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, FL 33620, USA;
| | - Argye E. Hillis
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA;
- Physical Medicine and Rehabilitation, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD 21287, USA
| | | |
Collapse
|
2
|
Kleiman MJ, Galvin JE. High frequency post-pause word choices and task-dependent speech behavior characterize connected speech in individuals with mild cognitive impairment. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.25.24303329. [PMID: 38464237 PMCID: PMC10925339 DOI: 10.1101/2024.02.25.24303329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Background Alzheimer's disease (AD) is characterized by progressive cognitive decline, including impairments in speech production and fluency. Mild cognitive impairment (MCI), a prodrome of AD, has also been linked with changes in speech behavior but to a more subtle degree. Objective This study aimed to investigate whether speech behavior immediately following both filled and unfilled pauses (post-pause speech behavior) differs between individuals with MCI and healthy controls (HCs), and how these differences are influenced by the cognitive demands of various speech tasks. Methods Transcribed speech samples were analyzed from both groups across different tasks, including immediate and delayed narrative recall, picture descriptions, and free responses. Key metrics including lexical and syntactic complexity, lexical frequency and diversity, and part of speech usage, both overall and post-pause, were examined. Results Significant differences in pause usage were observed between groups, with a higher incidence and longer latencies following these pauses in the MCI group. Lexical frequency following filled pauses was higher among MCI participants in the free response task but not in other tasks, potentially due to the relative cognitive load of the tasks. The immediate recall task was most useful at differentiating between groups. Predictive analyses utilizing random forest classifiers demonstrated high specificity in using speech behavior metrics to differentiate between MCI and HCs. Conclusions Speech behavior following pauses differs between MCI participants and healthy controls, with these differences being influenced by the cognitive demands of the speech tasks. These post-pause speech metrics can be easily integrated into existing speech analysis paradigms.
Collapse
Affiliation(s)
- Michael J. Kleiman
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, Boca Raton, FL 33433
| | - James E. Galvin
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, Boca Raton, FL 33433
| |
Collapse
|
3
|
Themistocleous CK, Andreou M, Peristeri E. Autism Detection in Children: Integrating Machine Learning and Natural Language Processing in Narrative Analysis. Behav Sci (Basel) 2024; 14:459. [PMID: 38920791 PMCID: PMC11200366 DOI: 10.3390/bs14060459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 05/24/2024] [Accepted: 05/25/2024] [Indexed: 06/27/2024] Open
Abstract
Despite the consensus that early identification leads to better outcomes for individuals with autism spectrum disorder (ASD), recent research reveals that the average age of diagnosis in the Greek population is approximately six years. However, this age of diagnosis is delayed by an additional two years for families from lower-income or minority backgrounds. These disparities result in adverse impacts on intervention outcomes, which are further burdened by the often time-consuming and labor-intensive language assessments for children with ASD. There is a crucial need for tools that increase access to early assessment and diagnosis that will be rigorous and objective. The current study leverages the capabilities of artificial intelligence to develop a reliable and practical model for distinguishing children with ASD from typically-developing peers based on their narrative and vocabulary skills. We applied natural language processing-based extraction techniques to automatically acquire language features (narrative and vocabulary skills) from storytelling in 68 children with ASD and 52 typically-developing children, and then trained machine learning models on the children's combined narrative and expressive vocabulary data to generate behavioral targets that effectively differentiate ASD from typically-developing children. According to the findings, the model could distinguish ASD from typically-developing children, achieving an accuracy of 96%. Specifically, out of the models used, hist gradient boosting and XGBoost showed slightly superior performance compared to the decision trees and gradient boosting models, particularly regarding accuracy and F1 score. These results bode well for the deployment of machine learning technology for children with ASD, especially those with limited access to early identification services.
Collapse
Affiliation(s)
| | - Maria Andreou
- Department of Speech and Language Therapy, University of Peloponnese, 24100 Kalamata, Greece
| | - Eleni Peristeri
- School of English, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece;
| |
Collapse
|
4
|
Angelopoulou G, Kasselimis D, Goutsos D, Potagas C. A Methodological Approach to Quantifying Silent Pauses, Speech Rate, and Articulation Rate across Distinct Narrative Tasks: Introducing the Connected Speech Analysis Protocol (CSAP). Brain Sci 2024; 14:466. [PMID: 38790445 PMCID: PMC11119743 DOI: 10.3390/brainsci14050466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/24/2024] [Accepted: 04/24/2024] [Indexed: 05/26/2024] Open
Abstract
The examination of connected speech may serve as a valuable tool for exploring speech output in both healthy speakers and individuals with language disorders. Numerous studies incorporate various fluency and silence measures into their analyses to investigate speech output patterns in different populations, along with the underlying cognitive processes that occur while speaking. However, methodological inconsistencies across existing studies pose challenges in comparing their results. In the current study, we introduce CSAP (Connected Speech Analysis Protocol), which is a specific methodological approach to investigate fluency metrics, such as articulation rate and speech rate, as well as silence measures, including silent pauses' frequency and duration. We emphasize the importance of employing a comprehensive set of measures within a specific methodological framework to better understand speech output patterns. Additionally, we advocate for the use of distinct narrative tasks for a thorough investigation of speech output in different conditions. We provide an example of data on which we implement CSAP to showcase the proposed pipeline. In conclusion, CSAP offers a comprehensive framework for investigating speech output patterns, incorporating fluency metrics and silence measures in distinct narrative tasks, thus allowing a detailed quantification of connected speech in both healthy and clinical populations. We emphasize the significance of adopting a unified methodological approach in connected speech studies, enabling the integration of results for more robust and generalizable conclusions.
Collapse
Affiliation(s)
- Georgia Angelopoulou
- Neuropsychology & Language Disorders Unit, 1st Neurology Department, Eginition Hospital, Faculty of Medicine, National and Kapodistrian University of Athens, 115 28 Athens, Greece; (G.A.); (D.K.)
| | - Dimitrios Kasselimis
- Neuropsychology & Language Disorders Unit, 1st Neurology Department, Eginition Hospital, Faculty of Medicine, National and Kapodistrian University of Athens, 115 28 Athens, Greece; (G.A.); (D.K.)
- Department of Psychology, Panteion University of Social and Political Sciences, 176 71 Athens, Greece
| | - Dionysios Goutsos
- Department of Linguistics, School of Philosophy, National and Kapodistrian University of Athens, 106 79 Athens, Greece
| | - Constantin Potagas
- Neuropsychology & Language Disorders Unit, 1st Neurology Department, Eginition Hospital, Faculty of Medicine, National and Kapodistrian University of Athens, 115 28 Athens, Greece; (G.A.); (D.K.)
| |
Collapse
|
5
|
Burke E, Gunstad J, Pavlenko O, Hamrick P. Distinguishable features of spontaneous speech in Alzheimer's clinical syndrome and healthy controls. NEUROPSYCHOLOGY, DEVELOPMENT, AND COGNITION. SECTION B, AGING, NEUROPSYCHOLOGY AND COGNITION 2024; 31:575-586. [PMID: 37272884 PMCID: PMC10696129 DOI: 10.1080/13825585.2023.2221020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/29/2023] [Indexed: 06/06/2023]
Abstract
There is growing evidence that subtle changes in spontaneous speech may reflect early pathological changes in cognitive function. Recent work has found that lexical-semantic features of spontaneous speech predict cognitive dysfunction in individuals with mild cognitive impairment (MCI). The current study assessed whether Ostrand and Gunstad's (OG) lexical-semantic features extend to predicting cognitive status in a sample of individuals with Alzheimer's clinical syndrome (ACS) and healthy controls. Four additional (New) speech indices shown to be important in language processing research were also explored in this sample to extend prior work. Speech transcripts of the Cookie Theft Task from 81 individuals with ACS (Mage = 72.7 years, SD = 8.80, 70.4% female) and 61 healthy controls (HC) (Mage = 63.9 years, SD = 8.52, 62.3% female) from Dementia Bank were analyzed. Random forest and logistic machine learning techniques examined whether subject-level lexical-semantic features could be used to accurately discriminate those with ACS from HC. Results showed that logistic models with the New lexical-semantic features obtained good classification accuracy (78.4%), but the OG features had wider success across machine learning model types. In terms of sensitivity and specificity, the random forest model trained on the OG features was the most balanced. Findings from the current study suggest that features of spontaneous speech used to predict MCI may also distinguish between individuals with ACS and healthy controls. Future work should evaluate these lexical-semantic features in pre-clinical persons to further explore their potential to assist with early detection through speech analysis.
Collapse
Affiliation(s)
- Erin Burke
- Department of Psychological Sciences, Kent State University
| | - John Gunstad
- Department of Psychological Sciences, Kent State University
| | | | | |
Collapse
|
6
|
Ambrosini E, Giangregorio C, Lomurno E, Moccia S, Milis M, Loizou C, Azzolino D, Cesari M, Cid Gala M, Galán de Isla C, Gomez-Raja J, Borghese NA, Matteucci M, Ferrante S. Automatic Spontaneous Speech Analysis for the Detection of Cognitive Functional Decline in Older Adults: Multilanguage Cross-Sectional Study. JMIR Aging 2024; 7:e50537. [PMID: 38386279 DOI: 10.2196/50537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 12/18/2023] [Accepted: 02/12/2024] [Indexed: 02/23/2024] Open
Abstract
BACKGROUND The rise in life expectancy is associated with an increase in long-term and gradual cognitive decline. Treatment effectiveness is enhanced at the early stage of the disease. Therefore, there is a need to find low-cost and ecological solutions for mass screening of community-dwelling older adults. OBJECTIVE This work aims to exploit automatic analysis of free speech to identify signs of cognitive function decline. METHODS A sample of 266 participants older than 65 years were recruited in Italy and Spain and were divided into 3 groups according to their Mini-Mental Status Examination (MMSE) scores. People were asked to tell a story and describe a picture, and voice recordings were used to extract high-level features on different time scales automatically. Based on these features, machine learning algorithms were trained to solve binary and multiclass classification problems by using both mono- and cross-lingual approaches. The algorithms were enriched using Shapley Additive Explanations for model explainability. RESULTS In the Italian data set, healthy participants (MMSE score≥27) were automatically discriminated from participants with mildly impaired cognitive function (20≤MMSE score≤26) and from those with moderate to severe impairment of cognitive function (11≤MMSE score≤19) with accuracy of 80% and 86%, respectively. Slightly lower performance was achieved in the Spanish and multilanguage data sets. CONCLUSIONS This work proposes a transparent and unobtrusive assessment method, which might be included in a mobile app for large-scale monitoring of cognitive functionality in older adults. Voice is confirmed to be an important biomarker of cognitive decline due to its noninvasive and easily accessible nature.
Collapse
Affiliation(s)
- Emilia Ambrosini
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Chiara Giangregorio
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Eugenio Lomurno
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Sara Moccia
- BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | | | - Christos Loizou
- Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology, Limassol, Cyprus
| | - Domenico Azzolino
- Geriatric Unit, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico Ca' Granda Ospedale Maggiore Policlinico, Milano, Italy
| | - Matteo Cesari
- Ageing and Health Unit, Department of Maternal, Newborn, Child, Adolescent Health and Ageing, World Health Organization, Geneva, Switzerland
| | - Manuel Cid Gala
- Consejería de Sanidad y Servicios Sociales, Junta de Extremadura, Merida, Spain
| | | | - Jonathan Gomez-Raja
- Consejería de Sanidad y Servicios Sociales, Junta de Extremadura, Merida, Spain
| | | | - Matteo Matteucci
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
| | - Simona Ferrante
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy
- Laboratory of E-Health Technologies and Artificial Intelligence Research in Neurology, Joint Research Platform, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico Istituto Neurologico Carlo Besta, Milano, Italy
| |
Collapse
|
7
|
B.T B, Chen JM. Performance Assessment of ChatGPT versus Bard in Detecting Alzheimer's Dementia. Diagnostics (Basel) 2024; 14:817. [PMID: 38667463 PMCID: PMC11048951 DOI: 10.3390/diagnostics14080817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/08/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open
Abstract
Large language models (LLMs) find increasing applications in many fields. Here, three LLM chatbots (ChatGPT-3.5, ChatGPT-4, and Bard) are assessed in their current form, as publicly available, for their ability to recognize Alzheimer's dementia (AD) and Cognitively Normal (CN) individuals using textual input derived from spontaneous speech recordings. A zero-shot learning approach is used at two levels of independent queries, with the second query (chain-of-thought prompting) eliciting more detailed information than the first. Each LLM chatbot's performance is evaluated on the prediction generated in terms of accuracy, sensitivity, specificity, precision, and F1 score. LLM chatbots generated a three-class outcome ("AD", "CN", or "Unsure"). When positively identifying AD, Bard produced the highest true-positives (89% recall) and highest F1 score (71%), but tended to misidentify CN as AD, with high confidence (low "Unsure" rates); for positively identifying CN, GPT-4 resulted in the highest true-negatives at 56% and highest F1 score (62%), adopting a diplomatic stance (moderate "Unsure" rates). Overall, the three LLM chatbots can identify AD vs. CN, surpassing chance-levels, but do not currently satisfy the requirements for clinical application.
Collapse
Affiliation(s)
- Balamurali B.T
- Science, Mathematics & Technology (SMT), Singapore University of Technology & Design, 8 Somapah Rd, Singapore 487372, Singapore
| | - Jer-Ming Chen
- Science, Mathematics & Technology (SMT), Singapore University of Technology & Design, 8 Somapah Rd, Singapore 487372, Singapore
| |
Collapse
|
8
|
Lukic S, Fan Z, García AM, Welch AE, Ratnasiri BM, Wilson SM, Henry ML, Vonk J, Deleon J, Miller BL, Miller Z, Mandelli ML, Gorno-Tempini ML. Discriminating nonfluent/agrammatic and logopenic PPA variants with automatically extracted morphosyntactic measures from connected speech. Cortex 2024; 173:34-48. [PMID: 38359511 PMCID: PMC11246552 DOI: 10.1016/j.cortex.2023.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 10/15/2023] [Accepted: 12/12/2023] [Indexed: 02/17/2024]
Abstract
Morphosyntactic assessments are important for characterizing individuals with nonfluent/agrammatic variant primary progressive aphasia (nfvPPA). Yet, standard tests are subject to examiner bias and often fail to differentiate between nfvPPA and logopenic variant PPA (lvPPA). Moreover, relevant neural signatures remain underexplored. Here, we leverage natural language processing tools to automatically capture morphosyntactic disturbances and their neuroanatomical correlates in 35 individuals with nfvPPA relative to 10 healthy controls (HC) and 26 individuals with lvPPA. Participants described a picture, and ensuing transcripts were analyzed via part-of-speech tagging to extract sentence-related features (e.g., subordinating and coordinating conjunctions), verbal-related features (e.g., tense markers), and nominal-related features (e.g., subjective and possessive pronouns). Gradient boosting machines were used to classify between groups using all features. We identified the most discriminant morphosyntactic marker via a feature importance algorithm and examined its neural correlates via voxel-based morphometry. Individuals with nfvPPA produced fewer morphosyntactic elements than the other two groups. Such features robustly discriminated them from both individuals with lvPPA and HCs with an AUC of .95 and .82, respectively. The most discriminatory feature corresponded to subordinating conjunctions was correlated with cortical atrophy within the left posterior inferior frontal gyrus across groups (pFWE < .05). Automated morphosyntactic analysis can efficiently differentiate nfvPPA from lvPPA. Also, the most sensitive morphosyntactic markers correlate with a core atrophy region of nfvPPA. Our approach, thus, can contribute to a key challenge in PPA diagnosis.
Collapse
Affiliation(s)
- Sladjana Lukic
- University of California, San Francisco Memory and Aging Center, CA, USA; Ruth S. Ammon College of Education and Health Sciences, Department of Communication Sciences and Disorders, Adelphi University, Garden City, NY, USA.
| | - Zekai Fan
- Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Adolfo M García
- Global Brain Health Institute (GBHI), University of California, San Francisco, CA, USA; Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina; Departamento de Lingüística y Literatura, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile
| | - Ariane E Welch
- Ruth S. Ammon College of Education and Health Sciences, Department of Communication Sciences and Disorders, Adelphi University, Garden City, NY, USA
| | | | - Stephen M Wilson
- School of Health and Rehabilitation Sciences, University of Queensland, Brisbane, QLD, Australia
| | - Maya L Henry
- University of Texas at Austin Moody College of Communication, Austin, TX, USA
| | - Jet Vonk
- University of California, San Francisco Memory and Aging Center, CA, USA
| | - Jessica Deleon
- University of California, San Francisco Memory and Aging Center, CA, USA
| | - Bruce L Miller
- University of California, San Francisco Memory and Aging Center, CA, USA
| | - Zachary Miller
- University of California, San Francisco Memory and Aging Center, CA, USA
| | | | | |
Collapse
|
9
|
Larsen E, Murton O, Song X, Joachim D, Watts D, Kapczinski F, Venesky L, Hurowitz G. Validating the efficacy and value proposition of mental fitness vocal biomarkers in a psychiatric population: prospective cohort study. Front Psychiatry 2024; 15:1342835. [PMID: 38505797 PMCID: PMC10948552 DOI: 10.3389/fpsyt.2024.1342835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open
Abstract
Background The utility of vocal biomarkers for mental health assessment has gained increasing attention. This study aims to further this line of research by introducing a novel vocal scoring system designed to provide mental fitness tracking insights to users in real-world settings. Methods A prospective cohort study with 104 outpatient psychiatric participants was conducted to validate the "Mental Fitness Vocal Biomarker" (MFVB) score. The MFVB score was derived from eight vocal features, selected based on literature review. Participants' mental health symptom severity was assessed using the M3 Checklist, which serves as a transdiagnostic tool for measuring depression, anxiety, post-traumatic stress disorder, and bipolar symptoms. Results The MFVB demonstrated an ability to stratify individuals by their risk of elevated mental health symptom severity. Continuous observation enhanced the MFVB's efficacy, with risk ratios improving from 1.53 (1.09-2.14, p=0.0138) for single 30-second voice samples to 2.00 (1.21-3.30, p=0.0068) for data aggregated over two weeks. A higher risk ratio of 8.50 (2.31-31.25, p=0.0013) was observed in participants who used the MFVB 5-6 times per week, underscoring the utility of frequent and continuous observation. Participant feedback confirmed the user-friendliness of the application and its perceived benefits. Conclusions The MFVB is a promising tool for objective mental health tracking in real-world conditions, with potential to be a cost-effective, scalable, and privacy-preserving adjunct to traditional psychiatric assessments. User feedback suggests that vocal biomarkers can offer personalized insights and support clinical therapy and other beneficial activities that are associated with improved mental health risks and outcomes.
Collapse
Affiliation(s)
| | | | | | | | - Devon Watts
- Neuroscience Graduate Program, Department of Health Sciences, McMaster University, Hamilton, ON, Canada
- St. Joseph’s Healthcare Hamilton, Hamilton, ON, Canada
| | - Flavio Kapczinski
- Neuroscience Graduate Program, Department of Health Sciences, McMaster University, Hamilton, ON, Canada
- Department of Psychiatry, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | | |
Collapse
|
10
|
Wang Z, Zhang Q. Ageing of grammatical advance planning in spoken sentence production: an eye movement study. PSYCHOLOGICAL RESEARCH 2024; 88:652-669. [PMID: 37561202 DOI: 10.1007/s00426-023-01861-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 07/24/2023] [Indexed: 08/11/2023]
Abstract
This study used an image-description paradigm with concurrent eye movement recordings to investigate differences of grammatical advance planning between young and older speakers in spoken sentence production. Participants were asked to produce sentences with simple or complex initial phrase structures (IPS) in Experiment 1 while producing individual words in Experiment 2. Young and older speakers showed comparable speaking latencies in sentence production task, whereas older speakers showed longer latencies than young speakers in word production task. Eye movement data showed that compared with young speakers, older speakers had higher fixation percentage on object 1, lower percentage of gaze shift from object 1 to 2, and lower fixation percentage on object 2 in simple IPS sentences, while they showed similar fixation percentage on object 1, similar percentage of gaze shift from object 1 to 2, and lower fixation percentage on object 2 in complex IPS sentences, indicating a decline of grammatical encoding scope presenting on eye movement patterns. Meanwhile, speech analysis showed that older speakers presented longer utterance duration, slower speech rate, and longer and more frequently occurred pauses in articulation, indicating a decline of speech articulation in older speakers. Thus, our study suggests that older speakers experience an ageing effect in the sentences with complex initial phrases due to limited cognitive resources.
Collapse
Affiliation(s)
- Zhiyun Wang
- Department of Psychology, Renmin University of China, 59 Zhongguancun Street, Haidian District, Beijing, 100872, People's Republic of China
| | - Qingfang Zhang
- Department of Psychology, Renmin University of China, 59 Zhongguancun Street, Haidian District, Beijing, 100872, People's Republic of China.
| |
Collapse
|
11
|
Banks R, Higgins C, Greene BR, Jannati A, Gomes‐Osman J, Tobyne S, Bates D, Pascual‐Leone A. Clinical classification of memory and cognitive impairment with multimodal digital biomarkers. ALZHEIMER'S & DEMENTIA (AMSTERDAM, NETHERLANDS) 2024; 16:e12557. [PMID: 38406610 PMCID: PMC10884988 DOI: 10.1002/dad2.12557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/24/2024] [Accepted: 01/24/2024] [Indexed: 02/27/2024]
Abstract
INTRODUCTION Early detection of Alzheimer's disease and cognitive impairment is critical to improving the healthcare trajectories of aging adults, enabling early intervention and potential prevention of decline. METHODS To evaluate multi-modal feature sets for assessing memory and cognitive impairment, feature selection and subsequent logistic regressions were used to identify the most salient features in classifying Rey Auditory Verbal Learning Test-determined memory impairment. RESULTS Multimodal models incorporating graphomotor, memory, and speech and voice features provided the stronger classification performance (area under the curve = 0.83; sensitivity = 0.81, specificity = 0.80). Multimodal models were superior to all other single modality and demographics models. DISCUSSION The current research contributes to the prevailing multimodal profile of those with cognitive impairment, suggesting that it is associated with slower speech with a particular effect on the duration, frequency, and percentage of pauses compared to normal healthy speech.
Collapse
Affiliation(s)
- Russell Banks
- Department of Communicative Sciences & DisordersCollege of Arts & SciencesMichigan State UniversityEast LansingMichiganUSA
| | | | | | - Ali Jannati
- Department of NeurologyHarvard Medical SchoolBostonMassachusettsUSA
| | - Joyce Gomes‐Osman
- Department of NeurologyUniversity of Miami Miller School of MedicineMiamiFloridaUSA
| | | | | | - Alvaro Pascual‐Leone
- Linus HealthBostonMassachusettsUSA
- Department of NeurologyHarvard Medical SchoolBostonMassachusettsUSA
- Hinda and Arthur Marcus Institute for Aging Research and Deanna and Sidney Wolk Center for Memory HealthHebrew SeniorLifeBostonMassachusettsUSA
| |
Collapse
|
12
|
Park CY, Kim M, Shim Y, Ryoo N, Choi H, Jeong HT, Yun G, Lee H, Kim H, Kim S, Youn YC. Harnessing the Power of Voice: A Deep Neural Network Model for Alzheimer's Disease Detection. Dement Neurocogn Disord 2024; 23:1-10. [PMID: 38362055 PMCID: PMC10864696 DOI: 10.12779/dnd.2024.23.1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/03/2023] [Accepted: 12/08/2023] [Indexed: 02/17/2024] Open
Abstract
Background and Purpose Voice, reflecting cerebral functions, holds potential for analyzing and understanding brain function, especially in the context of cognitive impairment (CI) and Alzheimer's disease (AD). This study used voice data to distinguish between normal cognition and CI or Alzheimer's disease dementia (ADD). Methods This study enrolled 3 groups of subjects: 1) 52 subjects with subjective cognitive decline; 2) 110 subjects with mild CI; and 3) 59 subjects with ADD. Voice features were extracted using Mel-frequency cepstral coefficients and Chroma. Results A deep neural network (DNN) model showed promising performance, with an accuracy of roughly 81% in 10 trials in predicting ADD, which increased to an average value of about 82.0%±1.6% when evaluated against unseen test dataset. Conclusions Although results did not demonstrate the level of accuracy necessary for a definitive clinical tool, they provided a compelling proof-of-concept for the potential use of voice data in cognitive status assessment. DNN algorithms using voice offer a promising approach to early detection of AD. They could improve the accuracy and accessibility of diagnosis, ultimately leading to better outcomes for patients.
Collapse
Affiliation(s)
- Chan-Young Park
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
| | - Minsoo Kim
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - YongSoo Shim
- Department of Neurology, Eunpyeong St. Mary's Hospital, The Catholic University of Korea, Seoul, Korea
| | - Nayoung Ryoo
- Department of Neurology, Eunpyeong St. Mary's Hospital, The Catholic University of Korea, Seoul, Korea
| | - Hyunjoo Choi
- Department of Communication Disorders, Korea Nazarene University, Cheonan, Korea
| | - Ho Tae Jeong
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
| | - Gihyun Yun
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - Hunboc Lee
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - Hyungryul Kim
- Research and Development, Baikal AI Inc., Seoul, Korea
| | - SangYun Kim
- Department of Neurology, Seoul National University College of Medicine and Seoul National University Bundang Hospital, Seongnam, Korea
| | - Young Chul Youn
- Department of Neurology, Chung-Ang University College of Medicine, Seoul, Korea
- Department of Medical Informatics, Chung-Ang University College of Medicine, Seoul, Korea
| |
Collapse
|
13
|
Parlak MM, Saylam G, Babademez MA, Munis ÖB, Tokgöz SA. Voice analysis results in individuals with Alzheimer's disease: How do age and cognitive status affect voice parameters? Brain Behav 2023; 13:e3271. [PMID: 37794703 PMCID: PMC10636380 DOI: 10.1002/brb3.3271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/17/2023] [Accepted: 09/23/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Reports of acoustic changes in the voice in individuals with Alzheimer's disease (AD) and the relationship of acoustic changes with age and cognitive status are still limited. OBJECTIVE This study aims to determine the changes in voice analysis results in AD, as well as the effects of age and cognitive status on voice parameters. METHODS The study included 47 (AD: 30; healthy: 17) women with a mean age of 76.13 years. The acoustic voice parameters mean fundamental frequency (F0), relative average perturbation (RAP), jitter percent (Jitt), shimmer percent (Shim), and noise-to-harmonic ratio were detected. The mini-mental state examination (MMSE) was utilized. RESULTS F0, Shim, Jitt, and RAP values were found to be statistically significantly higher in individuals with AD compared to healthy individuals. There was a significant negative correlation between MMSE and F0, Jitt, RAP and Shim, and the MMSE score had a significant negative effect on F0, Jitt, and RAP (p < .05). CONCLUSION Cognitive status was discovered to significantly impact the voice, with fundamental frequency and frequency and amplitude perturbations increasing as cognitive level decreases. In order to contribute to the therapy process for voice disorders, cognitive functions can be focused on in addition to voice therapy.
Collapse
Affiliation(s)
- Mümüne Merve Parlak
- Department of Speech and Language Therapy, Faculty of Health SciencesAnkara Yıldırım Beyazıt UniversityAnkaraTurkey
| | - Güleser Saylam
- Department of OtolaryngologyEtlik City HospitalAnkaraTurkey
| | | | | | | |
Collapse
|
14
|
Bushnell J, Unverzagt F, Wadley VG, Kennedy R, Del Gaizo J, Clark DG. Post-Processing Automatic Transcriptions with Machine Learning for Verbal Fluency Scoring. SPEECH COMMUNICATION 2023; 155:102990. [PMID: 38881790 PMCID: PMC11171467 DOI: 10.1016/j.specom.2023.102990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
Objective To compare verbal fluency scores derived from manual transcriptions to those obtained using automatic speech recognition enhanced with machine learning classifiers. Methods Using Amazon Web Services, we automatically transcribed verbal fluency recordings from 1400 individuals who performed both animal and letter F verbal fluency tasks. We manually adjusted timings and contents of the automatic transcriptions to obtain "gold standard" transcriptions. To make automatic scoring possible, we trained machine learning classifiers to discern between valid and invalid utterances. We then calculated and compared verbal fluency scores from the manual and automatic transcriptions. Results For both animal and letter fluency tasks, we achieved good separation of valid versus invalid utterances. Verbal fluency scores calculated based on automatic transcriptions showed high correlation with those calculated after manual correction. Conclusion Many techniques for scoring verbal fluency word lists require accurate transcriptions with word timings. We show that machine learning methods can be applied to improve off-the-shelf ASR for this purpose. These automatically derived scores may be satisfactory for some applications. Low correlations among some of the scores indicate the need for improvement in automatic speech recognition before a fully automatic approach can be reliably implemented.
Collapse
Affiliation(s)
- Justin Bushnell
- Department of Neurology, Indiana University, Indianapolis, IN, USA
| | | | - Virginia G Wadley
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Richard Kennedy
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - John Del Gaizo
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, USA
| | | |
Collapse
|
15
|
Bushnell J, Hammers DB, Aisen P, Dage JL, Eloyan A, Foroud T, Grinberg LT, Iaccarino L, Jack CR, Kirby K, Kramer J, Koeppe R, Kukull WA, La Joie R, Mundada NS, Murray ME, Nudelman K, Rumbaugh M, Soleimani-Meigooni DN, Toga A, Touroutoglou A, Vemuri P, Atri A, Day GS, Duara R, Graff-Radford NR, Honig LS, Jones DT, Masdeu J, Mendez M, Musiek E, Onyike CU, Riddle M, Rogalski E, Salloway S, Sha S, Turner RS, Wingo TS, Wolk DA, Carrillo MC, Dickerson BC, Rabinovici GD, Apostolova LG, Clark DG. Influence of amyloid and diagnostic syndrome on non-traditional memory scores in early-onset Alzheimer's disease. Alzheimers Dement 2023; 19 Suppl 9:S29-S41. [PMID: 37653686 PMCID: PMC10855009 DOI: 10.1002/alz.13434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/27/2023] [Accepted: 06/28/2023] [Indexed: 09/02/2023]
Abstract
INTRODUCTION The Rey Auditory Verbal Learning Test (RAVLT) is a useful neuropsychological test for describing episodic memory impairment in dementia. However, there is limited research on its utility in early-onset Alzheimer's disease (EOAD). We assess the influence of amyloid and diagnostic syndrome on several memory scores in EOAD. METHODS We transcribed RAVLT recordings from 303 subjects in the Longitudinal Early-Onset Alzheimer's Disease Study. Subjects were grouped by amyloid status and syndrome. Primacy, recency, J-curve, duration, stopping time, and speed score were calculated and entered into linear mixed effects models as dependent variables. RESULTS Compared with amyloid negative subjects, positive subjects exhibited effects on raw score, primacy, recency, and stopping time. Inter-syndromic differences were noted with raw score, primacy, recency, J-curve, and stopping time. DISCUSSION RAVLT measures are sensitive to the effects of amyloid and syndrome in EOAD. Future work is needed to quantify the predictive value of these scores. HIGHLIGHTS RAVLT patterns characterize various presentations of EOAD and EOnonAD Amyloid impacts raw score, primacy, recency, and stopping time Timing-based scores add value over traditional count-based scores.
Collapse
Affiliation(s)
- Justin Bushnell
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Dustin B. Hammers
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Paul Aisen
- Alzheimer’s Therapeutic Research Institute, University of Southern California, San Diego, California, USA
| | - Jeffrey L. Dage
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Ani Eloyan
- Department of Biostatistics, Center for Statistical Sciences, Brown University, Providence, Rhode Island, USA
| | - Tatiana Foroud
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Lea T. Grinberg
- Department of Pathology, University of California – San Francisco, San Francisco, California, USA
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | - Leonardo Iaccarino
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | | | - Kala Kirby
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Joel Kramer
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | - Robert Koeppe
- Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Walter A. Kukull
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Renaud La Joie
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | - Nidhi S. Mundada
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | | | - Kelly Nudelman
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Malia Rumbaugh
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | | | - Arthur Toga
- Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, Los Angeles, California, USA
| | - Alexandra Touroutoglou
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Alireza Atri
- Banner Sun Health Research Institute, Sun City, Arizona, USA
| | - Gregory S. Day
- Department of Neurology, Mayo Clinic, Jacksonville, Florida, USA
| | - Ranjan Duara
- Wien Center for Alzheimer’s Disease and Memory Disorders, Mount Sinai Medical Center, Miami, Florida, USA
| | | | - Lawrence S. Honig
- Taub Institute and Department of Neurology, Columbia University Irving Medical Center, New York, New York, USA
| | - David T. Jones
- Department of Radiology, Mayo Clinic, Rochester, Minnesota, USA
- Department of Neurology, Mayo Clinic, Rochester, Minnesota, USA
| | - Joseph Masdeu
- Nantz National Alzheimer Center, Houston Methodist and Weill Cornell Medicine, Houston, Texas, USA
| | - Mario Mendez
- Department of Neurology, David Geffen School of Medicine at UCLA, Los Angeles, California, USA
| | - Erik Musiek
- Department of Neurology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Chiadi U. Onyike
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Meghan Riddle
- Department of Neurology, Alpert Medical School, Brown University, Providence, Rhode Island, USA
| | - Emily Rogalski
- Department of Psychiatry and Behavioral Sciences, Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Steven Salloway
- Department of Neurology, Alpert Medical School, Brown University, Providence, Rhode Island, USA
| | - Sharon Sha
- Department of Neurology & Neurological Sciences, Stanford University, Palo Alto, California, USA
| | - Raymond S. Turner
- Department of Neurology, Georgetown University, Washington D.C., USA
| | - Thomas S. Wingo
- Department of Neurology and Human Genetics, Emory University School of Medicine, Atlanta, Georgia, USA
| | - David A. Wolk
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Maria C. Carrillo
- Medical & Scientific Relations Division, Alzheimer’s Association, Chicago, Illinois, USA
| | - Bradford C. Dickerson
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Gil D. Rabinovici
- Department of Neurology, University of California – San Francisco, San Francisco, California, USA
| | - Liana G. Apostolova
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - David G. Clark
- Department of Neurology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | | |
Collapse
|
16
|
Gregory S, Harrison J, Herrmann J, Hunter M, Jenkins N, König A, Linz N, Luz S, Mallick E, Pullen H, Welstead M, Ruhmel S, Tröger J, Ritchie CW. Remote data collection speech analysis in people at risk for Alzheimer's disease dementia: usability and acceptability results. FRONTIERS IN DEMENTIA 2023; 2:1271156. [PMID: 39081993 PMCID: PMC11285540 DOI: 10.3389/frdem.2023.1271156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 09/19/2023] [Indexed: 08/02/2024]
Abstract
Introduction Digital cognitive assessments are gathering importance for the decentralized remote clinical trials of the future. Before including such assessments in clinical trials, they must be tested to confirm feasibility and acceptability with the intended participant group. This study presents usability and acceptability data from the Speech on the Phone Assessment (SPeAk) study. Methods Participants (N = 68, mean age 70.43 years, 52.9% male) provided demographic data and completed baseline and 3-month follow-up phone based assessments. The baseline visit was administered by a trained researcher and included a spontaneous speech assessment and a brief cognitive battery (immediate and delayed recall, digit span, and verbal fluency). The follow-up visit repeated the cognitive battery which was administered by an automatic phone bot. Participants were randomized to receive their cognitive test results acer the final or acer each study visit. Participants completed acceptability questionnaires electronically acer each study visit. Results There was excellent retention (98.5%), few technical issues (n = 5), and good interrater reliability. Participants rated the assessment as acceptable, confirming the ease of use of the technology and their comfort in completing cognitive tasks on the phone. Participants generally reported feeling happy to receive the results of their cognitive tests, and this disclosure did not cause participants to feel worried. Discussion The results from this usability and acceptability analysis suggest that completing this brief battery of cognitive tests via a telephone call is both acceptable and feasible in a midlife-to-older adult population in the United Kingdom, living at risk for Alzheimer's disease.
Collapse
Affiliation(s)
- Sarah Gregory
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - John Harrison
- Scottish Brain Sciences, Edinburgh, United Kingdom
- Department of Neurology, Alzheimer Center Amsterdam, Amsterdam University Medical Centers, Vrije Universiteit, Amsterdam, Netherlands
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom
| | | | - Matthew Hunter
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Natalie Jenkins
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
| | - Alexandra König
- ki:elements GmbH, Saarbrücken, Germany
- CoBTek (Cognition-Behaviour-Technology) Lab, Université Côte d'Azur, Nice, France
| | | | - Saturnino Luz
- Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | | | - Hannah Pullen
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Scottish Brain Sciences, Edinburgh, United Kingdom
| | - Miles Welstead
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Scottish Brain Sciences, Edinburgh, United Kingdom
| | - Stephen Ruhmel
- Janssen Research & Development, LLC, Raritan, NJ, United States
| | | | - Craig W. Ritchie
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Scottish Brain Sciences, Edinburgh, United Kingdom
| |
Collapse
|
17
|
Parsapoor M. AI-based assessments of speech and language impairments in dementia. Alzheimers Dement 2023; 19:4675-4687. [PMID: 37578167 DOI: 10.1002/alz.13395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 06/03/2023] [Accepted: 06/05/2023] [Indexed: 08/15/2023]
Abstract
Recent advancements in the artificial intelligence (AI) domain have revolutionized the early detection of cognitive impairments associated with dementia. This has motivated clinicians to use AI-powered dementia detection systems, particularly systems developed based on individuals' and patients' speech and language, for a quick and accurate identification of patients with dementia. This paper reviews articles about developing assessment tools using machine learning and deep learning algorithms trained by vocal and textual datasets.
Collapse
Affiliation(s)
- Mahboobeh Parsapoor
- Centre de Recherche Informatique de Montréal: CRIM, Montreal, Quebec, Canada
| |
Collapse
|
18
|
He R, Chapin K, Al-Tamimi J, Bel N, Marquié M, Rosende-Roca M, Pytel V, Tartari JP, Alegret M, Sanabria A, Ruiz A, Boada M, Valero S, Hinzen W. Automated Classification of Cognitive Decline and Probable Alzheimer's Dementia Across Multiple Speech and Language Domains. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 32:2075-2086. [PMID: 37486774 DOI: 10.1044/2023_ajslp-22-00403] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
BACKGROUND Decline in language has emerged as a new potential biomarker for the early detection of Alzheimer's disease (AD). It remains unclear how sensitive language measures are across different tasks, language domains, and languages, and to what extent changes can be reliably detected in early stages such as subjective cognitive decline (SCD) and mild cognitive impairment (MCI). METHOD Using a scene construction task for speech elicitation in a new Spanish/Catalan speaking cohort (N = 119), we automatically extracted features across seven domains, three acoustic (spectral, cepstral, and voice quality), one prosodic, and three from text (morpholexical, semantic, and syntactic). They were forwarded to a random forest classifier to evaluate the discriminability of participants with probable AD dementia, amnestic and nonamnestic MCI, SCD, and cognitively healthy controls. Repeated-measures analyses of variance and paired-samples Wilcoxon signed-ranks test were used to assess whether and how performance differs significantly across groups and linguistic domains. RESULTS The performance scores of the machine learning classifier were generally satisfactorily high, with the highest scores over .9. Model performance was significantly different for linguistic domains (p < .001), and speech versus text (p = .043), with speech features outperforming textual features, and voice quality performing best. High diagnostic classification accuracies were seen even within both cognitively healthy (controls vs. SCD) and MCI (amnestic and nonamnestic) groups. CONCLUSION Speech-based machine learning is powerful in detecting cognitive decline and probable AD dementia across a range of different feature domains, though important differences exist between these domains as well. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.23699733.
Collapse
Affiliation(s)
- Rui He
- Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Kayla Chapin
- Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Jalal Al-Tamimi
- Laboratoire de Linguistique Formelle (LLF), CNRS, Université Paris Cité, France
| | - Núria Bel
- Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Marta Marquié
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Maitee Rosende-Roca
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
| | - Vanesa Pytel
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
| | - Juan Pablo Tartari
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
| | - Montse Alegret
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Angela Sanabria
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Agustín Ruiz
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Mercè Boada
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Sergi Valero
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Wolfram Hinzen
- Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
| |
Collapse
|
19
|
García-Gutiérrez F, Marquié M, Muñoz N, Alegret M, Cano A, de Rojas I, García-González P, Olivé C, Puerta R, Orellana A, Montrreal L, Pytel V, Ricciardi M, Zaldua C, Gabirondo P, Hinzen W, Lleonart N, García-Sánchez A, Tárraga L, Ruiz A, Boada M, Valero S. Harnessing acoustic speech parameters to decipher amyloid status in individuals with mild cognitive impairment. Front Neurosci 2023; 17:1221401. [PMID: 37746151 PMCID: PMC10512723 DOI: 10.3389/fnins.2023.1221401] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/08/2023] [Indexed: 09/26/2023] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative condition characterized by a gradual decline in cognitive functions. Currently, there are no effective treatments for AD, underscoring the importance of identifying individuals in the preclinical stages of mild cognitive impairment (MCI) to enable early interventions. Among the neuropathological events associated with the onset of the disease is the accumulation of amyloid protein in the brain, which correlates with decreased levels of Aβ42 peptide in the cerebrospinal fluid (CSF). Consequently, the development of non-invasive, low-cost, and easy-to-administer proxies for detecting Aβ42 positivity in CSF becomes particularly valuable. A promising approach to achieve this is spontaneous speech analysis, which combined with machine learning (ML) techniques, has proven highly useful in AD. In this study, we examined the relationship between amyloid status in CSF and acoustic features derived from the description of the Cookie Theft picture in MCI patients from a memory clinic. The cohort consisted of fifty-two patients with MCI (mean age 73 years, 65% female, and 57% positive amyloid status). Eighty-eight acoustic parameters were extracted from voice recordings using the extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), and several ML models were used to classify the amyloid status. Furthermore, interpretability techniques were employed to examine the influence of input variables on the determination of amyloid-positive status. The best model, based on acoustic variables, achieved an accuracy of 75% with an area under the curve (AUC) of 0.79 in the prediction of amyloid status evaluated by bootstrapping and Leave-One-Out Cross Validation (LOOCV), outperforming conventional neuropsychological tests (AUC = 0.66). Our results showed that the automated analysis of voice recordings derived from spontaneous speech tests offers valuable insights into AD biomarkers during the preclinical stages. These findings introduce novel possibilities for the use of digital biomarkers to identify subjects at high risk of developing AD.
Collapse
Affiliation(s)
| | - Marta Marquié
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Nathalia Muñoz
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Montserrat Alegret
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Amanda Cano
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Itziar de Rojas
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Pablo García-González
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Clàudia Olivé
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Raquel Puerta
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Adelina Orellana
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Laura Montrreal
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Vanesa Pytel
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Mario Ricciardi
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | | | | | - Wolfram Hinzen
- Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institut Català de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Núria Lleonart
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Ainhoa García-Sánchez
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Lluís Tárraga
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Agustín Ruiz
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Mercè Boada
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Sergi Valero
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| |
Collapse
|
20
|
Zolnoori M, Zolnour A, Topaz M. ADscreen: A speech processing-based screening system for automatic identification of patients with Alzheimer's disease and related dementia. Artif Intell Med 2023; 143:102624. [PMID: 37673583 PMCID: PMC10483114 DOI: 10.1016/j.artmed.2023.102624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 06/22/2023] [Accepted: 07/08/2023] [Indexed: 09/08/2023]
Abstract
Alzheimer's disease and related dementias (ADRD) present a looming public health crisis, affecting roughly 5 million people and 11 % of older adults in the United States. Despite nationwide efforts for timely diagnosis of patients with ADRD, >50 % of them are not diagnosed and unaware of their disease. To address this challenge, we developed ADscreen, an innovative speech-processing based ADRD screening algorithm for the protective identification of patients with ADRD. ADscreen consists of five major components: (i) noise reduction for reducing background noises from the audio-recorded patient speech, (ii) modeling the patient's ability in phonetic motor planning using acoustic parameters of the patient's voice, (iii) modeling the patient's ability in semantic and syntactic levels of language organization using linguistic parameters of the patient speech, (iv) extracting vocal and semantic psycholinguistic cues from the patient speech, and (v) building and evaluating the screening algorithm. To identify important speech parameters (features) associated with ADRD, we used the Joint Mutual Information Maximization (JMIM), an effective feature selection method for high dimensional, small sample size datasets. Modeling the relationship between speech parameters and the outcome variable (presence/absence of ADRD) was conducted using three different machine learning (ML) architectures with the capability of joining informative acoustic and linguistic with contextual word embedding vectors obtained from the DistilBERT (Bidirectional Encoder Representations from Transformers). We evaluated the performance of the ADscreen on an audio-recorded patients' speech (verbal description) for the Cookie-Theft picture description task, which is publicly available in the dementia databank. The joint fusion of acoustic and linguistic parameters with contextual word embedding vectors of DistilBERT achieved F1-score = 84.64 (standard deviation [std] = ±3.58) and AUC-ROC = 92.53 (std = ±3.34) for training dataset, and F1-score = 89.55 and AUC-ROC = 93.89 for the test dataset. In summary, ADscreen has a strong potential to be integrated with clinical workflow to address the need for an ADRD screening tool so that patients with cognitive impairment can receive appropriate and timely care.
Collapse
Affiliation(s)
- Maryam Zolnoori
- Columbia University Medical Center, New York, NY, United States of America; School of Nursing, Columbia University, New York, NY, United States of America.
| | - Ali Zolnour
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
| | - Maxim Topaz
- Columbia University Medical Center, New York, NY, United States of America; School of Nursing, Columbia University, New York, NY, United States of America
| |
Collapse
|
21
|
Moya-Galé G, Wisler AA, Walsh SJ, McAuliffe MJ, Levy ES. Acoustic Predictors of Ease of Understanding in Spanish Speakers With Dysarthria Associated With Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:2999-3012. [PMID: 36508721 DOI: 10.1044/2022_jslhr-22-00284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
PURPOSE The purpose of this study was to examine selected baseline acoustic features of hypokinetic dysarthria in Spanish speakers with Parkinson's disease (PD) and identify potential acoustic predictors of ease of understanding in Spanish. METHOD Seventeen Spanish-speaking individuals with mild-to-moderate hypokinetic dysarthria secondary to PD and eight healthy controls were recorded reading a translation of the Rainbow Passage. Acoustic measures of vowel space area, as indicated by the formant centralization ratio (FCR), envelope modulation spectra (EMS), and articulation rate were derived from the speech samples. Additionally, 15 healthy adults rated ease of understanding of the recordings on a visual analogue scale. A multiple linear regression model was implemented to investigate the predictive value of the selected acoustic parameters on ease of understanding. RESULTS Listeners' ease of understanding was significantly lower for speakers with dysarthria than for healthy controls. The FCR, EMS from the first 10 s of the reading passage, and the difference in EMS between the end and the beginning sections of the passage differed significantly between the two groups of speakers. Findings indicated that 67.7% of the variability in ease of understanding was explained by the predictive model, suggesting a moderately strong relationship between the acoustic and perceptual domains. CONCLUSIONS Measures of envelope modulation spectra were found to be highly significant model predictors of ease of understanding of Spanish-speaking individuals with hypokinetic dysarthria associated with PD. Articulation rate was also found to be important (albeit to a lesser degree) in the predictive model. The formant centralization ratio should be further examined with a larger sample size and more severe dysarthria to determine its efficacy in predicting ease of understanding.
Collapse
Affiliation(s)
| | | | | | | | - Erika S Levy
- Teachers College, Columbia University, New York, NY
| |
Collapse
|
22
|
Oh C, Morris R, Wang X, Raskin MS. Analysis of emotional prosody as a tool for differential diagnosis of cognitive impairments: a pilot research. Front Psychol 2023; 14:1129406. [PMID: 37425151 PMCID: PMC10327638 DOI: 10.3389/fpsyg.2023.1129406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 05/26/2023] [Indexed: 07/11/2023] Open
Abstract
Introduction This pilot research was designed to investigate if prosodic features from running spontaneous speech could differentiate dementia of the Alzheimer's type (DAT), vascular dementia (VaD), mild cognitive impairment (MCI), and healthy cognition. The study included acoustic measurements of prosodic features (Study 1) and listeners' perception of emotional prosody differences (Study 2). Methods For Study 1, prerecorded speech samples describing the Cookie Theft picture from 10 individuals with DAT, 5 with VaD, 9 with MCI, and 10 neurologically healthy controls (NHC) were obtained from the DementiaBank. The descriptive narratives by each participant were separated into utterances. These utterances were measured on 22 acoustic features via the Praat software and analyzed statistically using the principal component analysis (PCA), regression, and Mahalanobis distance measures. Results The analyses on acoustic data revealed a set of five factors and four salient features (i.e., pitch, amplitude, rate, and syllable) that discriminate the four groups. For Study 2, a group of 28 listeners served as judges of emotions expressed by the speakers. After a set of training and practice sessions, they were instructed to indicate the emotions they heard. Regression measures were used to analyze the perceptual data. The perceptual data indicated that the factor underlying pitch measures had the greatest strength for the listeners to separate the groups. Discussion The present pilot work showed that using acoustic measures of prosodic features may be a functional method for differentiating among DAT, VaD, MCI, and NHC. Future studies with data collected under a controlled environment using better stimuli are warranted.
Collapse
Affiliation(s)
- Chorong Oh
- School of Rehabilitation and Communication Sciences, Ohio University, Athens, OH, United States
| | - Richard Morris
- School of Communication Science and Disorders, Florida State University, Tallahassee, FL, United States
| | - Xianhui Wang
- School of Medicine, University of California Irvine, Irvine, CA, United States
| | - Morgan S. Raskin
- School of Communication Science and Disorders, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
23
|
Martínez-Nicolás I, Martínez-Sánchez F, Ivanova O, Meilán JJG. Reading and lexical-semantic retrieval tasks outperforms single task speech analysis in the screening of mild cognitive impairment and Alzheimer's disease. Sci Rep 2023; 13:9728. [PMID: 37322073 PMCID: PMC10272227 DOI: 10.1038/s41598-023-36804-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/12/2023] [Indexed: 06/17/2023] Open
Abstract
Age-related cognitive impairment have increased dramatically in recent years, which has risen the interes in developing screening tools for mild cognitive impairment and Alzheimer's disease. Speech analysis allows to exploit the behavioral consequences of cognitive deficits on the patient's vocal performance so that it is possible to identify pathologies affecting speech production such as dementia. Previous studies have further shown that the speech task used determines how the speech parameters are altered. We aim to combine the impairments in several speech production tasks in order to improve the accuracy of screening through speech analysis. The sample consists of 72 participants divided into three equal groups of healthy older adults, people with mild cognitive impairment, or Alzheimer's disease, matched by age and education. A complete neuropsychological assessment and two voice recordings were performed. The tasks required the participants to read a text, and complete a sentence with semantic information. A stepwise linear discriminant analysis was performed to select speech parameters with discriminative power. The discriminative functions obtained an accuracy of 83.3% in simultaneous classifications of several levels of cognitive impairment. It would therefore be a promising screening tool for dementia.
Collapse
Affiliation(s)
| | | | - Olga Ivanova
- Faculty of Philology, University of Salamanca, 37008, Salamanca, Spain
| | - Juan J G Meilán
- Faculty of Psychology, University of Salamanca, 37008, Salamanca, Spain
- Institute of Neuroscience of Castilla y León, 37007, Salamanca, Spain
| |
Collapse
|
24
|
Yamada Y, Shinkawa K, Nemoto M, Nemoto K, Arai T. A mobile application using automatic speech analysis for classifying Alzheimer's disease and mild cognitive impairment. COMPUT SPEECH LANG 2023. [DOI: 10.1016/j.csl.2023.101514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
|
25
|
Mefford JA, Zhao Z, Heilier L, Xu M, Zhou G, Mace R, Sloane KL, Sheppard SM, Glenn S. Varied performance of picture description task as a screening tool across MCI subtypes. PLOS DIGITAL HEALTH 2023; 2:e0000197. [PMID: 36913425 PMCID: PMC10010512 DOI: 10.1371/journal.pdig.0000197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 01/18/2023] [Indexed: 06/18/2023]
Abstract
A picture description task is a component of Miro Health's platform for self-administration of neurobehavioral assessments. Picture description has been used as a screening tool for identification of individuals with Alzheimer's disease and mild cognitive impairment (MCI), but currently requires in-person administration and scoring by someone with access to and familiarity with a scoring rubric. The Miro Health implementation allows broader use of this assessment through self-administration and automated processing, analysis, and scoring to deliver clinically useful quantifications of the users' speech production, vocal characteristics, and language. Picture description responses were collected from 62 healthy controls (HC), and 33 participants with MCI: 18 with amnestic MCI (aMCI) and 15 with non-amnestic MCI (naMCI). Speech and language features and contrasts between pairs of features were evaluated for differences in their distributions in the participant subgroups. Picture description features were selected and combined using penalized logistic regression to form risk scores for classification of HC versus MCI as well as HC versus specific MCI subtypes. A picture-description based risk score distinguishes MCI and HC with an area under the receiver operator curve (AUROC) of 0.74. When contrasting specific subtypes of MCI and HC, the classifiers have an AUROC of 0.88 for aMCI versus HC and and AUROC of 0.61 for naMCI versus HC. Tests of association of individual features or contrasts of pairs of features with HC versus aMCI identified 20 features with p-values below 5e-3 and False Discovery Rates (FDRs) at or below 0.113, and 61 contrasts with p-values below 5e-4 and FDRs at or below 0.132. Findings suggest that performance of picture description as a screening tool for MCI detection will vary greatly by MCI subtype or by the proportion of various subtypes in an undifferentiated MCI population.
Collapse
Affiliation(s)
- Joel A. Mefford
- Department of Neurology, University of California, Los Angeles, California, United States of America
| | - Zilong Zhao
- Miro Health, Inc., San Francisco, California, United States of America
| | - Leah Heilier
- Miro Health, Inc., San Francisco, California, United States of America
| | - Man Xu
- Miro Health, Inc., San Francisco, California, United States of America
| | - Guifeng Zhou
- Miro Health, Inc., San Francisco, California, United States of America
| | - Rachel Mace
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Kelly L. Sloane
- Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shannon M. Sheppard
- Department of Communication Sciences & Disorders, Chapman University, Orange, California, United States of America
| | - Shenly Glenn
- Miro Health, Inc., San Francisco, California, United States of America
| |
Collapse
|
26
|
Chen Y, Ma S, Yang X, Liu D, Yang J. Screening Children's Intellectual Disabilities with Phonetic Features, Facial Phenotype and Craniofacial Variability Index. Brain Sci 2023; 13:brainsci13010155. [PMID: 36672135 PMCID: PMC9857173 DOI: 10.3390/brainsci13010155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 12/31/2022] [Accepted: 01/09/2023] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Intellectual Disability (ID) is a kind of developmental deficiency syndrome caused by congenital diseases or postnatal events. This syndrome could be intervened as soon as possible if its early screening was efficient, which may improve the condition of patients and enhance their self-care ability. The early screening of ID is always achieved by clinical interview, which needs in-depth participation of medical professionals and related medical resources. METHODS A new method for screening ID has been proposed by analyzing the facial phenotype and phonetic characteristic of young subjects. First, the geometric features of subjects' faces and phonetic features of subjects' voice are extracted from interview videos, then craniofacial variability index (CVI) is calculated with the geometric features and the risk of ID is given with the measure of CVI. Furthermore, machine learning algorithms are utilized to establish a method for further screening ID based on facial features and phonetic features. RESULTS The proposed method using three feature sets, including geometric features, CVI features and phonetic features was evaluated. The best performance of accuracy was closer to 80%. CONCLUSIONS The results using the three feature sets revealed that the proposed method may be applied in a clinical setting in the future after continuous improvement.
Collapse
Affiliation(s)
- Yuhe Chen
- School of Foreign Languages, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Simeng Ma
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Xiaoyu Yang
- Department of Pharmacy, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
- Hubei Province Clinical Research Center for Precision Medicine for Critical Illness, Wuhan 430030, China
| | - Dujuan Liu
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China
- Correspondence: (D.L.); (J.Y.)
| | - Jun Yang
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan 430074, China
- School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
- Correspondence: (D.L.); (J.Y.)
| |
Collapse
|
27
|
Hu H, Li J, He S, Zhao Y, Liu P, Liu H. Aging-related decline in the neuromotor control of speech production: current and future. Front Aging Neurosci 2023; 15:1172277. [PMID: 37151845 PMCID: PMC10156980 DOI: 10.3389/fnagi.2023.1172277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/05/2023] [Indexed: 05/09/2023] Open
Affiliation(s)
- Huijing Hu
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, China
| | - Jingting Li
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Sixuan He
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, China
| | - Yan Zhao
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Peng Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Hanjun Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Hanjun Liu
| |
Collapse
|
28
|
Mahon E, Lachman ME. Voice biomarkers as indicators of cognitive changes in middle and later adulthood. Neurobiol Aging 2022; 119:22-35. [PMID: 35964541 PMCID: PMC9487188 DOI: 10.1016/j.neurobiolaging.2022.06.010] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 06/20/2022] [Accepted: 06/26/2022] [Indexed: 11/20/2022]
Abstract
Voice prosody measures have been linked with Alzheimer's disease (AD), but it is unclear whether they are associated with normal cognitive aging. We assessed relationships between voice measures and 10-year cognitive changes in the MIDUS national sample of middle-aged and older adults ages 42-92, with a mean age of 64.09 (standard deviation = 11.23) at the second wave. Seven cognitive tests were assessed in 2003-2004 (Wave 2) and 2013-2014 (Wave 3). Voice measures were collected at Wave 3 (N = 2585) from audio recordings of the cognitive interviews. Analyses controlled for age, education, depressive symptoms, and health. As predicted, higher jitter was associated with greater declines in episodic memory, verbal fluency, and attention switching. Lower pulse was related to greater decline in episodic memory, and fewer voice breaks were related to greater declines in episodic memory and verbal fluency, although the direction of these effects was contrary to hypotheses. Findings suggest that voice biomarkers may offer a promising approach for early detection of risk factors for cognitive impairment or AD.
Collapse
Affiliation(s)
- Elizabeth Mahon
- Brandeis University, Department of Psychology, Waltham, MA, USA.
| | | |
Collapse
|
29
|
Yamada Y, Shinkawa K, Nemoto M, Ota M, Nemoto K, Arai T. Speech and language characteristics differentiate Alzheimer's disease and dementia with Lewy bodies. ALZHEIMER'S & DEMENTIA (AMSTERDAM, NETHERLANDS) 2022; 14:e12364. [PMID: 36320609 PMCID: PMC9614050 DOI: 10.1002/dad2.12364] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 09/11/2022] [Indexed: 11/04/2022]
Abstract
Introduction Early differential diagnosis of Alzheimer's disease (AD) and dementia with Lewy bodies (DLB) is important, but it remains challenging. Different profiles of speech and language impairments between AD and DLB have been suggested, but direct comparisons have not been investigated. Methods We collected speech responses from 121 older adults comprising AD, DLB, and cognitively normal (CN) groups and investigated their acoustic, prosodic, and linguistic features. Results The AD group showed larger differences from the CN group than the DLB group in linguistic features, while the DLB group showed larger differences in prosodic and acoustic features. Machine-learning classifiers using these speech features achieved 87.0% accuracy for AD versus CN, 93.2% for DLB versus CN, and 87.4% for AD versus DLB. Discussion Our findings indicate the discriminative differences in speech features in AD and DLB and the feasibility of using these features in combination as a screening tool for identifying/differentiating AD and DLB.
Collapse
Affiliation(s)
| | | | - Miyuki Nemoto
- Department of PsychiatryDivision of Clinical MedicineFaculty of MedicineUniversity of TsukubaTsukubaIbarakiJapan
| | - Miho Ota
- Department of PsychiatryDivision of Clinical MedicineFaculty of MedicineUniversity of TsukubaTsukubaIbarakiJapan
| | - Kiyotaka Nemoto
- Department of PsychiatryDivision of Clinical MedicineFaculty of MedicineUniversity of TsukubaTsukubaIbarakiJapan
| | - Tetsuaki Arai
- Department of PsychiatryDivision of Clinical MedicineFaculty of MedicineUniversity of TsukubaTsukubaIbarakiJapan
| |
Collapse
|
30
|
Diaz-Asper M, Holmlund TB, Chandler C, Diaz-Asper C, Foltz PW, Cohen AS, Elvevåg B. Using automated syllable counting to detect missing information in speech transcripts from clinical settings. Psychiatry Res 2022; 315:114712. [PMID: 35839638 PMCID: PMC9378537 DOI: 10.1016/j.psychres.2022.114712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 07/01/2022] [Accepted: 07/02/2022] [Indexed: 11/19/2022]
Abstract
Speech rate and quantity reflect clinical state; thus automated transcription holds potential clinical applications. We describe two datasets where recording quality and speaker characteristics affected transcription accuracy. Transcripts of low-quality recordings omitted significant portions of speech. An automated syllable counter estimated actual speech output and quantified the amount of missing information. The efficacy of this method differed by audio quality: the correlation between missing syllables and word error rate was only significant when quality was low. Automatically counting syllables could be useful to measure and flag transcription omissions in clinical contexts where speaker characteristics and recording quality are problematic.
Collapse
Affiliation(s)
| | - Terje B Holmlund
- Department of Clinical Medicine, University of Tromsø - The Arctic University of Norway, Tromsø, Norway
| | - Chelsea Chandler
- Department of Computer Science, University of Colorado Boulder, CO, United States
| | | | - Peter W Foltz
- Institute of Cognitive Science, University of Colorado Boulder, CO, United States
| | - Alex S Cohen
- Department of Psychology, Louisiana State University, LA, United States
| | - Brita Elvevåg
- Department of Clinical Medicine, University of Tromsø - The Arctic University of Norway, Tromsø, Norway; Norwegian Center for eHealth Research, University Hospital of North Norway, Tromsø, Norway.
| |
Collapse
|
31
|
Ivanova O, Meilán JJG, Martínez-Sánchez F, Martínez-Nicolás I, Llorente TE, González NC. Discriminating speech traits of Alzheimer's disease assessed through a corpus of reading task for Spanish language. COMPUT SPEECH LANG 2022. [DOI: 10.1016/j.csl.2021.101341] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
32
|
Cho S, Agmon G, Shellikeri S, Cousins KAQ, Ash S, Irwin DJ, Spindler M, Deik AF, Elman LB, Quinn C, Liberman M, Grossman M, Nevler N. Prosodic characteristics of prepausal words produced by patients with neurodegenerative disease. SPEECH PROSODY (URBANA, ILL.) 2022; 2022:120-124. [PMID: 36444200 PMCID: PMC9701527 DOI: 10.21437/speechprosody.2022-25] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Prosody of patients with neurodegenerative disease is often impaired. We investigated changes to two prosodic cues in patients: the pitch contour and the duration of prepausal words. We analyzed recordings of picture descriptions produced by patients with neurodegenerative conditions that included either cognitive (n=223), motor (n=68), or mixed cognitive and motor impairments (n=109), and by healthy controls (n=28; HC). A speech activity detector identified pauses. Words were aligned to the acoustic signal; pitch values were normalized in scale and duration. Analyses of pitch showed that the ending (90th-100th percentile) of prepausal words had a lower pitch in the mixed and motor groups than the cognitive group and HC. The pitch contour from the midpoint of words to the end showed a steep rising slope for HC, but patients showed a gentle rising or flat slope. This suggests that HC signaled the continuation of their description after the pause with rising contour; patients either failed to keep describing the picture due to cognitive impairment or could not raise pitch due to motor impairments. Prepausal words showed longer duration relative to non-prepausal words with no significant differences between the groups. This suggests that prepausal lengthening is preserved in patients.
Collapse
Affiliation(s)
- Sunghye Cho
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA
| | - Galit Agmon
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Sanjana Shellikeri
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Katheryn A Q Cousins
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Sharon Ash
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - David J Irwin
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Meredith Spindler
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
| | - Andres F Deik
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
| | - Lauren B Elman
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
| | - Colin Quinn
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
| | - Mark Liberman
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA
| | - Murray Grossman
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Naomi Nevler
- Penn Frontotemporal Degeneration Center, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
33
|
Kálmán J, Devanand DP, Gosztolya G, Balogh R, Imre N, Tóth L, Hoffmann I, Kovács I, Vincze V, Pákáski M. Temporal speech parameters detect mild cognitive impairment in different languages: validation and comparison of the Speech-GAP Test® in English and Hungarian. Curr Alzheimer Res 2022; 19:373-386. [PMID: 35440309 DOI: 10.2174/1567205019666220418155130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/08/2022] [Accepted: 02/17/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND The development of automatic speech recognition (ASR) technology allows the analysis of temporal (time-based) speech parameters characteristic of mild cognitive impairment (MCI). However, no information has been available on whether the analysis of spontaneous speech can be used with the same efficiency in different language environments. OBJECTIVE The main goal of this international pilot study is to address the question whether the Speech-Gap Test® (S-GAP Test®), previously tested in the Hungarian language, is appropriate for and applicable to the recognition of MCI in other languages such as English. METHOD After an initial screening of 88 individuals, English-speaking (n = 33) and Hungarian-speaking (n = 33) participants were classified as having MCI or as healthy controls (HC) based on Petersen's criteria. Speech of each participant was recorded via a spontaneous speech task. 15 temporal parameters were determined and calculated by means of ASR. RESULTS Seven temporal parameters in the English-speaking sample and 5 in the Hungarian-speaking sample showed significant differences between the MCI and the HC group. Receiver operating characteristics (ROC) analysis clearly distinguished the English-speaking MCI cases from the HC group based on speech tempo and articulation tempo with 100% sensitivity, and on three more temporal parameters with high sensitivity (85.7%). In the Hungarian-speaking sample, the ROC analysis showed similar sensitivity rates (92.3%). CONCLUSION The results of this study in different native-speaking populations suggest that changes in acoustic parameters detected by the S-GAP Test® might be present across different languages.
Collapse
Affiliation(s)
- János Kálmán
- Albert Szent-Györgyi Medical School, University of Szeged, Szeged
| | - Davangere P Devanand
- Columbia University Medical Center, New York, NY.,New York State Psychiatric Institute, New York, NY
| | - Gábor Gosztolya
- MTA-SZTE Research Group on Artificial Intelligence, Faculty of Science and Informatics, University of Szeged, Szeged
| | - Réka Balogh
- Albert Szent-Györgyi Medical School, University of Szeged, Szeged
| | - Nóra Imre
- Albert Szent-Györgyi Medical School, University of Szeged, Szeged
| | - László Tóth
- Faculty of Science and Informatics, University of Szeged, Szeged
| | - Ildikó Hoffmann
- Faculty of Humanities and Social Sciences, University of Szeged, Szeged.,Hungarian Research Centre for Linguistics, Eötvös Loránd Research Network, Budapest
| | - Ildikó Kovács
- Albert Szent-Györgyi Medical School, University of Szeged, Szeged
| | - Veronika Vincze
- MTA-SZTE Research Group on Artificial Intelligence, Faculty of Science and Informatics, University of Szeged, Szeged
| | - Magdolna Pákáski
- Albert Szent-Györgyi Medical School, University of Szeged, Szeged
| |
Collapse
|
34
|
Gregory S, Linz N, König A, Langel K, Pullen H, Luz S, Harrison J, Ritchie CW. Remote data collection speech analysis and prediction of the identification of Alzheimer's disease biomarkers in people at risk for Alzheimer's disease dementia: the Speech on the Phone Assessment (SPeAk) prospective observational study protocol. BMJ Open 2022; 12:e052250. [PMID: 35292490 PMCID: PMC8928245 DOI: 10.1136/bmjopen-2021-052250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Identifying cost-effective, non-invasive biomarkers of Alzheimer's disease (AD) is a clinical and research priority. Speech data are easy to collect, and studies suggest it can identify those with AD. We do not know if speech features can predict AD biomarkers in a preclinical population. METHODS AND ANALYSIS The Speech on the Phone Assessment (SPeAk) study is a prospective observational study. SPeAk recruits participants aged 50 years and over who have previously completed studies with AD biomarker collection. Participants complete a baseline telephone assessment, including spontaneous speech and cognitive tests. A 3-month visit will repeat the cognitive tests with a conversational artificial intelligence bot. Participants complete acceptability questionnaires after each visit. Participants are randomised to receive their cognitive test results either after each visit or only after they have completed the study. We will combine SPeAK data with AD biomarker data collected in a previous study and analyse for correlations between extracted speech features and AD biomarkers. The outcome of this analysis will inform the development of an algorithm for prediction of AD risk based on speech features. ETHICS AND DISSEMINATION This study has been approved by the Edinburgh Medical School Research Ethics Committee (REC reference 20-EMREC-007). All participants will provide informed consent before completing any study-related procedures, participants must have capacity to consent to participate in this study. Participants may find the tests, or receiving their scores, causes anxiety or stress. Previous exposure to similar tests may make this more familiar and reduce this anxiety. The study information will include signposting in case of distress. Study results will be disseminated to study participants, presented at conferences and published in a peer reviewed journal. No study participants will be identifiable in the study results.
Collapse
Affiliation(s)
- Sarah Gregory
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| | - Nicklas Linz
- ki elements, ki elements, Saarbrucken, Saarland, Germany
| | - Alexandra König
- Stars Team, National Institute for Research in Computer Science and Automation, Nice, France
| | - Kai Langel
- Janssen Healthcare Innovation, Beerse, Belgium
| | - Hannah Pullen
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| | - Saturnino Luz
- Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, UK
| | - John Harrison
- Metis Cognition Ltd, Kilmington Common, UK
- Department of Neurology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Craig W Ritchie
- Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, The University of Edinburgh Centre for Clinical Brain Sciences, Edinburgh, UK
| |
Collapse
|
35
|
Changes in Speech Range Profile Are Associated with Cognitive Impairment. Dement Neurocogn Disord 2021; 20:89-98. [PMID: 34795772 PMCID: PMC8585535 DOI: 10.12779/dnd.2021.20.4.89] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/19/2021] [Accepted: 10/20/2021] [Indexed: 12/02/2022] Open
Abstract
Background and Purpose The aim of this study was to describe the variations in the speech range profile (SRP) of patients affected by cognitive decline. Methods We collected the data of patients managed for suspected voice and speech disorders, and suspected cognitive impairment. Patients underwent an Ear Nose and Throat evaluation and Mini-Mental State Examination (MMSE). To obtain SRP, we asked the patients to read 18 sentences twice, at their most comfortable pitch and loudness as they would do in daily conversation, and recorded their voice on to computer software. Results The study included 61 patients. The relationship between the MMSE score and SRP parameters was established. Increased severity of the MMSE score resulted in a statistically significant reduction in the average values of the semitones to the phonetogram, and the medium and maximum sound pressure levels (p<0.001). The maximum predictivity of MMSE was based on the highly significant values of semitones (p<0.001) and the maximum sound pressure levels (p=0.010). Conclusions The differences in SRP between the various groups were analyzed. Specifically, the SRP value decreased with increasing severity of cognitive decline. SRP was useful in highlighting the relationship between all cognitive declines tested and speech.
Collapse
|
36
|
Yamada Y, Shinkawa K, Kobayashi M, Nishimura M, Nemoto M, Tsukada E, Ota M, Nemoto K, Arai T. Tablet-Based Automatic Assessment for Early Detection of Alzheimer's Disease Using Speech Responses to Daily Life Questions. Front Digit Health 2021; 3:653904. [PMID: 34713127 PMCID: PMC8521899 DOI: 10.3389/fdgth.2021.653904] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 02/22/2021] [Indexed: 01/09/2023] Open
Abstract
Health-monitoring technologies for automatically detecting the early signs of Alzheimer's disease (AD) have become increasingly important. Speech responses to neuropsychological tasks have been used for quantifying changes resulting from AD and differentiating AD and mild cognitive impairment (MCI) from cognitively normal (CN). However, whether and how other types of speech tasks with less burden on older adults could be used for detecting early signs of AD remains unexplored. In this study, we developed a tablet-based application and compared speech responses to daily life questions with those to neuropsychological tasks in terms of differentiating MCI from CN. We found that in daily life questions, around 80% of speech features showing significant differences between CN and MCI overlapped those showing significant differences in both our study and other studies using neuropsychological tasks, but the number of significantly different features as well as their effect sizes from life questions decreased compared with those from neuropsychological tasks. On the other hand, the results of classification models for detecting MCI by using the speech features showed that daily life questions could achieve high accuracy, i.e., 86.4%, comparable to neuropsychological tasks by using eight questions against all five neuropsychological tasks. Our results indicate that, while daily life questions may elicit weaker but statistically discernable differences in speech responses resulting from MCI than neuropsychological tasks, combining them could be useful for detecting MCI with comparable performance to using neuropsychological tasks, which could help develop health-monitoring technologies for early detection of AD in a less burdensome manner.
Collapse
Affiliation(s)
| | | | | | - Masafumi Nishimura
- Department of Informatics, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Miyuki Nemoto
- Department of Psychiatry, University of Tsukuba Hospital, Ibaraki, Japan
| | - Eriko Tsukada
- Department of Psychiatry, University of Tsukuba Hospital, Ibaraki, Japan
| | - Miho Ota
- Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan
| | - Kiyotaka Nemoto
- Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan
| | - Tetsuaki Arai
- Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan
| |
Collapse
|
37
|
Gosztolya G, Balogh R, Imre N, Egas-López JV, Hoffmann I, Vincze V, Tóth L, Devanand DP, Pákáski M, Kálmán J. Cross-lingual detection of mild cognitive impairment based on temporal parameters of spontaneous speech. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2021.101215] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
38
|
Nasreen S, Rohanian M, Hough J, Purver M. Alzheimer’s Dementia Recognition From Spontaneous Speech Using Disfluency and Interactional Features. FRONTIERS IN COMPUTER SCIENCE 2021. [DOI: 10.3389/fcomp.2021.640669] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Alzheimer’s disease (AD) is a progressive, neurodegenerative disorder mainly characterized by memory loss with deficits in other cognitive domains, including language, visuospatial abilities, and changes in behavior. Detecting diagnostic biomarkers that are noninvasive and cost-effective is of great value not only for clinical assessments and diagnostics but also for research purposes. Several previous studies have investigated AD diagnosis via the acoustic, lexical, syntactic, and semantic aspects of speech and language. Other studies include approaches from conversation analysis that look at more interactional aspects, showing that disfluencies such as fillers and repairs, and purely nonverbal features such as inter-speaker silence, can be key features of AD conversations. These kinds of features, if useful for diagnosis, may have many advantages: They are simple to extract and relatively language-, topic-, and task-independent. This study aims to quantify the role and contribution of these features of interaction structure in predicting whether a dialogue participant has AD. We used a subset of the Carolinas Conversation Collection dataset of patients with AD at moderate stage within the age range 60–89 and similar-aged non-AD patients with other health conditions. Our feature analysis comprised two sets: disfluency features, including indicators such as self-repairs and fillers, and interactional features, including overlaps, turn-taking behavior, and distributions of different types of silence both within patient speech and between patient and interviewer speech. Statistical analysis showed significant differences between AD and non-AD groups for several disfluency features (edit terms, verbatim repeats, and substitutions) and interactional features (lapses, gaps, attributable silences, turn switches per minute, standardized phonation time, and turn length). For the classification of AD patient conversations vs. non-AD patient conversations, we achieved 83% accuracy with disfluency features, 83% accuracy with interactional features, and an overall accuracy of 90% when combining both feature sets using support vector machine classifiers. The discriminative power of these features, perhaps combined with more conventional linguistic features, therefore shows potential for integration into noninvasive clinical assessments for AD at advanced stages.
Collapse
|
39
|
Evans E, Coley SL, Gooding DC, Norris N, Ramsey CM, Green-Harris G, Mueller KD. Preliminary assessment of connected speech and language as marker for cognitive change in late middle-aged Black/African American adults at risk for Alzheimer's disease. APHASIOLOGY 2021; 36:982-1005. [PMID: 36016839 PMCID: PMC9398189 DOI: 10.1080/02687038.2021.1931801] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Accepted: 05/03/2021] [Indexed: 06/15/2023]
Abstract
Background Connected speech-language (CSL) has been a promising measure of assessing cognitive decline in populations at-risk for Alzheimer's disease and related dementias (ADRD) populations. A common way to obtain CSL is through using picture description tasks such as the most frequently used image Cookie Theft (CT). However, questions have been raised about using CT for diverse communities. Little is known about the CSL produced in response to this task in Black/African American (BAA) adults aged 48-74. Goals The present study's goals were to characterize CSL in BAA adults by sex and APOE-ε4 status from Milwaukee in the Wisconsin Registry for Alzheimer's Prevention (WRAP) study when presented with the CT picture description task and to identify differences in CSL output between BAAs and non-Hispanic Whites (NHW). Methods and Procedures We collected CSL samples from the CT picture from 48 BAA participants and 30 NHW participants from the WRAP participants in Milwaukee, WI group. CSL was analyzed using chi-square tests, T-tests, and ANCOVA. Linear mixed effect regression models were used to determine the association between cognitive status and longitudinal CSL in BAA participants with more than 1 timepoint. Outcomes and Results Declines in CSL of BAA participants were associated with subtle declines in cognition. Among BAA participants, we found no significant differences in speech measures in terms of sex and APOE-ε4 status. Our results showed no significant differences in speech measures between BAA and NHW groups. Conclusions CSL analysis provides an inexpensive way to evaluate preclinical changes in cognitive status that may not be as affected by other factors, such as ethnocultural background. Future studies with larger sample sizes and participants from other geographic locations can clarify these findings.
Collapse
Affiliation(s)
- Elizabeth Evans
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Sheryl L Coley
- Wisconsin Alzheimer's Institute, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Diane C Gooding
- Department of Psychology and Psychiatry, University of Wisconsin-Madison, Madison, WI University of Wisconsin, Madison, Wisconsin, USA
| | - Nia Norris
- Wisconsin Alzheimer's Institute, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Celena M Ramsey
- Wisconsin Alzheimer's Institute, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Gina Green-Harris
- Wisconsin Alzheimer's Institute, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Kimberly D Mueller
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Wisconsin, USA
| |
Collapse
|
40
|
Hidalgo-De la Guía I, Garayzábal-Heinze E, Gómez-Vilda P, Martínez-Olalla R, Palacios-Alonso D. Acoustic Analysis of Phonation in Children With Smith-Magenis Syndrome. Front Hum Neurosci 2021; 15:661392. [PMID: 34149380 PMCID: PMC8209519 DOI: 10.3389/fnhum.2021.661392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open
Abstract
Complex simultaneous neuropsychophysiological mechanisms are responsible for the processing of the information to be transmitted and for the neuromotor planning of the articulatory organs involved in speech. The nature of this set of mechanisms is closely linked to the clinical state of the subject. Thus, for example, in populations with neurodevelopmental deficits, these underlying neuropsychophysiological procedures are deficient and determine their phonation. Most of these cases with neurodevelopmental deficits are due to a genetic abnormality, as is the case in the population with Smith–Magenis syndrome (SMS). SMS is associated with neurodevelopmental deficits, intellectual disability, and a cohort of characteristic phenotypic features, including voice quality, which does not seem to be in line with the gender, age, and complexion of the diagnosed subject. The phonatory profile and speech features in this syndrome are dysphonia, high f0, excess vocal muscle stiffness, fluency alterations, numerous syllabic simplifications, phoneme omissions, and unintelligibility of speech. This exploratory study investigates whether the neuromotor deficits in children with SMS adversely affect phonation as compared to typically developing children without neuromotor deficits, which has not been previously determined. The authors compare the phonatory performance of a group of children with SMS (N = 12) with a healthy control group of children (N = 12) matched in age, gender, and grouped into two age ranges. The first group ranges from 5 to 7 years old, and the second group goes from 8 to 12 years old. Group differences were determined for two forms of acoustic analysis performed on repeated recordings of the sustained vowel /a/ F1 and F2 extraction and cepstral peak prominence (CPP). It is expected that the results will enlighten the question of the underlying neuromotor aspects of phonation in SMS population. These findings could provide evidence of the susceptibility of phonation of speech to neuromotor disturbances, regardless of their origin.
Collapse
Affiliation(s)
| | | | - Pedro Gómez-Vilda
- Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | | | - Daniel Palacios-Alonso
- Escuela Técnica Superior de Ingeniería Informática, Universidad Rey Juan Carlos, Madrid, Spain
| |
Collapse
|
41
|
Zhu Y, Liang X, Batsis JA, Roth RM. Exploring Deep Transfer Learning Techniques for Alzheimer's Dementia Detection. FRONTIERS IN COMPUTER SCIENCE 2021; 3:624683. [PMID: 34046588 PMCID: PMC8153512 DOI: 10.3389/fcomp.2021.624683] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Examination of speech datasets for detecting dementia, collected via various speech tasks, has revealed links between speech and cognitive abilities. However, the speech dataset available for this research is extremely limited because the collection process of speech and baseline data from patients with dementia in clinical settings is expensive. In this paper, we study the spontaneous speech dataset from a recent ADReSS challenge, a Cookie Theft Picture (CTP) dataset with balanced groups of participants in age, gender, and cognitive status. We explore state-of-the-art deep transfer learning techniques from image, audio, speech, and language domains. We envision that one advantage of transfer learning is to eliminate the design of handcrafted features based on the tasks and datasets. Transfer learning further mitigates the limited dementia-relevant speech data problem by inheriting knowledge from similar but much larger datasets. Specifically, we built a variety of transfer learning models using commonly employed MobileNet (image), YAMNet (audio), Mockingjay (speech), and BERT (text) models. Results indicated that the transfer learning models of text data showed significantly better performance than those of audio data. Performance gains of the text models may be due to the high similarity between the pre-training text dataset and the CTP text dataset. Our multi-modal transfer learning introduced a slight improvement in accuracy, demonstrating that audio and text data provide limited complementary information. Multi-task transfer learning resulted in limited improvements in classification and a negative impact in regression. By analyzing the meaning behind the AD/non-AD labels and Mini-Mental State Examination (MMSE) scores, we observed that the inconsistency between labels and scores could limit the performance of the multi-task learning, especially when the outputs of the single-task models are highly consistent with the corresponding labels/scores. In sum, we conducted a large comparative analysis of varying transfer learning models focusing less on model customization but more on pre-trained models and pre-training datasets. We revealed insightful relations among models, data types, and data labels in this research area.
Collapse
Affiliation(s)
- Youxiang Zhu
- Computer Science, University of Massachusetts Boston, Boston, MA, USA
| | - Xiaohui Liang
- Computer Science, University of Massachusetts Boston, Boston, MA, USA
| | - John A. Batsis
- School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Robert M. Roth
- Geisel School of Medicine at Dartmouth, Lebanon, NH, USA
| |
Collapse
|
42
|
Jonell P, Moëll B, Håkansson K, Henter GE, Kucherenko T, Mikheeva O, Hagman G, Holleman J, Kivipelto M, Kjellström H, Gustafson J, Beskow J. Multimodal Capture of Patient Behaviour for Improved Detection of Early Dementia: Clinical Feasibility and Preliminary Results. FRONTIERS IN COMPUTER SCIENCE 2021. [DOI: 10.3389/fcomp.2021.642633] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Non-invasive automatic screening for Alzheimer’s disease has the potential to improve diagnostic accuracy while lowering healthcare costs. Previous research has shown that patterns in speech, language, gaze, and drawing can help detect early signs of cognitive decline. In this paper, we describe a highly multimodal system for unobtrusively capturing data during real clinical interviews conducted as part of cognitive assessments for Alzheimer’s disease. The system uses nine different sensor devices (smartphones, a tablet, an eye tracker, a microphone array, and a wristband) to record interaction data during a specialist’s first clinical interview with a patient, and is currently in use at Karolinska University Hospital in Stockholm, Sweden. Furthermore, complementary information in the form of brain imaging, psychological tests, speech therapist assessment, and clinical meta-data is also available for each patient. We detail our data-collection and analysis procedure and present preliminary findings that relate measures extracted from the multimodal recordings to clinical assessments and established biomarkers, based on data from 25 patients gathered thus far. Our findings demonstrate feasibility for our proposed methodology and indicate that the collected data can be used to improve clinical assessments of early dementia.
Collapse
|
43
|
Yamada Y, Shinkawa K, Kobayashi M, Takagi H, Nemoto M, Nemoto K, Arai T. Using Speech Data From Interactions With a Voice Assistant to Predict the Risk of Future Accidents for Older Drivers: Prospective Cohort Study. J Med Internet Res 2021; 23:e27667. [PMID: 33830066 PMCID: PMC8063093 DOI: 10.2196/27667] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/08/2021] [Accepted: 03/15/2021] [Indexed: 01/27/2023] Open
Abstract
Background With the rapid growth of the older adult population worldwide, car accidents involving this population group have become an increasingly serious problem. Cognitive impairment, which is assessed using neuropsychological tests, has been reported as a risk factor for being involved in car accidents; however, it remains unclear whether this risk can be predicted using daily behavior data. Objective The objective of this study was to investigate whether speech data that can be collected in everyday life can be used to predict the risk of an older driver being involved in a car accident. Methods At baseline, we collected (1) speech data during interactions with a voice assistant and (2) cognitive assessment data—neuropsychological tests (Mini-Mental State Examination, revised Wechsler immediate and delayed logical memory, Frontal Assessment Battery, trail making test-parts A and B, and Clock Drawing Test), Geriatric Depression Scale, magnetic resonance imaging, and demographics (age, sex, education)—from older adults. Approximately one-and-a-half years later, we followed up to collect information about their driving experiences (with respect to car accidents) using a questionnaire. We investigated the association between speech data and future accident risk using statistical analysis and machine learning models. Results We found that older drivers (n=60) with accident or near-accident experiences had statistically discernible differences in speech features that suggest cognitive impairment such as reduced speech rate (P=.048) and increased response time (P=.040). Moreover, the model that used speech features could predict future accident or near-accident experiences with 81.7% accuracy, which was 6.7% higher than that using cognitive assessment data, and could achieve up to 88.3% accuracy when the model used both types of data. Conclusions Our study provides the first empirical results that suggest analysis of speech data recorded during interactions with voice assistants could help predict future accident risk for older drivers by capturing subtle impairments in cognitive function.
Collapse
Affiliation(s)
| | | | | | | | - Miyuki Nemoto
- Department of Psychiatry, University of Tsukuba Hospital, Ibaraki, Japan
| | - Kiyotaka Nemoto
- Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan
| | - Tetsuaki Arai
- Department of Psychiatry, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan
| |
Collapse
|
44
|
Robin J, Harrison JE, Kaufman LD, Rudzicz F, Simpson W, Yancheva M. Evaluation of Speech-Based Digital Biomarkers: Review and Recommendations. Digit Biomark 2020; 4:99-108. [PMID: 33251474 DOI: 10.1159/000510820] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 08/11/2020] [Indexed: 12/23/2022] Open
Abstract
Speech represents a promising novel biomarker by providing a window into brain health, as shown by its disruption in various neurological and psychiatric diseases. As with many novel digital biomarkers, however, rigorous evaluation is currently lacking and is required for these measures to be used effectively and safely. This paper outlines and provides examples from the literature of evaluation steps for speech-based digital biomarkers, based on the recent V3 framework (Goldsack et al., 2020). The V3 framework describes 3 components of evaluation for digital biomarkers: verification, analytical validation, and clinical validation. Verification includes assessing the quality of speech recordings and comparing the effects of hardware and recording conditions on the integrity of the recordings. Analytical validation includes checking the accuracy and reliability of data processing and computed measures, including understanding test-retest reliability, demographic variability, and comparing measures to reference standards. Clinical validity involves verifying the correspondence of a measure to clinical outcomes which can include diagnosis, disease progression, or response to treatment. For each of these sections, we provide recommendations for the types of evaluation necessary for speech-based biomarkers and review published examples. The examples in this paper focus on speech-based biomarkers, but they can be used as a template for digital biomarker development more generally.
Collapse
Affiliation(s)
| | - John E Harrison
- Metis Cognition Ltd., Park House, Kilmington Common, Warminster, United Kingdom.,Alzheimer Center, AUmc, Amsterdam, The Netherlands.,Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | | | - Frank Rudzicz
- Li Ka Shing Knowledge Institute, St Michael's Hospital, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - William Simpson
- Winterlight Labs, Toronto, Ontario, Canada.,Department of Psychiatry and Behavioural Neuroscience, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|