1
|
Ding J, Mirman D. Data-driven classification of narrative speech characteristics in stroke aphasia distinguishes neurological and strategic contributions. Cortex 2025; 186:61-73. [PMID: 40186929 DOI: 10.1016/j.cortex.2025.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 03/19/2025] [Accepted: 03/19/2025] [Indexed: 04/07/2025]
Abstract
Narrative speech deficits are common in post-stroke aphasia, resulting in negative influences on social participation and quality of life. Speech rate, complexity, and informativeness deficits all contribute to narrative speech. Research studies typically (implicitly) assume that these aspects of narrative speech production are a result of cognitive/neurological impairment, but they may also result from strategic choices made as individuals with aphasia attempt to produce narrative speech. Here, we used data-driven methods to classify aphasic narrative speech patterns and evaluated their predictability from lesion patterns. 76 stroke aphasia patients completed 11 narrative speech production tasks. Quantitative Production Analysis (QPA) and Correct Information Unit (CIU) analysis were used to measure their structural and functional properties. Based on prior work, we selected QPA measures of speech rate (words per minute) and complexity (mean sentence length, inflection index, and auxiliary index) and four CIU measures of informativeness (#CIU, CIU/min, %CIU, #nonCIU). These measures produced two orthogonal dimensions with four orthogonal participant clusters. Comprehensive comparison between clusters revealed that speech rate and complexity were strongly associated with general aphasia severity and total lesion volume, and were predicted by frontoparietal grey matter and dorsal pathway white matter damage. In contrast, informativeness was independent of other behavioral and neurological deficits, and was not predictable from lesion patterns, suggesting that it reflects communication strategy rather than specific neurological impairment. These results provide an important step toward distinguishing neurological and strategic aspects of narrative speech deficits in post-stroke aphasia, with potential implications for treatment approaches that target communication strategies.
Collapse
Affiliation(s)
- Junhua Ding
- State Key Laboratory of Cognitive Science and Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Daniel Mirman
- Department of Psychology, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Bayat S, Sanati M, Mohammad‐Panahi M, Khodadadi A, Ghasimi M, Rezaee S, Besharat S, Mahboubi‐Fooladi Z, Almasi‐Dooghaee M, Sanei‐Taheri M, Dickerson BC, Rezaii N. Language abnormalities in Alzheimer's disease indicate reduced informativeness. Ann Clin Transl Neurol 2024; 11:2946-2957. [PMID: 39291771 PMCID: PMC11572728 DOI: 10.1002/acn3.52205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/22/2024] [Accepted: 08/27/2024] [Indexed: 09/19/2024] Open
Abstract
OBJECTIVE This study aims to elucidate the cognitive underpinnings of language abnormalities in Alzheimer's Disease (AD) using a computational cross-linguistic approach and ultimately enhance the understanding and diagnostic accuracy of the disease. METHODS Computational analyses were conducted on language samples of 156 English and 50 Persian speakers, comprising both AD patients and healthy controls, to extract language indicators of AD. Furthermore, we introduced a machine learning-based metric, Language Informativeness Index (LII), to quantify empty speech. RESULTS Despite considerable disparities in surface structures between the two languages, we observed consistency across language indicators of AD in both English and Persian. Notably, indicators of AD in English resulted in a classification accuracy of 90% in classifying AD in Persian. The substantial degree of transferability suggests that the language abnormalities of AD do not tightly link to the surface structures specific to English. Subsequently, we posited that these abnormalities stem from impairments in a more universal aspect of language production: the ability to generate informative messages independent of the language spoken. Consistent with this hypothesis, we found significant correlations between language indicators of AD and empty speech in both English and Persian. INTERPRETATION The findings of this study suggest that language impairments in AD arise from a deficit in a universal aspect of message formation rather than from the breakdown of language-specific morphosyntactic structures. Beyond enhancing our understanding of the psycholinguistic deficits of AD, our approach fosters the development of diagnostic tools across various languages, enhancing health equity and biocultural diversity.
Collapse
Affiliation(s)
- Sabereh Bayat
- Azad University Science and Research BranchSattari HighwayTehranIran
| | - Mahya Sanati
- Abrar Institute of Higher EducationKhorasan SquareTehranIran
| | | | | | - Mahdieh Ghasimi
- Shahid Beheshti University of Medical SciencesVelenjak, Daneshjoo BlvdTehranIran
| | - Sahar Rezaee
- Shahid Beheshti University of Medical SciencesVelenjak, Daneshjoo BlvdTehranIran
| | - Sara Besharat
- Shahid Beheshti University of Medical SciencesVelenjak, Daneshjoo BlvdTehranIran
| | | | | | - Morteza Sanei‐Taheri
- Shahid Beheshti University of Medical SciencesVelenjak, Daneshjoo BlvdTehranIran
| | - Bradford C. Dickerson
- Massachusetts General Hospital, Harvard Medical School55 Fruit StreetBostonUSA
- Athinoula A. Martinos Center for Biomedical Imaging149 13th StreetBostonMassachusettsUSA
- Massachusetts Alzheimer's Disease Research CenterBostonMassachusetts02114USA
| | - Neguine Rezaii
- Massachusetts General Hospital, Harvard Medical School55 Fruit StreetBostonUSA
- Athinoula A. Martinos Center for Biomedical Imaging149 13th StreetBostonMassachusettsUSA
| |
Collapse
|
3
|
Rezaii N, Hochberg D, Quimby M, Wong B, Brickhouse M, Touroutoglou A, Dickerson BC, Wolff P. Artificial intelligence classifies primary progressive aphasia from connected speech. Brain 2024; 147:3070-3082. [PMID: 38912855 PMCID: PMC11370793 DOI: 10.1093/brain/awae196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 05/08/2024] [Accepted: 05/21/2024] [Indexed: 06/25/2024] Open
Abstract
Neurodegenerative dementia syndromes, such as primary progressive aphasias (PPA), have traditionally been diagnosed based, in part, on verbal and non-verbal cognitive profiles. Debate continues about whether PPA is best divided into three variants and regarding the most distinctive linguistic features for classifying PPA variants. In this cross-sectional study, we initially harnessed the capabilities of artificial intelligence and natural language processing to perform unsupervised classification of short, connected speech samples from 78 pateints with PPA. We then used natural language processing to identify linguistic features that best dissociate the three PPA variants. Large language models discerned three distinct PPA clusters, with 88.5% agreement with independent clinical diagnoses. Patterns of cortical atrophy of three data-driven clusters corresponded to the localization in the clinical diagnostic criteria. In the subsequent supervised classification, 17 distinctive features emerged, including the observation that separating verbs into high- and low-frequency types significantly improved classification accuracy. Using these linguistic features derived from the analysis of short, connected speech samples, we developed a classifier that achieved 97.9% accuracy in classifying the four groups (three PPA variants and healthy controls). The data-driven section of this study showcases the ability of large language models to find natural partitioning in the speech of patients with PPA consistent with conventional variants. In addition, the work identifies a robust set of language features indicative of each PPA variant, emphasizing the significance of dividing verbs into high- and low-frequency categories. Beyond improving diagnostic accuracy, these findings enhance our understanding of the neurobiology of language processing.
Collapse
Affiliation(s)
- Neguine Rezaii
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Daisy Hochberg
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Megan Quimby
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Bonnie Wong
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Michael Brickhouse
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Alexandra Touroutoglou
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| | - Bradford C Dickerson
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Athinoula A. Martinos Center for Biomedical Imaging, Harvard Medical School, Boston, MA 02129, USA
- Massachusetts Alzheimer's Disease Research Center, Harvard Medical School, Boston, MA 02114, USA
| | - Phillip Wolff
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
4
|
Cong Y, LaCroix AN, Lee J. Clinical efficacy of pre-trained large language models through the lens of aphasia. Sci Rep 2024; 14:15573. [PMID: 38971898 PMCID: PMC11227580 DOI: 10.1038/s41598-024-66576-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 07/01/2024] [Indexed: 07/08/2024] Open
Abstract
The rapid development of large language models (LLMs) motivates us to explore how such state-of-the-art natural language processing systems can inform aphasia research. What kind of language indices can we derive from a pre-trained LLM? How do they differ from or relate to the existing language features in aphasia? To what extent can LLMs serve as an interpretable and effective diagnostic and measurement tool in a clinical context? To investigate these questions, we constructed predictive and correlational models, which utilize mean surprisals from LLMs as predictor variables. Using AphasiaBank archived data, we validated our models' efficacy in aphasia diagnosis, measurement, and prediction. Our finding is that LLMs-surprisals can effectively detect the presence of aphasia and different natures of the disorder, LLMs in conjunction with the existing language indices improve models' efficacy in subtyping aphasia, and LLMs-surprisals can capture common agrammatic deficits at both word and sentence level. Overall, LLMs have potential to advance automatic and precise aphasia prediction. A natural language processing pipeline can be greatly benefitted from integrating LLMs, enabling us to refine models of existing language disorders, such as aphasia.
Collapse
Affiliation(s)
- Yan Cong
- School of Languages and Cultures, Purdue University, West Lafayette, USA.
| | - Arianna N LaCroix
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, USA
| | - Jiyeon Lee
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, USA
| |
Collapse
|
5
|
Sanati M, Bayat S, Panahi MM, Khodadadi A, Rezaee S, Ghasimi M, Besharat S, Fooladi ZM, Dooghaee MA, Taheri MS, Dickerson BC, Goldberg A, Rezaii N. Impaired language in Alzheimer's disease: A comparison between English and Persian implicates content-word frequency rather than the noun-verb distinction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.09.24305534. [PMID: 38645255 PMCID: PMC11030473 DOI: 10.1101/2024.04.09.24305534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
This study challenges the conventional psycholinguistic view that the distinction between nouns and verbs is pivotal in understanding language impairments in neurological disorders. Traditional views link frontal brain region damage with verb processing deficits and posterior temporoparietal damage with noun difficulties. However, this perspective is contested by findings from patients with Alzheimer's disease (pwAD), who show impairments in both word classes despite their typical temporoparietal atrophy. Notably, pwAD tend to use semantically lighter verbs in their speech than healthy individuals. By examining English-speaking pwAD and comparing them with Persian-speaking pwAD, this research aims to demonstrate that language impairments in Alzheimer's disease (AD) stem from the distributional properties of words within a language rather than distinct neural processing networks for nouns and verbs. We propose that the primary deficit in AD language production is an overreliance on high-frequency words. English has a set of particularly high-frequency verbs that surpass most nouns in usage frequency. Since pwAD tend to use high-frequency words, the byproduct of this word distribution in the English language would be an over-usage of high-frequency verbs. In contrast, Persian features complex verbs with an overall distribution lacking extremely high-frequency verbs like those found in English. As a result, we hypothesize that Persian-speaking pwAD would not have a bias toward the overuse of high-frequency verbs. We analyzed language samples from 95 English-speaking pwAD and 91 healthy controls, along with 27 Persian-speaking pwAD and 27 healthy controls. Employing uniform automated natural language processing methods, we measured the usage rates of nouns, verbs, and word frequencies across both cohorts. Our findings showed that English-speaking pwAD use higher-frequency verbs than healthy individuals, a pattern not mirrored by Persian-speaking pwAD. Crucially, we found a significant interaction between the frequencies of verbs used by English and Persian speakers with and without AD. Moreover, regression models that treated noun and verb frequencies as separate predictors did not outperform models that considered overall word frequency alone in classifying AD. In conclusion, this study suggests that language abnormalities among English-speaking pwAD reflect the unique distributional properties of words in English rather than a universal noun-verb class distinction. Beyond offering a new understanding of language abnormalities in AD, the study highlights the critical need for further investigation across diverse languages to deepen our insight into the mechanisms of language impairments in neurological disorders.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Mostafa Almasi Dooghaee
- Abrar Institute of Higher Education
- Azad University Science and Research Branch
- Institute for Cognitive Science Studies
- Mashhad University of Medical Science
- Shahid Beheshti University of Medical Sciences
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School
- Princeton University
| | | | - Bradford C Dickerson
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School
| | | | - Neguine Rezaii
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital, Harvard Medical School
| |
Collapse
|
6
|
Bayat S, Santai M, Panahi MM, Khodadadi A, Ghassimi M, Rezaei S, Besharat S, Mahboubi Z, Almasi M, Sanei Taheri M, Dickerson BC, Rezaii N. Language Abnormalities in Alzheimer's Disease Arise from Reduced Informativeness: A Cross-Linguistic Study in English and Persian. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.19.24304407. [PMID: 38562858 PMCID: PMC10984049 DOI: 10.1101/2024.03.19.24304407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
INTRODUCTION This research investigates the psycholinguistic origins of language impairments in Alzheimer's Disease (AD), questioning if these impairments result from language-specific structural disruptions or from a universal deficit in generating meaningful content. METHODS Cross-linguistic analysis was conducted on language samples from 184 English and 52 Persian speakers, comprising both AD patients and healthy controls, to extract various language features. Furthermore, we introduced a machine learning-based metric, Language Informativeness Index (LII), to quantify informativeness. RESULTS Indicators of AD in English were found to be highly predictive of AD in Persian, with a 92.3% classification accuracy. Additionally, we found robust correlations between the typical linguistic abnormalities of AD and language emptiness (low LII) across both languages. DISCUSSION Findings suggest AD linguistics impairments are attributed to a core universal difficulty in generating informative messages. Our approach underscores the importance of incorporating biocultural diversity into research, fostering the development of inclusive diagnostic tools.
Collapse
|
7
|
Rezaii N, Hochberg D, Quimby M, Wong B, McGinnis S, Dickerson BC, Putcha D. Language uncovers visuospatial dysfunction in posterior cortical atrophy: a natural language processing approach. Front Neurosci 2024; 18:1342909. [PMID: 38379764 PMCID: PMC10876777 DOI: 10.3389/fnins.2024.1342909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 01/18/2024] [Indexed: 02/22/2024] Open
Abstract
Introduction Posterior Cortical Atrophy (PCA) is a syndrome characterized by a progressive decline in higher-order visuospatial processing, leading to symptoms such as space perception deficit, simultanagnosia, and object perception impairment. While PCA is primarily known for its impact on visuospatial abilities, recent studies have documented language abnormalities in PCA patients. This study aims to delineate the nature and origin of language impairments in PCA, hypothesizing that language deficits reflect the visuospatial processing impairments of the disease. Methods We compared the language samples of 25 patients with PCA with age-matched cognitively normal (CN) individuals across two distinct tasks: a visually-dependent picture description and a visually-independent job description task. We extracted word frequency, word utterance latency, and spatial relational words for this comparison. We then conducted an in-depth analysis of the language used in the picture description task to identify specific linguistic indicators that reflect the visuospatial processing deficits of PCA. Results Patients with PCA showed significant language deficits in the visually-dependent task, characterized by higher word frequency, prolonged utterance latency, and fewer spatial relational words, but not in the visually-independent task. An in-depth analysis of the picture description task further showed that PCA patients struggled to identify certain visual elements as well as the overall theme of the picture. A predictive model based on these language features distinguished PCA patients from CN individuals with high classification accuracy. Discussion The findings indicate that language is a sensitive behavioral construct to detect visuospatial processing abnormalities of PCA. These insights offer theoretical and clinical avenues for understanding and managing PCA, underscoring language as a crucial marker for the visuospatial deficits of this atypical variant of Alzheimer's disease.
Collapse
Affiliation(s)
- Neguine Rezaii
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| | - Daisy Hochberg
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| | - Megan Quimby
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| | - Bonnie Wong
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| | - Scott McGinnis
- Center for Brain Mind Medicine, Department of Neurology, Brigham and Women’s Hospital, Boston, MA, United States
| | - Bradford C. Dickerson
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States
- Alzheimer’s Disease Research Center, Massachusetts General Hospital, Charlestown, MA, United States
| | - Deepti Putcha
- Frontotemporal Disorders Unit, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| |
Collapse
|
8
|
Rezaii N, Quimby M, Wong B, Hochberg D, Brickhouse M, Touroutoglou A, Dickerson BC, Wolff P. Using Generative Artificial Intelligence to Classify Primary Progressive Aphasia from Connected Speech. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.22.23300470. [PMID: 38234853 PMCID: PMC10793520 DOI: 10.1101/2023.12.22.23300470] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Neurodegenerative dementia syndromes, such as Primary Progressive Aphasias (PPA), have traditionally been diagnosed based in part on verbal and nonverbal cognitive profiles. Debate continues about whether PPA is best subdivided into three variants and also regarding the most distinctive linguistic features for classifying PPA variants. In this study, we harnessed the capabilities of artificial intelligence (AI) and natural language processing (NLP) to first perform unsupervised classification of concise, connected speech samples from 78 PPA patients. Large Language Models discerned three distinct PPA clusters, with 88.5% agreement with independent clinical diagnoses. Patterns of cortical atrophy of three data-driven clusters corresponded to the localization in the clinical diagnostic criteria. We then used NLP to identify linguistic features that best dissociate the three PPA variants. Seventeen features emerged as most valuable for this purpose, including the observation that separating verbs into high and low-frequency types significantly improves classification accuracy. Using these linguistic features derived from the analysis of brief connected speech samples, we developed a classifier that achieved 97.9% accuracy in predicting PPA subtypes and healthy controls. Our findings provide pivotal insights for refining early-stage dementia diagnosis, deepening our understanding of the characteristics of these neurodegenerative phenotypes and the neurobiology of language processing, and enhancing diagnostic evaluation accuracy.
Collapse
Affiliation(s)
- Neguine Rezaii
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Department of Neurology, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Megan Quimby
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Bonnie Wong
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Department of Psychiatry, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Daisy Hochberg
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Michael Brickhouse
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Alexandra Touroutoglou
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Department of Neurology, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Massachusetts Alzheimer’s Disease Research Center, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Bradford C. Dickerson
- Frontotemporal Disorders Unit, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Department of Neurology, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Department of Psychiatry, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
- Massachusetts Alzheimer’s Disease Research Center, Massachusetts General Hospital & Harvard Medical School, Boston MA, USA
| | - Phillip Wolff
- Department of Psychology, Emory University, Atlanta, GA, USA
| |
Collapse
|