1
|
Zhang E, Steel J, Togher L, Fromm D, MacWhinney B, Bogart E. Insights From Important Event Recounts Told by People With Traumatic Brain Injury. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:3064-3080. [PMID: 39116308 DOI: 10.1044/2024_jslhr-23-00595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
PURPOSE Communication can be chronically impacted by severe traumatic brain injury (TBI), yet there is a critical lack of research investigating communication recovery beyond 12 months postinjury with discourse measures. This longitudinal study aimed to investigate quantitative and qualitative changes in important event recounts produced by a group of people with severe TBI up to 2 years postinjury. METHOD A prospective observational design with an inception cohort was adopted. Thirty-four participants with severe TBI were asked to produce an important event recount at 6, 12, and 24 months postinjury. A mixed-methods approach comprised a quantitative analysis of microlinguistic and macrostructural measures, using the automated discourse command EVAL in Computerized Language Analysis (CLAN) and the CLAN Collaborative Commentary tool, respectively. Statistical analysis included a repeated-measures analysis of variance and the Friedman test. An independent qualitative content analysis was also conducted. RESULTS The measures revealed significant differences between 6 and 24 months, indicating a protracted recovery trajectory. The microlinguistic analysis showed increased use of revision and repetition over time. The macrostructural analysis indicated changes with orientation to recount characters, evaluative comments, and the number of events or complexity of the recount. The content analysis revealed categories of (a) childhood events, (b) family and relationships, (c) career and education, and (d) grief and loss. Topics at 6 months focused on childhood events and holidays, whereas career and education predominated at 24 months. CONCLUSIONS This is the first study to explore important event recounts told by people with severe TBI as they recovered. Participants showed discourse recovery beyond 12 months, highlighting the need for equivalent timing of service provision. The important event recount shows good potential as an ecologically valid assessment tool to evaluate communication recovery that can also be integrated with advances in computerized analysis. Analyses additionally provided insights into potential therapy targets and content categories for chronic discourse impairments. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.26499271.
Collapse
Affiliation(s)
- Erica Zhang
- The University of Sydney, New South Wales, Australia
| | - Joanne Steel
- The University of Newcastle, New South Wales, Australia
| | - Leanne Togher
- The University of Sydney, New South Wales, Australia
| | | | | | - Elise Bogart
- The University of Sydney, New South Wales, Australia
| |
Collapse
|
2
|
Cong Y, LaCroix AN, Lee J. Clinical efficacy of pre-trained large language models through the lens of aphasia. Sci Rep 2024; 14:15573. [PMID: 38971898 PMCID: PMC11227580 DOI: 10.1038/s41598-024-66576-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 07/01/2024] [Indexed: 07/08/2024] Open
Abstract
The rapid development of large language models (LLMs) motivates us to explore how such state-of-the-art natural language processing systems can inform aphasia research. What kind of language indices can we derive from a pre-trained LLM? How do they differ from or relate to the existing language features in aphasia? To what extent can LLMs serve as an interpretable and effective diagnostic and measurement tool in a clinical context? To investigate these questions, we constructed predictive and correlational models, which utilize mean surprisals from LLMs as predictor variables. Using AphasiaBank archived data, we validated our models' efficacy in aphasia diagnosis, measurement, and prediction. Our finding is that LLMs-surprisals can effectively detect the presence of aphasia and different natures of the disorder, LLMs in conjunction with the existing language indices improve models' efficacy in subtyping aphasia, and LLMs-surprisals can capture common agrammatic deficits at both word and sentence level. Overall, LLMs have potential to advance automatic and precise aphasia prediction. A natural language processing pipeline can be greatly benefitted from integrating LLMs, enabling us to refine models of existing language disorders, such as aphasia.
Collapse
Affiliation(s)
- Yan Cong
- School of Languages and Cultures, Purdue University, West Lafayette, USA.
| | - Arianna N LaCroix
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, USA
| | - Jiyeon Lee
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, USA
| |
Collapse
|
3
|
Riccardi N, Nelakuditi S, den Ouden DB, Rorden C, Fridriksson J, Desai RH. Discourse- and lesion-based aphasia quotient estimation using machine learning. Neuroimage Clin 2024; 42:103602. [PMID: 38593534 PMCID: PMC11016805 DOI: 10.1016/j.nicl.2024.103602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 04/01/2024] [Accepted: 04/01/2024] [Indexed: 04/11/2024]
Abstract
Discourse is a fundamentally important aspect of communication, and discourse production provides a wealth of information about linguistic ability. Aphasia commonly affects, in multiple ways, the ability to produce discourse. Comprehensive aphasia assessments such as the Western Aphasia Battery-Revised (WAB-R) are time- and resource-intensive. We examined whether discourse measures can be used to estimate WAB-R Aphasia Quotient (AQ), and whether this can serve as an ecologically valid, less resource-intensive measure. We used features extracted from discourse tasks using three AphasiaBank prompts involving expositional (picture description), story narrative, and procedural discourse. These features were used to train a machine learning model to predict the WAB-R AQ. We also compared and supplemented the model with lesion location information from structural neuroimaging. We found that discourse-based models could estimate AQ well, and that they outperformed models based on lesion features. Addition of lesion features to the discourse features did not improve the performance of the discourse model substantially. Inspection of the most informative discourse features revealed that different prompt types taxed different aspects of language. These findings suggest that discourse can be used to estimate aphasia severity, and provide insight into the linguistic content elicited by different types of discourse prompts.
Collapse
Affiliation(s)
- Nicholas Riccardi
- Department of Communication Sciences and Disorders, University of South Carolina, United States.
| | | | - Dirk B den Ouden
- Department of Communication Sciences and Disorders, University of South Carolina, United States
| | - Chris Rorden
- Department of Psychology, University of South Carolina, United States
| | - Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, United States
| | - Rutvik H Desai
- Department of Psychology, University of South Carolina, United States
| |
Collapse
|
4
|
Fromm D, Dalton SG, Brick A, Olaiya G, Hill S, Greenhouse J, MacWhinney B. The Case of the Cookie Jar: Differences in Typical Language Use in Dementia. J Alzheimers Dis 2024; 100:1417-1434. [PMID: 38995772 PMCID: PMC11380261 DOI: 10.3233/jad-230844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
Background Findings from language sample analyses can provide efficient and effective indicators of cognitive impairment in older adults. Objective This study used newly automated core lexicon analyses of Cookie Theft picture descriptions to assess differences in typical use across three groups. Methods Participants included adults without diagnosed cognitive impairments (Control), adults diagnosed with Alzheimer's disease (ProbableAD), and adults diagnosed with mild cognitive impairment (MCI). Cookie Theft picture descriptions were transcribed and analyzed using CLAN. Results Results showed that the ProbableAD group used significantly fewer core lexicon words overall than the MCI and Control groups. For core lexicon content words (nouns, verbs), however, both the MCI and ProbableAD groups produced significantly fewer words than the Control group. The groups did not differ in their use of core lexicon function words. The ProbableAD group was also slower to produce most of the core lexicon words than the MCI and Control groups. The MCI group was slower than the Control group for only two of the core lexicon content words. All groups mentioned a core lexicon word in the top left quadrant of the picture early in the description. The ProbableAD group was then significantly slower than the other groups to mention a core lexicon word in the other quadrants. Conclusions This standard and simple-to-administer task reveals group differences in overall core lexicon scores and the amount of time until the speaker produces the key items. Clinicians and researchers can use these tools for both early assessment and measurement of change over time.
Collapse
Affiliation(s)
- Davida Fromm
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Sarah Grace Dalton
- Department of Speech Pathology and Audiology, Marquette University, Milwaukee, WI, USA
| | - Alexander Brick
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Gbenuola Olaiya
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Sophia Hill
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Joel Greenhouse
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Brian MacWhinney
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
5
|
Liu H, MacWhinney B, Fromm D, Lanzi A. Automation of Language Sample Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:2421-2433. [PMID: 37348510 PMCID: PMC10555460 DOI: 10.1044/2023_jslhr-22-00642] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/14/2023] [Accepted: 04/13/2023] [Indexed: 06/24/2023]
Abstract
PURPOSE A major barrier to the wider use of language sample analysis (LSA) is the fact that transcription is very time intensive. Methods that can reduce the required time and effort could help in promoting the use of LSA for clinical practice and research. METHOD This article describes an automated pipeline, called Batchalign, that takes raw audio and creates full transcripts in Codes for the Human Analysis of Talk (CHAT) transcription format, complete with utterance- and word-level time alignments and morphosyntactic analysis. The pipeline only requires major human intervention for final checking. It combines a series of existing tools with additional novel reformatting processes. The steps in the pipeline are (a) automatic speech recognition, (b) utterance tokenization, (c) automatic corrections, (d) speaker ID assignment, (e) forced alignment, (f) user adjustments, and (g) automatic morphosyntactic and profiling analyses. RESULTS For work with recordings from adults with language disorders, six major results were obtained: (a) The word error rate was between 2.4% for controls and 3.4% for patients, (b) utterance tokenization accuracy was at the level reported for speakers without language disorders, (c) word-level diarization accuracy was at 93% for control participants and 83% for participants with language disorders, (d) utterance-level diarization accuracy based on word-level diarization was high, (e) adherence to CHAT format was fully accurate, and (f) human transcriber time was reduced by up to 75%. CONCLUSION The pipeline dramatically shortens the time gap between data collection and data analysis and provides an output superior to that typically generated by human transcribers.
Collapse
Affiliation(s)
| | - Brian MacWhinney
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | - Davida Fromm
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | - Alyssa Lanzi
- Communication Sciences and Disorders Department, University of Delaware, Newark
| |
Collapse
|
6
|
Lanzi AM, Saylor AK, Fromm D, Liu H, MacWhinney B, Cohen ML. DementiaBank: Theoretical Rationale, Protocol, and Illustrative Analyses. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 32:426-438. [PMID: 36791255 PMCID: PMC10171844 DOI: 10.1044/2022_ajslp-22-00281] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/01/2022] [Accepted: 11/25/2022] [Indexed: 05/12/2023]
Abstract
PURPOSE Dementia from Alzheimer's disease (AD) is characterized primarily by a significant decline in memory abilities; however, language abilities are also commonly affected and may precede the decline of other cognitive abilities. To study the progression of language, there is a need for open-access databases that can be used to build algorithms to produce translational models sensitive enough to detect early declines in language abilities. DementiaBank is an open-access repository of transcribed video/audio data from communicative interactions from people with dementia, mild cognitive impairment (MCI), and controls. The aims of this tutorial are to (a) describe the newly established standardized DementiaBank discourse protocol, (b) describe the Delaware corpus data, and (c) provide examples of automated linguistic analyses that can be conducted with the Delaware corpus data and describe additional DementiaBank resources. METHOD The DementiaBank discourse protocol elicits four types of discourse: picture description, story narrative, procedural, and personal narrative. The Delaware corpus currently includes data from 20 neurotypical adults and 33 adults with MCI from possible AD who completed the DementiaBank discourse protocol and a cognitive-linguistic battery. Language samples were video- and audio-recorded, transcribed, coded, and uploaded to DementiaBank. The protocol materials and transcription programs can be accessed for free via the DementiaBank website. RESULTS Illustrative analyses show the potential of the Delaware corpus data to help understand discourse metrics at the individual and group levels. In addition, they highlight analyses that could be used across TalkBank's other clinical banks (e.g., AphasiaBank). Information is also included on manual and automatic speech recognition transcription methods. CONCLUSIONS DementiaBank is a shared online database that can facilitate research efforts to address the gaps in knowledge about language changes associated with MCI and dementia from AD. Identifying early language markers could lead to improved assessment and treatment approaches for adults at risk for dementia.
Collapse
Affiliation(s)
- Alyssa M. Lanzi
- Department of Communication Sciences and Disorders, University of Delaware, Newark
- Delaware Center for Cognitive Aging Research, University of Delaware, Newark
| | - Anna K. Saylor
- Department of Communication Sciences and Disorders, University of Delaware, Newark
| | - Davida Fromm
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | | | - Brian MacWhinney
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA
| | - Matthew L. Cohen
- Department of Communication Sciences and Disorders, University of Delaware, Newark
- Delaware Center for Cognitive Aging Research, University of Delaware, Newark
- Center for Health Assessment Research and Translation, University of Delaware, Newark
| |
Collapse
|