1
|
Bosen AK, Doria GM. Identifying Links Between Latent Memory and Speech Recognition Factors. Ear Hear 2024; 45:351-369. [PMID: 37882100 PMCID: PMC10922378 DOI: 10.1097/aud.0000000000001430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
OBJECTIVES The link between memory ability and speech recognition accuracy is often examined by correlating summary measures of performance across various tasks, but interpretation of such correlations critically depends on assumptions about how these measures map onto underlying factors of interest. The present work presents an alternative approach, wherein latent factor models are fit to trial-level data from multiple tasks to directly test hypotheses about the underlying structure of memory and the extent to which latent memory factors are associated with individual differences in speech recognition accuracy. Latent factor models with different numbers of factors were fit to the data and compared to one another to select the structures which best explained vocoded sentence recognition in a two-talker masker across a range of target-to-masker ratios, performance on three memory tasks, and the link between sentence recognition and memory. DESIGN Young adults with normal hearing (N = 52 for the memory tasks, of which 21 participants also completed the sentence recognition task) completed three memory tasks and one sentence recognition task: reading span, auditory digit span, visual free recall of words, and recognition of 16-channel vocoded Perceptually Robust English Sentence Test Open-set sentences in the presence of a two-talker masker at target-to-masker ratios between +10 and 0 dB. Correlations between summary measures of memory task performance and sentence recognition accuracy were calculated for comparison to prior work, and latent factor models were fit to trial-level data and compared against one another to identify the number of latent factors which best explains the data. Models with one or two latent factors were fit to the sentence recognition data and models with one, two, or three latent factors were fit to the memory task data. Based on findings with these models, full models that linked one speech factor to one, two, or three memory factors were fit to the full data set. Models were compared via Expected Log pointwise Predictive Density and post hoc inspection of model parameters. RESULTS Summary measures were positively correlated across memory tasks and sentence recognition. Latent factor models revealed that sentence recognition accuracy was best explained by a single factor that varied across participants. Memory task performance was best explained by two latent factors, of which one was generally associated with performance on all three tasks and the other was specific to digit span recall accuracy at lists of six digits or more. When these models were combined, the general memory factor was closely related to the sentence recognition factor, whereas the factor specific to digit span had no apparent association with sentence recognition. CONCLUSIONS Comparison of latent factor models enables testing hypotheses about the underlying structure linking cognition and speech recognition. This approach showed that multiple memory tasks assess a common latent factor that is related to individual differences in sentence recognition, although performance on some tasks was associated with multiple factors. Thus, while these tasks provide some convergent assessment of common latent factors, caution is needed when interpreting what they tell us about speech recognition.
Collapse
|
2
|
Schauwecker N, Tamati TN, Moberly AC. Predicting Early Cochlear Implant Performance: Can Cognitive Testing Help? OTOLOGY & NEUROTOLOGY OPEN 2024; 4:e050. [PMID: 38533348 PMCID: PMC10962885 DOI: 10.1097/ono.0000000000000050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 01/08/2024] [Indexed: 03/28/2024]
Abstract
Introduction There is significant variability in speech recognition outcomes in adults who receive cochlear implants (CIs). Little is known regarding cognitive influences on very early CI performance, during which significant neural plasticity occurs. Methods Prospective study of 15 postlingually deafened adult CI candidates tested preoperatively with a battery of cognitive assessments. The mini-mental state exam (MMSE), forward digit span, Stroop measure of inhibition-concentration, and test of word reading efficiency were utilized to assess cognition. consonant-nucleus-consonant words, AZBio sentences in quiet, and AZBio sentences in noise (+10 dB SNR) were utilized to assess speech recognition at 1- and 3-months of CI use. Results Performance in all speech measures at 1-month was moderately correlated with preoperative MMSE, but these correlations were not strongly correlated after correcting for multiple comparisons. There were large correlations of forward digit span with 1-month AzBio quiet (P ≤ 0.001, rho = 0.762) and AzBio noise (P ≤ 0.001, rho = 0.860), both of which were strong after correction. At 3 months, forward digit span was strongly predictive of AzBio noise (P ≤ 0.001, rho = 0.786), which was strongly correlated after correction. Changes in speech recognition scores were not correlated with preoperative cognitive test scores. Conclusions Working memory capacity significantly predicted early CI sentence recognition performance in our small cohort, while other cognitive functions assessed did not. These results differ from prior studies predicting longer-term outcomes. Findings and further studies may lead to better preoperative counseling and help identify patients who require closer evaluation to ensure optimal CI performance.
Collapse
Affiliation(s)
- Natalie Schauwecker
- Department of Otolaryngology – Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Terrin N. Tamati
- Department of Otolaryngology – Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Aaron C. Moberly
- Department of Otolaryngology – Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
3
|
Pearson DV, Shen Y, McAuley JD, Kidd GR. The effect of rhythm on selective listening in multiple-source environments for young and older adults. Hear Res 2023; 435:108789. [PMID: 37276686 PMCID: PMC10460128 DOI: 10.1016/j.heares.2023.108789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 05/03/2023] [Accepted: 05/10/2023] [Indexed: 06/07/2023]
Abstract
Understanding continuous speech with competing background sounds is challenging, particularly for older adults. One stimulus property that may aid listeners understanding of to-be-attended (target) material is temporal regularity (rhythm). In the context of speech-in-noise understanding, McAuley and colleagues recently showed a target rhythm effect whereby recognition of target speech was better when natural speech rhythm of a target talker was intact than when it was temporally altered. The current study replicates the target rhythm effect using a synthetic vowel sequence paradigm in young adults (Experiment 1) and then uses this paradigm to investigate potential age-related changes in the effect of rhythm on recognition (Experiment 2). Listeners identified the last three vowels of temporally regular (isochronous) and irregular (anisochronous) synthetic vowel sequences in quiet and with a competing background sequence of vowel-like harmonic tone complexes presented at various tempos. The results replicated the target rhythm effect whereby temporal regularity in the vowel sequences improved identification accuracy of young listeners compared to irregular vowel sequences. The magnitude of the effect was not found to be influenced by background tempo, but faster background tempos led to greater vowel identification accuracy independent of regularity. Older listeners also demonstrated a target rhythm effect but received less benefit from the temporal regularity of the target sequences than did young listeners. This study highlights the importance of rhythm for understanding age-related differences in selective listening in complex environments and provides a novel paradigm for investigating effects of rhythm on perception.
Collapse
Affiliation(s)
- Dylan V Pearson
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, United States
| | - J Devin McAuley
- Department of Psychology, Michigan State University, United States
| | - Gary R Kidd
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States
| |
Collapse
|
4
|
Bianco R, Chait M. No Link Between Speech-in-Noise Perception and Auditory Sensory Memory - Evidence From a Large Cohort of Older and Younger Listeners. Trends Hear 2023; 27:23312165231190688. [PMID: 37828868 PMCID: PMC10576936 DOI: 10.1177/23312165231190688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 07/06/2023] [Accepted: 07/11/2023] [Indexed: 10/14/2023] Open
Abstract
A growing literature is demonstrating a link between working memory (WM) and speech-in-noise (SiN) perception. However, the nature of this correlation and which components of WM might underlie it, are being debated. We investigated how SiN reception links with auditory sensory memory (aSM) - the low-level processes that support the short-term maintenance of temporally unfolding sounds. A large sample of old (N = 199, 60-79 yo) and young (N = 149, 20-35 yo) participants was recruited online and performed a coordinate response measure-based speech-in-babble task that taps listeners' ability to track a speech target in background noise. We used two tasks to investigate implicit and explicit aSM. Both were based on tone patterns overlapping in processing time scales with speech (presentation rate of tones 20 Hz; of patterns 2 Hz). We hypothesised that a link between SiN and aSM may be particularly apparent in older listeners due to age-related reduction in both SiN reception and aSM. We confirmed impaired SiN reception in the older cohort and demonstrated reduced aSM performance in those listeners. However, SiN and aSM did not share variability. Across the two age groups, SiN performance was predicted by a binaural processing test and age. The results suggest that previously observed links between WM and SiN may relate to the executive components and other cognitive demands of the used tasks. This finding helps to constrain the search for the perceptual and cognitive factors that explain individual variability in SiN performance.
Collapse
Affiliation(s)
- Roberta Bianco
- Ear Institute, University College London, London, UK
- Neuroscience of Perception and Action Lab, Italian Institute of Technology (IIT), Rome, Italy
| | - Maria Chait
- Ear Institute, University College London, London, UK
| |
Collapse
|
5
|
An easy way to improve scoring of memory span tasks: The edit distance, beyond "correct recall in the correct serial position". Behav Res Methods 2022:10.3758/s13428-022-01908-2. [PMID: 35794418 DOI: 10.3758/s13428-022-01908-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/15/2022] [Indexed: 11/08/2022]
Abstract
For researchers and psychologists interested in estimating a subject's memory capacity, the current standard for scoring memory span tasks is the partial-credit method: subjects are credited with the number of stimuli that they manage to recall correctly in the correct serial position. A critical issue with this method, however, is that intrusions and omissions can radically change the scores depending on where they occur. For example, when recalling the sequence ABCDE, "ABCD" is worth 4 points but "BCDE" is worth 0 points. This paper presents an improved scoring method based on the edit distance, meaning the number of changes required to edit the recalled sequence into the target. Edit-distance scoring gives results close to partial-credit scoring, but without the corresponding vulnerability to positional shifts. A reanalysis of memory performance in two large datasets (N = 1093 and N = 758) confirms that in addition to being more logically consistent, edit-distance scoring demonstrates similar or better psychometric properties than partial-credit, with comparable validity, a small increase in reliability, and a substantial increase of test information (measurement precision in the context of item response theory). Test information was especially improved for harder items and for subjects with ability in the lower range, whose scores tend to be severely underestimated by partial-credit scoring. Code to compute edit-distance scores with various software is made available at https://osf.io/wdb83/ .
Collapse
|
6
|
Luo X, Azuma T, Kolberg C, Pulling KR. The effects of stimulus modality, task complexity, and cuing on working memory and the relationship with speech recognition in older cochlear implant users. JOURNAL OF COMMUNICATION DISORDERS 2022; 95:106170. [PMID: 34839068 DOI: 10.1016/j.jcomdis.2021.106170] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 11/15/2021] [Accepted: 11/15/2021] [Indexed: 06/13/2023]
Abstract
INTRODUCTION The role of working memory (WM) in speech recognition of older cochlear implant (CI) users remains unclear. This study 1) examined the effects of aging and CI on WM performance across different modalities (auditory vs. visual) and cuing conditions, and 2) assessed how specific WM measures relate to sentence and word recognition in noise. METHOD Fourteen Older CI users, 12 Older acoustic-hearing (AH) listeners with age-appropriate hearing loss, and 15 Young normal-hearing (NH) listeners were tested. Participants completed two simple span tasks (auditory digit and visual letter span), two complex WM tasks (reading span and cued-modality WM with simultaneously presented auditory digits and visual letters), and two speech recognition tasks (sentence and word recognition in speech-babble noise). RESULTS The groups showed similar simple span performance, except that Older CI users had lower auditory digit span than Young NH listeners. Both older groups had similar reading span performance, but scored significantly lower than Young NH listeners, indicating age-related declines in attentional and phonological processing. A similar group effect was observed in the cued-modality WM task. All groups showed higher recall for auditory digits than for visual letters and the advantage was most evident without modality cuing. All groups displayed greater cuing benefits for visual recall than for auditory recall, suggesting that participants consistently allocated more attention to auditory stimuli regardless of cuing. For Older CI users, after controlling for the previously reported spectral resolution, auditory-uncued WM performance was significantly correlated with word recognition but not sentence recognition. CONCLUSIONS Complex WM was significantly affected by aging but not by CI. Neither aging nor CI significantly affected modality cuing benefits in the WM task. For Older CI users, complex auditory WM with attentional control may better reflect the cognitive load of speech recognition in noise than simple span or complex visual WM.
Collapse
Affiliation(s)
- Xin Luo
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, AZ, United States of America.
| | - Tamiko Azuma
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, AZ, United States of America
| | - Courtney Kolberg
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, AZ, United States of America
| | - Kathryn R Pulling
- Program of Speech and Hearing Science, College of Health Solutions, Arizona State University, Tempe, AZ, United States of America
| |
Collapse
|
7
|
Lewis JH, Castellanos I, Moberly AC. The Impact of Neurocognitive Skills on Recognition of Spectrally Degraded Sentences. J Am Acad Audiol 2021; 32:528-536. [PMID: 34965599 DOI: 10.1055/s-0041-1732438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
BACKGROUND Recent models theorize that neurocognitive resources are deployed differently during speech recognition depending on task demands, such as the severity of degradation of the signal or modality (auditory vs. audiovisual [AV]). This concept is particularly relevant to the adult cochlear implant (CI) population, considering the large amount of variability among CI users in their spectro-temporal processing abilities. However, disentangling the effects of individual differences in spectro-temporal processing and neurocognitive skills on speech recognition in clinical populations of adult CI users is challenging. Thus, this study investigated the relationship between neurocognitive functions and recognition of spectrally degraded speech in a group of young adult normal-hearing (NH) listeners. PURPOSE The aim of this study was to manipulate the degree of spectral degradation and modality of speech presented to young adult NH listeners to determine whether deployment of neurocognitive skills would be affected. RESEARCH DESIGN Correlational study design. STUDY SAMPLE Twenty-one NH college students. DATA COLLECTION AND ANALYSIS Participants listened to sentences in three spectral-degradation conditions: no degradation (clear sentences); moderate degradation (8-channel noise-vocoded); and high degradation (4-channel noise-vocoded). Thirty sentences were presented in an auditory-only (A-only) modality and an AV fashion. Visual assessments from The National Institute of Health Toolbox Cognitive Battery were completed to evaluate working memory, inhibition-concentration, cognitive flexibility, and processing speed. Analyses of variance compared speech recognition performance among spectral degradation condition and modality. Bivariate correlation analyses were performed among speech recognition performance and the neurocognitive skills in the various test conditions. RESULTS Main effects on sentence recognition were found for degree of degradation (p = < 0.001) and modality (p = < 0.001). Inhibition-concentration skills moderately correlated (r = 0.45, p = 0.02) with recognition scores for sentences that were moderately degraded in the A-only condition. No correlations were found among neurocognitive scores and AV speech recognition scores. CONCLUSIONS Inhibition-concentration skills are deployed differentially during sentence recognition, depending on the level of signal degradation. Additional studies will be required to study these relations in actual clinical populations such as adult CI users.
Collapse
Affiliation(s)
- Jessica H Lewis
- Department of Otolaryngology - Head and Neck Surgery; The Ohio State University Wexner Medical Center, Columbus, Ohio.,Department of Speech and Hearing Science; The Ohio State University, Columbus, Ohio
| | - Irina Castellanos
- Department of Otolaryngology - Head and Neck Surgery; The Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Aaron C Moberly
- Department of Otolaryngology - Head and Neck Surgery; The Ohio State University Wexner Medical Center, Columbus, Ohio
| |
Collapse
|
8
|
Bosen AK, Sevich VA, Cannon SA. Forward Digit Span and Word Familiarity Do Not Correlate With Differences in Speech Recognition in Individuals With Cochlear Implants After Accounting for Auditory Resolution. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3330-3342. [PMID: 34251908 PMCID: PMC8740688 DOI: 10.1044/2021_jslhr-20-00574] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 01/12/2021] [Accepted: 04/09/2021] [Indexed: 06/07/2023]
Abstract
Purpose In individuals with cochlear implants, speech recognition is not associated with tests of working memory that primarily reflect storage, such as forward digit span. In contrast, our previous work found that vocoded speech recognition in individuals with normal hearing was correlated with performance on a forward digit span task. A possible explanation for this difference across groups is that variability in auditory resolution across individuals with cochlear implants could conceal the true relationship between speech and memory tasks. Here, our goal was to determine if performance on forward digit span and speech recognition tasks are correlated in individuals with cochlear implants after controlling for individual differences in auditory resolution. Method We measured sentence recognition ability in 20 individuals with cochlear implants with Perceptually Robust English Sentence Test Open-set sentences. Spectral and temporal modulation detection tasks were used to assess individual differences in auditory resolution, auditory forward digit span was used to assess working memory storage, and self-reported word familiarity was used to assess vocabulary. Results Individual differences in speech recognition were predicted by spectral and temporal resolution. A correlation was found between forward digit span and speech recognition, but this correlation was not significant after controlling for spectral and temporal resolution. No relationship was found between word familiarity and speech recognition. Forward digit span performance was not associated with individual differences in auditory resolution. Conclusions Our findings support the idea that sentence recognition in individuals with cochlear implants is primarily limited by individual differences in working memory processing, not storage. Studies examining the relationship between speech and memory should control for individual differences in auditory resolution.
Collapse
Affiliation(s)
| | - Victoria A. Sevich
- Boys Town National Research Hospital, Omaha, NE
- The Ohio State University, Columbus
| | | |
Collapse
|
9
|
Abstract
Sequences of phonologically similar words are more difficult to remember than phonologically distinct sequences. This study investigated whether this difficulty arises in the acoustic similarity of auditory stimuli or in the corresponding phonological labels in memory. Participants reconstructed sequences of words which were degraded with a vocoder. We manipulated the phonological similarity of response options across two groups. One group was trained to map stimulus words onto phonologically similar response labels which matched the recorded word; the other group was trained to map words onto a set of plausible responses which were mismatched from the original recordings but were selected to have less phonological overlap. Participants trained on the matched responses were able to learn responses with less training and recall sequences more accurately than participants trained on the mismatched responses, even though the mismatched responses were more phonologically distinct from one another and participants were unaware of the mismatch. The relative difficulty of recalling items in the correct position was the same across both sets of response labels. Mismatched responses impaired recall accuracy across all positions except the final item in each list. These results are consistent with the idea that increased difficulty of mapping acoustic stimuli onto phonological forms impairs serial recall. Increased mapping difficulty could impair retention of memoranda and impede consolidation into phonological forms, which would impair recall in adverse listening conditions.
Collapse
Affiliation(s)
- Adam K Bosen
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| | - Elizabeth Monzingo
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| | - Angela M AuBuchon
- Hearing and Speech Perception, Boys Town National Research Hospital, Omaha, NE, USA
| |
Collapse
|
10
|
O'Neill ER, Parke MN, Kreft HA, Oxenham AJ. Role of semantic context and talker variability in speech perception of cochlear-implant users and normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1224. [PMID: 33639827 PMCID: PMC7895533 DOI: 10.1121/10.0003532] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 01/01/2021] [Accepted: 01/26/2021] [Indexed: 06/12/2023]
Abstract
This study assessed the impact of semantic context and talker variability on speech perception by cochlear-implant (CI) users and compared their overall performance and between-subjects variance with that of normal-hearing (NH) listeners under vocoded conditions. Thirty post-lingually deafened adult CI users were tested, along with 30 age-matched and 30 younger NH listeners, on sentences with and without semantic context, presented in quiet and noise, spoken by four different talkers. Additional measures included working memory, non-verbal intelligence, and spectral-ripple detection and discrimination. Semantic context and between-talker differences influenced speech perception to similar degrees for both CI users and NH listeners. Between-subjects variance for speech perception was greatest in the CI group but remained substantial in both NH groups, despite the uniformly degraded stimuli in these two groups. Spectral-ripple detection and discrimination thresholds in CI users were significantly correlated with speech perception, but a single set of vocoder parameters for NH listeners was not able to capture average CI performance in both speech and spectral-ripple tasks. The lack of difference in the use of semantic context between CI users and NH listeners suggests no overall differences in listening strategy between the groups, when the stimuli are similarly degraded.
Collapse
Affiliation(s)
- Erin R O'Neill
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Morgan N Parke
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Heather A Kreft
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|