1
|
Zhao X, Li Y, Yang X. Aging affects Mandarin speakers' understanding of focus sentences in quiet and noisy environments. JOURNAL OF COMMUNICATION DISORDERS 2024; 111:106451. [PMID: 39043003 DOI: 10.1016/j.jcomdis.2024.106451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 05/27/2024] [Accepted: 07/11/2024] [Indexed: 07/25/2024]
Abstract
INTRODUCTION Older adults experiencing normal aging make up most patients seeking services at audiology clinics. While research acknowledges that the speech perception abilities of aging adults can be diminished in lower-level speech identification or discrimination, there is less concern about how aging affects higher-level speech understanding, particularly in tonal languages. This study aimed to explore the effects of aging on the comprehension of implied intentions conveyed through prosodic features in Mandarin focus sentences, both in quiet and noisy environments. METHODS Twenty-seven younger listeners (aged 17 to 26) and 27 older listeners (aged 58 to 77) participated in a focus comprehension task. Their task was to interpret SAVO (subject-adverbial-verb-object) sentences with five focus conditions (initial subject-focus, medial adverbial-focus, medial verb-focus, final object-focus, and neutral non-focus) across five background conditions: quiet, white noise (at 0 and -10-dB signal-to-noise ratios, SNRs), and competing speech (at 0 and -10-dB SNRs). Comprehension performances were analyzed based on accuracy rates, and underlying processing patterns were evaluated using confusion matrices. RESULTS Younger listeners consistently excelled across focus conditions in quiet settings, but their scores declined in white noise at the SNR of -10-dB. Older adults exhibited variability in scores across focus conditions but not in background conditions. They scored lower than their younger counterparts, with the highest scores observed in the comprehension of sentences featuring a medial adverbial-focus. Analysis of confusion matrices revealed that younger adults seldom mistook focus conditions, whereas older adults tended to comprehend the other focused items as medial adverbials. CONCLUSIONS Older listeners' performance reflects their over-reliance on top-down language knowledge, while their bottom-up acoustic processing decreases when interpreting Mandarin focus sentences. These findings provide evidence of active cognitive processing in prosody comprehension among aging adults and offer insights for diagnosing and intervening with speech disorders in clinical settings.
Collapse
Affiliation(s)
- Xinxian Zhao
- School of Foreign Studies, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Yang Li
- School of Foreign Studies, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Xiaohu Yang
- School of Foreign Studies, Tongji University, 1239 Siping Road, Shanghai 200092, China.
| |
Collapse
|
2
|
Alispahic S, Pellicano E, Cutler A, Antoniou M. Multiple talker processing in autistic adult listeners. Sci Rep 2024; 14:14698. [PMID: 38926416 PMCID: PMC11208580 DOI: 10.1038/s41598-024-62429-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 05/16/2024] [Indexed: 06/28/2024] Open
Abstract
Accommodating talker variability is a complex and multi-layered cognitive process. It involves shifting attention to the vocal characteristics of the talker as well as the linguistic content of their speech. Due to an interdependence between voice and phonological processing, multi-talker environments typically incur additional processing costs compared to single-talker environments. A failure or inability to efficiently distribute attention over multiple acoustic cues in the speech signal may have detrimental language learning consequences. Yet, no studies have examined effects of multi-talker processing in populations with atypical perceptual, social and language processing for communication, including autistic people. Employing a classic word-monitoring task, we investigated effects of talker variability in Australian English autistic (n = 24) and non-autistic (n = 28) adults. Listeners responded to target words (e.g., apple, duck, corn) in randomised sequences of words. Half of the sequences were spoken by a single talker and the other half by multiple talkers. Results revealed that autistic participants' sensitivity scores to accurately-spotted target words did not differ to those of non-autistic participants, regardless of whether they were spoken by a single or multiple talkers. As expected, the non-autistic group showed the well-established processing cost associated with talker variability (e.g., slower response times). Remarkably, autistic listeners' response times did not differ across single- or multi-talker conditions, indicating they did not show perceptual processing costs when accommodating talker variability. The present findings have implications for theories of autistic perception and speech and language processing.
Collapse
Affiliation(s)
- Samra Alispahic
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia.
| | - Elizabeth Pellicano
- Department of Educational Studies, Macquarie University, Sydney, Australia
- Department of Clinical, Educational and Health Psychology, University College London, London, UK
| | - Anne Cutler
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- ARC Centre of Excellence for the Dynamics of Language, Clayton, Australia
| | - Mark Antoniou
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
3
|
Zhao C, Ong JH, Veic A, Patel AD, Jiang C, Fogel AR, Wang L, Hou Q, Das D, Crasto C, Chakrabarti B, Williams TI, Loutrari A, Liu F. Predictive processing of music and language in autism: Evidence from Mandarin and English speakers. Autism Res 2024; 17:1230-1257. [PMID: 38651566 DOI: 10.1002/aur.3133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 04/01/2024] [Indexed: 04/25/2024]
Abstract
Atypical predictive processing has been associated with autism across multiple domains, based mainly on artificial antecedents and consequents. As structured sequences where expectations derive from implicit learning of combinatorial principles, language and music provide naturalistic stimuli for investigating predictive processing. In this study, we matched melodic and sentence stimuli in cloze probabilities and examined musical and linguistic prediction in Mandarin- (Experiment 1) and English-speaking (Experiment 2) autistic and non-autistic individuals using both production and perception tasks. In the production tasks, participants listened to unfinished melodies/sentences and then produced the final notes/words to complete these items. In the perception tasks, participants provided expectedness ratings of the completed melodies/sentences based on the most frequent notes/words in the norms. While Experiment 1 showed intact musical prediction but atypical linguistic prediction in autism in the Mandarin sample that demonstrated imbalanced musical training experience and receptive vocabulary skills between groups, the group difference disappeared in a more closely matched sample of English speakers in Experiment 2. These findings suggest the importance of taking an individual differences approach when investigating predictive processing in music and language in autism, as the difficulty in prediction in autism may not be due to generalized problems with prediction in any type of complex sequence processing.
Collapse
Affiliation(s)
- Chen Zhao
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Jia Hoong Ong
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Anamarija Veic
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Aniruddh D Patel
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
- Program in Brain, Mind, and Consciousness, Canadian Institute for Advanced Research (CIFAR), Toronto, Canada
| | - Cunmei Jiang
- Music College, Shanghai Normal University, Shanghai, China
| | - Allison R Fogel
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Li Wang
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Qingqi Hou
- Department of Music and Dance, Nanjing Normal University of Special Education, Nanjing, China
| | - Dipsikha Das
- School of Psychology, Keele University, Staffordshire, UK
| | - Cara Crasto
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Bhismadev Chakrabarti
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Tim I Williams
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Ariadne Loutrari
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| |
Collapse
|
4
|
Khayr R, Khnifes R, Shpak T, Banai K. Task-Specific Rapid Auditory Perceptual Learning in Adult Cochlear Implant Recipients: What Could It Mean for Speech Recognition. Ear Hear 2024:00003446-990000000-00285. [PMID: 38829780 DOI: 10.1097/aud.0000000000001523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
OBJECTIVES Speech recognition in cochlear implant (CI) recipients is quite variable, particularly in challenging listening conditions. Demographic, audiological, and cognitive factors explain some, but not all, of this variance. The literature suggests that rapid auditory perceptual learning explains unique variance in speech recognition in listeners with normal hearing and those with hearing loss. The present study focuses on the early adaptation phase of task-specific rapid auditory perceptual learning. It investigates whether adult CI recipients exhibit this learning and, if so, whether it accounts for portions of the variance in their recognition of fast speech and speech in noise. DESIGN Thirty-six adult CI recipients (ages = 35 to 77, M = 55) completed a battery of general speech recognition tests (sentences in speech-shaped noise, four-talker babble noise, and natural-fast speech), cognitive measures (vocabulary, working memory, attention, and verbal processing speed), and a rapid auditory perceptual learning task with time-compressed speech. Accuracy in the general speech recognition tasks was modeled with a series of generalized mixed models that accounted for demographic, audiological, and cognitive factors before accounting for the contribution of task-specific rapid auditory perceptual learning of time-compressed speech. RESULTS Most CI recipients exhibited early task-specific rapid auditory perceptual learning of time-compressed speech within the course of the first 20 sentences. This early task-specific rapid auditory perceptual learning had unique contribution to the recognition of natural-fast speech in quiet and speech in noise, although the contribution to natural-fast speech may reflect the rapid learning that occurred in this task. When accounting for demographic and cognitive characteristics, an increase of 1 SD in the early task-specific rapid auditory perceptual learning rate was associated with ~52% increase in the odds of correctly recognizing natural-fast speech in quiet, and ~19% to 28% in the odds of correctly recognizing the different types of speech in noise. Age, vocabulary, attention, and verbal processing speed also had unique contributions to general speech recognition. However, their contribution varied between the different general speech recognition tests. CONCLUSIONS Consistent with previous findings in other populations, in CI recipients, early task-specific rapid auditory perceptual, learning also accounts for some of the individual differences in the recognition of speech in noise and natural-fast speech in quiet. Thus, across populations, the early rapid adaptation phase of task-specific rapid auditory perceptual learning might serve as a skill that supports speech recognition in various adverse conditions. In CI users, the ability to rapidly adapt to ongoing acoustical challenges may be one of the factors associated with good CI outcomes. Overall, CI recipients with higher cognitive resources and faster rapid learning rates had better speech recognition.
Collapse
Affiliation(s)
- Ranin Khayr
- Department of Communication Sciences and Disorders, Faculty of Social Welfare and Health Studies, University of Haifa, Haifa, Israel
- Department of Otolaryngology-Head and Neck Surgery, Bnai-Zion Medical Center, Technion-Bruce Rappaport Faculty of Medicine, Haifa, Israel
| | - Riyad Khnifes
- Department of Communication Sciences and Disorders, Faculty of Social Welfare and Health Studies, University of Haifa, Haifa, Israel
- Department of Otolaryngology-Head and Neck Surgery, Bnai-Zion Medical Center, Technion-Bruce Rappaport Faculty of Medicine, Haifa, Israel
| | - Talma Shpak
- Department of Otolaryngology-Head and Neck Surgery, Bnai-Zion Medical Center, Technion-Bruce Rappaport Faculty of Medicine, Haifa, Israel
| | - Karen Banai
- Department of Communication Sciences and Disorders, Faculty of Social Welfare and Health Studies, University of Haifa, Haifa, Israel
| |
Collapse
|
5
|
Zhao X, Yang X. Aging affects auditory contributions to focus perception in Jianghuai Mandarina). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2990-3004. [PMID: 38717206 DOI: 10.1121/10.0025928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 04/20/2024] [Indexed: 09/20/2024]
Abstract
Speakers can place their prosodic prominence on any locations within a sentence, generating focus prosody for listeners to perceive new information. This study aimed to investigate age-related changes in the bottom-up processing of focus perception in Jianghuai Mandarin by clarifying the perceptual cues and the auditory processing abilities involved in the identification of focus locations. Young, middle-aged, and older speakers of Jianghuai Mandarin completed a focus identification task and an auditory perception task. The results showed that increasing age led to a decrease in listeners' accuracy rate in identifying focus locations, with all participants performing the worst when dynamic pitch cues were inaccessible. Auditory processing abilities did not predict focus perception performance in young and middle-aged listeners but accounted significantly for the variance in older adults' performance. These findings suggest that age-related deteriorations in focus perception can be largely attributed to declined auditory processing of perceptual cues. Poor ability to extract frequency modulation cues may be the most important underlying psychoacoustic factor for older adults' difficulties in perceiving focus prosody in Jianghuai Mandarin. The results contribute to our understanding of the bottom-up mechanisms involved in linguistic prosody processing in aging adults, particularly in tonal languages.
Collapse
Affiliation(s)
- Xinxian Zhao
- School of Foreign Studies, Tongji University, Shanghai 200092, China
| | - Xiaohu Yang
- School of Foreign Studies, Tongji University, Shanghai 200092, China
| |
Collapse
|
6
|
Symons AE, Holt LL, Tierney AT. Informational masking influences segmental and suprasegmental speech categorization. Psychon Bull Rev 2024; 31:686-696. [PMID: 37658222 PMCID: PMC11061029 DOI: 10.3758/s13423-023-02364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/04/2023] [Indexed: 09/03/2023]
Abstract
Auditory categorization requires listeners to integrate acoustic information from multiple dimensions. Attentional theories suggest that acoustic dimensions that are informative attract attention and therefore receive greater perceptual weight during categorization. However, the acoustic environment is often noisy, with multiple sound sources competing for listeners' attention. Amid these adverse conditions, attentional theories predict that listeners will distribute attention more evenly across multiple dimensions. Here we test this prediction using an informational masking paradigm. In two experiments, listeners completed suprasegmental (focus) and segmental (voicing) speech categorization tasks in quiet or in the presence of competing speech. In both experiments, the target speech consisted of short words or phrases that varied in the extent to which fundamental frequency (F0) and durational information signalled category identity. To isolate effects of informational masking, target and competing speech were presented in opposite ears. Across both experiments, there was substantial individual variability in the relative weighting of the two dimensions. These individual differences were consistent across listening conditions, suggesting that they reflect stable perceptual strategies. Consistent with attentional theories of auditory categorization, listeners who relied on a single primary dimension in quiet shifted towards integrating across multiple dimensions in the presence of competing speech. These findings demonstrate that listeners make greater use of the redundancy present in speech when attentional resources are limited.
Collapse
Affiliation(s)
- A E Symons
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| | - L L Holt
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, 500 Forbes Avenue, Pittsburgh, PA, USA.
| | - A T Tierney
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| |
Collapse
|
7
|
Harford EE, Holt LL, Abel TJ. Unveiling the development of human voice perception: Neurobiological mechanisms and pathophysiology. CURRENT RESEARCH IN NEUROBIOLOGY 2024; 6:100127. [PMID: 38511174 PMCID: PMC10950757 DOI: 10.1016/j.crneur.2024.100127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 02/22/2024] [Accepted: 02/26/2024] [Indexed: 03/22/2024] Open
Abstract
The human voice is a critical stimulus for the auditory system that promotes social connection, informs the listener about identity and emotion, and acts as the carrier for spoken language. Research on voice processing in adults has informed our understanding of the unique status of the human voice in the mature auditory cortex and provided potential explanations for mechanisms that underly voice selectivity and identity processing. There is evidence that voice perception undergoes developmental change starting in infancy and extending through early adolescence. While even young infants recognize the voice of their mother, there is an apparent protracted course of development to reach adult-like selectivity for human voice over other sound categories and recognition of other talkers by voice. Gaps in the literature do not allow for an exact mapping of this trajectory or an adequate description of how voice processing and its neural underpinnings abilities evolve. This review provides a comprehensive account of developmental voice processing research published to date and discusses how this evidence fits with and contributes to current theoretical models proposed in the adult literature. We discuss how factors such as cognitive development, neural plasticity, perceptual narrowing, and language acquisition may contribute to the development of voice processing and its investigation in children. We also review evidence of voice processing abilities in premature birth, autism spectrum disorder, and phonagnosia to examine where and how deviations from the typical trajectory of development may manifest.
Collapse
Affiliation(s)
- Emily E. Harford
- Department of Neurological Surgery, University of Pittsburgh, USA
| | - Lori L. Holt
- Department of Psychology, The University of Texas at Austin, USA
| | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, USA
- Department of Bioengineering, University of Pittsburgh, USA
| |
Collapse
|
8
|
Bosen AK, Doria GM. Identifying Links Between Latent Memory and Speech Recognition Factors. Ear Hear 2024; 45:351-369. [PMID: 37882100 PMCID: PMC10922378 DOI: 10.1097/aud.0000000000001430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
OBJECTIVES The link between memory ability and speech recognition accuracy is often examined by correlating summary measures of performance across various tasks, but interpretation of such correlations critically depends on assumptions about how these measures map onto underlying factors of interest. The present work presents an alternative approach, wherein latent factor models are fit to trial-level data from multiple tasks to directly test hypotheses about the underlying structure of memory and the extent to which latent memory factors are associated with individual differences in speech recognition accuracy. Latent factor models with different numbers of factors were fit to the data and compared to one another to select the structures which best explained vocoded sentence recognition in a two-talker masker across a range of target-to-masker ratios, performance on three memory tasks, and the link between sentence recognition and memory. DESIGN Young adults with normal hearing (N = 52 for the memory tasks, of which 21 participants also completed the sentence recognition task) completed three memory tasks and one sentence recognition task: reading span, auditory digit span, visual free recall of words, and recognition of 16-channel vocoded Perceptually Robust English Sentence Test Open-set sentences in the presence of a two-talker masker at target-to-masker ratios between +10 and 0 dB. Correlations between summary measures of memory task performance and sentence recognition accuracy were calculated for comparison to prior work, and latent factor models were fit to trial-level data and compared against one another to identify the number of latent factors which best explains the data. Models with one or two latent factors were fit to the sentence recognition data and models with one, two, or three latent factors were fit to the memory task data. Based on findings with these models, full models that linked one speech factor to one, two, or three memory factors were fit to the full data set. Models were compared via Expected Log pointwise Predictive Density and post hoc inspection of model parameters. RESULTS Summary measures were positively correlated across memory tasks and sentence recognition. Latent factor models revealed that sentence recognition accuracy was best explained by a single factor that varied across participants. Memory task performance was best explained by two latent factors, of which one was generally associated with performance on all three tasks and the other was specific to digit span recall accuracy at lists of six digits or more. When these models were combined, the general memory factor was closely related to the sentence recognition factor, whereas the factor specific to digit span had no apparent association with sentence recognition. CONCLUSIONS Comparison of latent factor models enables testing hypotheses about the underlying structure linking cognition and speech recognition. This approach showed that multiple memory tasks assess a common latent factor that is related to individual differences in sentence recognition, although performance on some tasks was associated with multiple factors. Thus, while these tasks provide some convergent assessment of common latent factors, caution is needed when interpreting what they tell us about speech recognition.
Collapse
|
9
|
Tzeng CY, Russell ML, Nygaard LC. Attention modulates perceptual learning of non-native-accented speech. Atten Percept Psychophys 2024; 86:339-353. [PMID: 37872434 DOI: 10.3758/s13414-023-02790-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/11/2023] [Indexed: 10/25/2023]
Abstract
Listeners readily adapt to variation in non-native-accented speech, learning to disambiguate between talker-specific and accent-based variation. We asked (1) which linguistic and indexical features of the spoken utterance are relevant for this learning to occur and (2) whether task-driven attention to these features affects the extent to which learning generalizes to novel utterances and voices. In two experiments, listeners heard English sentences (Experiment 1) or words (Experiment 2) produced by Spanish-accented talkers during an exposure phase. Listeners' attention was directed to lexical content (transcription), indexical cues (talker identification), or both (transcription + talker identification). In Experiment 1, listeners' test transcription of novel English sentences spoken by Spanish-accented talkers showed generalized perceptual learning to previously unheard voices and utterances for all training conditions. In Experiment 2, generalized learning occurred only in the transcription + talker identification condition, suggesting that attention to both linguistic and indexical cues optimizes listeners' ability to distinguish between individual talker- and group-based variation, especially with the reduced availability of sentence-length prosodic information. Collectively, these findings highlight the role of attentional processes in the encoding of speech input and underscore the interdependency of indexical and lexical characteristics in spoken language processing.
Collapse
Affiliation(s)
- Christina Y Tzeng
- Department of Psychology, San José State University, 1 Washington Sq, San José, CA, 95192, USA.
| | - Marissa L Russell
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA, USA
| | - Lynne C Nygaard
- Department of Psychology, Emory University, Atlanta, GA, USA
| |
Collapse
|
10
|
Guerra G, Tierney A, Tijms J, Vaessen A, Bonte M, Dick F. Attentional modulation of neural sound tracking in children with and without dyslexia. Dev Sci 2024; 27:e13420. [PMID: 37350014 DOI: 10.1111/desc.13420] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 04/09/2023] [Accepted: 05/26/2023] [Indexed: 06/24/2023]
Abstract
Auditory selective attention forms an important foundation of children's learning by enabling the prioritisation and encoding of relevant stimuli. It may also influence reading development, which relies on metalinguistic skills including the awareness of the sound structure of spoken language. Reports of attentional impairments and speech perception difficulties in noisy environments in dyslexic readers are also suggestive of the putative contribution of auditory attention to reading development. To date, it is unclear whether non-speech selective attention and its underlying neural mechanisms are impaired in children with dyslexia and to which extent these deficits relate to individual reading and speech perception abilities in suboptimal listening conditions. In this EEG study, we assessed non-speech sustained auditory selective attention in 106 7-to-12-year-old children with and without dyslexia. Children attended to one of two tone streams, detecting occasional sequence repeats in the attended stream, and performed a speech-in-speech perception task. Results show that when children directed their attention to one stream, inter-trial-phase-coherence at the attended rate increased in fronto-central sites; this, in turn, was associated with better target detection. Behavioural and neural indices of attention did not systematically differ as a function of dyslexia diagnosis. However, behavioural indices of attention did explain individual differences in reading fluency and speech-in-speech perception abilities: both these skills were impaired in dyslexic readers. Taken together, our results show that children with dyslexia do not show group-level auditory attention deficits but these deficits may represent a risk for developing reading impairments and problems with speech perception in complex acoustic environments. RESEARCH HIGHLIGHTS: Non-speech sustained auditory selective attention modulates EEG phase coherence in children with/without dyslexia Children with dyslexia show difficulties in speech-in-speech perception Attention relates to dyslexic readers' speech-in-speech perception and reading skills Dyslexia diagnosis is not linked to behavioural/EEG indices of auditory attention.
Collapse
Affiliation(s)
- Giada Guerra
- Centre for Brain and Cognitive Development, Birkbeck College, University of London, London, UK
- Maastricht Brain Imaging Center and Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Adam Tierney
- Centre for Brain and Cognitive Development, Birkbeck College, University of London, London, UK
| | - Jurgen Tijms
- RID, Amsterdam, Netherlands
- Rudolf Berlin Center, Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
| | | | - Milene Bonte
- Maastricht Brain Imaging Center and Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Frederic Dick
- Division of Psychology & Language Sciences, UCL, London, UK
| |
Collapse
|
11
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
12
|
Dambon J, Munder P, Mewes A, Böhnke B, Beyer A, Kolonko J, Brademann G, Hey M. Optimizing the efficiency of ECAP measurements due to interpolation. Acta Otolaryngol 2023; 143:971-978. [PMID: 38189322 DOI: 10.1080/00016489.2023.2298467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 12/11/2023] [Indexed: 01/09/2024]
Abstract
BACKGROUND Thresholds of electrically evoked compound action potentials (TECAP) may serve as starting points for electrophysiologically based fitting of cochlear implants. Absent TECAP data at single electrodes reduces the number of data points available for fitting and can be substituted by interpolation of measured data points. AIM To compare complete TECAP profiles with interpolated TECAP profiles of 5/22 (∼22.7%) and 11/22 (50%) electrode contacts. MATERIAL AND METHODS Single-centre, retrospective, observational study of data from 624 ears implanted with a Slim Modiolar (CI ×32) or Contour Advance (CI ×12, CI24RE(CA)) electrode array (Cochlear Ltd). The deviation of the complete measured TECAP profile from the same profile with missing and therefore interpolated TECAP values was quantified. RESULTS Interpolated TECAP profiles significantly differ from complete measured profiles especially at the basal and apical electrodes. Reference data for Slim Modiolar and Contour Advance electrodes mean profiles are provided. CONCLUSIONS AND SIGNIFICANCE Reducing the number of measured TECAP electrodes has to be weighted against losses in the TECAP accuracy of interpolated values. A clinically acceptable compromise may be a reduction from 22 to 11 even non-equidistant data points. While reducing ECAP measurement time, it is accompanied by a minimal loss of accuracy of the TECAP threshold profile.
Collapse
Affiliation(s)
- Jan Dambon
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Patrick Munder
- itap - Institut für technische und angewandte Physik GmbH, Oldenburg, Germany
| | - Alexander Mewes
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Britta Böhnke
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Annika Beyer
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Johannes Kolonko
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Goetz Brademann
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| | - Matthias Hey
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology, University Hospital Schleswig-Holstein (UKSH), Kiel, Germany
| |
Collapse
|
13
|
Kurteff GL, Lester-Smith RA, Martinez A, Currens N, Holder J, Villarreal C, Mercado VR, Truong C, Huber C, Pokharel P, Hamilton LS. Speaker-induced Suppression in EEG during a Naturalistic Reading and Listening Task. J Cogn Neurosci 2023; 35:1538-1556. [PMID: 37584593 DOI: 10.1162/jocn_a_02037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Speaking elicits a suppressed neural response when compared with listening to others' speech, a phenomenon known as speaker-induced suppression (SIS). Previous research has focused on investigating SIS at constrained levels of linguistic representation, such as the individual phoneme and word level. Here, we present scalp EEG data from a dual speech perception and production task where participants read sentences aloud then listened to playback of themselves reading those sentences. Playback was separated into immediate repetition of the previous trial and randomized repetition of a former trial to investigate if forward modeling of responses during passive listening suppresses the neural response. Concurrent EMG was recorded to control for movement artifact during speech production. In line with previous research, ERP analyses at the sentence level demonstrated suppression of early auditory components of the EEG for production compared with perception. To evaluate whether linguistic abstractions (in the form of phonological feature tuning) are suppressed during speech production alongside lower-level acoustic information, we fit linear encoding models that predicted scalp EEG based on phonological features, EMG activity, and task condition. We found that phonological features were encoded similarly between production and perception. However, this similarity was only observed when controlling for movement by using the EMG response as an additional regressor. Our results suggest that SIS operates at a sensory representational level and is dissociated from higher order cognitive and linguistic processing that takes place during speech perception and production. We also detail some important considerations when analyzing EEG during continuous speech production.
Collapse
|
14
|
Shorey AE, King CJ, Theodore RM, Stilp CE. Talker adaptation or "talker" adaptation? Musical instrument variability impedes pitch perception. Atten Percept Psychophys 2023; 85:2488-2501. [PMID: 37258892 DOI: 10.3758/s13414-023-02722-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2023] [Indexed: 06/02/2023]
Abstract
Listeners show perceptual benefits (faster and/or more accurate responses) when perceiving speech spoken by a single talker versus multiple talkers, known as talker adaptation. While near-exclusively studied in speech and with talkers, some aspects of talker adaptation might reflect domain-general processes. Music, like speech, is a sound class replete with acoustic variation, such as a multitude of pitch and instrument possibilities. Thus, it was hypothesized that perceptual benefits from structure in the acoustic signal (i.e., hearing the same sound source on every trial) are not specific to speech but rather a general auditory response. Forty nonmusician participants completed a simple musical task that mirrored talker adaptation paradigms. Low- or high-pitched notes were presented in single- and mixed-instrument blocks. Reflecting both music research on pitch and timbre interdependence and mirroring traditional "talker" adaptation paradigms, listeners were faster to make their pitch judgments when presented with a single instrument timbre relative to when the timbre was selected from one of four instruments from trial to trial. A second experiment ruled out the possibility that participants were responding faster to the specific instrument chosen as the single-instrument timbre. Consistent with general theoretical approaches to perception, perceptual benefits from signal structure are not limited to speech.
Collapse
Affiliation(s)
- Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, 317 Life Sciences Building, Louisville, KY, 40272, USA.
| | - Caleb J King
- Department of Psychological and Brain Sciences, University of Louisville, 317 Life Sciences Building, Louisville, KY, 40272, USA
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Unit 1085, Storrs, CT, 06269-1085, USA
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, 337 Mansfield Road, Unit 1272, Storrs, CT, 06269-1272, USA
| | - Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, 317 Life Sciences Building, Louisville, KY, 40272, USA
| |
Collapse
|
15
|
Khayr R, Karawani H, Banai K. Implicit learning and individual differences in speech recognition: an exploratory study. Front Psychol 2023; 14:1238823. [PMID: 37744578 PMCID: PMC10513179 DOI: 10.3389/fpsyg.2023.1238823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/22/2023] [Indexed: 09/26/2023] Open
Abstract
Individual differences in speech recognition in challenging listening environments are pronounced. Studies suggest that implicit learning is one variable that may contribute to this variability. Here, we explored the unique contributions of three indices of implicit learning to individual differences in the recognition of challenging speech. To this end, we assessed three indices of implicit learning (perceptual, statistical, and incidental), three types of challenging speech (natural fast, vocoded, and speech in noise), and cognitive factors associated with speech recognition (vocabulary, working memory, and attention) in a group of 51 young adults. Speech recognition was modeled as a function of the cognitive factors and learning, and the unique contribution of each index of learning was statistically isolated. The three indices of learning were uncorrelated. Whereas all indices of learning had unique contributions to the recognition of natural-fast speech, only statistical learning had a unique contribution to the recognition of speech in noise and vocoded speech. These data suggest that although implicit learning may contribute to the recognition of challenging speech, the contribution may depend on the type of speech challenge and on the learning task.
Collapse
Affiliation(s)
- Ranin Khayr
- Department of Communication Sciences and Disorders, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa, Israel
| | | | | |
Collapse
|
16
|
Mora JC, Darcy I. Individual differences in attention control and the processing of phonological contrasts in a second language. PHONETICA 2023; 80:153-184. [PMID: 37341707 DOI: 10.1515/phon-2022-0020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 05/20/2023] [Indexed: 06/22/2023]
Abstract
This study investigated attention control in L2 phonological processing from a cognitive individual differences perspective, to determine its role in predicting phonological acquisition in adult L2 learning. Participants were 21 L1-Spanish learners of English, and 19 L1-English learners of Spanish. Attention control was measured through a novel speech-based attention-switching task. Phonological processing was assessed through a speeded ABX categorization task (perception) and a delayed sentence repetition task (production). Correlational analyses indicated that learners with more efficient attention switching skill and faster speed in correctly identifying the target phonetic features in the speech dimension under focus could perceptually discriminate L2 vowels at higher processing speed, but not at higher accuracy rates. Thus, attentional flexibility provided a processing advantage for difficult L2 contrasts but did not predict the extent to which precise representations for the target L2 vowels had been established. However, attention control was related to L2 learners' ability to distinguish the contrasting L2 vowels in production. In addition, L2 learners' accuracy in perceptually distinguishing between two contrasting vowels was significantly related to how much of a quality distinction between them they could make in production.
Collapse
Affiliation(s)
- Joan C Mora
- Department of Modern Languages and Literatures and English Studies, Faculty of Philology and Communication, Universitat de Barcelona, Barcelona, Spain
| | - Isabelle Darcy
- Department of Second Language Studies, Indiana University, Bloomington, IN, USA
| |
Collapse
|
17
|
Li Y, Feng S. Chinese comprehenders' interpretation of underinformativeness in L1 and L2 accented speech narratives. Front Psychol 2023; 14:1040162. [PMID: 36755670 PMCID: PMC9900116 DOI: 10.3389/fpsyg.2023.1040162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 01/06/2023] [Indexed: 01/24/2023] Open
Abstract
Second language (L2) speakers with foreign accents are well-known to face disadvantages in terms of language processing; however, recent research has demonstrated possible social benefits for foreign-accented L2 speakers. While previous research has focused on the ways in which first language (L1) speakers of English comprehend L2 speech, the present article contributes to this line of research by exploring the ways in which comprehenders from a different culture and linguistic background perceive L2 speech narratives. This study investigates this issue by exploring how comprehenders with Mandarin Chinese as the first language interpret underinformative utterances containing scalar and ad hoc implicature in L1, accent-free L2, and foreign-accented L2 speech narratives. The sentence judgment task with a guise design used written sentences rather than oral utterances as stimuli in order to isolate the role of intelligibility factors. The results indicate that foreign accent confers social benefits on L2 speakers in that their omission of information in communication is tolerated and they are viewed as more likely to possess positive attributes. More importantly, we find that the bilingual characteristics of Chinese participants, as well as the different linguistic complexity of deriving scalar and ad hoc implicature, affect Chinese participants' explanations of underinformative sentences of L2 speakers. This study contributes to our understanding of L2 language processing.
Collapse
|
18
|
Pinto D, Kaufman M, Brown A, Zion Golumbic E. An ecological investigation of the capacity to follow simultaneous speech and preferential detection of ones’ own name. Cereb Cortex 2022; 33:5361-5374. [PMID: 36331339 DOI: 10.1093/cercor/bhac424] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/11/2022] [Accepted: 09/12/2022] [Indexed: 11/06/2022] Open
Abstract
Abstract
Many situations require focusing attention on one speaker, while monitoring the environment for potentially important information. Some have proposed that dividing attention among 2 speakers involves behavioral trade-offs, due to limited cognitive resources. However the severity of these trade-offs, particularly under ecologically-valid circumstances, is not well understood. We investigated the capacity to process simultaneous speech using a dual-task paradigm simulating task-demands and stimuli encountered in real-life. Participants listened to conversational narratives (Narrative Stream) and monitored a stream of announcements (Barista Stream), to detect when their order was called. We measured participants’ performance, neural activity, and skin conductance as they engaged in this dual-task. Participants achieved extremely high dual-task accuracy, with no apparent behavioral trade-offs. Moreover, robust neural and physiological responses were observed for target-stimuli in the Barista Stream, alongside significant neural speech-tracking of the Narrative Stream. These results suggest that humans have substantial capacity to process simultaneous speech and do not suffer from insufficient processing resources, at least for this highly ecological task-combination and level of perceptual load. Results also confirmed the ecological validity of the advantage for detecting ones’ own name at the behavioral, neural, and physiological level, highlighting the contribution of personal relevance when processing simultaneous speech.
Collapse
Affiliation(s)
- Danna Pinto
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Maya Kaufman
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Adi Brown
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Elana Zion Golumbic
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| |
Collapse
|
19
|
Andreeva IG, Ogorodnikova EA. Auditory Adaptation to Speech Signal Characteristics. J EVOL BIOCHEM PHYS+ 2022. [DOI: 10.1134/s0022093022050027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
20
|
Francis AL. Adding noise is a confounded nuisance. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1375. [PMID: 36182286 DOI: 10.1121/10.0013874] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 08/15/2022] [Indexed: 06/16/2023]
Abstract
A wide variety of research and clinical assessments involve presenting speech stimuli in the presence of some kind of noise. Here, I selectively review two theoretical perspectives and discuss ways in which these perspectives may help researchers understand the consequences for listeners of adding noise to a speech signal. I argue that adding noise changes more about the listening task than merely making the signal more difficult to perceive. To fully understand the effects of an added noise on speech perception, we must consider not just how much the noise affects task difficulty, but also how it affects all of the systems involved in understanding speech: increasing message uncertainty, modifying attentional demand, altering affective response, and changing motivation to perform the task.
Collapse
Affiliation(s)
- Alexander L Francis
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, Indiana 47907, USA
| |
Collapse
|
21
|
Listeners are sensitive to the speech breathing time series: Evidence from a gap detection task. Cognition 2022; 225:105171. [DOI: 10.1016/j.cognition.2022.105171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 04/29/2022] [Accepted: 05/11/2022] [Indexed: 11/23/2022]
|
22
|
Castellucci GA, Guenther FH, Long MA. A Theoretical Framework for Human and Nonhuman Vocal Interaction. Annu Rev Neurosci 2022; 45:295-316. [PMID: 35316612 PMCID: PMC9909589 DOI: 10.1146/annurev-neuro-111020-094807] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Vocal communication is a critical feature of social interaction across species; however, the relation between such behavior in humans and nonhumans remains unclear. To enable comparative investigation of this topic, we review the literature pertinent to interactive language use and identify the superset of cognitive operations involved in generating communicative action. We posit these functions comprise three intersecting multistep pathways: (a) the Content Pathway, which selects the movements constituting a response; (b) the Timing Pathway, which temporally structures responses; and (c) the Affect Pathway, which modulates response parameters according to internal state. These processing streams form the basis of the Convergent Pathways for Interaction framework, which provides a conceptual model for investigating the cognitive and neural computations underlying vocal communication across species.
Collapse
Affiliation(s)
- Gregg A. Castellucci
- NYU Neuroscience Institute and Department of Otolaryngology, New York University Langone Medical Center, New York, NY, USA
| | - Frank H. Guenther
- Departments of Speech, Language & Hearing Sciences and Biomedical Engineering, Boston University, Boston, MA, USA
| | - Michael A. Long
- NYU Neuroscience Institute and Department of Otolaryngology, New York University Langone Medical Center, New York, NY, USA
| |
Collapse
|
23
|
Kocabay AP, Aslan F, Yüce D, Turkyilmaz D. Speech in Noise: Implications of Age, Hearing Loss and Cognition. Folia Phoniatr Logop 2022; 74:345-351. [PMID: 35738235 DOI: 10.1159/000525580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 06/13/2022] [Indexed: 11/19/2022] Open
Abstract
INTRODUCTION Individuals with hearing loss have reduced hearing sensitivity and may not adequately process the temporal cues in acoustic signals. Cognitive skills that decrease with aging and hearing loss contribute negatively on the ability to understand speech. Hence, they may experience communication problems in noisy environments. The aim of the study was to investigate the effect of sloping high frequency hearing loss on speech perception in noise and to examine the impact of temporal and cognitive processing in young and middle-age adults. METHODS Speech in noise (SIN), temporal processing and cognitive tests were conducted to hearing-loss and normal hearing individuals aged 18-59 years. The measurements included the Matrix Sentence Test, Binaural Temporal Fine Structure Sensitivity (TFS) Test, Visual Aural Digit Span (VADS) and Auditory Verbal Learning Test (AVLT). 20 participants with normal hearing were recruited in the control group, whereas 20 participants with hearing loss at high frequencies was composed of the study group. RESULTS Hierarchical regression analysis for SIN was performed by entering 3 separate blocks of independent variables. We entered age and hearing loss into the first block, which explained a significant amount of variability in SIN (R2=0.72, p<0.001). Block 2 was comprised of scores from TFS sensitivity test, this independent variable characterized temporal processing (R2 change= 0.002., p<0.001). Block 3 was consisted of scores from VADS test and AVLT; these variables characterized cognitive processing and accounted for a good portion of SIN variance (R2 change=0.04, p<0.001). The age, hearing loss, and VADS contributed independently in the presence of all independent variables. CONCLUSION The final model accounted for 76.2% of the variance in SIN. The results suggested that sloping hearing loss, aging and cognitive decline affected auditory performance and the poor performance starts from an early age. Additionally, the findings indicated that a more comprehensive approach might be recommended to evaluate the listening skills and identify communication problems.
Collapse
Affiliation(s)
| | - Filiz Aslan
- Department of Audiology, Hacettepe University, Ankara, Turkey
| | - Deniz Yüce
- Department of Preventive Oncology, Hacettepe University, Ankara, Turkey
| | | |
Collapse
|
24
|
Ritz H, Wild CJ, Johnsrude IS. Parametric Cognitive Load Reveals Hidden Costs in the Neural Processing of Perfectly Intelligible Degraded Speech. J Neurosci 2022; 42:4619-4628. [PMID: 35508382 PMCID: PMC9186799 DOI: 10.1523/jneurosci.1777-21.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 11/21/2022] Open
Abstract
Speech is often degraded by environmental noise or hearing impairment. People can compensate for degradation, but this requires cognitive effort. Previous research has identified frontotemporal networks involved in effortful perception, but materials in these works were also less intelligible, and so it is not clear whether activity reflected effort or intelligibility differences. We used functional magnetic resonance imaging to assess the degree to which spoken sentences were processed under distraction and whether this depended on speech quality even when intelligibility of degraded speech was matched to that of clear speech (close to 100%). On each trial, male and female human participants either attended to a sentence or to a concurrent multiple object tracking (MOT) task that imposed parametric cognitive load. Activity in bilateral anterior insula reflected task demands; during the MOT task, activity increased as cognitive load increased, and during speech listening, activity increased as speech became more degraded. In marked contrast, activity in bilateral anterior temporal cortex was speech selective and gated by attention when speech was degraded. In this region, performance of the MOT task with a trivial load blocked processing of degraded speech, whereas processing of clear speech was unaffected. As load increased, responses to clear speech in these areas declined, consistent with reduced capacity to process it. This result dissociates cognitive control from speech processing; substantially less cognitive control is required to process clear speech than is required to understand even very mildly degraded, 100% intelligible speech. Perceptual and control systems clearly interact dynamically during real-world speech comprehension.SIGNIFICANCE STATEMENT Speech is often perfectly intelligible even when degraded, for example, by background sound, phone transmission, or hearing loss. How does degradation alter cognitive demands? Here, we use fMRI to demonstrate a novel and critical role for cognitive control in the processing of mildly degraded but perfectly intelligible speech. We compare speech that is matched for intelligibility but differs in putative control demands, dissociating cognitive control from speech processing. We also impose a parametric cognitive load during perception, dissociating processes that depend on tasks from those that depend on available capacity. Our findings distinguish between frontal and temporal contributions to speech perception and reveal a hidden cost to processing mildly degraded speech, underscoring the importance of cognitive control for everyday speech comprehension.
Collapse
Affiliation(s)
- Harrison Ritz
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, Rhode Island 02912
| | - Conor J Wild
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada
- Departments of Psychology and Communication Sciences and Disorders, University of Western Ontario, London, Ontario N6A 3K7, Canada
| |
Collapse
|
25
|
Gray R, Sarampalis A, Başkent D, Harding EE. Working-Memory, Alpha-Theta Oscillations and Musical Training in Older Age: Research Perspectives for Speech-on-speech Perception. Front Aging Neurosci 2022; 14:806439. [PMID: 35645774 PMCID: PMC9131017 DOI: 10.3389/fnagi.2022.806439] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 03/24/2022] [Indexed: 12/18/2022] Open
Abstract
During the normal course of aging, perception of speech-on-speech or “cocktail party” speech and use of working memory (WM) abilities change. Musical training, which is a complex activity that integrates multiple sensory modalities and higher-order cognitive functions, reportedly benefits both WM performance and speech-on-speech perception in older adults. This mini-review explores the relationship between musical training, WM and speech-on-speech perception in older age (> 65 years) through the lens of the Ease of Language Understanding (ELU) model. Linking neural-oscillation literature associating speech-on-speech perception and WM with alpha-theta oscillatory activity, we propose that two stages of speech-on-speech processing in the ELU are underpinned by WM-related alpha-theta oscillatory activity, and that effects of musical training on speech-on-speech perception may be reflected in these frequency bands among older adults.
Collapse
Affiliation(s)
- Ryan Gray
- Department of Experimental Psychology, University of Groningen, Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands
- Department of Psychology, Centre for Applied Behavioural Sciences, School of Social Sciences, Heriot-Watt University, Edinburgh, United Kingdom
| | - Anastasios Sarampalis
- Department of Experimental Psychology, University of Groningen, Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands
| | - Deniz Başkent
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands
- Department of Otorhinolaryngology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Eleanor E. Harding
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands
- Department of Otorhinolaryngology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- *Correspondence: Eleanor E. Harding,
| |
Collapse
|
26
|
Heald SLM, Van Hedger SC, Veillette J, Reis K, Snyder JS, Nusbaum HC. Going Beyond Rote Auditory Learning: Neural Patterns of Generalized Auditory Learning. J Cogn Neurosci 2022; 34:425-444. [PMID: 34942645 PMCID: PMC8832160 DOI: 10.1162/jocn_a_01805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The ability to generalize across specific experiences is vital for the recognition of new patterns, especially in speech perception considering acoustic-phonetic pattern variability. Indeed, behavioral research has demonstrated that listeners are able via a process of generalized learning to leverage their experiences of past words said by difficult-to-understand talker to improve their understanding for new words said by that talker. Here, we examine differences in neural responses to generalized versus rote learning in auditory cortical processing by training listeners to understand a novel synthetic talker. Using a pretest-posttest design with EEG, participants were trained using either (1) a large inventory of words where no words were repeated across the experiment (generalized learning) or (2) a small inventory of words where words were repeated (rote learning). Analysis of long-latency auditory evoked potentials at pretest and posttest revealed that rote and generalized learning both produced rapid changes in auditory processing, yet the nature of these changes differed. Generalized learning was marked by an amplitude reduction in the N1-P2 complex and by the presence of a late negativity wave in the auditory evoked potential following training; rote learning was marked only by temporally later scalp topography differences. The early N1-P2 change, found only for generalized learning, is consistent with an active processing account of speech perception, which proposes that the ability to rapidly adjust to the specific vocal characteristics of a new talker (for which rote learning is rare) relies on attentional mechanisms to selectively modify early auditory processing sensitivity.
Collapse
|
27
|
Abstract
The human brain exhibits the remarkable ability to categorize speech sounds into distinct, meaningful percepts, even in challenging tasks like learning non-native speech categories in adulthood and hearing speech in noisy listening conditions. In these scenarios, there is substantial variability in perception and behavior, both across individual listeners and individual trials. While there has been extensive work characterizing stimulus-related and contextual factors that contribute to variability, recent advances in neuroscience are beginning to shed light on another potential source of variability that has not been explored in speech processing. Specifically, there are task-independent, moment-to-moment variations in neural activity in broadly-distributed cortical and subcortical networks that affect how a stimulus is perceived on a trial-by-trial basis. In this review, we discuss factors that affect speech sound learning and moment-to-moment variability in perception, particularly arousal states—neurotransmitter-dependent modulations of cortical activity. We propose that a more complete model of speech perception and learning should incorporate subcortically-mediated arousal states that alter behavior in ways that are distinct from, yet complementary to, top-down cognitive modulations. Finally, we discuss a novel neuromodulation technique, transcutaneous auricular vagus nerve stimulation (taVNS), which is particularly well-suited to investigating causal relationships between arousal mechanisms and performance in a variety of perceptual tasks. Together, these approaches provide novel testable hypotheses for explaining variability in classically challenging tasks, including non-native speech sound learning.
Collapse
|
28
|
Wu M, Christiansen S, Fereczkowski M, Neher T. Revisiting Auditory Profiling: Can Cognitive Factors Improve the Prediction of Aided Speech-in-Noise Outcome? Trends Hear 2022; 26:23312165221113889. [PMID: 35942807 PMCID: PMC9373127 DOI: 10.1177/23312165221113889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Hearing aids (HA) are the most common type of rehabilitation treatment for
age-related hearing loss. However, HA users often obtain limited benefit from
their devices, particularly in noisy environments, and thus many HA candidates
do not use them at all. A possible reason for this could be that current HA
fittings are audiogram-based, that is, they neglect supra-threshold factors. In
an earlier study, an auditory-profiling method was proposed as a basis for more
personalized HA fittings. This method classifies HA users into four profiles
that differ in terms of hearing sensitivity and supra-threshold hearing
abilities. Previously, HA users belonging to these profiles showed significant
differences in terms of speech recognition in noise but not subjective
assessments of speech-in-noise (SIN) outcome. Moreover, large individual
differences within some profiles were observed. The current study therefore
explored if cognitive factors can help explain these differences and improve
aided outcome prediction. Thirty-nine older HA users completed sets of auditory
and SIN tests as well as two tablet-based cognitive measures (the Corsi
block-tapping and trail-making tests). Principal component analyses were applied
to extract the dominant sources of variance both within individual tests
producing many variables and within the three types of tests. Multiple linear
regression analyses performed on the extracted components showed that auditory
factors were related to aided speech recognition in noise but not to subjective
SIN outcome. Cognitive factors were unrelated to aided SIN outcome. Overall,
these findings provide limited support for adding those two cognitive tests to
the profiling of HA users.
Collapse
Affiliation(s)
- Mengfan Wu
- Institute of Clinical Research, Faculty of Health Sciences, 6174University of Southern Denmark, Odense, Denmark.,Research Unit for ORL - Head & Neck Surgery and Audiology, 11286Odense University Hospital & University of Southern Denmark, Odense, Denmark
| | - Stine Christiansen
- Institute of Clinical Research, Faculty of Health Sciences, 6174University of Southern Denmark, Odense, Denmark.,Research Unit for ORL - Head & Neck Surgery and Audiology, 11286Odense University Hospital & University of Southern Denmark, Odense, Denmark
| | - Michal Fereczkowski
- Institute of Clinical Research, Faculty of Health Sciences, 6174University of Southern Denmark, Odense, Denmark.,Research Unit for ORL - Head & Neck Surgery and Audiology, 11286Odense University Hospital & University of Southern Denmark, Odense, Denmark
| | - Tobias Neher
- Institute of Clinical Research, Faculty of Health Sciences, 6174University of Southern Denmark, Odense, Denmark.,Research Unit for ORL - Head & Neck Surgery and Audiology, 11286Odense University Hospital & University of Southern Denmark, Odense, Denmark
| |
Collapse
|
29
|
Hope M, Lilley J. Gender expansive listeners utilize a non-binary, multidimensional conception of gender to inform voice gender perception. BRAIN AND LANGUAGE 2022; 224:105049. [PMID: 34826679 DOI: 10.1016/j.bandl.2021.105049] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 11/03/2021] [Accepted: 11/08/2021] [Indexed: 06/13/2023]
Abstract
Few studies on voice perception have attempted to address the complexity of gender perception of ambiguous voices. The current study investigated how perception of gender varies with the complexity of the listener's own gender conception and identity. We explicitly recruited participants of all genders, including those who are gender expansive (i.e. transgender and/or non-binary), and directed them to rate ambiguous synthetic voices on three independent scales of masculine, feminine, and "other" (and to select one or multiple categorical labels for them). Gender expansive listeners were more likely to use the entire expanse of the rating scales and showed systematic categorization of gender-neutral voices as non-binary. We propose this is due to repeated use of reflective processes that challenge pre-existing gender categories and the incorporation of this decision-making process into their reflexive system. Because voice gender influences speech perception, the perceptual experience of gender expansive listeners may influence perceptual flexibility in speech.
Collapse
Affiliation(s)
- Maxwell Hope
- University of Delaware, Department of Linguistics & Cognitive Science, 125 E Main St, Newark, DE 19716, United States.
| | - Jason Lilley
- Nemours Biomedical Research, Center for Pediatric Auditory and Speech Sciences, 1701 Rockland Road, Room 136B, Wilmington, DE 19803, United States.
| |
Collapse
|
30
|
van Wieringen A, Magits S, Francart T, Wouters J. Home-Based Speech Perception Monitoring for Clinical Use With Cochlear Implant Users. Front Neurosci 2021; 15:773427. [PMID: 34916902 PMCID: PMC8669965 DOI: 10.3389/fnins.2021.773427] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 10/28/2021] [Indexed: 12/02/2022] Open
Abstract
Speech-perception testing is essential for monitoring outcomes with a hearing aid or cochlear implant (CI). However, clinical care is time-consuming and often challenging with an increasing number of clients. A potential approach to alleviating some clinical care and possibly making room for other outcome measures is to employ technologies that assess performance in the home environment. In this study, we investigate 3 different speech perception indices in the same 40 CI users: phoneme identification (vowels and consonants), digits in noise (DiN) and sentence recognition in noise (SiN). The first two tasks were implemented on a tablet and performed multiple times by each client in their home environment, while the sentence task was administered at the clinic. Speech perception outcomes in the same forty CI users showed that DiN assessed at home can serve as an alternative to SiN assessed at the clinic. DiN scores are in line with the SiN ones by 3–4 dB improvement and are useful to monitor performance at regular intervals and to detect changes in auditory performance. Phoneme identification in quiet also explains a significant part of speech perception in noise, and provides additional information on the detectability and discriminability of speech cues. The added benefit of the phoneme identification task, which also proved to be easy to administer at home, is the information transmission analysis in addition to the summary score. Performance changes for the different indices can be interpreted by comparing against measurement error and help to target personalized rehabilitation. Altogether, home-based speech testing is reliable and proves powerful to complement care in the clinic for CI users.
Collapse
Affiliation(s)
| | - Sara Magits
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - Tom Francart
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - Jan Wouters
- Experimental ORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
31
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
32
|
Distinct mechanisms for talker adaptation operate in parallel on different timescales. Psychon Bull Rev 2021; 29:627-634. [PMID: 34731443 DOI: 10.3758/s13423-021-02019-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/23/2021] [Indexed: 11/08/2022]
Abstract
The mapping between speech acoustics and phonemic representations is highly variable across talkers, and listeners are slower to recognize words when listening to multiple talkers compared with a single talker. Listeners' speech processing efficiency in mixed-talker settings improves when given time to reorient their attention to each new talker. However, it remains unknown how much time is needed to fully reorient attention to a new talker in mixed-talker settings so that speech processing becomes as efficient as when listening to a single talker. In this study, we examined how speech processing efficiency improves in mixed-talker settings as a function of the duration of continuous speech from a talker. In single-talker and mixed-talker conditions, listeners identified target words either in isolation or preceded by a carrier vowel of parametrically varying durations from 300 to 1,500 ms. Listeners' word identification was significantly slower in every mixed-talker condition compared with the corresponding single-talker condition. The costs associated with processing mixed-talker speech declined significantly as the duration of the speech carrier increased from 0 to 600 ms. However, increasing the carrier duration beyond 600 ms did not achieve further reduction in talker variability-related processing costs. These results suggest that two parallel mechanisms support processing talker variability: A stimulus-driven mechanism that operates on short timescales to reorient attention to new auditory sources, and a top-down mechanism that operates over longer timescales to allocate the cognitive resources needed to accommodate uncertainty in acoustic-phonemic correspondences during contexts where speech may come from multiple talkers.
Collapse
|
33
|
Feng Y, Meng Y, Li H, Peng G. Effects of Cognitive Load on the Categorical Perception of Mandarin Tones. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3794-3802. [PMID: 34473569 DOI: 10.1044/2021_jslhr-20-00695] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose This study investigated the effect of cognitive load (CL) on the categorical perception (CP) of Mandarin lexical tones to discuss the application of the generalized pulse-skipping hypothesis. This hypothesis assumes that listeners might miss/skip temporal pulses and lose essential speech information due to CL, which consequently affects both the temporal and spectral dimensions of speech perception. Should CL decrease listeners' pitch sensitivity and impair the distinction of tone categories, this study would support the generalized pulse-skipping hypothesis. Method Twenty-four native Mandarin-speaking listeners were recruited to complete a dual-task experiment where they were required to identify or discriminate tone stimuli while concurrently memorizing six Chinese characters or graphic symbols. A no-load condition without a memory recall task was also included as a baseline condition. The position of categorical boundary, identification slope, between- and within-category discrimination, and discrimination peakedness were compared across the three conditions to measure the impact of CL on tone perception. The recall accuracy of Chinese characters and graphic symbols was used to assess the difficulty of memory recall. Results Compared with the no-load condition, both load conditions showed a boundary shift to Tone 3, shallower identification slope, poorer between-category discrimination, and lower discrimination peakedness. Within-category discrimination was negatively affected by CL in the graphic symbol condition only, not in the Chinese character condition. Conclusions CL degraded listeners' sensitivity to subtle fundamental frequency changes and impaired CP of Mandarin lexical tones. This provides support for the generalized pulse-skipping hypothesis. Besides, the involvement of lexical information modulated the effect of CL.
Collapse
Affiliation(s)
- Yan Feng
- Research Centre for Language, Cognition, and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Kowloon
| | - Yaru Meng
- Department of Chinese Language and Literature, East China Normal University, Shanghai, China
| | - Hanfei Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
| | - Gang Peng
- Research Centre for Language, Cognition, and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Kowloon
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
| |
Collapse
|
34
|
Lim SJ, Carter YD, Njoroge JM, Shinn-Cunningham BG, Perrachione TK. Talker discontinuity disrupts attention to speech: Evidence from EEG and pupillometry. BRAIN AND LANGUAGE 2021; 221:104996. [PMID: 34358924 PMCID: PMC8515637 DOI: 10.1016/j.bandl.2021.104996] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 07/11/2021] [Accepted: 07/13/2021] [Indexed: 05/13/2023]
Abstract
Speech is processed less efficiently from discontinuous, mixed talkers than one consistent talker, but little is known about the neural mechanisms for processing talker variability. Here, we measured psychophysiological responses to talker variability using electroencephalography (EEG) and pupillometry while listeners performed a delayed recall of digit span task. Listeners heard and recalled seven-digit sequences with both talker (single- vs. mixed-talker digits) and temporal (0- vs. 500-ms inter-digit intervals) discontinuities. Talker discontinuity reduced serial recall accuracy. Both talker and temporal discontinuities elicited P3a-like neural evoked response, while rapid processing of mixed-talkers' speech led to increased phasic pupil dilation. Furthermore, mixed-talkers' speech produced less alpha oscillatory power during working memory maintenance, but not during speech encoding. Overall, these results are consistent with an auditory attention and streaming framework in which talker discontinuity leads to involuntary, stimulus-driven attentional reorientation to novel speech sources, resulting in the processing interference classically associated with talker variability.
Collapse
Affiliation(s)
- Sung-Joo Lim
- Department of Speech, Language, and Hearing Sciences, Boston University, United States.
| | - Yaminah D Carter
- Department of Speech, Language, and Hearing Sciences, Boston University, United States
| | - J Michelle Njoroge
- Department of Speech, Language, and Hearing Sciences, Boston University, United States
| | | | - Tyler K Perrachione
- Department of Speech, Language, and Hearing Sciences, Boston University, United States.
| |
Collapse
|
35
|
Prince P, Paul BT, Chen J, Le T, Lin V, Dimitrijevic A. Neural correlates of visual stimulus encoding and verbal working memory differ between cochlear implant users and normal-hearing controls. Eur J Neurosci 2021; 54:5016-5037. [PMID: 34146363 PMCID: PMC8457219 DOI: 10.1111/ejn.15365] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 06/10/2021] [Accepted: 06/14/2021] [Indexed: 11/29/2022]
Abstract
A common concern for individuals with severe‐to‐profound hearing loss fitted with cochlear implants (CIs) is difficulty following conversations in noisy environments. Recent work has suggested that these difficulties are related to individual differences in brain function, including verbal working memory and the degree of cross‐modal reorganization of auditory areas for visual processing. However, the neural basis for these relationships is not fully understood. Here, we investigated neural correlates of visual verbal working memory and sensory plasticity in 14 CI users and age‐matched normal‐hearing (NH) controls. While we recorded the high‐density electroencephalogram (EEG), participants completed a modified Sternberg visual working memory task where sets of letters and numbers were presented visually and then recalled at a later time. Results suggested that CI users had comparable behavioural working memory performance compared with NH. However, CI users had more pronounced neural activity during visual stimulus encoding, including stronger visual‐evoked activity in auditory and visual cortices, larger modulations of neural oscillations and increased frontotemporal connectivity. In contrast, during memory retention of the characters, CI users had descriptively weaker neural oscillations and significantly lower frontotemporal connectivity. We interpret the differences in neural correlates of visual stimulus processing in CI users through the lens of cross‐modal and intramodal plasticity.
Collapse
Affiliation(s)
- Priyanka Prince
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Department of Physiology, University of Toronto, Toronto, Ontario, Canada
| | - Brandon T Paul
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Department of Psychology, Ryerson University, Toronto, Ontario, Canada
| | - Joseph Chen
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Trung Le
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Vincent Lin
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Andrew Dimitrijevic
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Department of Physiology, University of Toronto, Toronto, Ontario, Canada.,Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
36
|
Mechtenberg H, Xie X, Myers EB. Sentence predictability modulates cortical response to phonetic ambiguity. BRAIN AND LANGUAGE 2021; 218:104959. [PMID: 33930722 PMCID: PMC8513138 DOI: 10.1016/j.bandl.2021.104959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 03/02/2021] [Accepted: 04/09/2021] [Indexed: 06/12/2023]
Abstract
Phonetic categories have undefined edges, such that individual tokens that belong to different speech sound categories may occupy the same region in acoustic space. In continuous speech, there are multiple sources of top-down information (e.g., lexical, semantic) that help to resolve the identity of an ambiguous phoneme. Of interest is how these top-down constraints interact with ambiguity at the phonetic level. In the current fMRI study, participants passively listened to sentences that varied in semantic predictability and in the amount of naturally-occurring phonetic competition. The left middle frontal gyrus, angular gyrus, and anterior inferior frontal gyrus were sensitive to both semantic predictability and the degree of phonetic competition. Notably, greater phonetic competition within non-predictive contexts resulted in a negatively-graded neural response. We suggest that uncertainty at the phonetic-acoustic level interacts with uncertainty at the semantic level-perhaps due to a failure of the network to construct a coherent meaning.
Collapse
Affiliation(s)
- Hannah Mechtenberg
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA.
| | - Xin Xie
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA.
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA; Department of Psychological Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA.
| |
Collapse
|
37
|
Francis AL, Bent T, Schumaker J, Love J, Silbert N. Listener characteristics differentially affect self-reported and physiological measures of effort associated with two challenging listening conditions. Atten Percept Psychophys 2021; 83:1818-1841. [PMID: 33438149 PMCID: PMC8084824 DOI: 10.3758/s13414-020-02195-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/16/2020] [Indexed: 12/14/2022]
Abstract
Listeners vary in their ability to understand speech in adverse conditions. Differences in both cognitive and linguistic capacities play a role, but increasing evidence suggests that such factors may contribute differentially depending on the listening challenge. Here, we used multilevel modeling to evaluate contributions of individual differences in age, hearing thresholds, vocabulary, selective attention, working memory capacity, personality traits, and noise sensitivity to variability in measures of comprehension and listening effort in two listening conditions. A total of 35 participants completed a battery of cognitive and linguistic tests as well as a spoken story comprehension task using (1) native-accented English speech masked by speech-shaped noise and (2) nonnative accented English speech without masking. Masker levels were adjusted individually to ensure each participant would show (close to) equivalent word recognition performance across the two conditions. Dependent measures included comprehension tests results, self-rated effort, and electrodermal, cardiovascular, and facial electromyographic measures associated with listening effort. Results showed varied patterns of responsivity across different dependent measures as well as across listening conditions. In particular, results suggested that working memory capacity may play a greater role in the comprehension of nonnative accented speech than noise-masked speech, while hearing acuity and personality may have a stronger influence on physiological responses affected by demands of understanding speech in noise. Furthermore, electrodermal measures may be more strongly affected by affective response to noise-related interference while cardiovascular responses may be more strongly affected by demands on working memory and lexical access.
Collapse
Affiliation(s)
- Alexander L Francis
- Department of Speech, Language and Hearing Sciences, Purdue University, Lyles-Porter Hall, 715 Clinic Dr., West Lafayette, IN, 47907, USA.
| | - Tessa Bent
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Jennifer Schumaker
- Department of Speech, Language and Hearing Sciences, Purdue University, Lyles-Porter Hall, 715 Clinic Dr., West Lafayette, IN, 47907, USA
| | - Jordan Love
- Department of Speech, Language and Hearing Sciences, Purdue University, Lyles-Porter Hall, 715 Clinic Dr., West Lafayette, IN, 47907, USA
| | - Noah Silbert
- Applied Research Laboratory for Intelligence and Security, University of Maryland, College Park, MD, USA
| |
Collapse
|
38
|
Liu F, Yin Y, Chan AHD, Yip V, Wong PCM. Individuals with congenital amusia do not show context-dependent perception of tonal categories. BRAIN AND LANGUAGE 2021; 215:104908. [PMID: 33578176 DOI: 10.1016/j.bandl.2021.104908] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 01/05/2021] [Accepted: 01/05/2021] [Indexed: 06/12/2023]
Abstract
Perceptual adaptation is an active cognitive process where listeners re-analyse speech categories based on new contexts/situations/talkers. It involves top-down influences from higher cortical levels on lower-level auditory processes. Individuals with congenital amusia have impaired pitch processing with reduced connectivity between frontal and temporal regions. This study examined whether deficits in amusia would lead to impaired perceptual adaptation in lexical tone perception. Thirteen Mandarin-speaking amusics and 13 controls identified the category of target tones on an 8-step continuum ranging from rising to high-level, either in isolation or in a high-/low-pitched context. For tones with no context, amusics exhibited reduced categorical perception than controls. While controls' lexical tone categorization demonstrated a significant context effect due to perceptual adaptation, amusics showed similar categorization patterns across both contexts. These findings suggest that congenital amusia impacts the extraction of context-dependent tonal categories in speech perception, indicating that perceptual adaptation may depend on listeners' perceptual acuity.
Collapse
Affiliation(s)
- Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Yanjun Yin
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong, China
| | - Alice H D Chan
- Linguistics and Multilingual Studies, School of Humanities, Nanyang Technological University, Singapore.
| | - Virginia Yip
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong, China; Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
39
|
Kaplan EC, Wagner AE, Toffanin P, Başkent D. Do Musicians and Non-musicians Differ in Speech-on-Speech Processing? Front Psychol 2021; 12:623787. [PMID: 33679539 PMCID: PMC7931613 DOI: 10.3389/fpsyg.2021.623787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 01/21/2021] [Indexed: 12/18/2022] Open
Abstract
Earlier studies have shown that musically trained individuals may have a benefit in adverse listening situations when compared to non-musicians, especially in speech-on-speech perception. However, the literature provides mostly conflicting results. In the current study, by employing different measures of spoken language processing, we aimed to test whether we could capture potential differences between musicians and non-musicians in speech-on-speech processing. We used an offline measure of speech perception (sentence recall task), which reveals a post-task response, and online measures of real time spoken language processing: gaze-tracking and pupillometry. We used stimuli of comparable complexity across both paradigms and tested the same groups of participants. In the sentence recall task, musicians recalled more words correctly than non-musicians. In the eye-tracking experiment, both groups showed reduced fixations to the target and competitor words' images as the level of speech maskers increased. The time course of gaze fixations to the competitor did not differ between groups in the speech-in-quiet condition, while the time course dynamics did differ between groups as the two-talker masker was added to the target signal. As the level of two-talker masker increased, musicians showed reduced lexical competition as indicated by the gaze fixations to the competitor. The pupil dilation data showed differences mainly in one target-to-masker ratio. This does not allow to draw conclusions regarding potential differences in the use of cognitive resources between groups. Overall, the eye-tracking measure enabled us to observe that musicians may be using a different strategy than non-musicians to attain spoken word recognition as the noise level increased. However, further investigation with more fine-grained alignment between the processes captured by online and offline measures is necessary to establish whether musicians differ due to better cognitive control or sound processing.
Collapse
Affiliation(s)
- Elif Canseza Kaplan
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| | - Anita E Wagner
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Paolo Toffanin
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| |
Collapse
|
40
|
Talker familiarity and the accommodation of talker variability. Atten Percept Psychophys 2021; 83:1842-1860. [PMID: 33398658 DOI: 10.3758/s13414-020-02203-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2020] [Indexed: 11/08/2022]
Abstract
A fundamental problem in speech perception is how (or whether) listeners accommodate variability in the way talkers produce speech. One view of the way listeners cope with this variability is that talker differences are normalized - a mapping between talker-specific characteristics and phonetic categories is computed such that speech is recognized in the context of the talker's vocal characteristics. Consistent with this view, listeners process speech more slowly when the talker changes randomly than when the talker remains constant. An alternative view is that speech perception is based on talker-specific auditory exemplars in memory clustered around linguistic categories that allow talker-independent perception. Consistent with this view, listeners become more efficient at talker-specific phonetic processing after voice identification training. We asked whether phonetic efficiency would increase with talker familiarity by testing listeners with extremely familiar talkers (family members), newly familiar talkers (based on laboratory training), and unfamiliar talkers. We also asked whether familiarity would reduce the need for normalization. As predicted, phonetic efficiency (word recognition in noise) increased with familiarity (unfamiliar < trained-on < family). However, we observed a constant processing cost for talker changes even for pairs of family members. We discuss how normalization and exemplar theories might account for these results, and constraints the results impose on theoretical accounts of phonetic constancy.
Collapse
|
41
|
Tremblay P, Brisson V, Deschamps I. Brain aging and speech perception: Effects of background noise and talker variability. Neuroimage 2020; 227:117675. [PMID: 33359849 DOI: 10.1016/j.neuroimage.2020.117675] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 12/15/2020] [Accepted: 12/17/2020] [Indexed: 10/22/2022] Open
Abstract
Speech perception can be challenging, especially for older adults. Despite the importance of speech perception in social interactions, the mechanisms underlying these difficulties remain unclear and treatment options are scarce. While several studies have suggested that decline within cortical auditory regions may be a hallmark of these difficulties, a growing number of studies have reported decline in regions beyond the auditory processing network, including regions involved in speech processing and executive control, suggesting a potentially diffuse underlying neural disruption, though no consensus exists regarding underlying dysfunctions. To address this issue, we conducted two experiments in which we investigated age differences in speech perception when background noise and talker variability are manipulated, two factors known to be detrimental to speech perception. In Experiment 1, we examined the relationship between speech perception, hearing and auditory attention in 88 healthy participants aged 19 to 87 years. In Experiment 2, we examined cortical thickness and BOLD signal using magnetic resonance imaging (MRI) and related these measures to speech perception performance using a simple mediation approach in 32 participants from Experiment 1. Our results show that, even after accounting for hearing thresholds and two measures of auditory attention, speech perception significantly declined with age. Age-related decline in speech perception in noise was associated with thinner cortex in auditory and speech processing regions (including the superior temporal cortex, ventral premotor cortex and inferior frontal gyrus) as well as in regions involved in executive control (including the dorsal anterior insula, the anterior cingulate cortex and medial frontal cortex). Further, our results show that speech perception performance was associated with reduced brain response in the right superior temporal cortex in older compared to younger adults, and to an increase in response to noise in older adults in the left anterior temporal cortex. Talker variability was not associated with different activation patterns in older compared to younger adults. Together, these results support the notion of a diffuse rather than a focal dysfunction underlying speech perception in noise difficulties in older adults.
Collapse
Affiliation(s)
- Pascale Tremblay
- CERVO Brain Research Center, Québec City, QC, Canada; Université Laval, Département de réadaptation, Québec City, QC, Canada.
| | - Valérie Brisson
- CERVO Brain Research Center, Québec City, QC, Canada; Université Laval, Département de réadaptation, Québec City, QC, Canada
| | | |
Collapse
|
42
|
Ong JH, Wong PCM, Liu F. Musicians show enhanced perception, but not production, of native lexical tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3443. [PMID: 33379922 DOI: 10.1121/10.0002776] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 11/06/2020] [Indexed: 06/12/2023]
Abstract
Many studies have reported a musical advantage in perceiving lexical tones among non-native listeners, but it is unclear whether this advantage also applies to native listeners, who are likely to show ceiling-like performance and thus mask any potential musical advantage. The ongoing tone merging phenomenon in Hong Kong Cantonese provides a unique opportunity to investigate this as merging tone pairs are reported to be difficult to differentiate even among native listeners. In the present study, native Cantonese musicians and non-musicians were compared based on discrimination and identification of merging Cantonese tone pairs to determine whether a musical advantage in perception will be observed, and if so, whether this is seen on the phonetic and/or phonological level. The tonal space of the subjects' lexical tone production was also compared. Results indicated that the musicians outperformed the non-musicians on the two perceptual tasks, as indexed by a higher accuracy and faster reaction time, particularly on the most difficult tone pair. In the production task, however, there was no group difference in various indices of tonal space. Taken together, musical experience appears to facilitate native listeners' perception, but not production, of lexical tones, which partially supports a music-to-language transfer effect.
Collapse
Affiliation(s)
- Jia Hoong Ong
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Reading RG6 6AL, United Kingdom
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages and Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Reading RG6 6AL, United Kingdom
| |
Collapse
|
43
|
Responses to Visual Speech in Human Posterior Superior Temporal Gyrus Examined with iEEG Deconvolution. J Neurosci 2020; 40:6938-6948. [PMID: 32727820 PMCID: PMC7470920 DOI: 10.1523/jneurosci.0279-20.2020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 06/01/2020] [Accepted: 06/02/2020] [Indexed: 12/22/2022] Open
Abstract
Experimentalists studying multisensory integration compare neural responses to multisensory stimuli with responses to the component modalities presented in isolation. This procedure is problematic for multisensory speech perception since audiovisual speech and auditory-only speech are easily intelligible but visual-only speech is not. To overcome this confound, we developed intracranial encephalography (iEEG) deconvolution. Individual stimuli always contained both auditory and visual speech, but jittering the onset asynchrony between modalities allowed for the time course of the unisensory responses and the interaction between them to be independently estimated. We applied this procedure to electrodes implanted in human epilepsy patients (both male and female) over the posterior superior temporal gyrus (pSTG), a brain area known to be important for speech perception. iEEG deconvolution revealed sustained positive responses to visual-only speech and larger, phasic responses to auditory-only speech. Confirming results from scalp EEG, responses to audiovisual speech were weaker than responses to auditory-only speech, demonstrating a subadditive multisensory neural computation. Leveraging the spatial resolution of iEEG, we extended these results to show that subadditivity is most pronounced in more posterior aspects of the pSTG. Across electrodes, subadditivity correlated with visual responsiveness, supporting a model in which visual speech enhances the efficiency of auditory speech processing in pSTG. The ability to separate neural processes may make iEEG deconvolution useful for studying a variety of complex cognitive and perceptual tasks.SIGNIFICANCE STATEMENT Understanding speech is one of the most important human abilities. Speech perception uses information from both the auditory and visual modalities. It has been difficult to study neural responses to visual speech because visual-only speech is difficult or impossible to comprehend, unlike auditory-only and audiovisual speech. We used intracranial encephalography deconvolution to overcome this obstacle. We found that visual speech evokes a positive response in the human posterior superior temporal gyrus, enhancing the efficiency of auditory speech processing.
Collapse
|
44
|
Selecting among competing models of talker adaptation: Attention, cognition, and memory in speech processing efficiency. Cognition 2020; 204:104393. [PMID: 32688132 DOI: 10.1016/j.cognition.2020.104393] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 06/14/2020] [Accepted: 06/29/2020] [Indexed: 11/24/2022]
Abstract
Phonetic variability across talkers imposes additional processing costs during speech perception, often measured by performance decrements between single- and mixed-talker conditions. However, models differ in their predictions about whether accommodating greater phonetic variability (i.e., more talkers) imposes greater processing costs. We measured speech processing efficiency in a speeded word identification task, in which we manipulated the number of talkers (1, 2, 4, 8, or 16) listeners heard. Word identification was less efficient in every mixed-talker condition compared to the single-talker condition, but the magnitude of this performance decrement was not affected by the number of talkers. Furthermore, in a condition with uniform transition probabilities between two talkers, word identification was more efficient when the talker was the same as the prior trial compared to trials when the talker switched. These results support an auditory streaming model of talker adaptation, where processing costs associated with changing talkers result from attentional reorientation.
Collapse
|
45
|
Abstract
Listeners exposed to accented speech must adjust how they map between acoustic features and lexical representations such as phonetic categories. A robust form of this adaptive perceptual learning is learning to perceive synthetic speech where the connections between acoustic features and phonetic categories must be updated. Both implicit learning through mere exposure and explicit learning through directed feedback have previously been shown to produce this type of adaptive learning. The present study crosses implicit exposure and explicit feedback with the presence or absence of a written identification task. We show that simple exposure produces some learning, but explicit feedback produces substantially stronger learning, whereas requiring written identification did not measurably affect learning. These results suggest that explicit feedback guides learning of new mappings between acoustic patterns and known phonetic categories. We discuss mechanisms that may support learning via implicit exposure.
Collapse
|
46
|
Uddin S, Reis KS, Heald SLM, Van Hedger SC, Nusbaum HC. Cortical mechanisms of talker normalization in fluent sentences. BRAIN AND LANGUAGE 2020; 201:104722. [PMID: 31835154 PMCID: PMC8038647 DOI: 10.1016/j.bandl.2019.104722] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 11/04/2019] [Accepted: 11/13/2019] [Indexed: 05/27/2023]
Abstract
Adjusting to the vocal characteristics of a new talker is important for speech recognition. Previous research has indicated that adjusting to talker differences is an active cognitive process that depends on attention and working memory (WM). These studies have not examined how talker variability affects perception and neural responses in fluent speech. Here we use source analysis from high-density EEG to show that perceiving fluent speech in which the talker changes recruits early involvement of parietal and temporal cortical areas, suggesting functional involvement of WM and attention in talker normalization. We extend these findings to acoustic source change in general by examining understanding environmental sounds in spoken sentence context. Though there may be differences in cortical recruitment to processing demands for non-speech sounds versus a changing talker, the underlying mechanisms are similar, supporting the view that shared cognitive-general mechanisms assist both talker normalization and speech-to-nonspeech transitions.
Collapse
Affiliation(s)
- Sophia Uddin
- Department of Psychology, The University of Chicago, 5848 S. University Ave., Chicago, IL 60637, United States.
| | - Katherine S Reis
- Department of Psychology, The University of Chicago, 5848 S. University Ave., Chicago, IL 60637, United States
| | - Shannon L M Heald
- Department of Psychology, The University of Chicago, 5848 S. University Ave., Chicago, IL 60637, United States
| | - Stephen C Van Hedger
- Department of Psychology, The University of Chicago, 5848 S. University Ave., Chicago, IL 60637, United States
| | - Howard C Nusbaum
- Department of Psychology, The University of Chicago, 5848 S. University Ave., Chicago, IL 60637, United States
| |
Collapse
|
47
|
Jenson D, Bowers AL, Hudock D, Saltuklaroglu T. The Application of EEG Mu Rhythm Measures to Neurophysiological Research in Stuttering. Front Hum Neurosci 2020; 13:458. [PMID: 31998103 PMCID: PMC6965028 DOI: 10.3389/fnhum.2019.00458] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 12/13/2019] [Indexed: 11/29/2022] Open
Abstract
Deficits in basal ganglia-based inhibitory and timing circuits along with sensorimotor internal modeling mechanisms are thought to underlie stuttering. However, much remains to be learned regarding the precise manner how these deficits contribute to disrupting both speech and cognitive functions in those who stutter. Herein, we examine the suitability of electroencephalographic (EEG) mu rhythms for addressing these deficits. We review some previous findings of mu rhythm activity differentiating stuttering from non-stuttering individuals and present some new preliminary findings capturing stuttering-related deficits in working memory. Mu rhythms are characterized by spectral peaks in alpha (8-13 Hz) and beta (14-25 Hz) frequency bands (mu-alpha and mu-beta). They emanate from premotor/motor regions and are influenced by basal ganglia and sensorimotor function. More specifically, alpha peaks (mu-alpha) are sensitive to basal ganglia-based inhibitory signals and sensory-to-motor feedback. Beta peaks (mu-beta) are sensitive to changes in timing and capture motor-to-sensory (i.e., forward model) projections. Observing simultaneous changes in mu-alpha and mu-beta across the time-course of specific events provides a rich window for observing neurophysiological deficits associated with stuttering in both speech and cognitive tasks and can provide a better understanding of the functional relationship between these stuttering symptoms. We review how independent component analysis (ICA) can extract mu rhythms from raw EEG signals in speech production tasks, such that changes in alpha and beta power are mapped to myogenic activity from articulators. We review findings from speech production and auditory discrimination tasks demonstrating that mu-alpha and mu-beta are highly sensitive to capturing sensorimotor and basal ganglia deficits associated with stuttering with high temporal precision. Novel findings from a non-word repetition (working memory) task are also included. They show reduced mu-alpha suppression in a stuttering group compared to a typically fluent group. Finally, we review current limitations and directions for future research.
Collapse
Affiliation(s)
- David Jenson
- Department of Speech and Hearing Sciences, Elson S. Floyd College of Medicine, Washington State University, Spokane, WA, United States
| | - Andrew L. Bowers
- Epley Center for Health Professions, Communication Sciences and Disorders, University of Arkansas, Fayetteville, AR, United States
| | - Daniel Hudock
- Department of Communication Sciences and Disorders, Idaho State University, Pocatello, ID, United States
| | - Tim Saltuklaroglu
- College of Health Professions, Department of Audiology and Speech-Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| |
Collapse
|
48
|
Loughrey DG, Mihelj E, Lawlor BA. Age-related hearing loss associated with altered response efficiency and variability on a visual sustained attention task. AGING NEUROPSYCHOLOGY AND COGNITION 2019; 28:1-25. [PMID: 31868123 DOI: 10.1080/13825585.2019.1704393] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
This study investigated the association between age-related hearing loss (ARHL) and differences in response efficiency and variability on a sustained attention task. The study population comprised 32 participants in a hearing loss group (HLG) and 34 controls without hearing loss (CG). Mean reaction time (RT) and accuracy were recorded to assess response efficiency. RT variability was decomposed to examine temporal aspects of variability associated with neural arousal and top-down executive control of vigilant attention. The HLG had a significantly longer mean RT, possibly reflecting a strategic approach to maintain accuracy. The HLG also demonstrated altered variability (indicative of greater decline in neural arousal) but maintained executive control that was significantly predictive of poorer response efficiency. Adults with ARHL may rely on higher-order attention networks to compensate for decline in both peripheral sensory function and in subcortical arousal systems which mediate lower-order automatic neurocognitive processes.
Collapse
Affiliation(s)
- David G Loughrey
- Global Brain Health Institute, Trinity College Dublin, Ireland/University of California , San Francisco, CA, USA
| | - Ernest Mihelj
- Institute of Human Movement Sciences and Sport, Eidgenössische Technische Hochschule Zürich , Switzerland
| | - Brian A Lawlor
- Global Brain Health Institute, Trinity College Dublin, Ireland/University of California, San Francisco. Mercer's Institute for Successful Ageing, St James Hospital , Dublin, Ireland
| |
Collapse
|
49
|
De Keyser K, De Letter M, De Groote E, Santens P, Talsma D, Botteldooren D, Bockstael A. Systematic Audiological Assessment of Auditory Functioning in Patients With Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:4564-4577. [PMID: 31770043 DOI: 10.1044/2019_jslhr-h-19-0097] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Purpose Alterations in primary auditory functioning have been reported in patients with Parkinson's disease (PD). Despite the current findings, the pathophysiological mechanisms underlying these alterations remain unclear, and the effect of dopaminergic medication on auditory functioning in PD has been explored insufficiently. Therefore, this study aimed to systematically investigate primary auditory functioning in patients with PD by using both subjective and objective audiological measurements. Method In this case-control study, 25 patients with PD and 25 age-, gender-, and education-matched healthy controls underwent an audiological test battery consisting of tonal audiometry, short increment sensitivity index, otoacoustic emissions (OAEs), and speech audiometry. Patients with PD were tested in the on- and off-medication states. Results Increased OAE amplitudes were found when patients with PD were tested without dopaminergic medication. In addition, speech audiometry in silence and multitalker babble noise demonstrated higher phoneme scores for patients with PD in the off-medication condition. The results showed no differences in auditory functioning between patients with PD in the on-medication condition and healthy controls. No effect of disease stage or motor score was evident. Conclusions This study provides evidence for a top-down involvement in auditory processing in PD at both central and peripheral levels. Most important, the increase in OAE amplitude in the off-medication condition in PD is hypothesized to be linked to a dysfunction of the olivocochlear efferent system, which is known to have an inhibitory effect on outer hair cell functioning. Future studies may clarify whether OAEs may facilitate an early diagnosis of PD.
Collapse
Affiliation(s)
- Kim De Keyser
- Department of Rehabilitation Sciences, Ghent University, Belgium
| | - Miet De Letter
- Department of Rehabilitation Sciences, Ghent University, Belgium
| | | | | | - Durk Talsma
- Department of Experimental Psychology, Ghent University, Belgium
| | - Dick Botteldooren
- Department of Information Technology (INTEC)-Acoustics Research Group, Ghent University, Belgium
| | - Annelies Bockstael
- Ecole d'Orthophonie et d'Audiologie, Université de Montréal, Quebec, Canada
| |
Collapse
|
50
|
Kowialiewski B, Van Calster L, Attout L, Phillips C, Majerus S. Neural Patterns in Linguistic Cortices Discriminate the Content of Verbal Working Memory. Cereb Cortex 2019; 30:2997-3014. [PMID: 31813984 DOI: 10.1093/cercor/bhz290] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 09/16/2019] [Accepted: 06/17/2019] [Indexed: 01/11/2023] Open
Abstract
An influential theoretical account of working memory (WM) considers that WM is based on direct activation of long-term memory knowledge. While there is empirical support for this position in the visual WM domain, direct evidence is scarce in the verbal WM domain. This question is critical for models of verbal WM, as the question of whether short-term maintenance of verbal information relies on direct activation within the long-term linguistic knowledge base or not is still debated. In this study, we examined the extent to which short-term maintenance of lexico-semantic knowledge relies on neural activation patterns in linguistic cortices, and this by using a fast encoding running span task for word and nonword stimuli minimizing strategic encoding mechanisms. Multivariate analyses showed specific neural patterns for the encoding and maintenance of word versus nonword stimuli. These patterns were not detectable anymore when participants were instructed to stop maintaining the memoranda. The patterns involved specific regions within the dorsal and ventral pathways, which are considered to support phonological and semantic processing to various degrees. This study provides novel evidence for a role of linguistic cortices in the representation of long-term memory linguistic knowledge during WM processing.
Collapse
Affiliation(s)
- Benjamin Kowialiewski
- University of Liège, Liège, Belgium.,Fund for Scientific Research-F.R.S.-FNRS, Brussels, Belgium
| | - Laurens Van Calster
- University of Liège, Liège, Belgium.,University of Geneva, Geneva, Switzerland
| | | | - Christophe Phillips
- University of Liège, Liège, Belgium.,Fund for Scientific Research-F.R.S.-FNRS, Brussels, Belgium
| | - Steve Majerus
- University of Liège, Liège, Belgium.,Fund for Scientific Research-F.R.S.-FNRS, Brussels, Belgium
| |
Collapse
|