1
|
Hansen TA, O’Leary RM, Svirsky MA, Wingfield A. Self-pacing ameliorates recall deficit when listening to vocoded discourse: a cochlear implant simulation. Front Psychol 2023; 14:1225752. [PMID: 38054180 PMCID: PMC10694252 DOI: 10.3389/fpsyg.2023.1225752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 11/07/2023] [Indexed: 12/07/2023] Open
Abstract
Introduction In spite of its apparent ease, comprehension of spoken discourse represents a complex linguistic and cognitive operation. The difficulty of such an operation can increase when the speech is degraded, as is the case with cochlear implant users. However, the additional challenges imposed by degraded speech may be mitigated to some extent by the linguistic context and pace of presentation. Methods An experiment is reported in which young adults with age-normal hearing recalled discourse passages heard with clear speech or with noise-band vocoding used to simulate the sound of speech produced by a cochlear implant. Passages were varied in inter-word predictability and presented either without interruption or in a self-pacing format that allowed the listener to control the rate at which the information was delivered. Results Results showed that discourse heard with clear speech was better recalled than discourse heard with vocoded speech, discourse with a higher average inter-word predictability was better recalled than discourse with a lower average inter-word predictability, and self-paced passages were recalled better than those heard without interruption. Of special interest was the semantic hierarchy effect: the tendency for listeners to show better recall for main ideas than mid-level information or detail from a passage as an index of listeners' ability to understand the meaning of a passage. The data revealed a significant effect of inter-word predictability, in that passages with lower predictability had an attenuated semantic hierarchy effect relative to higher-predictability passages. Discussion Results are discussed in terms of broadening cochlear implant outcome measures beyond current clinical measures that focus on single-word and sentence repetition.
Collapse
Affiliation(s)
- Thomas A. Hansen
- Department of Psychology, Brandeis University, Waltham, MA, United States
| | - Ryan M. O’Leary
- Department of Psychology, Brandeis University, Waltham, MA, United States
| | - Mario A. Svirsky
- Department of Otolaryngology, NYU Langone Medical Center, New York, NY, United States
| | - Arthur Wingfield
- Department of Psychology, Brandeis University, Waltham, MA, United States
| |
Collapse
|
2
|
Mepham A, Bi Y, Mattys SL. The time-course of linguistic interference during native and non-native speech-in-speech listening. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:954. [PMID: 36050191 DOI: 10.1121/10.0013417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 07/20/2022] [Indexed: 06/15/2023]
Abstract
Recognizing speech in a noisy background is harder when the background is time-forward than for time-reversed speech, a masker direction effect, and harder when the masker is in a known rather than an unknown language, indicating linguistic interference. We examined the masker direction effect when the masker was a known vs unknown language and calculated performance over 50 trials to assess differential masker adaptation. In experiment 1, native English listeners transcribing English sentences showed a larger masker direction effect with English than Mandarin maskers. In experiment 2, Mandarin non-native speakers of English transcribing Mandarin sentences showed a mirror pattern. Both experiments thus support the target-masker linguistic similarity hypothesis, where interference is maximal when target and masker languages are the same. In experiment 3, Mandarin non-native speakers of English transcribing English sentences showed comparable results for English and Mandarin maskers. Non-native listening is therefore consistent with the known-language interference hypothesis, where interference is maximal when the masker language is known to the listener, whether or not it matches the target language. A trial-by-trial analysis showed that the masker direction effect increased over time during native listening but not during non-native listening. The results indicate different target-to-masker streaming strategies during native and non-native speech-in-speech listening.
Collapse
Affiliation(s)
- Alex Mepham
- Department of Psychology, University of York, Heslington, United Kingdom
| | - Yifei Bi
- College of Foreign Languages, University of Shanghai for Science and Technology, Shanghai, China
| | - Sven L Mattys
- Department of Psychology, University of York, Heslington, United Kingdom
| |
Collapse
|
3
|
Heffner CC, Myers EB, Gracco VL. Impaired perceptual phonetic plasticity in Parkinson's disease. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:511. [PMID: 35931533 PMCID: PMC9299957 DOI: 10.1121/10.0012884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 07/05/2022] [Accepted: 07/06/2022] [Indexed: 06/08/2023]
Abstract
Parkinson's disease (PD) is a neurodegenerative condition primarily associated with its motor consequences. Although much of the focus within the speech domain has focused on PD's consequences for production, people with PD have been shown to differ in the perception of emotional prosody, loudness, and speech rate from age-matched controls. The current study targeted the effect of PD on perceptual phonetic plasticity, defined as the ability to learn and adjust to novel phonetic input, both in second language and native language contexts. People with PD were compared to age-matched controls (and, for three of the studies, a younger control population) in tasks of explicit non-native speech learning and adaptation to variation in native speech (compressed rate, accent, and the use of timing information within a sentence to parse ambiguities). The participants with PD showed significantly worse performance on the task of compressed rate and used the duration of an ambiguous fricative to segment speech to a lesser degree than age-matched controls, indicating impaired speech perceptual abilities. Exploratory comparisons also showed people with PD who were on medication performed significantly worse than their peers off medication on those two tasks and the task of explicit non-native learning.
Collapse
Affiliation(s)
- Christopher C Heffner
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
| | | |
Collapse
|
4
|
Dillon MT, O'Connell BP, Canfarotta MW, Buss E, Hopfinger J. Effect of Place-Based Versus Default Mapping Procedures on Masked Speech Recognition: Simulations of Cochlear Implant Alone and Electric-Acoustic Stimulation. Am J Audiol 2022; 31:322-337. [PMID: 35394798 DOI: 10.1044/2022_aja-21-00123] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Cochlear implant (CI) recipients demonstrate variable speech recognition when listening with a CI-alone or electric-acoustic stimulation (EAS) device, which may be due in part to electric frequency-to-place mismatches created by the default mapping procedures. Performance may be improved if the filter frequencies are aligned with the cochlear place frequencies, known as place-based mapping. Performance with default maps versus an experimental place-based map was compared for participants with normal hearing when listening to CI-alone or EAS simulations to observe potential outcomes prior to initiating an investigation with CI recipients. METHOD A noise vocoder simulated CI-alone and EAS devices, mapped with default or place-based procedures. The simulations were based on an actual 24-mm electrode array recipient, whose insertion angles for each electrode contact were used to estimate the respective cochlear place frequency. The default maps used the filter frequencies assigned by the clinical software. The filter frequencies for the place-based maps aligned with the cochlear place frequencies for individual contacts in the low- to mid-frequency cochlear region. For the EAS simulations, low-frequency acoustic information was filtered to simulate aided low-frequency audibility. Performance was evaluated for the AzBio sentences presented in a 10-talker masker at +5 dB signal-to-noise ratio (SNR), +10 dB SNR, and asymptote. RESULTS Performance was better with the place-based maps as compared with the default maps for both CI-alone and EAS simulations. For instance, median performance at +10 dB SNR for the CI-alone simulation was 57% correct for the place-based map and 20% for the default map. For the EAS simulation, those values were 59% and 37% correct. Adding acoustic low-frequency information resulted in a similar benefit for both maps. CONCLUSIONS Reducing frequency-to-place mismatches, such as with the experimental place-based mapping procedure, produces a greater benefit in speech recognition than maximizing bandwidth for CI-alone and EAS simulations. Ongoing work is evaluating the initial and long-term performance benefits in CI-alone and EAS users. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.19529053.
Collapse
Affiliation(s)
- Margaret T. Dillon
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
- Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill
| | - Brendan P. O'Connell
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Michael W. Canfarotta
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Emily Buss
- Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill
| | - Joseph Hopfinger
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill
| |
Collapse
|
5
|
Age-related differences in the neural network interactions underlying the predictability gain. Cortex 2022; 154:269-286. [DOI: 10.1016/j.cortex.2022.05.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 03/30/2022] [Accepted: 05/03/2022] [Indexed: 11/20/2022]
|
6
|
Cooke M, Scharenborg O, Meyer BT. The time course of adaptation to distorted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:2636. [PMID: 35461479 DOI: 10.1121/10.0010235] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/25/2022] [Indexed: 06/14/2023]
Abstract
When confronted with unfamiliar or novel forms of speech, listeners' word recognition performance is known to improve with exposure, but data are lacking on the fine-grained time course of adaptation. The current study aims to fill this gap by investigating the time course of adaptation to several different types of distorted speech. Keyword scores as a function of sentence position in a block of 30 sentences were measured in response to eight forms of distorted speech. Listeners recognised twice as many words in the final sentence compared to the initial sentence with around half of the gain appearing in the first three sentences, followed by gradual gains over the rest of the block. Rapid adaptation was apparent for most of the eight distortion types tested with differences mainly in the gradual phase. Adaptation to sine-wave speech improved if listeners had heard other types of distortion prior to exposure, but no similar facilitation occurred for the other types of distortion. Rapid adaptation is unlikely to be due to procedural learning since listeners had been familiarised with the task and sentence format through exposure to undistorted speech. The mechanisms that underlie rapid adaptation are currently unclear.
Collapse
Affiliation(s)
- Martin Cooke
- Ikerbasque (Basque Science Foundation), Bilbao, Spain
| | | | - Bernd T Meyer
- Communication Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
7
|
Heffner CC, Fuhrmeister P, Luthra S, Mechtenberg H, Saltzman D, Myers EB. Reliability and validity for perceptual flexibility in speech. BRAIN AND LANGUAGE 2022; 226:105070. [PMID: 35026449 DOI: 10.1016/j.bandl.2021.105070] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 12/10/2021] [Accepted: 12/31/2021] [Indexed: 06/08/2023]
Abstract
The study of perceptual flexibility in speech depends on a variety of tasks that feature a large degree of variability between participants. Of critical interest is whether measures are consistent within an individual or across stimulus contexts. This is particularly key for individual difference designs that aredeployed to examine the neural basis or clinical consequences of perceptual flexibility. In the present set of experiments, we assess the split-half reliability and construct validity of five measures of perceptual flexibility: three of learning in a native language context (e.g., understanding someone with a foreign accent) and two of learning in a non-native context (e.g., learning to categorize non-native speech sounds). We find that most of these tasks show an appreciable level of split-half reliability, although construct validity was sometimes weak. This provides good evidence for reliability for these tasks, while highlighting possible upper limits on expected effect sizes involving each measure.
Collapse
Affiliation(s)
- Christopher C Heffner
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT 06269, United States; Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269, United States; Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY 14214, United States.
| | - Pamela Fuhrmeister
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT 06269, United States; Department of Linguistics, University of Potsdam, 11476 Potsdam, Germany
| | - Sahil Luthra
- Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269, United States; Department of Psychological Sciences, University of Connecticut, Storrs, CT 06269, United States
| | - Hannah Mechtenberg
- Department of Psychological Sciences, University of Connecticut, Storrs, CT 06269, United States
| | - David Saltzman
- Department of Psychological Sciences, University of Connecticut, Storrs, CT 06269, United States
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT 06269, United States; Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269, United States; Department of Psychological Sciences, University of Connecticut, Storrs, CT 06269, United States
| |
Collapse
|
8
|
Lin Y, Tsao Y, Hsieh PJ. Neural correlates of individual differences in predicting ambiguous sounds comprehension level. Neuroimage 2022; 251:119012. [DOI: 10.1016/j.neuroimage.2022.119012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/28/2022] [Accepted: 02/16/2022] [Indexed: 11/16/2022] Open
|
9
|
Heffner CC, Myers EB. Individual Differences in Phonetic Plasticity Across Native and Nonnative Contexts. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3720-3733. [PMID: 34525309 DOI: 10.1044/2021_jslhr-21-00004] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Individuals vary in their ability to learn the sound categories of nonnative languages (nonnative phonetic learning) and to adapt to systematic differences, such as accent or talker differences, in the sounds of their native language (native phonetic learning). Difficulties with both native and nonnative learning are well attested in people with speech and language disorders relative to healthy controls, but substantial variability in these skills is also present in the typical population. This study examines whether this individual variability can be organized around a common ability that we label "phonetic plasticity." Method A group of healthy young adult participants (N = 80), who attested they had no history of speech, language, neurological, or hearing deficits, completed two tasks of nonnative phonetic category learning, two tasks of learning to cope with variation in their native language, and seven tasks of other cognitive functions, distributed across two sessions. Performance on these 11 tasks was compared, and exploratory factor analysis was used to assess the extent to which performance on each task was related to the others. Results Performance on both tasks of native learning and an explicit task of nonnative learning patterned together, suggesting that native and nonnative phonetic learning tasks rely on a shared underlying capacity, which is termed "phonetic plasticity." Phonetic plasticity was also associated with vocabulary, comprehension of words in background noise, and, more weakly, working memory. Conclusions Nonnative sound learning and native language speech perception may rely on shared phonetic plasticity. The results suggest that good learners of native language phonetic variation are also good learners of nonnative phonetic contrasts. Supplemental Material https://doi.org/10.23641/asha.16606778.
Collapse
Affiliation(s)
- Christopher C Heffner
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs
- Institute for Brain and Cognitive Sciences, University of Connecticut, Storrs
- Department of Communicative Disorders and Sciences, University at Buffalo, NY
- Center for Cognitive Science, University at Buffalo, NY
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs
- Institute for Brain and Cognitive Sciences, University of Connecticut, Storrs
- Department of Psychological Sciences, University of Connecticut, Storrs
| |
Collapse
|
10
|
Individual Variability in Recalibrating to Spectrally Shifted Speech: Implications for Cochlear Implants. Ear Hear 2021; 42:1412-1427. [PMID: 33795617 DOI: 10.1097/aud.0000000000001043] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Cochlear implant (CI) recipients are at a severe disadvantage compared with normal-hearing listeners in distinguishing consonants that differ by place of articulation because the key relevant spectral differences are degraded by the implant. One component of that degradation is the upward shifting of spectral energy that occurs with a shallow insertion depth of a CI. The present study aimed to systematically measure the effects of spectral shifting on word recognition and phoneme categorization by specifically controlling the amount of shifting and using stimuli whose identification specifically depends on perceiving frequency cues. We hypothesized that listeners would be biased toward perceiving phonemes that contain higher-frequency components because of the upward frequency shift and that intelligibility would decrease as spectral shifting increased. DESIGN Normal-hearing listeners (n = 15) heard sine wave-vocoded speech with simulated upward frequency shifts of 0, 2, 4, and 6 mm of cochlear space to simulate shallow CI insertion depth. Stimuli included monosyllabic words and /b/-/d/ and /∫/-/s/ continua that varied systematically by formant frequency transitions or frication noise spectral peaks, respectively. Recalibration to spectral shifting was operationally defined as shifting perceptual acoustic-phonetic mapping commensurate with the spectral shift. In other words, adjusting frequency expectations for both phonemes upward so that there is still a perceptual distinction, rather than hearing all upward-shifted phonemes as the higher-frequency member of the pair. RESULTS For moderate amounts of spectral shifting, group data suggested a general "halfway" recalibration to spectral shifting, but individual data suggested a notably different conclusion: half of the listeners were able to recalibrate fully, while the other halves of the listeners were utterly unable to categorize shifted speech with any reliability. There were no participants who demonstrated a pattern intermediate to these two extremes. Intelligibility of words decreased with greater amounts of spectral shifting, also showing loose clusters of better- and poorer-performing listeners. Phonetic analysis of word errors revealed certain cues were more susceptible to being compromised due to a frequency shift (place and manner of articulation), while voicing was robust to spectral shifting. CONCLUSIONS Shifting the frequency spectrum of speech has systematic effects that are in line with known properties of speech acoustics, but the ensuing difficulties cannot be predicted based on tonotopic mismatch alone. Difficulties are subject to substantial individual differences in the capacity to adjust acoustic-phonetic mapping. These results help to explain why speech recognition in CI listeners cannot be fully predicted by peripheral factors like electrode placement and spectral resolution; even among listeners with functionally equivalent auditory input, there is an additional factor of simply being able or unable to flexibly adjust acoustic-phonetic mapping. This individual variability could motivate precise treatment approaches guided by an individual's relative reliance on wideband frequency representation (even if it is mismatched) or limited frequency coverage whose tonotopy is preserved.
Collapse
|
11
|
Jiang J, Benhamou E, Waters S, Johnson JCS, Volkmer A, Weil RS, Marshall CR, Warren JD, Hardy CJD. Processing of Degraded Speech in Brain Disorders. Brain Sci 2021; 11:394. [PMID: 33804653 PMCID: PMC8003678 DOI: 10.3390/brainsci11030394] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 03/15/2021] [Accepted: 03/18/2021] [Indexed: 11/30/2022] Open
Abstract
The speech we hear every day is typically "degraded" by competing sounds and the idiosyncratic vocal characteristics of individual speakers. While the comprehension of "degraded" speech is normally automatic, it depends on dynamic and adaptive processing across distributed neural networks. This presents the brain with an immense computational challenge, making degraded speech processing vulnerable to a range of brain disorders. Therefore, it is likely to be a sensitive marker of neural circuit dysfunction and an index of retained neural plasticity. Considering experimental methods for studying degraded speech and factors that affect its processing in healthy individuals, we review the evidence for altered degraded speech processing in major neurodegenerative diseases, traumatic brain injury and stroke. We develop a predictive coding framework for understanding deficits of degraded speech processing in these disorders, focussing on the "language-led dementias"-the primary progressive aphasias. We conclude by considering prospects for using degraded speech as a probe of language network pathophysiology, a diagnostic tool and a target for therapeutic intervention.
Collapse
Affiliation(s)
- Jessica Jiang
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Elia Benhamou
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Sheena Waters
- Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London EC1M 6BQ, UK;
| | - Jeremy C. S. Johnson
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Anna Volkmer
- Division of Psychology and Language Sciences, University College London, London WC1H 0AP, UK;
| | - Rimona S. Weil
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Charles R. Marshall
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
- Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London EC1M 6BQ, UK;
| | - Jason D. Warren
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Chris J. D. Hardy
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| |
Collapse
|
12
|
A multisensory perspective onto primate pulvinar functions. Neurosci Biobehav Rev 2021; 125:231-243. [PMID: 33662442 DOI: 10.1016/j.neubiorev.2021.02.043] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 02/18/2021] [Accepted: 02/25/2021] [Indexed: 02/08/2023]
Abstract
Perception in ambiguous environments relies on the combination of sensory information from various sources. Most associative and primary sensory cortical areas are involved in this multisensory active integration process. As a result, the entire cortex appears as heavily multisensory. In this review, we focus on the contribution of the pulvinar to multisensory integration. This subcortical thalamic nucleus plays a central role in visual detection and selection at a fast time scale, as well as in the regulation of visual processes, at a much slower time scale. However, the pulvinar is also densely connected to cortical areas involved in multisensory integration. In spite of this, little is known about its multisensory properties and its contribution to multisensory perception. Here, we review the anatomical and functional organization of multisensory input to the pulvinar. We describe how visual, auditory, somatosensory, pain, proprioceptive and olfactory projections are differentially organized across the main subdivisions of the pulvinar and we show that topography is central to the organization of this complex nucleus. We propose that the pulvinar combines multiple sources of sensory information to enhance fast responses to the environment, while also playing the role of a general regulation hub for adaptive and flexible cognition.
Collapse
|
13
|
Koyama MS, Molfese PJ, Milham MP, Mencl WE, Pugh KR. Thalamus is a common locus of reading, arithmetic, and IQ: Analysis of local intrinsic functional properties. BRAIN AND LANGUAGE 2020; 209:104835. [PMID: 32738503 PMCID: PMC8087146 DOI: 10.1016/j.bandl.2020.104835] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 06/24/2020] [Accepted: 06/28/2020] [Indexed: 05/04/2023]
Abstract
Neuroimaging studies of basic achievement skills - reading and arithmetic - often control for the effect of IQ to identify unique neural correlates of each skill. This may underestimate possible effects of common factors between achievement and IQ measures on neuroimaging results. Here, we simultaneously examined achievement (reading and arithmetic) and IQ measures in young adults, aiming to identify MRI correlates of their common factors. Resting-state fMRI (rs-fMRI) data were analyzed using two metrics assessing local intrinsic functional properties; regional homogeneity (ReHo) and fractional amplitude low frequency fluctuation (fALFF), measuring local intrinsic functional connectivity and intrinsic functional activity, respectively. ReHo highlighted the thalamus/pulvinar (a subcortical region implied for selective attention) as a common locus for both achievement skills and IQ. More specifically, the higher the ReHo values, the lower the achievement and IQ scores. For fALFF, the left superior parietal lobule, part of the dorsal attention network, was positively associated with reading and IQ. Collectively, our results highlight attention-related regions, particularly the thalamus/pulvinar as a key region related to individual differences in performance on all the three measures. ReHo in the thalamus/pulvinar may serve as a tool to examine brain mechanisms underlying a comorbidity of reading and arithmetic difficulties, which could co-occur with weakness in general intellectual abilities.
Collapse
Affiliation(s)
- Maki S Koyama
- Haskins Laboratories, New Haven, CT, USA; Center for the Developing Brain, Child Mind Institute, New York, NY, USA.
| | - Peter J Molfese
- Haskins Laboratories, New Haven, CT, USA; Section on Functional Imaging Methods, Laboratory of Brain and Cognition, Department of Health and Human Services, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA.
| | - Michael P Milham
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA; Center for Biomedical Imagingand Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA.
| | | | - Kenneth R Pugh
- Haskins Laboratories, New Haven, CT, USA; Yale University School of Medicine, Department of Diagnostic Radiology, New Haven, CT, USA; University of Connecticut, Department of Psychology, Storrs, CT, USA.
| |
Collapse
|
14
|
Rysop AU, Schmitt LM, Obleser J, Hartwigsen G. Neural modelling of the semantic predictability gain under challenging listening conditions. Hum Brain Mapp 2020; 42:110-127. [PMID: 32959939 PMCID: PMC7721236 DOI: 10.1002/hbm.25208] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 09/07/2020] [Accepted: 09/08/2020] [Indexed: 11/09/2022] Open
Abstract
When speech intelligibility is reduced, listeners exploit constraints posed by semantic context to facilitate comprehension. The left angular gyrus (AG) has been argued to drive this semantic predictability gain. Taking a network perspective, we ask how the connectivity within language-specific and domain-general networks flexibly adapts to the predictability and intelligibility of speech. During continuous functional magnetic resonance imaging (fMRI), participants repeated sentences, which varied in semantic predictability of the final word and in acoustic intelligibility. At the neural level, highly predictable sentences led to stronger activation of left-hemispheric semantic regions including subregions of the AG (PGa, PGp) and posterior middle temporal gyrus when speech became more intelligible. The behavioural predictability gain of single participants mapped onto the same regions but was complemented by increased activity in frontal and medial regions. Effective connectivity from PGa to PGp increased for more intelligible sentences. In contrast, inhibitory influence from pre-supplementary motor area to left insula was strongest when predictability and intelligibility of sentences were either lowest or highest. This interactive effect was negatively correlated with the behavioural predictability gain. Together, these results suggest that successful comprehension in noisy listening conditions relies on an interplay of semantic regions and concurrent inhibition of cognitive control regions when semantic cues are available.
Collapse
Affiliation(s)
- Anna Uta Rysop
- Lise Meitner Research Group Cognition and Plasticity, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Lea-Maria Schmitt
- Department of Psychology, University of Lübeck, Lübeck, Germany.,Center of Brain, Behavior and Metabolism (CBBM), University of Lübeck, Lübeck, Germany
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck, Germany.,Center of Brain, Behavior and Metabolism (CBBM), University of Lübeck, Lübeck, Germany
| | - Gesa Hartwigsen
- Lise Meitner Research Group Cognition and Plasticity, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
15
|
Wöstmann M, Lui TKY, Friese KH, Kreitewolf J, Naujokat M, Obleser J. The vulnerability of working memory to distraction is rhythmic. Neuropsychologia 2020; 146:107505. [PMID: 32485200 DOI: 10.1016/j.neuropsychologia.2020.107505] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 05/08/2020] [Accepted: 05/26/2020] [Indexed: 12/29/2022]
Abstract
Recent research posits that the cognitive system samples target stimuli in a rhythmic fashion, characterized by target detection fluctuating at frequencies of ~3-8 Hz. Besides prioritized encoding of targets, a key cognitive function is the protection of working memory from distractor intrusion. Here, we test to which degree the vulnerability of working memory to distraction is rhythmic. In an Irrelevant-Speech Task, N = 23 human participants had to retain the serial order of nine numbers in working memory while being distracted by task-irrelevant speech with variable temporal onsets. The magnitude of the distractor-evoked N1 component in the event-related potential as well as behavioural recall accuracy, both measures of memory distraction, were periodically modulated by distractor onset time in approximately 2-4 cycles per second (Hz). Critically, an underlying 2.5-Hz rhythm explained variation in both measures of distraction such that stronger phasic distractor encoding mediated lower phasic memory recall accuracy. In a behavioural follow-up experiment, we tested whether these results would replicate in a task design without rhythmic presentation of target items. Participants (N = 6 with on average >2500 trials, each) retained two line-figures in memory while being distracted by acoustic noise of varying onset across trials. In agreement with the main experiment, the temporal onset of the distractor periodically modulated memory performance. These results suggest that during working memory retention, the human cognitive system implements distractor suppression in a temporally dynamic fashion, reflected in ~400-ms long cycles of high versus low distractibility.
Collapse
Affiliation(s)
- Malte Wöstmann
- Department of Psychology, University of Lübeck, Lübeck, Germany.
| | | | | | - Jens Kreitewolf
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Malte Naujokat
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck, Germany.
| |
Collapse
|
16
|
Paulus M, Hazan V, Adank P. The relationship between talker acoustics, intelligibility, and effort in degraded listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3348. [PMID: 32486777 DOI: 10.1121/10.0001212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 04/20/2020] [Indexed: 06/11/2023]
Abstract
Listening to degraded speech is associated with decreased intelligibility and increased effort. However, listeners are generally able to adapt to certain types of degradations. While intelligibility of degraded speech is modulated by talker acoustics, it is unclear whether talker acoustics also affect effort and adaptation. Moreover, it has been demonstrated that talker differences are preserved across spectral degradations, but it is not known whether this effect extends to temporal degradations and which acoustic-phonetic characteristics are responsible. In a listening experiment combined with pupillometry, participants were presented with speech in quiet as well as in masking noise, time-compressed, and noise-vocoded speech by 16 Southern British English speakers. Results showed that intelligibility, but not adaptation, was modulated by talker acoustics. Talkers who were more intelligible under noise-vocoding were also more intelligible under masking and time-compression. This effect was linked to acoustic-phonetic profiles with greater vowel space dispersion (VSD) and energy in mid-range frequencies, as well as slower speaking rate. While pupil dilation indicated increasing effort with decreasing intelligibility, this study also linked reduced effort in quiet to talkers with greater VSD. The results emphasize the relevance of talker acoustics for intelligibility and effort in degraded listening conditions.
Collapse
Affiliation(s)
- Maximillian Paulus
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Valerie Hazan
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| |
Collapse
|
17
|
Kennedy-Higgins D, Devlin JT, Adank P. Cognitive mechanisms underpinning successful perception of different speech distortions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2728. [PMID: 32359293 DOI: 10.1121/10.0001160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 04/08/2020] [Indexed: 06/11/2023]
Abstract
Few studies thus far have investigated whether perception of distorted speech is consistent across different types of distortion. This study investigated whether participants show a consistent perceptual profile across three speech distortions: time-compressed, noise-vocoded, and speech in noise. Additionally, this study investigated whether/how individual differences in performance on a battery of audiological and cognitive tasks links to perception. Eighty-eight participants completed a speeded sentence-verification task with increases in accuracy and reductions in response times used to indicate performance. Audiological and cognitive task measures include pure tone audiometry, speech recognition threshold, working memory, vocabulary knowledge, attention switching, and pattern analysis. Despite previous studies suggesting that temporal and spectral/environmental perception require different lexical or phonological mechanisms, this study shows significant positive correlations in accuracy and response time performance across all distortions. Results of a principal component analysis and multiple linear regressions suggest that a component based on vocabulary knowledge and working memory predicted performance in the speech in quiet, time-compressed and speech in noise conditions. These results suggest that listeners employ a similar cognitive strategy to perceive different temporal and spectral/environmental speech distortions and that this mechanism is supported by vocabulary knowledge and working memory.
Collapse
Affiliation(s)
- Dan Kennedy-Higgins
- Department of Speech, Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| | - Joseph T Devlin
- Department of Experimental Psychology, University College London, 26 Bedford Way, London, WC1H 0AP, United Kingdom
| | - Patti Adank
- Department of Speech, Hearing and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| |
Collapse
|
18
|
Casaponsa A, Sohoglu E, Moore DR, Füllgrabe C, Molloy K, Amitay S. Does training with amplitude modulated tones affect tone-vocoded speech perception? PLoS One 2019; 14:e0226288. [PMID: 31881550 PMCID: PMC6934405 DOI: 10.1371/journal.pone.0226288] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 11/22/2019] [Indexed: 11/17/2022] Open
Abstract
Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored.
Collapse
Affiliation(s)
- Aina Casaponsa
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
- Department of Linguistics and English Language, Lancaster University, Lancaster, England, United Kingdom
| | - Ediz Sohoglu
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
| | - David R. Moore
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
| | - Christian Füllgrabe
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
| | - Katharine Molloy
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
| | - Sygal Amitay
- Medical Research Council Institute of Hearing Research, Nottingham, England, United Kingdom
| |
Collapse
|
19
|
Working-memory disruption by task-irrelevant talkers depends on degree of talker familiarity. Atten Percept Psychophys 2019; 81:1108-1118. [PMID: 30993655 DOI: 10.3758/s13414-019-01727-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
When one is listening, familiarity with an attended talker's voice improves speech comprehension. Here, we instead investigated the effect of familiarity with a distracting talker. In an irrelevant-speech task, we assessed listeners' working memory for the serial order of spoken digits when a task-irrelevant, distracting sentence was produced by either a familiar or an unfamiliar talker (with rare omissions of the task-irrelevant sentence). We tested two groups of listeners using the same experimental procedure. The first group were undergraduate psychology students (N = 66) who had attended an introductory statistics course. Critically, each student had been taught by one of two course instructors, whose voices served as the familiar and unfamiliar task-irrelevant talkers. The second group of listeners were family members and friends (N = 20) who had known either one of the two talkers for more than 10 years. Students, but not family members and friends, made more errors when the task-irrelevant talker was familiar versus unfamiliar. Interestingly, the effect of talker familiarity was not modulated by the presence of task-irrelevant speech: Students experienced stronger working memory disruption by a familiar talker, irrespective of whether they heard a task-irrelevant sentence during memory retention or merely expected it. While previous work has shown that familiarity with an attended talker benefits speech comprehension, our findings indicate that familiarity with an ignored talker disrupts working memory for target speech. The absence of this effect in family members and friends suggests that the degree of familiarity modulates the memory disruption.
Collapse
|
20
|
Temporal Sensitivity Measured Shortly After Cochlear Implantation Predicts 6-Month Speech Recognition Outcome. Ear Hear 2019; 40:27-33. [PMID: 29697465 DOI: 10.1097/aud.0000000000000588] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Psychoacoustic tests assessed shortly after cochlear implantation are useful predictors of the rehabilitative speech outcome. While largely independent, both spectral and temporal resolution tests are important to provide an accurate prediction of speech recognition. However, rapid tests of temporal sensitivity are currently lacking. Here, we propose a simple amplitude modulation rate discrimination (AMRD) paradigm that is validated by predicting future speech recognition in adult cochlear implant (CI) patients. DESIGN In 34 newly implanted patients, we used an adaptive AMRD paradigm, where broadband noise was modulated at the speech-relevant rate of ~4 Hz. In a longitudinal study, speech recognition in quiet was assessed using the closed-set Freiburger number test shortly after cochlear implantation (t0) as well as the open-set Freiburger monosyllabic word test 6 months later (t6). RESULTS Both AMRD thresholds at t0 (r = -0.51) and speech recognition scores at t0 (r = 0.56) predicted speech recognition scores at t6. However, AMRD and speech recognition at t0 were uncorrelated, suggesting that those measures capture partially distinct perceptual abilities. A multiple regression model predicting 6-month speech recognition outcome with deafness duration and speech recognition at t0 improved from adjusted R = 0.30 to adjusted R = 0.44 when AMRD threshold was added as a predictor. CONCLUSIONS These findings identify AMRD thresholds as a reliable, nonredundant predictor above and beyond established speech tests for CI outcome. This AMRD test could potentially be developed into a rapid clinical temporal-resolution test to be integrated into the postoperative test battery to improve the reliability of speech outcome prognosis.
Collapse
|
21
|
Top-down, contextual entrainment of neuronal oscillations in the auditory thalamocortical circuit. Proc Natl Acad Sci U S A 2018; 115:E7605-E7614. [PMID: 30037997 PMCID: PMC6094129 DOI: 10.1073/pnas.1714684115] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Our results indicate that nonhuman primates detect complex repeating acoustic sequences in a continuous auditory stream, which is an important precursor for human speech learning and perception. We demonstrate that oscillatory entrainment, known to support the attentive perception of rhythmic stimulus sequences, can occur for rhythms defined solely by stimulus context rather than physical boundaries. As opposed to acoustically driven entrainment by rhythmic tone sequences demonstrated previously, this form of entrainment relies on the brain’s ability to group auditory inputs based on their statistical regularities. The internally initiated, context-driven modulation of excitability in the medial pulvinar prior to A1 supports the notion of top-down entrainment. Prior studies have shown that repetitive presentation of acoustic stimuli results in an alignment of ongoing neuronal oscillations to the sequence rhythm via oscillatory entrainment by external cues. Our study aimed to explore the neural correlates of the perceptual parsing and grouping of complex repeating auditory patterns that occur based solely on statistical regularities, or context. Human psychophysical studies suggest that the recognition of novel auditory patterns amid a continuous auditory stimulus sequence occurs automatically halfway through the first repetition. We hypothesized that once repeating patterns were detected by the brain, internal rhythms would become entrained, demarcating the temporal structure of these repetitions despite lacking external cues defining pattern on- or offsets. To examine the neural correlates of pattern perception, neuroelectric activity of primary auditory cortex (A1) and thalamic nuclei was recorded while nonhuman primates passively listened to streams of rapidly presented pure tones and bandpass noise bursts. At arbitrary intervals, random acoustic patterns composed of 11 stimuli were repeated five times without any perturbance of the constant stimulus flow. We found significant delta entrainment by these patterns in the A1, medial geniculate body, and medial pulvinar. In A1 and pulvinar, we observed a statistically significant, pattern structure-aligned modulation of neuronal firing that occurred earliest in the pulvinar, supporting the idea that grouping and detecting complex auditory patterns is a top-down, context-driven process. Besides electrophysiological measures, a pattern-related modulation of pupil diameter verified that, like humans, nonhuman primates consciously detect complex repetitive patterns that lack physical boundaries.
Collapse
|
22
|
Hakonen M, May PJC, Jääskeläinen IP, Jokinen E, Sams M, Tiitinen H. Predictive processing increases intelligibility of acoustically distorted speech: Behavioral and neural correlates. Brain Behav 2017; 7:e00789. [PMID: 28948083 PMCID: PMC5607552 DOI: 10.1002/brb3.789] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 06/10/2017] [Accepted: 06/26/2017] [Indexed: 11/08/2022] Open
Abstract
INTRODUCTION We examined which brain areas are involved in the comprehension of acoustically distorted speech using an experimental paradigm where the same distorted sentence can be perceived at different levels of intelligibility. This change in intelligibility occurs via a single intervening presentation of the intact version of the sentence, and the effect lasts at least on the order of minutes. Since the acoustic structure of the distorted stimulus is kept fixed and only intelligibility is varied, this allows one to study brain activity related to speech comprehension specifically. METHODS In a functional magnetic resonance imaging (fMRI) experiment, a stimulus set contained a block of six distorted sentences. This was followed by the intact counterparts of the sentences, after which the sentences were presented in distorted form again. A total of 18 such sets were presented to 20 human subjects. RESULTS The blood oxygenation level dependent (BOLD)-responses elicited by the distorted sentences which came after the disambiguating, intact sentences were contrasted with the responses to the sentences presented before disambiguation. This revealed increased activity in the bilateral frontal pole, the dorsal anterior cingulate/paracingulate cortex, and the right frontal operculum. Decreased BOLD responses were observed in the posterior insula, Heschl's gyrus, and the posterior superior temporal sulcus. CONCLUSIONS The brain areas that showed BOLD-enhancement for increased sentence comprehension have been associated with executive functions and with the mapping of incoming sensory information to representations stored in episodic memory. Thus, the comprehension of acoustically distorted speech may be associated with the engagement of memory-related subsystems. Further, activity in the primary auditory cortex was modulated by prior experience, possibly in a predictive coding framework. Our results suggest that memory biases the perception of ambiguous sensory information toward interpretations that have the highest probability to be correct based on previous experience.
Collapse
Affiliation(s)
- Maria Hakonen
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
- Department of PhysiologyFaculty of MedicineUniversity of HelsinkiHelsinkiFinland
| | - Patrick J. C. May
- Medical Research Council Institute of Hearing ResearchSchool of MedicineThe University of NottinghamNottinghamUK
- Special Laboratory Non‐Invasive Brain ImagingLeibniz Institute for NeurobiologyMagdeburgGermany
| | - Iiro P. Jääskeläinen
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
| | - Emma Jokinen
- Department of Signal Processing and AcousticsSchool of Electrical EngineeringAalto UniversityAaltoFinland
| | - Mikko Sams
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
| | | |
Collapse
|
23
|
Rosemann S, Gießing C, Özyurt J, Carroll R, Puschmann S, Thiel CM. The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults. Front Hum Neurosci 2017. [PMID: 28638329 PMCID: PMC5461255 DOI: 10.3389/fnhum.2017.00294] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI). This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome.
Collapse
Affiliation(s)
- Stephanie Rosemann
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Carsten Gießing
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Jale Özyurt
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Rebecca Carroll
- Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Institute of Dutch Studies, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Sebastian Puschmann
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Christiane M Thiel
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität OldenburgOldenburg, Germany
| |
Collapse
|
24
|
Wöstmann M, Lim SJ, Obleser J. The Human Neural Alpha Response to Speech is a Proxy of Attentional Control. Cereb Cortex 2017; 27:3307-3317. [DOI: 10.1093/cercor/bhx074] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Accepted: 03/08/2017] [Indexed: 12/25/2022] Open
|
25
|
Wöstmann M, Obleser J. Acoustic Detail But Not Predictability of Task-Irrelevant Speech Disrupts Working Memory. Front Hum Neurosci 2016; 10:538. [PMID: 27826235 PMCID: PMC5078496 DOI: 10.3389/fnhum.2016.00538] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 10/11/2016] [Indexed: 11/29/2022] Open
Abstract
Attended speech is comprehended better not only if more acoustic detail is available, but also if it is semantically highly predictable. But can more acoustic detail or higher predictability turn into disadvantages and distract a listener if the speech signal is to be ignored? Also, does the degree of distraction increase for older listeners who typically show a decline in attentional control ability? Adopting the irrelevant-speech paradigm, we tested whether younger (age 23–33 years) and older (60–78 years) listeners’ working memory for the serial order of spoken digits would be disrupted by the presentation of task-irrelevant speech varying in its acoustic detail (using noise-vocoding) and its semantic predictability (of sentence endings). More acoustic detail, but not higher predictability, of task-irrelevant speech aggravated memory interference. This pattern of results did not differ between younger and older listeners, despite generally lower performance in older listeners. Our findings suggest that the focus of attention determines how acoustics and predictability affect the processing of speech: first, as more acoustic detail is known to enhance speech comprehension and memory for speech, we here demonstrate that more acoustic detail of ignored speech enhances the degree of distraction. Second, while higher predictability of attended speech is known to also enhance speech comprehension under acoustically adverse conditions, higher predictability of ignored speech is unable to exert any distracting effect upon working memory performance in younger or older listeners. These findings suggest that features that make attended speech easier to comprehend do not necessarily enhance distraction by ignored speech.
Collapse
Affiliation(s)
- Malte Wöstmann
- Department of Psychology, University of Lübeck Lübeck, Germany
| | - Jonas Obleser
- Department of Psychology, University of Lübeck Lübeck, Germany
| |
Collapse
|
26
|
Thiel CM, Özyurt J, Nogueira W, Puschmann S. Effects of Age on Long Term Memory for Degraded Speech. Front Hum Neurosci 2016; 10:473. [PMID: 27708570 PMCID: PMC5030220 DOI: 10.3389/fnhum.2016.00473] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 09/07/2016] [Indexed: 12/15/2022] Open
Abstract
Prior research suggests that acoustical degradation impacts encoding of items into memory, especially in elderly subjects. We here aimed to investigate whether acoustically degraded items that are initially encoded into memory are more prone to forgetting as a function of age. Young and old participants were tested with a vocoded and unvocoded serial list learning task involving immediate and delayed free recall. We found that degraded auditory input increased forgetting of previously encoded items, especially in older participants. We further found that working memory capacity predicted forgetting of degraded information in young participants. In old participants, verbal IQ was the most important predictor for forgetting acoustically degraded information. Our data provide evidence that acoustically degraded information, even if encoded, is especially vulnerable to forgetting in old age.
Collapse
Affiliation(s)
- Christiane M Thiel
- Biological Psychology Lab, Cluster of Excellence "Hearing4all", Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany; Research Center Neurosensory Science, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Jale Özyurt
- Biological Psychology Lab, Cluster of Excellence "Hearing4all", Department of Psychology, European Medical School, Carl von Ossietzky Universität Oldenburg Oldenburg, Germany
| | - Waldo Nogueira
- Cluster of Excellence "Hearing4all", Department of Otolaryngology, Medical University Hannover Hannover, Germany
| | - Sebastian Puschmann
- Biological Psychology Lab, Cluster of Excellence "Hearing4all", Department of Psychology, European Medical School, Carl von Ossietzky Universität Oldenburg Oldenburg, Germany
| |
Collapse
|
27
|
Banks B, Gowen E, Munro KJ, Adank P. Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation. Front Hum Neurosci 2015; 9:422. [PMID: 26283946 PMCID: PMC4522556 DOI: 10.3389/fnhum.2015.00422] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 07/10/2015] [Indexed: 11/25/2022] Open
Abstract
Perceptual adaptation allows humans to recognize different varieties of accented speech. We investigated whether perceptual adaptation to accented speech is facilitated if listeners can see a speaker's facial and mouth movements. In Study 1, participants listened to sentences in a novel accent and underwent a period of training with audiovisual or audio-only speech cues, presented in quiet or in background noise. A control group also underwent training with visual-only (speech-reading) cues. We observed no significant difference in perceptual adaptation between any of the groups. To address a number of remaining questions, we carried out a second study using a different accent, speaker and experimental design, in which participants listened to sentences in a non-native (Japanese) accent with audiovisual or audio-only cues, without separate training. Participants' eye gaze was recorded to verify that they looked at the speaker's face during audiovisual trials. Recognition accuracy was significantly better for audiovisual than for audio-only stimuli; however, no statistical difference in perceptual adaptation was observed between the two modalities. Furthermore, Bayesian analysis suggested that the data supported the null hypothesis. Our results suggest that although the availability of visual speech cues may be immediately beneficial for recognition of unfamiliar accented speech in noise, it does not improve perceptual adaptation.
Collapse
Affiliation(s)
- Briony Banks
- School of Psychological Sciences, University of ManchesterManchester, UK
| | - Emma Gowen
- Faculty of Life Sciences, University of ManchesterManchester, UK
| | - Kevin J. Munro
- School of Psychological Sciences, University of ManchesterManchester, UK
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College LondonLondon, UK
| |
Collapse
|
28
|
Azadpour M, Balaban E. A proposed mechanism for rapid adaptation to spectrally distorted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:44-57. [PMID: 26233005 DOI: 10.1121/1.4922226] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The mechanisms underlying perceptual adaptation to severely spectrally-distorted speech were studied by training participants to comprehend spectrally-rotated speech, which is obtained by inverting the speech spectrum. Spectral-rotation produces severe distortion confined to the spectral domain while preserving temporal trajectories. During five 1-hour training sessions, pairs of participants attempted to extract spoken messages from the spectrally-rotated speech of their training partner. Data on training-induced changes in comprehension of spectrally-rotated sentences and identification/discrimination of spectrally-rotated phonemes were used to evaluate the plausibility of three different classes of underlying perceptual mechanisms: (1) phonemic remapping (the formation of new phonemic categories that specifically incorporate spectrally-rotated acoustic information); (2) experience-dependent generation of a perceptual "inverse-transform" that compensates for spectral-rotation; and (3) changes in cue weighting (the identification of sets of acoustic cues least affected by spectral-rotation, followed by a rapid shift in perceptual emphasis to favour those cues, combined with the recruitment of the same type of "perceptual filling-in" mechanisms used to disambiguate speech-in-noise). Results exclusively support the third mechanism, which is the only one predicting that learning would specifically target temporally-dynamic cues that were transmitting phonetic information most stably in spite of spectral-distortion. No support was found for phonemic remapping or for inverse-transform generation.
Collapse
Affiliation(s)
- Mahan Azadpour
- Cognitive Neuroscience Sector, SISSA (International School for Advanced Studies), Via Beirut 2-4, Trieste, Italy
| | - Evan Balaban
- Cognitive Neuroscience Sector, SISSA (International School for Advanced Studies), Via Beirut 2-4, Trieste, Italy
| |
Collapse
|
29
|
Lagerberg TB, Johnels JÅ, Hartelius L, Persson C. Effect of the number of presentations on listener transcriptions and reliability in the assessment of speech intelligibility in children. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2015; 50:476-487. [PMID: 25588966 DOI: 10.1111/1460-6984.12149] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Accepted: 11/03/2014] [Indexed: 06/04/2023]
Abstract
BACKGROUND The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, inter alia, to the speech material, the listener and the listener task. AIMS To explore the impact of the number of presentations of the utterances on assessments of intelligibility based on orthographic transcription of spontaneous speech, specifically the impact on intelligibility scores, reliability and intra-listener variability. METHODS & PROCEDURES Speech from 12 children (aged 4:6-8:3 years; mean = 5:10 years) with percentage consonants correct (PCC) scores ranging from 49 to 81 was listened to by 18 students on the speech-language pathology (SLP) programme and by two recent graduates from that programme. Three conditions were examined during the transcription phase: (1) listening to each utterance once; (2) listening to each utterance a second time; and (3) listening to all utterances from a given child a third time after having heard all of its utterances twice. OUTCOMES & RESULTS Statistically significant differences between intelligibility scores were found across the three conditions, i.e. the intelligibility score increased with the number of presentations while inter-judge reliability was unchanged. The results differed markedly across listeners, but each individual listener's results were very consistent across conditions. CONCLUSIONS & IMPLICATIONS Information about the number of times an utterance is presented to the listener is important and should therefore always be included in reports of research involving intelligibility assessment. There is a need for further research and discussion on listener abilities and strategies.
Collapse
Affiliation(s)
- Tove B Lagerberg
- The Sahlgrenska Academy, University of Gothenburg, Institute of Neuroscience and Physiology, Division of Speech and Language Pathology, Gothenburg, Sweden
| | - Jakob Åsberg Johnels
- The Sahlgrenska Academy, University of Gothenburg, Institute of Neuroscience and Physiology, Division of Speech and Language Pathology, Gothenburg, Sweden
| | - Lena Hartelius
- The Sahlgrenska Academy, University of Gothenburg, Institute of Neuroscience and Physiology, Division of Speech and Language Pathology, Gothenburg, Sweden
| | - Christina Persson
- The Sahlgrenska Academy, University of Gothenburg, Institute of Neuroscience and Physiology, Division of Speech and Language Pathology, Gothenburg, Sweden
| |
Collapse
|
30
|
Lima CF, Lavan N, Evans S, Agnew Z, Halpern AR, Shanmugalingam P, Meekings S, Boebinger D, Ostarek M, McGettigan C, Warren JE, Scott SK. Feel the Noise: Relating Individual Differences in Auditory Imagery to the Structure and Function of Sensorimotor Systems. Cereb Cortex 2015; 25:4638-50. [PMID: 26092220 PMCID: PMC4816805 DOI: 10.1093/cercor/bhv134] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Humans can generate mental auditory images of voices or songs, sometimes perceiving them almost as vividly as perceptual experiences. The functional networks supporting auditory imagery have been described, but less is known about the systems associated with interindividual differences in auditory imagery. Combining voxel-based morphometry and fMRI, we examined the structural basis of interindividual differences in how auditory images are subjectively perceived, and explored associations between auditory imagery, sensory-based processing, and visual imagery. Vividness of auditory imagery correlated with gray matter volume in the supplementary motor area (SMA), parietal cortex, medial superior frontal gyrus, and middle frontal gyrus. An analysis of functional responses to different types of human vocalizations revealed that the SMA and parietal sites that predict imagery are also modulated by sound type. Using representational similarity analysis, we found that higher representational specificity of heard sounds in SMA predicts vividness of imagery, indicating a mechanistic link between sensory- and imagery-based processing in sensorimotor cortex. Vividness of imagery in the visual domain also correlated with SMA structure, and with auditory imagery scores. Altogether, these findings provide evidence for a signature of imagery in brain structure, and highlight a common role of perceptual–motor interactions for processing heard and internally generated auditory information.
Collapse
Affiliation(s)
- César F Lima
- Institute of Cognitive Neuroscience Center for Psychology, University of Porto, Porto, Portugal
| | - Nadine Lavan
- Institute of Cognitive Neuroscience Department of Psychology, Royal Holloway University of London, London, UK
| | | | - Zarinah Agnew
- Institute of Cognitive Neuroscience Department of Otolaryngology, University of California, San Francisco, USA
| | | | | | | | | | | | - Carolyn McGettigan
- Institute of Cognitive Neuroscience Department of Psychology, Royal Holloway University of London, London, UK
| | - Jane E Warren
- Faculty of Brain Sciences, University College London, London, UK
| | | |
Collapse
|
31
|
Banks B, Gowen E, Munro KJ, Adank P. Cognitive predictors of perceptual adaptation to accented speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:2015-2024. [PMID: 25920852 DOI: 10.1121/1.4916265] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The present study investigated the effects of inhibition, vocabulary knowledge, and working memory on perceptual adaptation to accented speech. One hundred young, normal-hearing adults listened to sentences spoken in a constructed, unfamiliar accent presented in speech-shaped background noise. Speech Reception Thresholds (SRTs) corresponding to 50% speech recognition accuracy provided a measurement of adaptation to the accented speech. Stroop, vocabulary knowledge, and working memory tests were performed to measure cognitive ability. Participants adapted to the unfamiliar accent as revealed by a decrease in SRTs over time. Better inhibition (lower Stroop scores) predicted greater and faster adaptation to the unfamiliar accent. Vocabulary knowledge predicted better recognition of the unfamiliar accent, while working memory had a smaller, indirect effect on speech recognition mediated by vocabulary score. Results support a top-down model for successful adaptation to, and recognition of, accented speech; they add to recent theories that allocate a prominent role for executive function to effective speech comprehension in adverse listening conditions.
Collapse
Affiliation(s)
- Briony Banks
- School of Psychological Sciences, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Emma Gowen
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Kevin J Munro
- School of Psychological Sciences, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Patti Adank
- School of Psychological Sciences, University of Manchester, Manchester M13 9PL, United Kingdom
| |
Collapse
|
32
|
Scharinger M, Henry MJ, Obleser J. Acoustic cue selection and discrimination under degradation: differential contributions of the inferior parietal and posterior temporal cortices. Neuroimage 2014; 106:373-81. [PMID: 25481793 DOI: 10.1016/j.neuroimage.2014.11.050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2014] [Revised: 10/10/2014] [Accepted: 11/23/2014] [Indexed: 11/26/2022] Open
Abstract
Auditory categorization is a vital skill for perceiving the acoustic environment. Categorization depends on the discriminability of the sensory input as well as on the ability of the listener to adaptively make use of the relevant features of the sound. Previous studies on categorization have focused either on speech sounds when studying discriminability or on visual stimuli when assessing optimal cue utilization. Here, by contrast, we examined neural sensitivity to stimulus discriminability and optimal cue utilization when categorizing novel, non-speech auditory stimuli not affected by long-term familiarity. In a functional magnetic resonance imaging (fMRI) experiment, listeners categorized sounds from two category distributions, differing along two acoustic dimensions: spectral shape and duration. By introducing spectral degradation after the first half of the experiment, we manipulated both stimulus discriminability and the relative informativeness of acoustic cues. Degradation caused an overall decrease in discriminability based on spectral shape, and therefore enhanced the informativeness of duration. A relative increase in duration-cue utilization was accompanied by increased activity in left parietal cortex. Further, discriminability modulated right planum temporale activity to a higher degree when stimuli were spectrally degraded than when they were not. These findings provide support for separable contributions of parietal and posterior temporal areas to perceptual categorization. The parietal cortex seems to support the selective utilization of informative stimulus cues, while the posterior superior temporal cortex as a primarily auditory brain area supports discriminability particularly under acoustic degradation.
Collapse
Affiliation(s)
- Mathias Scharinger
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Molly J Henry
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
33
|
Herrmann B, Henry MJ, Scharinger M, Obleser J. Supplementary motor area activations predict individual differences in temporal-change sensitivity and its illusory distortions. Neuroimage 2014; 101:370-9. [DOI: 10.1016/j.neuroimage.2014.07.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Revised: 07/07/2014] [Accepted: 07/16/2014] [Indexed: 10/25/2022] Open
|
34
|
Hartwigsen G, Golombek T, Obleser J. Repetitive transcranial magnetic stimulation over left angular gyrus modulates the predictability gain in degraded speech comprehension. Cortex 2014; 68:100-10. [PMID: 25444577 DOI: 10.1016/j.cortex.2014.08.027] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 07/08/2014] [Accepted: 08/26/2014] [Indexed: 10/24/2022]
Abstract
Increased neural activity in left angular gyrus (AG) accompanies successful comprehension of acoustically degraded but highly predictable sentences, as previous functional imaging studies have shown. However, it remains unclear whether the left AG is causally relevant for the comprehension of degraded speech. Here, we applied transient virtual lesions to either the left AG or superior parietal lobe (SPL, as a control area) with repetitive transcranial magnetic stimulation (rTMS) while healthy volunteers listened to and repeated sentences with high- versus low-predictable endings and different noise vocoding levels. We expected that rTMS of AG should selectively modulate the predictability gain (i.e., the comprehension benefit from sentences with high-predictable endings) at a medium degradation level. We found that rTMS of AG indeed reduced the predictability gain at a medium degradation level of 4-band noise vocoding (relative to control rTMS of SPL). In contrast, the behavioral perturbation induced by rTMS changed with increased signal quality. Hence, at 8-band noise vocoding, rTMS over AG versus SPL decreased the number of correctly repeated keywords for sentences with low-predictable endings. Together, these results show that the degree of the rTMS interference depended jointly on signal quality and predictability. Our results provide the first causal evidence that the left AG is a critical node for facilitating speech comprehension in challenging listening conditions.
Collapse
Affiliation(s)
- Gesa Hartwigsen
- Language & Aphasia Laboratory, Department of Neurology, University of Leipzig, Leipzig, Germany; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychology, Christian-Albrechts-University, Kiel, Germany.
| | - Thomas Golombek
- Language & Aphasia Laboratory, Department of Neurology, University of Leipzig, Leipzig, Germany; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Jonas Obleser
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| |
Collapse
|
35
|
Neger TM, Rietveld T, Janse E. Relationship between perceptual learning in speech and statistical learning in younger and older adults. Front Hum Neurosci 2014; 8:628. [PMID: 25225475 PMCID: PMC4150448 DOI: 10.3389/fnhum.2014.00628] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 07/28/2014] [Indexed: 11/30/2022] Open
Abstract
Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly) draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech. In the present study, 73 older adults (aged over 60 years) and 60 younger adults (aged between 18 and 30 years) performed a visual artificial grammar learning task and were presented with 60 meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory, and processing speed). Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly.
Collapse
Affiliation(s)
- Thordis M Neger
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands ; International Max Planck Research School for Language Sciences Nijmegen, Netherlands
| | - Toni Rietveld
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands
| | - Esther Janse
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands ; Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| |
Collapse
|
36
|
Scharinger M, Herrmann B, Nierhaus T, Obleser J. Simultaneous EEG-fMRI brain signatures of auditory cue utilization. Front Neurosci 2014; 8:137. [PMID: 24926232 PMCID: PMC4044900 DOI: 10.3389/fnins.2014.00137] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 05/17/2014] [Indexed: 11/13/2022] Open
Abstract
Optimal utilization of acoustic cues during auditory categorization is a vital skill, particularly when informative cues become occluded or degraded. Consequently, the acoustic environment requires flexible choosing and switching amongst available cues. The present study targets the brain functions underlying such changes in cue utilization. Participants performed a categorization task with immediate feedback on acoustic stimuli from two categories that varied in duration and spectral properties, while we simultaneously recorded Blood Oxygenation Level Dependent (BOLD) responses in fMRI and electroencephalograms (EEGs). In the first half of the experiment, categories could be best discriminated by spectral properties. Halfway through the experiment, spectral degradation rendered the stimulus duration the more informative cue. Behaviorally, degradation decreased the likelihood of utilizing spectral cues. Spectrally degrading the acoustic signal led to increased alpha power compared to nondegraded stimuli. The EEG-informed fMRI analyses revealed that alpha power correlated with BOLD changes in inferior parietal cortex and right posterior superior temporal gyrus (including planum temporale). In both areas, spectral degradation led to a weaker coupling of BOLD response to behavioral utilization of the spectral cue. These data provide converging evidence from behavioral modeling, electrophysiology, and hemodynamics that (a) increased alpha power mediates the inhibition of uninformative (here spectral) stimulus features, and that (b) the parietal attention network supports optimal cue utilization in auditory categorization. The results highlight the complex cortical processing of auditory categorization under realistic listening challenges.
Collapse
Affiliation(s)
- Mathias Scharinger
- Max Planck Research Group "Auditory Cognition," Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| | - Björn Herrmann
- Max Planck Research Group "Auditory Cognition," Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| | - Till Nierhaus
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition," Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| |
Collapse
|
37
|
Becker R, Pefkou M, Michel CM, Hervais-Adelman AG. Left temporal alpha-band activity reflects single word intelligibility. Front Syst Neurosci 2013; 7:121. [PMID: 24416001 PMCID: PMC3873629 DOI: 10.3389/fnsys.2013.00121] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 12/10/2013] [Indexed: 11/13/2022] Open
Abstract
The electroencephalographic (EEG) correlates of degraded speech perception have been explored in a number of recent studies. However, such investigations have often been inconclusive as to whether observed differences in brain responses between conditions result from different acoustic properties of more or less intelligible stimuli or whether they relate to cognitive processes implicated in comprehending challenging stimuli. In this study we used noise vocoding to spectrally degrade monosyllabic words in order to manipulate their intelligibility. We used spectral rotation to generate incomprehensible control conditions matched in terms of spectral detail. We recorded EEG from 14 volunteers who listened to a series of noise vocoded (NV) and noise-vocoded spectrally-rotated (rNV) words, while they carried out a detection task. We specifically sought components of the EEG response that showed an interaction between spectral rotation and spectral degradation. This reflects those aspects of the brain electrical response that are related to the intelligibility of acoustically degraded monosyllabic words, while controlling for spectral detail. An interaction between spectral complexity and rotation was apparent in both evoked and induced activity. Analyses of event-related potentials showed an interaction effect for a P300-like component at several centro-parietal electrodes. Time-frequency analysis of the EEG signal in the alpha-band revealed a monotonic increase in event-related desynchronization (ERD) for the NV but not the rNV stimuli in the alpha band at a left temporo-central electrode cluster from 420-560 ms reflecting a direct relationship between the strength of alpha-band ERD and intelligibility. By matching NV words with their incomprehensible rNV homologues, we reveal the spatiotemporal pattern of evoked and induced processes involved in degraded speech perception, largely uncontaminated by purely acoustic effects.
Collapse
Affiliation(s)
- Robert Becker
- Functional Brain Mapping Lab, Department of Fundamental Neuroscience, University of Geneva Geneva, Switzerland
| | - Maria Pefkou
- Brain and Language Lab, Department of Clinical Neuroscience, University of Geneva Geneva, Switzerland
| | - Christoph M Michel
- Functional Brain Mapping Lab, Department of Fundamental Neuroscience, University of Geneva Geneva, Switzerland
| | - Alexis G Hervais-Adelman
- Brain and Language Lab, Department of Clinical Neuroscience, University of Geneva Geneva, Switzerland
| |
Collapse
|
38
|
Erb J, Obleser J. Upregulation of cognitive control networks in older adults' speech comprehension. Front Syst Neurosci 2013; 7:116. [PMID: 24399939 PMCID: PMC3871967 DOI: 10.3389/fnsys.2013.00116] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Accepted: 12/05/2013] [Indexed: 11/20/2022] Open
Abstract
Speech comprehension abilities decline with age and with age-related hearing loss, but it is unclear how this decline expresses in terms of central neural mechanisms. The current study examined neural speech processing in a group of older adults (aged 56–77, n = 16, with varying degrees of sensorineural hearing loss), and compared them to a cohort of young adults (aged 22–31, n = 30, self-reported normal hearing). In a functional MRI experiment, listeners heard and repeated back degraded sentences (4-band vocoded, where the temporal envelope of the acoustic signal is preserved, while the spectral information is substantially degraded). Behaviorally, older adults adapted to degraded speech at the same rate as young listeners, although their overall comprehension of degraded speech was lower. Neurally, both older and young adults relied on the left anterior insula for degraded more than clear speech perception. However, anterior insula engagement in older adults was dependent on hearing acuity. Young adults additionally employed the anterior cingulate cortex (ACC). Interestingly, this age group × degradation interaction was driven by a reduced dynamic range in older adults who displayed elevated levels of ACC activity for both degraded and clear speech, consistent with a persistent upregulation in cognitive control irrespective of task difficulty. For correct speech comprehension, older adults relied on the middle frontal gyrus in addition to a core speech comprehension network recruited by younger adults suggestive of a compensatory mechanism. Taken together, the results indicate that older adults increasingly recruit cognitive control networks, even under optimal listening conditions, at the expense of these systems’ dynamic range.
Collapse
Affiliation(s)
- Julia Erb
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| |
Collapse
|
39
|
Facilitation of inferior frontal cortex by transcranial direct current stimulation induces perceptual learning of severely degraded speech. J Neurosci 2013; 33:15868-78. [PMID: 24089493 DOI: 10.1523/jneurosci.5466-12.2013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Perceptual learning requires the generalization of categorical perceptual sensitivity from trained to untrained items. For degraded speech, perceptual learning modulates activation in a left-lateralized network, including inferior frontal gyrus (IFG) and inferior parietal cortex (IPC). Here we demonstrate that facilitatory anodal transcranial direct current stimulation (tDCS(anodal)) can induce perceptual learning in healthy humans. In a sham-controlled, parallel design study, 36 volunteers were allocated to the three following intervention groups: tDCS(anodal) over left IFG, IPC, or sham. Participants decided on the match between an acoustically degraded and an undegraded written word by forced same-different choice. Acoustic degradation varied in four noise-vocoding levels (2, 3, 4, and 6 bands). Participants were trained to discriminate between minimal (/Tisch/-FISCH) and identical word pairs (/Tisch/-TISCH) over a period of 3 d, and tDCS(anodal) was applied during the first 20 min of training. Perceptual sensitivity (d') for trained word pairs, and an equal number of untrained word pairs, was tested before and after training. Increases in d' indicate perceptual learning for untrained word pairs, and a combination of item-specific and perceptual learning for trained word pairs. Most notably for the lowest intelligibility level, perceptual learning occurred only when tDCS(anodal) was applied over left IFG. For trained pairs, improved d' was seen on all intelligibility levels regardless of tDCS intervention. Over left IPC, tDCS(anodal) did not modulate learning but instead introduced a response bias during training. Volunteers were more likely to respond "same," potentially indicating enhanced perceptual fusion of degraded auditory with undegraded written input. Our results supply first evidence that neural facilitation of higher-order language areas can induce perceptual learning of severely degraded speech.
Collapse
|
40
|
Scharinger M, Henry MJ, Erb J, Meyer L, Obleser J. Thalamic and parietal brain morphology predicts auditory category learning. Neuropsychologia 2013; 53:75-83. [PMID: 24035788 DOI: 10.1016/j.neuropsychologia.2013.09.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Revised: 09/02/2013] [Accepted: 09/04/2013] [Indexed: 01/13/2023]
Abstract
Auditory categorization is a vital skill involving the attribution of meaning to acoustic events, engaging domain-specific (i.e., auditory) as well as domain-general (e.g., executive) brain networks. A listener's ability to categorize novel acoustic stimuli should therefore depend on both, with the domain-general network being particularly relevant for adaptively changing listening strategies and directing attention to relevant acoustic cues. Here we assessed adaptive listening behavior, using complex acoustic stimuli with an initially salient (but later degraded) spectral cue and a secondary, duration cue that remained nondegraded. We employed voxel-based morphometry (VBM) to identify cortical and subcortical brain structures whose individual neuroanatomy predicted task performance and the ability to optimally switch to making use of temporal cues after spectral degradation. Behavioral listening strategies were assessed by logistic regression and revealed mainly strategy switches in the expected direction, with considerable individual differences. Gray-matter probability in the left inferior parietal lobule (BA 40) and left precentral gyrus was predictive of "optimal" strategy switch, while gray-matter probability in thalamic areas, comprising the medial geniculate body, co-varied with overall performance. Taken together, our findings suggest that successful auditory categorization relies on domain-specific neural circuits in the ascending auditory pathway, while adaptive listening behavior depends more on brain structure in parietal cortex, enabling the (re)direction of attention to salient stimulus properties.
Collapse
Affiliation(s)
- Mathias Scharinger
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Molly J Henry
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Julia Erb
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Lars Meyer
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
41
|
Abstract
Listeners show a remarkable ability to quickly adjust to degraded speech input. Here, we aimed to identify the neural mechanisms of such short-term perceptual adaptation. In a sparse-sampling, cardiac-gated functional magnetic resonance imaging (fMRI) acquisition, human listeners heard and repeated back 4-band-vocoded sentences (in which the temporal envelope of the acoustic signal is preserved, while spectral information is highly degraded). Clear-speech trials were included as baseline. An additional fMRI experiment on amplitude modulation rate discrimination quantified the convergence of neural mechanisms that subserve coping with challenging listening conditions for speech and non-speech. First, the degraded speech task revealed an "executive" network (comprising the anterior insula and anterior cingulate cortex), parts of which were also activated in the non-speech discrimination task. Second, trial-by-trial fluctuations in successful comprehension of degraded speech drove hemodynamic signal change in classic "language" areas (bilateral temporal cortices). Third, as listeners perceptually adapted to degraded speech, downregulation in a cortico-striato-thalamo-cortical circuit was observable. The present data highlight differential upregulation and downregulation in auditory-language and executive networks, respectively, with important subcortical contributions when successfully adapting to a challenging listening situation.
Collapse
|
42
|
Henry MJ, Herrmann B, Obleser J. Selective Attention to Temporal Features on Nested Time Scales. Cereb Cortex 2013; 25:450-9. [DOI: 10.1093/cercor/bht240] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
43
|
Smalt CJ, Gonzalez-Castillo J, Talavage TM, Pisoni DB, Svirsky MA. Neural correlates of adaptation in freely-moving normal hearing subjects under cochlear implant acoustic simulations. Neuroimage 2013; 82:500-9. [PMID: 23751864 DOI: 10.1016/j.neuroimage.2013.06.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Revised: 05/28/2013] [Accepted: 06/01/2013] [Indexed: 11/28/2022] Open
Abstract
Neurobiological correlates of adaptation to spectrally degraded speech were investigated with fMRI before and after exposure to a portable real-time speech processor that implements an acoustic simulation model of a cochlear implant (CI). The speech processor, in conjunction with isolating insert earphones and a microphone to capture environment sounds, was worn by participants over a two week chronic exposure period. fMRI and behavioral speech comprehension testing were conducted before and after this two week period. After using the simulator each day for 2h, participants significantly improved in word and sentence recognition scores. fMRI shows that these improvements came accompanied by changes in patterns of neuronal activation. In particular, we found additional recruitment of visual, motor, and working memory areas after the perceptual training period. These findings suggest that the human brain is able to adapt in a short period of time to a degraded auditory signal under a natural learning environment, and gives insight on how a CI might interact with the central nervous system. This paradigm can be furthered to investigate neural correlates of new rehabilitation, training, and signal processing strategies non-invasively in normal hearing listeners to improve CI patient outcomes.
Collapse
Affiliation(s)
- Christopher J Smalt
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA.
| | | | | | | | | |
Collapse
|