1
|
Kim SG, De Martino F, Overath T. Linguistic modulation of the neural encoding of phonemes. Cereb Cortex 2024; 34:bhae155. [PMID: 38687241 PMCID: PMC11059272 DOI: 10.1093/cercor/bhae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 05/02/2024] Open
Abstract
Speech comprehension entails the neural mapping of the acoustic speech signal onto learned linguistic units. This acousto-linguistic transformation is bi-directional, whereby higher-level linguistic processes (e.g. semantics) modulate the acoustic analysis of individual linguistic units. Here, we investigated the cortical topography and linguistic modulation of the most fundamental linguistic unit, the phoneme. We presented natural speech and "phoneme quilts" (pseudo-randomly shuffled phonemes) in either a familiar (English) or unfamiliar (Korean) language to native English speakers while recording functional magnetic resonance imaging. This allowed us to dissociate the contribution of acoustic vs. linguistic processes toward phoneme analysis. We show that (i) the acoustic analysis of phonemes is modulated by linguistic analysis and (ii) that for this modulation, both of acoustic and phonetic information need to be incorporated. These results suggest that the linguistic modulation of cortical sensitivity to phoneme classes minimizes prediction error during natural speech perception, thereby aiding speech comprehension in challenging listening situations.
Collapse
Affiliation(s)
- Seung-Goo Kim
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt am Main 60322, Germany
| | - Federico De Martino
- Faculty of Psychology and Neuroscience, University of Maastricht, Universiteitssingel 40, 6229 ER Maastricht, Netherlands
| | - Tobias Overath
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Duke Institute for Brain Sciences, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Center for Cognitive Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
| |
Collapse
|
2
|
Zhou X, Burg E, Kan A, Litovsky RY. Investigating effortful speech perception using fNIRS and pupillometry measures. CURRENT RESEARCH IN NEUROBIOLOGY 2022; 3:100052. [PMID: 36518346 PMCID: PMC9743070 DOI: 10.1016/j.crneur.2022.100052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 05/12/2022] [Accepted: 08/12/2022] [Indexed: 10/15/2022] Open
Abstract
The current study examined the neural mechanisms for mental effort and its correlation to speech perception using functional near-infrared spectroscopy (fNIRS) in listeners with normal hearing (NH). Data were collected while participants listened and responded to unprocessed and degraded sentences, where words were presented in grammatically correct or shuffled order. Effortful listening and task difficulty due to stimulus manipulations was confirmed using a subjective questionnaire and a well-established objective measure of mental effort - pupillometry. fNIRS measures focused on cortical responses in two a priori regions of interest, the left auditory cortex (AC) and lateral frontal cortex (LFC), which are closely related to auditory speech perception and listening effort, respectively. We examined the relations between the two objective measures and behavioral measures of speech perception (task performance) and task difficulty. Results demonstrated that changes in pupil dilation were positively correlated with the self-reported task difficulty levels and negatively correlated with the task performance scores. A significant and negative correlation between the two behavioral measures was also found. That is, as perceived task demands increased and task performance scores decreased, pupils dilated more. fNIRS measures (cerebral oxygenation) in the left AC and LFC were both negatively correlated with the self-reported task difficulty levels and positively correlated with task performance scores. These results suggest that pupillometry measures can indicate task demands and listening effort; whereas, fNIRS measures using a similar paradigm seem to reflect speech processing, but not effort.
Collapse
Affiliation(s)
- Xin Zhou
- Waisman Center, University of Wisconsin Madison, WI, USA
| | - Emily Burg
- Waisman Center, University of Wisconsin Madison, WI, USA
- Department of Communication Science and Disorders, University of Wisconsin Madison, WI, USA
| | - Alan Kan
- School of Engineering, Macquarie University, Sydney, NSW, Australia
| | - Ruth Y Litovsky
- Waisman Center, University of Wisconsin Madison, WI, USA
- Department of Communication Science and Disorders, University of Wisconsin Madison, WI, USA
| |
Collapse
|
3
|
Suppanen E, Winkler I, Kujala T, Ylinen S. More efficient formation of longer-term representations for word forms at birth can be linked to better language skills at 2 years. Dev Cogn Neurosci 2022; 55:101113. [PMID: 35605476 PMCID: PMC9130088 DOI: 10.1016/j.dcn.2022.101113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 04/03/2022] [Accepted: 05/09/2022] [Indexed: 11/15/2022] Open
Abstract
Infants are able to extract words from speech early in life. Here we show that the quality of forming longer-term representations for word forms at birth predicts expressive language ability at the age of two years. Seventy-five neonates were familiarized with two spoken disyllabic pseudowords. We then tested whether the neonate brain predicts the second syllable from the first one by presenting a familiarized pseudoword frequently, and occasionally violating the learned syllable combination by different rare pseudowords. Distinct brain responses were elicited by predicted and unpredicted word endings, suggesting that the neonates had learned the familiarized pseudowords. The difference between responses to predicted and unpredicted pseudowords indexing the quality of word-form learning during familiarization significantly correlated with expressive language scores (the mean length of utterance) at 24 months in the same infant. These findings suggest that 1) neonates can memorize disyllabic words so that a learned first syllable generates predictions for the word ending, and 2) early individual differences in the quality of word-form learning correlate with language skills. This relationship helps early identification of infants at risk for language impairment.
Collapse
Affiliation(s)
- Emma Suppanen
- Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Hungary
| | - Teija Kujala
- Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Sari Ylinen
- Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Logopedics, Welfare Sciences, Faculty of Social Sciences, Tampere University, Finland
| |
Collapse
|
4
|
Irsik VC, Johnsrude IS, Herrmann B. Neural Activity during Story Listening Is Synchronized across Individuals Despite Acoustic Masking. J Cogn Neurosci 2022; 34:933-950. [PMID: 35258555 DOI: 10.1162/jocn_a_01842] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Older people with hearing problems often experience difficulties understanding speech in the presence of background sound. As a result, they may disengage in social situations, which has been associated with negative psychosocial health outcomes. Measuring listening (dis)engagement during challenging listening situations has received little attention thus far. We recruit young, normal-hearing human adults (both sexes) and investigate how speech intelligibility and engagement during naturalistic story listening is affected by the level of acoustic masking (12-talker babble) at different signal-to-noise ratios (SNRs). In Experiment 1, we observed that word-report scores were above 80% for all but the lowest SNR (-3 dB SNR) we tested, at which performance dropped to 54%. In Experiment 2, we calculated intersubject correlation (ISC) using EEG data to identify dynamic spatial patterns of shared neural activity evoked by the stories. ISC has been used as a neural measure of participants' engagement with naturalistic materials. Our results show that ISC was stable across all but the lowest SNRs, despite reduced speech intelligibility. Comparing ISC and intelligibility demonstrated that word-report performance declined more strongly with decreasing SNR compared to ISC. Our measure of neural engagement suggests that individuals remain engaged in story listening despite missing words because of background noise. Our work provides a potentially fruitful approach to investigate listener engagement with naturalistic, spoken stories that may be used to investigate (dis)engagement in older adults with hearing impairment.
Collapse
Affiliation(s)
| | | | - Björn Herrmann
- The University of Western Ontario.,Rotman Research Institute, Toronto, ON, Canada.,University of Toronto
| |
Collapse
|
5
|
Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022; 33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open
Abstract
Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.
Collapse
Affiliation(s)
- Andrew W Corcoran
- Corresponding author: Room E672, 20 Chancellors Walk, Clayton, VIC 3800, Australia.
| | - Ricardo Perera
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Matthieu Koroma
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Sid Kouider
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Jakob Hohwy
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia,Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Thomas Andrillon
- Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia,Paris Brain Institute, Sorbonne Université, Inserm-CNRS, Paris 75013, France
| |
Collapse
|
6
|
Al-Zubaidi A, Bräuer S, Holdgraf CR, Schepers IM, Rieger JW. OUP accepted manuscript. Cereb Cortex Commun 2022; 3:tgac007. [PMID: 35281216 PMCID: PMC8914075 DOI: 10.1093/texcom/tgac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 11/14/2022] Open
Affiliation(s)
- Arkan Al-Zubaidi
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
- Research Center Neurosensory Science, Oldenburg University, 26129 Oldenburg, Germany
| | - Susann Bräuer
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Chris R Holdgraf
- Department of Statistics, UC Berkeley, Berkeley, CA 94720, USA
- International Interactive Computing Collaboration
| | - Inga M Schepers
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Jochem W Rieger
- Corresponding author: Department of Psychology, Faculty VI, Oldenburg University, 26129 Oldenburg, Germany.
| |
Collapse
|
7
|
Zhang M, Alamatsaz N, Ihlefeld A. Hemodynamic Responses Link Individual Differences in Informational Masking to the Vicinity of Superior Temporal Gyrus. Front Neurosci 2021; 15:675326. [PMID: 34366772 PMCID: PMC8339305 DOI: 10.3389/fnins.2021.675326] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/13/2021] [Indexed: 01/20/2023] Open
Abstract
Suppressing unwanted background sound is crucial for aural communication. A particularly disruptive type of background sound, informational masking (IM), often interferes in social settings. However, IM mechanisms are incompletely understood. At present, IM is identified operationally: when a target should be audible, based on suprathreshold target/masker energy ratios, yet cannot be heard because target-like background sound interferes. We here confirm that speech identification thresholds differ dramatically between low- vs. high-IM background sound. However, speech detection thresholds are comparable across the two conditions. Moreover, functional near infrared spectroscopy recordings show that task-evoked blood oxygenation changes near the superior temporal gyrus (STG) covary with behavioral speech detection performance for high-IM but not low-IM background sound, suggesting that the STG is part of an IM-dependent network. Moreover, listeners who are more vulnerable to IM show increased hemodynamic recruitment near STG, an effect that cannot be explained based on differences in task difficulty across low- vs. high-IM. In contrast, task-evoked responses near another auditory region of cortex, the caudal inferior frontal sulcus (cIFS), do not predict behavioral sensitivity, suggesting that the cIFS belongs to an IM-independent network. Results are consistent with the idea that cortical gating shapes individual vulnerability to IM.
Collapse
Affiliation(s)
- Min Zhang
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, United States
- Rutgers Biomedical and Health Sciences, Rutgers University, Newark, NJ, United States
| | - Nima Alamatsaz
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, United States
- Rutgers Biomedical and Health Sciences, Rutgers University, Newark, NJ, United States
| | - Antje Ihlefeld
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, United States
| |
Collapse
|
8
|
Text Captioning Buffers Against the Effects of Background Noise and Hearing Loss on Memory for Speech. Ear Hear 2021; 43:115-127. [PMID: 34260436 DOI: 10.1097/aud.0000000000001079] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Everyday speech understanding frequently occurs in perceptually demanding environments, for example, due to background noise and normal age-related hearing loss. The resulting degraded speech signals increase listening effort, which gives rise to negative downstream effects on subsequent memory and comprehension, even when speech is intelligible. In two experiments, we explored whether the presentation of realistic assistive text captioned speech offsets the negative effects of background noise and hearing impairment on multiple measures of speech memory. DESIGN In Experiment 1, young normal-hearing adults (N = 48) listened to sentences for immediate recall and delayed recognition memory. Speech was presented in quiet or in two levels of background noise. Sentences were either presented as speech only or as text captioned speech. Thus, the experiment followed a 2 (caption vs no caption) × 3 (no noise, +7 dB signal-to-noise ratio, +3 dB signal-to-noise ratio) within-subjects design. In Experiment 2, a group of older adults (age range: 61 to 80, N = 31), with varying levels of hearing acuity completed the same experimental task as in Experiment 1. For both experiments, immediate recall, recognition memory accuracy, and recognition memory confidence were analyzed via general(ized) linear mixed-effects models. In addition, we examined individual differences as a function of hearing acuity in Experiment 2. RESULTS In Experiment 1, we found that the presentation of realistic text-captioned speech in young normal-hearing listeners showed improved immediate recall and delayed recognition memory accuracy and confidence compared with speech alone. Moreover, text captions attenuated the negative effects of background noise on all speech memory outcomes. In Experiment 2, we replicated the same pattern of results in a sample of older adults with varying levels of hearing acuity. Moreover, we showed that the negative effects of hearing loss on speech memory in older adulthood were attenuated by the presentation of text captions. CONCLUSIONS Collectively, these findings strongly suggest that the simultaneous presentation of text can offset the negative effects of effortful listening on speech memory. Critically, captioning benefits extended from immediate word recall to long-term sentence recognition memory, a benefit that was observed not only for older adults with hearing loss but also young normal-hearing listeners. These findings suggest that the text captioning benefit to memory is robust and has potentially wide applications for supporting speech listening in acoustically challenging environments.
Collapse
|
9
|
Prince P, Paul BT, Chen J, Le T, Lin V, Dimitrijevic A. Neural correlates of visual stimulus encoding and verbal working memory differ between cochlear implant users and normal-hearing controls. Eur J Neurosci 2021; 54:5016-5037. [PMID: 34146363 PMCID: PMC8457219 DOI: 10.1111/ejn.15365] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 06/10/2021] [Accepted: 06/14/2021] [Indexed: 11/29/2022]
Abstract
A common concern for individuals with severe‐to‐profound hearing loss fitted with cochlear implants (CIs) is difficulty following conversations in noisy environments. Recent work has suggested that these difficulties are related to individual differences in brain function, including verbal working memory and the degree of cross‐modal reorganization of auditory areas for visual processing. However, the neural basis for these relationships is not fully understood. Here, we investigated neural correlates of visual verbal working memory and sensory plasticity in 14 CI users and age‐matched normal‐hearing (NH) controls. While we recorded the high‐density electroencephalogram (EEG), participants completed a modified Sternberg visual working memory task where sets of letters and numbers were presented visually and then recalled at a later time. Results suggested that CI users had comparable behavioural working memory performance compared with NH. However, CI users had more pronounced neural activity during visual stimulus encoding, including stronger visual‐evoked activity in auditory and visual cortices, larger modulations of neural oscillations and increased frontotemporal connectivity. In contrast, during memory retention of the characters, CI users had descriptively weaker neural oscillations and significantly lower frontotemporal connectivity. We interpret the differences in neural correlates of visual stimulus processing in CI users through the lens of cross‐modal and intramodal plasticity.
Collapse
Affiliation(s)
- Priyanka Prince
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Department of Physiology, University of Toronto, Toronto, Ontario, Canada
| | - Brandon T Paul
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Department of Psychology, Ryerson University, Toronto, Ontario, Canada
| | - Joseph Chen
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Trung Le
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Vincent Lin
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Andrew Dimitrijevic
- Evaluative Clinical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada.,Department of Physiology, University of Toronto, Toronto, Ontario, Canada.,Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.,Faculty of Medicine, Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Wang J, Chen J, Yang X, Liu L, Wu C, Lu L, Li L, Wu Y. Common Brain Substrates Underlying Auditory Speech Priming and Perceived Spatial Separation. Front Neurosci 2021; 15:664985. [PMID: 34220425 PMCID: PMC8247760 DOI: 10.3389/fnins.2021.664985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 05/10/2021] [Indexed: 11/22/2022] Open
Abstract
Under a “cocktail party” environment, listeners can utilize prior knowledge of the content and voice of the target speech [i.e., auditory speech priming (ASP)] and perceived spatial separation to improve recognition of the target speech among masking speech. Previous studies suggest that these two unmasking cues are not processed independently. However, it is unclear whether the unmasking effects of these two cues are supported by common neural bases. In the current study, we aimed to first confirm that ASP and perceived spatial separation contribute to the improvement of speech recognition interactively in a multitalker condition and further investigate whether there exist intersectant brain substrates underlying both unmasking effects, by introducing these two unmasking cues in a unified paradigm and using functional magnetic resonance imaging. The results showed that neural activations by the unmasking effects of ASP and perceived separation partly overlapped in brain areas: the left pars triangularis (TriIFG) and orbitalis of the inferior frontal gyrus, left inferior parietal lobule, left supramarginal gyrus, and bilateral putamen, all of which are involved in the sensorimotor integration and the speech production. The activations of the left TriIFG were correlated with behavioral improvements caused by ASP and perceived separation. Meanwhile, ASP and perceived separation also enhanced the functional connectivity between the left IFG and brain areas related to the suppression of distractive speech signals: the anterior cingulate cortex and the left middle frontal gyrus, respectively. Therefore, these findings suggest that the motor representation of speech is important for both the unmasking effects of ASP and perceived separation and highlight the critical role of the left IFG in these unmasking effects in “cocktail party” environments.
Collapse
Affiliation(s)
- Junxian Wang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Jing Chen
- Department of Machine Intelligence, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Xiaodong Yang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Lei Liu
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Chao Wu
- School of Nursing, Peking University, Beijing, China
| | - Lingxi Lu
- Center for the Cognitive Science of Language, Beijing Language and Culture University, Beijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.,Beijing Institute for Brain Disorders, Beijing, China
| | - Yanhong Wu
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| |
Collapse
|
11
|
Holmes E, Johnsrude IS. Speech-evoked brain activity is more robust to competing speech when it is spoken by someone familiar. Neuroimage 2021; 237:118107. [PMID: 33933598 DOI: 10.1016/j.neuroimage.2021.118107] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 04/19/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022] Open
Abstract
When speech is masked by competing sound, people are better at understanding what is said if the talker is familiar compared to unfamiliar. The benefit is robust, but how does processing of familiar voices facilitate intelligibility? We combined high-resolution fMRI with representational similarity analysis to quantify the difference in distributed activity between clear and masked speech. We demonstrate that brain representations of spoken sentences are less affected by a competing sentence when they are spoken by a friend or partner than by someone unfamiliar-effectively, showing a cortical signal-to-noise ratio (SNR) enhancement for familiar voices. This effect correlated with the familiar-voice intelligibility benefit. We functionally parcellated auditory cortex, and found that the most prominent familiar-voice advantage was manifest along the posterior superior and middle temporal gyri. Overall, our results demonstrate that experience-driven improvements in intelligibility are associated with enhanced multivariate pattern activity in posterior temporal cortex.
Collapse
Affiliation(s)
- Emma Holmes
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada.
| | - Ingrid S Johnsrude
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada; School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario, London, N6G 1H1, Canada
| |
Collapse
|
12
|
Murai SA, Riquimaroux H. Neural correlates of subjective comprehension of noise-vocoded speech. Hear Res 2021; 405:108249. [PMID: 33894680 DOI: 10.1016/j.heares.2021.108249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 03/28/2021] [Accepted: 04/06/2021] [Indexed: 10/21/2022]
Abstract
Under an acoustically degraded condition, the degree of speech comprehension fluctuates within individuals. Understanding the relationship between such fluctuations in comprehension and neural responses might reveal perceptual processing for distorted speech. In this study we investigated the cerebral activity associated with the degree of subjective comprehension of noise-vocoded speech sounds (NVSS) using functional magnetic resonance imaging. Our results indicate that higher comprehension of NVSS sentences was associated with greater activation in the right superior temporal cortex, and that activity in the left inferior frontal gyrus (Broca's area) was increased when a listener recognized words in a sentence they did not fully comprehend. In addition, results of laterality analysis demonstrated that recognition of words in an NVSS sentence led to less lateralized responses in the temporal cortex, though a left-lateralization was observed when no words were recognized. The data suggest that variation in comprehension within individuals can be associated with changes in lateralization in the temporal auditory cortex.
Collapse
Affiliation(s)
- Shota A Murai
- Faculty of Life and Medical Sciences, Doshisha University. 1-3 Miyakodani, Tatara, Kyotanabe 610-0321, Kyoto, Japan
| | - Hiroshi Riquimaroux
- Faculty of Life and Medical Sciences, Doshisha University. 1-3 Miyakodani, Tatara, Kyotanabe 610-0321, Kyoto, Japan.
| |
Collapse
|
13
|
Urbschat A, Uppenkamp S, Anemüller J. Searchlight Classification Informative Region Mixture Model (SCIM): Identification of Cortical Regions Showing Discriminable BOLD Patterns in Event-Related Auditory fMRI Data. Front Neurosci 2021; 14:616906. [PMID: 33597841 PMCID: PMC7882477 DOI: 10.3389/fnins.2020.616906] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 12/29/2020] [Indexed: 11/13/2022] Open
Abstract
The investigation of abstract cognitive tasks, e.g., semantic processing of speech, requires the simultaneous use of a carefully selected stimulus design and sensitive tools for the analysis of corresponding neural activity that are comparable across different studies investigating similar research questions. Multi-voxel pattern analysis (MVPA) methods are commonly used in neuroimaging to investigate BOLD responses corresponding to neural activation associated with specific cognitive tasks. Regions of significant activation are identified by a thresholding operation during multivariate pattern analysis, the results of which are susceptible to the applied threshold value. Investigation of analysis approaches that are robust to a large extent with respect to thresholding, is thus an important goal pursued here. The present paper contributes a novel statistical analysis method for fMRI experiments, searchlight classification informative region mixture model (SCIM), that is based on the assumption that the whole brain volume can be subdivided into two groups of voxels: spatial voxel positions around which recorded BOLD activity does convey information about the present stimulus condition and those that do not. A generative statistical model is proposed that assigns a probability of being informative to each position in the brain, based on a combination of a support vector machine searchlight analysis and Gaussian mixture models. Results from an auditory fMRI study investigating cortical regions that are engaged in the semantic processing of speech indicate that the SCIM method identifies physiologically plausible brain regions as informative, similar to those from two standard methods as reference that we compare to, with two important differences. SCIM-identified regions are very robust to the choice of the threshold for significance, i.e., less “noisy,” in contrast to, e.g., the binomial test whose results in the present experiment are highly dependent on the chosen significance threshold or random permutation tests that are additionally bound to very high computational costs. In group analyses, the SCIM method identifies a physiologically plausible pre-frontal region, anterior cingulate sulcus, to be involved in semantic processing that other methods succeed to identify only in single subject analyses.
Collapse
Affiliation(s)
- Annika Urbschat
- Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
| | - Stefan Uppenkamp
- Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
| | - Jörn Anemüller
- Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
| |
Collapse
|
14
|
Kajiura M, Jeong H, Kawata NYS, Yu S, Kinoshita T, Kawashima R, Sugiura M. Brain activity predicts future learning success in intensive second language listening training. BRAIN AND LANGUAGE 2021; 212:104839. [PMID: 33271393 DOI: 10.1016/j.bandl.2020.104839] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 06/03/2020] [Accepted: 07/14/2020] [Indexed: 06/12/2023]
Abstract
This study explores neural mechanisms underlying how prior knowledge gained from pre-listening transcript reading helps comprehend fast-rate speech in a second language (L2) and applies to L2 learning. Top-down predictive processing by prior knowledge may play an important role in L2 speech comprehension and improving listening skill. By manipulating the pre-listening transcript effect (pre-listening transcript reading [TR] vs. no transcript reading [NTR]) and type of languages (first language (L1) vs. L2), we measured brain activity in L2 learners, who performed fast-rate listening comprehension tasks during functional magnetic resonance imaging. Thereafter, we examined whether TR_L2-specific brain activity can predict individual learning success after an intensive listening training. The left angular and superior temporal gyri were key areas responsible for integrating prior knowledge to sensory input. Activity in these areas correlated significantly with gain scores on subsequent training, indicating that brain activity related to prior knowledge-sensory input integration predicts future learning success.
Collapse
Affiliation(s)
- Mayumi Kajiura
- Division of Foreign Language Education, Aichi Shukutoku University, Nagoya, Japan.
| | - Hyeonjeong Jeong
- Graduate School of International Cultural Studies, Tohoku University, Sendai, Japan; Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan.
| | - Natasha Y S Kawata
- Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan
| | - Shaoyun Yu
- Graduate School of Humanities, Nagoya University, Nagoya, Japan
| | - Toru Kinoshita
- Graduate School of Humanities, Nagoya University, Nagoya, Japan
| | - Ryuta Kawashima
- Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan
| | - Motoaki Sugiura
- Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan; International Research Institute for Disaster Science, Tohoku University, Sendai, Japan
| |
Collapse
|
15
|
Holmes E, Zeidman P, Friston KJ, Griffiths TD. Difficulties with Speech-in-Noise Perception Related to Fundamental Grouping Processes in Auditory Cortex. Cereb Cortex 2020; 31:1582-1596. [PMID: 33136138 PMCID: PMC7869094 DOI: 10.1093/cercor/bhaa311] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/04/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open
Abstract
In our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” [SPIN] perception). SPIN perception varies widely—and people who are worse at SPIN perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with SPIN perception to difficulties with figure-ground perception using functional magnetic resonance imaging. We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when SPIN and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% rather than 90% performance)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (SPIN) tasks—which provides a common computational basis for the link between SPIN perception and fundamental auditory grouping.
Collapse
Affiliation(s)
- Emma Holmes
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Peter Zeidman
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Timothy D Griffiths
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK.,Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
16
|
Signoret C, Andersen LM, Dahlström Ö, Blomberg R, Lundqvist D, Rudner M, Rönnberg J. The Influence of Form- and Meaning-Based Predictions on Cortical Speech Processing Under Challenging Listening Conditions: A MEG Study. Front Neurosci 2020; 14:573254. [PMID: 33100961 PMCID: PMC7546411 DOI: 10.3389/fnins.2020.573254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 09/01/2020] [Indexed: 01/07/2023] Open
Abstract
Under adverse listening conditions, prior linguistic knowledge about the form (i.e., phonology) and meaning (i.e., semantics) help us to predict what an interlocutor is about to say. Previous research has shown that accurate predictions of incoming speech increase speech intelligibility, and that semantic predictions enhance the perceptual clarity of degraded speech even when exact phonological predictions are possible. In addition, working memory (WM) is thought to have specific influence over anticipatory mechanisms by actively maintaining and updating the relevance of predicted vs. unpredicted speech inputs. However, the relative impact on speech processing of deviations from expectations related to form and meaning is incompletely understood. Here, we use MEG to investigate the cortical temporal processing of deviations from the expected form and meaning of final words during sentence processing. Our overall aim was to observe how deviations from the expected form and meaning modulate cortical speech processing under adverse listening conditions and investigate the degree to which this is associated with WM capacity. Results indicated that different types of deviations are processed differently in the auditory N400 and Mismatch Negativity (MMN) components. In particular, MMN was sensitive to the type of deviation (form or meaning) whereas the N400 was sensitive to the magnitude of the deviation rather than its type. WM capacity was associated with the ability to process phonological incoming information and semantic integration.
Collapse
Affiliation(s)
- Carine Signoret
- Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| | - Lau M Andersen
- The National Research Facility for Magnetoencephalography, Department of Clinical Neuroscience, Karolinska Institutet, Solna, Sweden.,Center of Functionally Integrative Neuroscience, Institute of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Örjan Dahlström
- Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| | - Rina Blomberg
- Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| | - Daniel Lundqvist
- The National Research Facility for Magnetoencephalography, Department of Clinical Neuroscience, Karolinska Institutet, Solna, Sweden
| | - Mary Rudner
- Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| | - Jerker Rönnberg
- Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| |
Collapse
|
17
|
Banellis L, Sokoliuk R, Wild CJ, Bowman H, Cruse D. Event-related potentials reflect prediction errors and pop-out during comprehension of degraded speech. Neurosci Conscious 2020; 2020:niaa022. [PMID: 33133640 PMCID: PMC7585676 DOI: 10.1093/nc/niaa022] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 07/08/2020] [Accepted: 08/06/2020] [Indexed: 11/20/2022] Open
Abstract
Comprehension of degraded speech requires higher-order expectations informed by prior knowledge. Accurate top-down expectations of incoming degraded speech cause a subjective semantic 'pop-out' or conscious breakthrough experience. Indeed, the same stimulus can be perceived as meaningless when no expectations are made in advance. We investigated the event-related potential (ERP) correlates of these top-down expectations, their error signals and the subjective pop-out experience in healthy participants. We manipulated expectations in a word-pair priming degraded (noise-vocoded) speech task and investigated the role of top-down expectation with a between-groups attention manipulation. Consistent with the role of expectations in comprehension, repetition priming significantly enhanced perceptual intelligibility of the noise-vocoded degraded targets for attentive participants. An early ERP was larger for mismatched (i.e. unexpected) targets than matched targets, indicative of an initial error signal not reliant on top-down expectations. Subsequently, a P3a-like ERP was larger to matched targets than mismatched targets only for attending participants-i.e. a pop-out effect-while a later ERP was larger for mismatched targets and did not significantly interact with attention. Rather than relying on complex post hoc interactions between prediction error and precision to explain this apredictive pattern, we consider our data to be consistent with prediction error minimization accounts for early stages of processing followed by Global Neuronal Workspace-like breakthrough and processing in service of task goals.
Collapse
Affiliation(s)
- Leah Banellis
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| | - Rodika Sokoliuk
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| | - Conor J Wild
- Brain and Mind Institute, University of Western Ontario, London, ON N6A 3K7, Canada
| | - Howard Bowman
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
- School of Computing, University of Kent, Canterbury, Kent CT2 7NF, UK
| | - Damian Cruse
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Edgbaston B15 2TT, UK
| |
Collapse
|
18
|
Abstract
OBJECTIVES Slowed speaking rate was examined for its effects on speech intelligibility, its interaction with the benefit of contextual cues, and the impact of these factors on listening effort in adults with cochlear implants. DESIGN Participants (n = 21 cochlear implant users) heard high- and low-context sentences that were played at the original speaking rate, as well as a slowed (1.4× duration) speaking rate, using uniform pitch-synchronous time warping. In addition to intelligibility measures, changes in pupil dilation were measured as a time-varying index of processing load or listening effort. Slope of pupil size recovery to baseline after the sentence was used as an index of resolution of perceptual ambiguity. RESULTS Speech intelligibility was better for high-context compared to low-context sentences and slightly better for slower compared to original-rate speech. Speech rate did not affect magnitude and latency of peak pupil dilation relative to sentence offset. However, baseline pupil size recovered more substantially for slower-rate sentences, suggesting easier processing in the moment after the sentence was over. The effect of slowing speech rate was comparable to changing a sentence from low context to high context. The effect of context on pupil dilation was not observed until after the sentence was over, and one of two analyses suggested that context had greater beneficial effects on listening effort when the speaking rate was slower. These patterns maintained even at perfect sentence intelligibility, suggesting that correct speech repetition does not guarantee efficient or effortless processing. With slower speaking rates, there was less variability in pupil dilation slopes following the sentence, implying mitigation of some of the difficulties shown by individual listeners who would otherwise demonstrate prolonged effort after a sentence is heard. CONCLUSIONS Slowed speaking rate provides release from listening effort when hearing an utterance, particularly relieving effort that would have lingered after a sentence is over. Context arguably provides even more release from listening effort when speaking rate is slower. The pattern of prolonged pupil dilation for faster speech is consistent with increased need to mentally correct errors, although that exact interpretation cannot be verified with intelligibility data alone or with pupil data alone. A pattern of needing to dwell on a sentence to disambiguate misperceptions likely contributes to difficulty in running conversation where there are few opportunities to pause and resolve recently heard utterances.
Collapse
|
19
|
Wikman P, Sahari E, Salmela V, Leminen A, Leminen M, Laine M, Alho K. Breaking down the cocktail party: Attentional modulation of cerebral audiovisual speech processing. Neuroimage 2020; 224:117365. [PMID: 32941985 DOI: 10.1016/j.neuroimage.2020.117365] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 08/19/2020] [Accepted: 09/07/2020] [Indexed: 12/20/2022] Open
Abstract
Recent studies utilizing electrophysiological speech envelope reconstruction have sparked renewed interest in the cocktail party effect by showing that auditory neurons entrain to selectively attended speech. Yet, the neural networks of attention to speech in naturalistic audiovisual settings with multiple sound sources remain poorly understood. We collected functional brain imaging data while participants viewed audiovisual video clips of lifelike dialogues with concurrent distracting speech in the background. Dialogues were presented in a full-factorial design, comprising task (listen to the dialogues vs. ignore them), audiovisual quality and semantic predictability. We used univariate analyses in combination with multivariate pattern analysis (MVPA) to study modulations of brain activity related to attentive processing of audiovisual speech. We found attentive speech processing to cause distinct spatiotemporal modulation profiles in distributed cortical areas including sensory and frontal-control networks. Semantic coherence modulated attention-related activation patterns in the earliest stages of auditory cortical processing, suggesting that the auditory cortex is involved in high-level speech processing. Our results corroborate views that emphasize the dynamic nature of attention, with task-specificity and context as cornerstones of the underlying neuro-cognitive mechanisms.
Collapse
Affiliation(s)
- Patrik Wikman
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland.
| | - Elisa Sahari
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
| | - Viljami Salmela
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Alina Leminen
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Department of Digital Humanities, University of Helsinki, Helsinki, Finland
| | - Miika Leminen
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Department of Phoniatrics, Helsinki University Hospital, Helsinki, Finland
| | - Matti Laine
- Department of Psychology, Åbo Akademi University, Turku, Finland
| | - Kimmo Alho
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
20
|
Lawless MS, Vigeant MC. Sensitivity of the human auditory cortex and reward network to reverberant musical stimuli. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2121. [PMID: 32359334 DOI: 10.1121/10.0000984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 03/12/2020] [Indexed: 06/11/2023]
Abstract
A room's acoustics can alter subjective impressions of music, including preference. However, little research has characterized the brain's response to room conditions. Functional magnetic resonance imaging (fMRI) was used to investigate the auditory and reward responses to concert hall stimuli. Before the fMRI testing, 18 participants rated their preferences to a solo-instrumental passage and an orchestral motif simulated in eight room acoustic conditions outside an MRI scanner to identify their most liked and disliked conditions. In the MRI, the most-liked (reverberation time, RT = 1.0-2.8 s) and most-disliked (RT = 7.2 s) conditions, along with the [anechoic and scrambled versions] anechoic and scrambled versions of the musical passages were presented. The auditory cortex was found to be sensitive to the temporal coherence of the stimuli as it exhibited stronger activations for simpler stimuli, i.e., the solo-instrumental and anechoic conditions, than for stimuli containing temporally incoherent auditory objects-the orchestral and reverberant conditions. In contrasts between liked and disliked reverberant stimuli, a reward response in the basal ganglia was detected in a region of interest analysis using a temporal derivative model of the hemodynamic response function. This response may indicate differences in preference between subtle variations in room acoustics applied to the same musical passage.
Collapse
Affiliation(s)
- Martin S Lawless
- Graduate Program in Acoustics, The Pennsylvania State University, 201 Applied Science Building, University Park, Pennsylvania 16802, USA
| | - Michelle C Vigeant
- Graduate Program in Acoustics, The Pennsylvania State University, 201 Applied Science Building, University Park, Pennsylvania 16802, USA
| |
Collapse
|
21
|
Krishnan S, Lima CF, Evans S, Chen S, Guldner S, Yeff H, Manly T, Scott SK. Beatboxers and Guitarists Engage Sensorimotor Regions Selectively When Listening to the Instruments They can Play. Cereb Cortex 2019; 28:4063-4079. [PMID: 30169831 PMCID: PMC6188551 DOI: 10.1093/cercor/bhy208] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 08/04/2018] [Indexed: 12/31/2022] Open
Abstract
Studies of classical musicians have demonstrated that expertise modulates neural responses during auditory perception. However, it remains unclear whether such expertise-dependent plasticity is modulated by the instrument that a musician plays. To examine whether the recruitment of sensorimotor regions during music perception is modulated by instrument-specific experience, we studied nonclassical musicians-beatboxers, who predominantly use their vocal apparatus to produce sound, and guitarists, who use their hands. We contrast fMRI activity in 20 beatboxers, 20 guitarists, and 20 nonmusicians as they listen to novel beatboxing and guitar pieces. All musicians show enhanced activity in sensorimotor regions (IFG, IPC, and SMA), but only when listening to the musical instrument they can play. Using independent component analysis, we find expertise-selective enhancement in sensorimotor networks, which are distinct from changes in attentional networks. These findings suggest that long-term sensorimotor experience facilitates access to the posterodorsal "how" pathway during auditory processing.
Collapse
Affiliation(s)
- Saloni Krishnan
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK.,Department of Experimental Psychology, University of Oxford, Anna Watts Building, Radcliffe Observatory Quarter, Oxford, UK
| | - César F Lima
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK.,Instituto Universitário de Lisboa (ISCTE-IUL), Avenida das Forças Armadas, Lisboa, Portugal
| | - Samuel Evans
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK.,Department of Psychology, University of Westminster, 115 New Cavendish Street, London, UK
| | - Sinead Chen
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK
| | - Stella Guldner
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK.,Graduate School of Economic and Social Sciences (GESS), University of Mannheim, Mannheim, Germany
| | - Harry Yeff
- Get Involved Ltd, 3 Loughborough Street, London, UK
| | - Tom Manly
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London, UK
| |
Collapse
|
22
|
Guediche S, Zhu Y, Minicucci D, Blumstein SE. Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions. BRAIN AND LANGUAGE 2019; 199:104698. [PMID: 31586792 DOI: 10.1016/j.bandl.2019.104698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/15/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]
Abstract
This study examines cross-modality effects of a semantically-biased written sentence context on the perception of an acoustically-ambiguous word target identifying neural areas sensitive to interactions between sentential bias and phonetic ambiguity. Of interest is whether the locus or nature of the interactions resembles those previously demonstrated for auditory-only effects. FMRI results show significant interaction effects in right mid-middle temporal gyrus (RmMTG) and bilateral anterior superior temporal gyri (aSTG), regions along the ventral language comprehension stream that map sound onto meaning. These regions are more anterior than those previously identified for auditory-only effects; however, the same cross-over interaction pattern emerged implying similar underlying computations at play. The findings suggest that the mechanisms that integrate information across modality and across sentence and phonetic levels of processing recruit amodal areas where reading and spoken lexical and semantic access converge. Taken together, results support interactive accounts of speech and language processing.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; BCBL - Basque Center on Cognition, Brain and Language, Donostia-San Sebastian, Spain.
| | - Yuli Zhu
- Neuroscience Department, Brown University, United States
| | - Domenic Minicucci
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; Brown Institute for Brain Science, Brown University, United States
| |
Collapse
|
23
|
A Sound-Sensitive Source of Alpha Oscillations in Human Non-Primary Auditory Cortex. J Neurosci 2019; 39:8679-8689. [PMID: 31533976 PMCID: PMC6820204 DOI: 10.1523/jneurosci.0696-19.2019] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/09/2019] [Accepted: 07/02/2019] [Indexed: 02/06/2023] Open
Abstract
The functional organization of human auditory cortex can be probed by characterizing responses to various classes of sound at different anatomical locations. Along with histological studies this approach has revealed a primary field in posteromedial Heschl's gyrus (HG) with pronounced induced high-frequency (70–150 Hz) activity and short-latency responses that phase-lock to rapid transient sounds. Low-frequency neural oscillations are also relevant to stimulus processing and information flow, however, their distribution within auditory cortex has not been established. Alpha activity (7–14 Hz) in particular has been associated with processes that may differentially engage earlier versus later levels of the cortical hierarchy, including functional inhibition and the communication of sensory predictions. These theories derive largely from the study of occipitoparietal sources readily detectable in scalp electroencephalography. To characterize the anatomical basis and functional significance of less accessible temporal-lobe alpha activity we analyzed responses to sentences in seven human adults (4 female) with epilepsy who had been implanted with electrodes in superior temporal cortex. In contrast to primary cortex in posteromedial HG, a non-primary field in anterolateral HG was characterized by high spontaneous alpha activity that was strongly suppressed during auditory stimulation. Alpha-power suppression decreased with distance from anterolateral HG throughout superior temporal cortex, and was more pronounced for clear compared to degraded speech. This suppression could not be accounted for solely by a change in the slope of the power spectrum. The differential manifestation and stimulus-sensitivity of alpha oscillations across auditory fields should be accounted for in theories of their generation and function. SIGNIFICANCE STATEMENT To understand how auditory cortex is organized in support of perception, we recorded from patients implanted with electrodes for clinical reasons. This allowed measurement of activity in brain regions at different levels of sensory processing. Oscillations in the alpha range (7–14 Hz) have been associated with functions including sensory prediction and inhibition of regions handling irrelevant information, but their distribution within auditory cortex is not known. A key finding was that these oscillations dominated in one particular non-primary field, anterolateral Heschl's gyrus, and were suppressed when subjects listened to sentences. These results build on our knowledge of the functional organization of auditory cortex and provide anatomical constraints on theories of the generation and function of alpha oscillations.
Collapse
|
24
|
Planton S, Chanoine V, Sein J, Anton JL, Nazarian B, Pallier C, Pattamadilok C. Top-down activation of the visuo-orthographic system during spoken sentence processing. Neuroimage 2019; 202:116135. [PMID: 31470125 DOI: 10.1016/j.neuroimage.2019.116135] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 08/09/2019] [Accepted: 08/26/2019] [Indexed: 11/28/2022] Open
Abstract
The left ventral occipitotemporal cortex (vOT) is considered the key area of the visuo-orthographic system. However, some studies reported that the area is also involved in speech processing tasks, especially those that require activation of orthographic knowledge. These findings suggest the existence of a top-down activation mechanism allowing such cross-modal activation. Yet, little is known about the involvement of the vOT in more natural speech processing situations like spoken sentence processing. Here, we addressed this issue in a functional Magnetic Resonance Imaging (fMRI) study while manipulating the impacts of two factors, i.e., task demands (semantic vs. low-level perceptual task) and the quality of speech signals (sentences presented against clear vs. noisy background). Analyses were performed at the levels of whole brain and region-of-interest (ROI) focusing on the vOT voxels individually identified through a reading task. Whole brain analysis showed that processing spoken sentences induced activity in a large network including the regions typically involved in phonological, articulatory, semantic and orthographic processing. ROI analysis further specified that a significant part of the vOT voxels that responded to written words also responded to spoken sentences, thus, suggesting that the same area within the left occipitotemporal pathway contributes to both reading and speech processing. Interestingly, both analyses provided converging evidence that vOT responses to speech were sensitive to both task demands and quality of speech signals: Compared to the low-level perceptual task, activity of the area increased when efforts on comprehension were required. The impact of background noise depended on task demands. It led to a decrease of vOT activity in the semantic task but not in the low-level perceptual task. Our results provide new insights into the function of this key area of the reading network, notably by showing that its speech-induced top-down activation also generalizes to ecological speech processing situations.
Collapse
Affiliation(s)
- Samuel Planton
- Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France; INSERM-CEA, Cognitive Neuroimaging Unit, Neurospin Center, Gif-sur-Yvette, France.
| | - Valérie Chanoine
- Aix Marseille Univ, Institute of Language, Communication and the Brain, Brain and Language Research Institute, Aix-en-Provence, France
| | - Julien Sein
- Aix Marseille Univ, CNRS, Centre IRM-INT, INT UMR, 7289, Marseille, France
| | - Jean-Luc Anton
- Aix Marseille Univ, CNRS, Centre IRM-INT, INT UMR, 7289, Marseille, France
| | - Bruno Nazarian
- Aix Marseille Univ, CNRS, Centre IRM-INT, INT UMR, 7289, Marseille, France
| | - Christophe Pallier
- INSERM-CEA, Cognitive Neuroimaging Unit, Neurospin Center, Gif-sur-Yvette, France
| | | |
Collapse
|
25
|
Zekveld AA, Kramer SE, Rönnberg J, Rudner M. In a Concurrent Memory and Auditory Perception Task, the Pupil Dilation Response Is More Sensitive to Memory Load Than to Auditory Stimulus Characteristics. Ear Hear 2019; 40:272-286. [PMID: 29923867 PMCID: PMC6400496 DOI: 10.1097/aud.0000000000000612] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 04/10/2018] [Indexed: 11/30/2022]
Abstract
OBJECTIVES Speech understanding may be cognitively demanding, but it can be enhanced when semantically related text cues precede auditory sentences. The present study aimed to determine whether (a) providing text cues reduces pupil dilation, a measure of cognitive load, during listening to sentences, (b) repeating the sentences aloud affects recall accuracy and pupil dilation during recall of cue words, and (c) semantic relatedness between cues and sentences affects recall accuracy and pupil dilation during recall of cue words. DESIGN Sentence repetition following text cues and recall of the text cues were tested. Twenty-six participants (mean age, 22 years) with normal hearing listened to masked sentences. On each trial, a set of four-word cues was presented visually as text preceding the auditory presentation of a sentence whose meaning was either related or unrelated to the cues. On each trial, participants first read the cue words, then listened to a sentence. Following this they spoke aloud either the cue words or the sentence, according to instruction, and finally on all trials orally recalled the cues. Peak pupil dilation was measured throughout listening and recall on each trial. Additionally, participants completed a test measuring the ability to perceive degraded verbal text information and three working memory tests (a reading span test, a size-comparison span test, and a test of memory updating). RESULTS Cue words that were semantically related to the sentence facilitated sentence repetition but did not reduce pupil dilation. Recall was poorer and there were more intrusion errors when the cue words were related to the sentences. Recall was also poorer when sentences were repeated aloud. Both behavioral effects were associated with greater pupil dilation. Larger reading span capacity and smaller size-comparison span were associated with larger peak pupil dilation during listening. Furthermore, larger reading span and greater memory updating ability were both associated with better cue recall overall. CONCLUSIONS Although sentence-related word cues facilitate sentence repetition, our results indicate that they do not reduce cognitive load during listening in noise with a concurrent memory load. As expected, higher working memory capacity was associated with better recall of the cues. Unexpectedly, however, semantic relatedness with the sentence reduced word cue recall accuracy and increased intrusion errors, suggesting an effect of semantic confusion. Further, speaking the sentence aloud also reduced word cue recall accuracy, probably due to articulatory suppression. Importantly, imposing a memory load during listening to sentences resulted in the absence of formerly established strong effects of speech intelligibility on the pupil dilation response. This nullified intelligibility effect demonstrates that the pupil dilation response to a cognitive (memory) task can completely overshadow the effect of perceptual factors on the pupil dilation response. This highlights the importance of taking cognitive task load into account during auditory testing.
Collapse
Affiliation(s)
- Adriana A. Zekveld
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
- Linnaeus Centre HEAD, The Swedish Institute for Disability Research, Linköping and Örebro Universities, Linköping, Sweden
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and Amsterdam Public Health research institute VU University Medical Center, Amsterdam, The Netherlands
| | - Sophia E. Kramer
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and Amsterdam Public Health research institute VU University Medical Center, Amsterdam, The Netherlands
| | - Jerker Rönnberg
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
- Linnaeus Centre HEAD, The Swedish Institute for Disability Research, Linköping and Örebro Universities, Linköping, Sweden
| | - Mary Rudner
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
- Linnaeus Centre HEAD, The Swedish Institute for Disability Research, Linköping and Örebro Universities, Linköping, Sweden
| |
Collapse
|
26
|
Peelle JE. Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior. Ear Hear 2019; 39:204-214. [PMID: 28938250 PMCID: PMC5821557 DOI: 10.1097/aud.0000000000000494] [Citation(s) in RCA: 309] [Impact Index Per Article: 61.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Accepted: 07/28/2017] [Indexed: 02/04/2023]
Abstract
Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners' abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication.
Collapse
Affiliation(s)
- Jonathan E Peelle
- Department of Otolaryngology, Washington University in Saint Louis, Saint Louis, Missouri, USA
| |
Collapse
|
27
|
Rosemann S, Thiel CM. The effect of age-related hearing loss and listening effort on resting state connectivity. Sci Rep 2019; 9:2337. [PMID: 30787339 PMCID: PMC6382886 DOI: 10.1038/s41598-019-38816-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/10/2019] [Indexed: 12/22/2022] Open
Abstract
Age-related hearing loss is associated with a decrease in hearing abilities for high frequencies. This increases not only the difficulty to understand speech but also the experienced listening effort. Task based neuroimaging studies in normal-hearing and hearing-impaired participants show an increased frontal activation during effortful speech perception in the hearing-impaired. Whether the increased effort in everyday listening in hearing-impaired even impacts functional brain connectivity at rest is unknown. Nineteen normal-hearing and nineteen hearing-impaired participants with mild to moderate hearing loss participated in the study. Hearing abilities, listening effort and resting state functional connectivity were assessed. Our results indicate no differences in functional connectivity between hearing-impaired and normal-hearing participants. Increased listening effort, however, was related to significantly decreased functional connectivity between the dorsal attention network and the precuneus and superior parietal lobule as well as between the auditory and the inferior frontal cortex. We conclude that already mild to moderate age-related hearing loss can impact resting state functional connectivity. It is however not the hearing loss itself but the individually perceived listening effort that relates to functional connectivity changes.
Collapse
Affiliation(s)
- Stephanie Rosemann
- Biological Psychology, Department of Psychology, School of Medicine and Health Sciences, Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany. .,Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany.
| | - Christiane M Thiel
- Biological Psychology, Department of Psychology, School of Medicine and Health Sciences, Carl-von-Ossietzky Universität Oldenburg, Oldenburg, Germany.,Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
| |
Collapse
|
28
|
Luthra S, Guediche S, Blumstein SE, Myers EB. Neural substrates of subphonemic variation and lexical competition in spoken word recognition. LANGUAGE, COGNITION AND NEUROSCIENCE 2019; 34:151-169. [PMID: 31106225 PMCID: PMC6516505 DOI: 10.1080/23273798.2018.1531140] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In spoken word recognition, subphonemic variation influences lexical activation, with sounds near a category boundary increasing phonetic competition as well as lexical competition. The current study investigated the interplay of these factors using a visual world task in which participants were instructed to look at a picture of an auditory target (e.g., peacock). Eyetracking data indicated that participants were slowed when a voiced onset competitor (e.g., beaker) was also displayed, and this effect was amplified when acoustic-phonetic competition was increased. Simultaneously-collected fMRI data showed that several brain regions were sensitive to the presence of the onset competitor, including the supramarginal, middle temporal, and inferior frontal gyri, and functional connectivity analyses revealed that the coordinated activity of left frontal regions depends on both acoustic-phonetic and lexical factors. Taken together, results suggest a role for frontal brain structures in resolving lexical competition, particularly as atypical acoustic-phonetic information maps on to the lexicon.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
| | - Sara Guediche
- BCBL. Basque Center on Cognition, Brain and Language Mikeletegi Pasealekua, 69, 20009 Donostia, Gipuzkoa, Spain
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University 190 Thayer Street, Providence, RI, USA 02912
- Brown Institute for Brain Science, Brown University 2 Stimson Ave, Providence, RI, USA 02912
| | - Emily B Myers
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
- Department of Speech, Language & Hearing Sciences, University of Connecticut 850 Bolton Road, Unit 1085, Storrs, CT, USA 06269
- Haskins Laboratories 300 George Street, Suite 900, New Haven, CT, USA 06511
| |
Collapse
|
29
|
|
30
|
Chu R, Meltzer JA, Bitan T. Interhemispheric interactions during sentence comprehension in patients with aphasia. Cortex 2018; 109:74-91. [PMID: 30312780 DOI: 10.1016/j.cortex.2018.08.022] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 05/03/2018] [Accepted: 08/28/2018] [Indexed: 02/06/2023]
Abstract
Right-hemisphere involvement in language processing following left-hemisphere damage may reflect either compensatory processes, or a release from homotopic transcallosal inhibition, resulting in excessive right-to-left suppression that is maladaptive for language performance. Using fMRI, we assessed inter-hemispheric effective connectivity in fifteen patients with post-stroke aphasia, along with age-matched and younger controls during a sentence comprehension task. Dynamic Causal Modeling was used with four bilateral regions including inferior frontal gyri (IFG) and primary auditory cortices (A1). Despite the presence of lesions, satisfactory model fit was obtained in 9/15 patients. In young controls, the only significant homotopic connection (RA1-LA1), was excitatory, while inhibitory connections emanated from LIFG to both left and right A1's. Interestingly, these connections were also correlated with language comprehension scores in patients. The results for homotopic connections show that excitatory connectivity from RA1-to-LA1 and inhibitory connectivity from LA1-to-RA1 are associated with general auditory verbal comprehension. Moreover, negative correlations were found between sentence comprehension and top-down coupling for both heterotopic (LIFG-to-RA1) and intra-hemispheric (LIFG-to-LA1) connections. These results do not show an emergence of a new compensatory right to left excitation in patients nor do they support the existence of left to right transcallosal suppression in controls. Nevertheless, the correlations with performance in patients are consistent with some aspects of both the compensation model, and the transcallosal suppression account for the role of the RH. Altogether our results suggest that changes to both excitatory and inhibitory homotopic and heterotopic connections due to LH damage may be maladaptive, as they disrupt the normal inter-hemispheric coordination and communication.
Collapse
Affiliation(s)
- Ronald Chu
- Baycrest Health Sciences, Rotman Research Institute, Toronto, ON, Canada; University of Toronto, Department of Psychology, Toronto, ON, Canada.
| | - Jed A Meltzer
- Baycrest Health Sciences, Rotman Research Institute, Toronto, ON, Canada; University of Toronto, Department of Psychology, Toronto, ON, Canada; University of Toronto, Department of Speech-Language Pathology, Toronto, ON, Canada; Canadian Partnership for Stroke Recovery, Ottawa, ON, Canada
| | - Tali Bitan
- University of Toronto, Department of Speech-Language Pathology, Toronto, ON, Canada; University of Haifa, Department of Psychology and IIPDM, Haifa, Israel
| |
Collapse
|
31
|
Wilson SM, Yen M, Eriksson DK. An adaptive semantic matching paradigm for reliable and valid language mapping in individuals with aphasia. Hum Brain Mapp 2018; 39:3285-3307. [PMID: 29665223 PMCID: PMC6045968 DOI: 10.1002/hbm.24077] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 03/26/2018] [Accepted: 03/30/2018] [Indexed: 11/08/2022] Open
Abstract
Research on neuroplasticity in recovery from aphasia depends on the ability to identify language areas of the brain in individuals with aphasia. However, tasks commonly used to engage language processing in people with aphasia, such as narrative comprehension and picture naming, are limited in terms of reliability (test-retest reproducibility) and validity (identification of language regions, and not other regions). On the other hand, paradigms such as semantic decision that are effective in identifying language regions in people without aphasia can be prohibitively challenging for people with aphasia. This paper describes a new semantic matching paradigm that uses an adaptive staircase procedure to present individuals with stimuli that are challenging yet within their competence, so that language processing can be fully engaged in people with and without language impairments. The feasibility, reliability and validity of the adaptive semantic matching paradigm were investigated in sixteen individuals with chronic post-stroke aphasia and fourteen neurologically normal participants, in comparison to narrative comprehension and picture naming paradigms. All participants succeeded in learning and performing the semantic paradigm. Test-retest reproducibility of the semantic paradigm in people with aphasia was good (Dice coefficient = 0.66), and was superior to the other two paradigms. The semantic paradigm revealed known features of typical language organization (lateralization; frontal and temporal regions) more consistently in neurologically normal individuals than the other two paradigms, constituting evidence for validity. In sum, the adaptive semantic matching paradigm is a feasible, reliable and valid method for mapping language regions in people with aphasia.
Collapse
Affiliation(s)
- Stephen M. Wilson
- Department of Hearing and Speech SciencesVanderbilt University Medical CenterNashvilleTennessee
| | - Melodie Yen
- Department of Hearing and Speech SciencesVanderbilt University Medical CenterNashvilleTennessee
| | - Dana K. Eriksson
- Department of SpeechLanguage, and Hearing Sciences, University of ArizonaTucsonArizona
| |
Collapse
|
32
|
Abstract
OBJECTIVES It is well known from previous research that when listeners are told what they are about to hear before a degraded or partially masked auditory signal is presented, the speech signal "pops out" of the background and becomes considerably more intelligible. The goal of this research was to explore whether this priming effect is as strong in older adults as in younger adults. DESIGN Fifty-six adults-28 older and 28 younger-listened to "nonsense" sentences spoken by a female talker in the presence of a 2-talker speech masker (also female) or a fluctuating speech-like noise masker at 5 signal-to-noise ratios. Just before, or just after, the auditory signal was presented, a typed caption was displayed on a computer screen. The caption sentence was either identical to the auditory sentence or differed by one key word. The subjects' task was to decide whether the caption and auditory messages were the same or different. Discrimination performance was reported in d'. The strength of the pop-out perception was inferred from the improvement in performance that was expected from the caption-before order of presentation. A subset of 12 subjects from each group made confidence judgments as they gave their responses, and also completed several cognitive tests. RESULTS Data showed a clear order effect for both subject groups and both maskers, with better same-different discrimination performance for the caption-before condition than the caption-after condition. However, for the two-talker masker, the younger adults obtained a larger and more consistent benefit from the caption-before order than the older adults across signal-to-noise ratios. Especially at the poorer signal-to-noise ratios, older subjects showed little evidence that they experienced the pop-out effect that is presumed to make the discrimination task easier. On average, older subjects also appeared to approach the task differently, being more reluctant than younger subjects to report that the captions and auditory sentences were the same. Correlation analyses indicated a significant negative association between age and priming benefit in the two-talker masker and nonsignificant associations between priming benefit in this masker and either high-frequency hearing loss or performance on the cognitive tasks. CONCLUSIONS Previous studies have shown that older adults are at least as good, if not better, at exploiting context in speech recognition, as compared with younger adults. The current results are not in disagreement with those findings but suggest that, under some conditions, the automatic priming process that may contribute to benefits from context is not as strong in older as in younger adults.
Collapse
|
33
|
Helfer KS, Freyman RL, Merchant GR. How repetition influences speech understanding by younger, middle-aged and older adults. Int J Audiol 2018; 57:695-702. [PMID: 29801416 DOI: 10.1080/14992027.2018.1475756] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
OBJECTIVE To examine benefit from immediate repetition of a masked speech message in younger, middle-aged and older adults. DESIGN Participants listened to sentences in conditions where only the target message was repeated, and when both the target message and its accompanying masker (noise or speech) were repeated. In a follow-up experiment, the effect of repetition was evaluated using a square-wave modulated noise masker to compare benefit when listeners were exposed to the same glimpses of the target message during first and second presentation versus when the glimpses differed. STUDY SAMPLE Younger, middle-aged and older adults (n = 16/group) for the main experiment; 15 younger adults for the follow-up experiment. RESULTS Repetition benefit was larger when the target but not the masker was repeated for all groups. This was especially true for older adults, suggesting that these individuals may be more negatively affected when a background message is repeated. Data obtained using noise maskers suggest that it is slightly more beneficial when listeners hear different (versus identical) portions of speech between initial presentation and repetition. CONCLUSIONS Although subtle age-related differences were found in some conditions, results confirm that repetition is an effective repair strategy for listeners spanning the adult age range.
Collapse
Affiliation(s)
- Karen S Helfer
- a Department of Communication Disorders , University of Massachusetts Amherst , Amherst , MA , USA
| | - Richard L Freyman
- a Department of Communication Disorders , University of Massachusetts Amherst , Amherst , MA , USA
| | - Gabrielle R Merchant
- a Department of Communication Disorders , University of Massachusetts Amherst , Amherst , MA , USA
| |
Collapse
|
34
|
Zheng Y, Wu C, Li J, Li R, Peng H, She S, Ning Y, Li L. Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions. BMC Psychiatry 2018; 18:90. [PMID: 29618332 PMCID: PMC5885301 DOI: 10.1186/s12888-018-1675-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 03/26/2018] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. METHODS Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. RESULTS The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. CONCLUSIONS In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.
Collapse
Affiliation(s)
- Yingjun Zheng
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Chao Wu
- 0000 0004 1789 9964grid.20513.35Faculty of Psychology, Beijing Normal University, Beijing, 100875 China
| | - Juanhua Li
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Ruikeng Li
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Hongjun Peng
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Shenglin She
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Yuping Ning
- 0000 0000 8653 1072grid.410737.6The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370 China
| | - Liang Li
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception (Ministry of Education), Peking University, 5 Yiheyuan Road, Beijing, 100080, People's Republic of China. .,Beijing Institute for Brain Disorder, Capital Medical University, Beijing, China.
| |
Collapse
|
35
|
Alain C, Du Y, Bernstein LJ, Barten T, Banai K. Listening under difficult conditions: An activation likelihood estimation meta-analysis. Hum Brain Mapp 2018. [PMID: 29536592 DOI: 10.1002/hbm.24031] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The brain networks supporting speech identification and comprehension under difficult listening conditions are not well specified. The networks hypothesized to underlie effortful listening include regions responsible for executive control. We conducted meta-analyses of auditory neuroimaging studies to determine whether a common activation pattern of the frontal lobe supports effortful listening under different speech manipulations. Fifty-three functional neuroimaging studies investigating speech perception were divided into three independent Activation Likelihood Estimate analyses based on the type of speech manipulation paradigm used: Speech-in-noise (SIN, 16 studies, involving 224 participants); spectrally degraded speech using filtering techniques (15 studies involving 270 participants); and linguistic complexity (i.e., levels of syntactic, lexical and semantic intricacy/density, 22 studies, involving 348 participants). Meta-analysis of the SIN studies revealed higher effort was associated with activation in left inferior frontal gyrus (IFG), left inferior parietal lobule, and right insula. Studies using spectrally degraded speech demonstrated increased activation of the insula bilaterally and the left superior temporal gyrus (STG). Studies manipulating linguistic complexity showed activation in the left IFG, right middle frontal gyrus, left middle temporal gyrus and bilateral STG. Planned contrasts revealed left IFG activation in linguistic complexity studies, which differed from activation patterns observed in SIN or spectral degradation studies. Although there were no significant overlap in prefrontal activation across these three speech manipulation paradigms, SIN and spectral degradation showed overlapping regions in left and right insula. These findings provide evidence that there is regional specialization within the left IFG and differential executive networks underlie effortful listening.
Collapse
Affiliation(s)
- Claude Alain
- Rotman Research Institute, Baycrest Health Centre, Toronto, Ontario, Canada.,Department of Psychology, University of Toronto, Toronto, Ontario, Canada
| | - Yi Du
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Lori J Bernstein
- Department of Supportive Care, Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
| | - Thijs Barten
- Rotman Research Institute, Baycrest Health Centre, Toronto, Ontario, Canada
| | - Karen Banai
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| |
Collapse
|
36
|
Di Liberto GM, Lalor EC, Millman RE. Causal cortical dynamics of a predictive enhancement of speech intelligibility. Neuroimage 2018; 166:247-258. [DOI: 10.1016/j.neuroimage.2017.10.066] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 10/04/2017] [Accepted: 10/30/2017] [Indexed: 11/28/2022] Open
|
37
|
Rowland SC, Hartley DEH, Wiggins IM. Listening in Naturalistic Scenes: What Can Functional Near-Infrared Spectroscopy and Intersubject Correlation Analysis Tell Us About the Underlying Brain Activity? Trends Hear 2018; 22:2331216518804116. [PMID: 30345888 PMCID: PMC6198387 DOI: 10.1177/2331216518804116] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 08/17/2018] [Accepted: 09/06/2018] [Indexed: 12/24/2022] Open
Abstract
Listening to speech in the noisy conditions of everyday life can be effortful, reflecting the increased cognitive workload involved in extracting meaning from a degraded acoustic signal. Studying the underlying neural processes has the potential to provide mechanistic insight into why listening is effortful under certain conditions. In a move toward studying listening effort under ecologically relevant conditions, we used the silent and flexible neuroimaging technique functional near-infrared spectroscopy (fNIRS) to examine brain activity during attentive listening to speech in naturalistic scenes. Thirty normally hearing participants listened to a series of narratives continuously varying in acoustic difficulty while undergoing fNIRS imaging. Participants then listened to another set of closely matched narratives and rated perceived effort and intelligibility for each scene. As expected, self-reported effort generally increased with worsening signal-to-noise ratio. After controlling for better-ear signal-to-noise ratio, perceived effort was greater in scenes that contained competing speech than in those that did not, potentially reflecting an additional cognitive cost of overcoming informational masking. We analyzed the fNIRS data using intersubject correlation, a data-driven approach suitable for analyzing data collected under naturalistic conditions. Significant intersubject correlation was seen in the bilateral auditory cortices and in a range of channels across the prefrontal cortex. The involvement of prefrontal regions is consistent with the notion that higher order cognitive processes are engaged during attentive listening to speech in complex real-world conditions. However, further research is needed to elucidate the relationship between perceived listening effort and activity in these extended cortical networks.
Collapse
Affiliation(s)
- Stephen C. Rowland
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
| | - Douglas E. H. Hartley
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
- Medical Research Council Institute of Hearing Research, School of Medicine, University of Nottingham, UK
- Nottingham University Hospitals NHS Trust, Queens Medical Centre, UK
| | - Ian M. Wiggins
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
- Medical Research Council Institute of Hearing Research, School of Medicine, University of Nottingham, UK
| |
Collapse
|
38
|
Xie X, Myers E. Left Inferior Frontal Gyrus Sensitivity to Phonetic Competition in Receptive Language Processing: A Comparison of Clear and Conversational Speech. J Cogn Neurosci 2017; 30:267-280. [PMID: 29160743 DOI: 10.1162/jocn_a_01208] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The speech signal is rife with variations in phonetic ambiguity. For instance, when talkers speak in a conversational register, they demonstrate less articulatory precision, leading to greater potential for confusability at the phonetic level compared with a clear speech register. Current psycholinguistic models assume that ambiguous speech sounds activate more than one phonological category and that competition at prelexical levels cascades to lexical levels of processing. Imaging studies have shown that the left inferior frontal gyrus (LIFG) is modulated by phonetic competition between simultaneously activated categories, with increases in activation for more ambiguous tokens. Yet, these studies have often used artificially manipulated speech and/or metalinguistic tasks, which arguably may recruit neural regions that are not critical for natural speech recognition. Indeed, a prominent model of speech processing, the dual-stream model, posits that the LIFG is not involved in prelexical processing in receptive language processing. In the current study, we exploited natural variation in phonetic competition in the speech signal to investigate the neural systems sensitive to phonetic competition as listeners engage in a receptive language task. Participants heard nonsense sentences spoken in either a clear or conversational register as neural activity was monitored using fMRI. Conversational sentences contained greater phonetic competition, as estimated by measures of vowel confusability, and these sentences also elicited greater activation in a region in the LIFG. Sentence-level phonetic competition metrics uniquely correlated with LIFG activity as well. This finding is consistent with the hypothesis that the LIFG responds to competition at multiple levels of language processing and that recruitment of this region does not require an explicit phonological judgment.
Collapse
|
39
|
Hakonen M, May PJC, Jääskeläinen IP, Jokinen E, Sams M, Tiitinen H. Predictive processing increases intelligibility of acoustically distorted speech: Behavioral and neural correlates. Brain Behav 2017; 7:e00789. [PMID: 28948083 PMCID: PMC5607552 DOI: 10.1002/brb3.789] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 06/10/2017] [Accepted: 06/26/2017] [Indexed: 11/08/2022] Open
Abstract
INTRODUCTION We examined which brain areas are involved in the comprehension of acoustically distorted speech using an experimental paradigm where the same distorted sentence can be perceived at different levels of intelligibility. This change in intelligibility occurs via a single intervening presentation of the intact version of the sentence, and the effect lasts at least on the order of minutes. Since the acoustic structure of the distorted stimulus is kept fixed and only intelligibility is varied, this allows one to study brain activity related to speech comprehension specifically. METHODS In a functional magnetic resonance imaging (fMRI) experiment, a stimulus set contained a block of six distorted sentences. This was followed by the intact counterparts of the sentences, after which the sentences were presented in distorted form again. A total of 18 such sets were presented to 20 human subjects. RESULTS The blood oxygenation level dependent (BOLD)-responses elicited by the distorted sentences which came after the disambiguating, intact sentences were contrasted with the responses to the sentences presented before disambiguation. This revealed increased activity in the bilateral frontal pole, the dorsal anterior cingulate/paracingulate cortex, and the right frontal operculum. Decreased BOLD responses were observed in the posterior insula, Heschl's gyrus, and the posterior superior temporal sulcus. CONCLUSIONS The brain areas that showed BOLD-enhancement for increased sentence comprehension have been associated with executive functions and with the mapping of incoming sensory information to representations stored in episodic memory. Thus, the comprehension of acoustically distorted speech may be associated with the engagement of memory-related subsystems. Further, activity in the primary auditory cortex was modulated by prior experience, possibly in a predictive coding framework. Our results suggest that memory biases the perception of ambiguous sensory information toward interpretations that have the highest probability to be correct based on previous experience.
Collapse
Affiliation(s)
- Maria Hakonen
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
- Department of PhysiologyFaculty of MedicineUniversity of HelsinkiHelsinkiFinland
| | - Patrick J. C. May
- Medical Research Council Institute of Hearing ResearchSchool of MedicineThe University of NottinghamNottinghamUK
- Special Laboratory Non‐Invasive Brain ImagingLeibniz Institute for NeurobiologyMagdeburgGermany
| | - Iiro P. Jääskeläinen
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
| | - Emma Jokinen
- Department of Signal Processing and AcousticsSchool of Electrical EngineeringAalto UniversityAaltoFinland
| | - Mikko Sams
- Brain and Mind LaboratoryDepartment of Neuroscience and Biomedical Engineering (NBE)School of ScienceAalto UniversityAaltoFinland
| | | |
Collapse
|
40
|
Li J, Wu C, Zheng Y, Li R, Li X, She S, Wu H, Peng H, Ning Y, Li L. Schizophrenia affects speech-induced functional connectivity of the superior temporal gyrus under cocktail-party listening conditions. Neuroscience 2017; 359:248-257. [PMID: 28673720 DOI: 10.1016/j.neuroscience.2017.06.043] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/02/2017] [Accepted: 06/22/2017] [Indexed: 12/31/2022]
Abstract
The superior temporal gyrus (STG) is involved in speech recognition against informational masking under cocktail-party-listening conditions. Compared to healthy listeners, people with schizophrenia perform worse in speech recognition under informational speech-on-speech masking conditions. It is not clear whether the schizophrenia-related vulnerability to informational masking is associated with certain changes in FC of the STG with some critical brain regions. Using sparse-sampling fMRI design, this study investigated the differences between people with schizophrenia and healthy controls in FC of the STG for target-speech listening against informational speech-on-speech masking, when a listening condition with either perceived spatial separation (PSS, with a spatial release of informational masking) or perceived spatial co-location (PSC, without the spatial release) between target speech and masking speech was introduced. The results showed that in healthy participants, but not participants with schizophrenia, the contrast of either the PSS or PSC condition against the masker-only condition induced an enhancement of functional connectivity (FC) of the STG with the left superior parietal lobule and the right precuneus. Compared to healthy participants, participants with schizophrenia showed declined FC of the STG with the bilateral precuneus, right SPL, and right supplementary motor area. Thus, FC of the STG with the parietal areas is normally involved in speech listening against informational masking under either the PSS or PSC conditions, and declined FC of the STG in people with schizophrenia with the parietal areas may be associated with the increased vulnerability to informational masking.
Collapse
Affiliation(s)
- Juanhua Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Chao Wu
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100080, China; School of Psychology, Beijing Normal University, Beijing 100875, China
| | - Yingjun Zheng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Ruikeng Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Xuanzi Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Shenglin She
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Haibo Wu
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Hongjun Peng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Yuping Ning
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China
| | - Liang Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou 510370, China; School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100080, China; Beijing Institute for Brain Disorder, Capital Medical University, Beijing, China.
| |
Collapse
|
41
|
Wild CJ, Linke AC, Zubiaurre-Elorza L, Herzmann C, Duffy H, Han VK, Lee DSC, Cusack R. Adult-like processing of naturalistic sounds in auditory cortex by 3- and 9-month old infants. Neuroimage 2017. [PMID: 28648887 DOI: 10.1016/j.neuroimage.2017.06.038] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Functional neuroimaging has been used to show that the developing auditory cortex of very young human infants responds, in some way, to sound. However, impoverished stimuli and uncontrolled designs have made it difficult to attribute brain responses to specific auditory features, and thus made it difficult to assess the maturity of feature tuning in auditory cortex. To address this, we used functional magnetic resonance imaging (fMRI) to measure the brain activity evoked by naturalistic sounds (a series of sung lullabies) in two groups of infants (3 and 9 months) and adults. We developed a novel analysis method - inter-subject regression (ISR) - to quantify the similarity of cortical responses between infants and adults, and to decompose components of the response due to different auditory features. We found that the temporal pattern of activity in infant auditory cortex shared similarity with adults. Some of this shared response could be attributed to simple acoustic features, such as frequency, pitch, envelope, but other parts were not, suggesting that even more complex adult-like features are represented in auditory cortex in early infancy.
Collapse
Affiliation(s)
- Conor J Wild
- Brain and Mind Institute, Western University, London, Canada.
| | - Annika C Linke
- Brain and Mind Institute, Western University, London, Canada
| | | | | | - Hester Duffy
- Brain and Mind Institute, Western University, London, Canada
| | - Victor K Han
- Children's Health Research Institute, London, Canada
| | - David S C Lee
- Children's Health Research Institute, London, Canada
| | - Rhodri Cusack
- Brain and Mind Institute, Western University, London, Canada; Children's Health Research Institute, London, Canada; School of Psychology, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
42
|
The Hierarchical Cortical Organization of Human Speech Processing. J Neurosci 2017; 37:6539-6557. [PMID: 28588065 DOI: 10.1523/jneurosci.3267-16.2017] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Revised: 05/22/2017] [Accepted: 05/25/2017] [Indexed: 12/13/2022] Open
Abstract
Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech.SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to natural speech. Both cerebral hemispheres were actively involved in speech processing in large and equal amounts. Also, the transformation from spectral features to semantic elements occurs early in the cortical speech-processing stream. Our experimental and analytical approaches are important alternatives and complements to standard approaches that use segmented speech and block designs, which report more laterality in speech processing and associated semantic processing to higher levels of cortex than reported here.
Collapse
|
43
|
Wu C, Zheng Y, Li J, Wu H, She S, Liu S, Ning Y, Li L. Brain substrates underlying auditory speech priming in healthy listeners and listeners with schizophrenia. Psychol Med 2017; 47:837-852. [PMID: 27894376 DOI: 10.1017/s0033291716002816] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
BACKGROUND Under 'cocktail party' listening conditions, healthy listeners and listeners with schizophrenia can use temporally pre-presented auditory speech-priming (ASP) stimuli to improve target-speech recognition, even though listeners with schizophrenia are more vulnerable to informational speech masking. METHOD Using functional magnetic resonance imaging, this study searched for both brain substrates underlying the unmasking effect of ASP in 16 healthy controls and 22 patients with schizophrenia, and brain substrates underlying schizophrenia-related speech-recognition deficits under speech-masking conditions. RESULTS In both controls and patients, introducing the ASP condition (against the auditory non-speech-priming condition) not only activated the left superior temporal gyrus (STG) and left posterior middle temporal gyrus (pMTG), but also enhanced functional connectivity of the left STG/pMTG with the left caudate. It also enhanced functional connectivity of the left STG/pMTG with the left pars triangularis of the inferior frontal gyrus (TriIFG) in controls and that with the left Rolandic operculum in patients. The strength of functional connectivity between the left STG and left TriIFG was correlated with target-speech recognition under the speech-masking condition in both controls and patients, but reduced in patients. CONCLUSIONS The left STG/pMTG and their ASP-related functional connectivity with both the left caudate and some frontal regions (the left TriIFG in healthy listeners and the left Rolandic operculum in listeners with schizophrenia) are involved in the unmasking effect of ASP, possibly through facilitating the following processes: masker-signal inhibition, target-speech encoding, and speech production. The schizophrenia-related reduction of functional connectivity between the left STG and left TriIFG augments the vulnerability of speech recognition to speech masking.
Collapse
Affiliation(s)
- C Wu
- School of Psychological and Cognitive Sciences, and Beijing Key Laboratory of Behavior and Mental Health,Key Laboratory on Machine Perception (Ministry of Education),Peking University,Beijing,People's Republic of China
| | - Y Zheng
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - J Li
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - H Wu
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - S She
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - S Liu
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - Y Ning
- The Affiliated Brain Hospital of Guangzhou Medical University,Guangzhou,People's Republic of China
| | - L Li
- School of Psychological and Cognitive Sciences, and Beijing Key Laboratory of Behavior and Mental Health,Key Laboratory on Machine Perception (Ministry of Education),Peking University,Beijing,People's Republic of China
| |
Collapse
|
44
|
Wu C, Zheng Y, Li J, Zhang B, Li R, Wu H, She S, Liu S, Peng H, Ning Y, Li L. Activation and Functional Connectivity of the Left Inferior Temporal Gyrus during Visual Speech Priming in Healthy Listeners and Listeners with Schizophrenia. Front Neurosci 2017; 11:107. [PMID: 28360829 PMCID: PMC5350153 DOI: 10.3389/fnins.2017.00107] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 02/20/2017] [Indexed: 11/13/2022] Open
Abstract
Under a "cocktail-party" listening condition with multiple-people talking, compared to healthy people, people with schizophrenia benefit less from the use of visual-speech (lipreading) priming (VSP) cues to improve speech recognition. The neural mechanisms underlying the unmasking effect of VSP remain unknown. This study investigated the brain substrates underlying the unmasking effect of VSP in healthy listeners and the schizophrenia-induced changes in the brain substrates. Using functional magnetic resonance imaging, brain activation and functional connectivity for the contrasts of the VSP listening condition vs. the visual non-speech priming (VNSP) condition were examined in 16 healthy listeners (27.4 ± 8.6 years old, 9 females and 7 males) and 22 listeners with schizophrenia (29.0 ± 8.1 years old, 8 females and 14 males). The results showed that in healthy listeners, but not listeners with schizophrenia, the VSP-induced activation (against the VNSP condition) of the left posterior inferior temporal gyrus (pITG) was significantly correlated with the VSP-induced improvement in target-speech recognition against speech masking. Compared to healthy listeners, listeners with schizophrenia showed significantly lower VSP-induced activation of the left pITG and reduced functional connectivity of the left pITG with the bilateral Rolandic operculum, bilateral STG, and left insular. Thus, the left pITG and its functional connectivity may be the brain substrates related to the unmasking effect of VSP, assumedly through enhancing both the processing of target visual-speech signals and the inhibition of masking-speech signals. In people with schizophrenia, the reduced unmasking effect of VSP on speech recognition may be associated with a schizophrenia-related reduction of VSP-induced activation and functional connectivity of the left pITG.
Collapse
Affiliation(s)
- Chao Wu
- Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception, Ministry of Education, School of Psychological and Cognitive Sciences, Peking UniversityBeijing, China; School of Life Sciences, Peking UniversityBeijing, China; School of Psychology, Beijing Normal UniversityBeijing, China
| | - Yingjun Zheng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Juanhua Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Bei Zhang
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Ruikeng Li
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Haibo Wu
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Shenglin She
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Sha Liu
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Hongjun Peng
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Yuping Ning
- The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital) Guangzhou, China
| | - Liang Li
- Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory on Machine Perception, Ministry of Education, School of Psychological and Cognitive Sciences, Peking UniversityBeijing, China; The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital)Guangzhou, China; Beijing Institute for Brain Disorder, Capital Medical UniversityBeijing, China
| |
Collapse
|
45
|
Ozker M, Schepers IM, Magnotti JF, Yoshor D, Beauchamp MS. A Double Dissociation between Anterior and Posterior Superior Temporal Gyrus for Processing Audiovisual Speech Demonstrated by Electrocorticography. J Cogn Neurosci 2017; 29:1044-1060. [PMID: 28253074 DOI: 10.1162/jocn_a_01110] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Human speech can be comprehended using only auditory information from the talker's voice. However, comprehension is improved if the talker's face is visible, especially if the auditory information is degraded as occurs in noisy environments or with hearing loss. We explored the neural substrates of audiovisual speech perception using electrocorticography, direct recording of neural activity using electrodes implanted on the cortical surface. We observed a double dissociation in the responses to audiovisual speech with clear and noisy auditory component within the superior temporal gyrus (STG), a region long known to be important for speech perception. Anterior STG showed greater neural activity to audiovisual speech with clear auditory component, whereas posterior STG showed similar or greater neural activity to audiovisual speech in which the speech was replaced with speech-like noise. A distinct border between the two response patterns was observed, demarcated by a landmark corresponding to the posterior margin of Heschl's gyrus. To further investigate the computational roles of both regions, we considered Bayesian models of multisensory integration, which predict that combining the independent sources of information available from different modalities should reduce variability in the neural responses. We tested this prediction by measuring the variability of the neural responses to single audiovisual words. Posterior STG showed smaller variability than anterior STG during presentation of audiovisual speech with noisy auditory component. Taken together, these results suggest that posterior STG but not anterior STG is important for multisensory integration of noisy auditory and visual speech.
Collapse
Affiliation(s)
- Muge Ozker
- 1 University of Texas Graduate School of Biomedical Sciences at Houston.,2 Baylor College of Medicine
| | | | | | | | | |
Collapse
|
46
|
Leonard MK, Baud MO, Sjerps MJ, Chang EF. Perceptual restoration of masked speech in human cortex. Nat Commun 2016; 7:13619. [PMID: 27996973 PMCID: PMC5187421 DOI: 10.1038/ncomms13619] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Accepted: 10/19/2016] [Indexed: 02/02/2023] Open
Abstract
Humans are adept at understanding speech despite the fact that our natural listening environment is often filled with interference. An example of this capacity is phoneme restoration, in which part of a word is completely replaced by noise, yet listeners report hearing the whole word. The neurological basis for this unconscious fill-in phenomenon is unknown, despite being a fundamental characteristic of human hearing. Here, using direct cortical recordings in humans, we demonstrate that missing speech is restored at the acoustic-phonetic level in bilateral auditory cortex, in real-time. This restoration is preceded by specific neural activity patterns in a separate language area, left frontal cortex, which predicts the word that participants later report hearing. These results demonstrate that during speech perception, missing acoustic content is synthesized online from the integration of incoming sensory cues and the internal neural dynamics that bias word-level expectation and prediction. We can often ‘fill in' missing or occluded sounds from a speech signal—an effect known as phoneme restoration. Leonard et al. found a real-time restoration of the missing sounds in the superior temporal auditory cortex in humans. Interestingly, neural activity in frontal regions prior to the stimulus can predict the word that the participant would later hear.
Collapse
Affiliation(s)
- Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| | - Maxime O Baud
- Department of Neurology, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| | - Matthias J Sjerps
- Department of Linguistics, University of California, Berkeley, 1203 Dwinelle Hall #2650, Berkeley, California 94720-2650, USA.,Neurobiology of Language Department, Donders Institute for Brain, Cognition and Behavior, Centre for Cognitive Neuroimaging, Radboud University, Kapittelweg 29, Nijmegen 6525 EN, The Netherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Department of Physiology, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| |
Collapse
|
47
|
Blank H, Davis MH. Prediction Errors but Not Sharpened Signals Simulate Multivoxel fMRI Patterns during Speech Perception. PLoS Biol 2016; 14:e1002577. [PMID: 27846209 PMCID: PMC5112801 DOI: 10.1371/journal.pbio.1002577] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 10/19/2016] [Indexed: 11/19/2022] Open
Abstract
Successful perception depends on combining sensory input with prior knowledge. However, the underlying mechanism by which these two sources of information are combined is unknown. In speech perception, as in other domains, two functionally distinct coding schemes have been proposed for how expectations influence representation of sensory evidence. Traditional models suggest that expected features of the speech input are enhanced or sharpened via interactive activation (Sharpened Signals). Conversely, Predictive Coding suggests that expected features are suppressed so that unexpected features of the speech input (Prediction Errors) are processed further. The present work is aimed at distinguishing between these two accounts of how prior knowledge influences speech perception. By combining behavioural, univariate, and multivariate fMRI measures of how sensory detail and prior expectations influence speech perception with computational modelling, we provide evidence in favour of Prediction Error computations. Increased sensory detail and informative expectations have additive behavioural and univariate neural effects because they both improve the accuracy of word report and reduce the BOLD signal in lateral temporal lobe regions. However, sensory detail and informative expectations have interacting effects on speech representations shown by multivariate fMRI in the posterior superior temporal sulcus. When prior knowledge was absent, increased sensory detail enhanced the amount of speech information measured in superior temporal multivoxel patterns, but with informative expectations, increased sensory detail reduced the amount of measured information. Computational simulations of Sharpened Signals and Prediction Errors during speech perception could both explain these behavioural and univariate fMRI observations. However, the multivariate fMRI observations were uniquely simulated by a Prediction Error and not a Sharpened Signal model. The interaction between prior expectation and sensory detail provides evidence for a Predictive Coding account of speech perception. Our work establishes methods that can be used to distinguish representations of Prediction Error and Sharpened Signals in other perceptual domains.
Collapse
Affiliation(s)
- Helen Blank
- MRC Cognition and Brain Sciences Unit, Cambridge, United Kingdom
- * E-mail:
| | - Matthew H. Davis
- MRC Cognition and Brain Sciences Unit, Cambridge, United Kingdom
| |
Collapse
|
48
|
Moberget T, Hilland E, Andersson S, Lundar T, Due-Tønnessen BJ, Heldal A, Ivry RB, Endestad T. Patients with focal cerebellar lesions show reduced auditory cortex activation during silent reading. BRAIN AND LANGUAGE 2016; 161:18-27. [PMID: 26341544 PMCID: PMC4775464 DOI: 10.1016/j.bandl.2015.08.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Revised: 07/28/2015] [Accepted: 08/06/2015] [Indexed: 06/05/2023]
Abstract
Functional neuroimaging studies consistently report language-related cerebellar activations, but evidence from the clinical literature is less conclusive. Here, we attempt to bridge this gap by testing the effect of focal cerebellar lesions on cerebral activations in a reading task previously shown to involve distinct cerebellar regions. Patients (N=10) had lesions primarily affecting medial cerebellum, overlapping cerebellar regions activated during the presentation of random word sequences, but distinct from activations related to semantic prediction generation and prediction error processing. In line with this pattern of activation-lesion overlap, patients did not differ from matched healthy controls (N=10) in predictability-related activations. However, whereas controls showed increased activation in bilateral auditory cortex and parietal operculum when silently reading familiar words relative to viewing letter strings, this effect was absent in the patients. Our results highlight the need for careful lesion mapping and suggest possible roles for the cerebellum in visual-to-auditory mapping and/or inner speech.
Collapse
Affiliation(s)
| | - Eva Hilland
- Department of Psychology, University of Oslo, Oslo, Norway
| | - Stein Andersson
- Department of Psychology, University of Oslo, Oslo, Norway; Department of Psychosomatic Medicine, Oslo University Hospital, Oslo, Norway
| | - Tryggve Lundar
- Department of Neurosurgery, Oslo University Hospital, Oslo, Norway
| | | | - Aasta Heldal
- Department of Psychosomatic Medicine, Oslo University Hospital, Oslo, Norway
| | - Richard B Ivry
- Psychology Department, University of California, Berkeley, Berkeley, CA, USA
| | - Tor Endestad
- Department of Psychology, University of Oslo, Oslo, Norway; Department of Psychosomatic Medicine, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
49
|
An fMRI study investigating effects of conceptually related sentences on the perception of degraded speech. Cortex 2016; 79:57-74. [PMID: 27100909 DOI: 10.1016/j.cortex.2016.03.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Revised: 01/06/2016] [Accepted: 03/15/2016] [Indexed: 11/20/2022]
Abstract
Prior research has shown that the perception of degraded speech is influenced by within sentence meaning and recruits one or more components of a frontal-temporal-parietal network. The goal of the current study is to examine whether the overall conceptual meaning of a sentence, made up of one set of words, influences the perception of a second acoustically degraded sentence, made up of a different set of words. Using functional magnetic resonance imaging (fMRI), we presented an acoustically clear sentence followed by an acoustically degraded sentence and manipulated the semantic relationship between them: Related in meaning (but consisting of different content words), Unrelated in meaning, or Same. Results showed that listeners' word recognition accuracy for the acoustically degraded sentences was significantly higher when the target sentence was preceded by a conceptually related compared to a conceptually unrelated sentence. Sensitivity to conceptual relationships was associated with enhanced activity in middle and inferior frontal, temporal, and parietal areas. In addition, the left middle frontal gyrus (LMFG), left inferior frontal gyrus (LIFG), and left middle temporal gyrus (LMTG) showed activity that correlated with individual performance on the Related condition. The superior temporal gyrus (STG) showed increased activation in the Same condition suggesting that it is sensitive to perceptual similarity rather than the integration of meaning between the sentence pairs. A fronto-temporo-parietal network appears to consolidate information sources across multiple levels of language (acoustic, lexical, syntactic, semantic) to build, and ultimately integrate conceptual information across sentences and facilitate the perception of a degraded speech signal. However, the nature of the sources of information that are available differentially recruit specific regions and modulate their activity within this network. Implications of these findings for the functional architecture of the network are considered.
Collapse
|
50
|
Sohoglu E, Davis MH. Perceptual learning of degraded speech by minimizing prediction error. Proc Natl Acad Sci U S A 2016; 113:E1747-56. [PMID: 26957596 PMCID: PMC4812728 DOI: 10.1073/pnas.1523266113] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.
Collapse
Affiliation(s)
- Ediz Sohoglu
- Medical Research Council Cognition and Brain Sciences Unit, Cambridge CB2 7EF, United Kingdom
| | - Matthew H Davis
- Medical Research Council Cognition and Brain Sciences Unit, Cambridge CB2 7EF, United Kingdom
| |
Collapse
|