1
|
Drouin JR, Davis CP. Individual differences in visual pattern completion predict adaptation to degraded speech. BRAIN AND LANGUAGE 2024; 255:105449. [PMID: 39083999 DOI: 10.1016/j.bandl.2024.105449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 03/18/2024] [Accepted: 07/23/2024] [Indexed: 08/02/2024]
Abstract
Recognizing acoustically degraded speech relies on predictive processing whereby incomplete auditory cues are mapped to stored linguistic representations via pattern recognition processes. While listeners vary in their ability to recognize degraded speech, performance improves when a written transcription is presented, allowing completion of the partial sensory pattern to preexisting representations. Building on work characterizing predictive processing as pattern completion, we examined the relationship between domain-general pattern recognition and individual variation in degraded speech learning. Participants completed a visual pattern recognition task to measure individual-level tendency towards pattern completion. Participants were also trained to recognize noise-vocoded speech with written transcriptions and tested on speech recognition pre- and post-training using a retrieval-based transcription task. Listeners significantly improved in recognizing speech after training, and pattern completion on the visual task predicted improvement for novel items. The results implicate pattern completion as a domain-general learning mechanism that can facilitate speech adaptation in challenging contexts.
Collapse
Affiliation(s)
- Julia R Drouin
- Division of Speech and Hearing Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, CA 92831, USA.
| | - Charles P Davis
- Department of Psychology & Neuroscience, Duke University, Durham, NC 27708, USA
| |
Collapse
|
2
|
Wang J, Wang X, Zou J, Duan J, Shen Z, Xu N, Chen Y, Zhang J, He H, Bi Y, Ding N. Neural substrate underlying the learning of a passage with unfamiliar vocabulary and syntax. Cereb Cortex 2023; 33:10036-10046. [PMID: 37491998 DOI: 10.1093/cercor/bhad263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 07/27/2023] Open
Abstract
Speech comprehension is a complex process involving multiple stages, such as decoding of phonetic units, recognizing words, and understanding sentences and passages. In this study, we identify cortical networks beyond basic phonetic processing using a novel passage learning paradigm. Participants learn to comprehend a story composed of syllables of their native language, but containing unfamiliar vocabulary and syntax. Three learning methods are employed, each resulting in some degree of learning within a 12-min learning session. Functional magnetic resonance imaging results reveal that, when listening to the same story, the classic temporal-frontal language network is significantly enhanced by learning. Critically, activation of the left anterior and posterior temporal lobe correlates with the learning outcome that is assessed behaviorally through, e.g. word recognition and passage comprehension tests. This study demonstrates that a brief learning session is sufficient to induce neural plasticity in the left temporal lobe, which underlies the transformation from phonetic units to the units of meaning, such as words and sentences.
Collapse
Affiliation(s)
- Jing Wang
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Xiaosha Wang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Jiajie Zou
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Jipeng Duan
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Zhuowen Shen
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Nannan Xu
- School of Linguistic Sciences and Arts, Jiangsu Normal University, Xuzhou 221009, China
| | - Yan Chen
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Jianfeng Zhang
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Hongjian He
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Yanchao Bi
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, Center for Brain Imaging Science and Technology, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
- MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
3
|
Banks MI, Krause BM, Berger DG, Campbell DI, Boes AD, Bruss JE, Kovach CK, Kawasaki H, Steinschneider M, Nourski KV. Functional geometry of auditory cortical resting state networks derived from intracranial electrophysiology. PLoS Biol 2023; 21:e3002239. [PMID: 37651504 PMCID: PMC10499207 DOI: 10.1371/journal.pbio.3002239] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/13/2023] [Accepted: 07/07/2023] [Indexed: 09/02/2023] Open
Abstract
Understanding central auditory processing critically depends on defining underlying auditory cortical networks and their relationship to the rest of the brain. We addressed these questions using resting state functional connectivity derived from human intracranial electroencephalography. Mapping recording sites into a low-dimensional space where proximity represents functional similarity revealed a hierarchical organization. At a fine scale, a group of auditory cortical regions excluded several higher-order auditory areas and segregated maximally from the prefrontal cortex. On mesoscale, the proximity of limbic structures to the auditory cortex suggested a limbic stream that parallels the classically described ventral and dorsal auditory processing streams. Identities of global hubs in anterior temporal and cingulate cortex depended on frequency band, consistent with diverse roles in semantic and cognitive processing. On a macroscale, observed hemispheric asymmetries were not specific for speech and language networks. This approach can be applied to multivariate brain data with respect to development, behavior, and disorders.
Collapse
Affiliation(s)
- Matthew I. Banks
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Neuroscience, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Bryan M. Krause
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - D. Graham Berger
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Declan I. Campbell
- Department of Anesthesiology, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Aaron D. Boes
- Department of Neurology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Joel E. Bruss
- Department of Neurology, The University of Iowa, Iowa City, Iowa, United States of America
| | - Christopher K. Kovach
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
| | - Mitchell Steinschneider
- Department of Neurology, Albert Einstein College of Medicine, New York, New York, United States of America
- Department of Neuroscience, Albert Einstein College of Medicine, New York, New York, United States of America
| | - Kirill V. Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, Iowa, United States of America
- Iowa Neuroscience Institute, The University of Iowa, Iowa City, Iowa, United States of America
| |
Collapse
|
4
|
Murai SA, Riquimaroux H. Long-term changes in cortical representation through perceptual learning of spectrally degraded speech. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2023; 209:163-172. [PMID: 36464716 DOI: 10.1007/s00359-022-01593-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 12/07/2022]
Abstract
Listeners can adapt to acoustically degraded speech with perceptual training. The learning processes for long periods underlies the rehabilitation of patients with hearing aids or cochlear implants. Perceptual learning of acoustically degraded speech has been associated with the frontotemporal cortices. However, neural processes during and after long-term perceptual learning remain unclear. Here we conducted perceptual training of noise-vocoded speech sounds (NVSS), which is spectrally degraded signals, and measured the cortical activity for seven days and the follow up testing (approximately 1 year later) to investigate changes in neural activation patterns using functional magnetic resonance imaging. We demonstrated that young adult participants (n = 5) improved their performance across seven experimental days, and the gains were maintained after 10 months or more. Representational similarity analysis showed that the neural activation patterns of NVSS relative to clear speech in the left posterior superior temporal sulcus (pSTS) were significantly different across seven training days, accompanying neural changes in frontal cortices. In addition, the distinct activation patterns to NVSS in the frontotemporal cortices were also observed 10-13 months after the training. We, therefore, propose that perceptual training can induce plastic changes and long-term effects on neural representations of the trained degraded speech in the frontotemporal cortices. These behavioral improvements and neural changes induced by the perceptual learning of degraded speech will provide insights into cortical mechanisms underlying adaptive processes in difficult listening situations and long-term rehabilitation of auditory disorders.
Collapse
Affiliation(s)
- Shota A Murai
- Faculty of Life and Medical Sciences, Doshisha University, 1-3 Miyakodani, Tatara, Kyotanabe, Kyoto, 610-0321, Japan.,International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo Institutes for Advanced Study, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroshi Riquimaroux
- Faculty of Life and Medical Sciences, Doshisha University, 1-3 Miyakodani, Tatara, Kyotanabe, Kyoto, 610-0321, Japan.
| |
Collapse
|
5
|
Lanzilotti C, Andéol G, Micheyl C, Scannella S. Cocktail party training induces increased speech intelligibility and decreased cortical activity in bilateral inferior frontal gyri. A functional near-infrared study. PLoS One 2022; 17:e0277801. [PMID: 36454948 PMCID: PMC9714910 DOI: 10.1371/journal.pone.0277801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 11/03/2022] [Indexed: 12/03/2022] Open
Abstract
The human brain networks responsible for selectively listening to a voice amid other talkers remain to be clarified. The present study aimed to investigate relationships between cortical activity and performance in a speech-in-speech task, before (Experiment I) and after training-induced improvements (Experiment II). In Experiment I, 74 participants performed a speech-in-speech task while their cortical activity was measured using a functional near infrared spectroscopy (fNIRS) device. One target talker and one masker talker were simultaneously presented at three different target-to-masker ratios (TMRs): adverse, intermediate and favorable. Behavioral results show that performance may increase monotonically with TMR in some participants and failed to decrease, or even improved, in the adverse-TMR condition for others. On the neural level, an extensive brain network including the frontal (left prefrontal cortex, right dorsolateral prefrontal cortex and bilateral inferior frontal gyri) and temporal (bilateral auditory cortex) regions was more solicited by the intermediate condition than the two others. Additionally, bilateral frontal gyri and left auditory cortex activities were found to be positively correlated with behavioral performance in the adverse-TMR condition. In Experiment II, 27 participants, whose performance was the poorest in the adverse-TMR condition of Experiment I, were trained to improve performance in that condition. Results show significant performance improvements along with decreased activity in bilateral inferior frontal gyri, the right dorsolateral prefrontal cortex, the left inferior parietal cortex and the right auditory cortex in the adverse-TMR condition after training. Arguably, lower neural activity reflects higher efficiency in processing masker inhibition after speech-in-speech training. As speech-in-noise tasks also imply frontal and temporal regions, we suggest that regardless of the type of masking (speech or noise) the complexity of the task will prompt the implication of a similar brain network. Furthermore, the initial significant cognitive recruitment will be reduced following a training leading to an economy of cognitive resources.
Collapse
Affiliation(s)
- Cosima Lanzilotti
- Département Neuroscience et Sciences Cognitives, Institut de Recherche Biomédicale des Armées, Brétigny sur Orge, France
- ISAE-SUPAERO, Université de Toulouse, Toulouse, France
- Thales SIX GTS France, Gennevilliers, France
| | - Guillaume Andéol
- Département Neuroscience et Sciences Cognitives, Institut de Recherche Biomédicale des Armées, Brétigny sur Orge, France
| | | | | |
Collapse
|
6
|
Liu Y, Luo C, Zheng J, Liang J, Ding N. Working memory asymmetrically modulates auditory and linguistic processing of speech. Neuroimage 2022; 264:119698. [PMID: 36270622 DOI: 10.1016/j.neuroimage.2022.119698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 10/11/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022] Open
Abstract
Working memory load can modulate speech perception. However, since speech perception and working memory are both complex functions, it remains elusive how each component of the working memory system interacts with each speech processing stage. To investigate this issue, we concurrently measure how the working memory load modulates neural activity tracking three levels of linguistic units, i.e., syllables, phrases, and sentences, using a multiscale frequency-tagging approach. Participants engage in a sentence comprehension task and the working memory load is manipulated by asking them to memorize either auditory verbal sequences or visual patterns. It is found that verbal and visual working memory load modulate speech processing in similar manners: Higher working memory load attenuates neural activity tracking of phrases and sentences but enhances neural activity tracking of syllables. Since verbal and visual WM load similarly influence the neural responses to speech, such influences may derive from the domain-general component of WM system. More importantly, working memory load asymmetrically modulates lower-level auditory encoding and higher-level linguistic processing of speech, possibly reflecting reallocation of attention induced by mnemonic load.
Collapse
Affiliation(s)
- Yiguang Liu
- Research Center for Applied Mathematics and Machine Intelligence, Research Institute of Basic Theories, Zhejiang Lab, Hangzhou 311121, China
| | - Cheng Luo
- Research Center for Applied Mathematics and Machine Intelligence, Research Institute of Basic Theories, Zhejiang Lab, Hangzhou 311121, China
| | - Jing Zheng
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Junying Liang
- Department of Linguistics, School of International Studies, Zhejiang University, Hangzhou 310058, China
| | - Nai Ding
- Research Center for Applied Mathematics and Machine Intelligence, Research Institute of Basic Theories, Zhejiang Lab, Hangzhou 311121, China; Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou 310012, China.
| |
Collapse
|
7
|
Shader MJ, Luke R, McKay CM. Contralateral dominance to speech in the adult auditory cortex immediately after cochlear implantation. iScience 2022; 25:104737. [PMID: 35938045 PMCID: PMC9352526 DOI: 10.1016/j.isci.2022.104737] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/12/2022] [Accepted: 07/07/2022] [Indexed: 11/06/2022] Open
Abstract
Sensory deprivation causes structural and functional changes in the human brain. Cochlear implantation delivers immediate reintroduction of auditory sensory information. Previous reports have indicated that over a year is required for the brain to reestablish canonical cortical processing patterns after the reintroduction of auditory stimulation. We utilized functional near-infrared spectroscopy (fNIRS) to investigate brain activity to natural speech stimuli directly after cochlear implantation. We presented 12 cochlear implant recipients, who each had a minimum of 12 months of auditory deprivation, with unilateral auditory- and visual-speech stimuli. Regardless of the side of implantation, canonical responses were elicited primarily on the contralateral side of stimulation as early as 1 h after device activation. These data indicate that auditory pathway connections are sustained during periods of sensory deprivation in adults, and that typical cortical lateralization is observed immediately following the reintroduction of auditory sensory input.
Collapse
Affiliation(s)
- Maureen J. Shader
- Purdue University, Department of Speech, Language, and Hearing Sciences, 715 Clinic Drive, West Lafayette, IN 47907, USA
- The University of Melbourne, Department of Medical Bionics, Parkville, VIC 3010, Australia
| | - Robert Luke
- Bionics Institute, 384-388 Albert St, East Melbourne, VIC 3002, Australia
- Macquarie University, Department of Linguistics, Faculty of Medicine, Health and Human Sciences, Macquarie Hearing, NSW 2109, Australia
| | - Colette M. McKay
- Bionics Institute, 384-388 Albert St, East Melbourne, VIC 3002, Australia
- The University of Melbourne, Department of Medical Bionics, Parkville, VIC 3010, Australia
| |
Collapse
|
8
|
Hauswald A, Keitel A, Chen Y, Rösch S, Weisz N. Degradation levels of continuous speech affect neural speech tracking and alpha power differently. Eur J Neurosci 2022; 55:3288-3302. [PMID: 32687616 PMCID: PMC9540197 DOI: 10.1111/ejn.14912] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 07/12/2020] [Accepted: 07/13/2020] [Indexed: 11/26/2022]
Abstract
Making sense of a poor auditory signal can pose a challenge. Previous attempts to quantify speech intelligibility in neural terms have usually focused on one of two measures, namely low-frequency speech-brain synchronization or alpha power modulations. However, reports have been mixed concerning the modulation of these measures, an issue aggravated by the fact that they have normally been studied separately. We present two MEG studies analyzing both measures. In study 1, participants listened to unimodal auditory speech with three different levels of degradation (original, 7-channel and 3-channel vocoding). Intelligibility declined with declining clarity, but speech was still intelligible to some extent even for the lowest clarity level (3-channel vocoding). Low-frequency (1-7 Hz) speech tracking suggested a U-shaped relationship with strongest effects for the medium-degraded speech (7-channel) in bilateral auditory and left frontal regions. To follow up on this finding, we implemented three additional vocoding levels (5-channel, 2-channel and 1-channel) in a second MEG study. Using this wider range of degradation, the speech-brain synchronization showed a similar pattern as in study 1, but further showed that when speech becomes unintelligible, synchronization declines again. The relationship differed for alpha power, which continued to decrease across vocoding levels reaching a floor effect for 5-channel vocoding. Predicting subjective intelligibility based on models either combining both measures or each measure alone showed superiority of the combined model. Our findings underline that speech tracking and alpha power are modified differently by the degree of degradation of continuous speech but together contribute to the subjective speech understanding.
Collapse
Affiliation(s)
- Anne Hauswald
- Center of Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| | - Anne Keitel
- Psychology, School of Social SciencesUniversity of DundeeDundeeUK
- Centre for Cognitive NeuroimagingUniversity of GlasgowGlasgowUK
| | - Ya‐Ping Chen
- Center of Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| | - Sebastian Rösch
- Department of OtorhinolaryngologyParacelsus Medical UniversitySalzburgAustria
| | - Nathan Weisz
- Center of Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| |
Collapse
|
9
|
Zhou X, Sobczak GS, McKay CM, Litovsky RY. Effects of degraded speech processing and binaural unmasking investigated using functional near-infrared spectroscopy (fNIRS). PLoS One 2022; 17:e0267588. [PMID: 35468160 PMCID: PMC9037936 DOI: 10.1371/journal.pone.0267588] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 04/11/2022] [Indexed: 12/24/2022] Open
Abstract
The present study aimed to investigate the effects of degraded speech perception and binaural unmasking using functional near-infrared spectroscopy (fNIRS). Normal hearing listeners were tested when attending to unprocessed or vocoded speech, presented to the left ear at two speech-to-noise ratios (SNRs). Additionally, by comparing monaural versus diotic masker noise, we measured binaural unmasking. Our primary research question was whether the prefrontal cortex and temporal cortex responded differently to varying listening configurations. Our a priori regions of interest (ROIs) were located at the left dorsolateral prefrontal cortex (DLPFC) and auditory cortex (AC). The left DLPFC has been reported to be involved in attentional processes when listening to degraded speech and in spatial hearing processing, while the AC has been reported to be sensitive to speech intelligibility. Comparisons of cortical activity between these two ROIs revealed significantly different fNIRS response patterns. Further, we showed a significant and positive correlation between self-reported task difficulty levels and fNIRS responses in the DLPFC, with a negative but non-significant correlation for the left AC, suggesting that the two ROIs played different roles in effortful speech perception. Our secondary question was whether activity within three sub-regions of the lateral PFC (LPFC) including the DLPFC was differentially affected by varying speech-noise configurations. We found significant effects of spectral degradation and SNR, and significant differences in fNIRS response amplitudes between the three regions, but no significant interaction between ROI and speech type, or between ROI and SNR. When attending to speech with monaural and diotic noises, participants reported the latter conditions being easier; however, no significant main effect of masker condition on cortical activity was observed. For cortical responses in the LPFC, a significant interaction between SNR and masker condition was observed. These findings suggest that binaural unmasking affects cortical activity through improving speech reception threshold in noise, rather than by reducing effort exerted.
Collapse
Affiliation(s)
- Xin Zhou
- Waisman Center, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Gabriel S. Sobczak
- School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Colette M. McKay
- The Bionics Institute of Australia, Melbourne, VIC, Australia
- Department of Medical Bionics, University of Melbourne, Melbourne, VIC, Australia
| | - Ruth Y. Litovsky
- Waisman Center, University of Wisconsin-Madison, Madison, WI, United States of America
- Department of Communication Science and Disorders, University of Wisconsin-Madison, Madison, WI, United States of America
- Division of Otolaryngology, Department of Surgery, University of Wisconsin-Madison, Madison, WI, United States of America
| |
Collapse
|
10
|
Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022; 33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open
Abstract
Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.
Collapse
Affiliation(s)
- Andrew W Corcoran
- Corresponding author: Room E672, 20 Chancellors Walk, Clayton, VIC 3800, Australia.
| | - Ricardo Perera
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Matthieu Koroma
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Sid Kouider
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Jakob Hohwy
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia,Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Thomas Andrillon
- Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia,Paris Brain Institute, Sorbonne Université, Inserm-CNRS, Paris 75013, France
| |
Collapse
|
11
|
Lin Y, Tsao Y, Hsieh PJ. Neural correlates of individual differences in predicting ambiguous sounds comprehension level. Neuroimage 2022; 251:119012. [DOI: 10.1016/j.neuroimage.2022.119012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/28/2022] [Accepted: 02/16/2022] [Indexed: 11/16/2022] Open
|
12
|
Cheng FY, Xu C, Gold L, Smith S. Rapid Enhancement of Subcortical Neural Responses to Sine-Wave Speech. Front Neurosci 2022; 15:747303. [PMID: 34987356 PMCID: PMC8721138 DOI: 10.3389/fnins.2021.747303] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 12/02/2021] [Indexed: 01/15/2023] Open
Abstract
The efferent auditory nervous system may be a potent force in shaping how the brain responds to behaviorally significant sounds. Previous human experiments using the frequency following response (FFR) have shown efferent-induced modulation of subcortical auditory function online and over short- and long-term time scales; however, a contemporary understanding of FFR generation presents new questions about whether previous effects were constrained solely to the auditory subcortex. The present experiment used sine-wave speech (SWS), an acoustically-sparse stimulus in which dynamic pure tones represent speech formant contours, to evoke FFRSWS. Due to the higher stimulus frequencies used in SWS, this approach biased neural responses toward brainstem generators and allowed for three stimuli (/bɔ/, /bu/, and /bo/) to be used to evoke FFRSWSbefore and after listeners in a training group were made aware that they were hearing a degraded speech stimulus. All SWS stimuli were rapidly perceived as speech when presented with a SWS carrier phrase, and average token identification reached ceiling performance during a perceptual training phase. Compared to a control group which remained naïve throughout the experiment, training group FFRSWS amplitudes were enhanced post-training for each stimulus. Further, linear support vector machine classification of training group FFRSWS significantly improved post-training compared to the control group, indicating that training-induced neural enhancements were sufficient to bolster machine learning classification accuracy. These results suggest that the efferent auditory system may rapidly modulate auditory brainstem representation of sounds depending on their context and perception as non-speech or speech.
Collapse
Affiliation(s)
- Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Lisa Gold
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
13
|
Defenderfer J, Forbes S, Wijeakumar S, Hedrick M, Plyler P, Buss AT. Frontotemporal activation differs between perception of simulated cochlear implant speech and speech in background noise: An image-based fNIRS study. Neuroimage 2021; 240:118385. [PMID: 34256138 PMCID: PMC8503862 DOI: 10.1016/j.neuroimage.2021.118385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 06/10/2021] [Accepted: 07/09/2021] [Indexed: 10/27/2022] Open
Abstract
In this study we used functional near-infrared spectroscopy (fNIRS) to investigate neural responses in normal-hearing adults as a function of speech recognition accuracy, intelligibility of the speech stimulus, and the manner in which speech is distorted. Participants listened to sentences and reported aloud what they heard. Speech quality was distorted artificially by vocoding (simulated cochlear implant speech) or naturally by adding background noise. Each type of distortion included high and low-intelligibility conditions. Sentences in quiet were used as baseline comparison. fNIRS data were analyzed using a newly developed image reconstruction approach. First, elevated cortical responses in the middle temporal gyrus (MTG) and middle frontal gyrus (MFG) were associated with speech recognition during the low-intelligibility conditions. Second, activation in the MTG was associated with recognition of vocoded speech with low intelligibility, whereas MFG activity was largely driven by recognition of speech in background noise, suggesting that the cortical response varies as a function of distortion type. Lastly, an accuracy effect in the MFG demonstrated significantly higher activation during correct perception relative to incorrect perception of speech. These results suggest that normal-hearing adults (i.e., untrained listeners of vocoded stimuli) do not exploit the same attentional mechanisms of the frontal cortex used to resolve naturally degraded speech and may instead rely on segmental and phonetic analyses in the temporal lobe to discriminate vocoded speech.
Collapse
Affiliation(s)
- Jessica Defenderfer
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Samuel Forbes
- Psychology, University of East Anglia, Norwich, England.
| | | | - Mark Hedrick
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Patrick Plyler
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Aaron T Buss
- Psychology, University of Tennessee, Knoxville, TN, United States.
| |
Collapse
|
14
|
Jiang J, Benhamou E, Waters S, Johnson JCS, Volkmer A, Weil RS, Marshall CR, Warren JD, Hardy CJD. Processing of Degraded Speech in Brain Disorders. Brain Sci 2021; 11:394. [PMID: 33804653 PMCID: PMC8003678 DOI: 10.3390/brainsci11030394] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 03/15/2021] [Accepted: 03/18/2021] [Indexed: 11/30/2022] Open
Abstract
The speech we hear every day is typically "degraded" by competing sounds and the idiosyncratic vocal characteristics of individual speakers. While the comprehension of "degraded" speech is normally automatic, it depends on dynamic and adaptive processing across distributed neural networks. This presents the brain with an immense computational challenge, making degraded speech processing vulnerable to a range of brain disorders. Therefore, it is likely to be a sensitive marker of neural circuit dysfunction and an index of retained neural plasticity. Considering experimental methods for studying degraded speech and factors that affect its processing in healthy individuals, we review the evidence for altered degraded speech processing in major neurodegenerative diseases, traumatic brain injury and stroke. We develop a predictive coding framework for understanding deficits of degraded speech processing in these disorders, focussing on the "language-led dementias"-the primary progressive aphasias. We conclude by considering prospects for using degraded speech as a probe of language network pathophysiology, a diagnostic tool and a target for therapeutic intervention.
Collapse
Affiliation(s)
- Jessica Jiang
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Elia Benhamou
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Sheena Waters
- Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London EC1M 6BQ, UK;
| | - Jeremy C. S. Johnson
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Anna Volkmer
- Division of Psychology and Language Sciences, University College London, London WC1H 0AP, UK;
| | - Rimona S. Weil
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Charles R. Marshall
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
- Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London EC1M 6BQ, UK;
| | - Jason D. Warren
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| | - Chris J. D. Hardy
- Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK; (J.J.); (E.B.); (J.C.S.J.); (R.S.W.); (C.R.M.); (J.D.W.)
| |
Collapse
|
15
|
Adaptation to mis-pronounced speech: evidence for a prefrontal-cortex repair mechanism. Sci Rep 2021; 11:97. [PMID: 33420193 PMCID: PMC7794353 DOI: 10.1038/s41598-020-79640-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 11/23/2020] [Indexed: 11/30/2022] Open
Abstract
Speech is a complex and ambiguous acoustic signal that varies significantly within and across speakers. Despite the processing challenge that such variability poses, humans adapt to systematic variations in pronunciation rapidly. The goal of this study is to uncover the neurobiological bases of the attunement process that enables such fluent comprehension. Twenty-four native English participants listened to words spoken by a “canonical” American speaker and two non-canonical speakers, and performed a word-picture matching task, while magnetoencephalography was recorded. Non-canonical speech was created by including systematic phonological substitutions within the word (e.g. [s] → [sh]). Activity in the auditory cortex (superior temporal gyrus) was greater in response to substituted phonemes, and, critically, this was not attenuated by exposure. By contrast, prefrontal regions showed an interaction between the presence of a substitution and the amount of exposure: activity decreased for canonical speech over time, whereas responses to non-canonical speech remained consistently elevated. Grainger causality analyses further revealed that prefrontal responses serve to modulate activity in auditory regions, suggesting the recruitment of top-down processing to decode non-canonical pronunciations. In sum, our results suggest that the behavioural deficit in processing mispronounced phonemes may be due to a disruption to the typical exchange of information between the prefrontal and auditory cortices as observed for canonical speech.
Collapse
|
16
|
Lin IF, Itahashi T, Kashino M, Kato N, Hashimoto RI. Brain activations while processing degraded speech in adults with autism spectrum disorder. Neuropsychologia 2021; 152:107750. [PMID: 33417913 DOI: 10.1016/j.neuropsychologia.2021.107750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 12/14/2020] [Accepted: 12/31/2020] [Indexed: 11/17/2022]
Abstract
Individuals with autism spectrum disorder (ASD) are found to have difficulties in understanding speech in adverse conditions. In this study, we used noise-vocoded speech (VS) to investigate neural processing of degraded speech in individuals with ASD. We ran fMRI experiments in the ASD group and a typically developed control (TDC) group while they listened to clear speech (CS), VS, and spectrally rotated VS (SRVS), and they were requested to pay attention to the heard sentence and answer whether it was intelligible or not. The VS used in this experiment was spectrally degraded but still intelligible, but the SRVS was unintelligible. We recruited 21 right-handed adult males with ASD and 24 age-matched and right-handed male TDC participants for this experiment. Compared with the TDC group, we observed reduced functional connectivity (FC) between the left dorsal premotor cortex and left temporoparietal junction in the ASD group for the effect of task difficulty in speech processing, computed as VS-(CS + SRVS)/2. Furthermore, the observed reduced FC was negatively correlated with their Autism-Spectrum Quotient scores. This observation supports our hypothesis that the disrupted dorsal stream for attentive process of degraded speech in individuals with ASD might be related to their difficulty in understanding speech in adverse conditions.
Collapse
Affiliation(s)
- I-Fan Lin
- Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, 243-0124, Japan; Department of Medicine, Taipei Medical University, Taipei, Taiwan, 11031; Department of Occupational Medicine, Shuang Ho Hospital, New Taipei City, Taiwan, 23561.
| | - Takashi Itahashi
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan
| | - Makio Kashino
- Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, 243-0124, Japan; School of Engineering, Tokyo Institute of Technology, Yokohama, 226-8503, Japan; Graduate School of Education, University of Tokyo, Tokyo, 113-0033, Japan
| | - Nobumasa Kato
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan
| | - Ryu-Ichiro Hashimoto
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan; Department of Language Sciences, Tokyo Metropolitan University, Tokyo, 192-0364, Japan.
| |
Collapse
|
17
|
Liu H, Miyakoshi M, Nakai T, Annabel Chen SH. Aging patterns of Japanese auditory semantic processing: an fMRI study. AGING NEUROPSYCHOLOGY AND COGNITION 2020; 29:213-236. [DOI: 10.1080/13825585.2020.1861202] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Hengshuang Liu
- National Key Research Centre for Linguistics and Applied Linguistics, Adjunct Researcher in the Bilingual Cognition and Development Lab, Guangdong University of Foreign Studies, Guangzhou, China
| | - Makoto Miyakoshi
- Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California San Diego, San Diego, CA, USA
| | - Toshiharu Nakai
- Department of Radiology, Graduate School of Dentistry, Osaka University, Suita, Osaka, Japan
| | - Shen-Hsing Annabel Chen
- Psychology, School of Social Sciences, Nanyang Technological University, Singapore, Singapore
- Centre for Research and Development in Learning, Nanyang Technological University, Singapore, Singapore
- Lee Kong Chian School of Medicine (Lkcmedicine), Nanyang Technological University, Singapore, Singapore
- National Institute of Education, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
18
|
Herrmann B, Johnsrude IS. Absorption and Enjoyment During Listening to Acoustically Masked Stories. Trends Hear 2020; 24:2331216520967850. [PMID: 33143565 DOI: 10.1177/2331216520967850] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Comprehension of speech masked by background sound requires increased cognitive processing, which makes listening effortful. Research in hearing has focused on such challenging listening experiences, in part because they are thought to contribute to social withdrawal in people with hearing impairment. Research has focused less on positive listening experiences, such as enjoyment, despite their potential importance in motivating effortful listening. Moreover, the artificial speech materials-such as disconnected, brief sentences-commonly used to investigate speech intelligibility and listening effort may be ill-suited to capture positive experiences when listening is challenging. Here, we investigate how listening to naturalistic spoken stories under acoustic challenges influences the quality of listening experiences. We assess absorption (the feeling of being immersed/engaged in a story), enjoyment, and listening effort and show that (a) story absorption and enjoyment are only minimally affected by moderate speech masking although listening effort increases, (b) thematic knowledge increases absorption and enjoyment and reduces listening effort when listening to a story presented in multitalker babble, and (c) absorption and enjoyment increase and effort decreases over time as individuals listen to several stories successively in multitalker babble. Our research indicates that naturalistic, spoken stories can reveal several concurrent listening experiences and that expertise in a topic can increase engagement and reduce effort. Our work also demonstrates that, although listening effort may increase with speech masking, listeners may still find the experience both absorbing and enjoyable.
Collapse
Affiliation(s)
- Björn Herrmann
- Rotman Research Institute, Baycrest, Toronto, Ontario, Canada.,Department of Psychology, University of Toronto, Toronto, Ontario, Canada.,Department of Psychology, University of Western Ontario, London, Canada
| | - Ingrid S Johnsrude
- Department of Psychology, University of Western Ontario, London, Canada.,School of Communication Sciences & Disorders, University of Western Ontario, London, Canada
| |
Collapse
|
19
|
Campbell J, Sharma A. Frontal Cortical Modulation of Temporal Visual Cross-Modal Re-organization in Adults with Hearing Loss. Brain Sci 2020; 10:brainsci10080498. [PMID: 32751543 PMCID: PMC7465622 DOI: 10.3390/brainsci10080498] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 07/24/2020] [Accepted: 07/27/2020] [Indexed: 11/19/2022] Open
Abstract
Recent research has demonstrated frontal cortical involvement to co-occur with visual re-organization, suggestive of top-down modulation of cross-modal mechanisms. However, it is unclear whether top-down modulation of visual re-organization takes place in mild hearing loss, or is dependent upon greater degrees of hearing loss severity. Thus, the purpose of this study was to determine if frontal top-down modulation of visual cross-modal re-organization increased across hearing loss severity. We recorded visual evoked potentials (VEPs) in response to apparent motion stimuli in 17 adults with mild-moderate hearing loss using 128-channel high-density electroencephalography (EEG). Current density reconstructions (CDRs) were generated using sLORETA to visualize VEP generators in both groups. VEP latency and amplitude in frontal regions of interest (ROIs) were compared between groups and correlated with auditory behavioral measures. Activation of frontal networks in response to visual stimulation increased across mild to moderate hearing loss, with simultaneous activation of the temporal cortex. In addition, group differences in VEP latency and amplitude correlated with auditory behavioral measures. Overall, these findings support the hypothesis that frontal top-down modulation of visual cross-modal re-organization is dependent upon hearing loss severity.
Collapse
Affiliation(s)
- Julia Campbell
- Central Sensory Processes Laboratory, Department of Communication Sciences and Disorders, University of Texas at Austin, 2504 Whitis Ave a1100, Austin, TX 78712, USA;
| | - Anu Sharma
- Anu Sharma, Brain and Behavior Laboratory, Institute of Cognitive Science, Department of Speech, Language and Hearing Science, University of Colorado at Boulder, 409 UCB, 2501 Kittredge Loop Drive, Boulder, CO 80309, USA
- Correspondence:
| |
Collapse
|
20
|
Cardin V, Rosen S, Konieczny L, Coulson K, Lametti D, Edwards M, Woll B. The effect of dopamine on the comprehension of spectrally-shifted noise-vocoded speech: a pilot study. Int J Audiol 2020; 59:674-681. [PMID: 32186216 DOI: 10.1080/14992027.2020.1734675] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Objectives: Cochlear implantation has proven beneficial in restoring hearing. However, success is variable, and there is a need for a simple post-implantation therapy that could significantly increase implantation success. Dopamine has a general role in learning and in assigning value to environmental stimuli. We tested the effect of dopamine in the comprehension of spectrally-shifted noise-vocoded (SSNV) speech, which simulates, in hearing individuals, the signal delivered by a cochlear implant (CI).Design and study sample: Thirty-five participants (age = 38.0 ± 10.1 SD) recruited from the general population were divided into three groups. We tested SSNV speech comprehension in two experimental sessions. In one session, a metabolic precursor of dopamine (L-DOPA) was administered to participants in two of the groups; a placebo was administered in the other session.Results: A single dose of L-DOPA interacted with training to improve perception of SSNV speech, but did not significantly accelerate learning.Conclusions: These findings are a first step in exploring the use of dopamine to enhance speech understanding in CI patients. Replications of these results using SSNV in individuals with normal hearing, and also in CI users, are needed to determine whether these effects can translate into benefits in everyday language comprehension.
Collapse
Affiliation(s)
- Velia Cardin
- Deafness, Cognition and Language Research Centre, University College London, London, United Kingdom.,School of Psychology, University of East Anglia, Norwich, Norfolk, United Kingdom
| | - Stuart Rosen
- Speech, Hearing and Phonetics Sciences, UCL, London, United Kingdom
| | - Linda Konieczny
- Deafness, Cognition and Language Research Centre, University College London, London, United Kingdom
| | - Kim Coulson
- Deafness, Cognition and Language Research Centre, University College London, London, United Kingdom
| | - Daniel Lametti
- Department of Psychology, Acadia University, Wolfville, Nova Scotia, Canada
| | - Mark Edwards
- Neuroscience Research Centre, Institute of Molecular and Clinical Sciences, St George's University of London, London, United Kingdom
| | - Bencie Woll
- Deafness, Cognition and Language Research Centre, University College London, London, United Kingdom
| |
Collapse
|
21
|
Dimitrijevic A, Smith ML, Kadis DS, Moore DR. Neural indices of listening effort in noisy environments. Sci Rep 2019; 9:11278. [PMID: 31375712 PMCID: PMC6677804 DOI: 10.1038/s41598-019-47643-1] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 07/15/2019] [Indexed: 11/09/2022] Open
|
22
|
Khoshkhoo S, Leonard MK, Mesgarani N, Chang EF. Neural correlates of sine-wave speech intelligibility in human frontal and temporal cortex. BRAIN AND LANGUAGE 2018; 187:83-91. [PMID: 29397190 PMCID: PMC6067983 DOI: 10.1016/j.bandl.2018.01.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Revised: 12/06/2017] [Accepted: 01/20/2018] [Indexed: 05/09/2023]
Abstract
Auditory speech comprehension is the result of neural computations that occur in a broad network that includes the temporal lobe auditory cortex and the left inferior frontal cortex. It remains unclear how representations in this network differentially contribute to speech comprehension. Here, we recorded high-density direct cortical activity during a sine-wave speech (SWS) listening task to examine detailed neural speech representations when the exact same acoustic input is comprehended versus not comprehended. Listeners heard SWS sentences (pre-exposure), followed by clear versions of the same sentences, which revealed the content of the sounds (exposure), and then the same SWS sentences again (post-exposure). Across all three task phases, high-gamma neural activity in the superior temporal gyrus was similar, distinguishing different words based on bottom-up acoustic features. In contrast, frontal regions showed a more pronounced and sudden increase in activity only when the input was comprehended, which corresponded with stronger representational separability among spatiotemporal activity patterns evoked by different words. We observed this effect only in participants who were not able to comprehend the stimuli during the pre-exposure phase, indicating a relationship between frontal high-gamma activity and speech understanding. Together, these results demonstrate that both frontal and temporal cortical networks are involved in spoken language understanding, and that under certain listening conditions, frontal regions are involved in discriminating speech sounds.
Collapse
Affiliation(s)
- Sattar Khoshkhoo
- School of Medicine, University of California, San Francisco, 505 Parnassus Ave., San Francisco, CA 94143, United States
| | - Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 505 Parnassus Ave., San Francisco, CA 94143, United States; Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Ln., Room 535, San Francisco, CA 94158, United States; Weill Institute for Neurosciences, University of California, San Francisco, 675 Nelson Rising Ln., Room 535, San Francisco, CA 94158, United States
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, Mudd Building, Room 1339, 500 W 120th St., New York, NY 10027, United States
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 505 Parnassus Ave., San Francisco, CA 94143, United States; Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Ln., Room 535, San Francisco, CA 94158, United States; Weill Institute for Neurosciences, University of California, San Francisco, 675 Nelson Rising Ln., Room 535, San Francisco, CA 94158, United States.
| |
Collapse
|
23
|
Nourski KV, Steinschneider M, Rhone AE, Kovach CK, Kawasaki H, Howard MA. Differential responses to spectrally degraded speech within human auditory cortex: An intracranial electrophysiology study. Hear Res 2018; 371:53-65. [PMID: 30500619 DOI: 10.1016/j.heares.2018.11.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 11/15/2018] [Accepted: 11/19/2018] [Indexed: 12/28/2022]
Abstract
Understanding cortical processing of spectrally degraded speech in normal-hearing subjects may provide insights into how sound information is processed by cochlear implant (CI) users. This study investigated electrocorticographic (ECoG) responses to noise-vocoded speech and related these responses to behavioral performance in a phonemic identification task. Subjects were neurosurgical patients undergoing chronic invasive monitoring for medically refractory epilepsy. Stimuli were utterances /aba/ and /ada/, spectrally degraded using a noise vocoder (1-4 bands). ECoG responses were obtained from Heschl's gyrus (HG) and superior temporal gyrus (STG), and were examined within the high gamma frequency range (70-150 Hz). All subjects performed at chance accuracy with speech degraded to 1 and 2 spectral bands, and at or near ceiling for clear speech. Inter-subject variability was observed in the 3- and 4-band conditions. High gamma responses in posteromedial HG (auditory core cortex) were similar for all vocoded conditions and clear speech. A progressive preference for clear speech emerged in anterolateral segments of HG, regardless of behavioral performance. On the lateral STG, responses to all vocoded stimuli were larger in subjects with better task performance. In contrast, both behavioral and neural responses to clear speech were comparable across subjects regardless of their ability to identify degraded stimuli. Findings highlight differences in representation of spectrally degraded speech across cortical areas and their relationship to perception. The results are in agreement with prior non-invasive results. The data provide insight into the neural mechanisms associated with variability in perception of degraded speech and potentially into sources of such variability in CI users.
Collapse
Affiliation(s)
- Kirill V Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, USA; Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, USA.
| | - Mitchell Steinschneider
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Ariane E Rhone
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, USA
| | | | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, USA
| | - Matthew A Howard
- Department of Neurosurgery, The University of Iowa, Iowa City, IA, USA; Iowa Neuroscience Institute, The University of Iowa, Iowa City, IA, USA; Pappajohn Biomedical Institute, The University of Iowa, Iowa City, IA, USA
| |
Collapse
|
24
|
Altvater-Mackensen N, Grossmann T. Modality-independent recruitment of inferior frontal cortex during speech processing in human infants. Dev Cogn Neurosci 2018; 34:130-138. [PMID: 30391756 PMCID: PMC6969291 DOI: 10.1016/j.dcn.2018.10.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 08/25/2018] [Accepted: 10/25/2018] [Indexed: 11/22/2022] Open
Abstract
Despite increasing interest in the development of audiovisual speech perception in infancy, the underlying mechanisms and neural processes are still only poorly understood. In addition to regions in temporal cortex associated with speech processing and multimodal integration, such as superior temporal sulcus, left inferior frontal cortex (IFC) has been suggested to be critically involved in mapping information from different modalities during speech perception. To further illuminate the role of IFC during infant language learning and speech perception, the current study examined the processing of auditory, visual and audiovisual speech in 6-month-old infants using functional near-infrared spectroscopy (fNIRS). Our results revealed that infants recruit speech-sensitive regions in frontal cortex including IFC regardless of whether they processed unimodal or multimodal speech. We argue that IFC may play an important role in associating multimodal speech information during the early steps of language learning.
Collapse
Affiliation(s)
- Nicole Altvater-Mackensen
- Department of Psychology, Johannes-Gutenberg-University Mainz, Germany; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Tobias Grossmann
- Department of Psychology, University of Virginia, USA; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
25
|
Yellamsetty A, Bidelman GM. Low- and high-frequency cortical brain oscillations reflect dissociable mechanisms of concurrent speech segregation in noise. Hear Res 2018; 361:92-102. [PMID: 29398142 DOI: 10.1016/j.heares.2018.01.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 12/09/2017] [Accepted: 01/12/2018] [Indexed: 10/18/2022]
Abstract
Parsing simultaneous speech requires listeners use pitch-guided segregation which can be affected by the signal-to-noise ratio (SNR) in the auditory scene. The interaction of these two cues may occur at multiple levels within the cortex. The aims of the current study were to assess the correspondence between oscillatory brain rhythms and determine how listeners exploit pitch and SNR cues to successfully segregate concurrent speech. We recorded electrical brain activity while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero or four semitones (STs) presented in either clean or noise-degraded (+5 dB SNR) conditions. We found that behavioral identification was more accurate for vowel mixtures with larger pitch separations but F0 benefit interacted with noise. Time-frequency analysis decomposed the EEG into different spectrotemporal frequency bands. Low-frequency (θ, β) responses were elevated when speech did not contain pitch cues (0ST > 4ST) or was noisy, suggesting a correlate of increased listening effort and/or memory demands. Contrastively, γ power increments were observed for changes in both pitch (0ST > 4ST) and SNR (clean > noise), suggesting high-frequency bands carry information related to acoustic features and the quality of speech representations. Brain-behavior associations corroborated these effects; modulations in low-frequency rhythms predicted the speed of listeners' perceptual decisions with higher bands predicting identification accuracy. Results are consistent with the notion that neural oscillations reflect both automatic (pre-perceptual) and controlled (post-perceptual) mechanisms of speech processing that are largely divisible into high- and low-frequency bands of human brain rhythms.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
26
|
fMRI as a Preimplant Objective Tool to Predict Postimplant Oral Language Outcomes in Children with Cochlear Implants. Ear Hear 2018; 37:e263-72. [PMID: 26689275 DOI: 10.1097/aud.0000000000000259] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Despite the positive effects of cochlear implantation, postimplant variability in speech perception and oral language outcomes is still difficult to predict. The aim of this study was to identify neuroimaging biomarkers of postimplant speech perception and oral language performance in children with hearing loss who receive a cochlear implant. The authors hypothesized positive correlations between blood oxygen level-dependent functional magnetic resonance imaging (fMRI) activation in brain regions related to auditory language processing and attention and scores on the Clinical Evaluation of Language Fundamentals-Preschool, Second Edition (CELF-P2) and the Early Speech Perception Test for Profoundly Hearing-Impaired Children (ESP), in children with congenital hearing loss. DESIGN Eleven children with congenital hearing loss were recruited for the present study based on referral for clinical MRI and other inclusion criteria. All participants were <24 months at fMRI scanning and <36 months at first implantation. A silent background fMRI acquisition method was performed to acquire fMRI during auditory stimulation. A voxel-based analysis technique was utilized to generate z maps showing significant contrast in brain activation between auditory stimulation conditions (spoken narratives and narrow band noise). CELF-P2 and ESP were administered 2 years after implantation. Because most participants reached a ceiling on ESP, a voxel-wise regression analysis was performed between preimplant fMRI activation and postimplant CELF-P2 scores alone. Age at implantation and preimplant hearing thresholds were controlled in this regression analysis. RESULTS Four brain regions were found to be significantly correlated with CELF-P2 scores. These clusters of positive correlation encompassed the temporo-parieto-occipital junction, areas in the prefrontal cortex and the cingulate gyrus. For the story versus silence contrast, CELF-P2 core language score demonstrated significant positive correlation with activation in the right angular gyrus (r = 0.95), left medial frontal gyrus (r = 0.94), and left cingulate gyrus (r = 0.96). For the narrow band noise versus silence contrast, the CELF-P2 core language score exhibited significant positive correlation with activation in the left angular gyrus (r = 0.89; for all clusters, corrected p < 0.05). CONCLUSIONS Four brain regions related to language function and attention were identified that correlated with CELF-P2. Children with better oral language performance postimplant displayed greater activation in these regions preimplant. The results suggest that despite auditory deprivation, these regions are more receptive to gains in oral language development performance of children with hearing loss who receive early intervention via cochlear implantation. The present study suggests that oral language outcome following cochlear implant may be predicted by preimplant fMRI with auditory stimulation using natural speech.
Collapse
|
27
|
Xie X, Myers E. Left Inferior Frontal Gyrus Sensitivity to Phonetic Competition in Receptive Language Processing: A Comparison of Clear and Conversational Speech. J Cogn Neurosci 2017; 30:267-280. [PMID: 29160743 DOI: 10.1162/jocn_a_01208] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The speech signal is rife with variations in phonetic ambiguity. For instance, when talkers speak in a conversational register, they demonstrate less articulatory precision, leading to greater potential for confusability at the phonetic level compared with a clear speech register. Current psycholinguistic models assume that ambiguous speech sounds activate more than one phonological category and that competition at prelexical levels cascades to lexical levels of processing. Imaging studies have shown that the left inferior frontal gyrus (LIFG) is modulated by phonetic competition between simultaneously activated categories, with increases in activation for more ambiguous tokens. Yet, these studies have often used artificially manipulated speech and/or metalinguistic tasks, which arguably may recruit neural regions that are not critical for natural speech recognition. Indeed, a prominent model of speech processing, the dual-stream model, posits that the LIFG is not involved in prelexical processing in receptive language processing. In the current study, we exploited natural variation in phonetic competition in the speech signal to investigate the neural systems sensitive to phonetic competition as listeners engage in a receptive language task. Participants heard nonsense sentences spoken in either a clear or conversational register as neural activity was monitored using fMRI. Conversational sentences contained greater phonetic competition, as estimated by measures of vowel confusability, and these sentences also elicited greater activation in a region in the LIFG. Sentence-level phonetic competition metrics uniquely correlated with LIFG activity as well. This finding is consistent with the hypothesis that the LIFG responds to competition at multiple levels of language processing and that recruitment of this region does not require an explicit phonological judgment.
Collapse
|
28
|
Alderson-Day B, Lima CF, Evans S, Krishnan S, Shanmugalingam P, Fernyhough C, Scott SK. Distinct processing of ambiguous speech in people with non-clinical auditory verbal hallucinations. Brain 2017; 140:2475-2489. [PMID: 29050393 DOI: 10.1093/brain/awx206] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2017] [Accepted: 06/29/2017] [Indexed: 01/17/2023] Open
Abstract
Auditory verbal hallucinations (hearing voices) are typically associated with psychosis, but a minority of the general population also experience them frequently and without distress. Such 'non-clinical' experiences offer a rare and unique opportunity to study hallucinations apart from confounding clinical factors, thus allowing for the identification of symptom-specific mechanisms. Recent theories propose that hallucinations result from an imbalance of prior expectation and sensory information, but whether such an imbalance also influences auditory-perceptual processes remains unknown. We examine for the first time the cortical processing of ambiguous speech in people without psychosis who regularly hear voices. Twelve non-clinical voice-hearers and 17 matched controls completed a functional magnetic resonance imaging scan while passively listening to degraded speech ('sine-wave' speech), that was either potentially intelligible or unintelligible. Voice-hearers reported recognizing the presence of speech in the stimuli before controls, and before being explicitly informed of its intelligibility. Across both groups, intelligible sine-wave speech engaged a typical left-lateralized speech processing network. Notably, however, voice-hearers showed stronger intelligibility responses than controls in the dorsal anterior cingulate cortex and in the superior frontal gyrus. This suggests an enhanced involvement of attention and sensorimotor processes, selectively when speech was potentially intelligible. Altogether, these behavioural and neural findings indicate that people with hallucinatory experiences show distinct responses to meaningful auditory stimuli. A greater weighting towards prior knowledge and expectation might cause non-veridical auditory sensations in these individuals, but it might also spontaneously facilitate perceptual processing where such knowledge is required. This has implications for the understanding of hallucinations in clinical and non-clinical populations, and is consistent with current 'predictive processing' theories of psychosis.
Collapse
Affiliation(s)
- Ben Alderson-Day
- Department of Psychology, Durham University, Science Laboratories, South Road, Durham, DH1 3LE, UK
| | - César F Lima
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AR, UK.,Faculty of Psychology and Education Sciences, University of Porto, Rua Alfredo Allen, 4200-135 Porto, Portugal
| | - Samuel Evans
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AR, UK.,Department of Psychology, University of Westminster, 115 New Cavendish Street, London, W1W 6UW, UK
| | - Saloni Krishnan
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AR, UK.,Department of Experimental Psychology, University of Oxford, S Parks Rd, Oxford OX1 3UD, UK
| | - Pradheep Shanmugalingam
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AR, UK
| | - Charles Fernyhough
- Department of Psychology, Durham University, Science Laboratories, South Road, Durham, DH1 3LE, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AR, UK
| |
Collapse
|
29
|
Investigating the role of temporal lobe activation in speech perception accuracy with normal hearing adults: An event-related fNIRS study. Neuropsychologia 2017; 106:31-41. [PMID: 28888891 DOI: 10.1016/j.neuropsychologia.2017.09.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Revised: 08/29/2017] [Accepted: 09/04/2017] [Indexed: 12/14/2022]
Abstract
Functional near infrared spectroscopy (fNIRS) is a safe, non-invasive, relatively quiet imaging technique that is tolerant of movement artifact making it uniquely ideal for the assessment of hearing mechanisms. Previous research demonstrates the capacity for fNIRS to detect cortical changes to varying speech intelligibility, revealing a positive relationship between cortical activation amplitude and speech perception score. In the present study, we use an event-related design to investigate the hemodynamic response in the temporal lobe across different listening conditions. We presented participants with a speech recognition task using sentences in quiet, sentences in noise, and vocoded sentences. Hemodynamic responses were examined across conditions and then compared when speech perception was accurate compared to when speech perception was inaccurate in the context of noisy speech. Repeated measures, two-way ANOVAs revealed that the speech in noise condition (-2.8dB signal-to-noise ratio/SNR) demonstrated significantly greater activation than the easier listening conditions on multiple channels bilaterally. Further analyses comparing correct recognition trials to incorrect recognition trials (during the presentation phase of the trial) revealed that activation was significantly greater during correct trials. Lastly, during the repetition phase of the trial, where participants correctly repeated the sentence, the hemodynamic response demonstrated significantly higher deoxyhemoglobin than oxyhemoglobin, indicating a difference between the effects of perception and production on the cortical response. Using fNIRS, the present study adds meaningful evidence to the body of knowledge that describes the brain/behavior relationship related to speech perception.
Collapse
|
30
|
Rosemann S, Gießing C, Özyurt J, Carroll R, Puschmann S, Thiel CM. The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults. Front Hum Neurosci 2017. [PMID: 28638329 PMCID: PMC5461255 DOI: 10.3389/fnhum.2017.00294] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI). This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome.
Collapse
Affiliation(s)
- Stephanie Rosemann
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Carsten Gießing
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Jale Özyurt
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Rebecca Carroll
- Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Institute of Dutch Studies, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Sebastian Puschmann
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany
| | - Christiane M Thiel
- Biological Psychology, Department of Psychology, European Medical School, Carl von Ossietzky Universität OldenburgOldenburg, Germany.,Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität OldenburgOldenburg, Germany
| |
Collapse
|
31
|
Bidelman GM, Yellamsetty A. Noise and pitch interact during the cortical segregation of concurrent speech. Hear Res 2017; 351:34-44. [PMID: 28578876 DOI: 10.1016/j.heares.2017.05.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 05/09/2017] [Accepted: 05/23/2017] [Indexed: 10/19/2022]
Abstract
Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, 38152, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, 38163, USA.
| | - Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA
| |
Collapse
|
32
|
McGettigan C, Jasmin K, Eisner F, Agnew ZK, Josephs OJ, Calder AJ, Jessop R, Lawson RP, Spielmann M, Scott SK. You talkin' to me? Communicative talker gaze activates left-lateralized superior temporal cortex during perception of degraded speech. Neuropsychologia 2017; 100:51-63. [PMID: 28400328 PMCID: PMC5446325 DOI: 10.1016/j.neuropsychologia.2017.04.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Revised: 04/05/2017] [Accepted: 04/07/2017] [Indexed: 11/13/2022]
Abstract
Neuroimaging studies of speech perception have consistently indicated a left-hemisphere dominance in the temporal lobes’ responses to intelligible auditory speech signals (McGettigan and Scott, 2012). However, there are important communicative cues that cannot be extracted from auditory signals alone, including the direction of the talker's gaze. Previous work has implicated the superior temporal cortices in processing gaze direction, with evidence for predominantly right-lateralized responses (Carlin & Calder, 2013). The aim of the current study was to investigate whether the lateralization of responses to talker gaze differs in an auditory communicative context. Participants in a functional MRI experiment watched and listened to videos of spoken sentences in which the auditory intelligibility and talker gaze direction were manipulated factorially. We observed a left-dominant temporal lobe sensitivity to the talker's gaze direction, in which the left anterior superior temporal sulcus/gyrus and temporal pole showed an enhanced response to direct gaze – further investigation revealed that this pattern of lateralization was modulated by auditory intelligibility. Our results suggest flexibility in the distribution of neural responses to social cues in the face within the context of a challenging speech perception task. Talker gaze is an important social cue during speech comprehension. Neural responses to gaze were measured during perception of degraded sentences. Gaze direction modulated activation in left-lateralized superior temporal cortex. Left lateralization became stronger when speech was less intelligible. Results suggest task-dependent flexibility in cortical responses to gaze.
Collapse
Affiliation(s)
- Carolyn McGettigan
- Department of Psychology, Royal Holloway University of London, Egham Hill, Egham TW20 0EX, UK; Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK.
| | - Kyle Jasmin
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Frank Eisner
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Donders Institute, Radboud University, Montessorilaan 3, 6525 HR Nijmegen, Netherlands
| | - Zarinah K Agnew
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Department of Otolaryngology, University of California, San Francisco, 513 Parnassus Avenue, San Francisco, CA, USA
| | - Oliver J Josephs
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Andrew J Calder
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, UK
| | - Rosemary Jessop
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Rebecca P Lawson
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Mona Spielmann
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| |
Collapse
|
33
|
Leonard MK, Baud MO, Sjerps MJ, Chang EF. Perceptual restoration of masked speech in human cortex. Nat Commun 2016; 7:13619. [PMID: 27996973 PMCID: PMC5187421 DOI: 10.1038/ncomms13619] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Accepted: 10/19/2016] [Indexed: 02/02/2023] Open
Abstract
Humans are adept at understanding speech despite the fact that our natural listening environment is often filled with interference. An example of this capacity is phoneme restoration, in which part of a word is completely replaced by noise, yet listeners report hearing the whole word. The neurological basis for this unconscious fill-in phenomenon is unknown, despite being a fundamental characteristic of human hearing. Here, using direct cortical recordings in humans, we demonstrate that missing speech is restored at the acoustic-phonetic level in bilateral auditory cortex, in real-time. This restoration is preceded by specific neural activity patterns in a separate language area, left frontal cortex, which predicts the word that participants later report hearing. These results demonstrate that during speech perception, missing acoustic content is synthesized online from the integration of incoming sensory cues and the internal neural dynamics that bias word-level expectation and prediction. We can often ‘fill in' missing or occluded sounds from a speech signal—an effect known as phoneme restoration. Leonard et al. found a real-time restoration of the missing sounds in the superior temporal auditory cortex in humans. Interestingly, neural activity in frontal regions prior to the stimulus can predict the word that the participant would later hear.
Collapse
Affiliation(s)
- Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| | - Maxime O Baud
- Department of Neurology, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| | - Matthias J Sjerps
- Department of Linguistics, University of California, Berkeley, 1203 Dwinelle Hall #2650, Berkeley, California 94720-2650, USA.,Neurobiology of Language Department, Donders Institute for Brain, Cognition and Behavior, Centre for Cognitive Neuroimaging, Radboud University, Kapittelweg 29, Nijmegen 6525 EN, The Netherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA.,Department of Physiology, University of California, San Francisco, 675 Nelson Rising Lane, Room 535, San Francisco, California 94158, USA
| |
Collapse
|
34
|
Sohoglu E, Davis MH. Perceptual learning of degraded speech by minimizing prediction error. Proc Natl Acad Sci U S A 2016; 113:E1747-56. [PMID: 26957596 PMCID: PMC4812728 DOI: 10.1073/pnas.1523266113] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.
Collapse
Affiliation(s)
- Ediz Sohoglu
- Medical Research Council Cognition and Brain Sciences Unit, Cambridge CB2 7EF, United Kingdom
| | - Matthew H Davis
- Medical Research Council Cognition and Brain Sciences Unit, Cambridge CB2 7EF, United Kingdom
| |
Collapse
|
35
|
Bonte M, Ley A, Scharke W, Formisano E. Developmental refinement of cortical systems for speech and voice processing. Neuroimage 2016; 128:373-384. [PMID: 26777479 DOI: 10.1016/j.neuroimage.2016.01.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 12/15/2015] [Accepted: 01/06/2016] [Indexed: 01/31/2023] Open
Affiliation(s)
- Milene Bonte
- Department of Cognitive Neuroscience and Maastricht Brain Imaging Center, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands.
| | - Anke Ley
- Department of Cognitive Neuroscience and Maastricht Brain Imaging Center, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
| | - Wolfgang Scharke
- Department of Cognitive Neuroscience and Maastricht Brain Imaging Center, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience and Maastricht Brain Imaging Center, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
| |
Collapse
|
36
|
Evans S, McGettigan C, Agnew ZK, Rosen S, Scott SK. Getting the Cocktail Party Started: Masking Effects in Speech Perception. J Cogn Neurosci 2015; 28:483-500. [PMID: 26696297 DOI: 10.1162/jocn_a_00913] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Spoken conversations typically take place in noisy environments, and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous fMRI, while they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioral task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment; activity was found within right lateralized frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise.
Collapse
Affiliation(s)
| | | | - Zarinah K Agnew
- University College London.,University of California, San Francisco
| | | | | |
Collapse
|
37
|
Gallese V, Gernsbacher MA, Heyes C, Hickok G, Iacoboni M. Mirror Neuron Forum. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2015; 6:369-407. [PMID: 25520744 DOI: 10.1177/1745691611413392] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Vittorio Gallese
- Department of Neuroscience, University of Parma, and Italian Institute of Technology Brain Center for Social and Motor Cognition, Parma, Italy
| | | | - Cecilia Heyes
- All Souls College and Department of Experimental Psychology, University of Oxford, United Kingdom
| | - Gregory Hickok
- Center for Cognitive Neuroscience, Department of Cognitive Sciences, University of California, Irvine
| | - Marco Iacoboni
- Ahmanson-Lovelace Brain Mapping Center, Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Social Behavior, Brain Research Institute, David Geffen School of Medicine, University of California, Los Angeles
| |
Collapse
|
38
|
Evans S, Davis MH. Hierarchical Organization of Auditory and Motor Representations in Speech Perception: Evidence from Searchlight Similarity Analysis. Cereb Cortex 2015; 25:4772-88. [PMID: 26157026 PMCID: PMC4635918 DOI: 10.1093/cercor/bhv136] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
How humans extract the identity of speech sounds from highly variable acoustic signals remains unclear. Here, we use searchlight representational similarity analysis (RSA) to localize and characterize neural representations of syllables at different levels of the hierarchically organized temporo-frontal pathways for speech perception. We asked participants to listen to spoken syllables that differed considerably in their surface acoustic form by changing speaker and degrading surface acoustics using noise-vocoding and sine wave synthesis while we recorded neural responses with functional magnetic resonance imaging. We found evidence for a graded hierarchy of abstraction across the brain. At the peak of the hierarchy, neural representations in somatomotor cortex encoded syllable identity but not surface acoustic form, at the base of the hierarchy, primary auditory cortex showed the reverse. In contrast, bilateral temporal cortex exhibited an intermediate response, encoding both syllable identity and the surface acoustic form of speech. Regions of somatomotor cortex associated with encoding syllable identity in perception were also engaged when producing the same syllables in a separate session. These findings are consistent with a hierarchical account of how variable acoustic signals are transformed into abstract representations of the identity of speech sounds.
Collapse
Affiliation(s)
- Samuel Evans
- MRC Cognition and Brain Sciences Unit, Cambridge CB2 7EF, UK Institute of Cognitive Neuroscience, University College London, WC1 3AR, UK
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, Cambridge CB2 7EF, UK
| |
Collapse
|
39
|
Guediche S, Holt LL, Laurent P, Lim SJ, Fiez JA. Evidence for Cerebellar Contributions to Adaptive Plasticity in Speech Perception. Cereb Cortex 2015; 25:1867-77. [PMID: 24451660 PMCID: PMC4481605 DOI: 10.1093/cercor/bht428] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Human speech perception rapidly adapts to maintain comprehension under adverse listening conditions. For example, with exposure listeners can adapt to heavily accented speech produced by a non-native speaker. Outside the domain of speech perception, adaptive changes in sensory and motor processing have been attributed to cerebellar functions. The present functional magnetic resonance imaging study investigates whether adaptation in speech perception also involves the cerebellum. Acoustic stimuli were distorted using a vocoding plus spectral-shift manipulation and presented in a word recognition task. Regions in the cerebellum that showed differences before versus after adaptation were identified, and the relationship between activity during adaptation and subsequent behavioral improvements was examined. These analyses implicated the right Crus I region of the cerebellum in adaptive changes in speech perception. A functional correlation analysis with the right Crus I as a seed region probed for cerebral cortical regions with covarying hemodynamic responses during the adaptation period. The results provided evidence of a functional network between the cerebellum and language-related regions in the temporal and parietal lobes of the cerebral cortex. Consistent with known cerebellar contributions to sensorimotor adaptation, cerebro-cerebellar interactions may support supervised learning mechanisms that rely on sensory prediction error signals in speech perception.
Collapse
Affiliation(s)
- Sara Guediche
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
- Current address: Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Lori L. Holt
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Patryk Laurent
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
- Current address: Brain Corporation, San Diego, CA, USA
| | - Sung-Joo Lim
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Julie A. Fiez
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
40
|
Bidelman GM, Dexter L. Bilinguals at the "cocktail party": dissociable neural activity in auditory-linguistic brain regions reveals neurobiological basis for nonnative listeners' speech-in-noise recognition deficits. BRAIN AND LANGUAGE 2015; 143:32-41. [PMID: 25747886 DOI: 10.1016/j.bandl.2015.02.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2014] [Revised: 12/22/2014] [Accepted: 02/08/2015] [Indexed: 06/04/2023]
Abstract
We examined a consistent deficit observed in bilinguals: poorer speech-in-noise (SIN) comprehension for their nonnative language. We recorded neuroelectric mismatch potentials in mono- and bi-lingual listeners in response to contrastive speech sounds in noise. Behaviorally, late bilinguals required ∼10dB more favorable signal-to-noise ratios to match monolinguals' SIN abilities. Source analysis of cortical activity demonstrated monotonic increase in response latency with noise in superior temporal gyrus (STG) for both groups, suggesting parallel degradation of speech representations in auditory cortex. Contrastively, we found differential speech encoding between groups within inferior frontal gyrus (IFG)-adjacent to Broca's area-where noise delays observed in nonnative listeners were offset in monolinguals. Notably, brain-behavior correspondences double dissociated between language groups: STG activation predicted bilinguals' SIN, whereas IFG activation predicted monolinguals' performance. We infer higher-order brain areas act compensatorily to enhance impoverished sensory representations but only when degraded speech recruits linguistic brain mechanisms downstream from initial auditory-sensory inputs.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Lauren Dexter
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
41
|
Neger TM, Rietveld T, Janse E. Relationship between perceptual learning in speech and statistical learning in younger and older adults. Front Hum Neurosci 2014; 8:628. [PMID: 25225475 PMCID: PMC4150448 DOI: 10.3389/fnhum.2014.00628] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 07/28/2014] [Indexed: 11/30/2022] Open
Abstract
Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly) draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech. In the present study, 73 older adults (aged over 60 years) and 60 younger adults (aged between 18 and 30 years) performed a visual artificial grammar learning task and were presented with 60 meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory, and processing speed). Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly.
Collapse
Affiliation(s)
- Thordis M Neger
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands ; International Max Planck Research School for Language Sciences Nijmegen, Netherlands
| | - Toni Rietveld
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands
| | - Esther Janse
- Centre for Language Studies, Radboud University Nijmegen Nijmegen, Netherlands ; Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| |
Collapse
|
42
|
Task-dependent decoding of speaker and vowel identity from auditory cortical response patterns. J Neurosci 2014; 34:4548-57. [PMID: 24672000 DOI: 10.1523/jneurosci.4339-13.2014] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Selective attention to relevant sound properties is essential for everyday listening situations. It enables the formation of different perceptual representations of the same acoustic input and is at the basis of flexible and goal-dependent behavior. Here, we investigated the role of the human auditory cortex in forming behavior-dependent representations of sounds. We used single-trial fMRI and analyzed cortical responses collected while subjects listened to the same speech sounds (vowels /a/, /i/, and /u/) spoken by different speakers (boy, girl, male) and performed a delayed-match-to-sample task on either speech sound or speaker identity. Univariate analyses showed a task-specific activation increase in the right superior temporal gyrus/sulcus (STG/STS) during speaker categorization and in the right posterior temporal cortex during vowel categorization. Beyond regional differences in activation levels, multivariate classification of single trial responses demonstrated that the success with which single speakers and vowels can be decoded from auditory cortical activation patterns depends on task demands and subject's behavioral performance. Speaker/vowel classification relied on distinct but overlapping regions across the (right) mid-anterior STG/STS (speakers) and bilateral mid-posterior STG/STS (vowels), as well as the superior temporal plane including Heschl's gyrus/sulcus. The task dependency of speaker/vowel classification demonstrates that the informative fMRI response patterns reflect the top-down enhancement of behaviorally relevant sound representations. Furthermore, our findings suggest that successful selection, processing, and retention of task-relevant sound properties relies on the joint encoding of information across early and higher-order regions of the auditory cortex.
Collapse
|
43
|
Kyong JS, Scott SK, Rosen S, Howe TB, Agnew ZK, McGettigan C. Exploring the roles of spectral detail and intonation contour in speech intelligibility: an FMRI study. J Cogn Neurosci 2014; 26:1748-63. [PMID: 24568205 DOI: 10.1162/jocn_a_00583] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information [Johnsrude, I. S., Penhune, V. B., & Zatorre, R. J. Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain, 123, 155-163, 2000; Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000]. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al. [Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000], where greater sentence intelligibility was predominately associated with increased activity in the left STS, and the greatest response to normal sentence melody was found in right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps surprisingly, there was little evidence for an interaction between the two factors-we discuss the implications for the processing of sound and speech in the dorsolateral temporal lobes.
Collapse
|
44
|
Guediche S, Blumstein SE, Fiez JA, Holt LL. Speech perception under adverse conditions: insights from behavioral, computational, and neuroscience research. Front Syst Neurosci 2014; 7:126. [PMID: 24427119 PMCID: PMC3879477 DOI: 10.3389/fnsys.2013.00126] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Accepted: 12/16/2013] [Indexed: 01/06/2023] Open
Abstract
Adult speech perception reflects the long-term regularities of the native language, but it is also flexible such that it accommodates and adapts to adverse listening conditions and short-term deviations from native-language norms. The purpose of this article is to examine how the broader neuroscience literature can inform and advance research efforts in understanding the neural basis of flexibility and adaptive plasticity in speech perception. Specifically, we highlight the potential role of learning algorithms that rely on prediction error signals and discuss specific neural structures that are likely to contribute to such learning. To this end, we review behavioral studies, computational accounts, and neuroimaging findings related to adaptive plasticity in speech perception. Already, a few studies have alluded to a potential role of these mechanisms in adaptive plasticity in speech perception. Furthermore, we consider research topics in neuroscience that offer insight into how perception can be adaptively tuned to short-term deviations while balancing the need to maintain stability in the perception of learned long-term regularities. Consideration of the application and limitations of these algorithms in characterizing flexible speech perception under adverse conditions promises to inform theoretical models of speech.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown UniversityProvidence, RI, USA
| | - Sheila E. Blumstein
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown UniversityProvidence, RI, USA
- Department of Cognitive, Linguistic, and Psychological Sciences, Brain Institute, Brown UniversityProvidence, RI, USA
| | - Julie A. Fiez
- Department of Neuroscience, Center for Neuroscience at the University of Pittsburgh, University of PittsburghPittsburgh, PA, USA
- Department of Psychology, University of PittsburghPittsburgh, PA, USA
- Department of Psychology at Carnegie Mellon University and Department of Neuroscience at the University of Pittsburgh, Center for the Neural Basis of CognitionPittsburgh, PA, USA
| | - Lori L. Holt
- Department of Neuroscience, Center for Neuroscience at the University of Pittsburgh, University of PittsburghPittsburgh, PA, USA
- Department of Psychology at Carnegie Mellon University and Department of Neuroscience at the University of Pittsburgh, Center for the Neural Basis of CognitionPittsburgh, PA, USA
- Department of Psychology, Carnegie Mellon UniversityPittsburgh, PA, USA
| |
Collapse
|
45
|
Becker R, Pefkou M, Michel CM, Hervais-Adelman AG. Left temporal alpha-band activity reflects single word intelligibility. Front Syst Neurosci 2013; 7:121. [PMID: 24416001 PMCID: PMC3873629 DOI: 10.3389/fnsys.2013.00121] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 12/10/2013] [Indexed: 11/13/2022] Open
Abstract
The electroencephalographic (EEG) correlates of degraded speech perception have been explored in a number of recent studies. However, such investigations have often been inconclusive as to whether observed differences in brain responses between conditions result from different acoustic properties of more or less intelligible stimuli or whether they relate to cognitive processes implicated in comprehending challenging stimuli. In this study we used noise vocoding to spectrally degrade monosyllabic words in order to manipulate their intelligibility. We used spectral rotation to generate incomprehensible control conditions matched in terms of spectral detail. We recorded EEG from 14 volunteers who listened to a series of noise vocoded (NV) and noise-vocoded spectrally-rotated (rNV) words, while they carried out a detection task. We specifically sought components of the EEG response that showed an interaction between spectral rotation and spectral degradation. This reflects those aspects of the brain electrical response that are related to the intelligibility of acoustically degraded monosyllabic words, while controlling for spectral detail. An interaction between spectral complexity and rotation was apparent in both evoked and induced activity. Analyses of event-related potentials showed an interaction effect for a P300-like component at several centro-parietal electrodes. Time-frequency analysis of the EEG signal in the alpha-band revealed a monotonic increase in event-related desynchronization (ERD) for the NV but not the rNV stimuli in the alpha band at a left temporo-central electrode cluster from 420-560 ms reflecting a direct relationship between the strength of alpha-band ERD and intelligibility. By matching NV words with their incomprehensible rNV homologues, we reveal the spatiotemporal pattern of evoked and induced processes involved in degraded speech perception, largely uncontaminated by purely acoustic effects.
Collapse
Affiliation(s)
- Robert Becker
- Functional Brain Mapping Lab, Department of Fundamental Neuroscience, University of Geneva Geneva, Switzerland
| | - Maria Pefkou
- Brain and Language Lab, Department of Clinical Neuroscience, University of Geneva Geneva, Switzerland
| | - Christoph M Michel
- Functional Brain Mapping Lab, Department of Fundamental Neuroscience, University of Geneva Geneva, Switzerland
| | - Alexis G Hervais-Adelman
- Brain and Language Lab, Department of Clinical Neuroscience, University of Geneva Geneva, Switzerland
| |
Collapse
|
46
|
Erb J, Obleser J. Upregulation of cognitive control networks in older adults' speech comprehension. Front Syst Neurosci 2013; 7:116. [PMID: 24399939 PMCID: PMC3871967 DOI: 10.3389/fnsys.2013.00116] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Accepted: 12/05/2013] [Indexed: 11/20/2022] Open
Abstract
Speech comprehension abilities decline with age and with age-related hearing loss, but it is unclear how this decline expresses in terms of central neural mechanisms. The current study examined neural speech processing in a group of older adults (aged 56–77, n = 16, with varying degrees of sensorineural hearing loss), and compared them to a cohort of young adults (aged 22–31, n = 30, self-reported normal hearing). In a functional MRI experiment, listeners heard and repeated back degraded sentences (4-band vocoded, where the temporal envelope of the acoustic signal is preserved, while the spectral information is substantially degraded). Behaviorally, older adults adapted to degraded speech at the same rate as young listeners, although their overall comprehension of degraded speech was lower. Neurally, both older and young adults relied on the left anterior insula for degraded more than clear speech perception. However, anterior insula engagement in older adults was dependent on hearing acuity. Young adults additionally employed the anterior cingulate cortex (ACC). Interestingly, this age group × degradation interaction was driven by a reduced dynamic range in older adults who displayed elevated levels of ACC activity for both degraded and clear speech, consistent with a persistent upregulation in cognitive control irrespective of task difficulty. For correct speech comprehension, older adults relied on the middle frontal gyrus in addition to a core speech comprehension network recruited by younger adults suggestive of a compensatory mechanism. Taken together, the results indicate that older adults increasingly recruit cognitive control networks, even under optimal listening conditions, at the expense of these systems’ dynamic range.
Collapse
Affiliation(s)
- Julia Erb
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| | - Jonas Obleser
- Max Planck Research Group "Auditory Cognition", Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany
| |
Collapse
|
47
|
Facilitation of inferior frontal cortex by transcranial direct current stimulation induces perceptual learning of severely degraded speech. J Neurosci 2013; 33:15868-78. [PMID: 24089493 DOI: 10.1523/jneurosci.5466-12.2013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Perceptual learning requires the generalization of categorical perceptual sensitivity from trained to untrained items. For degraded speech, perceptual learning modulates activation in a left-lateralized network, including inferior frontal gyrus (IFG) and inferior parietal cortex (IPC). Here we demonstrate that facilitatory anodal transcranial direct current stimulation (tDCS(anodal)) can induce perceptual learning in healthy humans. In a sham-controlled, parallel design study, 36 volunteers were allocated to the three following intervention groups: tDCS(anodal) over left IFG, IPC, or sham. Participants decided on the match between an acoustically degraded and an undegraded written word by forced same-different choice. Acoustic degradation varied in four noise-vocoding levels (2, 3, 4, and 6 bands). Participants were trained to discriminate between minimal (/Tisch/-FISCH) and identical word pairs (/Tisch/-TISCH) over a period of 3 d, and tDCS(anodal) was applied during the first 20 min of training. Perceptual sensitivity (d') for trained word pairs, and an equal number of untrained word pairs, was tested before and after training. Increases in d' indicate perceptual learning for untrained word pairs, and a combination of item-specific and perceptual learning for trained word pairs. Most notably for the lowest intelligibility level, perceptual learning occurred only when tDCS(anodal) was applied over left IFG. For trained pairs, improved d' was seen on all intelligibility levels regardless of tDCS intervention. Over left IPC, tDCS(anodal) did not modulate learning but instead introduced a response bias during training. Volunteers were more likely to respond "same," potentially indicating enhanced perceptual fusion of degraded auditory with undegraded written input. Our results supply first evidence that neural facilitation of higher-order language areas can induce perceptual learning of severely degraded speech.
Collapse
|
48
|
Strelnikov K, Rouger J, Demonet JF, Lagleyre S, Fraysse B, Deguine O, Barone P. Visual activity predicts auditory recovery from deafness after adult cochlear implantation. ACTA ACUST UNITED AC 2013; 136:3682-95. [PMID: 24136826 DOI: 10.1093/brain/awt274] [Citation(s) in RCA: 95] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Modern cochlear implantation technologies allow deaf patients to understand auditory speech; however, the implants deliver only a coarse auditory input and patients must use long-term adaptive processes to achieve coherent percepts. In adults with post-lingual deafness, the high progress of speech recovery is observed during the first year after cochlear implantation, but there is a large range of variability in the level of cochlear implant outcomes and the temporal evolution of recovery. It has been proposed that when profoundly deaf subjects receive a cochlear implant, the visual cross-modal reorganization of the brain is deleterious for auditory speech recovery. We tested this hypothesis in post-lingually deaf adults by analysing whether brain activity shortly after implantation correlated with the level of auditory recovery 6 months later. Based on brain activity induced by a speech-processing task, we found strong positive correlations in areas outside the auditory cortex. The highest positive correlations were found in the occipital cortex involved in visual processing, as well as in the posterior-temporal cortex known for audio-visual integration. The other area, which positively correlated with auditory speech recovery, was localized in the left inferior frontal area known for speech processing. Our results demonstrate that the visual modality's functional level is related to the proficiency level of auditory recovery. Based on the positive correlation of visual activity with auditory speech recovery, we suggest that visual modality may facilitate the perception of the word's auditory counterpart in communicative situations. The link demonstrated between visual activity and auditory speech perception indicates that visuoauditory synergy is crucial for cross-modal plasticity and fostering speech-comprehension recovery in adult cochlear-implanted deaf patients.
Collapse
Affiliation(s)
- Kuzma Strelnikov
- 1 Université de Toulouse, Cerveau and Cognition, Université Paul Sabatier, Toulouse, France
| | | | | | | | | | | | | |
Collapse
|
49
|
Adank P, Rueschemeyer SA, Bekkering H. The role of accent imitation in sensorimotor integration during processing of intelligible speech. Front Hum Neurosci 2013; 7:634. [PMID: 24109447 PMCID: PMC3789941 DOI: 10.3389/fnhum.2013.00634] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Accepted: 09/12/2013] [Indexed: 11/13/2022] Open
Abstract
Recent theories on how listeners maintain perceptual invariance despite variation in the speech signal allocate a prominent role to imitation mechanisms. Notably, these simulation accounts propose that motor mechanisms support perception of ambiguous or noisy signals. Indeed, imitation of ambiguous signals, e.g., accented speech, has been found to aid effective speech comprehension. Here, we explored the possibility that imitation in speech benefits perception by increasing activation in speech perception and production areas. Participants rated the intelligibility of sentences spoken in an unfamiliar accent of Dutch in a functional Magnetic Resonance Imaging experiment. Next, participants in one group repeated the sentences in their own accent, while a second group vocally imitated the accent. Finally, both groups rated the intelligibility of accented sentences in a post-test. The neuroimaging results showed an interaction between type of training and pre- and post-test sessions in left Inferior Frontal Gyrus, Supplementary Motor Area, and left Superior Temporal Sulcus. Although alternative explanations such as task engagement and fatigue need to be considered as well, the results suggest that imitation may aid effective speech comprehension by supporting sensorimotor integration.
Collapse
Affiliation(s)
- Patti Adank
- Department of Speech, Hearing and Phonetic Sciences, Division of Psychology and Language Sciences, University College London London, UK ; Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | | | | |
Collapse
|
50
|
Scott SK, McGettigan C. Do temporal processes underlie left hemisphere dominance in speech perception? BRAIN AND LANGUAGE 2013; 127:36-45. [PMID: 24125574 PMCID: PMC4083253 DOI: 10.1016/j.bandl.2013.07.006] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2012] [Revised: 07/18/2013] [Accepted: 07/22/2013] [Indexed: 05/27/2023]
Abstract
It is not unusual to find it stated as a fact that the left hemisphere is specialized for the processing of rapid, or temporal aspects of sound, and that the dominance of the left hemisphere in the perception of speech can be a consequence of this specialization. In this review we explore the history of this claim and assess the weight of this assumption. We will demonstrate that instead of a supposed sensitivity of the left temporal lobe for the acoustic properties of speech, it is the right temporal lobe which shows a marked preference for certain properties of sounds, for example longer durations, or variations in pitch. We finish by outlining some alternative factors that contribute to the left lateralization of speech perception.
Collapse
Affiliation(s)
- Sophie K Scott
- Institute for Cognitive Neuroscience, 17 Queen Square, London WC1N 3AR, UK.
| | | |
Collapse
|