Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wild CJ, Davis MH, Johnsrude IS. Human auditory cortex is sensitive to the perceived clarity of speech. Neuroimage 2012;60:1490-502. [PMID: 22248574 DOI: 10.1016/j.neuroimage.2012.01.035] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Revised: 11/23/2011] [Accepted: 01/02/2012] [Indexed: 11/29/2022] Open

For:	Wild CJ, Davis MH, Johnsrude IS. Human auditory cortex is sensitive to the perceived clarity of speech. Neuroimage 2012;60:1490-502. [PMID: 22248574 DOI: 10.1016/j.neuroimage.2012.01.035] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Revised: 11/23/2011] [Accepted: 01/02/2012] [Indexed: 11/29/2022] Open

Number

Cited by Other Article(s)

Kim SG, De Martino F, Overath T. Linguistic modulation of the neural encoding of phonemes. Cereb Cortex 2024;34:bhae155. [PMID: 38687241 PMCID: PMC11059272 DOI: 10.1093/cercor/bhae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 05/02/2024] Open

Zhou X, Burg E, Kan A, Litovsky RY. Investigating effortful speech perception using fNIRS and pupillometry measures. CURRENT RESEARCH IN NEUROBIOLOGY 2022;3:100052. [PMID: 36518346 PMCID: PMC9743070 DOI: 10.1016/j.crneur.2022.100052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 05/12/2022] [Accepted: 08/12/2022] [Indexed: 10/15/2022] Open

Suppanen E, Winkler I, Kujala T, Ylinen S. More efficient formation of longer-term representations for word forms at birth can be linked to better language skills at 2 years. Dev Cogn Neurosci 2022;55:101113. [PMID: 35605476 PMCID: PMC9130088 DOI: 10.1016/j.dcn.2022.101113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 04/03/2022] [Accepted: 05/09/2022] [Indexed: 11/15/2022] Open

Irsik VC, Johnsrude IS, Herrmann B. Neural Activity during Story Listening Is Synchronized across Individuals Despite Acoustic Masking. J Cogn Neurosci 2022;34:933-950. [PMID: 35258555 DOI: 10.1162/jocn_a_01842] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022;33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open

Al-Zubaidi A, Bräuer S, Holdgraf CR, Schepers IM, Rieger JW. OUP accepted manuscript. Cereb Cortex Commun 2022;3:tgac007. [PMID: 35281216 PMCID: PMC8914075 DOI: 10.1093/texcom/tgac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 11/14/2022] Open

Zhang M, Alamatsaz N, Ihlefeld A. Hemodynamic Responses Link Individual Differences in Informational Masking to the Vicinity of Superior Temporal Gyrus. Front Neurosci 2021;15:675326. [PMID: 34366772 PMCID: PMC8339305 DOI: 10.3389/fnins.2021.675326] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/13/2021] [Indexed: 01/20/2023] Open

Text Captioning Buffers Against the Effects of Background Noise and Hearing Loss on Memory for Speech. Ear Hear 2021;43:115-127. [PMID: 34260436 DOI: 10.1097/aud.0000000000001079] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

OBJECTIVE

Everyday speech understanding frequently occurs in perceptually demanding environments, for example, due to background noise and normal age-related hearing loss. The resulting degraded speech signals increase listening effort, which gives rise to negative downstream effects on subsequent memory and comprehension, even when speech is intelligible. In two experiments, we explored whether the presentation of realistic assistive text captioned speech offsets the negative effects of background noise and hearing impairment on multiple measures of speech memory.

DESIGN

In Experiment 1, young normal-hearing adults (N = 48) listened to sentences for immediate recall and delayed recognition memory. Speech was presented in quiet or in two levels of background noise. Sentences were either presented as speech only or as text captioned speech. Thus, the experiment followed a 2 (caption vs no caption) × 3 (no noise, +7 dB signal-to-noise ratio, +3 dB signal-to-noise ratio) within-subjects design. In Experiment 2, a group of older adults (age range: 61 to 80, N = 31), with varying levels of hearing acuity completed the same experimental task as in Experiment 1. For both experiments, immediate recall, recognition memory accuracy, and recognition memory confidence were analyzed via general(ized) linear mixed-effects models. In addition, we examined individual differences as a function of hearing acuity in Experiment 2.

RESULTS

In Experiment 1, we found that the presentation of realistic text-captioned speech in young normal-hearing listeners showed improved immediate recall and delayed recognition memory accuracy and confidence compared with speech alone. Moreover, text captions attenuated the negative effects of background noise on all speech memory outcomes. In Experiment 2, we replicated the same pattern of results in a sample of older adults with varying levels of hearing acuity. Moreover, we showed that the negative effects of hearing loss on speech memory in older adulthood were attenuated by the presentation of text captions.

CONCLUSIONS

Collectively, these findings strongly suggest that the simultaneous presentation of text can offset the negative effects of effortful listening on speech memory. Critically, captioning benefits extended from immediate word recall to long-term sentence recognition memory, a benefit that was observed not only for older adults with hearing loss but also young normal-hearing listeners. These findings suggest that the text captioning benefit to memory is robust and has potentially wide applications for supporting speech listening in acoustically challenging environments.

Collapse

Prince P, Paul BT, Chen J, Le T, Lin V, Dimitrijevic A. Neural correlates of visual stimulus encoding and verbal working memory differ between cochlear implant users and normal-hearing controls. Eur J Neurosci 2021;54:5016-5037. [PMID: 34146363 PMCID: PMC8457219 DOI: 10.1111/ejn.15365] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 06/10/2021] [Accepted: 06/14/2021] [Indexed: 11/29/2022]

Wang J, Chen J, Yang X, Liu L, Wu C, Lu L, Li L, Wu Y. Common Brain Substrates Underlying Auditory Speech Priming and Perceived Spatial Separation. Front Neurosci 2021;15:664985. [PMID: 34220425 PMCID: PMC8247760 DOI: 10.3389/fnins.2021.664985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 05/10/2021] [Indexed: 11/22/2022] Open

Abstract

Under a “cocktail party” environment, listeners can utilize prior knowledge of the content and voice of the target speech [i.e., auditory speech priming (ASP)] and perceived spatial separation to improve recognition of the target speech among masking speech. Previous studies suggest that these two unmasking cues are not processed independently. However, it is unclear whether the unmasking effects of these two cues are supported by common neural bases. In the current study, we aimed to first confirm that ASP and perceived spatial separation contribute to the improvement of speech recognition interactively in a multitalker condition and further investigate whether there exist intersectant brain substrates underlying both unmasking effects, by introducing these two unmasking cues in a unified paradigm and using functional magnetic resonance imaging. The results showed that neural activations by the unmasking effects of ASP and perceived separation partly overlapped in brain areas: the left pars triangularis (TriIFG) and orbitalis of the inferior frontal gyrus, left inferior parietal lobule, left supramarginal gyrus, and bilateral putamen, all of which are involved in the sensorimotor integration and the speech production. The activations of the left TriIFG were correlated with behavioral improvements caused by ASP and perceived separation. Meanwhile, ASP and perceived separation also enhanced the functional connectivity between the left IFG and brain areas related to the suppression of distractive speech signals: the anterior cingulate cortex and the left middle frontal gyrus, respectively. Therefore, these findings suggest that the motor representation of speech is important for both the unmasking effects of ASP and perceived separation and highlight the critical role of the left IFG in these unmasking effects in “cocktail party” environments.

Collapse

Holmes E, Johnsrude IS. Speech-evoked brain activity is more robust to competing speech when it is spoken by someone familiar. Neuroimage 2021;237:118107. [PMID: 33933598 DOI: 10.1016/j.neuroimage.2021.118107] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 04/19/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022] Open

Murai SA, Riquimaroux H. Neural correlates of subjective comprehension of noise-vocoded speech. Hear Res 2021;405:108249. [PMID: 33894680 DOI: 10.1016/j.heares.2021.108249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 03/28/2021] [Accepted: 04/06/2021] [Indexed: 10/21/2022]

Urbschat A, Uppenkamp S, Anemüller J. Searchlight Classification Informative Region Mixture Model (SCIM): Identification of Cortical Regions Showing Discriminable BOLD Patterns in Event-Related Auditory fMRI Data. Front Neurosci 2021;14:616906. [PMID: 33597841 PMCID: PMC7882477 DOI: 10.3389/fnins.2020.616906] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 12/29/2020] [Indexed: 11/13/2022] Open

Abstract

The investigation of abstract cognitive tasks, e.g., semantic processing of speech, requires the simultaneous use of a carefully selected stimulus design and sensitive tools for the analysis of corresponding neural activity that are comparable across different studies investigating similar research questions. Multi-voxel pattern analysis (MVPA) methods are commonly used in neuroimaging to investigate BOLD responses corresponding to neural activation associated with specific cognitive tasks. Regions of significant activation are identified by a thresholding operation during multivariate pattern analysis, the results of which are susceptible to the applied threshold value. Investigation of analysis approaches that are robust to a large extent with respect to thresholding, is thus an important goal pursued here. The present paper contributes a novel statistical analysis method for fMRI experiments, searchlight classification informative region mixture model (SCIM), that is based on the assumption that the whole brain volume can be subdivided into two groups of voxels: spatial voxel positions around which recorded BOLD activity does convey information about the present stimulus condition and those that do not. A generative statistical model is proposed that assigns a probability of being informative to each position in the brain, based on a combination of a support vector machine searchlight analysis and Gaussian mixture models. Results from an auditory fMRI study investigating cortical regions that are engaged in the semantic processing of speech indicate that the SCIM method identifies physiologically plausible brain regions as informative, similar to those from two standard methods as reference that we compare to, with two important differences. SCIM-identified regions are very robust to the choice of the threshold for significance, i.e., less “noisy,” in contrast to, e.g., the binomial test whose results in the present experiment are highly dependent on the chosen significance threshold or random permutation tests that are additionally bound to very high computational costs. In group analyses, the SCIM method identifies a physiologically plausible pre-frontal region, anterior cingulate sulcus, to be involved in semantic processing that other methods succeed to identify only in single subject analyses.

Collapse

Kajiura M, Jeong H, Kawata NYS, Yu S, Kinoshita T, Kawashima R, Sugiura M. Brain activity predicts future learning success in intensive second language listening training. BRAIN AND LANGUAGE 2021;212:104839. [PMID: 33271393 DOI: 10.1016/j.bandl.2020.104839] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 06/03/2020] [Accepted: 07/14/2020] [Indexed: 06/12/2023]

Holmes E, Zeidman P, Friston KJ, Griffiths TD. Difficulties with Speech-in-Noise Perception Related to Fundamental Grouping Processes in Auditory Cortex. Cereb Cortex 2020;31:1582-1596. [PMID: 33136138 PMCID: PMC7869094 DOI: 10.1093/cercor/bhaa311] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/04/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open

Signoret C, Andersen LM, Dahlström Ö, Blomberg R, Lundqvist D, Rudner M, Rönnberg J. The Influence of Form- and Meaning-Based Predictions on Cortical Speech Processing Under Challenging Listening Conditions: A MEG Study. Front Neurosci 2020;14:573254. [PMID: 33100961 PMCID: PMC7546411 DOI: 10.3389/fnins.2020.573254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 09/01/2020] [Indexed: 01/07/2023] Open

Banellis L, Sokoliuk R, Wild CJ, Bowman H, Cruse D. Event-related potentials reflect prediction errors and pop-out during comprehension of degraded speech. Neurosci Conscious 2020;2020:niaa022. [PMID: 33133640 PMCID: PMC7585676 DOI: 10.1093/nc/niaa022] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 07/08/2020] [Accepted: 08/06/2020] [Indexed: 11/20/2022] Open

Slower Speaking Rate Reduces Listening Effort Among Listeners With Cochlear Implants. Ear Hear 2020;42:584-595. [PMID: 33002968 PMCID: PMC8005496 DOI: 10.1097/aud.0000000000000958] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

OBJECTIVES

Slowed speaking rate was examined for its effects on speech intelligibility, its interaction with the benefit of contextual cues, and the impact of these factors on listening effort in adults with cochlear implants.

DESIGN

Participants (n = 21 cochlear implant users) heard high- and low-context sentences that were played at the original speaking rate, as well as a slowed (1.4× duration) speaking rate, using uniform pitch-synchronous time warping. In addition to intelligibility measures, changes in pupil dilation were measured as a time-varying index of processing load or listening effort. Slope of pupil size recovery to baseline after the sentence was used as an index of resolution of perceptual ambiguity.

RESULTS

Speech intelligibility was better for high-context compared to low-context sentences and slightly better for slower compared to original-rate speech. Speech rate did not affect magnitude and latency of peak pupil dilation relative to sentence offset. However, baseline pupil size recovered more substantially for slower-rate sentences, suggesting easier processing in the moment after the sentence was over. The effect of slowing speech rate was comparable to changing a sentence from low context to high context. The effect of context on pupil dilation was not observed until after the sentence was over, and one of two analyses suggested that context had greater beneficial effects on listening effort when the speaking rate was slower. These patterns maintained even at perfect sentence intelligibility, suggesting that correct speech repetition does not guarantee efficient or effortless processing. With slower speaking rates, there was less variability in pupil dilation slopes following the sentence, implying mitigation of some of the difficulties shown by individual listeners who would otherwise demonstrate prolonged effort after a sentence is heard.

CONCLUSIONS

Slowed speaking rate provides release from listening effort when hearing an utterance, particularly relieving effort that would have lingered after a sentence is over. Context arguably provides even more release from listening effort when speaking rate is slower. The pattern of prolonged pupil dilation for faster speech is consistent with increased need to mentally correct errors, although that exact interpretation cannot be verified with intelligibility data alone or with pupil data alone. A pattern of needing to dwell on a sentence to disambiguate misperceptions likely contributes to difficulty in running conversation where there are few opportunities to pause and resolve recently heard utterances.

Collapse

Wikman P, Sahari E, Salmela V, Leminen A, Leminen M, Laine M, Alho K. Breaking down the cocktail party: Attentional modulation of cerebral audiovisual speech processing. Neuroimage 2020;224:117365. [PMID: 32941985 DOI: 10.1016/j.neuroimage.2020.117365] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 08/19/2020] [Accepted: 09/07/2020] [Indexed: 12/20/2022] Open

Lawless MS, Vigeant MC. Sensitivity of the human auditory cortex and reward network to reverberant musical stimuli. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020;147:2121. [PMID: 32359334 DOI: 10.1121/10.0000984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 03/12/2020] [Indexed: 06/11/2023]

Krishnan S, Lima CF, Evans S, Chen S, Guldner S, Yeff H, Manly T, Scott SK. Beatboxers and Guitarists Engage Sensorimotor Regions Selectively When Listening to the Instruments They can Play. Cereb Cortex 2019;28:4063-4079. [PMID: 30169831 PMCID: PMC6188551 DOI: 10.1093/cercor/bhy208] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 08/04/2018] [Indexed: 12/31/2022] Open

Guediche S, Zhu Y, Minicucci D, Blumstein SE. Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions. BRAIN AND LANGUAGE 2019;199:104698. [PMID: 31586792 DOI: 10.1016/j.bandl.2019.104698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/15/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]

A Sound-Sensitive Source of Alpha Oscillations in Human Non-Primary Auditory Cortex. J Neurosci 2019;39:8679-8689. [PMID: 31533976 PMCID: PMC6820204 DOI: 10.1523/jneurosci.0696-19.2019] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/09/2019] [Accepted: 07/02/2019] [Indexed: 02/06/2023] Open

Abstract

The functional organization of human auditory cortex can be probed by characterizing responses to various classes of sound at different anatomical locations. Along with histological studies this approach has revealed a primary field in posteromedial Heschl's gyrus (HG) with pronounced induced high-frequency (70–150 Hz) activity and short-latency responses that phase-lock to rapid transient sounds. Low-frequency neural oscillations are also relevant to stimulus processing and information flow, however, their distribution within auditory cortex has not been established. Alpha activity (7–14 Hz) in particular has been associated with processes that may differentially engage earlier versus later levels of the cortical hierarchy, including functional inhibition and the communication of sensory predictions. These theories derive largely from the study of occipitoparietal sources readily detectable in scalp electroencephalography. To characterize the anatomical basis and functional significance of less accessible temporal-lobe alpha activity we analyzed responses to sentences in seven human adults (4 female) with epilepsy who had been implanted with electrodes in superior temporal cortex. In contrast to primary cortex in posteromedial HG, a non-primary field in anterolateral HG was characterized by high spontaneous alpha activity that was strongly suppressed during auditory stimulation. Alpha-power suppression decreased with distance from anterolateral HG throughout superior temporal cortex, and was more pronounced for clear compared to degraded speech. This suppression could not be accounted for solely by a change in the slope of the power spectrum. The differential manifestation and stimulus-sensitivity of alpha oscillations across auditory fields should be accounted for in theories of their generation and function.

SIGNIFICANCE STATEMENT To understand how auditory cortex is organized in support of perception, we recorded from patients implanted with electrodes for clinical reasons. This allowed measurement of activity in brain regions at different levels of sensory processing. Oscillations in the alpha range (7–14 Hz) have been associated with functions including sensory prediction and inhibition of regions handling irrelevant information, but their distribution within auditory cortex is not known. A key finding was that these oscillations dominated in one particular non-primary field, anterolateral Heschl's gyrus, and were suppressed when subjects listened to sentences. These results build on our knowledge of the functional organization of auditory cortex and provide anatomical constraints on theories of the generation and function of alpha oscillations.

Collapse

Planton S, Chanoine V, Sein J, Anton JL, Nazarian B, Pallier C, Pattamadilok C. Top-down activation of the visuo-orthographic system during spoken sentence processing. Neuroimage 2019;202:116135. [PMID: 31470125 DOI: 10.1016/j.neuroimage.2019.116135] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 08/09/2019] [Accepted: 08/26/2019] [Indexed: 11/28/2022] Open

Abstract

The left ventral occipitotemporal cortex (vOT) is considered the key area of the visuo-orthographic system. However, some studies reported that the area is also involved in speech processing tasks, especially those that require activation of orthographic knowledge. These findings suggest the existence of a top-down activation mechanism allowing such cross-modal activation. Yet, little is known about the involvement of the vOT in more natural speech processing situations like spoken sentence processing. Here, we addressed this issue in a functional Magnetic Resonance Imaging (fMRI) study while manipulating the impacts of two factors, i.e., task demands (semantic vs. low-level perceptual task) and the quality of speech signals (sentences presented against clear vs. noisy background). Analyses were performed at the levels of whole brain and region-of-interest (ROI) focusing on the vOT voxels individually identified through a reading task. Whole brain analysis showed that processing spoken sentences induced activity in a large network including the regions typically involved in phonological, articulatory, semantic and orthographic processing. ROI analysis further specified that a significant part of the vOT voxels that responded to written words also responded to spoken sentences, thus, suggesting that the same area within the left occipitotemporal pathway contributes to both reading and speech processing. Interestingly, both analyses provided converging evidence that vOT responses to speech were sensitive to both task demands and quality of speech signals: Compared to the low-level perceptual task, activity of the area increased when efforts on comprehension were required. The impact of background noise depended on task demands. It led to a decrease of vOT activity in the semantic task but not in the low-level perceptual task. Our results provide new insights into the function of this key area of the reading network, notably by showing that its speech-induced top-down activation also generalizes to ecological speech processing situations.

Collapse

Zekveld AA, Kramer SE, Rönnberg J, Rudner M. In a Concurrent Memory and Auditory Perception Task, the Pupil Dilation Response Is More Sensitive to Memory Load Than to Auditory Stimulus Characteristics. Ear Hear 2019;40:272-286. [PMID: 29923867 PMCID: PMC6400496 DOI: 10.1097/aud.0000000000000612] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 04/10/2018] [Indexed: 11/30/2022]

Abstract

OBJECTIVES

Speech understanding may be cognitively demanding, but it can be enhanced when semantically related text cues precede auditory sentences. The present study aimed to determine whether (a) providing text cues reduces pupil dilation, a measure of cognitive load, during listening to sentences, (b) repeating the sentences aloud affects recall accuracy and pupil dilation during recall of cue words, and (c) semantic relatedness between cues and sentences affects recall accuracy and pupil dilation during recall of cue words.

DESIGN

Sentence repetition following text cues and recall of the text cues were tested. Twenty-six participants (mean age, 22 years) with normal hearing listened to masked sentences. On each trial, a set of four-word cues was presented visually as text preceding the auditory presentation of a sentence whose meaning was either related or unrelated to the cues. On each trial, participants first read the cue words, then listened to a sentence. Following this they spoke aloud either the cue words or the sentence, according to instruction, and finally on all trials orally recalled the cues. Peak pupil dilation was measured throughout listening and recall on each trial. Additionally, participants completed a test measuring the ability to perceive degraded verbal text information and three working memory tests (a reading span test, a size-comparison span test, and a test of memory updating).

RESULTS

Cue words that were semantically related to the sentence facilitated sentence repetition but did not reduce pupil dilation. Recall was poorer and there were more intrusion errors when the cue words were related to the sentences. Recall was also poorer when sentences were repeated aloud. Both behavioral effects were associated with greater pupil dilation. Larger reading span capacity and smaller size-comparison span were associated with larger peak pupil dilation during listening. Furthermore, larger reading span and greater memory updating ability were both associated with better cue recall overall.

CONCLUSIONS

Although sentence-related word cues facilitate sentence repetition, our results indicate that they do not reduce cognitive load during listening in noise with a concurrent memory load. As expected, higher working memory capacity was associated with better recall of the cues. Unexpectedly, however, semantic relatedness with the sentence reduced word cue recall accuracy and increased intrusion errors, suggesting an effect of semantic confusion. Further, speaking the sentence aloud also reduced word cue recall accuracy, probably due to articulatory suppression. Importantly, imposing a memory load during listening to sentences resulted in the absence of formerly established strong effects of speech intelligibility on the pupil dilation response. This nullified intelligibility effect demonstrates that the pupil dilation response to a cognitive (memory) task can completely overshadow the effect of perceptual factors on the pupil dilation response. This highlights the importance of taking cognitive task load into account during auditory testing.

Collapse

Peelle JE. Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior. Ear Hear 2019;39:204-214. [PMID: 28938250 PMCID: PMC5821557 DOI: 10.1097/aud.0000000000000494] [Citation(s) in RCA: 309] [Impact Index Per Article: 61.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Accepted: 07/28/2017] [Indexed: 02/04/2023]

Abstract

Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners' abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication.

Collapse

Rosemann S, Thiel CM. The effect of age-related hearing loss and listening effort on resting state connectivity. Sci Rep 2019;9:2337. [PMID: 30787339 PMCID: PMC6382886 DOI: 10.1038/s41598-019-38816-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/10/2019] [Indexed: 12/22/2022] Open

Luthra S, Guediche S, Blumstein SE, Myers EB. Neural substrates of subphonemic variation and lexical competition in spoken word recognition. LANGUAGE, COGNITION AND NEUROSCIENCE 2019;34:151-169. [PMID: 31106225 PMCID: PMC6516505 DOI: 10.1080/23273798.2018.1531140] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]

Hearing Impairment and Perceived Clarity of Predictable Speech. Ear Hear 2019;40:1140-1148. [DOI: 10.1097/aud.0000000000000689] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Chu R, Meltzer JA, Bitan T. Interhemispheric interactions during sentence comprehension in patients with aphasia. Cortex 2018;109:74-91. [PMID: 30312780 DOI: 10.1016/j.cortex.2018.08.022] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 05/03/2018] [Accepted: 08/28/2018] [Indexed: 02/06/2023]

Abstract

Right-hemisphere involvement in language processing following left-hemisphere damage may reflect either compensatory processes, or a release from homotopic transcallosal inhibition, resulting in excessive right-to-left suppression that is maladaptive for language performance. Using fMRI, we assessed inter-hemispheric effective connectivity in fifteen patients with post-stroke aphasia, along with age-matched and younger controls during a sentence comprehension task. Dynamic Causal Modeling was used with four bilateral regions including inferior frontal gyri (IFG) and primary auditory cortices (A1). Despite the presence of lesions, satisfactory model fit was obtained in 9/15 patients. In young controls, the only significant homotopic connection (RA1-LA1), was excitatory, while inhibitory connections emanated from LIFG to both left and right A1's. Interestingly, these connections were also correlated with language comprehension scores in patients. The results for homotopic connections show that excitatory connectivity from RA1-to-LA1 and inhibitory connectivity from LA1-to-RA1 are associated with general auditory verbal comprehension. Moreover, negative correlations were found between sentence comprehension and top-down coupling for both heterotopic (LIFG-to-RA1) and intra-hemispheric (LIFG-to-LA1) connections. These results do not show an emergence of a new compensatory right to left excitation in patients nor do they support the existence of left to right transcallosal suppression in controls. Nevertheless, the correlations with performance in patients are consistent with some aspects of both the compensation model, and the transcallosal suppression account for the role of the RH. Altogether our results suggest that changes to both excitatory and inhibitory homotopic and heterotopic connections due to LH damage may be maladaptive, as they disrupt the normal inter-hemispheric coordination and communication.

Collapse

Wilson SM, Yen M, Eriksson DK. An adaptive semantic matching paradigm for reliable and valid language mapping in individuals with aphasia. Hum Brain Mapp 2018;39:3285-3307. [PMID: 29665223 PMCID: PMC6045968 DOI: 10.1002/hbm.24077] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 03/26/2018] [Accepted: 03/30/2018] [Indexed: 11/08/2022] Open

The Effect of Aging and Priming on Same/Different Judgments Between Text and Partially Masked Speech. Ear Hear 2018. [PMID: 28650352 DOI: 10.1097/aud.0000000000000450] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

OBJECTIVES

It is well known from previous research that when listeners are told what they are about to hear before a degraded or partially masked auditory signal is presented, the speech signal "pops out" of the background and becomes considerably more intelligible. The goal of this research was to explore whether this priming effect is as strong in older adults as in younger adults.

DESIGN

Fifty-six adults-28 older and 28 younger-listened to "nonsense" sentences spoken by a female talker in the presence of a 2-talker speech masker (also female) or a fluctuating speech-like noise masker at 5 signal-to-noise ratios. Just before, or just after, the auditory signal was presented, a typed caption was displayed on a computer screen. The caption sentence was either identical to the auditory sentence or differed by one key word. The subjects' task was to decide whether the caption and auditory messages were the same or different. Discrimination performance was reported in d'. The strength of the pop-out perception was inferred from the improvement in performance that was expected from the caption-before order of presentation. A subset of 12 subjects from each group made confidence judgments as they gave their responses, and also completed several cognitive tests.

RESULTS

Data showed a clear order effect for both subject groups and both maskers, with better same-different discrimination performance for the caption-before condition than the caption-after condition. However, for the two-talker masker, the younger adults obtained a larger and more consistent benefit from the caption-before order than the older adults across signal-to-noise ratios. Especially at the poorer signal-to-noise ratios, older subjects showed little evidence that they experienced the pop-out effect that is presumed to make the discrimination task easier. On average, older subjects also appeared to approach the task differently, being more reluctant than younger subjects to report that the captions and auditory sentences were the same. Correlation analyses indicated a significant negative association between age and priming benefit in the two-talker masker and nonsignificant associations between priming benefit in this masker and either high-frequency hearing loss or performance on the cognitive tasks.

CONCLUSIONS

Previous studies have shown that older adults are at least as good, if not better, at exploiting context in speech recognition, as compared with younger adults. The current results are not in disagreement with those findings but suggest that, under some conditions, the automatic priming process that may contribute to benefits from context is not as strong in older as in younger adults.

Collapse

Helfer KS, Freyman RL, Merchant GR. How repetition influences speech understanding by younger, middle-aged and older adults. Int J Audiol 2018;57:695-702. [PMID: 29801416 DOI: 10.1080/14992027.2018.1475756] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]

Zheng Y, Wu C, Li J, Li R, Peng H, She S, Ning Y, Li L. Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions. BMC Psychiatry 2018;18:90. [PMID: 29618332 PMCID: PMC5885301 DOI: 10.1186/s12888-018-1675-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 03/26/2018] [Indexed: 01/17/2023] Open

Abstract

BACKGROUND

Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia.

METHODS

Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions.

RESULTS

The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate.

CONCLUSIONS

In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.

Collapse

Alain C, Du Y, Bernstein LJ, Barten T, Banai K. Listening under difficult conditions: An activation likelihood estimation meta-analysis. Hum Brain Mapp 2018. [PMID: 29536592 DOI: 10.1002/hbm.24031] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open

Abstract

The brain networks supporting speech identification and comprehension under difficult listening conditions are not well specified. The networks hypothesized to underlie effortful listening include regions responsible for executive control. We conducted meta-analyses of auditory neuroimaging studies to determine whether a common activation pattern of the frontal lobe supports effortful listening under different speech manipulations. Fifty-three functional neuroimaging studies investigating speech perception were divided into three independent Activation Likelihood Estimate analyses based on the type of speech manipulation paradigm used: Speech-in-noise (SIN, 16 studies, involving 224 participants); spectrally degraded speech using filtering techniques (15 studies involving 270 participants); and linguistic complexity (i.e., levels of syntactic, lexical and semantic intricacy/density, 22 studies, involving 348 participants). Meta-analysis of the SIN studies revealed higher effort was associated with activation in left inferior frontal gyrus (IFG), left inferior parietal lobule, and right insula. Studies using spectrally degraded speech demonstrated increased activation of the insula bilaterally and the left superior temporal gyrus (STG). Studies manipulating linguistic complexity showed activation in the left IFG, right middle frontal gyrus, left middle temporal gyrus and bilateral STG. Planned contrasts revealed left IFG activation in linguistic complexity studies, which differed from activation patterns observed in SIN or spectral degradation studies. Although there were no significant overlap in prefrontal activation across these three speech manipulation paradigms, SIN and spectral degradation showed overlapping regions in left and right insula. These findings provide evidence that there is regional specialization within the left IFG and differential executive networks underlie effortful listening.

Collapse

Di Liberto GM, Lalor EC, Millman RE. Causal cortical dynamics of a predictive enhancement of speech intelligibility. Neuroimage 2018;166:247-258. [DOI: 10.1016/j.neuroimage.2017.10.066] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 10/04/2017] [Accepted: 10/30/2017] [Indexed: 11/28/2022] Open

Rowland SC, Hartley DEH, Wiggins IM. Listening in Naturalistic Scenes: What Can Functional Near-Infrared Spectroscopy and Intersubject Correlation Analysis Tell Us About the Underlying Brain Activity? Trends Hear 2018;22:2331216518804116. [PMID: 30345888 PMCID: PMC6198387 DOI: 10.1177/2331216518804116] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 08/17/2018] [Accepted: 09/06/2018] [Indexed: 12/24/2022] Open

Abstract

Listening to speech in the noisy conditions of everyday life can be effortful, reflecting the increased cognitive workload involved in extracting meaning from a degraded acoustic signal. Studying the underlying neural processes has the potential to provide mechanistic insight into why listening is effortful under certain conditions. In a move toward studying listening effort under ecologically relevant conditions, we used the silent and flexible neuroimaging technique functional near-infrared spectroscopy (fNIRS) to examine brain activity during attentive listening to speech in naturalistic scenes. Thirty normally hearing participants listened to a series of narratives continuously varying in acoustic difficulty while undergoing fNIRS imaging. Participants then listened to another set of closely matched narratives and rated perceived effort and intelligibility for each scene. As expected, self-reported effort generally increased with worsening signal-to-noise ratio. After controlling for better-ear signal-to-noise ratio, perceived effort was greater in scenes that contained competing speech than in those that did not, potentially reflecting an additional cognitive cost of overcoming informational masking. We analyzed the fNIRS data using intersubject correlation, a data-driven approach suitable for analyzing data collected under naturalistic conditions. Significant intersubject correlation was seen in the bilateral auditory cortices and in a range of channels across the prefrontal cortex. The involvement of prefrontal regions is consistent with the notion that higher order cognitive processes are engaged during attentive listening to speech in complex real-world conditions. However, further research is needed to elucidate the relationship between perceived listening effort and activity in these extended cortical networks.

Collapse

Xie X, Myers E. Left Inferior Frontal Gyrus Sensitivity to Phonetic Competition in Receptive Language Processing: A Comparison of Clear and Conversational Speech. J Cogn Neurosci 2017;30:267-280. [PMID: 29160743 DOI: 10.1162/jocn_a_01208] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Abstract

The speech signal is rife with variations in phonetic ambiguity. For instance, when talkers speak in a conversational register, they demonstrate less articulatory precision, leading to greater potential for confusability at the phonetic level compared with a clear speech register. Current psycholinguistic models assume that ambiguous speech sounds activate more than one phonological category and that competition at prelexical levels cascades to lexical levels of processing. Imaging studies have shown that the left inferior frontal gyrus (LIFG) is modulated by phonetic competition between simultaneously activated categories, with increases in activation for more ambiguous tokens. Yet, these studies have often used artificially manipulated speech and/or metalinguistic tasks, which arguably may recruit neural regions that are not critical for natural speech recognition. Indeed, a prominent model of speech processing, the dual-stream model, posits that the LIFG is not involved in prelexical processing in receptive language processing. In the current study, we exploited natural variation in phonetic competition in the speech signal to investigate the neural systems sensitive to phonetic competition as listeners engage in a receptive language task. Participants heard nonsense sentences spoken in either a clear or conversational register as neural activity was monitored using fMRI. Conversational sentences contained greater phonetic competition, as estimated by measures of vowel confusability, and these sentences also elicited greater activation in a region in the LIFG. Sentence-level phonetic competition metrics uniquely correlated with LIFG activity as well. This finding is consistent with the hypothesis that the LIFG responds to competition at multiple levels of language processing and that recruitment of this region does not require an explicit phonological judgment.

Collapse

Hakonen M, May PJC, Jääskeläinen IP, Jokinen E, Sams M, Tiitinen H. Predictive processing increases intelligibility of acoustically distorted speech: Behavioral and neural correlates. Brain Behav 2017;7:e00789. [PMID: 28948083 PMCID: PMC5607552 DOI: 10.1002/brb3.789] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 06/10/2017] [Accepted: 06/26/2017] [Indexed: 11/08/2022] Open

Abstract

INTRODUCTION

We examined which brain areas are involved in the comprehension of acoustically distorted speech using an experimental paradigm where the same distorted sentence can be perceived at different levels of intelligibility. This change in intelligibility occurs via a single intervening presentation of the intact version of the sentence, and the effect lasts at least on the order of minutes. Since the acoustic structure of the distorted stimulus is kept fixed and only intelligibility is varied, this allows one to study brain activity related to speech comprehension specifically.

METHODS

In a functional magnetic resonance imaging (fMRI) experiment, a stimulus set contained a block of six distorted sentences. This was followed by the intact counterparts of the sentences, after which the sentences were presented in distorted form again. A total of 18 such sets were presented to 20 human subjects.

RESULTS

The blood oxygenation level dependent (BOLD)-responses elicited by the distorted sentences which came after the disambiguating, intact sentences were contrasted with the responses to the sentences presented before disambiguation. This revealed increased activity in the bilateral frontal pole, the dorsal anterior cingulate/paracingulate cortex, and the right frontal operculum. Decreased BOLD responses were observed in the posterior insula, Heschl's gyrus, and the posterior superior temporal sulcus.

CONCLUSIONS

The brain areas that showed BOLD-enhancement for increased sentence comprehension have been associated with executive functions and with the mapping of incoming sensory information to representations stored in episodic memory. Thus, the comprehension of acoustically distorted speech may be associated with the engagement of memory-related subsystems. Further, activity in the primary auditory cortex was modulated by prior experience, possibly in a predictive coding framework. Our results suggest that memory biases the perception of ambiguous sensory information toward interpretations that have the highest probability to be correct based on previous experience.

Collapse

Li J, Wu C, Zheng Y, Li R, Li X, She S, Wu H, Peng H, Ning Y, Li L. Schizophrenia affects speech-induced functional connectivity of the superior temporal gyrus under cocktail-party listening conditions. Neuroscience 2017;359:248-257. [PMID: 28673720 DOI: 10.1016/j.neuroscience.2017.06.043] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/02/2017] [Accepted: 06/22/2017] [Indexed: 12/31/2022]

Wild CJ, Linke AC, Zubiaurre-Elorza L, Herzmann C, Duffy H, Han VK, Lee DSC, Cusack R. Adult-like processing of naturalistic sounds in auditory cortex by 3- and 9-month old infants. Neuroimage 2017. [PMID: 28648887 DOI: 10.1016/j.neuroimage.2017.06.038] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

The Hierarchical Cortical Organization of Human Speech Processing. J Neurosci 2017;37:6539-6557. [PMID: 28588065 DOI: 10.1523/jneurosci.3267-16.2017] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Revised: 05/22/2017] [Accepted: 05/25/2017] [Indexed: 12/13/2022] Open

Abstract

Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech.SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to natural speech. Both cerebral hemispheres were actively involved in speech processing in large and equal amounts. Also, the transformation from spectral features to semantic elements occurs early in the cortical speech-processing stream. Our experimental and analytical approaches are important alternatives and complements to standard approaches that use segmented speech and block designs, which report more laterality in speech processing and associated semantic processing to higher levels of cortex than reported here.

Collapse

Wu C, Zheng Y, Li J, Wu H, She S, Liu S, Ning Y, Li L. Brain substrates underlying auditory speech priming in healthy listeners and listeners with schizophrenia. Psychol Med 2017;47:837-852. [PMID: 27894376 DOI: 10.1017/s0033291716002816] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Abstract

BACKGROUND

Under 'cocktail party' listening conditions, healthy listeners and listeners with schizophrenia can use temporally pre-presented auditory speech-priming (ASP) stimuli to improve target-speech recognition, even though listeners with schizophrenia are more vulnerable to informational speech masking.

METHOD

Using functional magnetic resonance imaging, this study searched for both brain substrates underlying the unmasking effect of ASP in 16 healthy controls and 22 patients with schizophrenia, and brain substrates underlying schizophrenia-related speech-recognition deficits under speech-masking conditions.

RESULTS

In both controls and patients, introducing the ASP condition (against the auditory non-speech-priming condition) not only activated the left superior temporal gyrus (STG) and left posterior middle temporal gyrus (pMTG), but also enhanced functional connectivity of the left STG/pMTG with the left caudate. It also enhanced functional connectivity of the left STG/pMTG with the left pars triangularis of the inferior frontal gyrus (TriIFG) in controls and that with the left Rolandic operculum in patients. The strength of functional connectivity between the left STG and left TriIFG was correlated with target-speech recognition under the speech-masking condition in both controls and patients, but reduced in patients.

CONCLUSIONS

The left STG/pMTG and their ASP-related functional connectivity with both the left caudate and some frontal regions (the left TriIFG in healthy listeners and the left Rolandic operculum in listeners with schizophrenia) are involved in the unmasking effect of ASP, possibly through facilitating the following processes: masker-signal inhibition, target-speech encoding, and speech production. The schizophrenia-related reduction of functional connectivity between the left STG and left TriIFG augments the vulnerability of speech recognition to speech masking.

Collapse

Wu C, Zheng Y, Li J, Zhang B, Li R, Wu H, She S, Liu S, Peng H, Ning Y, Li L. Activation and Functional Connectivity of the Left Inferior Temporal Gyrus during Visual Speech Priming in Healthy Listeners and Listeners with Schizophrenia. Front Neurosci 2017;11:107. [PMID: 28360829 PMCID: PMC5350153 DOI: 10.3389/fnins.2017.00107] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 02/20/2017] [Indexed: 11/13/2022] Open

Abstract

Under a "cocktail-party" listening condition with multiple-people talking, compared to healthy people, people with schizophrenia benefit less from the use of visual-speech (lipreading) priming (VSP) cues to improve speech recognition. The neural mechanisms underlying the unmasking effect of VSP remain unknown. This study investigated the brain substrates underlying the unmasking effect of VSP in healthy listeners and the schizophrenia-induced changes in the brain substrates. Using functional magnetic resonance imaging, brain activation and functional connectivity for the contrasts of the VSP listening condition vs. the visual non-speech priming (VNSP) condition were examined in 16 healthy listeners (27.4 ± 8.6 years old, 9 females and 7 males) and 22 listeners with schizophrenia (29.0 ± 8.1 years old, 8 females and 14 males). The results showed that in healthy listeners, but not listeners with schizophrenia, the VSP-induced activation (against the VNSP condition) of the left posterior inferior temporal gyrus (pITG) was significantly correlated with the VSP-induced improvement in target-speech recognition against speech masking. Compared to healthy listeners, listeners with schizophrenia showed significantly lower VSP-induced activation of the left pITG and reduced functional connectivity of the left pITG with the bilateral Rolandic operculum, bilateral STG, and left insular. Thus, the left pITG and its functional connectivity may be the brain substrates related to the unmasking effect of VSP, assumedly through enhancing both the processing of target visual-speech signals and the inhibition of masking-speech signals. In people with schizophrenia, the reduced unmasking effect of VSP on speech recognition may be associated with a schizophrenia-related reduction of VSP-induced activation and functional connectivity of the left pITG.

Collapse

Ozker M, Schepers IM, Magnotti JF, Yoshor D, Beauchamp MS. A Double Dissociation between Anterior and Posterior Superior Temporal Gyrus for Processing Audiovisual Speech Demonstrated by Electrocorticography. J Cogn Neurosci 2017;29:1044-1060. [PMID: 28253074 DOI: 10.1162/jocn_a_01110] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Leonard MK, Baud MO, Sjerps MJ, Chang EF. Perceptual restoration of masked speech in human cortex. Nat Commun 2016;7:13619. [PMID: 27996973 PMCID: PMC5187421 DOI: 10.1038/ncomms13619] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2016] [Accepted: 10/19/2016] [Indexed: 02/02/2023] Open

Blank H, Davis MH. Prediction Errors but Not Sharpened Signals Simulate Multivoxel fMRI Patterns during Speech Perception. PLoS Biol 2016;14:e1002577. [PMID: 27846209 PMCID: PMC5112801 DOI: 10.1371/journal.pbio.1002577] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 10/19/2016] [Indexed: 11/19/2022] Open

Abstract

Successful perception depends on combining sensory input with prior knowledge. However, the underlying mechanism by which these two sources of information are combined is unknown. In speech perception, as in other domains, two functionally distinct coding schemes have been proposed for how expectations influence representation of sensory evidence. Traditional models suggest that expected features of the speech input are enhanced or sharpened via interactive activation (Sharpened Signals). Conversely, Predictive Coding suggests that expected features are suppressed so that unexpected features of the speech input (Prediction Errors) are processed further. The present work is aimed at distinguishing between these two accounts of how prior knowledge influences speech perception. By combining behavioural, univariate, and multivariate fMRI measures of how sensory detail and prior expectations influence speech perception with computational modelling, we provide evidence in favour of Prediction Error computations. Increased sensory detail and informative expectations have additive behavioural and univariate neural effects because they both improve the accuracy of word report and reduce the BOLD signal in lateral temporal lobe regions. However, sensory detail and informative expectations have interacting effects on speech representations shown by multivariate fMRI in the posterior superior temporal sulcus. When prior knowledge was absent, increased sensory detail enhanced the amount of speech information measured in superior temporal multivoxel patterns, but with informative expectations, increased sensory detail reduced the amount of measured information. Computational simulations of Sharpened Signals and Prediction Errors during speech perception could both explain these behavioural and univariate fMRI observations. However, the multivariate fMRI observations were uniquely simulated by a Prediction Error and not a Sharpened Signal model. The interaction between prior expectation and sensory detail provides evidence for a Predictive Coding account of speech perception. Our work establishes methods that can be used to distinguish representations of Prediction Error and Sharpened Signals in other perceptual domains.

Collapse

Moberget T, Hilland E, Andersson S, Lundar T, Due-Tønnessen BJ, Heldal A, Ivry RB, Endestad T. Patients with focal cerebellar lesions show reduced auditory cortex activation during silent reading. BRAIN AND LANGUAGE 2016;161:18-27. [PMID: 26341544 PMCID: PMC4775464 DOI: 10.1016/j.bandl.2015.08.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Revised: 07/28/2015] [Accepted: 08/06/2015] [Indexed: 06/05/2023]

An fMRI study investigating effects of conceptually related sentences on the perception of degraded speech. Cortex 2016;79:57-74. [PMID: 27100909 DOI: 10.1016/j.cortex.2016.03.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Revised: 01/06/2016] [Accepted: 03/15/2016] [Indexed: 11/20/2022]

Abstract

Prior research has shown that the perception of degraded speech is influenced by within sentence meaning and recruits one or more components of a frontal-temporal-parietal network. The goal of the current study is to examine whether the overall conceptual meaning of a sentence, made up of one set of words, influences the perception of a second acoustically degraded sentence, made up of a different set of words. Using functional magnetic resonance imaging (fMRI), we presented an acoustically clear sentence followed by an acoustically degraded sentence and manipulated the semantic relationship between them: Related in meaning (but consisting of different content words), Unrelated in meaning, or Same. Results showed that listeners' word recognition accuracy for the acoustically degraded sentences was significantly higher when the target sentence was preceded by a conceptually related compared to a conceptually unrelated sentence. Sensitivity to conceptual relationships was associated with enhanced activity in middle and inferior frontal, temporal, and parietal areas. In addition, the left middle frontal gyrus (LMFG), left inferior frontal gyrus (LIFG), and left middle temporal gyrus (LMTG) showed activity that correlated with individual performance on the Related condition. The superior temporal gyrus (STG) showed increased activation in the Same condition suggesting that it is sensitive to perceptual similarity rather than the integration of meaning between the sentence pairs. A fronto-temporo-parietal network appears to consolidate information sources across multiple levels of language (acoustic, lexical, syntactic, semantic) to build, and ultimately integrate conceptual information across sentences and facilitate the perception of a degraded speech signal. However, the nature of the sources of information that are available differentially recruit specific regions and modulate their activity within this network. Implications of these findings for the functional architecture of the network are considered.

Collapse

Sohoglu E, Davis MH. Perceptual learning of degraded speech by minimizing prediction error. Proc Natl Acad Sci U S A 2016;113:E1747-56. [PMID: 26957596 PMCID: PMC4812728 DOI: 10.1073/pnas.1523266113] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.

Collapse