1
|
Liu YL, Zhang YX, Wang Y, Yang Y. Evidence for early encoding of speech in blind people. BRAIN AND LANGUAGE 2024; 259:105504. [PMID: 39631270 DOI: 10.1016/j.bandl.2024.105504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 09/27/2024] [Accepted: 11/19/2024] [Indexed: 12/07/2024]
Abstract
Blind listeners rely more on their auditory skills than the sighted to adapt to unavailable visual information. However, it is still unclear whether the blind has stronger noise-related modulation compared with the sighted when speech is presented under adverse listening conditions. This study aims to address this research gap by constructing noisy conditions and syllable contrasts to obtain auditory middle-latency response (MLR) and long-latency response (LLR) in blind and sighted adults. We found that blind people showed higher MLR (Na, Nb, and Pa) and N1 amplitudes compared with sighted, while this phenomenon was not observed for mismatch negativity (MMN) during auditory discrimination in both quiet and noisy backgrounds, which might eventually affect stream segregation and facilitate the understanding of speech in complex environments, contributing to their more sensitive speech detection ability of blind people. These results had important implications regarding the interpretation of noise-induced changes in the early encoding of speech in blind people.
Collapse
Affiliation(s)
- Yu-Lu Liu
- Department of Hearing and Speech Rehabilitation, Binzhou Medical University, Yantai 264003, China.
| | - Yu-Xin Zhang
- Department of Hearing and Speech Rehabilitation, Binzhou Medical University, Yantai 264003, China.
| | - Yao Wang
- Department of Hearing and Speech Rehabilitation, Binzhou Medical University, Yantai 264003, China.
| | - Ying Yang
- Department of Hearing and Speech Rehabilitation, Binzhou Medical University, Yantai 264003, China.
| |
Collapse
|
2
|
Dial HR, Tessmer R, Henry ML. Speech perception and language comprehension in primary progressive aphasia. Cortex 2024; 181:272-289. [PMID: 39577248 DOI: 10.1016/j.cortex.2024.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 08/27/2024] [Accepted: 10/14/2024] [Indexed: 11/24/2024]
Abstract
Primary progressive aphasia (PPA) is a neurodegenerative disorder characterized by progressive loss of speech and language. Although speech perception and language comprehension deficits are observed in individuals with PPA, these deficits have been understudied relative to production deficits. Recent work has examined receptive language processing at sublexical, lexical, and semantic levels in PPA; however, systematic investigation of these levels of processing within a single PPA cohort is lacking. The current study sought to fill this gap. Individuals with logopenic, nonfluent, and semantic variants of PPA and healthy, age-matched controls completed minimal pairs syllable discrimination, auditory lexical decision, and picture-word verification tasks to assess sublexical, lexical, and semantic processing. Distinct profiles were observed across PPA variants. Individuals with logopenic variant PPA had impaired performance on auditory lexical decision and picture-word verification tasks, with a trend toward impaired performance on the syllable discrimination task. Individuals with nonfluent and semantic variant PPA had impaired performance only on auditory lexical decision and picture-word verification. Evaluation of the types of errors made on the picture-word verification task (phonological and semantic) provided further insight into levels of deficits across the variants. Overall, the results indicate deficits in receptive processing at the lexical-phonological, lexical-semantic, and semantic levels in logopenic variant PPA, with a trend toward impaired sublexical processing. Deficits were observed at the lexical-semantic and semantic levels in semantic variant PPA, and lexical-phonological deficits were observed in nonfluent PPA, likely reflecting changes both in lexical-phonological processing as well as changes in predictive coding during perception. This study provides a more precise characterization of the linguistic profile of each PPA subtype for speech perception and language comprehension. The constellation of deficits observed in each PPA subtype holds promise for differential diagnosis and for informing models of intervention.
Collapse
Affiliation(s)
- Heather R Dial
- Department of Communication Sciences and Disorders, University of Houston, Houston, TX, USA; Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin, Austin, TX, USA.
| | - Rachel Tessmer
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin, Austin, TX, USA; Geriatric Research, Education, and Clinical Center, VA Pittsburgh Healthcare System, Pittsburgh, PA, USA
| | - Maya L Henry
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin, Austin, TX, USA; Department of Neurology, The University of Texas at Austin Dell Medical School, Austin, TX, USA
| |
Collapse
|
3
|
Farrar R, Ashjaei S, Arjmandi MK. Speech-evoked cortical activities and speech recognition in adult cochlear implant listeners: a review of functional near-infrared spectroscopy studies. Exp Brain Res 2024; 242:2509-2530. [PMID: 39305309 PMCID: PMC11527908 DOI: 10.1007/s00221-024-06921-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 09/04/2024] [Indexed: 11/01/2024]
Abstract
Cochlear implants (CIs) are the most successful neural prostheses, enabling individuals with severe to profound hearing loss to access sounds and understand speech. While CI has demonstrated success, speech perception outcomes vary largely among CI listeners, with significantly reduced performance in noise. This review paper summarizes prior findings on speech-evoked cortical activities in adult CI listeners using functional near-infrared spectroscopy (fNIRS) to understand (a) speech-evoked cortical processing in CI listeners compared to normal-hearing (NH) individuals, (b) the relationship between these activities and behavioral speech recognition scores, (c) the extent to which current fNIRS-measured speech-evoked cortical activities in CI listeners account for their differences in speech perception, and (d) challenges in using fNIRS for CI research. Compared to NH listeners, CI listeners had diminished speech-evoked activation in the middle temporal gyrus (MTG) and in the superior temporal gyrus (STG), except one study reporting an opposite pattern for STG. NH listeners exhibited higher inferior frontal gyrus (IFG) activity when listening to CI-simulated speech compared to natural speech. Among CI listeners, higher speech recognition scores correlated with lower speech-evoked activation in the STG, higher activation in the left IFG and left fusiform gyrus, with mixed findings in the MTG. fNIRS shows promise for enhancing our understanding of cortical processing of speech in CI listeners, though findings are mixed. Challenges include test-retest reliability, managing noise, replicating natural conditions, optimizing montage design, and standardizing methods to establish a strong predictive relationship between fNIRS-based cortical activities and speech perception in CI listeners.
Collapse
Affiliation(s)
- Reed Farrar
- Department of Psychology, University of South Carolina, 1512 Pendleton Street, Columbia, SC, 29208, USA
| | - Samin Ashjaei
- Department of Communication Sciences and Disorders, University of South Carolina, 1705 College Street, Columbia, SC, 29208, USA
| | - Meisam K Arjmandi
- Department of Communication Sciences and Disorders, University of South Carolina, 1705 College Street, Columbia, SC, 29208, USA.
- Institute for Mind and Brain, University of South Carolina, Barnwell Street, Columbia, SC, 29208, USA.
| |
Collapse
|
4
|
Luthra S, Magnuson JS, Myers EB. Right Posterior Temporal Cortex Supports Integration of Phonetic and Talker Information. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:145-177. [PMID: 37229142 PMCID: PMC10205075 DOI: 10.1162/nol_a_00091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 11/08/2022] [Indexed: 05/27/2023]
Abstract
Though the right hemisphere has been implicated in talker processing, it is thought to play a minimal role in phonetic processing, at least relative to the left hemisphere. Recent evidence suggests that the right posterior temporal cortex may support learning of phonetic variation associated with a specific talker. In the current study, listeners heard a male talker and a female talker, one of whom produced an ambiguous fricative in /s/-biased lexical contexts (e.g., epi?ode) and one who produced it in /∫/-biased contexts (e.g., friend?ip). Listeners in a behavioral experiment (Experiment 1) showed evidence of lexically guided perceptual learning, categorizing ambiguous fricatives in line with their previous experience. Listeners in an fMRI experiment (Experiment 2) showed differential phonetic categorization as a function of talker, allowing for an investigation of the neural basis of talker-specific phonetic processing, though they did not exhibit perceptual learning (likely due to characteristics of our in-scanner headphones). Searchlight analyses revealed that the patterns of activation in the right superior temporal sulcus (STS) contained information about who was talking and what phoneme they produced. We take this as evidence that talker information and phonetic information are integrated in the right STS. Functional connectivity analyses suggested that the process of conditioning phonetic identity on talker information depends on the coordinated activity of a left-lateralized phonetic processing system and a right-lateralized talker processing system. Overall, these results clarify the mechanisms through which the right hemisphere supports talker-specific phonetic processing.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
| | - James S. Magnuson
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Basque Center on Cognition Brain and Language (BCBL), Donostia-San Sebastián, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
5
|
Chen J, Zhao Y, Zou T, Wen X, Zhou X, Yu Y, Liu Z, Li M. Sensorineural Hearing Loss Affects Functional Connectivity of the Auditory Cortex, Parahippocampal Gyrus and Inferior Prefrontal Gyrus in Tinnitus Patients. Front Neurosci 2022; 16:816712. [PMID: 35431781 PMCID: PMC9011051 DOI: 10.3389/fnins.2022.816712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 03/10/2022] [Indexed: 11/22/2022] Open
Abstract
Background Tinnitus can interfere with a patient’s speech discrimination, but whether tinnitus itself or the accompanying sensorineural hearing loss (SNHL) causes this interference is still unclear. We analyzed event-related electroencephalograms (EEGs) to observe auditory-related brain function and explore the possible effects of SNHL on auditory processing in tinnitus patients. Methods Speech discrimination scores (SDSs) were recorded in 21 healthy control subjects, 24 tinnitus patients, 24 SNHL patients, and 27 patients with both SNHL and tinnitus. EEGs were collected under an oddball paradigm. Then, the mismatch negativity (MMN) amplitude and latency, the clustering coefficient and average path length of the whole network in the tinnitus and SNHL groups were compared with those in the control group. Additionally, we analyzed the intergroup differences in functional connectivity among the primary auditory cortex (AC), parahippocampal gyrus (PHG), and inferior frontal gyrus (IFG). Results SNHL patients with or without tinnitus had lower SDSs than the control subjects. Compared with control subjects, tinnitus patients with or without SNHL had decreased MMN amplitudes, and SNHL patients had longer MMN latencies. Tinnitus patients without SNHL had a smaller clustering coefficient and a longer whole-brain average path length than the control subjects. SNHL patients with or without tinnitus had a smaller clustering coefficient and a longer average path length than patients with tinnitus alone. The connectivity strength from the AC to the PHG and IFG was lower on the affected side in tinnitus patients than that in control subjects; the connectivity strength from the PHG to the IFG was also lower on the affected side in tinnitus patients than that in control subjects. However, the connectivity strength from the IFG to the AC was stronger in tinnitus patients than that in the control subjects. In SNHL patients with or without tinnitus, these changes were magnified. Conclusion Changes in auditory processing in tinnitus patients do not influence SDSs. Instead, SNHL might cause the activity of the AC, PHG and IFG to change, resulting in impaired speech recognition in tinnitus patients with SNHL.
Collapse
|
6
|
Wang YC, Sohoglu E, Gilbert RA, Henson RN, Davis MH. Predictive Neural Computations Support Spoken Word Recognition: Evidence from MEG and Competitor Priming. J Neurosci 2021; 41:6919-6932. [PMID: 34210777 PMCID: PMC8360690 DOI: 10.1523/jneurosci.1685-20.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 11/24/2022] Open
Abstract
Human listeners achieve quick and effortless speech comprehension through computations of conditional probability using Bayes rule. However, the neural implementation of Bayesian perceptual inference remains unclear. Competitive-selection accounts (e.g., TRACE) propose that word recognition is achieved through direct inhibitory connections between units representing candidate words that share segments (e.g., hygiene and hijack share /haidʒ/). Manipulations that increase lexical uncertainty should increase neural responses associated with word recognition when words cannot be uniquely identified. In contrast, predictive-selection accounts (e.g., Predictive-Coding) propose that spoken word recognition involves comparing heard and predicted speech sounds and using prediction error to update lexical representations. Increased lexical uncertainty in words, such as hygiene and hijack, will increase prediction error and hence neural activity only at later time points when different segments are predicted. We collected MEG data from male and female listeners to test these two Bayesian mechanisms and used a competitor priming manipulation to change the prior probability of specific words. Lexical decision responses showed delayed recognition of target words (hygiene) following presentation of a neighboring prime word (hijack) several minutes earlier. However, this effect was not observed with pseudoword primes (higent) or targets (hijure). Crucially, MEG responses in the STG showed greater neural responses for word-primed words after the point at which they were uniquely identified (after /haidʒ/ in hygiene) but not before while similar changes were again absent for pseudowords. These findings are consistent with accounts of spoken word recognition in which neural computations of prediction error play a central role.SIGNIFICANCE STATEMENT Effective speech perception is critical to daily life and involves computations that combine speech signals with prior knowledge of spoken words (i.e., Bayesian perceptual inference). This study specifies the neural mechanisms that support spoken word recognition by testing two distinct implementations of Bayes perceptual inference. Most established theories propose direct competition between lexical units such that inhibition of irrelevant candidates leads to selection of critical words. Our results instead support predictive-selection theories (e.g., Predictive-Coding): by comparing heard and predicted speech sounds, neural computations of prediction error can help listeners continuously update lexical probabilities, allowing for more rapid word identification.
Collapse
Affiliation(s)
- Yingcan Carol Wang
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Ediz Sohoglu
- School of Psychology, University of Sussex, Brighton, BN1 9RH, United Kingdom
| | - Rebecca A Gilbert
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Richard N Henson
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| |
Collapse
|
7
|
Mechtenberg H, Xie X, Myers EB. Sentence predictability modulates cortical response to phonetic ambiguity. BRAIN AND LANGUAGE 2021; 218:104959. [PMID: 33930722 PMCID: PMC8513138 DOI: 10.1016/j.bandl.2021.104959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 03/02/2021] [Accepted: 04/09/2021] [Indexed: 06/12/2023]
Abstract
Phonetic categories have undefined edges, such that individual tokens that belong to different speech sound categories may occupy the same region in acoustic space. In continuous speech, there are multiple sources of top-down information (e.g., lexical, semantic) that help to resolve the identity of an ambiguous phoneme. Of interest is how these top-down constraints interact with ambiguity at the phonetic level. In the current fMRI study, participants passively listened to sentences that varied in semantic predictability and in the amount of naturally-occurring phonetic competition. The left middle frontal gyrus, angular gyrus, and anterior inferior frontal gyrus were sensitive to both semantic predictability and the degree of phonetic competition. Notably, greater phonetic competition within non-predictive contexts resulted in a negatively-graded neural response. We suggest that uncertainty at the phonetic-acoustic level interacts with uncertainty at the semantic level-perhaps due to a failure of the network to construct a coherent meaning.
Collapse
Affiliation(s)
- Hannah Mechtenberg
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA.
| | - Xin Xie
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA.
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA; Department of Psychological Sciences, University of Connecticut, Storrs, Mansfield, CT 06269, USA.
| |
Collapse
|
8
|
Guediche S, de Bruin A, Caballero-Gaudes C, Baart M, Samuel AG. Second-language word recognition in noise: Interdependent neuromodulatory effects of semantic context and crosslinguistic interactions driven by word form similarity. Neuroimage 2021; 237:118168. [PMID: 34000398 DOI: 10.1016/j.neuroimage.2021.118168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 05/05/2021] [Accepted: 05/12/2021] [Indexed: 11/17/2022] Open
Abstract
Spoken language comprehension is a fundamental component of our cognitive skills. We are quite proficient at deciphering words from the auditory input despite the fact that the speech we hear is often masked by noise such as background babble originating from talkers other than the one we are attending to. To perceive spoken language as intended, we rely on prior linguistic knowledge and context. Prior knowledge includes all sounds and words that are familiar to a listener and depends on linguistic experience. For bilinguals, the phonetic and lexical repertoire encompasses two languages, and the degree of overlap between word forms across languages affects the degree to which they influence one another during auditory word recognition. To support spoken word recognition, listeners often rely on semantic information (i.e., the words we hear are usually related in a meaningful way). Although the number of multilinguals across the globe is increasing, little is known about how crosslinguistic effects (i.e., word overlap) interact with semantic context and affect the flexible neural systems that support accurate word recognition. The current multi-echo functional magnetic resonance imaging (fMRI) study addresses this question by examining how prime-target word pair semantic relationships interact with the target word's form similarity (cognate status) to the translation equivalent in the dominant language (L1) during accurate word recognition of a non-dominant (L2) language. We tested 26 early-proficient Spanish-Basque (L1-L2) bilinguals. When L2 targets matching L1 translation-equivalent phonological word forms were preceded by unrelated semantic contexts that drive lexical competition, a flexible language control (fronto-parietal-subcortical) network was upregulated, whereas when they were preceded by related semantic contexts that reduce lexical competition, it was downregulated. We conclude that an interplay between semantic and crosslinguistic effects regulates flexible control mechanisms of speech processing to facilitate L2 word recognition, in noise.
Collapse
Affiliation(s)
- Sara Guediche
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain.
| | | | | | - Martijn Baart
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain; Department of Cognitive Neuropsychology, Tilburg University, P.O. Box 90153, 5000 LE Tilburg, the Netherlands
| | - Arthur G Samuel
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain; Stony Brook University, NY 11794-2500, United States; Ikerbasque Foundation, Spain
| |
Collapse
|
9
|
Adaptation to mis-pronounced speech: evidence for a prefrontal-cortex repair mechanism. Sci Rep 2021; 11:97. [PMID: 33420193 PMCID: PMC7794353 DOI: 10.1038/s41598-020-79640-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 11/23/2020] [Indexed: 11/30/2022] Open
Abstract
Speech is a complex and ambiguous acoustic signal that varies significantly within and across speakers. Despite the processing challenge that such variability poses, humans adapt to systematic variations in pronunciation rapidly. The goal of this study is to uncover the neurobiological bases of the attunement process that enables such fluent comprehension. Twenty-four native English participants listened to words spoken by a “canonical” American speaker and two non-canonical speakers, and performed a word-picture matching task, while magnetoencephalography was recorded. Non-canonical speech was created by including systematic phonological substitutions within the word (e.g. [s] → [sh]). Activity in the auditory cortex (superior temporal gyrus) was greater in response to substituted phonemes, and, critically, this was not attenuated by exposure. By contrast, prefrontal regions showed an interaction between the presence of a substitution and the amount of exposure: activity decreased for canonical speech over time, whereas responses to non-canonical speech remained consistently elevated. Grainger causality analyses further revealed that prefrontal responses serve to modulate activity in auditory regions, suggesting the recruitment of top-down processing to decode non-canonical pronunciations. In sum, our results suggest that the behavioural deficit in processing mispronounced phonemes may be due to a disruption to the typical exchange of information between the prefrontal and auditory cortices as observed for canonical speech.
Collapse
|
10
|
Lin IF, Itahashi T, Kashino M, Kato N, Hashimoto RI. Brain activations while processing degraded speech in adults with autism spectrum disorder. Neuropsychologia 2021; 152:107750. [PMID: 33417913 DOI: 10.1016/j.neuropsychologia.2021.107750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 12/14/2020] [Accepted: 12/31/2020] [Indexed: 11/17/2022]
Abstract
Individuals with autism spectrum disorder (ASD) are found to have difficulties in understanding speech in adverse conditions. In this study, we used noise-vocoded speech (VS) to investigate neural processing of degraded speech in individuals with ASD. We ran fMRI experiments in the ASD group and a typically developed control (TDC) group while they listened to clear speech (CS), VS, and spectrally rotated VS (SRVS), and they were requested to pay attention to the heard sentence and answer whether it was intelligible or not. The VS used in this experiment was spectrally degraded but still intelligible, but the SRVS was unintelligible. We recruited 21 right-handed adult males with ASD and 24 age-matched and right-handed male TDC participants for this experiment. Compared with the TDC group, we observed reduced functional connectivity (FC) between the left dorsal premotor cortex and left temporoparietal junction in the ASD group for the effect of task difficulty in speech processing, computed as VS-(CS + SRVS)/2. Furthermore, the observed reduced FC was negatively correlated with their Autism-Spectrum Quotient scores. This observation supports our hypothesis that the disrupted dorsal stream for attentive process of degraded speech in individuals with ASD might be related to their difficulty in understanding speech in adverse conditions.
Collapse
Affiliation(s)
- I-Fan Lin
- Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, 243-0124, Japan; Department of Medicine, Taipei Medical University, Taipei, Taiwan, 11031; Department of Occupational Medicine, Shuang Ho Hospital, New Taipei City, Taiwan, 23561.
| | - Takashi Itahashi
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan
| | - Makio Kashino
- Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, 243-0124, Japan; School of Engineering, Tokyo Institute of Technology, Yokohama, 226-8503, Japan; Graduate School of Education, University of Tokyo, Tokyo, 113-0033, Japan
| | - Nobumasa Kato
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan
| | - Ryu-Ichiro Hashimoto
- Medical Institute of Developmental Disabilities Research, Showa University Karasuyama Hospital, Tokyo, 157-8577, Japan; Department of Language Sciences, Tokyo Metropolitan University, Tokyo, 192-0364, Japan.
| |
Collapse
|
11
|
Luthra S. The Role of the Right Hemisphere in Processing Phonetic Variability Between Talkers. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:138-151. [PMID: 37213418 PMCID: PMC10174361 DOI: 10.1162/nol_a_00028] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/13/2020] [Indexed: 05/23/2023]
Abstract
Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.
Collapse
|
12
|
Sohoglu E, Davis MH. Rapid computations of spectrotemporal prediction error support perception of degraded speech. eLife 2020; 9:e58077. [PMID: 33147138 PMCID: PMC7641582 DOI: 10.7554/elife.58077] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022] Open
Abstract
Human speech perception can be described as Bayesian perceptual inference but how are these Bayesian computations instantiated neurally? We used magnetoencephalographic recordings of brain responses to degraded spoken words and experimentally manipulated signal quality and prior knowledge. We first demonstrate that spectrotemporal modulations in speech are more strongly represented in neural responses than alternative speech representations (e.g. spectrogram or articulatory features). Critically, we found an interaction between speech signal quality and expectations from prior written text on the quality of neural representations; increased signal quality enhanced neural representations of speech that mismatched with prior expectations, but led to greater suppression of speech that matched prior expectations. This interaction is a unique neural signature of prediction error computations and is apparent in neural responses within 100 ms of speech input. Our findings contribute to the detailed specification of a computational model of speech perception based on predictive coding frameworks.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of SussexBrightonUnited Kingdom
| | - Matthew H Davis
- MRC Cognition and Brain Sciences UnitCambridgeUnited Kingdom
| |
Collapse
|
13
|
Getz LM, Toscano JC. The time-course of speech perception revealed by temporally-sensitive neural measures. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2020; 12:e1541. [PMID: 32767836 DOI: 10.1002/wcs.1541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 05/28/2020] [Accepted: 06/26/2020] [Indexed: 11/07/2022]
Abstract
Recent advances in cognitive neuroscience have provided a detailed picture of the early time-course of speech perception. In this review, we highlight this work, placing it within the broader context of research on the neurobiology of speech processing, and discuss how these data point us toward new models of speech perception and spoken language comprehension. We focus, in particular, on temporally-sensitive measures that allow us to directly measure early perceptual processes. Overall, the data provide support for two key principles: (a) speech perception is based on gradient representations of speech sounds and (b) speech perception is interactive and receives input from higher-level linguistic context at the earliest stages of cortical processing. Implications for models of speech processing and the neurobiology of language more broadly are discussed. This article is categorized under: Psychology > Language Psychology > Perception and Psychophysics Neuroscience > Cognition.
Collapse
Affiliation(s)
- Laura M Getz
- Department of Psychological Sciences, University of San Diego, San Diego, California, USA
| | - Joseph C Toscano
- Department of Psychological and Brain Sciences, Villanova University, Villanova, Pennsylvania, USA
| |
Collapse
|
14
|
Nelson MJ, Moeller S, Basu A, Christopher L, Rogalski EJ, Greicius M, Weintraub S, Bonakdarpour B, Hurley RS, Mesulam MM. Taxonomic Interference Associated with Phonemic Paraphasias in Agrammatic Primary Progressive Aphasia. Cereb Cortex 2020; 30:2529-2541. [PMID: 31800048 PMCID: PMC7174997 DOI: 10.1093/cercor/bhz258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/14/2022] Open
Abstract
Phonemic paraphasias are thought to reflect phonological (post-semantic) deficits in language production. Here we present evidence that phonemic paraphasias in non-semantic primary progressive aphasia (PPA) may be associated with taxonomic interference. Agrammatic and logopenic PPA patients and control participants performed a word-to-picture visual search task where they matched a stimulus noun to 1 of 16 object pictures as their eye movements were recorded. Participants were subsequently asked to name the same items. We measured taxonomic interference (ratio of time spent viewing related vs. unrelated foils) during the search task for each item. Target items that elicited a phonemic paraphasia during object naming elicited increased taxonomic interference during the search task in agrammatic but not logopenic PPA patients. These results could reflect either very subtle sub-clinical semantic distortions of word representations or partial degradation of specific phonological word forms in agrammatic PPA during both word-to-picture matching (input stage) and picture naming (output stage). The mechanism for phonemic paraphasias in logopenic patients seems to be different and to be operative at the pre-articulatory stage of phonological retrieval. Glucose metabolic imaging suggests that degeneration in the left posterior frontal lobe and left temporo-parietal junction, respectively, might underlie these different patterns of phonemic paraphasia.
Collapse
Affiliation(s)
- M J Nelson
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Neurological Surgery, Feinberg School of Medicine , Northwestern University, Chicago, IL 60611, USA
- Department of Neurosurgery, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35233, USA
| | - S Moeller
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - A Basu
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - L Christopher
- Department of Neurology and Neurological Sciences, FIND Lab, Stanford University, Stanford, CA 94304, USA
| | - E J Rogalski
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - M Greicius
- Department of Neurology and Neurological Sciences, FIND Lab, Stanford University, Stanford, CA 94304, USA
| | - S Weintraub
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Neurology, Feinberg School of Medicine , Northwestern University, Chicago, IL 60611, USA
| | - B Bonakdarpour
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - R S Hurley
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Psychology, Cleveland State University, Cleveland, OH 44115, USA
| | - M-M Mesulam
- Mesulam Center for Cognitive Neurology and Alzheimer’s Disease, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
- Department of Neurology, Feinberg School of Medicine , Northwestern University, Chicago, IL 60611, USA
| |
Collapse
|
15
|
Bidelman GM, Bush LC, Boudreaux AM. Effects of Noise on the Behavioral and Neural Categorization of Speech. Front Neurosci 2020; 14:153. [PMID: 32180700 PMCID: PMC7057933 DOI: 10.3389/fnins.2020.00153] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 02/10/2020] [Indexed: 02/02/2023] Open
Abstract
We investigated whether the categorical perception (CP) of speech might also provide a mechanism that aids its perception in noise. We varied signal-to-noise ratio (SNR) [clear, 0 dB, -5 dB] while listeners classified an acoustic-phonetic continuum (/u/ to /a/). Noise-related changes in behavioral categorization were only observed at the lowest SNR. Event-related brain potentials (ERPs) differentiated category vs. category-ambiguous speech by the P2 wave (~180-320 ms). Paralleling behavior, neural responses to speech with clear phonetic status (i.e., continuum endpoints) were robust to noise down to -5 dB SNR, whereas responses to ambiguous tokens declined with decreasing SNR. Results demonstrate that phonetic speech representations are more resistant to degradation than corresponding acoustic representations. Findings suggest the mere process of binning speech sounds into categories provides a robust mechanism to aid figure-ground speech perception by fortifying abstract categories from the acoustic signal and making the speech code more resistant to external interferences.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, United States
| | - Lauren C Bush
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Alex M Boudreaux
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| |
Collapse
|
16
|
Guediche S, Zhu Y, Minicucci D, Blumstein SE. Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions. BRAIN AND LANGUAGE 2019; 199:104698. [PMID: 31586792 DOI: 10.1016/j.bandl.2019.104698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/15/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]
Abstract
This study examines cross-modality effects of a semantically-biased written sentence context on the perception of an acoustically-ambiguous word target identifying neural areas sensitive to interactions between sentential bias and phonetic ambiguity. Of interest is whether the locus or nature of the interactions resembles those previously demonstrated for auditory-only effects. FMRI results show significant interaction effects in right mid-middle temporal gyrus (RmMTG) and bilateral anterior superior temporal gyri (aSTG), regions along the ventral language comprehension stream that map sound onto meaning. These regions are more anterior than those previously identified for auditory-only effects; however, the same cross-over interaction pattern emerged implying similar underlying computations at play. The findings suggest that the mechanisms that integrate information across modality and across sentence and phonetic levels of processing recruit amodal areas where reading and spoken lexical and semantic access converge. Taken together, results support interactive accounts of speech and language processing.
Collapse
Affiliation(s)
- Sara Guediche
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; BCBL - Basque Center on Cognition, Brain and Language, Donostia-San Sebastian, Spain.
| | - Yuli Zhu
- Neuroscience Department, Brown University, United States
| | - Domenic Minicucci
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, United States; Brown Institute for Brain Science, Brown University, United States
| |
Collapse
|
17
|
Luthra S, Fuhrmeister P, Molfese PJ, Guediche S, Blumstein SE, Myers EB. Brain-behavior relationships in incidental learning of non-native phonetic categories. BRAIN AND LANGUAGE 2019; 198:104692. [PMID: 31522094 PMCID: PMC6773471 DOI: 10.1016/j.bandl.2019.104692] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 08/29/2019] [Accepted: 09/01/2019] [Indexed: 06/01/2023]
Abstract
Research has implicated the left inferior frontal gyrus (LIFG) in mapping acoustic-phonetic input to sound category representations, both in native speech perception and non-native phonetic category learning. At issue is whether this sensitivity reflects access to phonetic category information per se or to explicit category labels, the latter often being required by experimental procedures. The current study employed an incidental learning paradigm designed to increase sensitivity to a difficult non-native phonetic contrast without inducing explicit awareness of the categorical nature of the stimuli. Functional MRI scans revealed frontal sensitivity to phonetic category structure both before and after learning. Additionally, individuals who succeeded most on the learning task showed the largest increases in frontal recruitment after learning. Overall, results suggest that processing novel phonetic category information entails a reliance on frontal brain regions, even in the absence of explicit category labels.
Collapse
Affiliation(s)
- Sahil Luthra
- University of Connecticut, Department of Psychological Sciences, United States.
| | - Pamela Fuhrmeister
- University of Connecticut, Department of Speech, Language and Hearing Sciences, United States.
| | | | - Sara Guediche
- Basque Center on Cognition, Brain and Language, Spain.
| | - Sheila E Blumstein
- Brown University, Department of Cognitive, Linguistic and Psychological Sciences, United States.
| | - Emily B Myers
- University of Connecticut, Department of Psychological Sciences, United States; University of Connecticut, Department of Speech, Language and Hearing Sciences, United States; Haskins Laboratories, United States.
| |
Collapse
|
18
|
Luthra S, Guediche S, Blumstein SE, Myers EB. Neural substrates of subphonemic variation and lexical competition in spoken word recognition. LANGUAGE, COGNITION AND NEUROSCIENCE 2019; 34:151-169. [PMID: 31106225 PMCID: PMC6516505 DOI: 10.1080/23273798.2018.1531140] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In spoken word recognition, subphonemic variation influences lexical activation, with sounds near a category boundary increasing phonetic competition as well as lexical competition. The current study investigated the interplay of these factors using a visual world task in which participants were instructed to look at a picture of an auditory target (e.g., peacock). Eyetracking data indicated that participants were slowed when a voiced onset competitor (e.g., beaker) was also displayed, and this effect was amplified when acoustic-phonetic competition was increased. Simultaneously-collected fMRI data showed that several brain regions were sensitive to the presence of the onset competitor, including the supramarginal, middle temporal, and inferior frontal gyri, and functional connectivity analyses revealed that the coordinated activity of left frontal regions depends on both acoustic-phonetic and lexical factors. Taken together, results suggest a role for frontal brain structures in resolving lexical competition, particularly as atypical acoustic-phonetic information maps on to the lexicon.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
| | - Sara Guediche
- BCBL. Basque Center on Cognition, Brain and Language Mikeletegi Pasealekua, 69, 20009 Donostia, Gipuzkoa, Spain
| | - Sheila E Blumstein
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University 190 Thayer Street, Providence, RI, USA 02912
- Brown Institute for Brain Science, Brown University 2 Stimson Ave, Providence, RI, USA 02912
| | - Emily B Myers
- Department of Psychological Sciences, University of Connecticut 406 Babbidge Road, Unit 1020, Storrs, CT, USA 06269
- Department of Speech, Language & Hearing Sciences, University of Connecticut 850 Bolton Road, Unit 1085, Storrs, CT, USA 06269
- Haskins Laboratories 300 George Street, Suite 900, New Haven, CT, USA 06511
| |
Collapse
|
19
|
Toscano JC, Anderson ND, Fabiani M, Gratton G, Garnsey SM. The time-course of cortical responses to speech revealed by fast optical imaging. BRAIN AND LANGUAGE 2018; 184:32-42. [PMID: 29960165 PMCID: PMC6102048 DOI: 10.1016/j.bandl.2018.06.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 04/03/2018] [Accepted: 06/12/2018] [Indexed: 05/31/2023]
Abstract
Recent work has sought to describe the time-course of spoken word recognition, from initial acoustic cue encoding through lexical activation, and identify cortical areas involved in each stage of analysis. However, existing methods are limited in either temporal or spatial resolution, and as a result, have only provided partial answers to the question of how listeners encode acoustic information in speech. We present data from an experiment using a novel neuroimaging method, fast optical imaging, to directly assess the time-course of speech perception, providing non-invasive measurement of speech sound representations, localized to specific cortical areas. We find that listeners encode speech in terms of continuous acoustic cues at early stages of processing (ca. 96 ms post-stimulus onset), and begin activating phonological category representations rapidly (ca. 144 ms post-stimulus). Moreover, cue-based representations are widespread in the brain and overlap in time with graded category-based representations, suggesting that spoken word recognition involves simultaneous activation of both continuous acoustic cues and phonological categories.
Collapse
Affiliation(s)
- Joseph C Toscano
- Department of Psychological & Brain Sciences, Villanova University, United States; Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States.
| | - Nathaniel D Anderson
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Monica Fabiani
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Gabriele Gratton
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| | - Susan M Garnsey
- Beckman Institute for Advanced Science & Technology, University of Illinois at Urbana-Champaign, United States; Department of Psychology, University of Illinois at Urbana-Champaign, United States
| |
Collapse
|
20
|
Liang B, Du Y. The Functional Neuroanatomy of Lexical Tone Perception: An Activation Likelihood Estimation Meta-Analysis. Front Neurosci 2018; 12:495. [PMID: 30087589 PMCID: PMC6066585 DOI: 10.3389/fnins.2018.00495] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 07/02/2018] [Indexed: 11/13/2022] Open
Abstract
In tonal language such as Chinese, lexical tone serves as a phonemic feature in determining word meaning. Meanwhile, it is close to prosody in terms of suprasegmental pitch variations and larynx-based articulation. The important yet mixed nature of lexical tone has evoked considerable studies, but no consensus has been reached on its functional neuroanatomy. This meta-analysis aimed at uncovering the neural network of lexical tone perception in comparison with that of phoneme and prosody in a unified framework. Independent Activation Likelihood Estimation meta-analyses were conducted for different linguistic elements: lexical tone by native tonal language speakers, lexical tone by non-tonal language speakers, phoneme, word-level prosody, and sentence-level prosody. Results showed that lexical tone and prosody studies demonstrated more extensive activations in the right than the left auditory cortex, whereas the opposite pattern was found for phoneme studies. Only tonal language speakers consistently recruited the left anterior superior temporal gyrus (STG) for processing lexical tone, an area implicated in phoneme processing and word-form recognition. Moreover, an anterior-lateral to posterior-medial gradient of activation as a function of element timescale was revealed in the right STG, in which the activation for lexical tone lied between that for phoneme and that for prosody. Another topological pattern was shown on the left precentral gyrus (preCG), with the activation for lexical tone overlapped with that for prosody but ventral to that for phoneme. These findings provide evidence that the neural network for lexical tone perception is hybrid with those for phoneme and prosody. That is, resembling prosody, lexical tone perception, regardless of language experience, involved right auditory cortex, with activation localized between sites engaged by phonemic and prosodic processing, suggesting a hierarchical organization of representations in the right auditory cortex. For tonal language speakers, lexical tone additionally engaged the left STG lexical mapping network, consistent with the phonemic representation. Similarly, when processing lexical tone, only tonal language speakers engaged the left preCG site implicated in prosody perception, consistent with tonal language speakers having stronger articulatory representations for lexical tone in the laryngeal sensorimotor network. A dynamic dual-stream model for lexical tone perception was proposed and discussed.
Collapse
Affiliation(s)
- Baishen Liang
- CAS Key Laboratory of Behavioral Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Yi Du
- CAS Key Laboratory of Behavioral Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
21
|
In Spoken Word Recognition, the Future Predicts the Past. J Neurosci 2018; 38:7585-7599. [PMID: 30012695 DOI: 10.1523/jneurosci.0065-18.2018] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 06/06/2018] [Accepted: 07/09/2018] [Indexed: 11/21/2022] Open
Abstract
Speech is an inherently noisy and ambiguous signal. To fluently derive meaning, a listener must integrate contextual information to guide interpretations of the sensory input. Although many studies have demonstrated the influence of prior context on speech perception, the neural mechanisms supporting the integration of subsequent context remain unknown. Using MEG to record from human auditory cortex, we analyzed responses to spoken words with a varyingly ambiguous onset phoneme, the identity of which is later disambiguated at the lexical uniqueness point. Fifty participants (both male and female) were recruited across two MEG experiments. Our findings suggest that primary auditory cortex is sensitive to phonological ambiguity very early during processing at just 50 ms after onset. Subphonemic detail is preserved in auditory cortex over long timescales and re-evoked at subsequent phoneme positions. Commitments to phonological categories occur in parallel, resolving on the shorter timescale of ∼450 ms. These findings provide evidence that future input determines the perception of earlier speech sounds by maintaining sensory features until they can be integrated with top-down lexical information.SIGNIFICANCE STATEMENT The perception of a speech sound is determined by its surrounding context in the form of words, sentences, and other speech sounds. Often, such contextual information becomes available later than the sensory input. The present study is the first to unveil how the brain uses this subsequent information to aid speech comprehension. Concretely, we found that the auditory system actively maintains the acoustic signal in auditory cortex while concurrently making guesses about the identity of the words being said. Such a processing strategy allows the content of the message to be accessed quickly while also permitting reanalysis of the acoustic signal to minimize parsing mistakes.
Collapse
|
22
|
Focal versus distributed temporal cortex activity for speech sound category assignment. Proc Natl Acad Sci U S A 2018; 115:E1299-E1308. [PMID: 29363598 PMCID: PMC5819402 DOI: 10.1073/pnas.1714279115] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
When listening to speech, phonemes are represented in a distributed fashion in our temporal and prefrontal cortices. How these representations are selected in a phonemic decision context, and in particular whether distributed or focal neural information is required for explicit phoneme recognition, is unclear. We hypothesized that focal and early neural encoding of acoustic signals is sufficiently informative to access speech sound representations and permit phoneme recognition. We tested this hypothesis by combining a simple speech-phoneme categorization task with univariate and multivariate analyses of fMRI, magnetoencephalography, intracortical, and clinical data. We show that neural information available focally in the temporal cortex prior to decision-related neural activity is specific enough to account for human phonemic identification. Percepts and words can be decoded from distributed neural activity measures. However, the existence of widespread representations might conflict with the more classical notions of hierarchical processing and efficient coding, which are especially relevant in speech processing. Using fMRI and magnetoencephalography during syllable identification, we show that sensory and decisional activity colocalize to a restricted part of the posterior superior temporal gyrus (pSTG). Next, using intracortical recordings, we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine decoding was possible from neuronal activity sampled across different regions of the temporal and frontal lobes, despite weak or absent sensory or decision-related responses. These findings show that speech-sound categorization relies on an efficient readout of focal pSTG neural activity, while more distributed activity patterns, although classifiable by machine learning, instead reflect collateral processes of sensory perception and decision.
Collapse
|
23
|
Xie X, Myers E. Left Inferior Frontal Gyrus Sensitivity to Phonetic Competition in Receptive Language Processing: A Comparison of Clear and Conversational Speech. J Cogn Neurosci 2017; 30:267-280. [PMID: 29160743 DOI: 10.1162/jocn_a_01208] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The speech signal is rife with variations in phonetic ambiguity. For instance, when talkers speak in a conversational register, they demonstrate less articulatory precision, leading to greater potential for confusability at the phonetic level compared with a clear speech register. Current psycholinguistic models assume that ambiguous speech sounds activate more than one phonological category and that competition at prelexical levels cascades to lexical levels of processing. Imaging studies have shown that the left inferior frontal gyrus (LIFG) is modulated by phonetic competition between simultaneously activated categories, with increases in activation for more ambiguous tokens. Yet, these studies have often used artificially manipulated speech and/or metalinguistic tasks, which arguably may recruit neural regions that are not critical for natural speech recognition. Indeed, a prominent model of speech processing, the dual-stream model, posits that the LIFG is not involved in prelexical processing in receptive language processing. In the current study, we exploited natural variation in phonetic competition in the speech signal to investigate the neural systems sensitive to phonetic competition as listeners engage in a receptive language task. Participants heard nonsense sentences spoken in either a clear or conversational register as neural activity was monitored using fMRI. Conversational sentences contained greater phonetic competition, as estimated by measures of vowel confusability, and these sentences also elicited greater activation in a region in the LIFG. Sentence-level phonetic competition metrics uniquely correlated with LIFG activity as well. This finding is consistent with the hypothesis that the LIFG responds to competition at multiple levels of language processing and that recruitment of this region does not require an explicit phonological judgment.
Collapse
|
24
|
Cai ZG, Gilbert RA, Davis MH, Gaskell MG, Farrar L, Adler S, Rodd JM. Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition. Cogn Psychol 2017; 98:73-101. [PMID: 28881224 PMCID: PMC6597358 DOI: 10.1016/j.cogpsych.2017.08.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Revised: 08/22/2017] [Accepted: 08/23/2017] [Indexed: 11/02/2022]
Abstract
Speech carries accent information relevant to determining the speaker's linguistic and social background. A series of web-based experiments demonstrate that accent cues can modulate access to word meaning. In Experiments 1-3, British participants were more likely to retrieve the American dominant meaning (e.g., hat meaning of "bonnet") in a word association task if they heard the words in an American than a British accent. In addition, results from a speeded semantic decision task (Experiment 4) and sentence comprehension task (Experiment 5) confirm that accent modulates on-line meaning retrieval such that comprehension of ambiguous words is easier when the relevant word meaning is dominant in the speaker's dialect. Critically, neutral-accent speech items, created by morphing British- and American-accented recordings, were interpreted in a similar way to accented words when embedded in a context of accented words (Experiment 2). This finding indicates that listeners do not use accent to guide meaning retrieval on a word-by-word basis; instead they use accent information to determine the dialectic identity of a speaker and then use their experience of that dialect to guide meaning access for all words spoken by that person. These results motivate a speaker-model account of spoken word recognition in which comprehenders determine key characteristics of their interlocutor and use this knowledge to guide word meaning access.
Collapse
Affiliation(s)
- Zhenguang G Cai
- University College London, United Kingdom; University of East Anglia, United Kingdom.
| | - Rebecca A Gilbert
- University College London, United Kingdom; MRC Cognition & Brain Sciences Unit, Cambridge, United Kingdom
| | - Matthew H Davis
- MRC Cognition & Brain Sciences Unit, Cambridge, United Kingdom
| | | | | | | | | |
Collapse
|