1
|
Lulaci T, Söderström P, Tronnier M, Roll M. Temporal dynamics of coarticulatory cues to prediction. Front Psychol 2024; 15:1446240. [PMID: 39315043 PMCID: PMC11416931 DOI: 10.3389/fpsyg.2024.1446240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Accepted: 08/27/2024] [Indexed: 09/25/2024] Open
Abstract
The temporal dynamics of the perception of within-word coarticulatory cues remain a subject of ongoing debate in speech perception research. This behavioral gating study sheds light on the unfolding predictive use of anticipatory coarticulation in onset fricatives. Word onset fricatives (/f/ and /s/) were split into four gates (15, 35, 75 and 135 milliseconds). Listeners made a forced choice about the word they were listening to, based on the stimulus gates. The results showed fast predictive use of coarticulatory lip rounding during /s/ word onsets, as early as 15 ms from word onset. For /f/ onsets, coarticulatory backness and height began to be used predictively after 75 ms. These findings indicate that onset times of the occurrence and use of coarticulatory cues can be extremely fast and have a time course that differs depending on fricative type.
Collapse
Affiliation(s)
- Tugba Lulaci
- Centre for Languages and Literature, Lund University, Lund, Sweden
| | - Pelle Söderström
- Centre for Languages and Literature, Lund University, Lund, Sweden
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
| | | | - Mikael Roll
- Centre for Languages and Literature, Lund University, Lund, Sweden
| |
Collapse
|
2
|
Sarrett ME, Toscano JC. Decoding speech sounds from neurophysiological data: Practical considerations and theoretical implications. Psychophysiology 2024; 61:e14475. [PMID: 37947235 DOI: 10.1111/psyp.14475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 09/13/2023] [Accepted: 10/04/2023] [Indexed: 11/12/2023]
Abstract
Machine learning techniques have proven to be a useful tool in cognitive neuroscience. However, their implementation in scalp-recorded electroencephalography (EEG) is relatively limited. To address this, we present three analyses using data from a previous study that examined event-related potential (ERP) responses to a wide range of naturally-produced speech sounds. First, we explore which features of the EEG signal best maximize machine learning accuracy for a voicing distinction, using a support vector machine (SVM). We manipulate three dimensions of the EEG signal as input to the SVM: number of trials averaged, number of time points averaged, and polynomial fit. We discuss the trade-offs in using different feature sets and offer some recommendations for researchers using machine learning. Next, we use SVMs to classify specific pairs of phonemes, finding that we can detect differences in the EEG signal that are not otherwise detectable using conventional ERP analyses. Finally, we characterize the timecourse of phonetic feature decoding across three phonological dimensions (voicing, manner of articulation, and place of articulation), and find that voicing and manner are decodable from neural activity, whereas place of articulation is not. This set of analyses addresses both practical considerations in the application of machine learning to EEG, particularly for speech studies, and also sheds light on current issues regarding the nature of perceptual representations of speech.
Collapse
Affiliation(s)
- McCall E Sarrett
- Department of Psychological and Brain Sciences, Villanova University, Villanova, Pennsylvania, USA
- Psychology Department, Gonzaga University, Spokane, Washington, USA
| | - Joseph C Toscano
- Department of Psychological and Brain Sciences, Villanova University, Villanova, Pennsylvania, USA
| |
Collapse
|
3
|
Davies B, Holt R, Demuth K. Children with hearing loss can use subject-verb agreement to predict during spoken language processing. J Exp Child Psychol 2023; 226:105545. [PMID: 36126586 DOI: 10.1016/j.jecp.2022.105545] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 08/21/2022] [Accepted: 08/25/2022] [Indexed: 10/14/2022]
Abstract
Rapid processing of spoken language is aided by the ability to predict upcoming words using both semantic and syntactic cues. However, although children with hearing loss (HL) can predict upcoming words using semantic associations, little is known about their ability to predict using syntactic dependencies such as subject-verb (SV) agreement. This study examined whether school-aged children with hearing aids and/or cochlear implants can use SV agreement to predict upcoming nouns when processing spoken language. Although they did demonstrate prediction with plural SV agreement, they did so more slowly than their normal hearing (NH) peers. This may be due to weaker grammatical representations given that function words and grammatical inflections typically have lower perceptual salience. Thus, a better understanding of morphosyntactic representations in children with HL, and their ability to use these for prediction, sheds much-needed light on the online language processing challenges and abilities of this population.
Collapse
Affiliation(s)
- Benjamin Davies
- Department of Linguistics, Level 3 Australian Hearing Hub, Macquarie University, Sydney, New South Wales 2109, Australia.
| | - Rebecca Holt
- Department of Linguistics, Level 3 Australian Hearing Hub, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Katherine Demuth
- Department of Linguistics, Level 3 Australian Hearing Hub, Macquarie University, Sydney, New South Wales 2109, Australia
| |
Collapse
|
4
|
McMurray B. I'm not sure that curve means what you think it means: Toward a [more] realistic understanding of the role of eye-movement generation in the Visual World Paradigm. Psychon Bull Rev 2023; 30:102-146. [PMID: 35962241 PMCID: PMC10964151 DOI: 10.3758/s13423-022-02143-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/29/2022] [Indexed: 11/08/2022]
Abstract
The Visual World Paradigm (VWP) is a powerful experimental paradigm for language research. Listeners respond to speech in a "visual world" containing potential referents of the speech. Fixations to these referents provides insight into the preliminary states of language processing as decisions unfold. The VWP has become the dominant paradigm in psycholinguistics and extended to every level of language, development, and disorders. Part of its impact is the impressive data visualizations which reveal the millisecond-by-millisecond time course of processing, and advances have been made in developing new analyses that precisely characterize this time course. All theoretical and statistical approaches make the tacit assumption that the time course of fixations is closely related to the underlying activation in the system. However, given the serial nature of fixations and their long refractory period, it is unclear how closely the observed dynamics of the fixation curves are actually coupled to the underlying dynamics of activation. I investigated this assumption with a series of simulations. Each simulation starts with a set of true underlying activation functions and generates simulated fixations using a simple stochastic sampling procedure that respects the sequential nature of fixations. I then analyzed the results to determine the conditions under which the observed fixations curves match the underlying functions, the reliability of the observed data, and the implications for Type I error and power. These simulations demonstrate that even under the simplest fixation-based models, observed fixation curves are systematically biased relative to the underlying activation functions, and they are substantially noisier, with important implications for reliability and power. I then present a potential generative model that may ultimately overcome many of these issues.
Collapse
Affiliation(s)
- Bob McMurray
- Department of Psychological and Brain Sciences, 278 PBSB, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA.
- Department of Linguistics, University of Iowa, Iowa City, IA, USA.
- Department of Otolaryngology, University of Iowa, Iowa City, IA, USA.
| |
Collapse
|
5
|
Apfelbaum KS, Goodwin C, Blomquist C, McMurray B. The development of lexical competition in written- and spoken-word recognition. Q J Exp Psychol (Hove) 2023; 76:196-219. [PMID: 35296190 PMCID: PMC10962864 DOI: 10.1177/17470218221090483] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Efficient word recognition depends on the ability to overcome competition from overlapping words. The nature of the overlap depends on the input modality: spoken words have temporal overlap from other words that share phonemes in the same positions, whereas written words have spatial overlap from other words with letters in the same places. It is unclear how these differences in input format affect the ability to recognise a word and the types of competitors that become active while doing so. This study investigates word recognition in both modalities in children between 7 and 15. Children complete a visual-world paradigm eye-tracking task that measures competition from words with several types of overlap, using identical word lists between modalities. Results showed correlated developmental changes in the speed of target recognition in both modalities. In addition, developmental changes were seen in the efficiency of competitor suppression for some competitor types in the spoken modality. These data reveal some developmental continuity in the process of word recognition independent of modality but also some instances of independence in how competitors are activated. Stimuli, data, and analyses from this project are available at: https://osf.io/eav72.
Collapse
Affiliation(s)
- Keith S Apfelbaum
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, USA
| | - Claire Goodwin
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, USA
| | - Christina Blomquist
- Department of Communication Sciences and Disorders, University of Maryland, College Park, MD, USA
| | - Bob McMurray
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, USA
- Department of Communication Sciences and Disorders, Department of Linguistics, Department of Otolaryngology, University of Iowa, Iowa City, IA, USA
| |
Collapse
|
6
|
Avcu E, Newman O, Ahlfors SP, Gow DW. Neural evidence suggests phonological acceptability judgments reflect similarity, not constraint evaluation. Cognition 2023; 230:105322. [PMID: 36370613 PMCID: PMC9712273 DOI: 10.1016/j.cognition.2022.105322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 10/24/2022] [Accepted: 11/01/2022] [Indexed: 11/11/2022]
Abstract
Acceptability judgments are a primary source of evidence in formal linguistic research. Within the generative linguistic tradition, these judgments are attributed to evaluation of novel forms based on implicit knowledge of rules or constraints governing well-formedness. In the domain of phonological acceptability judgments, other factors including ease of articulation and similarity to known forms have been hypothesized to influence evaluation. We used data-driven neural techniques to identify the relative contributions of these factors. Granger causality analysis of magnetic resonance imaging (MRI)-constrained magnetoencephalography (MEG) and electroencephalography (EEG) data revealed patterns of interaction between brain regions that support explicit judgments of the phonological acceptability of spoken nonwords. Comparisons of data obtained with nonwords that varied in terms of onset consonant cluster attestation and acceptability revealed different cortical regions and effective connectivity patterns associated with phonological acceptability judgments. Attested forms produced stronger influences of brain regions implicated in lexical representation and sensorimotor simulation on acoustic-phonetic regions, whereas unattested forms produced stronger influence of phonological control mechanisms on acoustic-phonetic processing. Unacceptable forms produced widespread patterns of interaction consistent with attempted search or repair. Together, these results suggest that speakers' phonological acceptability judgments reflect lexical and sensorimotor factors.
Collapse
Affiliation(s)
- Enes Avcu
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America.
| | - Olivia Newman
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Seppo P Ahlfors
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States of America; Department of Radiology, Harvard Medical School, Boston, MA, United States of America
| | - David W Gow
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America; Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States of America; Department of Psychology, Salem State University, Salem, MA, United States of America; Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA 02139, United States of America
| |
Collapse
|
7
|
McMurray B, Sarrett ME, Chiu S, Black AK, Wang A, Canale R, Aslin RN. Decoding the temporal dynamics of spoken word and nonword processing from EEG. Neuroimage 2022; 260:119457. [PMID: 35842096 PMCID: PMC10875705 DOI: 10.1016/j.neuroimage.2022.119457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/02/2022] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open
Abstract
The efficiency of spoken word recognition is essential for real-time communication. There is consensus that this efficiency relies on an implicit process of activating multiple word candidates that compete for recognition as the acoustic signal unfolds in real-time. However, few methods capture the neural basis of this dynamic competition on a msec-by-msec basis. This is crucial for understanding the neuroscience of language, and for understanding hearing, language and cognitive disorders in people for whom current behavioral methods are not suitable. We applied machine-learning techniques to standard EEG signals to decode which word was heard on each trial and analyzed the patterns of confusion over time. Results mirrored psycholinguistic findings: Early on, the decoder was equally likely to report the target (e.g., baggage) or a similar sounding competitor (badger), but by around 500 msec, competitors were suppressed. Follow up analyses show that this is robust across EEG systems (gel and saline), with fewer channels, and with fewer trials. Results are robust within individuals and show high reliability. This suggests a powerful and simple paradigm that can assess the neural dynamics of speech decoding, with potential applications for understanding lexical development in a variety of clinical disorders.
Collapse
Affiliation(s)
- Bob McMurray
- Dept. of Psychological and Brain Sciences, Dept. of Communication Sciences and Disorders, Dept. of Linguistics and Dept. of Otolaryngology, University of Iowa.
| | - McCall E Sarrett
- Interdisciplinary Graduate Program in Neuroscience, Unviersity of Iowa
| | - Samantha Chiu
- Dept. of Psychological and Brain Sciences, University of Iowa
| | - Alexis K Black
- School of Audiology and Speech Sciences, University of British Columbia, Haskins Laboratories
| | - Alice Wang
- Dept. of Psychology, University of Oregon, Haskins Laboratories
| | - Rebecca Canale
- Dept. of Psychological Sciences, University of Connecticut, Haskins Laboratories
| | - Richard N Aslin
- Haskins Laboratories, Department of Psychology and Child Study Center, Yale University, Department of Psychology, University of Connecticut
| |
Collapse
|
8
|
Holt R, Bruggeman L, Demuth K. Children with hearing loss can predict during sentence processing. Cognition 2021; 212:104684. [PMID: 33901882 DOI: 10.1016/j.cognition.2021.104684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 03/10/2021] [Accepted: 03/16/2021] [Indexed: 10/21/2022]
Abstract
Listeners readily anticipate upcoming sentence constituents, however little is known about prediction when the input is suboptimal, such as for children with hearing loss (HL). Here we examined whether children with hearing aids and/or cochlear implants use semantic context to predict upcoming spoken sentence completions. We expected reduced prediction among children with HL, but found they were able to predict similarly to children with normal hearing. This suggests prediction is robust even when input quality is chronically suboptimal, and is compatible with the idea that recent advances in the management of pre-lingual HL may have minimised some of the language processing differences between children with and without HL.
Collapse
Affiliation(s)
- Rebecca Holt
- Department of Linguistics, Macquarie University, Level 3 Australian Hearing Hub, 16 University Ave, NSW 2109, Australia.
| | - Laurence Bruggeman
- Department of Linguistics, Macquarie University, Level 3 Australian Hearing Hub, 16 University Ave, NSW 2109, Australia; The MARCS Institute for Brain, Behaviour & Development, ARC Centre of Excellence for the Dynamics of Language, Western Sydney University; Bullecourt Ave, Milperra, NSW 2214, Australia.
| | - Katherine Demuth
- Department of Linguistics, Macquarie University, Level 3 Australian Hearing Hub, 16 University Ave, NSW 2109, Australia.
| |
Collapse
|
9
|
Brown M, Tanenhaus MK, Dilley L. Syllable Inference as a Mechanism for Spoken Language Understanding. Top Cogn Sci 2021; 13:351-398. [PMID: 33780156 DOI: 10.1111/tops.12529] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 01/25/2023]
Abstract
A classic problem in spoken language comprehension is how listeners perceive speech as being composed of discrete words, given the variable time-course of information in continuous signals. We propose a syllable inference account of spoken word recognition and segmentation, according to which alternative hierarchical models of syllables, words, and phonemes are dynamically posited, which are expected to maximally predict incoming sensory input. Generative models are combined with current estimates of context speech rate drawn from neural oscillatory dynamics, which are sensitive to amplitude rises. Over time, models which result in local minima in error between predicted and recently experienced signals give rise to perceptions of hearing words. Three experiments using the visual world eye-tracking paradigm with a picture-selection task tested hypotheses motivated by this framework. Materials were sentences that were acoustically ambiguous in numbers of syllables, words, and phonemes they contained (cf. English plural constructions, such as "saw (a) raccoon(s) swimming," which have two loci of grammatical information). Time-compressing, or expanding, speech materials permitted determination of how temporal information at, or in the context of, each locus affected looks to, and selection of, pictures with a singular or plural referent (e.g., one or more than one raccoon). Supporting our account, listeners probabilistically interpreted identical chunks of speech as consistent with a singular or plural referent to a degree that was based on the chunk's gradient rate in relation to its context. We interpret these results as evidence that arriving temporal information, judged in relation to language model predictions generated from context speech rate evaluated on a continuous scale, informs inferences about syllables, thereby giving rise to perceptual experiences of understanding spoken language as words separated in time.
Collapse
Affiliation(s)
- Meredith Brown
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,Department of Psychiatry and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Michael K Tanenhaus
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,School of Psychology, Nanjing Normal University, Nanjing, China
| | - Laura Dilley
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
10
|
Ou J, Yu ACL, Xiang M. Individual Differences in Categorization Gradience As Predicted by Online Processing of Phonetic Cues During Spoken Word Recognition: Evidence From Eye Movements. Cogn Sci 2021; 45:e12948. [PMID: 33682211 DOI: 10.1111/cogs.12948] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 12/13/2020] [Accepted: 12/21/2020] [Indexed: 11/30/2022]
Abstract
Recent studies have documented substantial variability among typical listeners in how gradiently they categorize speech sounds, and this variability in categorization gradience may link to how listeners weight different cues in the incoming signal. The present study tested the relationship between categorization gradience and cue weighting across two sets of English contrasts, each varying orthogonally in two acoustic dimensions. Participants performed a four-alternative forced-choice identification task in a visual world paradigm while their eye movements were monitored. We found that (a) greater categorization gradience derived from behavioral identification responses corresponds to larger secondary cue weights derived from eye movements; (b) the relationship between categorization gradience and secondary cue weighting is observed across cues and contrasts, suggesting that categorization gradience may be a consistent within-individual property in speech perception; and (c) listeners who showed greater categorization gradience tend to adopt a buffered processing strategy, especially when cues arrive asynchronously in time.
Collapse
Affiliation(s)
- Jinghua Ou
- Department of Linguistics, University of Chicago
| | - Alan C L Yu
- Department of Linguistics, University of Chicago
| | - Ming Xiang
- Department of Linguistics, University of Chicago
| |
Collapse
|
11
|
Caplan S, Hafri A, Trueswell JC. Now You Hear Me, Later You Don't: The Immediacy of Linguistic Computation and the Representation of Speech. Psychol Sci 2021; 32:410-423. [PMID: 33617735 DOI: 10.1177/0956797620968787] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
What happens to an acoustic signal after it enters the mind of a listener? Previous work has demonstrated that listeners maintain intermediate representations over time. However, the internal structure of such representations-be they the acoustic-phonetic signal or more general information about the probability of possible categories-remains underspecified. We present two experiments using a novel speaker-adaptation paradigm aimed at uncovering the format of speech representations. We exposed adult listeners (N = 297) to a speaker whose utterances contained acoustically ambiguous information concerning phones (and thus words), and we manipulated the temporal availability of disambiguating cues via visually presented text (presented before or after each utterance). Results from a traditional phoneme-categorization task showed that listeners adapted to a modified acoustic distribution when disambiguating text was provided before but not after the audio. These results support the position that speech representations consist of activation over categories and are inconsistent with direct maintenance of the acoustic-phonetic signal.
Collapse
Affiliation(s)
| | - Alon Hafri
- Department of Cognitive Science, Johns Hopkins University.,Department of Psychological and Brain Sciences, Johns Hopkins University
| | | |
Collapse
|
12
|
Getz LM, Toscano JC. The time-course of speech perception revealed by temporally-sensitive neural measures. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2020; 12:e1541. [PMID: 32767836 DOI: 10.1002/wcs.1541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 05/28/2020] [Accepted: 06/26/2020] [Indexed: 11/07/2022]
Abstract
Recent advances in cognitive neuroscience have provided a detailed picture of the early time-course of speech perception. In this review, we highlight this work, placing it within the broader context of research on the neurobiology of speech processing, and discuss how these data point us toward new models of speech perception and spoken language comprehension. We focus, in particular, on temporally-sensitive measures that allow us to directly measure early perceptual processes. Overall, the data provide support for two key principles: (a) speech perception is based on gradient representations of speech sounds and (b) speech perception is interactive and receives input from higher-level linguistic context at the earliest stages of cortical processing. Implications for models of speech processing and the neurobiology of language more broadly are discussed. This article is categorized under: Psychology > Language Psychology > Perception and Psychophysics Neuroscience > Cognition.
Collapse
Affiliation(s)
- Laura M Getz
- Department of Psychological Sciences, University of San Diego, San Diego, California, USA
| | - Joseph C Toscano
- Department of Psychological and Brain Sciences, Villanova University, Villanova, Pennsylvania, USA
| |
Collapse
|
13
|
Schreiber KE, McMurray B. Listeners can anticipate future segments before they identify the current one. Atten Percept Psychophys 2019; 81:1147-1166. [PMID: 31087271 PMCID: PMC6688751 DOI: 10.3758/s13414-019-01712-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Speech unfolds rapidly over time, and the information necessary to recognize even a single phoneme may not be available simultaneously. Consequently, listeners must both integrate prior acoustic cues and anticipate future segments. Prior work on stop consonants and vowels suggests that listeners integrate asynchronous cues by partially activating lexical entries as soon as any information is available, and then updating this when later cues arrive. However, a recent study suggests that for the voiceless sibilant fricatives (/s/ and /ʃ/), listeners wait to initiate lexical access until all cues have arrived at the onset of the vowel. Sibilants also contain coarticulatory cues that could be used to anticipate the vowel upcoming. However, given these results, it is unclear if listeners could use them fast enough to speed vowel recognition. The current study examines anticipation by asking when listeners use coarticulatory information in the frication to predict the upcoming vowel. A visual world paradigm experiment found that listeners do not wait: they anticipate the vowel immediately from the onset of the frication, even as they wait several hundred milliseconds to identify the fricative. This finding suggests listeners do not strictly process phonemes in the order that they appear; rather the dynamics of language processing may be largely internal and only loosely coupled to the dynamics of the input.
Collapse
Affiliation(s)
- Kayleen E Schreiber
- Interdisciplinary Graduate Program in Neuroscience, University of Iowa, Iowa City, IA, USA
| | - Bob McMurray
- Department of Psychological and Brain Sciences, Department of Communication Sciences and Disorders, Department of Linguistics, University of Iowa, W311 SSH, Iowa City, IA, 52242, USA.
| |
Collapse
|