1
|
Cutler A, Burchfield LA, Antoniou M. The Language-Specificity of Phonetic Adaptation to Talkers. LANGUAGE AND SPEECH 2024; 67:373-400. [PMID: 38054422 PMCID: PMC11141103 DOI: 10.1177/00238309231214244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Listeners adapt efficiently to new talkers by using lexical knowledge to resolve perceptual uncertainty. This adaptation has been widely observed, both in first (L1) and in second languages (L2). Here, adaptation was tested in both the L1 and L2 of speakers of Mandarin and English, two very dissimilar languages. A sound midway between /f/ and /s/ replacing either /f/ or /s/ in Mandarin words presented for lexical decision (e.g., bu4fa3 "illegal"; kuan1song1 "loose") prompted the expected adaptation; it induced an expanded /f/ category in phoneme categorization when it had replaced /f/, but an expanded /s/ category when it had replaced /s/. Both L1 listeners and English-native listeners with L2 Mandarin showed this effect. In English, however (with e.g., traffic; insane), we observed adaptation in L1 but not in L2; Mandarin-native listeners, despite scoring highly in the English lexical decision training, did not adapt their category boundaries for /f/ and /s/. Whether the ambiguous sound appeared syllable-initially (as in Mandarin phonology) versus word-finally (providing more word identity information) made no difference. Perceptual learning for talker adaptation is language-specific in that successful lexically guided adaptation in one language does not guarantee adaptation in other known languages; the enabling conditions for adaptation may be multiple and diverse.
Collapse
Affiliation(s)
- Anne Cutler
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia
| | - L Ann Burchfield
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia
| | - Mark Antoniou
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia
| |
Collapse
|
2
|
Drouin JR, Flores S. Effects of training length on adaptation to noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2114-2127. [PMID: 38488452 DOI: 10.1121/10.0025273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 02/22/2024] [Indexed: 03/19/2024]
Abstract
Listeners show rapid perceptual learning of acoustically degraded speech, though the amount of exposure required to maximize speech adaptation is unspecified. The current work used a single-session design to examine the length of auditory training on perceptual learning for normal hearing listeners exposed to eight-channel noise-vocoded speech. Participants completed short, medium, or long training using a two-alternative forced choice sentence identification task with feedback. To assess learning and generalization, a 40-trial pre-test and post-test transcription task was administered using trained and novel sentences. Training results showed all groups performed near ceiling with no reliable differences. For test data, we evaluated changes in transcription accuracy using separate linear mixed models for trained or novel sentences. In both models, we observed a significant improvement in transcription at post-test relative to pre-test. Critically, the three training groups did not differ in the magnitude of improvement following training. Subsequent Bayes factors analysis evaluating the test by group interaction provided strong evidence in support of the null hypothesis. For these stimuli and procedure, results suggest increased training does not necessarily maximize learning outcomes; both passive and trained experience likely supported adaptation. Findings may contribute to rehabilitation recommendations for listeners adapting to degraded speech signals.
Collapse
Affiliation(s)
- Julia R Drouin
- Division of Speech and Hearing Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Stephany Flores
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, California 92831, USA
| |
Collapse
|
3
|
Drouin JR, Rojas JA. Influence of face masks on recalibration of phonetic categories. Atten Percept Psychophys 2023; 85:2700-2717. [PMID: 37188863 PMCID: PMC10185375 DOI: 10.3758/s13414-023-02715-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2023] [Indexed: 05/17/2023]
Abstract
Previous research demonstrates listeners dynamically adjust phonetic categories in line with lexical context. While listeners show flexibility in adapting speech categories, recalibration may be constrained when variability can be attributed externally. It has been hypothesized that when listeners attribute atypical speech input to a causal factor, phonetic recalibration is attenuated. The current study investigated this theory directly by examining the influence of face masks, an external factor that affects both visual and articulatory cues, on the magnitude of phonetic recalibration. Across four experiments, listeners completed a lexical decision exposure phase in which they heard an ambiguous sound in either /s/-biasing or /ʃ/-biasing lexical contexts, while simultaneously viewing a speaker with a mask off, mask on the chin, or mask over the mouth. Following exposure, all listeners completed an auditory phonetic categorization test along an /ʃ/-/s/ continuum. In Experiment 1 (when no face mask was present during exposure trials), Experiment 2 (when the face mask was on the chin), Experiment 3 (when the face mask was on the mouth during ambiguous items), and Experiment 4 (when the face mask was on the mouth during the entire exposure phase), listeners showed a robust and equivalent phonetic recalibration effect. Recalibration manifested as greater proportion /s/ responses for listeners in the /s/-biased exposure group, relative to listeners in the /ʃ/-biased exposure group. Results support the notion that listeners do not causally attribute face masks with speech idiosyncrasies, which may reflect a general speech learning adjustment during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Julia R Drouin
- Division of Speech and Hearing Sciences, University of North Carolina School of Medicine, Chapel Hill, NC, USA.
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, CA, USA.
| | - Jose A Rojas
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, CA, USA
| |
Collapse
|
4
|
Luthra S, Magnuson JS, Myers EB. Right Posterior Temporal Cortex Supports Integration of Phonetic and Talker Information. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:145-177. [PMID: 37229142 PMCID: PMC10205075 DOI: 10.1162/nol_a_00091] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 11/08/2022] [Indexed: 05/27/2023]
Abstract
Though the right hemisphere has been implicated in talker processing, it is thought to play a minimal role in phonetic processing, at least relative to the left hemisphere. Recent evidence suggests that the right posterior temporal cortex may support learning of phonetic variation associated with a specific talker. In the current study, listeners heard a male talker and a female talker, one of whom produced an ambiguous fricative in /s/-biased lexical contexts (e.g., epi?ode) and one who produced it in /∫/-biased contexts (e.g., friend?ip). Listeners in a behavioral experiment (Experiment 1) showed evidence of lexically guided perceptual learning, categorizing ambiguous fricatives in line with their previous experience. Listeners in an fMRI experiment (Experiment 2) showed differential phonetic categorization as a function of talker, allowing for an investigation of the neural basis of talker-specific phonetic processing, though they did not exhibit perceptual learning (likely due to characteristics of our in-scanner headphones). Searchlight analyses revealed that the patterns of activation in the right superior temporal sulcus (STS) contained information about who was talking and what phoneme they produced. We take this as evidence that talker information and phonetic information are integrated in the right STS. Functional connectivity analyses suggested that the process of conditioning phonetic identity on talker information depends on the coordinated activity of a left-lateralized phonetic processing system and a right-lateralized talker processing system. Overall, these results clarify the mechanisms through which the right hemisphere supports talker-specific phonetic processing.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
| | - James S. Magnuson
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Basque Center on Cognition Brain and Language (BCBL), Donostia-San Sebastián, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
5
|
Hearing is believing: Lexically guided perceptual learning is graded to reflect the quantity of evidence in speech input. Cognition 2023; 235:105404. [PMID: 36812836 DOI: 10.1016/j.cognition.2023.105404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 11/29/2022] [Accepted: 02/07/2023] [Indexed: 02/22/2023]
Abstract
There is wide variability in the acoustic patterns that are produced for a given linguistic message, including variability that is conditioned on who is speaking. Listeners solve this lack of invariance problem, at least in part, by dynamically modifying the mapping to speech sounds in response to structured variation in the input. Here we test a primary tenet of the ideal adapter framework of speech adaptation, which posits that perceptual learning reflects the incremental updating of cue-sound mappings to incorporate observed evidence with prior beliefs. Our investigation draws on the influential lexically guided perceptual learning paradigm. During an exposure phase, listeners heard a talker who produced fricative energy ambiguous between /ʃ/ and /s/. Lexical context differentially biased interpretation of the ambiguity as either /s/ or /ʃ/, and, across two behavioral experiments (n = 500), we manipulated the quantity of evidence and the consistency of evidence that was provided during exposure. Following exposure, listeners categorized tokens from an ashi - asi continuum to assess learning. The ideal adapter framework was formalized through computational simulations, which predicted that learning would be graded to reflect the quantity, but not the consistency, of the exposure input. These predictions were upheld in human listeners; the magnitude of the learning effect monotonically increased given exposure to four, 10, or 20 critical productions, and there was no evidence that learning differed given consistent versus inconsistent exposure. These results (1) provide support for a primary tenet of the ideal adapter framework, (2) establish quantity of evidence as a key determinant of adaptation in human listeners, and (3) provide critical evidence that lexically guided perceptual learning is not a binary outcome. In doing so, the current work provides foundational knowledge to support theoretical advances that consider perceptual learning as a graded outcome that is tightly linked to input statistics in the speech stream.
Collapse
|
6
|
Perceptual learning of multiple talkers: Determinants, characteristics, and limitations. Atten Percept Psychophys 2022; 84:2335-2359. [PMID: 36076119 DOI: 10.3758/s13414-022-02556-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/08/2022] [Indexed: 11/08/2022]
Abstract
Research suggests that listeners simultaneously update talker-specific generative models to reflect structured phonetic variation. Because past investigations exposed listeners to talkers of different genders, it is unknown whether adaptation is talker specific or rather linked to a broader sociophonetic class. Here, we test determinants of listeners' ability to update and apply talker-specific models for speech perception. In six experiments (n = 480), listeners were first exposed to the speech of two talkers who produced ambiguous fricative energy. The talkers' speech was interleaved during exposure, and lexical context differentially biased interpretation of the ambiguity as either /s/ or /ʃ/ for each talker. At test, listeners categorized tokens from ashi-asi continua, one for each talker. Across conditions and experiments, we manipulated exposure quantity, talker gender, blocked versus interleaved talker structure at test, and the degree to which fricative acoustics differed between talkers. When test was blocked by talker, learning was observed for different but not same gender talkers. When talkers were interleaved at test, learning was observed for both different and same gender talkers, which was attenuated when fricative acoustics were constant across talkers. There was no strong evidence to suggest that adaptation to multiple talkers required increased quantity of exposure beyond that required to adapt to a single talker. These results suggest that perceptual learning for speech is achieved via a mechanism that represents a context-dependent, cumulative integration of experience with speech input and identity critical constraints on listeners' ability to dynamically apply multiple generative models in mixed talker listening environments.
Collapse
|
7
|
Drouin JR, Theodore RM. Many tasks, same outcome: Role of training task on learning and maintenance of noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:981. [PMID: 36050170 PMCID: PMC9553285 DOI: 10.1121/10.0013507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/12/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
Listeners who use cochlear implants show variability in speech recognition. Research suggests that structured auditory training can improve speech recognition outcomes in cochlear implant users, and a central goal in the rehabilitation literature is to identify factors that maximize training. Here, we examined factors that may influence perceptual learning for noise-vocoded speech in normal hearing listeners as a foundational step towards clinical recommendations. Three groups of listeners were exposed to anomalous noise-vocoded sentences and completed one of three training tasks: transcription with feedback, transcription without feedback, or talker identification. Listeners completed a word transcription test at three time points: immediately before training, immediately after training, and one week following training. Accuracy at test was indexed by keyword accuracy at the sentence-initial and sentence-final position for high and low predictability noise-vocoded sentences. Following training, listeners showed improved transcription for both sentence-initial and sentence-final items, and for both low and high predictability sentences. The training groups showed robust and equivalent learning of noise-vocoded sentences immediately after training. Critically, gains were largely maintained equivalently among training groups one week later. These results converge with evidence pointing towards the utility of non-traditional training tasks to maximize perceptual learning of noise-vocoded speech.
Collapse
Affiliation(s)
- Julia R Drouin
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, California 92831, USA
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
| |
Collapse
|
8
|
Luthra S, Mechtenberg H, Myers EB. Perceptual learning of multiple talkers requires additional exposure. Atten Percept Psychophys 2021; 83:2217-2228. [PMID: 33754298 PMCID: PMC8217155 DOI: 10.3758/s13414-021-02261-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2021] [Indexed: 11/08/2022]
Abstract
Because different talkers produce their speech sounds differently, listeners benefit from maintaining distinct generative models (sets of beliefs) about the correspondence between acoustic information and phonetic categories for different talkers. A robust literature on phonetic recalibration indicates that when listeners encounter a talker who produces their speech sounds idiosyncratically (e.g., a talker who produces their /s/ sound atypically), they can update their generative model for that talker. Such recalibration has been shown to occur in a relatively talker-specific way. Because listeners in ecological situations often meet several new talkers at once, the present study considered how the process of simultaneously updating two distinct generative models compares to updating one model at a time. Listeners were exposed to two talkers, one who produced /s/ atypically and one who produced /∫/ atypically. Critically, these talkers only produced these sounds in contexts where lexical information disambiguated the phoneme's identity (e.g., epi_ode, flouri_ing). When initial exposure to the two talkers was blocked by voice (Experiment 1), listeners recalibrated to these talkers after relatively little exposure to each talker (32 instances per talker, of which 16 contained ambiguous fricatives). However, when the talkers were intermixed during learning (Experiment 2), listeners required more exposure trials before they were able to adapt to the idiosyncratic productions of these talkers (64 instances per talker, of which 32 contained ambiguous fricatives). Results suggest that there is a perceptual cost to simultaneously updating multiple distinct generative models, potentially because listeners must first select which generative model to update.
Collapse
Affiliation(s)
- Sahil Luthra
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA.
- The Connecticut Institute for the Brain and Cognitive Sciences, Storrs, CT, USA.
| | - Hannah Mechtenberg
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
| | - Emily B Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, CT, USA
- The Connecticut Institute for the Brain and Cognitive Sciences, Storrs, CT, USA
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
9
|
Luthra S, Magnuson JS, Myers EB. Boosting lexical support does not enhance lexically guided perceptual learning. J Exp Psychol Learn Mem Cogn 2021; 47:685-704. [PMID: 33983786 PMCID: PMC8287971 DOI: 10.1037/xlm0000945] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A challenge for listeners is to learn the appropriate mapping between acoustics and phonetic categories for an individual talker. Lexically guided perceptual learning (LGPL) studies have shown that listeners can leverage lexical knowledge to guide this process. For instance, listeners learn to interpret ambiguous /s/-/∫/ blends as /s/ if they have previously encountered them in /s/-biased contexts like epi?ode. Here, we examined whether the degree of preceding lexical support might modulate the extent of perceptual learning. In Experiment 1, we first demonstrated that perceptual learning could be obtained in a modified LGPL paradigm where listeners were first biased to interpret ambiguous tokens as one phoneme (e.g., /s/) and then later as another (e.g., /∫/). In subsequent experiments, we tested whether the extent of learning differed depending on whether targets encountered predictive contexts or neutral contexts prior to the auditory target (e.g., epi?ode). Experiment 2 used auditory sentence contexts (e.g., "I love The Walking Dead and eagerly await every new . . ."), whereas Experiment 3 used written sentence contexts. In Experiment 4, participants did not receive sentence contexts but rather saw the written form of the target word (episode) or filler text (########) prior to hearing the critical auditory token. While we consistently observed effects of context on in-the-moment processing of critical words, the size of the learning effect was not modulated by the type of context. We hypothesize that boosting lexical support through preceding context may not strongly influence perceptual learning when ambiguous speech sounds can be identified solely from lexical information. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
10
|
Giovannone N, Theodore RM. Individual Differences in Lexical Contributions to Speech Perception. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:707-724. [PMID: 33606960 PMCID: PMC8608212 DOI: 10.1044/2020_jslhr-20-00283] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/05/2020] [Accepted: 11/26/2020] [Indexed: 05/28/2023]
Abstract
Purpose The extant literature suggests that individual differences in speech perception can be linked to broad receptive language phenotype. For example, a recent study found that individuals with a smaller receptive vocabulary showed diminished lexically guided perceptual learning compared to individuals with a larger receptive vocabulary. Here, we examined (a) whether such individual differences stem from variation in reliance on lexical information or variation in perceptual learning itself and (b) whether a relationship exists between lexical recruitment and lexically guided perceptual learning more broadly, as predicted by current models of lexically guided perceptual learning. Method In Experiment 1, adult participants (n = 70) completed measures of receptive and expressive language ability, lexical recruitment, and lexically guided perceptual learning. In Experiment 2, adult participants (n = 120) completed the same lexical recruitment and lexically guided perceptual learning tasks to provide a high-powered replication of the primary findings from Experiment 1. Results In Experiment 1, individuals with weaker receptive language ability showed increased lexical recruitment relative to individuals with higher receptive language ability; however, receptive language ability did not predict the magnitude of lexically guided perceptual learning. Moreover, the results of both experiments converged to show no evidence indicating a relationship between lexical recruitment and lexically guided perceptual learning. Conclusion The current findings suggest that (a) individuals with weaker language ability demonstrate increased reliance on lexical information for speech perception compared to those with stronger receptive language ability; (b) individuals with weaker language ability maintain an intact perceptual learning mechanism; and, (c) to the degree that the measures used here accurately capture individual differences in lexical recruitment and lexically guided perceptual learning, there is no graded relationship between these two constructs.
Collapse
Affiliation(s)
- Nikole Giovannone
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs
| | - Rachel M. Theodore
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
- Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, Storrs
| |
Collapse
|
11
|
Caplan S, Hafri A, Trueswell JC. Now You Hear Me, Later You Don't: The Immediacy of Linguistic Computation and the Representation of Speech. Psychol Sci 2021; 32:410-423. [PMID: 33617735 DOI: 10.1177/0956797620968787] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
What happens to an acoustic signal after it enters the mind of a listener? Previous work has demonstrated that listeners maintain intermediate representations over time. However, the internal structure of such representations-be they the acoustic-phonetic signal or more general information about the probability of possible categories-remains underspecified. We present two experiments using a novel speaker-adaptation paradigm aimed at uncovering the format of speech representations. We exposed adult listeners (N = 297) to a speaker whose utterances contained acoustically ambiguous information concerning phones (and thus words), and we manipulated the temporal availability of disambiguating cues via visually presented text (presented before or after each utterance). Results from a traditional phoneme-categorization task showed that listeners adapted to a modified acoustic distribution when disambiguating text was provided before but not after the audio. These results support the position that speech representations consist of activation over categories and are inconsistent with direct maintenance of the acoustic-phonetic signal.
Collapse
Affiliation(s)
| | - Alon Hafri
- Department of Cognitive Science, Johns Hopkins University.,Department of Psychological and Brain Sciences, Johns Hopkins University
| | | |
Collapse
|
12
|
Luthra S, Correia JM, Kleinschmidt DF, Mesite L, Myers EB. Lexical Information Guides Retuning of Neural Patterns in Perceptual Learning for Speech. J Cogn Neurosci 2020; 32:2001-2012. [PMID: 32662731 PMCID: PMC8048099 DOI: 10.1162/jocn_a_01612] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A listener's interpretation of a given speech sound can vary probabilistically from moment to moment. Previous experience (i.e., the contexts in which one has encountered an ambiguous sound) can further influence the interpretation of speech, a phenomenon known as perceptual learning for speech. This study used multivoxel pattern analysis to query how neural patterns reflect perceptual learning, leveraging archival fMRI data from a lexically guided perceptual learning study conducted by Myers and Mesite [Myers, E. B., & Mesite, L. M. Neural systems underlying perceptual adjustment to non-standard speech tokens. Journal of Memory and Language, 76, 80-93, 2014]. In that study, participants first heard ambiguous /s/-/∫/ blends in either /s/-biased lexical contexts (epi_ode) or /∫/-biased contexts (refre_ing); subsequently, they performed a phonetic categorization task on tokens from an /asi/-/a∫i/ continuum. In the current work, a classifier was trained to distinguish between phonetic categorization trials in which participants heard unambiguous productions of /s/ and those in which they heard unambiguous productions of /∫/. The classifier was able to generalize this training to ambiguous tokens from the middle of the continuum on the basis of individual participants' trial-by-trial perception. We take these findings as evidence that perceptual learning for speech involves neural recalibration, such that the pattern of activation approximates the perceived category. Exploratory analyses showed that left parietal regions (supramarginal and angular gyri) and right temporal regions (superior, middle, and transverse temporal gyri) were most informative for categorization. Overall, our results inform an understanding of how moment-to-moment variability in speech perception is encoded in the brain.
Collapse
Affiliation(s)
| | - João M Correia
- University of Algarve
- Basque Center on Cognition, Brain and Language
| | | | - Laura Mesite
- MGH Institute of Health Professions
- Harvard Graduate School of Education
| | | |
Collapse
|
13
|
Liu L, Jaeger TF. Talker-specific pronunciation or speech error? Discounting (or not) atypical pronunciations during speech perception. J Exp Psychol Hum Percept Perform 2019; 45:1562-1588. [PMID: 31750716 DOI: 10.1037/xhp0000693] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Perceptual recalibration allows listeners to adapt to talker-specific pronunciations, such as atypical realizations of specific sounds. Such recalibration can facilitate robust speech recognition. However, indiscriminate recalibration following any atypically pronounced words also risks interpreting pronunciations as characteristic of a talker that are in reality because of incidental, short-lived factors (such as a speech error). We investigate whether the mechanisms underlying perceptual recalibration involve inferences about the causes for unexpected pronunciations. In 5 experiments, we ask whether perceptual recalibration is blocked if the atypical pronunciations of an unfamiliar talker can also be attributed to other incidental causes. We investigated 3 type of incidental causes for atypical pronunciations: the talker is intoxicated, the talker speaks unusually fast, or the atypical pronunciations occur only in the context of tongue twisters. In all 5 experiments, we find robust evidence for perceptual recalibration, but little evidence that the presence of incidental causes block perceptual recalibration. We discuss these results in light of other recent findings that incidental causes can block perceptual recalibration. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
Affiliation(s)
- Linda Liu
- Department of Brain and Cognitive Sciences, University of Rochester
| | - T Florian Jaeger
- Department of Brain and Cognitive Sciences, University of Rochester
| |
Collapse
|