1
|
Drouin JR, Rojas JA. Influence of face masks on recalibration of phonetic categories. Atten Percept Psychophys 2023; 85:2700-2717. [PMID: 37188863 PMCID: PMC10185375 DOI: 10.3758/s13414-023-02715-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2023] [Indexed: 05/17/2023]
Abstract
Previous research demonstrates listeners dynamically adjust phonetic categories in line with lexical context. While listeners show flexibility in adapting speech categories, recalibration may be constrained when variability can be attributed externally. It has been hypothesized that when listeners attribute atypical speech input to a causal factor, phonetic recalibration is attenuated. The current study investigated this theory directly by examining the influence of face masks, an external factor that affects both visual and articulatory cues, on the magnitude of phonetic recalibration. Across four experiments, listeners completed a lexical decision exposure phase in which they heard an ambiguous sound in either /s/-biasing or /ʃ/-biasing lexical contexts, while simultaneously viewing a speaker with a mask off, mask on the chin, or mask over the mouth. Following exposure, all listeners completed an auditory phonetic categorization test along an /ʃ/-/s/ continuum. In Experiment 1 (when no face mask was present during exposure trials), Experiment 2 (when the face mask was on the chin), Experiment 3 (when the face mask was on the mouth during ambiguous items), and Experiment 4 (when the face mask was on the mouth during the entire exposure phase), listeners showed a robust and equivalent phonetic recalibration effect. Recalibration manifested as greater proportion /s/ responses for listeners in the /s/-biased exposure group, relative to listeners in the /ʃ/-biased exposure group. Results support the notion that listeners do not causally attribute face masks with speech idiosyncrasies, which may reflect a general speech learning adjustment during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Julia R Drouin
- Division of Speech and Hearing Sciences, University of North Carolina School of Medicine, Chapel Hill, NC, USA.
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, CA, USA.
| | - Jose A Rojas
- Department of Communication Sciences and Disorders, California State University Fullerton, Fullerton, CA, USA
| |
Collapse
|
2
|
Xie X, Jaeger TF, Kurumada C. What we do (not) know about the mechanisms underlying adaptive speech perception: A computational framework and review. Cortex 2023; 166:377-424. [PMID: 37506665 DOI: 10.1016/j.cortex.2023.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 12/23/2022] [Accepted: 05/05/2023] [Indexed: 07/30/2023]
Abstract
Speech from unfamiliar talkers can be difficult to comprehend initially. These difficulties tend to dissipate with exposure, sometimes within minutes or less. Adaptivity in response to unfamiliar input is now considered a fundamental property of speech perception, and research over the past two decades has made substantial progress in identifying its characteristics. The mechanisms underlying adaptive speech perception, however, remain unknown. Past work has attributed facilitatory effects of exposure to any one of three qualitatively different hypothesized mechanisms: (1) low-level, pre-linguistic, signal normalization, (2) changes in/selection of linguistic representations, or (3) changes in post-perceptual decision-making. Direct comparisons of these hypotheses, or combinations thereof, have been lacking. We describe a general computational framework for adaptive speech perception (ASP) that-for the first time-implements all three mechanisms. We demonstrate how the framework can be used to derive predictions for experiments on perception from the acoustic properties of the stimuli. Using this approach, we find that-at the level of data analysis presently employed by most studies in the field-the signature results of influential experimental paradigms do not distinguish between the three mechanisms. This highlights the need for a change in research practices, so that future experiments provide more informative results. We recommend specific changes to experimental paradigms and data analysis. All data and code for this study are shared via OSF, including the R markdown document that this article is generated from, and an R library that implements the models we present.
Collapse
Affiliation(s)
- Xin Xie
- Language Science, University of California, Irvine, USA.
| | - T Florian Jaeger
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA; Computer Science, University of Rochester, Rochester, NY, USA
| | - Chigusa Kurumada
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| |
Collapse
|
3
|
Attentional resources contribute to the perceptual learning of talker idiosyncrasies in audiovisual speech. Atten Percept Psychophys 2019; 81:1006-1019. [PMID: 30684204 DOI: 10.3758/s13414-018-01651-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
To recognize audiovisual speech, listeners evaluate and combine information obtained from the auditory and visual modalities. Listeners also use information from one modality to adjust their phonetic categories to a talker's idiosyncrasy encountered in the other modality. In this study, we examined whether the outcome of this cross-modal recalibration relies on attentional resources. In a standard recalibration experiment in Experiment 1, participants heard an ambiguous sound, disambiguated by the accompanying visual speech as either /p/ or /t/. Participants' primary task was to attend to the audiovisual speech while either monitoring a tone sequence for a target tone or ignoring the tones. Listeners subsequently categorized the steps of an auditory /p/-/t/ continuum more often in line with their exposure. The aftereffect of phonetic recalibration was reduced, but not eliminated, by attentional load during exposure. In Experiment 2, participants saw an ambiguous visual speech gesture that was disambiguated auditorily as either /p/ or /t/. At test, listeners categorized the steps of a visual /p/-/t/ continuum more often in line with the prior exposure. Imposing load in the auditory modality during exposure did not reduce the aftereffect of this type of cross-modal phonetic recalibration. Together, these results suggest that auditory attentional resources are needed for the processing of auditory speech and/or for the shifting of auditory phonetic category boundaries. Listeners thus need to dedicate attentional resources in order to accommodate talker idiosyncrasies in audiovisual speech.
Collapse
|
4
|
Modelska M, Pourquié M, Baart M. No "Self" Advantage for Audiovisual Speech Aftereffects. Front Psychol 2019; 10:658. [PMID: 30967827 PMCID: PMC6440388 DOI: 10.3389/fpsyg.2019.00658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 03/08/2019] [Indexed: 11/13/2022] Open
Abstract
Although the default state of the world is that we see and hear other people talking, there is evidence that seeing and hearing ourselves rather than someone else may lead to visual (i.e., lip-read) or auditory "self" advantages. We assessed whether there is a "self" advantage for phonetic recalibration (a lip-read driven cross-modal learning effect) and selective adaptation (a contrastive effect in the opposite direction of recalibration). We observed both aftereffects as well as an on-line effect of lip-read information on auditory perception (i.e., immediate capture), but there was no evidence for a "self" advantage in any of the tasks (as additionally supported by Bayesian statistics). These findings strengthen the emerging notion that recalibration reflects a general learning mechanism, and bolster the argument that adaptation depends on rather low-level auditory/acoustic features of the speech signal.
Collapse
Affiliation(s)
- Maria Modelska
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
| | - Marie Pourquié
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
- UPPA, IKER (UMR5478), Bayonne, France
| | - Martijn Baart
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, Netherlands
| |
Collapse
|
5
|
Romanovska L, Janssen R, Bonte M. Reading-Induced Shifts in Speech Perception in Dyslexic and Typically Reading Children. Front Psychol 2019; 10:221. [PMID: 30792685 PMCID: PMC6374624 DOI: 10.3389/fpsyg.2019.00221] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 01/22/2019] [Indexed: 11/13/2022] Open
Abstract
One of the proposed mechanisms underlying reading difficulties observed in developmental dyslexia is impaired mapping of visual to auditory speech representations. We investigate these mappings in 20 typically reading and 20 children with dyslexia aged 8–10 years using text-based recalibration. In this paradigm, the pairing of visual text and ambiguous speech sounds shifts (recalibrates) the participant’s perception of the ambiguous speech in subsequent auditory-only post-test trials. Recent research in adults demonstrated this text-induced perceptual shift in typical, but not in dyslexic readers. Our current results instead show significant text-induced recalibration in both typically reading children and children with dyslexia. The strength of this effect was significantly linked to the strength of perceptual adaptation effects in children with dyslexia but not typically reading children. Furthermore, additional analyses in a sample of typically reading children of various reading levels revealed a significant link between recalibration and phoneme categorization. Taken together, our study highlights the importance of considering dynamic developmental changes in reading, letter-speech sound coupling and speech perception when investigating group differences between typical and dyslexic readers.
Collapse
Affiliation(s)
- Linda Romanovska
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Roef Janssen
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Milene Bonte
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
6
|
Baart M, Vroomen J. Recalibration of vocal affect by a dynamic face. Exp Brain Res 2018; 236:1911-1918. [PMID: 29696314 PMCID: PMC6010487 DOI: 10.1007/s00221-018-5270-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 04/20/2018] [Indexed: 11/04/2022]
Abstract
Perception of vocal affect is influenced by the concurrent sight of an emotional face. We demonstrate that the sight of an emotional face also can induce recalibration of vocal affect. Participants were exposed to videos of a ‘happy’ or ‘fearful’ face in combination with a slightly incongruous sentence with ambiguous prosody. After this exposure, ambiguous test sentences were rated as more ‘happy’ when the exposure phase contained ‘happy’ instead of ‘fearful’ faces. This auditory shift likely reflects recalibration that is induced by error minimization of the inter-sensory discrepancy. In line with this view, when the prosody of the exposure sentence was non-ambiguous and congruent with the face (without audiovisual discrepancy), aftereffects went in the opposite direction, likely reflecting adaptation. Our results demonstrate, for the first time, that perception of vocal affect is flexible and can be recalibrated by slightly discrepant visual information.
Collapse
Affiliation(s)
- Martijn Baart
- Department of Cognitive Neuropsychology, Tilburg University, P.O. Box 90153, 5000 LE, Tilburg, The Netherlands. .,BCBL, Basque Center on Cognition, Brain and Language, Donostia, Spain.
| | - Jean Vroomen
- Department of Cognitive Neuropsychology, Tilburg University, P.O. Box 90153, 5000 LE, Tilburg, The Netherlands.
| |
Collapse
|
7
|
Abstract
Adaptation to female voices causes subsequent voices to be perceived as more male, and vice versa. This contrastive aftereffect disappears under spatial inattention to adaptors, suggesting that voices are not encoded automatically. According to Lavie, Hirst, de Fockert, and Viding (2004), the processing of task-irrelevant stimuli during selective attention depends on perceptual resources and working memory. Possibly due to their social significance, faces may be an exceptional domain: That is, task-irrelevant faces can escape perceptual load effects. Here we tested voice processing, to study whether voice gender aftereffects (VGAEs) depend on low or high perceptual (Exp. 1) or working memory (Exp. 2) load in a relevant visual task. Participants adapted to irrelevant voices while either searching digit displays for a target (Exp. 1) or recognizing studied digits (Exp. 2). We found that the VGAE was unaffected by perceptual load, indicating that task-irrelevant voices, like faces, can also escape perceptual-load effects. Intriguingly, the VGAE was increased under high memory load. Therefore, visual working memory load, but not general perceptual load, determines the processing of task-irrelevant voices.
Collapse
|
8
|
Visual speech influences speech perception immediately but not automatically. Atten Percept Psychophys 2016; 79:660-678. [DOI: 10.3758/s13414-016-1249-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
9
|
Samuel AG. Lexical representations are malleable for about one second: Evidence for the non-automaticity of perceptual recalibration. Cogn Psychol 2016; 88:88-114. [PMID: 27423485 DOI: 10.1016/j.cogpsych.2016.06.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 06/29/2016] [Accepted: 06/30/2016] [Indexed: 10/21/2022]
Abstract
In listening to speech, people have been shown to apply several types of adjustment to their phonemic categories that take into account variations in the prevailing linguistic environment. These adjustments include selective adaptation, lexically driven recalibration, and audiovisually determined recalibration. Prior studies have used dual task procedures to test whether these adjustments are automatic or if they require attention, and all of these tests have supported automaticity. The current study instead uses a method of targeted distraction to demonstrate that lexical recalibration does in fact require attention. Building on this finding, the targeted distraction method is used to measure the period of time during which the lexical percept remains malleable. The results support a processing window of approximately one second, consistent with the results of a small number of prior studies that bear on this question. The results also demonstrate that recalibration is closely linked to the completion of lexical access.
Collapse
Affiliation(s)
- Arthur G Samuel
- Basque Center on Cognition, Brain and Language, Donostia, Spain; IKERBASQUE, Basque Foundation for Science, Spain; Stony Brook University, Dept. of Psychology, Stony Brook, NY, United States.
| |
Collapse
|
10
|
Cross-modal perceptual load: the impact of modality and individual differences. Exp Brain Res 2015; 234:1279-91. [DOI: 10.1007/s00221-015-4517-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 11/30/2015] [Indexed: 11/30/2022]
|
11
|
Keetels M, Pecoraro M, Vroomen J. Recalibration of auditory phonemes by lipread speech is ear-specific. Cognition 2015; 141:121-6. [PMID: 25981732 DOI: 10.1016/j.cognition.2015.04.019] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Revised: 04/21/2015] [Accepted: 04/24/2015] [Indexed: 11/18/2022]
Abstract
Listeners quickly learn to label an ambiguous speech sound if there is lipread information that tells what the sound should be (i.e., phonetic recalibration Bertelson, Vroomen, & de Gelder (2003)). We report the counter-intuitive result that the same ambiguous sound can be simultaneously adapted to two opposing phonemic interpretations if presented in the left and right ear. This is strong evidence against the notion that phonetic recalibration involves an adjustment of abstract phoneme boundaries. It rather supports the idea that phonetic recalibration is closely tied to the sensory specifics of the learning context.
Collapse
Affiliation(s)
- Mirjam Keetels
- Department of Cognitive Neuropsychology, Tilburg University, The Netherlands
| | - Mauro Pecoraro
- Department of Cognitive Neuropsychology, Tilburg University, The Netherlands
| | - Jean Vroomen
- Department of Cognitive Neuropsychology, Tilburg University, The Netherlands.
| |
Collapse
|
12
|
Processing load impairs coordinate integration for the localization of touch. Atten Percept Psychophys 2015; 76:1136-50. [PMID: 24550040 DOI: 10.3758/s13414-013-0590-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
To perform an action toward a touch, the tactile spatial representation must be transformed from a skin-based, anatomical reference frame into an external reference frame. Evidence suggests that, after transformation, both anatomical and external coordinates are integrated for the location estimate. The present study investigated whether the calculation and integration of external coordinates are automatic processes. Participants made temporal order judgments (TOJs) of two tactile stimuli, one applied to each hand, in crossed and uncrossed postures. The influence of the external coordinates of touch was indicated by the performance difference between crossed and uncrossed postures, referred to as the crossing effect. To assess automaticity, the TOJ task was combined with a working memory task that varied in difficulty (size of the working memory set) and quality (verbal vs. spatial). In two studies, the crossing effect was consistently reduced under processing load. When the load level was adaptively adjusted to individual performance (Study 2), the crossing effect additionally varied as a function of the difficulty of the secondary task. These modulatory effects of processing load on the crossing effect were independent of the type of working memory. The sensitivity of the crossing effect to processing load suggests that coordinate integration for touch localization is not fully automatic. To reconcile the present results with previous findings, we suggest that the genuine remapping process-that is, the transformation of anatomical into external coordinates-proceeds automatically, whereas their integration in service of a combined location estimate is subject to top-down control.
Collapse
|
13
|
Baart M, de Boer-Schellekens L, Vroomen J. Lipread-induced phonetic recalibration in dyslexia. Acta Psychol (Amst) 2012; 140:91-5. [PMID: 22484551 DOI: 10.1016/j.actpsy.2012.03.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2011] [Revised: 02/14/2012] [Accepted: 03/11/2012] [Indexed: 10/28/2022] Open
Abstract
Auditory phoneme categories are less well-defined in developmental dyslexic readers than in fluent readers. Here, we examined whether poor recalibration of phonetic boundaries might be associated with this deficit. 22 adult dyslexic readers were compared with 22 fluent readers on a phoneme identification task and a task that measured phonetic recalibration by lipread speech (Bertelson, Vroomen, & De Gelder, 2003). In line with previous reports, we found that dyslexics were less categorical in the labeling of the speech sounds. The size of their phonetic recalibration effect, though, was comparable to that of normal readers. This result indicates that phonetic recalibration is unaffected in dyslexic readers, and that it is unlikely to lie at the foundation of their auditory phoneme categorization impairments. For normal readers however, it appeared that a well-calibrated system is related to auditory precision as the steepness of the auditory identification curve positively correlated with recalibration.
Collapse
|
14
|
|
15
|
|
16
|
Kilian-Hütten N, Vroomen J, Formisano E. Brain activation during audiovisual exposure anticipates future perception of ambiguous speech. Neuroimage 2011; 57:1601-7. [PMID: 21664279 DOI: 10.1016/j.neuroimage.2011.05.043] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2011] [Revised: 04/21/2011] [Accepted: 05/16/2011] [Indexed: 11/18/2022] Open
Affiliation(s)
- Niclas Kilian-Hütten
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands.
| | | | | |
Collapse
|