1
|
LaTourrette A, Blanco C, Atik ND, Waxman SR. Navigating accent variability: 24-month-olds recognize known words spoken in an unfamiliar accent but require additional support to learn new words. Infant Behav Dev 2024; 76:101962. [PMID: 38820860 DOI: 10.1016/j.infbeh.2024.101962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 05/10/2024] [Accepted: 05/19/2024] [Indexed: 06/02/2024]
Abstract
As infants learn their native languages, they must also learn to contend with variability across speakers of those languages. Here, we examine 24-month-olds' ability to process speech in an unfamiliar accent. We demonstrate that 24-month-olds successfully identify the referents of known words in unfamiliar-accented speech but cannot use known words alone to infer new word meanings. However, when the novel word occurs in a supportive referential context, with the target referent visually available, 24-month-olds successfully learn new word-referent mappings. Thus, 24-month-olds recognize and learn words in unfamiliar accents, but unfamiliar-accented speech may pose challenges for more sophisticated language processing strategies.
Collapse
Affiliation(s)
| | | | | | - Sandra R Waxman
- Department of Psychology, Northwestern University, USA; Institute for Policy Research, Northwestern University, USA
| |
Collapse
|
2
|
Kato M, Baese-Berk MM. The Effects of Acoustic and Semantic Enhancements on Perception of Native and Non-Native Speech. LANGUAGE AND SPEECH 2024; 67:40-71. [PMID: 36967604 DOI: 10.1177/00238309231156615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Previous research has shown that native listeners benefit from clearly produced speech, as well as from predictable semantic context when these enhancements are delivered in native speech. However, it is unclear whether native listeners benefit from acoustic and semantic enhancements differently when listening to other varieties of speech, including non-native speech. The current study examines to what extent native English listeners benefit from acoustic and semantic cues present in native and non-native English speech. Native English listeners transcribed sentence final words that were of different levels of semantic predictability, produced in plain- or clear-speaking styles by Native English talkers and by native Mandarin talkers of higher- and lower-proficiency in English. The perception results demonstrated that listeners benefited from semantic cues in higher- and lower-proficiency talkers' speech (i.e., transcribed speech more accurately), but not from acoustic cues, even though higher-proficiency talkers did make substantial acoustic enhancements from plain to clear speech. The current results suggest that native listeners benefit more robustly from semantic cues than from acoustic cues when those cues are embedded in non-native speech.
Collapse
Affiliation(s)
- Misaki Kato
- Department of Linguistics, University of Oregon, USA
| | | |
Collapse
|
3
|
Mechtenberg H, Giorio C, Myers EB. Pupil Dilation Reflects Perceptual Priorities During a Receptive Speech Task. Ear Hear 2024; 45:425-440. [PMID: 37882091 PMCID: PMC10868674 DOI: 10.1097/aud.0000000000001438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 09/01/2023] [Indexed: 10/27/2023]
Abstract
OBJECTIVES The listening demand incurred by speech perception fluctuates in normal conversation. At the acoustic-phonetic level, natural variation in pronunciation acts as speedbumps to accurate lexical selection. Any given utterance may be more or less phonetically ambiguous-a problem that must be resolved by the listener to choose the correct word. This becomes especially apparent when considering two common speech registers-clear and casual-that have characteristically different levels of phonetic ambiguity. Clear speech prioritizes intelligibility through hyperarticulation which results in less ambiguity at the phonetic level, while casual speech tends to have a more collapsed acoustic space. We hypothesized that listeners would invest greater cognitive resources while listening to casual speech to resolve the increased amount of phonetic ambiguity, as compared with clear speech. To this end, we used pupillometry as an online measure of listening effort during perception of clear and casual continuous speech in two background conditions: quiet and noise. DESIGN Forty-eight participants performed a probe detection task while listening to spoken, nonsensical sentences (masked and unmasked) while recording pupil size. Pupil size was modeled using growth curve analysis to capture the dynamics of the pupil response as the sentence unfolded. RESULTS Pupil size during listening was sensitive to the presence of noise and speech register (clear/casual). Unsurprisingly, listeners had overall larger pupil dilations during speech perception in noise, replicating earlier work. The pupil dilation pattern for clear and casual sentences was considerably more complex. Pupil dilation during clear speech trials was slightly larger than for casual speech, across quiet and noisy backgrounds. CONCLUSIONS We suggest that listener motivation could explain the larger pupil dilations to clearly spoken speech. We propose that, bounded by the context of this task, listeners devoted more resources to perceiving the speech signal with the greatest acoustic/phonetic fidelity. Further, we unexpectedly found systematic differences in pupil dilation preceding the onset of the spoken sentences. Together, these data demonstrate that the pupillary system is not merely reactive but also adaptive-sensitive to both task structure and listener motivation to maximize accurate perception in a limited resource system.
Collapse
Affiliation(s)
- Hannah Mechtenberg
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, USA
| | - Cristal Giorio
- Department of Psychology, Pennsylvania State University, State College, Pennsylvania, USA
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, USA
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs, Connecticut, USA
| |
Collapse
|
4
|
McLaughlin DJ, Colvett JS, Bugg JM, Van Engen KJ. Sequence effects and speech processing: cognitive load for speaker-switching within and across accents. Psychon Bull Rev 2024; 31:176-186. [PMID: 37442872 PMCID: PMC10867039 DOI: 10.3758/s13423-023-02322-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2023] [Indexed: 07/15/2023]
Abstract
Prior work in speech processing indicates that listening tasks with multiple speakers (as opposed to a single speaker) result in slower and less accurate processing. Notably, the trial-to-trial cognitive demands of switching between speakers or switching between accents have yet to be examined. We used pupillometry, a physiological index of cognitive load, to examine the demands of processing first (L1) and second (L2) language-accented speech when listening to sentences produced by the same speaker consecutively (no switch), a novel speaker of the same accent (within-accent switch), and a novel speaker with a different accent (across-accent switch). Inspired by research on sequential adjustments in cognitive control, we aimed to identify the cognitive demands of accommodating a novel speaker and accent by examining the trial-to-trial changes in pupil dilation during speech processing. Our results indicate that switching between speakers was more cognitively demanding than listening to the same speaker consecutively. Additionally, switching to a novel speaker with a different accent was more cognitively demanding than switching between speakers of the same accent. However, there was an asymmetry for across-accent switches, such that switching from an L1 to an L2 accent was more demanding than vice versa. Findings from the present study align with work examining multi-talker processing costs, and provide novel evidence that listeners dynamically adjust cognitive processing to accommodate speaker and accent variability. We discuss these novel findings in the context of an active control model and auditory streaming framework of speech processing.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA.
- Basque Center on Cognition, Brain and Language, Paseo Mikeletegi, 69, 20009, Donostia-San Sebastián, Gipuzkoa, Spain.
| | - Jackson S Colvett
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| | - Julie M Bugg
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| | - Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| |
Collapse
|
5
|
McLaughlin DJ, Van Engen KJ. Exploring effects of social information on talker-independent accent adaptation. JASA EXPRESS LETTERS 2023; 3:125201. [PMID: 38059794 DOI: 10.1121/10.0022536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 11/01/2023] [Indexed: 12/08/2023]
Abstract
The present study examined whether race information about speakers can promote rapid and generalizable perceptual adaptation to second-language accent. First-language English listeners were presented with Cantonese-accented English sentences in speech-shaped noise during a training session with three intermixed talkers, followed by a test session with a novel (i.e., fourth) talker. Participants were assigned to view either three East Asian or three White faces during training, corresponding to each speaker. Results indicated no effect of the social priming manipulation on the training or test sessions, although both groups performed better at test than a control group.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Basque Center on Cognition, Brain and Language, Donostia-San Sebastián, Gipuzkoa 20018, Spain
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, ,
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, ,
| |
Collapse
|
6
|
Kuchinsky SE, Razeghi N, Pandža NB. Auditory, Lexical, and Multitasking Demands Interactively Impact Listening Effort. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:4066-4082. [PMID: 37672797 PMCID: PMC10713022 DOI: 10.1044/2023_jslhr-22-00548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 03/12/2023] [Accepted: 06/27/2023] [Indexed: 09/08/2023]
Abstract
PURPOSE This study examined the extent to which acoustic, linguistic, and cognitive task demands interactively impact listening effort. METHOD Using a dual-task paradigm, on each trial, participants were instructed to perform either a single task or two tasks. In the primary word recognition task, participants repeated Northwestern University Auditory Test No. 6 words presented in speech-shaped noise at either an easier or a harder signal-to-noise ratio (SNR). The words varied in how commonly they occur in the English language (lexical frequency). In the secondary visual task, participants were instructed to press a specific key as soon as a number appeared on screen (simpler task) or one of two keys to indicate whether the visualized number was even or odd (more complex task). RESULTS Manipulation checks revealed that key assumptions of the dual-task design were met. A significant three-way interaction was observed, such that the expected effect of SNR on effort was only observable for words with lower lexical frequency and only when multitasking demands were relatively simpler. CONCLUSIONS This work reveals that variability across speech stimuli can influence the sensitivity of the dual-task paradigm for detecting changes in listening effort. In line with previous work, the results of this study also suggest that higher cognitive demands may limit the ability to detect expected effects of SNR on measures of effort. With implications for real-world listening, these findings highlight that even relatively minor changes in lexical and multitasking demands can alter the effort devoted to listening in noise.
Collapse
Affiliation(s)
- Stefanie E. Kuchinsky
- Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Applied Research Laboratory for Intelligence and Security, University of Maryland, College Park
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Niki Razeghi
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Nick B. Pandža
- Applied Research Laboratory for Intelligence and Security, University of Maryland, College Park
- Program in Second Language Acquisition, University of Maryland, College Park
- Maryland Language Science Center, University of Maryland, College Park
| |
Collapse
|
7
|
McLaughlin DJ, Van Engen KJ. Social Priming: Exploring the Effects of Speaker Race and Ethnicity on Perception of Second Language Accents. LANGUAGE AND SPEECH 2023:238309231199245. [PMID: 37772514 DOI: 10.1177/00238309231199245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Listeners use more than just acoustic information when processing speech. Social information, such as a speaker's perceived race or ethnicity, can also affect the processing of the speech signal, in some cases facilitating perception ("social priming"). We aimed to replicate and extend this line of inquiry, examining effects of multiple social primes (i.e., a Middle Eastern, White, or East Asian face, or a control silhouette image) on the perception of Mandarin Chinese-accented English and Arabic-accented English. By including uncommon priming combinations (e.g., a Middle Eastern prime for a Mandarin accent), we aimed to test the specificity of social primes: For example, can a Middle Eastern face facilitate perception of both Arabic-accented English and Mandarin-accented English? Contrary to our predictions, our results indicated no facilitative social priming effects for either of the second language (L2) accents. Results for our examination of specificity were mixed. Trends in the data indicated that the combination of an East Asian prime with Arabic accent resulted in lower accuracy as compared with a White prime, but the combination of a Middle Eastern prime with a Mandarin accent did not (and may have actually benefited listeners to some degree). We conclude that the specificity of priming effects may depend on listeners' level of familiarity with a given accent and/or racial/ethnic group and that the mixed outcomes in the current work motivate further inquiries to determine whether social priming effects for L2-accented speech may be smaller than previously hypothesized and/or highly dependent on listener experience.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Department of Psychological & Brain Sciences, Washington University in St. Louis, USA; Basque Center on Cognition, Brain and Language, Spain
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in St. Louis, USA
| |
Collapse
|
8
|
McLaughlin DJ, Zink ME, Gaunt L, Reilly J, Sommers MS, Van Engen KJ, Peelle JE. Give me a break! Unavoidable fatigue effects in cognitive pupillometry. Psychophysiology 2023; 60:e14256. [PMID: 36734299 PMCID: PMC11161670 DOI: 10.1111/psyp.14256] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 11/15/2022] [Accepted: 12/17/2022] [Indexed: 02/04/2023]
Abstract
Pupillometry has a rich history in the study of perception and cognition. One perennial challenge is that the magnitude of the task-evoked pupil response diminishes over the course of an experiment, a phenomenon we refer to as a fatigue effect. Reducing fatigue effects may improve sensitivity to task effects and reduce the likelihood of confounds due to systematic physiological changes over time. In this paper, we investigated the degree to which fatigue effects could be ameliorated by experimenter intervention. In Experiment 1, we assigned participants to one of three groups-no breaks, kinetic breaks (playing with toys, but no social interaction), or chatting with a research assistant-and compared the pupil response across conditions. In Experiment 2, we additionally tested the effect of researcher observation. Only breaks including social interaction significantly reduced the fatigue of the pupil response across trials. However, in all conditions we found robust evidence for fatigue effects: that is, regardless of protocol, the task-evoked pupil response was substantially diminished (at least 60%) over the duration of the experiment. We account for the variance of fatigue effects in our pupillometry data using multiple common statistical modeling approaches (e.g., linear mixed-effects models of peak, mean, and baseline pupil diameters, as well as growth curve models of time-course data). We conclude that pupil attenuation is a predictable phenomenon that should be accommodated in our experimental designs and statistical models.
Collapse
Affiliation(s)
- Drew J. McLaughlin
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Maggie E. Zink
- Department of Otolaryngology, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Lauren Gaunt
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Jamie Reilly
- Department of Communication Sciences and Disorders, Temple University, Philadelphia, Pennsylvania, USA
| | - Mitchell S. Sommers
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Kristin J. Van Engen
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Jonathan E. Peelle
- Department of Otolaryngology, Washington University in Saint Louis, St. Louis, Missouri, USA
| |
Collapse
|
9
|
Rovetti J, Sumantry D, Russo FA. Exposure to nonnative-accented speech reduces listening effort and improves social judgments of the speaker. Sci Rep 2023; 13:2808. [PMID: 36797318 PMCID: PMC9935874 DOI: 10.1038/s41598-023-29082-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 01/30/2023] [Indexed: 02/18/2023] Open
Abstract
Prior research has revealed a native-accent advantage, whereby nonnative-accented speech is more difficult to process than native-accented speech. Nonnative-accented speakers also experience more negative social judgments. In the current study, we asked three questions. First, does exposure to nonnative-accented speech increase speech intelligibility or decrease listening effort, thereby narrowing the native-accent advantage? Second, does lower intelligibility or higher listening effort contribute to listeners' negative social judgments of speakers? Third and finally, does increased intelligibility or decreased listening effort with exposure to speech bring about more positive social judgments of speakers? To address these questions, normal-hearing adults listened to a block of English sentences with a native accent and a block with nonnative accent. We found that once participants were accustomed to the task, intelligibility was greater for nonnative-accented speech and increased similarly with exposure for both accents. However, listening effort decreased only for nonnative-accented speech, soon reaching the level of native-accented speech. In addition, lower intelligibility and higher listening effort was associated with lower ratings of speaker warmth, speaker competence, and willingness to interact with the speaker. Finally, competence ratings increased over time to a similar extent for both accents, with this relationship fully mediated by intelligibility and listening effort. These results offer insight into how listeners process and judge unfamiliar speakers.
Collapse
Affiliation(s)
- Joseph Rovetti
- grid.39381.300000 0004 1936 8884Department of Psychology, Western University, London, ON N6A 3K7 Canada ,Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| | - David Sumantry
- Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| | - Frank A. Russo
- Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| |
Collapse
|
10
|
Bsharat-Maalouf D, Degani T, Karawani H. The Involvement of Listening Effort in Explaining Bilingual Listening Under Adverse Listening Conditions. Trends Hear 2023; 27:23312165231205107. [PMID: 37941413 PMCID: PMC10637154 DOI: 10.1177/23312165231205107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/14/2023] [Accepted: 09/15/2023] [Indexed: 11/10/2023] Open
Abstract
The current review examines listening effort to uncover how it is implicated in bilingual performance under adverse listening conditions. Various measures of listening effort, including physiological, behavioral, and subjective measures, have been employed to examine listening effort in bilingual children and adults. Adverse listening conditions, stemming from environmental factors, as well as factors related to the speaker or listener, have been examined. The existing literature, although relatively limited to date, points to increased listening effort among bilinguals in their nondominant second language (L2) compared to their dominant first language (L1) and relative to monolinguals. Interestingly, increased effort is often observed even when speech intelligibility remains unaffected. These findings emphasize the importance of considering listening effort alongside speech intelligibility. Building upon the insights gained from the current review, we propose that various factors may modulate the observed effects. These include the particular measure selected to examine listening effort, the characteristics of the adverse condition, as well as factors related to the particular linguistic background of the bilingual speaker. Critically, further research is needed to better understand the impact of these factors on listening effort. The review outlines avenues for future research that would promote a comprehensive understanding of listening effort in bilingual individuals.
Collapse
Affiliation(s)
- Dana Bsharat-Maalouf
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Tamar Degani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Hanin Karawani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| |
Collapse
|
11
|
Baese-Berk MM, Levi SV, Van Engen KJ. Intelligibility as a measure of speech perception: Current approaches, challenges, and recommendations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:68. [PMID: 36732227 DOI: 10.1121/10.0016806] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 12/18/2022] [Indexed: 06/18/2023]
Abstract
Intelligibility measures, which assess the number of words or phonemes a listener correctly transcribes or repeats, are commonly used metrics for speech perception research. While these measures have many benefits for researchers, they also come with a number of limitations. By pointing out the strengths and limitations of this approach, including how it fails to capture aspects of perception such as listening effort, this article argues that the role of intelligibility measures must be reconsidered in fields such as linguistics, communication disorders, and psychology. Recommendations for future work in this area are presented.
Collapse
Affiliation(s)
| | - Susannah V Levi
- Department of Communicative Sciences and Disorders, New York University, New York, New York 10012, USA
| | - Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, USA
| |
Collapse
|
12
|
Kutlu E, Tiv M, Wulff S, Titone D. Does race impact speech perception? An account of accented speech in two different multilingual locales. Cogn Res Princ Implic 2022; 7:7. [PMID: 35089448 PMCID: PMC8799814 DOI: 10.1186/s41235-022-00354-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 01/02/2022] [Indexed: 11/10/2022] Open
Abstract
Upon hearing someone's speech, a listener can access information such as the speaker's age, gender identity, socioeconomic status, and their linguistic background. However, an open question is whether living in different locales modulates how listeners use these factors to assess speakers' speech. Here, an audio-visual test was used to measure whether listeners' accentedness judgments and intelligibility (i.e., speech perception) can be modulated depending on racial information in faces that they see. American, British, and Indian English were used as three different English varieties of speech. These speech samples were presented with either a white female face or a South Asian female face. Two experiments were completed in two locales: Gainesville, Florida (USA) and Montreal, Quebec (Canada). Overall, Montreal listeners were more accurate in their transcription of sentences (i.e., intelligibility) compared to Gainesville listeners. Moreover, Gainesville listeners' ability to transcribe the same spoken sentences decreased for all varieties when listening to speech paired with South Asian faces. However, seeing a white or a South Asian face did not impact speech intelligibility for the same spoken sentences for Montreal listeners. Finally, listeners' accentedness judgments increased for American English and Indian English when the visual information changed from a white face to a South Asian face in Gainesville, but not in Montreal. These findings suggest that visual cues for race impact speech perception to a greater degree in locales with greater ecological diversity.
Collapse
Affiliation(s)
- Ethan Kutlu
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, USA
- Department of Linguistics, University of Iowa, Iowa City, USA
| | - Mehrgol Tiv
- Department of Psychology, McGill University, Montreal, Canada
| | - Stefanie Wulff
- Department of Linguistics, University of Florida, Gainesville, USA.
- Department of Language and Culture, UiT The Arctic University of Norway, Tromsø, Norway.
| | - Debra Titone
- Department of Psychology, McGill University, Montreal, Canada
| |
Collapse
|
13
|
Revisiting the relationship between implicit racial bias and audiovisual benefit for nonnative-accented speech. Atten Percept Psychophys 2022; 84:2074-2086. [PMID: 34988904 DOI: 10.3758/s13414-021-02423-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2021] [Indexed: 01/25/2023]
Abstract
Speech intelligibility is improved when the listener can see the talker in addition to hearing their voice. Notably, though, previous work has suggested that this "audiovisual benefit" for nonnative (i.e., foreign-accented) speech is smaller than the benefit for native speech, an effect that may be partially accounted for by listeners' implicit racial biases (Yi et al., 2013, The Journal of the Acoustical Society of America, 134[5], EL387-EL393.). In the present study, we sought to replicate these findings in a significantly larger sample of online participants. In a direct replication of Yi et al. (Experiment 1), we found that audiovisual benefit was indeed smaller for nonnative-accented relative to native-accented speech. However, our results did not support the conclusion that implicit racial biases, as measured with two types of implicit association tasks, were related to these differences in audiovisual benefit for native and nonnative speech. In a second experiment, we addressed a potential confound in the experimental design; to ensure that the difference in audiovisual benefit was caused by a difference in accent rather than a difference in overall intelligibility, we reversed the overall difficulty of each accent condition by presenting them at different signal-to-noise ratios. Even when native speech was presented at a much more difficult intelligibility level than nonnative speech, audiovisual benefit for nonnative speech remained poorer. In light of these findings, we discuss alternative explanations of reduced audiovisual benefit for nonnative speech, as well as methodological considerations for future work examining the intersection of social, cognitive, and linguistic processes.
Collapse
|
14
|
Bieber RE, Gordon-Salant S. Semantic context and stimulus variability independently affect rapid adaptation to non-native English speech in young adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:242. [PMID: 35104999 PMCID: PMC8769767 DOI: 10.1121/10.0009170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 11/26/2021] [Accepted: 12/07/2021] [Indexed: 06/14/2023]
Abstract
When speech is degraded or challenging to recognize, young adult listeners with normal hearing are able to quickly adapt, improving their recognition of the speech over a short period of time. This rapid adaptation is robust, but the factors influencing rate, magnitude, and generalization of improvement have not been fully described. Two factors of interest are lexico-semantic information and talker and accent variability; lexico-semantic information promotes perceptual learning for acoustically ambiguous speech, while talker and accent variability are beneficial for generalization of learning. In the present study, rate and magnitude of adaptation were measured for speech varying in level of semantic context, and in the type and number of talkers. Generalization of learning to an unfamiliar talker was also assessed. Results indicate that rate of rapid adaptation was slowed for semantically anomalous sentences, as compared to semantically intact or topic-grouped sentences; however, generalization was seen in the anomalous conditions. Magnitude of adaptation was greater for non-native as compared to native talker conditions, with no difference between single and multiple non-native talker conditions. These findings indicate that the previously documented benefit of lexical information in supporting rapid adaptation is not enhanced by the addition of supra-sentence context.
Collapse
Affiliation(s)
- Rebecca E Bieber
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| |
Collapse
|
15
|
Colby S, McMurray B. Cognitive and Physiological Measures of Listening Effort During Degraded Speech Perception: Relating Dual-Task and Pupillometry Paradigms. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3627-3652. [PMID: 34491779 PMCID: PMC8642090 DOI: 10.1044/2021_jslhr-20-00583] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 04/01/2021] [Accepted: 05/21/2021] [Indexed: 06/13/2023]
Abstract
Purpose Listening effort is quickly becoming an important metric for assessing speech perception in less-than-ideal situations. However, the relationship between the construct of listening effort and the measures used to assess it remains unclear. We compared two measures of listening effort: a cognitive dual task and a physiological pupillometry task. We sought to investigate the relationship between these measures of effort and whether engaging effort impacts speech accuracy. Method In Experiment 1, 30 participants completed a dual task and a pupillometry task that were carefully matched in stimuli and design. The dual task consisted of a spoken word recognition task and a visual match-to-sample task. In the pupillometry task, pupil size was monitored while participants completed a spoken word recognition task. Both tasks presented words at three levels of listening difficulty (unmodified, eight-channel vocoding, and four-channel vocoding) and provided response feedback on every trial. We refined the pupillometry task in Experiment 2 (n = 31); crucially, participants no longer received response feedback. Finally, we ran a new group of subjects on both tasks in Experiment 3 (n = 30). Results In Experiment 1, accuracy in the visual task decreased with increased signal degradation in the dual task, but pupil size was sensitive to accuracy and not vocoding condition. After removing feedback in Experiment 2, changes in pupil size were predicted by listening condition, suggesting the task was now sensitive to engaged effort. Both tasks were sensitive to listening difficulty in Experiment 3, but there was no relationship between the tasks and neither task predicted speech accuracy. Conclusions Consistent with previous work, we found little evidence for a relationship between different measures of listening effort. We also found no evidence that effort predicts speech accuracy, suggesting that engaging more effort does not lead to improved speech recognition. Cognitive and physiological measures of listening effort are likely sensitive to different aspects of the construct of listening effort. Supplemental Material https://doi.org/10.23641/asha.16455900.
Collapse
Affiliation(s)
- Sarah Colby
- Department of Psychological and Brain Sciences, The University of Iowa, Iowa City
| | - Bob McMurray
- Department of Psychological and Brain Sciences, The University of Iowa, Iowa City
| |
Collapse
|
16
|
Banks B, Gowen E, Munro KJ, Adank P. Eye Gaze and Perceptual Adaptation to Audiovisual Degraded Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3432-3445. [PMID: 34463528 DOI: 10.1044/2021_jslhr-21-00106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Visual cues from a speaker's face may benefit perceptual adaptation to degraded speech, but current evidence is limited. We aimed to replicate results from previous studies to establish the extent to which visual speech cues can lead to greater adaptation over time, extending existing results to a real-time adaptation paradigm (i.e., without a separate training period). A second aim was to investigate whether eye gaze patterns toward the speaker's mouth were related to better perception, hypothesizing that listeners who looked more at the speaker's mouth would show greater adaptation. Method A group of listeners (n = 30) was presented with 90 noise-vocoded sentences in audiovisual format, whereas a control group (n = 29) was presented with the audio signal only. Recognition accuracy was measured throughout and eye tracking was used to measure fixations toward the speaker's eyes and mouth in the audiovisual group. Results Previous studies were partially replicated: The audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall. Longer fixations on the speaker's mouth in the audiovisual group were related to better overall accuracy. An exploratory analysis further demonstrated that the duration of fixations to the speaker's mouth decreased over time. Conclusions The results suggest that visual cues may not benefit adaptation to degraded speech as much as previously thought. Longer fixations on a speaker's mouth may play a role in successfully decoding visual speech cues; however, this will need to be confirmed in future research to fully understand how patterns of eye gaze are related to audiovisual speech recognition. All materials, data, and code are available at https://osf.io/2wqkf/.
Collapse
Affiliation(s)
- Briony Banks
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
| | - Emma Gowen
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
| | - Kevin J Munro
- Manchester Centre for Audiology and Deafness, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
- Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, United Kingdom
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| |
Collapse
|
17
|
Pupillometry reveals cognitive demands of lexical competition during spoken word recognition in young and older adults. Psychon Bull Rev 2021; 29:268-280. [PMID: 34405386 DOI: 10.3758/s13423-021-01991-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2021] [Indexed: 12/27/2022]
Abstract
In most contemporary activation-competition frameworks for spoken word recognition, candidate words compete against phonological "neighbors" with similar acoustic properties (e.g., "cap" vs. "cat"). Thus, recognizing words with more competitors should come at a greater cognitive cost relative to recognizing words with fewer competitors, due to increased demands for selecting the correct item and inhibiting incorrect candidates. Importantly, these processes should operate even in the absence of differences in accuracy. In the present study, we tested this proposal by examining differences in processing costs associated with neighborhood density for highly intelligible items presented in quiet. A second goal was to examine whether the cognitive demands associated with increased neighborhood density were greater for older adults compared with young adults. Using pupillometry as an index of cognitive processing load, we compared the cognitive demands associated with spoken word recognition for words with many or fewer neighbors, presented in quiet, for young (n = 67) and older (n = 69) adult listeners. Growth curve analysis of the pupil data indicated that older adults showed a greater evoked pupil response for spoken words than did young adults, consistent with increased cognitive load during spoken word recognition. Words from dense neighborhoods were marginally more demanding to process than words from sparse neighborhoods. There was also an interaction between age and neighborhood density, indicating larger effects of density in young adult listeners. These results highlight the importance of assessing both cognitive demands and accuracy when investigating the mechanisms underlying spoken word recognition.
Collapse
|
18
|
Brown VA, Van Engen KJ, Peelle JE. Face mask type affects audiovisual speech intelligibility and subjective listening effort in young and older adults. COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2021; 6:49. [PMID: 34275022 PMCID: PMC8286438 DOI: 10.1186/s41235-021-00314-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/28/2021] [Indexed: 01/25/2023]
Abstract
Identifying speech requires that listeners make rapid use of fine-grained acoustic cues—a process that is facilitated by being able to see the talker’s face. Face masks present a challenge to this process because they can both alter acoustic information and conceal the talker’s mouth. Here, we investigated the degree to which different types of face masks and noise levels affect speech intelligibility and subjective listening effort for young (N = 180) and older (N = 180) adult listeners. We found that in quiet, mask type had little influence on speech intelligibility relative to speech produced without a mask for both young and older adults. However, with the addition of moderate (− 5 dB SNR) and high (− 9 dB SNR) levels of background noise, intelligibility dropped substantially for all types of face masks in both age groups. Across noise levels, transparent face masks and cloth face masks with filters impaired performance the most, and surgical face masks had the smallest influence on intelligibility. Participants also rated speech produced with a face mask as more effortful than unmasked speech, particularly in background noise. Although young and older adults were similarly affected by face masks and noise in terms of intelligibility and subjective listening effort, older adults showed poorer intelligibility overall and rated the speech as more effortful to process relative to young adults. This research will help individuals make more informed decisions about which types of masks to wear in various communicative settings.
Collapse
Affiliation(s)
- Violet A Brown
- Department of Psychological & Brain Sciences, Washington University in Saint Louis, St. Louis, USA.
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in Saint Louis, St. Louis, USA
| | - Jonathan E Peelle
- Department of Otolaryngology, Washington University in Saint Louis, St. Louis, USA
| |
Collapse
|
19
|
Trotter AS, Banks B, Adank P. The Relevance of the Availability of Visual Speech Cues During Adaptation to Noise-Vocoded Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2513-2528. [PMID: 34161748 DOI: 10.1044/2021_jslhr-20-00575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose This study first aimed to establish whether viewing specific parts of the speaker's face (eyes or mouth), compared to viewing the whole face, affected adaptation to distorted noise-vocoded sentences. Second, this study also aimed to replicate results on processing of distorted speech from lab-based experiments in an online setup. Method We monitored recognition accuracy online while participants were listening to noise-vocoded sentences. We first established if participants were able to perceive and adapt to audiovisual four-band noise-vocoded sentences when the entire moving face was visible (AV Full). Four further groups were then tested: a group in which participants viewed the moving lower part of the speaker's face (AV Mouth), a group in which participants only see the moving upper part of the face (AV Eyes), a group in which participants could not see the moving lower or upper face (AV Blocked), and a group in which participants saw an image of a still face (AV Still). Results Participants repeated around 40% of the key words correctly and adapted during the experiment, but only when the moving mouth was visible. In contrast, performance was at floor level, and no adaptation took place, in conditions when the moving mouth was occluded. Conclusions The results show the importance of being able to observe relevant visual speech information from the speaker's mouth region, but not the eyes/upper face region, when listening and adapting to distorted sentences online. Second, the results also demonstrated that it is feasible to run speech perception and adaptation studies online, but that not all findings reported for lab studies replicate. Supplemental Material https://doi.org/10.23641/asha.14810523.
Collapse
Affiliation(s)
- Antony S Trotter
- Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| | - Briony Banks
- Department of Psychology, Lancaster University, United Kingdom
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| |
Collapse
|
20
|
Yu ME, Schertz J, Johnson EK. The Other Accent Effect in Talker Recognition: Now You See It, Now You Don't. Cogn Sci 2021; 45:e12986. [PMID: 34170043 DOI: 10.1111/cogs.12986] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 04/16/2021] [Accepted: 04/29/2021] [Indexed: 11/30/2022]
Abstract
The existence of the Language Familiarity Effect (LFE), where talkers of a familiar language are easier to identify than talkers of an unfamiliar language, is well-documented and uncontroversial. However, a closely related phenomenon known as the Other Accent Effect (OAE), where accented talkers are more difficult to recognize, is less well understood. There are several possible explanations for why the OAE exists, but to date, little data exist to adjudicate differences between them. Here, we begin to address this issue by directly comparing listeners' recognition of talkers who speak in different types of accents, and by examining both the LFE and OAE in the same set of listeners. Specifically, Canadian English listeners were tested on their ability to recognize talkers within four types of voice line-ups: Canadian English talkers, Australian English talkers, Mandarin-accented English talkers, and Mandarin talkers. We predicted that the OAE would be present for talkers of Mandarin-accented English but not for talkers of Australian English-which is precisely what we observed. We also observed a disconnect between listeners' confidence and performance across different types of accents; that is, listeners performed equally poorly with Mandarin and Mandarin-accented talkers, but they were more confident with their performance with the latter group of talkers. The present findings set the stage for further investigation into the nature of the OAE by exploring a range of potential explanations for the effect, and introducing important implications for forensic scientists' evaluation of ear witness testimony.
Collapse
Affiliation(s)
| | - Jessamyn Schertz
- Department of Language Studies, University of Toronto Mississauga
| | | |
Collapse
|
21
|
Smiljanic R, Keerstock S, Meemann K, Ransom SM. Face masks and speaking style affect audio-visual word recognition and memory of native and non-native speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4013. [PMID: 34241444 PMCID: PMC8269755 DOI: 10.1121/10.0005191] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Though necessary, protective mask wearing in response to the COVID-19 pandemic presents communication challenges. The present study examines how signal degradation and loss of visual information due to masks affects intelligibility and memory for native and non-native speech. We also test whether clear speech can alleviate perceptual difficulty for masked speech. One native and one non-native speaker of English recorded video clips in conversational speech without a mask and conversational and clear speech with a mask. Native English listeners watched video clips presented in quiet or mixed with competing speech. The results showed that word recognition and recall of speech produced with a mask can be as accurate as without a mask in optimal listening conditions. Masks affected non-native speech processing at easier noise levels than native speech. Clear speech with a mask significantly improved accuracy in all listening conditions. Speaking clearly, reducing noise, and using surgical masks as well as good signal amplification can help compensate for the loss of intelligibility due to background noise, lack of visual cues, physical distancing, or non-native speech. The findings have implications for communication in classrooms and hospitals where listeners interact with teachers and healthcare providers, oftentimes non-native speakers, through their protective barriers.
Collapse
Affiliation(s)
- Rajka Smiljanic
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| | - Sandie Keerstock
- Department of Psychological Sciences, University of Missouri; 124 Psychology Building, Columbia, Missouri 65211, USA
| | - Kirsten Meemann
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| | - Sarah M Ransom
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| |
Collapse
|
22
|
Bieber RE, Tinnemore AR, Yeni-Komshian G, Gordon-Salant S. Younger and older adults show non-linear, stimulus-dependent performance during early stages of auditory training for non-native English. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4348. [PMID: 34241442 PMCID: PMC8214469 DOI: 10.1121/10.0005279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 05/24/2021] [Accepted: 05/25/2021] [Indexed: 06/13/2023]
Abstract
Older adults often report difficulty understanding speech produced by non-native talkers. These listeners can achieve rapid adaptation to non-native speech, but few studies have assessed auditory training protocols to improve non-native speech recognition in older adults. In this study, a word-level training paradigm was employed, targeting improved recognition of Spanish-accented English. Younger and older adults were trained on Spanish-accented monosyllabic word pairs containing four phonemic contrasts (initial s/z, initial f/v, final b/p, final d/t) produced in English by multiple male native Spanish speakers. Listeners completed pre-testing, training, and post-testing over two sessions. Statistical methods, such as growth curve modeling and generalized additive mixed models, were employed to describe the patterns of rapid adaptation and how they varied between listener groups and phonemic contrasts. While the training protocol failed to elicit post-test improvements for recognition of Spanish-accented speech, examination of listeners' performance during the pre-testing period showed patterns of rapid adaptation that differed, depending on the nature of the phonemes to be learned and the listener group. Normal-hearing younger and older adults showed a faster rate of adaptation for non-native stimuli that were more nativelike in their productions, while older adults with hearing impairment did not realize this benefit.
Collapse
Affiliation(s)
- Rebecca E Bieber
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Anna R Tinnemore
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Grace Yeni-Komshian
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland College Park, College Park, Maryland 20742, USA
| |
Collapse
|
23
|
Strand JF, Ray L, Dillman-Hasso NH, Villanueva J, Brown VA. Understanding Speech Amid the Jingle and Jangle: Recommendations for Improving Measurement Practices in Listening Effort Research. ACTA ACUST UNITED AC 2020; 3:169-188. [PMID: 34240011 DOI: 10.1080/25742442.2021.1903293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The latent constructs psychologists study are typically not directly accessible, so researchers must design measurement instruments that are intended to provide insights about those constructs. Construct validation-assessing whether instruments measure what they intend to-is therefore critical for ensuring that the conclusions we draw actually reflect the intended phenomena. Insufficient construct validation can lead to the jingle fallacy-falsely assuming two instruments measure the same construct because the instruments share a name (Thorndike, 1904)-and the jangle fallacy-falsely assuming two instruments measure different constructs because the instruments have different names (Kelley, 1927). In this paper, we examine construct validation practices in research on listening effort and identify patterns that strongly suggest the presence of jingle and jangle in the literature. We argue that the lack of construct validation for listening effort measures has led to inconsistent findings and hindered our understanding of the construct. We also provide specific recommendations for improving construct validation of listening effort instruments, drawing on the framework laid out in a recent paper on improving measurement practices (Flake & Fried, 2020). Although this paper addresses listening effort, the issues raised and recommendations presented are widely applicable to tasks used in research on auditory perception and cognitive psychology.
Collapse
Affiliation(s)
| | - Lucia Ray
- Carleton College, Department of Psychology
| | | | | | - Violet A Brown
- Washington University in St. Louis, Department of Psychological & Brain Sciences
| |
Collapse
|