1
|
Cai Y, Strauch C, Van der Stigchel S, Naber M. Open-DPSM: An open-source toolkit for modeling pupil size changes to dynamic visual inputs. Behav Res Methods 2024; 56:5605-5621. [PMID: 38082113 PMCID: PMC11335788 DOI: 10.3758/s13428-023-02292-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/07/2023] [Indexed: 08/21/2024]
Abstract
Pupil size change is a widely adopted, sensitive indicator for sensory and cognitive processes. However, the interpretation of these changes is complicated by the influence of multiple low-level effects, such as brightness or contrast changes, posing challenges to applying pupillometry outside of extremely controlled settings. Building on and extending previous models, we here introduce Open Dynamic Pupil Size Modeling (Open-DPSM), an open-source toolkit to model pupil size changes to dynamically changing visual inputs using a convolution approach. Open-DPSM incorporates three key steps: (1) Modeling pupillary responses to both luminance and contrast changes; (2) Weighing of the distinct contributions of visual events across the visual field on pupil size change; and (3) Incorporating gaze-contingent visual event extraction and modeling. These steps improve the prediction of pupil size changes beyond the here-evaluated benchmarks. Open-DPSM provides Python functions, as well as a graphical user interface (GUI), enabling the extension of its applications to versatile scenarios and adaptations to individualized needs. By obtaining a predicted pupil trace using video and eye-tracking data, users can mitigate the effects of low-level features by subtracting the predicted trace or assess the efficacy of the low-level feature manipulations a priori by comparing estimated traces across conditions.
Collapse
Affiliation(s)
- Yuqing Cai
- Experimental Psychology, Helmholtz Institute, Faculty of Social Sciences, Utrecht University, Heidelberglaan 1, 3584, CS, Utrecht, The Netherlands.
| | - Christoph Strauch
- Experimental Psychology, Helmholtz Institute, Faculty of Social Sciences, Utrecht University, Heidelberglaan 1, 3584, CS, Utrecht, The Netherlands
| | - Stefan Van der Stigchel
- Experimental Psychology, Helmholtz Institute, Faculty of Social Sciences, Utrecht University, Heidelberglaan 1, 3584, CS, Utrecht, The Netherlands
| | - Marnix Naber
- Experimental Psychology, Helmholtz Institute, Faculty of Social Sciences, Utrecht University, Heidelberglaan 1, 3584, CS, Utrecht, The Netherlands
| |
Collapse
|
2
|
McLaughlin DJ, Van Engen KJ. Social Priming: Exploring the Effects of Speaker Race and Ethnicity on Perception of Second Language Accents. LANGUAGE AND SPEECH 2024; 67:821-845. [PMID: 37772514 DOI: 10.1177/00238309231199245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Listeners use more than just acoustic information when processing speech. Social information, such as a speaker's perceived race or ethnicity, can also affect the processing of the speech signal, in some cases facilitating perception ("social priming"). We aimed to replicate and extend this line of inquiry, examining effects of multiple social primes (i.e., a Middle Eastern, White, or East Asian face, or a control silhouette image) on the perception of Mandarin Chinese-accented English and Arabic-accented English. By including uncommon priming combinations (e.g., a Middle Eastern prime for a Mandarin accent), we aimed to test the specificity of social primes: For example, can a Middle Eastern face facilitate perception of both Arabic-accented English and Mandarin-accented English? Contrary to our predictions, our results indicated no facilitative social priming effects for either of the second language (L2) accents. Results for our examination of specificity were mixed. Trends in the data indicated that the combination of an East Asian prime with Arabic accent resulted in lower accuracy as compared with a White prime, but the combination of a Middle Eastern prime with a Mandarin accent did not (and may have actually benefited listeners to some degree). We conclude that the specificity of priming effects may depend on listeners' level of familiarity with a given accent and/or racial/ethnic group and that the mixed outcomes in the current work motivate further inquiries to determine whether social priming effects for L2-accented speech may be smaller than previously hypothesized and/or highly dependent on listener experience.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Department of Psychological & Brain Sciences, Washington University in St. Louis, USA; Basque Center on Cognition, Brain and Language, Spain
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in St. Louis, USA
| |
Collapse
|
3
|
Bieber RE, Makashay MJ, Sheffield BM, Brungart DS. Intelligibility of Natively and Nonnatively Produced English Speech Presented in Noise to a Large Cohort of United States Service Members. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2454-2472. [PMID: 38950169 DOI: 10.1044/2024_jslhr-23-00312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
PURPOSE A corpus of English matrix sentences produced by 60 native and nonnative speakers of English was developed as part of a multinational coalition task group. This corpus was tested on a large cohort of U.S. Service members in order to examine the effects of talker nativeness, listener nativeness, masker type, and hearing sensitivity on speech recognition performance in this population. METHOD A total of 1,939 U.S. Service members (ages 18-68 years) completed this closed-set listening task, including 430 women and 110 nonnative English speakers. Stimuli were produced by native and nonnative speakers of English and were presented in speech-shaped noise and multitalker babble. Keyword recognition accuracy and response times were analyzed. RESULTS General(ized) linear mixed-effects regression models found that, on the whole, speech recognition performance was lower for listeners who identified as nonnative speakers of English and when listening to speech produced by nonnative speakers of English. Talker and listener effects were more pronounced when listening in a babble masker than in a speech-shaped noise masker. Response times varied as a function of recognition score, with longest response times found for intermediate levels of performance. CONCLUSIONS This study found additive effects of talker and listener nonnativeness when listening to speech in background noise. These effects were present in both accuracy and response time measures. No multiplicative effects of talker and listener language background were found. There was little evidence of a negative interaction between talker nonnativeness and hearing impairment, suggesting that these factors may have redundant effects on speech recognition. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.26060191.
Collapse
Affiliation(s)
- Rebecca E Bieber
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD
| | - Matthew J Makashay
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Hearing Conservation and Readiness Branch, Defense Centers for Public Health - Aberdeen, Aberdeen Proving Ground, MD
| | - Benjamin M Sheffield
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
- Hearing Conservation and Readiness Branch, Defense Centers for Public Health - Aberdeen, Aberdeen Proving Ground, MD
| | - Douglas S Brungart
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
| |
Collapse
|
4
|
Chernyak BR, Bradlow AR, Keshet J, Goldrick M. A perceptual similarity space for speech based on self-supervised speech representations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3915-3929. [PMID: 38904539 DOI: 10.1121/10.0026358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 05/29/2024] [Indexed: 06/22/2024]
Abstract
Speech recognition by both humans and machines frequently fails in non-optimal yet common situations. For example, word recognition error rates for second-language (L2) speech can be high, especially under conditions involving background noise. At the same time, both human and machine speech recognition sometimes shows remarkable robustness against signal- and noise-related degradation. Which acoustic features of speech explain this substantial variation in intelligibility? Current approaches align speech to text to extract a small set of pre-defined spectro-temporal properties from specific sounds in particular words. However, variation in these properties leaves much cross-talker variation in intelligibility unexplained. We examine an alternative approach utilizing a perceptual similarity space acquired using self-supervised learning. This approach encodes distinctions between speech samples without requiring pre-defined acoustic features or speech-to-text alignment. We show that L2 English speech samples are less tightly clustered in the space than L1 samples reflecting variability in English proficiency among L2 talkers. Critically, distances in this similarity space are perceptually meaningful: L1 English listeners have lower recognition accuracy for L2 speakers whose speech is more distant in the space from L1 speech. These results indicate that perceptual similarity may form the basis for an entirely new speech and language analysis approach.
Collapse
Affiliation(s)
- Bronya R Chernyak
- Faculty of Electrical & Computer Engineering, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Ann R Bradlow
- Department of Linguistics, Northwestern University, Evanston, Illinois 60208, USA
| | - Joseph Keshet
- Faculty of Electrical & Computer Engineering, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Matthew Goldrick
- Department of Linguistics, Northwestern University, Evanston, Illinois 60208, USA
| |
Collapse
|
5
|
Gangopadhyay I, Fulford D, Corriveau K, Mow J, Li PH, Arunachalam S. Pupils Dilate More to Harder Vocabulary Words than Easier Ones. Cogn Sci 2024; 48:e13446. [PMID: 38655881 DOI: 10.1111/cogs.13446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 03/27/2024] [Accepted: 04/08/2024] [Indexed: 04/26/2024]
Abstract
Understanding cognitive effort expended during assessments is essential to improving efficiency, accuracy, and accessibility within these assessments. Pupil dilation is commonly used as a psychophysiological measure of cognitive effort, yet research on its relationship with effort expended specifically during language processing is limited. The present study adds to and expands on this literature by investigating the relationships among pupil dilation, trial difficulty, and accuracy during a vocabulary test. Participants (n = 63, Mage = 19.25) completed a subset of trials from the Peabody Picture Vocabulary Test while seated at an eye-tracker monitor. During each trial, four colored images were presented on the monitor while a word was presented via audio recording. Participants verbally indicated which image they thought represented the target word. Words were categorized into Easy, Medium, and Hard difficulty. Pupil dilation during the Medium and Hard trials was significantly greater than during the Easy trials, though the Medium and Hard trials did not significantly differ from each other. Pupil dilation in comparison to trial accuracy presented a more complex pattern, with comparisons between accurate and inaccurate trials differing depending on the timing of the stimulus presentation. These results present further evidence that pupil dilation increases with cognitive effort associated with vocabulary tests, providing insights that could help refine vocabulary assessments and other related tests of language processing.
Collapse
Affiliation(s)
| | - Daniel Fulford
- Department of Occupational Therapy, Boston University
- Department of Psychological and Brain Sciences, Boston University
| | - Kathleen Corriveau
- Department of Counseling Psychology and Applied Human Development, Boston University
| | - Jessica Mow
- Department of Occupational Therapy, Boston University
| | - Pearl Han Li
- Department of Psychology and Neuroscience, Duke University
| | - Sudha Arunachalam
- Department of Communicative Sciences and Disorders, New York University
| |
Collapse
|
6
|
Kato M, Baese-Berk MM. The Effects of Acoustic and Semantic Enhancements on Perception of Native and Non-Native Speech. LANGUAGE AND SPEECH 2024; 67:40-71. [PMID: 36967604 DOI: 10.1177/00238309231156615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Previous research has shown that native listeners benefit from clearly produced speech, as well as from predictable semantic context when these enhancements are delivered in native speech. However, it is unclear whether native listeners benefit from acoustic and semantic enhancements differently when listening to other varieties of speech, including non-native speech. The current study examines to what extent native English listeners benefit from acoustic and semantic cues present in native and non-native English speech. Native English listeners transcribed sentence final words that were of different levels of semantic predictability, produced in plain- or clear-speaking styles by Native English talkers and by native Mandarin talkers of higher- and lower-proficiency in English. The perception results demonstrated that listeners benefited from semantic cues in higher- and lower-proficiency talkers' speech (i.e., transcribed speech more accurately), but not from acoustic cues, even though higher-proficiency talkers did make substantial acoustic enhancements from plain to clear speech. The current results suggest that native listeners benefit more robustly from semantic cues than from acoustic cues when those cues are embedded in non-native speech.
Collapse
Affiliation(s)
- Misaki Kato
- Department of Linguistics, University of Oregon, USA
| | | |
Collapse
|
7
|
Mechtenberg H, Giorio C, Myers EB. Pupil Dilation Reflects Perceptual Priorities During a Receptive Speech Task. Ear Hear 2024; 45:425-440. [PMID: 37882091 PMCID: PMC10868674 DOI: 10.1097/aud.0000000000001438] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 09/01/2023] [Indexed: 10/27/2023]
Abstract
OBJECTIVES The listening demand incurred by speech perception fluctuates in normal conversation. At the acoustic-phonetic level, natural variation in pronunciation acts as speedbumps to accurate lexical selection. Any given utterance may be more or less phonetically ambiguous-a problem that must be resolved by the listener to choose the correct word. This becomes especially apparent when considering two common speech registers-clear and casual-that have characteristically different levels of phonetic ambiguity. Clear speech prioritizes intelligibility through hyperarticulation which results in less ambiguity at the phonetic level, while casual speech tends to have a more collapsed acoustic space. We hypothesized that listeners would invest greater cognitive resources while listening to casual speech to resolve the increased amount of phonetic ambiguity, as compared with clear speech. To this end, we used pupillometry as an online measure of listening effort during perception of clear and casual continuous speech in two background conditions: quiet and noise. DESIGN Forty-eight participants performed a probe detection task while listening to spoken, nonsensical sentences (masked and unmasked) while recording pupil size. Pupil size was modeled using growth curve analysis to capture the dynamics of the pupil response as the sentence unfolded. RESULTS Pupil size during listening was sensitive to the presence of noise and speech register (clear/casual). Unsurprisingly, listeners had overall larger pupil dilations during speech perception in noise, replicating earlier work. The pupil dilation pattern for clear and casual sentences was considerably more complex. Pupil dilation during clear speech trials was slightly larger than for casual speech, across quiet and noisy backgrounds. CONCLUSIONS We suggest that listener motivation could explain the larger pupil dilations to clearly spoken speech. We propose that, bounded by the context of this task, listeners devoted more resources to perceiving the speech signal with the greatest acoustic/phonetic fidelity. Further, we unexpectedly found systematic differences in pupil dilation preceding the onset of the spoken sentences. Together, these data demonstrate that the pupillary system is not merely reactive but also adaptive-sensitive to both task structure and listener motivation to maximize accurate perception in a limited resource system.
Collapse
Affiliation(s)
- Hannah Mechtenberg
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, USA
| | - Cristal Giorio
- Department of Psychology, Pennsylvania State University, State College, Pennsylvania, USA
| | - Emily B. Myers
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, USA
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs, Connecticut, USA
| |
Collapse
|
8
|
Ershaid H, Lizarazu M, McLaughlin D, Cooke M, Simantiraki O, Koutsogiannaki M, Lallier M. Contributions of listening effort and intelligibility to cortical tracking of speech in adverse listening conditions. Cortex 2024; 172:54-71. [PMID: 38215511 DOI: 10.1016/j.cortex.2023.11.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 09/05/2023] [Accepted: 11/14/2023] [Indexed: 01/14/2024]
Abstract
Cortical tracking of speech is vital for speech segmentation and is linked to speech intelligibility. However, there is no clear consensus as to whether reduced intelligibility leads to a decrease or an increase in cortical speech tracking, warranting further investigation of the factors influencing this relationship. One such factor is listening effort, defined as the cognitive resources necessary for speech comprehension, and reported to have a strong negative correlation with speech intelligibility. Yet, no studies have examined the relationship between speech intelligibility, listening effort, and cortical tracking of speech. The aim of the present study was thus to examine these factors in quiet and distinct adverse listening conditions. Forty-nine normal hearing adults listened to sentences produced casually, presented in quiet and two adverse listening conditions: cafeteria noise and reverberant speech. Electrophysiological responses were registered with electroencephalogram, and listening effort was estimated subjectively using self-reported scores and objectively using pupillometry. Results indicated varying impacts of adverse conditions on intelligibility, listening effort, and cortical tracking of speech, depending on the preservation of the speech temporal envelope. The more distorted envelope in the reverberant condition led to higher listening effort, as reflected in higher subjective scores, increased pupil diameter, and stronger cortical tracking of speech in the delta band. These findings suggest that using measures of listening effort in addition to those of intelligibility is useful for interpreting cortical tracking of speech results. Moreover, reading and phonological skills of participants were positively correlated with listening effort in the cafeteria condition, suggesting a special role of expert language skills in processing speech in this noisy condition. Implications for future research and theories linking atypical cortical tracking of speech and reading disorders are further discussed.
Collapse
Affiliation(s)
- Hadeel Ershaid
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain.
| | - Mikel Lizarazu
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain.
| | - Drew McLaughlin
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain.
| | - Martin Cooke
- Ikerbasque, Basque Science Foundation, Bilbao, Spain.
| | | | | | - Marie Lallier
- Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Ikerbasque, Basque Science Foundation, Bilbao, Spain.
| |
Collapse
|
9
|
Giuliani NP, Venkitakrishnan S, Wu YH. Input-related demands: vocoded sentences evoke different pupillometrics and subjective listening effort than sentences in speech-shaped noise. Int J Audiol 2024; 63:199-206. [PMID: 36519812 PMCID: PMC10947987 DOI: 10.1080/14992027.2022.2150901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 11/17/2022] [Accepted: 11/18/2022] [Indexed: 12/23/2022]
Abstract
OBJECTIVES The Framework for Effortful Listening (FUEL) suggests five input-related demands can alter listening effort: source, transmission, listener, message and context factors. We hypothesised that vocoded sentences represented a source factor degradation and sentences in speech-shaped noise represented a transmission factor degradation. We used pupillometry and a subjective scale to examine our hypothesis. DESIGN Participants listened to vocoded sentences and sentences in speech-shaped noise at several difficulty levels designed to produce similar word recognition abilities; they also listened to unprocessed sentences. Within-participant pupillometrics and subjective listening effort were analysed. Post-hoc analyses were performed to examine if word recognition accuracy differentially influenced pupil responses. STUDY SAMPLES Twenty young adults with normal hearing. RESULTS Baseline pupil diameter was significantly smaller, peak pupil dilation was significantly larger, peak pupil dilation latency was significantly shorter, and subjective listening effort was significantly greater for the vocoded sentences than the sentences-in-noise. Word recognition ability also affected pupillometrics, but only for the vocoded sentences. CONCLUSIONS Our findings suggest that source factor degradations result in greater listening effort than transmission factor degradations. Future research should address how clinical interventions tailored towards different input-related demands may lead to reduced listening effort and improve patient outcomes.
Collapse
Affiliation(s)
- Nicholas P. Giuliani
- Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, USA
| | - Soumya Venkitakrishnan
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| | - Yu-Hsiang Wu
- Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, USA
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| |
Collapse
|
10
|
McLaughlin DJ, Colvett JS, Bugg JM, Van Engen KJ. Sequence effects and speech processing: cognitive load for speaker-switching within and across accents. Psychon Bull Rev 2024; 31:176-186. [PMID: 37442872 PMCID: PMC10867039 DOI: 10.3758/s13423-023-02322-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2023] [Indexed: 07/15/2023]
Abstract
Prior work in speech processing indicates that listening tasks with multiple speakers (as opposed to a single speaker) result in slower and less accurate processing. Notably, the trial-to-trial cognitive demands of switching between speakers or switching between accents have yet to be examined. We used pupillometry, a physiological index of cognitive load, to examine the demands of processing first (L1) and second (L2) language-accented speech when listening to sentences produced by the same speaker consecutively (no switch), a novel speaker of the same accent (within-accent switch), and a novel speaker with a different accent (across-accent switch). Inspired by research on sequential adjustments in cognitive control, we aimed to identify the cognitive demands of accommodating a novel speaker and accent by examining the trial-to-trial changes in pupil dilation during speech processing. Our results indicate that switching between speakers was more cognitively demanding than listening to the same speaker consecutively. Additionally, switching to a novel speaker with a different accent was more cognitively demanding than switching between speakers of the same accent. However, there was an asymmetry for across-accent switches, such that switching from an L1 to an L2 accent was more demanding than vice versa. Findings from the present study align with work examining multi-talker processing costs, and provide novel evidence that listeners dynamically adjust cognitive processing to accommodate speaker and accent variability. We discuss these novel findings in the context of an active control model and auditory streaming framework of speech processing.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA.
- Basque Center on Cognition, Brain and Language, Paseo Mikeletegi, 69, 20009, Donostia-San Sebastián, Gipuzkoa, Spain.
| | - Jackson S Colvett
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| | - Julie M Bugg
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| | - Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St Louis, MO, USA
| |
Collapse
|
11
|
Suite L, Freiwirth G, Babel M. Receptive vocabulary predicts multilinguals' recognition skills in adverse listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3916-3930. [PMID: 38126803 DOI: 10.1121/10.0023960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 11/29/2023] [Indexed: 12/23/2023]
Abstract
Adverse listening conditions are known to affect bilingual listeners' intelligibility scores more than those of monolingual listeners. To advance theoretical understanding of the mechanisms underpinning bilinguals' challenges in adverse listening conditions, vocabulary size and language entropy are compared as predictors in a sentence transcription task with a heterogeneous multilingual population representative of a speech community. Adverse listening was induced through noise type, bandwidth manipulations, and sentences varying in their semantic predictability. Overall, the results generally confirm anticipated patterns with respect to sentence type, noise masking, and bandwidth. Listeners show better comprehension of semantically coherent utterances without masking and with a full spectrum. Crucially, listeners with larger receptive vocabularies and lower language entropy, a measure of the predictability of one's language use, showed improved performance in adverse listening conditions. Vocabulary size had a substantially larger effect size, indicating that vocabulary size has more impact on performance in adverse listening conditions than bilingual language use. These results suggest that the mechanism behind the bilingual disadvantage in adverse listening conditions may be rooted in bilinguals' smaller language-specific receptive vocabularies, offering a harmonious explanation for challenges in adverse listening conditions experienced by monolinguals and multilinguals.
Collapse
Affiliation(s)
- Lexia Suite
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Galia Freiwirth
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Molly Babel
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
12
|
Carraturo S, McLaughlin DJ, Peelle JE, Van Engen KJ. Pupillometry reveals differences in cognitive demands of listening to face mask-attenuated speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3973-3985. [PMID: 38149818 DOI: 10.1121/10.0023953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 11/29/2023] [Indexed: 12/28/2023]
Abstract
Face masks offer essential protection but also interfere with speech communication. Here, audio-only sentences spoken through four types of masks were presented in noise to young adult listeners. Pupil dilation (an index of cognitive demand), intelligibility, and subjective effort and performance ratings were collected. Dilation increased in response to each mask relative to the no-mask condition and differed significantly where acoustic attenuation was most prominent. These results suggest that the acoustic impact of the mask drives not only the intelligibility of speech, but also the cognitive demands of listening. Subjective effort ratings reflected the same trends as the pupil data.
Collapse
Affiliation(s)
- Sita Carraturo
- Department of Psychological & Brain Sciences, Washington University in St. Louis, Saint Louis, Missouri 63130, USA
| | - Drew J McLaughlin
- Basque Center on Cognition, Brain and Language, San Sebastian, Basque Country 20009, Spain
| | - Jonathan E Peelle
- Department of Communication Sciences and Disorders, Northeastern University, Boston, Massachusetts 02115, USA
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in St. Louis, Saint Louis, Missouri 63130, USA
| |
Collapse
|
13
|
McLaughlin DJ, Van Engen KJ. Exploring effects of social information on talker-independent accent adaptation. JASA EXPRESS LETTERS 2023; 3:125201. [PMID: 38059794 DOI: 10.1121/10.0022536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 11/01/2023] [Indexed: 12/08/2023]
Abstract
The present study examined whether race information about speakers can promote rapid and generalizable perceptual adaptation to second-language accent. First-language English listeners were presented with Cantonese-accented English sentences in speech-shaped noise during a training session with three intermixed talkers, followed by a test session with a novel (i.e., fourth) talker. Participants were assigned to view either three East Asian or three White faces during training, corresponding to each speaker. Results indicated no effect of the social priming manipulation on the training or test sessions, although both groups performed better at test than a control group.
Collapse
Affiliation(s)
- Drew J McLaughlin
- Basque Center on Cognition, Brain and Language, Donostia-San Sebastián, Gipuzkoa 20018, Spain
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, ,
| | - Kristin J Van Engen
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, ,
| |
Collapse
|
14
|
McLaughlin DJ, Zink ME, Gaunt L, Reilly J, Sommers MS, Van Engen KJ, Peelle JE. Give me a break! Unavoidable fatigue effects in cognitive pupillometry. Psychophysiology 2023; 60:e14256. [PMID: 36734299 PMCID: PMC11161670 DOI: 10.1111/psyp.14256] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 11/15/2022] [Accepted: 12/17/2022] [Indexed: 02/04/2023]
Abstract
Pupillometry has a rich history in the study of perception and cognition. One perennial challenge is that the magnitude of the task-evoked pupil response diminishes over the course of an experiment, a phenomenon we refer to as a fatigue effect. Reducing fatigue effects may improve sensitivity to task effects and reduce the likelihood of confounds due to systematic physiological changes over time. In this paper, we investigated the degree to which fatigue effects could be ameliorated by experimenter intervention. In Experiment 1, we assigned participants to one of three groups-no breaks, kinetic breaks (playing with toys, but no social interaction), or chatting with a research assistant-and compared the pupil response across conditions. In Experiment 2, we additionally tested the effect of researcher observation. Only breaks including social interaction significantly reduced the fatigue of the pupil response across trials. However, in all conditions we found robust evidence for fatigue effects: that is, regardless of protocol, the task-evoked pupil response was substantially diminished (at least 60%) over the duration of the experiment. We account for the variance of fatigue effects in our pupillometry data using multiple common statistical modeling approaches (e.g., linear mixed-effects models of peak, mean, and baseline pupil diameters, as well as growth curve models of time-course data). We conclude that pupil attenuation is a predictable phenomenon that should be accommodated in our experimental designs and statistical models.
Collapse
Affiliation(s)
- Drew J. McLaughlin
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Maggie E. Zink
- Department of Otolaryngology, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Lauren Gaunt
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Jamie Reilly
- Department of Communication Sciences and Disorders, Temple University, Philadelphia, Pennsylvania, USA
| | - Mitchell S. Sommers
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Kristin J. Van Engen
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, St. Louis, Missouri, USA
| | - Jonathan E. Peelle
- Department of Otolaryngology, Washington University in Saint Louis, St. Louis, Missouri, USA
| |
Collapse
|
15
|
Rovetti J, Sumantry D, Russo FA. Exposure to nonnative-accented speech reduces listening effort and improves social judgments of the speaker. Sci Rep 2023; 13:2808. [PMID: 36797318 PMCID: PMC9935874 DOI: 10.1038/s41598-023-29082-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 01/30/2023] [Indexed: 02/18/2023] Open
Abstract
Prior research has revealed a native-accent advantage, whereby nonnative-accented speech is more difficult to process than native-accented speech. Nonnative-accented speakers also experience more negative social judgments. In the current study, we asked three questions. First, does exposure to nonnative-accented speech increase speech intelligibility or decrease listening effort, thereby narrowing the native-accent advantage? Second, does lower intelligibility or higher listening effort contribute to listeners' negative social judgments of speakers? Third and finally, does increased intelligibility or decreased listening effort with exposure to speech bring about more positive social judgments of speakers? To address these questions, normal-hearing adults listened to a block of English sentences with a native accent and a block with nonnative accent. We found that once participants were accustomed to the task, intelligibility was greater for nonnative-accented speech and increased similarly with exposure for both accents. However, listening effort decreased only for nonnative-accented speech, soon reaching the level of native-accented speech. In addition, lower intelligibility and higher listening effort was associated with lower ratings of speaker warmth, speaker competence, and willingness to interact with the speaker. Finally, competence ratings increased over time to a similar extent for both accents, with this relationship fully mediated by intelligibility and listening effort. These results offer insight into how listeners process and judge unfamiliar speakers.
Collapse
Affiliation(s)
- Joseph Rovetti
- grid.39381.300000 0004 1936 8884Department of Psychology, Western University, London, ON N6A 3K7 Canada ,Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| | - David Sumantry
- Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| | - Frank A. Russo
- Department of Psychology, Toronto Metropolitan University, Toronto, ON M5B 2K3 Canada
| |
Collapse
|
16
|
Baese-Berk MM, Levi SV, Van Engen KJ. Intelligibility as a measure of speech perception: Current approaches, challenges, and recommendations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:68. [PMID: 36732227 DOI: 10.1121/10.0016806] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 12/18/2022] [Indexed: 06/18/2023]
Abstract
Intelligibility measures, which assess the number of words or phonemes a listener correctly transcribes or repeats, are commonly used metrics for speech perception research. While these measures have many benefits for researchers, they also come with a number of limitations. By pointing out the strengths and limitations of this approach, including how it fails to capture aspects of perception such as listening effort, this article argues that the role of intelligibility measures must be reconsidered in fields such as linguistics, communication disorders, and psychology. Recommendations for future work in this area are presented.
Collapse
Affiliation(s)
| | - Susannah V Levi
- Department of Communicative Sciences and Disorders, New York University, New York, New York 10012, USA
| | - Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, USA
| |
Collapse
|
17
|
Relaño-Iborra H, Wendt D, Neagu MB, Kressner AA, Dau T, Bækgaard P. Baseline pupil size encodes task-related information and modulates the task-evoked response in a speech-in-noise task. Trends Hear 2022; 26:23312165221134003. [PMID: 36426573 PMCID: PMC9703509 DOI: 10.1177/23312165221134003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Pupillometry data are commonly reported relative to a baseline value recorded in a controlled pre-task condition. In this study, the influence of the experimental design and the preparatory processing related to task difficulty on the baseline pupil size was investigated during a speech intelligibility in noise paradigm. Furthermore, the relationship between the baseline pupil size and the temporal dynamics of the pupil response was assessed. The analysis revealed strong effects of block presentation order, within-block sentence order and task difficulty on the baseline values. An interaction between signal-to-noise ratio and block order was found, indicating that baseline values reflect listener expectations arising from the order in which the different blocks were presented. Furthermore, the baseline pupil size was found to affect the slope, delay and curvature of the pupillary response as well as the peak pupil dilation. This suggests that baseline correction might be sufficient when reporting pupillometry results in terms of mean pupil dilation only, but not when a more complex characterization of the temporal dynamics of the response is considered. By clarifying which factors affect baseline pupil size and how baseline values interact with the task-evoked response, the results from the present study can contribute to a better interpretation of the pupillary response as a marker of cognitive processing.
Collapse
Affiliation(s)
- Helia Relaño-Iborra
- Cognitive Systems Section, Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark,Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark,Helia Relaño-Iborra, Cognitive Systems Section, Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.
| | - Dorothea Wendt
- Eriksholm Research Center, Oticon, 3070 Snekkersten, Denmark
| | - Mihaela Beatrice Neagu
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark
| | - Abigail Anne Kressner
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark,Copenhagen Hearing and Balance Center, Rigshospitalet, 2100, Copenhagen, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark
| | - Per Bækgaard
- Cognitive Systems Section, Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark
| |
Collapse
|
18
|
Winn MB, Teece KH. Effortful Listening Despite Correct Responses: The Cost of Mental Repair in Sentence Recognition by Listeners With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3966-3980. [PMID: 36112516 PMCID: PMC9927629 DOI: 10.1044/2022_jslhr-21-00631] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 04/20/2022] [Accepted: 06/24/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE Speech recognition percent correct scores fail to capture the effort of mentally repairing the perception of speech that was initially misheard. This study measured the effort of listening to stimuli specifically designed to elicit mental repair in adults who use cochlear implants (CIs). METHOD CI listeners heard and repeated sentences in which specific words were distorted or masked by noise but recovered based on later context: a signature of mental repair. Changes in pupil dilation were tracked as an index of effort and time-locked with specific landmarks during perception. RESULTS Effort significantly increases when a listener needs to repair a misperceived word, even if the verbal response is ultimately correct. Mental repair of words in a sentence was accompanied by greater prevalence of errors elsewhere in the same sentence, suggesting that effort spreads to consume resources across time. The cost of mental repair in CI listeners was essentially the same as that observed in listeners with normal hearing in previous work. CONCLUSIONS Listening effort as tracked by pupil dilation is better explained by the mental repair and reconstruction of words rather than the appearance of correct or incorrect perception. Linguistic coherence drives effort more heavily than the mere presence of mistakes, highlighting the importance of testing materials that do not constrain coherence by design.
Collapse
Affiliation(s)
- Matthew B. Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| | - Katherine H. Teece
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| |
Collapse
|
19
|
Revisiting the relationship between implicit racial bias and audiovisual benefit for nonnative-accented speech. Atten Percept Psychophys 2022; 84:2074-2086. [PMID: 34988904 DOI: 10.3758/s13414-021-02423-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2021] [Indexed: 01/25/2023]
Abstract
Speech intelligibility is improved when the listener can see the talker in addition to hearing their voice. Notably, though, previous work has suggested that this "audiovisual benefit" for nonnative (i.e., foreign-accented) speech is smaller than the benefit for native speech, an effect that may be partially accounted for by listeners' implicit racial biases (Yi et al., 2013, The Journal of the Acoustical Society of America, 134[5], EL387-EL393.). In the present study, we sought to replicate these findings in a significantly larger sample of online participants. In a direct replication of Yi et al. (Experiment 1), we found that audiovisual benefit was indeed smaller for nonnative-accented relative to native-accented speech. However, our results did not support the conclusion that implicit racial biases, as measured with two types of implicit association tasks, were related to these differences in audiovisual benefit for native and nonnative speech. In a second experiment, we addressed a potential confound in the experimental design; to ensure that the difference in audiovisual benefit was caused by a difference in accent rather than a difference in overall intelligibility, we reversed the overall difficulty of each accent condition by presenting them at different signal-to-noise ratios. Even when native speech was presented at a much more difficult intelligibility level than nonnative speech, audiovisual benefit for nonnative speech remained poorer. In light of these findings, we discuss alternative explanations of reduced audiovisual benefit for nonnative speech, as well as methodological considerations for future work examining the intersection of social, cognitive, and linguistic processes.
Collapse
|
20
|
Pupillometry reveals cognitive demands of lexical competition during spoken word recognition in young and older adults. Psychon Bull Rev 2021; 29:268-280. [PMID: 34405386 DOI: 10.3758/s13423-021-01991-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2021] [Indexed: 12/27/2022]
Abstract
In most contemporary activation-competition frameworks for spoken word recognition, candidate words compete against phonological "neighbors" with similar acoustic properties (e.g., "cap" vs. "cat"). Thus, recognizing words with more competitors should come at a greater cognitive cost relative to recognizing words with fewer competitors, due to increased demands for selecting the correct item and inhibiting incorrect candidates. Importantly, these processes should operate even in the absence of differences in accuracy. In the present study, we tested this proposal by examining differences in processing costs associated with neighborhood density for highly intelligible items presented in quiet. A second goal was to examine whether the cognitive demands associated with increased neighborhood density were greater for older adults compared with young adults. Using pupillometry as an index of cognitive processing load, we compared the cognitive demands associated with spoken word recognition for words with many or fewer neighbors, presented in quiet, for young (n = 67) and older (n = 69) adult listeners. Growth curve analysis of the pupil data indicated that older adults showed a greater evoked pupil response for spoken words than did young adults, consistent with increased cognitive load during spoken word recognition. Words from dense neighborhoods were marginally more demanding to process than words from sparse neighborhoods. There was also an interaction between age and neighborhood density, indicating larger effects of density in young adult listeners. These results highlight the importance of assessing both cognitive demands and accuracy when investigating the mechanisms underlying spoken word recognition.
Collapse
|
21
|
Abstract
Listening effort is a valuable and important notion to measure because it is among the primary complaints of people with hearing loss. It is tempting and intuitive to accept speech intelligibility scores as a proxy for listening effort, but this link is likely oversimplified and lacks actionable explanatory power. This study was conducted to explain the mechanisms of listening effort that are not captured by intelligibility scores, using sentence-repetition tasks where specific kinds of mistakes were prospectively planned or analyzed retrospectively. Effort measured as changes in pupil size among 20 listeners with normal hearing and 19 listeners with cochlear implants. Experiment 1 demonstrates that mental correction of misperceived words increases effort even when responses are correct. Experiment 2 shows that for incorrect responses, listening effort is not a function of the proportion of words correct but is rather driven by the types of errors, position of errors within a sentence, and the need to resolve ambiguity, reflecting how easily the listener can make sense of a perception. A simple taxonomy of error types is provided that is both intuitive and consistent with data from these two experiments. The diversity of errors in these experiments implies that speech perception tasks can be designed prospectively to elicit the mistakes that are more closely linked with effort. Although mental corrective action and number of mistakes can scale together in many experiments, it is possible to dissociate them to advance toward a more explanatory (rather than correlational) account of listening effort.
Collapse
Affiliation(s)
- Matthew B. Winn
- Matthew B. Winn, University of Minnesota, Twin Cities, 164 Pillsbury Dr SE, Minneapolis, MN Minnesota 55455, United States.
| | | |
Collapse
|
22
|
Silcox JW, Payne BR. The costs (and benefits) of effortful listening on context processing: A simultaneous electrophysiology, pupillometry, and behavioral study. Cortex 2021; 142:296-316. [PMID: 34332197 DOI: 10.1016/j.cortex.2021.06.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 04/02/2021] [Accepted: 06/10/2021] [Indexed: 11/24/2022]
Abstract
There is an apparent disparity between the fields of cognitive audiology and cognitive electrophysiology as to how linguistic context is used when listening to perceptually challenging speech. To gain a clearer picture of how listening effort impacts context use, we conducted a pre-registered study to simultaneously examine electrophysiological, pupillometric, and behavioral responses when listening to sentences varying in contextual constraint and acoustic challenge in the same sample. Participants (N = 44) listened to sentences that were highly constraining and completed with expected or unexpected sentence-final words ("The prisoners were planning their escape/party") or were low-constraint sentences with unexpected sentence-final words ("All day she thought about the party"). Sentences were presented either in quiet or with +3 dB SNR background noise. Pupillometry and EEG were simultaneously recorded and subsequent sentence recognition and word recall were measured. While the N400 expectancy effect was diminished by noise, suggesting impaired real-time context use, we simultaneously observed a beneficial effect of constraint on subsequent recognition memory for degraded speech. Importantly, analyses of trial-to-trial coupling between pupil dilation and N400 amplitude showed that when participants' showed increased listening effort (i.e., greater pupil dilation), there was a subsequent recovery of the N400 effect, but at the same time, higher effort was related to poorer subsequent sentence recognition and word recall. Collectively, these findings suggest divergent effects of acoustic challenge and listening effort on context use: while noise impairs the rapid use of context to facilitate lexical semantic processing in general, this negative effect is attenuated when listeners show increased effort in response to noise. However, this effort-induced reliance on context for online word processing comes at the cost of poorer subsequent memory.
Collapse
Affiliation(s)
| | - Brennan R Payne
- Department of Psychology, University of Utah, USA; Interdepartmental Neuroscience Program, University of Utah, USA
| |
Collapse
|
23
|
Smiljanic R, Keerstock S, Meemann K, Ransom SM. Face masks and speaking style affect audio-visual word recognition and memory of native and non-native speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4013. [PMID: 34241444 PMCID: PMC8269755 DOI: 10.1121/10.0005191] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Though necessary, protective mask wearing in response to the COVID-19 pandemic presents communication challenges. The present study examines how signal degradation and loss of visual information due to masks affects intelligibility and memory for native and non-native speech. We also test whether clear speech can alleviate perceptual difficulty for masked speech. One native and one non-native speaker of English recorded video clips in conversational speech without a mask and conversational and clear speech with a mask. Native English listeners watched video clips presented in quiet or mixed with competing speech. The results showed that word recognition and recall of speech produced with a mask can be as accurate as without a mask in optimal listening conditions. Masks affected non-native speech processing at easier noise levels than native speech. Clear speech with a mask significantly improved accuracy in all listening conditions. Speaking clearly, reducing noise, and using surgical masks as well as good signal amplification can help compensate for the loss of intelligibility due to background noise, lack of visual cues, physical distancing, or non-native speech. The findings have implications for communication in classrooms and hospitals where listeners interact with teachers and healthcare providers, oftentimes non-native speakers, through their protective barriers.
Collapse
Affiliation(s)
- Rajka Smiljanic
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| | - Sandie Keerstock
- Department of Psychological Sciences, University of Missouri; 124 Psychology Building, Columbia, Missouri 65211, USA
| | - Kirsten Meemann
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| | - Sarah M Ransom
- Department of Linguistics, University of Texas at Austin, 305 East 23rd Street STOP B5100, Austin, Texas 78712, USA
| |
Collapse
|
24
|
McGarrigle R, Rakusen L, Mattys S. Effortful listening under the microscope: Examining relations between pupillometric and subjective markers of effort and tiredness from listening. Psychophysiology 2020; 58:e13703. [PMID: 33031584 DOI: 10.1111/psyp.13703] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/22/2022]
Abstract
Effort during listening is commonly measured using the task-evoked pupil response (TEPR); a pupillometric marker of physiological arousal. However, studies to date report no association between TEPR and perceived effort. One possible reason for this is the way in which self-report effort measures are typically administered, namely as a single data point collected at the end of a testing session. Another possible reason is that TEPR might relate more closely to the experience of tiredness from listening than to effort per se. To examine these possibilities, we conducted two preregistered experiments that recorded subjective ratings of effort and tiredness from listening at multiple time points and examined their covariance with TEPR over the course of listening tasks varying in levels of acoustic and attentional demand. In both experiments, we showed a within-subject association between TEPR and tiredness from listening, but no association between TEPR and effort. The data also suggest that the effect of task difficulty on the experience of tiredness from listening may go undetected using the traditional approach of collecting a single data point at the end of a listening block. Finally, this study demonstrates the utility of a novel correlation analysis technique ("rmcorr"), which can be used to overcome statistical power constraints commonly found in the literature. Teasing apart the subjective and physiological mechanisms that underpin effortful listening is a crucial step toward addressing these difficulties in older and/or hearing-impaired individuals.
Collapse
Affiliation(s)
| | | | - Sven Mattys
- Department of Psychology, University of York, York, UK
| |
Collapse
|
25
|
Paulus M, Hazan V, Adank P. The relationship between talker acoustics, intelligibility, and effort in degraded listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3348. [PMID: 32486777 DOI: 10.1121/10.0001212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 04/20/2020] [Indexed: 06/11/2023]
Abstract
Listening to degraded speech is associated with decreased intelligibility and increased effort. However, listeners are generally able to adapt to certain types of degradations. While intelligibility of degraded speech is modulated by talker acoustics, it is unclear whether talker acoustics also affect effort and adaptation. Moreover, it has been demonstrated that talker differences are preserved across spectral degradations, but it is not known whether this effect extends to temporal degradations and which acoustic-phonetic characteristics are responsible. In a listening experiment combined with pupillometry, participants were presented with speech in quiet as well as in masking noise, time-compressed, and noise-vocoded speech by 16 Southern British English speakers. Results showed that intelligibility, but not adaptation, was modulated by talker acoustics. Talkers who were more intelligible under noise-vocoding were also more intelligible under masking and time-compression. This effect was linked to acoustic-phonetic profiles with greater vowel space dispersion (VSD) and energy in mid-range frequencies, as well as slower speaking rate. While pupil dilation indicated increasing effort with decreasing intelligibility, this study also linked reduced effort in quiet to talkers with greater VSD. The results emphasize the relevance of talker acoustics for intelligibility and effort in degraded listening conditions.
Collapse
Affiliation(s)
- Maximillian Paulus
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Valerie Hazan
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| |
Collapse
|