1
|
Yoshihara M, Nakayama M, Junyi X, Hino Y. Does rotation eliminate masked priming effects for Japanese kanji words? Cognition 2024; 246:105759. [PMID: 38430752 DOI: 10.1016/j.cognition.2024.105759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 01/14/2024] [Accepted: 02/23/2024] [Indexed: 03/05/2024]
Abstract
A key issue in recent visual word recognition literature is whether text rotation disrupts the early stages of orthographic processing. Previous research found no masked repetition priming effect when primes were rotated ≥90° in alphabetic languages. The present study investigated the impact of text rotation using logographic (two-character Japanese kanji) words. In Experiment 1, we conducted a masked repetition priming lexical decision experiment with upright and 180° rotated primes. The rotated primes produced a significant priming effect, although the effect was smaller than the upright primes. In Experiment 2, we further examined the effectiveness of 180° rotated primes in two different conditions: the whole words were rotated vs. each constituent character was rotated at their own positions. Both prime types produced significant priming effects of similar magnitudes. These findings suggest that orthographic processing is more robust against text rotation in logographic languages than in alphabetic languages.
Collapse
Affiliation(s)
- Masahiro Yoshihara
- Graduate School of International Cultural Studies, Tohoku University, 41 Kawauchi, Aoba-ku, Sendai, Miyagi 980-8570, Japan; Japan Society for the Promotion of Science, Tokyo 102-0083, Japan.
| | - Mariko Nakayama
- Graduate School of International Cultural Studies, Tohoku University, 41 Kawauchi, Aoba-ku, Sendai, Miyagi 980-8570, Japan
| | - Xue Junyi
- Faculty of Letters, Arts, and Sciences, Waseda University, 1-24-1 Toyama, Shinjuku-ku, Tokyo, 162-8644, Japan
| | - Yasushi Hino
- Faculty of Letters, Arts, and Sciences, Waseda University, 1-24-1 Toyama, Shinjuku-ku, Tokyo, 162-8644, Japan
| |
Collapse
|
2
|
Thaler L, Castillo-Serrano JG, Kish D, Norman LJ. Effects of type of emission and masking sound, and their spatial correspondence, on blind and sighted people's ability to echolocate. Neuropsychologia 2024; 196:108822. [PMID: 38342179 DOI: 10.1016/j.neuropsychologia.2024.108822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 01/30/2024] [Accepted: 02/08/2024] [Indexed: 02/13/2024]
Abstract
Ambient sound can mask acoustic signals. The current study addressed how echolocation in people is affected by masking sound, and the role played by type of sound and spatial (i.e. binaural) similarity. We also investigated the role played by blindness and long-term experience with echolocation, by testing echolocation experts, as well as blind and sighted people new to echolocation. Results were obtained in two echolocation tasks where participants listened to binaural recordings of echolocation and masking sounds, and either localized echoes in azimuth or discriminated echo audibility. Echolocation and masking sounds could be either clicks or broad band noise. An adaptive staircase method was used to adjust signal-to-noise ratios (SNRs) based on participants' responses. When target and masker had the same binaural cues (i.e. both were monoaural sounds), people performed better (i.e. had lower SNRs) when target and masker used different types of sound (e.g. clicks in noise-masker or noise in clicks-masker), as compared to when target and masker used the same type of sound (e.g. clicks in click-, or noise in noise-masker). A very different pattern of results was observed when masker and target differed in their binaural cues, in which case people always performed better when clicks were the masker, regardless of type of emission used. Further, direct comparison between conditions with and without binaural difference revealed binaural release from masking only when clicks were used as emissions and masker, but not otherwise (i.e. when noise was used as masker or emission). This suggests that echolocation with clicks or noise may differ in their sensitivity to binaural cues. We observed the same pattern of results for echolocation experts, and blind and sighted people new to echolocation, suggesting a limited role played by long-term experience or blindness. In addition to generating novel predictions for future work, the findings also inform instruction in echolocation for people who are blind or sighted.
Collapse
Affiliation(s)
- L Thaler
- Department of Psychology, Durham University, South Road, Durham, DH1 5AY, UK.
| | | | - D Kish
- World Access for the Blind, 1007 Marino Drive, Placentia, CA, 92870, USA
| | - L J Norman
- Department of Psychology, Durham University, South Road, Durham, DH1 5AY, UK
| |
Collapse
|
3
|
Chen SCY, Chen Y, Geisler WS, Seidemann E. Neural correlates of perceptual similarity masking in primate V1. eLife 2024; 12:RP89570. [PMID: 38592269 PMCID: PMC11003749 DOI: 10.7554/elife.89570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024] Open
Abstract
Visual detection is a fundamental natural task. Detection becomes more challenging as the similarity between the target and the background in which it is embedded increases, a phenomenon termed 'similarity masking'. To test the hypothesis that V1 contributes to similarity masking, we used voltage sensitive dye imaging (VSDI) to measure V1 population responses while macaque monkeys performed a detection task under varying levels of target-background similarity. Paradoxically, we find that during an initial transient phase, V1 responses to the target are enhanced, rather than suppressed, by target-background similarity. This effect reverses in the second phase of the response, so that in this phase V1 signals are positively correlated with the behavioral effect of similarity. Finally, we show that a simple model with delayed divisive normalization can qualitatively account for our findings. Overall, our results support the hypothesis that a nonlinear gain control mechanism in V1 contributes to perceptual similarity masking.
Collapse
Affiliation(s)
- Spencer Chin-Yu Chen
- Center for Perceptual Systems, University of Texas at AustinAustinUnited States
- Department of Psychology, University of Texas at AustinAustinUnited States
- Center for Theoretical and Computational NeuroscienceAustinUnited States
- Department of Neuroscience, University of Texas at AustinAustinUnited States
- Department of Neurosurgery, Rutgers UniversityNew BrunswickUnited States
| | - Yuzhi Chen
- Center for Perceptual Systems, University of Texas at AustinAustinUnited States
- Department of Psychology, University of Texas at AustinAustinUnited States
- Center for Theoretical and Computational NeuroscienceAustinUnited States
- Department of Neuroscience, University of Texas at AustinAustinUnited States
| | - Wilson S Geisler
- Center for Perceptual Systems, University of Texas at AustinAustinUnited States
- Department of Psychology, University of Texas at AustinAustinUnited States
- Center for Theoretical and Computational NeuroscienceAustinUnited States
| | - Eyal Seidemann
- Center for Perceptual Systems, University of Texas at AustinAustinUnited States
- Department of Psychology, University of Texas at AustinAustinUnited States
- Center for Theoretical and Computational NeuroscienceAustinUnited States
- Department of Neuroscience, University of Texas at AustinAustinUnited States
| |
Collapse
|
4
|
Regev J, Relaño-Iborra H, Zaar J, Dau T. Disentangling the effects of hearing loss and age on amplitude modulation frequency selectivity. J Acoust Soc Am 2024; 155:2589-2602. [PMID: 38607268 DOI: 10.1121/10.0025541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 03/19/2024] [Indexed: 04/13/2024]
Abstract
The processing and perception of amplitude modulation (AM) in the auditory system reflect a frequency-selective process, often described as a modulation filterbank. Previous studies on perceptual AM masking reported similar results for older listeners with hearing impairment (HI listeners) and young listeners with normal hearing (NH listeners), suggesting no effects of age or hearing loss on AM frequency selectivity. However, recent evidence has shown that age, independently of hearing loss, adversely affects AM frequency selectivity. Hence, this study aimed to disentangle the effects of hearing loss and age. A simultaneous AM masking paradigm was employed, using a sinusoidal carrier at 2.8 kHz, narrowband noise modulation maskers, and target modulation frequencies of 4, 16, 64, and 128 Hz. The results obtained from young (n = 3, 24-30 years of age) and older (n = 10, 63-77 years of age) HI listeners were compared to previously obtained data from young and older NH listeners. Notably, the HI listeners generally exhibited lower (unmasked) AM detection thresholds and greater AM frequency selectivity than their NH counterparts in both age groups. Overall, the results suggest that age negatively affects AM frequency selectivity for both NH and HI listeners, whereas hearing loss improves AM detection and AM selectivity, likely due to the loss of peripheral compression.
Collapse
Affiliation(s)
- Jonathan Regev
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Johannes Zaar
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
- Eriksholm Research Centre, Snekkersten, 3070, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
- Copenhagen Hearing and Balance Center, Copenhagen University Hospital, Rigshospitalet, Copenhagen, 2100, Denmark
| |
Collapse
|
5
|
Guérit F, Middlebrooks JC, Gransier R, Richardson ML, Wouters J, Carlyon RP. Exploring the Use of Interleaved Stimuli to Measure Cochlear-Implant Excitation Patterns. J Assoc Res Otolaryngol 2024; 25:201-213. [PMID: 38459245 PMCID: PMC11018570 DOI: 10.1007/s10162-024-00937-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 02/15/2024] [Indexed: 03/10/2024] Open
Abstract
PURPOSE Attempts to use current-focussing strategies with cochlear implants (CI) to reduce neural spread-of-excitation have met with only mixed success in human studies, in contrast to promising results in animal studies. Although this discrepancy could stem from between-species anatomical and aetiological differences, the masking experiments used in human studies may be insufficiently sensitive to differences in excitation-pattern width. METHODS We used an interleaved-masking method to measure psychophysical excitation patterns in seven participants with four masker stimulation configurations: monopolar (MP), partial tripolar (pTP), a wider partial tripolar (pTP + 2), and, importantly, a condition (RP + 2) designed to produce a broader excitation pattern than MP. The probe was always in partial-tripolar configuration. RESULTS We found a significant effect of stimulation configuration on both the amount of on-site masking (mask and probe on same electrode; an indirect indicator of sharpness) and the difference between off-site and on-site masking. Differences were driven solely by RP + 2 producing a broader excitation pattern than the other configurations, whereas monopolar and the two current-focussing configurations did not statistically differ from each other. CONCLUSION A method that is sensitive enough to reveal a modest broadening in RP + 2 showed no evidence for sharpening with focussed stimulation. We also showed that although voltage recordings from the implant accurately predicted a broadening of the psychophysical excitation patterns with RP + 2, they wrongly predicted a strong sharpening with pTP + 2. We additionally argue, based on our recent research, that the interleaved-masking method can usefully be applied to non-human species and objective measures of CI excitation patterns.
Collapse
Affiliation(s)
- François Guérit
- Cambridge Hearing Group, MRC Cognition & Brain Sciences Unit, University of Cambridge, Cambridge, England.
| | - John C Middlebrooks
- Department of Otolaryngology, University of California at Irvine, Irvine, CA, USA
- Department of Neurobiology and Behavior, University of California at Irvine, Irvine, CA, USA
- Department of Biomedical Engineering, University of California at Irvine, Irvine, CA, USA
| | - Robin Gransier
- Department of Neurosciences, ExpORL KU Leuven, Leuven, Belgium
- Leuven Brain Institute KU Leuven, Leuven, Belgium
| | - Matthew L Richardson
- Department of Otolaryngology, University of California at Irvine, Irvine, CA, USA
| | - Jan Wouters
- Department of Neurosciences, ExpORL KU Leuven, Leuven, Belgium
- Leuven Brain Institute KU Leuven, Leuven, Belgium
| | - Robert P Carlyon
- Cambridge Hearing Group, MRC Cognition & Brain Sciences Unit, University of Cambridge, Cambridge, England
| |
Collapse
|
6
|
Bureš Z, Profant O, Sommerhalder N, Skarnitzl R, Fuksa J, Meyer M. Speech intelligibility and its relation to auditory temporal processing in Czech and Swiss German subjects with and without tinnitus. Eur Arch Otorhinolaryngol 2024; 281:1589-1595. [PMID: 38175264 DOI: 10.1007/s00405-023-08398-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 12/06/2023] [Indexed: 01/05/2024]
Abstract
PURPOSE Previous studies have shown that levels for 50% speech intelligibility in quiet and in noise differ for different languages. Here, we aimed to find out whether these differences may relate to different auditory processing of temporal sound features in different languages, and to determine the influence of tinnitus on speech comprehension in different languages. METHODS We measured speech intelligibility under various conditions (words in quiet, sentences in babble noise, interrupted sentences) along with tone detection thresholds in quiet [PTA] and in noise [PTAnoise], gap detection thresholds [GDT], and detection thresholds for frequency modulation [FMT], and compared them between Czech and Swiss subjects matched in mean age and PTA. RESULTS The Swiss subjects exhibited higher speech reception thresholds in quiet, higher threshold speech-to-noise ratio, and shallower slope of performance-intensity function for the words in quiet. Importantly, the intelligibility of temporally gated speech was similar in the Czech and Swiss subjects. The PTAnoise, GDT, and FMT were similar in the two groups. The Czech subjects exhibited correlations of the speech tests with GDT and FMT, which was not the case in the Swiss group. Qualitatively, the results of comparisons between the Swiss and Czech populations were not influenced by presence of subjective tinnitus. CONCLUSION The results support the notion of language-specific differences in speech comprehension which persists also in tinnitus subjects, and indicates different associations with the elementary measures of auditory temporal processing.
Collapse
Affiliation(s)
- Zbyněk Bureš
- Department of Otorhinolaryngology, Third Faculty of Medicine, University Hospital Královské Vinohrady, Charles University, Prague, Czech Republic.
- Department of Cognitive Systems and Neurosciences, Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslávských partyzánů 1580/3, 160 00, Prague 6, Czech Republic.
| | - Oliver Profant
- Department of Otorhinolaryngology, Third Faculty of Medicine, University Hospital Královské Vinohrady, Charles University, Prague, Czech Republic
- Department of Auditory Neuroscience, Institute of Experimental Medicine, Czech Academy of Sciences, Prague, Czech Republic
| | - Nick Sommerhalder
- Evolutionary Neuroscience of Language, Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
| | - Radek Skarnitzl
- Institute of Phonetics, Faculty of Arts, Charles University, Prague, Czech Republic
| | - Jakub Fuksa
- Department of Otorhinolaryngology, Third Faculty of Medicine, University Hospital Královské Vinohrady, Charles University, Prague, Czech Republic
- Department of Auditory Neuroscience, Institute of Experimental Medicine, Czech Academy of Sciences, Prague, Czech Republic
| | - Martin Meyer
- Evolutionary Neuroscience of Language, Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| |
Collapse
|
7
|
Stein T, van Gaal S, Fahrenfort JJ. How (not) to demonstrate unconscious priming: Overcoming issues with post-hoc data selection, low power, and frequentist statistics. Conscious Cogn 2024; 119:103669. [PMID: 38395013 DOI: 10.1016/j.concog.2024.103669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 02/16/2024] [Accepted: 02/16/2024] [Indexed: 02/25/2024]
Abstract
One widely used scientific approach to studying consciousness involves contrasting conscious operations with unconscious ones. However, challenges in establishing the absence of conscious awareness have led to debates about the extent and existence of unconscious processes. We collected experimental data on unconscious semantic priming, manipulating prime presentation duration to highlight the critical role of the analysis approach in attributing priming effects to unconscious processing. We demonstrate that common practices like post-hoc data selection, low statistical power, and frequentist statistical testing can erroneously support claims of unconscious priming. Conversely, adopting best practices like direct performance-awareness contrasts, Bayesian tests, and increased statistical power can prevent such erroneous conclusions. Many past experiments, including our own, fail to meet these standards, casting doubt on previous claims about unconscious processing. Implementing these robust practices will enhance our understanding of unconscious processing and shed light on the functions and neural mechanisms of consciousness.
Collapse
Affiliation(s)
- Timo Stein
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands.
| | - Simon van Gaal
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands
| | - Johannes J Fahrenfort
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, the Netherlands; Department of Applied and Experimental Psychology, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
8
|
Fan J, Williamson DS. From the perspective of perceptual speech quality: The robustness of frequency bands to noise. J Acoust Soc Am 2024; 155:1916-1927. [PMID: 38456734 DOI: 10.1121/10.0025272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 02/22/2024] [Indexed: 03/09/2024]
Abstract
Speech quality is one of the main foci of speech-related research, where it is frequently studied with speech intelligibility, another essential measurement. Band-level perceptual speech intelligibility, however, has been studied frequently, whereas speech quality has not been thoroughly analyzed. In this paper, a Multiple Stimuli With Hidden Reference and Anchor (MUSHRA) inspired approach was proposed to study the individual robustness of frequency bands to noise with perceptual speech quality as the measure. Speech signals were filtered into thirty-two frequency bands with compromising real-world noise employed at different signal-to-noise ratios. Robustness to noise indices of individual frequency bands was calculated based on the human-rated perceptual quality scores assigned to the reconstructed noisy speech signals. Trends in the results suggest the mid-frequency region appeared less robust to noise in terms of perceptual speech quality. These findings suggest future research aiming at improving speech quality should pay more attention to the mid-frequency region of the speech signals accordingly.
Collapse
Affiliation(s)
- Junyi Fan
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, USA
| | - Donald S Williamson
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
9
|
Alamatsaz N, Rosen MJ, Ihlefeld A. Increased reliance on temporal coding when target sound is softer than the background. Sci Rep 2024; 14:4457. [PMID: 38396044 PMCID: PMC10891139 DOI: 10.1038/s41598-024-54865-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 02/17/2024] [Indexed: 02/25/2024] Open
Abstract
Everyday environments often contain multiple concurrent sound sources that fluctuate over time. Normally hearing listeners can benefit from high signal-to-noise ratios (SNRs) in energetic dips of temporally fluctuating background sound, a phenomenon called dip-listening. Specialized mechanisms of dip-listening exist across the entire auditory pathway. Both the instantaneous fluctuating and the long-term overall SNR shape dip-listening. An unresolved issue regarding cortical mechanisms of dip-listening is how target perception remains invariant to overall SNR, specifically, across different tone levels with an ongoing fluctuating masker. Equivalent target detection over both positive and negative overall SNRs (SNR invariance) is reliably achieved in highly-trained listeners. Dip-listening is correlated with the ability to resolve temporal fine structure, which involves temporally-varying spike patterns. Thus the current work tests the hypothesis that at negative SNRs, neuronal readout mechanisms need to increasingly rely on decoding strategies based on temporal spike patterns, as opposed to spike count. Recordings from chronically implanted electrode arrays in core auditory cortex of trained and awake Mongolian gerbils that are engaged in a tone detection task in 10 Hz amplitude-modulated background sound reveal that rate-based decoding is not SNR-invariant, whereas temporal coding is informative at both negative and positive SNRs.
Collapse
Affiliation(s)
- Nima Alamatsaz
- Graduate School of Biomedical Sciences, Rutgers University, Newark, NJ, USA
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, USA
| | - Merri J Rosen
- Northeast Ohio Medical University (NEOMED), Rootstown, OH, USA.
- University Hospitals Hearing Research Center at NEOMED, Rootstown, OH, USA.
- Brain Health Research Institute, Kent State University, Kent, OH, USA.
| | | |
Collapse
|
10
|
Lalonde K, Peng ZE, Halverson DM, Dwyer GA. Children's use of spatial and visual cues for release from perceptual masking. J Acoust Soc Am 2024; 155:1559-1569. [PMID: 38393738 PMCID: PMC10890829 DOI: 10.1121/10.0024766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/19/2024] [Accepted: 01/22/2024] [Indexed: 02/25/2024]
Abstract
This study examined the role of visual speech in providing release from perceptual masking in children by comparing visual speech benefit across conditions with and without a spatial separation cue. Auditory-only and audiovisual speech recognition thresholds in a two-talker speech masker were obtained from 21 children with typical hearing (7-9 years of age) using a color-number identification task. The target was presented from a loudspeaker at 0° azimuth. Masker source location varied across conditions. In the spatially collocated condition, the masker was also presented from the loudspeaker at 0° azimuth. In the spatially separated condition, the masker was presented from the loudspeaker at 0° azimuth and a loudspeaker at -90° azimuth, with the signal from the -90° loudspeaker leading the signal from the 0° loudspeaker by 4 ms. The visual stimulus (static image or video of the target talker) was presented at 0° azimuth. Children achieved better thresholds when the spatial cue was provided and when the visual cue was provided. Visual and spatial cue benefit did not differ significantly depending on the presence of the other cue. Additional studies are needed to characterize how children's preferential use of visual and spatial cues varies depending on the strength of each cue.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Z Ellen Peng
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Destinee M Halverson
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Grace A Dwyer
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| |
Collapse
|
11
|
Lelic D, Nielsen LLA, Pedersen AK, Neher T. Focusing on Positive Listening Experiences Improves Speech Intelligibility in Experienced Hearing Aid Users. Trends Hear 2024; 28:23312165241246616. [PMID: 38656770 PMCID: PMC11044800 DOI: 10.1177/23312165241246616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 04/26/2024] Open
Abstract
Negativity bias is a cognitive bias that results in negative events being perceptually more salient than positive ones. For hearing care, this means that hearing aid benefits can potentially be overshadowed by adverse experiences. Research has shown that sustaining focus on positive experiences has the potential to mitigate negativity bias. The purpose of the current study was to investigate whether a positive focus (PF) intervention can improve speech-in-noise abilities for experienced hearing aid users. Thirty participants were randomly allocated to a control or PF group (N = 2 × 15). Prior to hearing aid fitting, all participants filled out the short form of the Speech, Spatial and Qualities of Hearing scale (SSQ12) based on their own hearing aids. At the first visit, they were fitted with study hearing aids, and speech-in-noise testing was performed. Both groups then wore the study hearing aids for two weeks and sent daily text messages reporting hours of hearing aid use to an experimenter. In addition, the PF group was instructed to focus on positive listening experiences and to also report them in the daily text messages. After the 2-week trial, all participants filled out the SSQ12 questionnaire based on the study hearing aids and completed the speech-in-noise testing again. Speech-in-noise performance and SSQ12 Qualities score were improved for the PF group but not for the control group. This finding indicates that the PF intervention can improve subjective and objective hearing aid benefits.
Collapse
Affiliation(s)
| | | | | | - Tobias Neher
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark
- Research Unit for ORL – Head & Neck Surgery and Audiology, Odense University Hospital & University of Southern Denmark, Odense, Denmark
| |
Collapse
|
12
|
Lalonde K, Walker EA, Leibold LJ, McCreery RW. Predictors of Susceptibility to Noise and Speech Masking Among School-Age Children With Hearing Loss or Typical Hearing. Ear Hear 2024; 45:81-93. [PMID: 37415268 PMCID: PMC10771540 DOI: 10.1097/aud.0000000000001403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
OBJECTIVES The purpose of this study was to evaluate effects of masker type and hearing group on the relationship between school-age children's speech recognition and age, vocabulary, working memory, and selective attention. This study also explored effects of masker type and hearing group on the time course of maturation of masked speech recognition. DESIGN Participants included 31 children with normal hearing (CNH) and 41 children with mild to severe bilateral sensorineural hearing loss (CHL), between 6.7 and 13 years of age. Children with hearing aids used their personal hearing aids throughout testing. Audiometric thresholds and standardized measures of vocabulary, working memory, and selective attention were obtained from each child, along with masked sentence recognition thresholds in a steady state, speech-spectrum noise (SSN) and in a two-talker speech masker (TTS). Aided audibility through children's hearing aids was calculated based on the Speech Intelligibility Index (SII) for all children wearing hearing aids. Linear mixed effects models were used to examine the contribution of group, age, vocabulary, working memory, and attention to individual differences in speech recognition thresholds in each masker. Additional models were constructed to examine the role of aided audibility on masked speech recognition in CHL. Finally, to explore the time course of maturation of masked speech perception, linear mixed effects models were used to examine interactions between age, masker type, and hearing group as predictors of masked speech recognition. RESULTS Children's thresholds were higher in TTS than in SSN. There was no interaction of hearing group and masker type. CHL had higher thresholds than CNH in both maskers. In both hearing groups and masker types, children with better vocabularies had lower thresholds. An interaction of hearing group and attention was observed only in the TTS. Among CNH, attention predicted thresholds in TTS. Among CHL, vocabulary and aided audibility predicted thresholds in TTS. In both maskers, thresholds decreased as a function of age at a similar rate in CNH and CHL. CONCLUSIONS The factors contributing to individual differences in speech recognition differed as a function of masker type. In TTS, the factors contributing to individual difference in speech recognition further differed as a function of hearing group. Whereas attention predicted variance for CNH in TTS, vocabulary and aided audibility predicted variance in CHL. CHL required a more favorable signal to noise ratio (SNR) to recognize speech in TTS than in SSN (mean = +1 dB in TTS, -3 dB in SSN). We posit that failures in auditory stream segregation limit the extent to which CHL can recognize speech in a speech masker. Larger sample sizes or longitudinal data are needed to characterize the time course of maturation of masked speech perception in CHL.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Elizabeth A. Walker
- Department of Communication Sciences and Disorders, The University of Iowa, Iowa City, IA
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Ryan W. McCreery
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| |
Collapse
|
13
|
Nakashima Y, Kanazawa S, Yamaguchi MK. Metacontrast masking is ineffective in the first 6 months of life. Cognition 2024; 242:105666. [PMID: 37984131 DOI: 10.1016/j.cognition.2023.105666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 11/09/2023] [Accepted: 11/11/2023] [Indexed: 11/22/2023]
Abstract
Metacontrast masking is one of the most widely studied types of visual masking, in which a visual stimulus is rendered invisible by a subsequent mask that does not spatially overlap with the target. Metacontrast has been used for many decades as a tool to study visual processing and conscious perception in adults. However, there are so far no infant studies on metacontrast and it remains unknown even whether it occurs in infants. The present study examined metacontrast masking in 3- to 8-month-old infants (N = 168) using a habituation paradigm. We found that metacontrast is ineffective for infants under 7 months and that younger infants can perceive a masked stimulus that older infants cannot. Our results suggest that metacontrast is distinct from other simple types of masking that occur in early infancy, and would be consistent with the idea that metacontrast results from the disruption of recurrent processing.
Collapse
Affiliation(s)
- Yusuke Nakashima
- Research and Development Initiative, Chuo University, 742-1 Higashinakano, Hachioji-shi, Tokyo 192-0393, Japan.
| | - So Kanazawa
- Department of Psychology, Japan Women's University, 2-8-1 Mejirodai, Bunkyo-ku, Tokyo 112-8681, Japan
| | - Masami K Yamaguchi
- Department of Psychology, Chuo University, 742-1 Higashinakano, Hachioji-shi, Tokyo 192-0393, Japan
| |
Collapse
|
14
|
Oh Y, Lerud KD, Hoglund E, Klyn N, Large EW, Feth LL. Testing a computational model for aural detection of aircraft in ambient noise. J Acoust Soc Am 2023; 154:3799-3809. [PMID: 38109404 DOI: 10.1121/10.0023933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 11/16/2023] [Indexed: 12/20/2023]
Abstract
Computational models are used to predict the performance of human listeners for carefully specified signal and noise conditions. However, there may be substantial discrepancies between the conditions under which listeners are tested and those used for model predictions. Thus, models may predict better performance than exhibited by the listeners, or they may "fail" to capture the ability of the listener to respond to subtle stimulus conditions. This study tested a computational model devised to predict a listener's ability to detect an aircraft in various soundscapes. The model and listeners processed the same sound recordings under carefully specified testing conditions. Details of signal and masker calibration were carefully matched, and the model was tested using the same adaptive tracking paradigm. Perhaps most importantly, the behavioral results were not available to the modeler before the model predictions were presented. Recordings from three different aircraft were used as the target signals. Maskers were derived from recordings obtained at nine locations ranging from very quiet rural environments to suburban and urban settings. Overall, with a few exceptions, model predictions matched the performance of the listeners very well. Discussion focuses on those differences and possible reasons for their occurrence.
Collapse
Affiliation(s)
- Yonghee Oh
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, Louisville, Kentucky 40202, USA
| | - Karl D Lerud
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
| | - Evelyn Hoglund
- Department of Speech and Hearing Science, Ohio State University, Columbus, Ohio 43210, USA
| | - Niall Klyn
- Department of Speech and Hearing Science, Ohio State University, Columbus, Ohio 43210, USA
| | - Edward W Large
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
- Department of Physics, University of Connecticut, Storrs, Connecticut 06269, USA
- Oscilloscape, LLC, 400 Farmington Avenue, Farmington, Connecticut 06032, USA
| | - Lawrence L Feth
- Department of Speech and Hearing Science, Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
15
|
Sewell K, Brown VA, Farwell G, Rogers M, Zhang X, Strand JF. The effects of temporal cues, point-light displays, and faces on speech identification and listening effort. PLoS One 2023; 18:e0290826. [PMID: 38019831 PMCID: PMC10686424 DOI: 10.1371/journal.pone.0290826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 08/16/2023] [Indexed: 12/01/2023] Open
Abstract
Among the most robust findings in speech research is that the presence of a talking face improves the intelligibility of spoken language. Talking faces supplement the auditory signal by providing fine phonetic cues based on the placement of the articulators, as well as temporal cues to when speech is occurring. In this study, we varied the amount of information contained in the visual signal, ranging from temporal information alone to a natural talking face. Participants were presented with spoken sentences in energetic or informational masking in four different visual conditions: audio-only, a modulating circle providing temporal cues to salient features of the speech, a digitally rendered point-light display showing lip movement, and a natural talking face. We assessed both sentence identification accuracy and self-reported listening effort. Audiovisual benefit for intelligibility was observed for the natural face in both informational and energetic masking, but the digitally rendered point-light display only provided benefit in energetic masking. Intelligibility for speech accompanied by the modulating circle did not differ from the audio-only conditions in either masker type. Thus, the temporal cues used here were insufficient to improve speech intelligibility in noise, but some types of digital point-light displays may contain enough phonetic detail to produce modest improvements in speech identification in noise.
Collapse
Affiliation(s)
- Katrina Sewell
- Department of Psychology, Carleton College, Northfield, MN, United States of America
| | - Violet A. Brown
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, MO, United States of America
| | - Grace Farwell
- Department of Psychology, Carleton College, Northfield, MN, United States of America
| | - Maya Rogers
- Department of Psychology, Carleton College, Northfield, MN, United States of America
| | - Xingyi Zhang
- Department of Psychology, Carleton College, Northfield, MN, United States of America
| | - Julia F. Strand
- Department of Psychology, Carleton College, Northfield, MN, United States of America
| |
Collapse
|
16
|
Cychosz M, Xu K, Fu QJ. Effects of spectral smearing on speech understanding and masking release in simulated bilateral cochlear implants. PLoS One 2023; 18:e0287728. [PMID: 37917727 PMCID: PMC10621938 DOI: 10.1371/journal.pone.0287728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 06/11/2023] [Indexed: 11/04/2023] Open
Abstract
Differences in spectro-temporal degradation may explain some variability in cochlear implant users' speech outcomes. The present study employs vocoder simulations on listeners with typical hearing to evaluate how differences in degree of channel interaction across ears affects spatial speech recognition. Speech recognition thresholds and spatial release from masking were measured in 16 normal-hearing subjects listening to simulated bilateral cochlear implants. 16-channel sine-vocoded speech simulated limited, broad, or mixed channel interaction, in dichotic and diotic target-masker conditions, across ears. Thresholds were highest with broad channel interaction in both ears but improved when interaction decreased in one ear and again in both ears. Masking release was apparent across conditions. Results from this simulation study on listeners with typical hearing show that channel interaction may impact speech recognition more than masking release, and may have implications for the effects of channel interaction on cochlear implant users' speech recognition outcomes.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Linguistics, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Kevin Xu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States of America
| |
Collapse
|
17
|
Flaherty MM, Arzuaga B, Bottalico P. The effects of face masks on speech-in-speech recognition for children and adults. Int J Audiol 2023; 62:1014-1021. [PMID: 36688609 DOI: 10.1080/14992027.2023.2168218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 01/05/2023] [Indexed: 01/24/2023]
Abstract
OBJECTIVES This study explored the effects of different face masks on school-age children's and young adults' word recognition. DESIGN Speech recognition thresholds were measured adaptively in a two-talker speech masker using a closed-set picture pointing task. Target words were recorded by a female talker in five conditions: no mask, transparent mask, face shield, N95 mask and surgical mask. STUDY SAMPLES Thirty children (8-12 years) and 25 adults (18-25 years) with normal hearing. RESULTS Both children's and adults' word recognition was most negatively impacted by the face shield. Children's recognition was also impaired by the transparent mask. No negative effects were observed for the N95 or surgical mask for either age group. CONCLUSION School-age children, like young adults, are negatively affected by face masks when recognising speech in a two-talker speech masker, but the effects depend on the type of face mask being worn. Acoustic analyses suggest that the reflective materials used for masks impact speech signal quality and impair word recognition.
Collapse
Affiliation(s)
- Mary M Flaherty
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Briana Arzuaga
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Pasquale Bottalico
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
18
|
Lupker SJ, Spinelli G. An examination of models of reading multi-morphemic and pseudo multi-morphemic words using sandwich priming. J Exp Psychol Learn Mem Cogn 2023; 49:1861-1880. [PMID: 37668567 DOI: 10.1037/xlm0001289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
Rastle et al. (2004) reported that true (e.g., walker) and pseudo (e.g., corner) multi-morphemic words prime their stem words more than form controls do (e.g., brothel priming BROTH) in a masked priming lexical decision task. This data pattern has led a number of models to propose that both of the former word types are "decomposed" into their stem (e.g., walk, corn) and affix (e.g., -er) early in the reading process. The present experiments were designed to examine the models proposed to explain Rastle et al.'s effect, including models not assuming a decomposition process, using a more sensitive priming technique, sandwich priming (Lupker & Davis, 2009). Experiment 1, using the conventional masked priming procedure, replicated Rastle et al.'s results. Experiments 2 and 3, involving sandwich priming procedures, showed a clear dissociation between priming effects for true versus pseudo multi-morphemic words, results that are not easily explained by any of the current models. Nonetheless, the overall data pattern does appear to be most consistent with there being a decomposition process when reading real and pseudo multi-morphemic words, a process that involves activating (and inhibiting) lexical-level representations including a representation for the affix (e.g., -er), with the ultimate lexical decision being based on the process of resolving the pattern created by the activated representational units. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Giacomo Spinelli
- Dipartimento di Psicologia, Universita degli Studi di Milano-Bicocca
| |
Collapse
|
19
|
Lugli M. Toward a general model for the evolution of the auditory sensitivity under variable ambient noise conditionsa). J Acoust Soc Am 2023; 154:2236-2255. [PMID: 37819375 DOI: 10.1121/10.0021306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 09/21/2023] [Indexed: 10/13/2023]
Abstract
Ambient noise constrains the evolution of acoustic signals and hearing. An earlier fitness model showed that the trade-off between sound detection and recognition helps predict the best level of auditory sensitivity for acoustic communication in noise. Here, the early model is improved to investigate the effects of different noise masking conditions and signal-to-noise ratios (SNRs). It is revealed that low sensitivity is expected for acoustic communication over short distances in complex noisy environments provided missed sound recognition is costly. By contrast, high sensitivity is expected for acoustic communication over long distances in quieter habitats or when sounds are received with good SNRs under unfavorable noise conditions. High sensitivity is also expected in noisy environments characterized by one dominant source of noise with a fairly constant spectrum (running-water noise) or when sounds are processed using anti-masking strategies favoring the detection and recognition of sound embedded in noise. These predictions help explain unexpected findings that do not fit with the current view on the effects of environmental selection on signal and sensitivity. Model predictions are compared with those of models of signal detection in noisy conditions and results of empirical studies.
Collapse
Affiliation(s)
- Marco Lugli
- Department of Chemistry, Life Sciences and Environmental Sustainability-Unit of Behavioral Biology, University of Parma, Parma, Italy
| |
Collapse
|
20
|
Kinoshita S, Liong G. Mirror letter priming is rightward-biased but not inhibitory: Little evidence for a mirror suppression mechanism in the recognition of mirror letters. J Exp Psychol Learn Mem Cogn 2023; 49:1523-1538. [PMID: 37053425 DOI: 10.1037/xlm0001239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Unlike other visual objects which are invariant to the left-right orientation, mirror letters (e.g., b and d) represent different object identities. Previous masked priming lexical decision studies have suggested that the identification of a mirror letter involves suppression of its mirror image counterpart reporting as evidence that a pseudoword prime containing the mirror letter counterpart slowed down the recognition of target word relative to a control prime containing an unrelated letter (e.g., ibea-idea > ilea-idea). Furthermore, it has been reported recently that this inhibitory mirror priming effect is sensitive to the distributional bias of left/right orientation in the Latin alphabet such that only the more dominant (frequent) right-facing mirror letter prime (e.g., b) produced interference. In the present study, we examined mirror letter priming with single letters and nonlexical letter strings with adult readers. In all experiments, relative to a visually dissimilar control letter prime, both the right-facing and left-facing mirror letter prime consistently facilitated, rather than slowed down the recognition of a target letter (e.g., b-d < w-d). Assessed against an identity prime, mirror primes showed a rightward bias, although it was small in magnitude and not always significant within an individual experiment. These results provide no support for a mirror suppression mechanism in the identification of mirror letters, and an alternative interpretation in terms of noisy perception is suggested. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
|
21
|
Byrne AJ, Conroy C, Kidd G. Individual differences in speech-on-speech masking are correlated with cognitive and visual task performance. J Acoust Soc Am 2023; 154:2137-2153. [PMID: 37800988 PMCID: PMC10631817 DOI: 10.1121/10.0021301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/19/2023] [Accepted: 09/17/2023] [Indexed: 10/07/2023]
Abstract
Individual differences in spatial tuning for masked target speech identification were determined using maskers that varied in type and proximity to the target source. The maskers were chosen to produce three strengths of informational masking (IM): high [same-gender, speech-on-speech (SOS) masking], intermediate (the same masker speech time-reversed), and low (speech-shaped, speech-envelope-modulated noise). Typical for this task, individual differences increased as IM increased, while overall performance decreased. To determine the extent to which auditory performance might generalize to another sensory modality, a comparison visual task was also implemented. Visual search time was measured for identifying a cued object among "clouds" of distractors that were varied symmetrically in proximity to the target. The visual maskers also were chosen to produce three strengths of an analog of IM based on feature similarities between the target and maskers. Significant correlations were found for overall auditory and visual task performance, and both of these measures were correlated with an index of general cognitive reasoning. Overall, the findings provide qualified support for the proposition that the ability of an individual to solve IM-dominated tasks depends on cognitive mechanisms that operate in common across sensory modalities.
Collapse
Affiliation(s)
- Andrew J Byrne
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| | - Christopher Conroy
- Department of Biological and Vision Sciences, State University of New York College of Optometry, New York, New York 10036, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
22
|
Lutfi RA, Zandona M, Lee J. Simultaneous relative cue reliance in speech-on-speech masking. J Acoust Soc Am 2023; 154:2530-2538. [PMID: 37870932 DOI: 10.1121/10.0021874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 09/27/2023] [Indexed: 10/25/2023]
Abstract
Modern hearing research has identified the ability of listeners to segregate simultaneous speech streams with a reliance on three major voice cues, fundamental frequency, level, and location. Few of these studies evaluated reliance for these cues presented simultaneously as occurs in nature, and fewer still considered the listeners' relative reliance on these cues owing to the cues' different units of measure. In the present study trial-by-trial analyses were used to isolate the listener's simultaneous reliance on the three voice cues, with the behavior of an ideal observer [Green and Swets (1966). (Wiley, New York), pp.151-178] serving as a comparison standard for evaluating relative reliance. Listeners heard on each trial a pair of randomly selected, simultaneous recordings of naturally spoken sentences. One of the recordings was always from the same talker, a distracter, and the other, with equal probability, was from one of two target talkers differing in the three voice cues. The listener's task was to identify the target talker. Among 33 clinically normal-hearing adults only one relied predominantly on voice level, the remaining were split between voice fundamental frequency and/or location. The results are discussed regarding their implications for the common practice in studies of using target-distracter level as a dependent measure of speech-on-speech masking.
Collapse
Affiliation(s)
- R A Lutfi
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - M Zandona
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - J Lee
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
23
|
van Schoonhoven J, Rhebergen KS, Dreschler WA. A context-based approach to predict intelligibility of meaningful and nonsense words in interrupted noise: Model evaluation. J Acoust Soc Am 2023; 154:2476-2488. [PMID: 37862572 DOI: 10.1121/10.0021302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 09/17/2023] [Indexed: 10/22/2023]
Abstract
The context-based Extended Speech Transmission Index (cESTI) by Van Schoonhoven et al. (2022) was successfully used to predict the intelligibility of meaningful, monosyllabic words in interrupted noise. However, it is not clear how the model behaves when using different degrees of context. In the current paper, intelligibility of meaningful and nonsense CVC words in stationary and interrupted noise was measured in fourteen normally hearing adults. Intelligibility of nonsense words in interrupted noise at -18 dB SNR was relatively poor, possibly because listeners did not profit from coarticulatory cues as they did in stationary noise. With 75% of the total variance explained, the cESTI model performed better than the original ESTI model (R2 = 27%), especially due to better predictions at low interruption rates. However, predictions for meaningful word scores were relatively poor (R2 = 38%), mainly due to remaining inaccuracies at interruption rates below 4 Hz and a large effect of forward masking. Adjusting parameters of the forward masking function improved the accuracy of the model to a total explained variance of 83%, while the predicted power of previously published cESTI data remained similar.
Collapse
Affiliation(s)
- Jelmer van Schoonhoven
- Department of Clinical and Experimental Audiology, Amsterdam University Medical Center, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
| | - Koenraad S Rhebergen
- Department of Otorhinolaryngology and Head & Neck Surgery, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, The Netherlands
| | - Wouter A Dreschler
- Department of Clinical and Experimental Audiology, Amsterdam University Medical Center, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
| |
Collapse
|
24
|
Jimenez M, Prieto A, Gómez P, Hinojosa JA, Montoro PR. Masked priming under the Bayesian microscope: Exploring the integration of local elements into global shape through Bayesian model comparison. Conscious Cogn 2023; 115:103568. [PMID: 37708623 DOI: 10.1016/j.concog.2023.103568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 08/24/2023] [Accepted: 08/24/2023] [Indexed: 09/16/2023]
Abstract
To investigate whether local elements are grouped into global shapes in the absence of awareness, we introduced two different masked priming designs (e.g., the classic dissociation paradigm and a trial-wise probe and prime discrimination task) and collected both objective (i.e., performance based) and subjective (using the perceptual awareness scale [PAS]) awareness measures. Prime visibility was manipulated using three different prime-mask stimulus onset asynchronies (SOAs) and an unmasked condition. Our results showed that assessing prime visibility trial-wise heavily interfered with masked priming preventing any prime facilitation effect. The implementation of Bayesian regression models, which predict priming effects for participants whose awareness levels are at chance level, provided strong evidence in favor of the hypothesis that local elements group into global shape in the absence of awareness for SOAs longer than 50 ms, suggesting that prime-mask SOA is a crucial factor in the processing of the global shape without awareness.
Collapse
Affiliation(s)
- Mikel Jimenez
- Department of Psychology, University of Durham, Durham, United Kingdom.
| | | | - Pablo Gómez
- California State University San Bernardino, Palm Desert Campus, USA
| | - José Antonio Hinojosa
- Facultad de Lenguas y Educación, Universidad de Nebrija, Madrid, Spain; Instituto Pluridisciplinar, Universidad Complutense de Madrid, Spain; Departamento de Psicología Experimental, Procesos Psicológicos y Logopedia, Universidad Complutense de Madrid, Spain
| | | |
Collapse
|
25
|
Silva AE, Lehmann R, Perikleous N, Thompson B. The temporal dynamics of visual crowding in letter recognition: Modulating crowding with alternating flicker presentations. J Vis 2023; 23:18. [PMID: 37768277 PMCID: PMC10540873 DOI: 10.1167/jov.23.10.18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 08/31/2023] [Indexed: 09/29/2023] Open
Abstract
Visual crowding reduces the visibility of a peripherally presented group of stimuli. This is especially challenging for peripheral reading because adjacent letters or characters perceptually crowd one another. We investigated the temporal course of spatial visual crowding by sequentially alternating the visibility of the target and flanking letters within a trigram letter stimulus presented 9° below fixation. We found that alternation rates of roughly 3 Hz released half of the total effect of crowding, whereas 10 Hz alternation rates elicited near-crowded performance. Furthermore, we found a robust performance asymmetry whereby presenting the target first elicited better performance than presenting the flankers first, an effect resembling forward masking. These results held for conditions of high, medium, and low spatial crowding. Future work will determine whether the alternation rates found in the current study can improve peripheral reading.
Collapse
Affiliation(s)
- Andrew E Silva
- School of Optometry and Vision Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Rebecca Lehmann
- Aalen University, Optics and Mechatronics, Aalen, Baden-Wuerttemberg, Germany
| | - Niki Perikleous
- Aalen University, Optics and Mechatronics, Aalen, Baden-Wuerttemberg, Germany
| | - Benjamin Thompson
- School of Optometry and Vision Science, University of Waterloo, Waterloo, Ontario, Canada
- Centre for Eye and Vision Research, Hong Kong, SAR China
- Liggins Institute, University of Auckland, Auckland, New Zealand
| |
Collapse
|
26
|
Komiyama T, Takedomi H, Aoyama C, Goya R, Shimegi S. Acute exercise has specific effects on the formation process and pathway of visual perception in healthy young men. Eur J Neurosci 2023; 58:3239-3252. [PMID: 37424403 DOI: 10.1111/ejn.16082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 06/19/2023] [Accepted: 06/20/2023] [Indexed: 07/11/2023]
Abstract
Visual perception is formed over time through the formation process and visual pathway. Exercise improves visual perception, but it is unclear whether exercise modulates nonspecifically or specifically the formation process and pathway of visual perception. Healthy young men performed the visual detection task in a backward masking paradigm before and during cycling exercise at a mild intensity or rest (control). The task presented gratings of a circular patch (target) and annulus (mask) arranged concentrically as a visual stimulus and asked if the presence and striped pattern (feature) of the target were detected. The relationship between the orientations of the gratings of the target and the mask included iso-orientation and orthogonal orientation to investigate the orientation selectivity of the masking effect. The masking effect was evaluated by perceptual suppressive index (PSI). Exercise improved feature detection (∆PSI; Exercise: -20.6%, Control: 1.7%) but not presence detection (∆PSI; Exercise: 8.9%, Control: 29.6%) compared to the control condition, and the improving effect resulted from the attenuation of the non-orientation-selective (∆PSI; Exercise: -29.0%, Control: 16.8%) but not orientation-selective masking effect (∆PSI; Exercise: -3.1%, Control: 11.7%). These results suggest that exercise affects the formation process of the perceptual feature of the target stimulus by suppressively modulating the neural networks responsible for the non-orientation-selective surround interaction in the subcortical visual pathways, whose effects are inherited by the cortical visual pathways necessary for perceptual image formation. In conclusion, our findings suggest that acute exercise improves visual perception transiently through the modulation of a specific formation process of visual processing.
Collapse
Affiliation(s)
- Takaaki Komiyama
- Laboratory of Brain Information Science in Sports, Center for Education in Liberal Arts and Science, Osaka University, Toyonaka, Japan
| | - Hiromasa Takedomi
- Graduate School of Frontier of Biosciences, Osaka University, Toyonaka, Japan
| | - Chisa Aoyama
- Graduate School of Medicine, Osaka University, Toyonaka, Japan
| | - Ryoma Goya
- Graduate School of Frontier of Biosciences, Osaka University, Toyonaka, Japan
- Faculty of Sports Science, Fukuoka University, Fukuoka, Japan
| | - Satoshi Shimegi
- Laboratory of Brain Information Science in Sports, Center for Education in Liberal Arts and Science, Osaka University, Toyonaka, Japan
- Graduate School of Frontier of Biosciences, Osaka University, Toyonaka, Japan
- Graduate School of Medicine, Osaka University, Toyonaka, Japan
| |
Collapse
|
27
|
Pittrich K, Schroeder S. Priming effects in reading words with vertically and horizontally mirrored letters. Q J Exp Psychol (Hove) 2023; 76:2183-2196. [PMID: 36384348 PMCID: PMC10466978 DOI: 10.1177/17470218221141076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 10/09/2022] [Accepted: 10/19/2022] [Indexed: 11/18/2022]
Abstract
We conducted two masked priming experiments to examine how the orthographic system processes words with mirrored letters. In both experiments, four different primes were used: an identity prime, an unrelated control prime, and two mirror-primes in which letters were either mirrored at their vertical or horizontal axis. Task was varied between experiments: In Experiment 1, we used a lexical decision task, and in Experiment 2, we used a cross-case same-different match task. We expected to see priming effects in both mirror-conditions with stronger effects in the vertically than in the horizontally mirrored letters. In the lexical decision task, we observed only vertical priming effects for words, whereas in the same-different task, priming effects were present in both mirror-conditions and for both words and non-words. We discuss the implications of our findings for extant models of orthographic processing.
Collapse
Affiliation(s)
- Katharina Pittrich
- Department of Educational Psychology, University of Göttingen, Göttingen, Germany
| | - Sascha Schroeder
- Department of Educational Psychology, University of Göttingen, Göttingen, Germany
| |
Collapse
|
28
|
Benyhe A, Labusch M, Perea M. Just a mark: Diacritic function does not play a role in the early stages of visual word recognition. Psychon Bull Rev 2023; 30:1530-1538. [PMID: 36635587 DOI: 10.3758/s13423-022-02244-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/29/2022] [Indexed: 01/13/2023]
Abstract
A very common feature in most writing systems is the presence of diacritics: distinguishing marks that are added for various linguistic reasons. Most models of reading, however, have not yet captured the nature of these marks. Recent priming experiments in several languages have attempted to resolve how diacritical letters are represented in the visual word recognition system. Since the function and appearance of diacritics can change from one language to the other, it is hard to interpret the accumulated evidence. With this in mind, we conducted two masked priming lexical decision experiments in Hungarian, a transparent orthography with a wide use of diacritic vowels that allows for clear-cut manipulations. In the two experiments, we manipulated the presence or absence of the same diacritic (i.e., the acute accent) on two specific sets of letters that behave differently. In Experiment 1, the manipulation changed only the length of vowels, whereas in Experiment 2, it also changed the quality (e.g., a↝/ɒ/ vs. á↝/aː/). In both experiments, we found that primes with an omitted diacritic work just as good as the identity primes (nema→NÉMA = néma→NÉMA [mute]), whereas the addition of a diacritic comes with a cost (mése→MESE > mese→MESE [tale]). This asymmetry favors a purely perceptual account of the very early stages of word recognition, making it blind to the function of diacritics. We suggest that the linguistic functions of diacritics originate at later processing stages.
Collapse
Affiliation(s)
- András Benyhe
- Department of Physiology, Albert Szent-Györgyi Medical School, University of Szeged, Dóm tér 10, Szeged, 6720, Hungary.
| | - Melanie Labusch
- Universidad Antonio de Nebrija, Madrid, Spain
- Universitat de València, Valencia, Spain
| | - Manuel Perea
- Universidad Antonio de Nebrija, Madrid, Spain
- Universitat de València, Valencia, Spain
| |
Collapse
|
29
|
Hiraumi H, Oikawa SI, Shiga K, Sato H. Systemic cisplatin increases the number of patients showing positive off-frequency masking audiometry. PLoS One 2023; 18:e0287400. [PMID: 37410731 PMCID: PMC10325046 DOI: 10.1371/journal.pone.0287400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 06/05/2023] [Indexed: 07/08/2023] Open
Abstract
OBJECTIVE The study aimed to evaluate the effect of systemic cisplatin administration on off-frequency masking audiometry. METHODS Among 26 patients receiving systemic cisplatin, 48 ears were included in the analysis. All patients underwent pure-tone audiometry with ipsilateral narrow-band masking noise (off-frequency masking audiometry). In the off-frequency masking audiometry, 70 dBHL band-pass noise (center frequency 1000 Hz, 1/3 octave bandwidth) was administered to the tested ear. The acquired thresholds were compared to those of standard pure-tone audiometry, and threshold elevations greater than 10 dB were regarded as significant. The number of patients showing abnormal threshold elevation was compared between before and after the cisplatin administration. RESULTS Before cisplatin administration, 91.7, 93.8, 97.9, and 93.8% of ears showed normal off-frequency masking audiometry outcomes at 125, 250, 6000, and 8000 Hz, respectively. After cisplatin administration, a higher number of patients showed abnormal off-frequency masking audiometry outcomes. This change was more prominent with increasing doses of cisplatin. After the cisplatin administration of 100∼200 mg/m2, the prevalence of patients with normal off-frequency masking audiometry outcomes was 77.3, 70.5, 90.9, and 88.6% at 125, 250, 6000, and 8000 Hz, respectively. At 250 Hz, the change was statistically significant (p = 0.01, chi-squared test).
Collapse
Affiliation(s)
- Harukazu Hiraumi
- Department of Otolaryngology—Head and Neck Surgery, Iwate Medical University, Yahaba, Shiwa, Iwate, Japan
| | - Shin-ichi Oikawa
- Department of Otolaryngology—Head and Neck Surgery, Iwate Medical University, Yahaba, Shiwa, Iwate, Japan
| | - Kiyoto Shiga
- Department of Otolaryngology—Head and Neck Surgery, Iwate Medical University, Yahaba, Shiwa, Iwate, Japan
| | - Hiroaki Sato
- Department of Otolaryngology—Head and Neck Surgery, Iwate Medical University, Yahaba, Shiwa, Iwate, Japan
| |
Collapse
|
30
|
de Cheveigné A. In-channel cancellation: A model of early auditory processing. J Acoust Soc Am 2023; 153:3350. [PMID: 37328948 DOI: 10.1121/10.0019752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 06/02/2023] [Indexed: 06/18/2023]
Abstract
A model of early auditory processing is proposed in which each peripheral channel is processed by a delay-and-subtract cancellation filter, tuned independently for each channel with a criterion of minimum power. For a channel dominated by a pure tone or a resolved partial of a complex tone, the optimal delay is its period. For a channel responding to harmonically related partials, the optimal delay is their common fundamental period. Each peripheral channel is thus split into two subchannels-one that is cancellation-filtered and the other that is not. Perception can involve either or both, depending on the task. The model is illustrated by applying it to the masking asymmetry between pure tones and narrowband noise: a noise target masked by a tone is more easily detectable than a tone target masked by noise. The model is one of a wider class of models, monaural or binaural, that cancel irrelevant stimulus dimensions to attain invariance to competing sources. Similar to occlusion in the visual domain, cancellation yields sensory evidence that is incomplete, thus requiring Bayesian inference of an internal model of the world along the lines of Helmholtz's doctrine of unconscious inference.
Collapse
Affiliation(s)
- Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, Unité Mixte de Recherche 8248, Centre National de la Recherche Scientifique, Paris, France
| |
Collapse
|
31
|
Borrie SA, Yoho SE, Healy EW, Barrett TS. The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise. J Speech Lang Hear Res 2023; 66:1853-1866. [PMID: 36944186 PMCID: PMC10457087 DOI: 10.1044/2023_jslhr-22-00558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/13/2022] [Accepted: 01/10/2023] [Indexed: 05/11/2023]
Abstract
PURPOSE Background noise reduces speech intelligibility. Time-frequency (T-F) masking is an established signal processing technique that improves intelligibility of neurotypical speech in background noise. Here, we investigated a novel application of T-F masking, assessing its potential to improve intelligibility of neurologically degraded speech in background noise. METHOD Listener participants (N = 422) completed an intelligibility task either in the laboratory or online, listening to and transcribing audio recordings of neurotypical (control) and neurologically degraded (dysarthria) speech under three different processing types: speech in quiet (quiet), speech mixed with cafeteria noise (noise), and speech mixed with cafeteria noise and then subsequently processed by an ideal quantized mask (IQM) to remove the noise. RESULTS We observed significant reductions in intelligibility of dysarthric speech, even at highly favorable signal-to-noise ratios (+11 to +23 dB) that did not impact neurotypical speech. We also observed significant intelligibility improvements from speech in noise to IQM-processed speech for both control and dysarthric speech across a wide range of noise levels. Furthermore, the overall benefit of IQM processing for dysarthric speech was comparable with that of the control speech in background noise, as was the intelligibility data collected in the laboratory versus online. CONCLUSIONS This study demonstrates proof of concept, validating the application of T-F masks to a neurologically degraded speech signal. Given that intelligibility challenges greatly impact communication, and thus the lives of people with dysarthria and their communication partners, the development of clinical tools to enhance intelligibility in this clinical population is critical.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Sarah E. Yoho
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
- Department of Speech and Hearing Science, The Ohio State University, Columbus
| | - Eric W. Healy
- Department of Speech and Hearing Science, The Ohio State University, Columbus
| | | |
Collapse
|
32
|
Davidson A, Eitel M, Lange RT, French LM, Lippa S, Brickell TA, Brungart D. Efficient Estimation of the Binaural Masking Level Difference Using a Technique Based on Manual Audiometry. J Speech Lang Hear Res 2023; 66:1378-1393. [PMID: 36898137 DOI: 10.1044/2022_jslhr-22-00519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE The Masking Level Difference (MLD) has been used for decades to evaluate the binaural listening advantage. Although originally measured using Bekesy audiometry, the most common clinical use of the MLD is the CD-based Wilson 500-Hz technique with interleaved N0S0 and N0Sπ components. Here, we propose an alternative technique based on manual audiometry as a faster way of measuring the MLD. The article describes the advantages to this administration technique and evaluates if it is a viable alternative for the Wilson technique. METHOD Data were retrospectively analyzed on 264 service members (SMs). All SMs completed both the Wilson and Manual MLDs. Descriptive and correlational statistics were applied to evaluate the comparisons between the two techniques and highlight the differences. Equivalence measures were also completed to compare the tests using a standardized cutoff score. Analyses were also made to compare both techniques to subjective and objective measures of hearing performance. RESULTS Moderate to high positive correlations were determined between Wilson and Manual measures of each threshold (N0Sπ and N0S0). Although the Manual and Wilson MLD techniques produced significantly different thresholds, simple linear transformations can be used to obtain approximately equivalent scores on the two tests, and agreement was high for using these transformed scores to identify individuals with substantial MLD deficits. Both techniques had moderate test-retest reliability. The Manual MLD and components had stronger correlations to the subjective and objective hearing measures than the Wilson. CONCLUSIONS The Manual technique is a faster method for obtaining MLD scores that is just as reliable as the CD-based Wilson test. With the significant reduction in assessment time and comparable results, the Manual MLD is a viable alternative for direct use in the clinic.
Collapse
Affiliation(s)
| | - Megan Eitel
- Walter Reed National Military Medical Center, Bethesda, MD
| | - Rael T Lange
- Walter Reed National Military Medical Center, Bethesda, MD
- Traumatic Brain Injury Center of Excellence, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
- University of British Columbia, Vancouver
| | - Louis M French
- Walter Reed National Military Medical Center, Bethesda, MD
- Traumatic Brain Injury Center of Excellence, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
| | - Sara Lippa
- Walter Reed National Military Medical Center, Bethesda, MD
- National Intrepid Center of Excellence, Bethesda, MD
| | - Tracey A Brickell
- Walter Reed National Military Medical Center, Bethesda, MD
- Traumatic Brain Injury Center of Excellence, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
| | | |
Collapse
|
33
|
Regev J, Zaar J, Relaño-Iborra H, Dau T. Age-related reduction of amplitude modulation frequency selectivity. J Acoust Soc Am 2023; 153:2298. [PMID: 37092934 DOI: 10.1121/10.0017835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/27/2023] [Indexed: 05/03/2023]
Abstract
The perception of amplitude modulations (AMs) has been characterized by a frequency-selective process in the temporal envelope domain and simulated in computational auditory processing and perception models using a modulation filterbank. Such AM frequency-selective processing has been argued to be critical for the perception of complex sounds, including speech. This study aimed at investigating the effects of age on behavioral AM frequency selectivity in young (n = 11, 22-29 years) versus older (n = 10, 57-77 years) listeners with normal hearing, using a simultaneous AM masking paradigm with a sinusoidal carrier (2.8 kHz), target modulation frequencies of 4, 16, 64, and 128 Hz, and narrowband-noise modulation maskers. A reduction of AM frequency selectivity by a factor of up to 2 was found in the older listeners. While the observed AM selectivity co-varied with the unmasked AM detection sensitivity, the age-related broadening of the masked threshold patterns remained stable even when AM sensitivity was similar across groups for an extended stimulus duration. The results from the present study might provide a valuable basis for further investigations exploring the effects of age and reduced AM frequency selectivity on complex sound perception as well as the interaction of age and hearing impairment on AM processing and perception.
Collapse
Affiliation(s)
- Jonathan Regev
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Johannes Zaar
- Eriksholm Research Centre, Snekkersten, 3070, Denmark
| | - Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| |
Collapse
|
34
|
Bologna WJ, Carrillo AA, Clamage DS, Coco L, He YJ, de Larrea-Mancera ESL, Stecker GC, Gallun FJ, Seitz AR. Effects of Gamification on Assessment of Spatial Release From Masking. Am J Audiol 2023; 32:210-219. [PMID: 36763846 PMCID: PMC10171850 DOI: 10.1044/2022_aja-22-00133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 10/23/2022] [Accepted: 10/25/2022] [Indexed: 02/12/2023] Open
Abstract
PURPOSE Difficulty understanding speech in noise is a common communication problem. Clinical tests of speech in noise differ considerably from real-world listening and offer patients limited intrinsic motivation to perform well. In order to design a test that captures motivational aspects of real-world communication, this study investigated effects of gamification, or the inclusion of game elements, on a laboratory spatial release from masking test. METHOD Fifty-four younger adults with normal hearing completed a traditional laboratory and a gamified test of spatial release from masking in counterbalanced order. Masker level adapted based on performance, with the traditional test ending after 10 reversals and the gamified test ending when participants solved a visual puzzle. Target-to-masker ratio thresholds (TMRs) with colocated maskers, separated maskers, and estimates of spatial release were calculated after the 10th reversal for both tests and from the last six reversals of the adaptive track from the gamified test. RESULTS Thresholds calculated from the 10th reversal indicated no significant differences between the traditional and gamified tests. A learning effect was observed with spatially separated maskers, such that TMRs were better for the second test than the first, regardless of test order. Thresholds calculated from the last six reversals of the gamified test indicated better TMRs in the separated condition compared to the traditional test. CONCLUSIONS Adding gamified elements to a traditional test of spatial release from masking did not negatively affect test validity or estimates of spatial release. Participants were willing to continue playing the gamified test for an average of 30.2 reversals of the adaptive track. For some listeners, performance in the separated condition continued to improve after the 10th reversal, leading to better TMRs and greater spatial release from masking at the end of the gamified test compared to the traditional test. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.22028789.
Collapse
Affiliation(s)
- William J. Bologna
- Department of Speech-Language Pathology and Audiology, Towson University, MD
| | | | | | - Laura Coco
- Oregon Hearing Research Center, Oregon Health and Science University, Portland
- VA Health Services Research & Development (HSR&D) Service Center of Innovation, Center to Improve Veteran Involvement in Care (CIVIC), VA Portland Health Care System, OR
- School of Speech, Language, and Hearing Sciences, San Diego State University, CA
| | - Yue J. He
- Brain Game Center, University of California, Riverside
| | | | | | - Frederick J. Gallun
- Oregon Hearing Research Center, Oregon Health and Science University, Portland
| | - Aaron R. Seitz
- Brain Game Center, University of California, Riverside
- Department of Psychology, University of California, Riverside
- Department of Psychology, Northeastern University, Boston, MA
| |
Collapse
|
35
|
Drouin JR, Zysk VA, Myers EB, Theodore RM. Sleep-Based Memory Consolidation Stabilizes Perceptual Learning of Noise-Vocoded Speech. J Speech Lang Hear Res 2023; 66:720-734. [PMID: 36668820 PMCID: PMC10023171 DOI: 10.1044/2022_jslhr-22-00139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/08/2022] [Accepted: 10/03/2022] [Indexed: 06/17/2023]
Abstract
PURPOSE Sleep-based memory consolidation has been shown to facilitate perceptual learning of atypical speech input including nonnative speech sounds, accented speech, and synthetic speech. The current research examined the role of sleep-based memory consolidation on perceptual learning for noise-vocoded speech, including maintenance of learning over a 1-week time interval. Because comprehending noise-vocoded speech requires extensive restructuring of the mapping between the acoustic signal and prelexical representations, sleep consolidation may be critical for this type of adaptation. Thus, the purpose of this study was to investigate the role of sleep-based memory consolidation on adaptation to noise-vocoded speech in listeners without hearing loss as a foundational step toward identifying parameters that can be useful to consider for auditory training with clinical populations. METHOD Two groups of normal-hearing listeners completed a transcription training task with feedback for noise-vocoded sentences in either the morning or the evening. Learning was assessed through transcription accuracy before training, immediately after training, 12 hr after training, and 1 week after training for both trained and novel sentences. RESULTS Both the morning and evening groups showed improved comprehension of noise-vocoded sentences immediately following training. Twelve hours later, the evening group showed stable gains (following a period of sleep), whereas the morning group demonstrated a decline in gains (following a period of wakefulness). One week after training, the morning and evening groups showed equivalent performance for both trained and novel sentences. CONCLUSION Sleep-consolidated learning helps stabilize training gains for degraded speech input, which may hold clinical utility for optimizing rehabilitation recommendations.
Collapse
Affiliation(s)
- Julia R. Drouin
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
- Department of Communication Sciences and Disorders, California State University, Fullerton
| | - Victoria A. Zysk
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
| | - Emily B. Myers
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
| | - Rachel M. Theodore
- Department of Speech, Language and Hearing Sciences, University of Connecticut, Storrs
| |
Collapse
|
36
|
Flaherty MM, Buss E, Libert K. Effects of Target and Masker Fundamental Frequency Contour Depth on School-Age Children's Speech Recognition in a Two-Talker Masker. J Speech Lang Hear Res 2023; 66:400-414. [PMID: 36580582 DOI: 10.1044/2022_jslhr-22-00207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
PURPOSE Maturation of the ability to recognize target speech in the presence of a two-talker speech masker extends into early adolescence. This study evaluated whether children benefit from differences in fundamental frequency (f o) contour depth between the target and masker speech, a cue that has been shown to improve recognition in adults. METHOD Speech stimuli were recorded from talkers using three speaking styles, with f o contour depths that were Flat, Normal, or Exaggerated. Targets were open-set, declarative sentences produced by a female talker, and maskers were two streams of concatenated sentences produced by a second female talker. Listeners were children (ages 5-17 years) and adults (ages 18-24 years) with normal hearing. Each listener was tested in one of the three masker styles paired with all three target styles. Speech recognition thresholds (SRTs) corresponding to 50% correct were estimated by fitting psychometric functions to adaptive track data. RESULTS For adults, performance did not differ significantly across conditions with matched speaking styles. A mismatch benefit was observed when combining Flat targets with the Exaggerated masker and Exaggerated targets with the Flat masker, and for both Flat and Exaggerated targets paired with the Normal masker. For children, there was a significant effect of age in all conditions. Flat targets in the Flat masker were associated with lower SRTs than the other two matched conditions, and a mismatch benefit was observed for young children only when the target f o contour was less variable than the masker f o contour. CONCLUSIONS Whereas child-directed speech often has exaggerated pitch contours, young children were better able to recognize speech with less variable f o. Age effects were observed in the benefit of mismatched speaking styles for some conditions, which could be related to differences in baseline SRTs rather than differences in segregation abilities.
Collapse
Affiliation(s)
- Mary M Flaherty
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Kelsey Libert
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
| |
Collapse
|
37
|
Irino T, Yokota K, Patterson RD. Improving Auditory Filter Estimation by Incorporating Absolute Threshold and a Level-dependent Internal Noise. Trends Hear 2023; 27:23312165231209750. [PMID: 37905400 PMCID: PMC10619342 DOI: 10.1177/23312165231209750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Accepted: 10/07/2023] [Indexed: 11/02/2023] Open
Abstract
Auditory filter (AF) shape has traditionally been estimated with a combination of a notched-noise (NN) masking experiment and a power spectrum model (PSM) of masking. However, there are several challenges that remain in both the simultaneous and forward masking paradigms. We hypothesized that AF shape estimation would be improved if absolute threshold (AT) and a level-dependent internal noise were explicitly represented in the PSM. To document the interaction between NN threshold and AT in normal hearing (NH) listeners, a large set of NN thresholds was measured at four center frequencies (500, 1000, 2000, and 4000 Hz) with the emphasis on low-level maskers. The proposed PSM, consisting of the compressive gammachirp (cGC) filter and three nonfilter parameters, allowed AF estimation over a wide range of frequencies and levels with fewer coefficients and less error than previous models. The results also provided new insights into the nonfilter parameters. The detector signal-to-noise ratio (K ) was found to be constant across signal frequencies, suggesting that no frequency dependence hypothesis is required in the postfiltering process. The ANSI standard "Hearing Level-0dB" function, i.e., AT of NH listeners, could be applied to the frequency distribution of the noise floor for the best AF estimation. The introduction of a level-dependent internal noise could mitigate the nonlinear effects that occur in the simultaneous NN masking paradigm. The new PSM improves the applicability of the model, particularly when the sound pressure level of the NN threshold is close to AT.
Collapse
Affiliation(s)
- Toshio Irino
- Faculty of Systems Engineering, Wakayama University, Japan
| | - Kenji Yokota
- Faculty of Systems Engineering, Wakayama University, Japan
| | - Roy D. Patterson
- Department of Physiology, Development and Neuroscience, University
of Cambridge, UK
| |
Collapse
|
38
|
Conroy C, Buss E, Kidd G. Cues to reduce modulation informational masking. J Acoust Soc Am 2023; 153:274. [PMID: 36732267 PMCID: PMC9848649 DOI: 10.1121/10.0016867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 12/23/2022] [Accepted: 12/28/2022] [Indexed: 06/18/2023]
Abstract
The detectability of target amplitude modulation (AM) can be reduced by masker AM in the same carrier-frequency region. It can be reduced even further, however, if the masker-AM rate is uncertain [Conroy and Kidd, J. Acoust. Soc. Am. 149, 3665-3673 (2021)]. This study examined the effectiveness of contextual cues in reducing this latter, uncertainty-related effect (modulation informational masking). Observers were tasked with detecting fixed-rate target sinusoidal amplitude modulation (SAM) in the presence of masker SAM applied simultaneously to the same broadband-noise carrier. A single-interval, two-alternative forced-choice detection procedure was used to measure sensitivity for the target SAM; masker-AM-rate uncertainty was created by randomly selecting the AM rate of the masker SAM on each trial. Relative to an uncued condition, a pretrial cue to the masker SAM significantly improved sensitivity for the target SAM; a cue to the target SAM, however, did not. The delay between the cue-interval offset and trial-interval onset did not affect the size of the masker-cue benefit, suggesting that adaptation of the masker SAM was not responsible. A simple model of within-AM-channel masking captured important trends in the psychophysical data, suggesting that reduced masker-AM-rate uncertainty may have played a relatively minor role in the masker-cue benefit.
Collapse
Affiliation(s)
- Christopher Conroy
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, Boston, Massachusetts 02215, USA
| |
Collapse
|
39
|
Williams BT, Viswanathan N, Brouwer S. The effect of visual speech information on linguistic release from masking. J Acoust Soc Am 2023; 153:602. [PMID: 36732222 PMCID: PMC10162837 DOI: 10.1121/10.0016865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 11/30/2022] [Accepted: 12/23/2022] [Indexed: 05/07/2023]
Abstract
Listeners often experience challenges understanding a person (target) in the presence of competing talkers (maskers). This difficulty reduces with the availability of visual speech information (VSI; lip movements, degree of mouth opening) and during linguistic release from masking (LRM; masking decreases with dissimilar language maskers). We investigate whether and how LRM occurs with VSI. We presented English targets with either Dutch or English maskers in audio-only and audiovisual conditions to 62 American English participants. The signal-to-noise ratio (SNR) was easy at 0 audio-only and -8 dB audiovisual in Experiment 1 and hard at -8 and -16 dB in Experiment 2 to assess the effects of modality on LRM across the same and different SNRs. We found LRM in the audiovisual condition for all SNRs and in audio-only for -8 dB, demonstrating reliable LRM for audiovisual conditions. Results also revealed that LRM is modulated by modality with larger LRM in audio-only indicating that introducing VSI weakens LRM. Furthermore, participants showed higher performance for Dutch maskers compared to English maskers with and without VSI. This establishes that listeners use both VSI and dissimilar language maskers to overcome masking. Our study shows that LRM persists in the audiovisual modality and its strength depends on the modality.
Collapse
Affiliation(s)
- Brittany T Williams
- Department of Communication Sciences and Disorders, The Pennsylvania State University, State College, Pennsylvania 16801, USA
| | - Navin Viswanathan
- Department of Communication Sciences and Disorders, The Pennsylvania State University, State College, Pennsylvania 16801, USA
| | - Susanne Brouwer
- Department of Modern Languages and Cultures, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
40
|
Veyrié A, Noreña A, Sarrazin JC, Pezard L. Investigating the influence of masker and target properties on the dynamics of perceptual awareness under informational masking. PLoS One 2023; 18:e0282885. [PMID: 36928693 PMCID: PMC10019711 DOI: 10.1371/journal.pone.0282885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 02/27/2023] [Indexed: 03/18/2023] Open
Abstract
Informational masking has been investigated using the detection of an auditory target embedded in a random multi-tone masker. The build-up of the target percept is influenced by the masker and target properties. Most studies dealing with discrimination performance neglect the dynamics of perceptual awareness. This study aims at investigating the dynamics of perceptual awareness using multi-level survival models in an informational masking paradigm by manipulating masker uncertainty, masker-target similarity and target repetition rate. Consistent with previous studies, it shows that high target repetition rates, low masker-target similarity and low masker uncertainty facilitate target detection. In the context of evidence accumulation models, these results can be interpreted by changes in the accumulation parameters. The probabilistic description of perceptual awareness provides a benchmark for the choice of target and masker parameters in order to examine the underlying cognitive and neural dynamics of perceptual awareness.
Collapse
Affiliation(s)
- Alexandre Veyrié
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- ONERA, The French Aerospace Lab, Salon de Provence, France
| | - Arnaud Noreña
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
| | | | - Laurent Pezard
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- * E-mail:
| |
Collapse
|
41
|
Sheffield SW, Wheeler HJ, Brungart DS, Bernstein JGW. The Effect of Sound Localization on Auditory-Only and Audiovisual Speech Recognition in a Simulated Multitalker Environment. Trends Hear 2023; 27:23312165231186040. [PMID: 37415497 PMCID: PMC10331332 DOI: 10.1177/23312165231186040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/13/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023] Open
Abstract
Information regarding sound-source spatial location provides several speech-perception benefits, including auditory spatial cues for perceptual talker separation and localization cues to face the talker to obtain visual speech information. These benefits have typically been examined separately. A real-time processing algorithm for sound-localization degradation (LocDeg) was used to investigate how spatial-hearing benefits interact in a multitalker environment. Normal-hearing adults performed auditory-only and auditory-visual sentence recognition with target speech and maskers presented from loudspeakers at -90°, -36°, 36°, or 90° azimuths. For auditory-visual conditions, one target and three masking talker videos (always spatially separated) were rendered virtually in rectangular windows at these locations on a head-mounted display. Auditory-only conditions presented blank windows at these locations. Auditory target speech (always spatially aligned with the target video) was presented in co-located speech-shaped noise (experiment 1) or with three co-located or spatially separated auditory interfering talkers corresponding to the masker videos (experiment 2). In the co-located conditions, the LocDeg algorithm did not affect auditory-only performance but reduced target orientation accuracy, reducing auditory-visual benefit. In the multitalker environment, two spatial-hearing benefits were observed: perceptually separating competing speech based on auditory spatial differences and orienting to the target talker to obtain visual speech cues. These two benefits were additive, and both were diminished by the LocDeg algorithm. Although visual cues always improved performance when the target was accurately localized, there was no strong evidence that they provided additional assistance in perceptually separating co-located competing speech. These results highlight the importance of sound localization in everyday communication.
Collapse
Affiliation(s)
- Sterling W. Sheffield
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville, FL, USA
| | - Harley J. Wheeler
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA
| | - Douglas S. Brungart
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Joshua G. W. Bernstein
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| |
Collapse
|
42
|
Svec A, Wojtczak M, Nelson PB. Amplitude-modulation forward masking for listeners with and without hearing loss. JASA Express Lett 2022; 2:124401. [PMID: 36586961 DOI: 10.1121/10.0015315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Amplitude-modulation (AM) forward masking was measured for listeners with normal hearing and sensorineural hearing loss at 4000 and 1000 Hz, using continuous and noncontinuous masker and signal carriers, respectively. A low-fluctuation noise (LFN) carrier was used for the "continuous carrier" conditions. An unmodulated low-fluctuation noise (U-LFN), an unmodulated Gaussian noise (U-GN), and an amplitude-modulation low-fluctuation noise (AM-LFN) were maskers for the "noncontinuous carrier" conditions. As predicted, U-GN yielded more masking than U-LFN and similar masking to AM-LFN, suggesting that U-GN resulted in AM forward masking. Contrary to predictions, no differences in masked thresholds were observed between listener groups.
Collapse
Affiliation(s)
- Adam Svec
- Department of Audiology, San José State University, San José, California 95112, USA
| | - Magdalena Wojtczak
- Department of Psychology, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, USA
| | - Peggy B Nelson
- Department of Speech-Language-Hearing Sciences, Center for Applied and Translational Sensory Science, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, USA , ,
| |
Collapse
|
43
|
Zenke K, Rosen S. Spatial release of masking in children and adults in non-individualized virtual environments. J Acoust Soc Am 2022; 152:3384. [PMID: 36586845 DOI: 10.1121/10.0016360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 11/14/2022] [Indexed: 06/17/2023]
Abstract
The spatial release of masking (SRM) is often measured in virtual auditory environments created from head-related transfer functions (HRTFs) of a standardized adult head. Adults and children, however, differ in head dimensions and mismatched HRTFs are known to affect some aspects of binaural hearing. So far, there has been little research on HRTFs in children and it is unclear whether a large mismatch of spatial cues can degrade speech perception in complex environments. In two studies, the effect of non-individualized virtual environments on SRM accuracy in adults and children was examined. The SRMs were measured in virtual environments created from individual and non-individualized HRTFs and the equivalent real anechoic environment. Speech reception thresholds (SRTs) were measured for frontal target sentences and symmetrical speech maskers at 0° or ±90° azimuth. No significant difference between environments was observed for adults. In 7 to 12-year-old children, SRTs and SRMs improved with age, with SRMs approaching adult levels. SRTs differed slightly between environments and were significantly worse in a virtual environment based on HRTFs from a spherical head. Adult HRTFs seem sufficient to accurately measure SRTs in children even in complex listening conditions.
Collapse
Affiliation(s)
- Katharina Zenke
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| | - Stuart Rosen
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London, WC1N 1PF, United Kingdom
| |
Collapse
|
44
|
Graetzer S, Hopkins C. Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios. J Acoust Soc Am 2022; 152:3458. [PMID: 36586840 DOI: 10.1121/10.0016494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 11/21/2022] [Indexed: 06/17/2023]
Abstract
The literature shows that the intelligibility of noisy speech can be improved by applying an ideal binary or soft gain mask in the time-frequency domain for signal-to-noise ratios (SNRs) between -10 and +10 dB. In this study, two mask-based algorithms are compared when applied to speech mixed with white Gaussian noise (WGN) at lower SNRs, that is, SNRs from -29 to -5 dB. These comprise an Ideal Binary Mask (IBM) with a Local Criterion (LC) set to 0 dB and an Ideal Ratio Mask (IRM). The performance of three intrusive Short-Time Objective Intelligibility (STOI) variants-STOI, STOI+, and Extended Short-Time Objective Intelligibility (ESTOI)-is compared with that of other monaural intelligibility metrics that can be used before and after mask-based processing. The results show that IRMs can be used to obtain near maximal speech intelligibility (>90% for sentence material) even at very low mixture SNRs, while IBMs with LC = 0 provide limited intelligibility gains for SNR < -14 dB. It is also shown that, unlike STOI, STOI+ and ESTOI are suitable metrics for speech mixed with WGN at low SNRs and processed by IBMs with LC = 0 even when speech is high-pass filtered to flatten the spectral tilt before masking.
Collapse
Affiliation(s)
- Simone Graetzer
- Acoustics Research Unit, School of Architecture, University of Liverpool, Liverpool, L69 7ZN, United Kingdom
| | - Carl Hopkins
- Acoustics Research Unit, School of Architecture, University of Liverpool, Liverpool, L69 7ZN, United Kingdom
| |
Collapse
|
45
|
Edraki A, Chan WY, Jensen J, Fogerty D. Spectro-temporal modulation glimpsing for speech intelligibility prediction. Hear Res 2022; 426:108620. [PMID: 36175300 PMCID: PMC10125146 DOI: 10.1016/j.heares.2022.108620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 09/14/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022]
Abstract
We compare two alternative speech intelligibility prediction algorithms: time-frequency glimpse proportion (GP) and spectro-temporal glimpsing index (STGI). Both algorithms hypothesize that listeners understand speech in challenging acoustic environments by "glimpsing" partially available information from degraded speech. GP defines glimpses as those time-frequency regions whose local signal-to-noise ratio is above a certain threshold and estimates intelligibility as the proportion of the time-frequency regions glimpsed. STGI, on the other hand, applies glimpsing to the spectro-temporal modulation (STM) domain and uses a similarity measure based on the normalized cross-correlation between the STM envelopes of the clean and degraded speech signals to estimate intelligibility as the proportion of the STM channels glimpsed. Our experimental results demonstrate that STGI extends the notion of glimpsing proportion to a wider range of distortions, including non-linear signal processing, and outperforms GP for the additive uncorrelated noise datasets we tested. Furthermore, the results show that spectro-temporal modulation analysis enables STGI to account for the effects of masker type on speech intelligibility, leading to superior performance over GP in modulated noise datasets.
Collapse
Affiliation(s)
- Amin Edraki
- Department of Electrical and Computer Engineering, Queen's University, Kingston, ON K7L 3N6, Canada.
| | - Wai-Yip Chan
- Department of Electrical and Computer Engineering, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Jesper Jensen
- Department of Electronic Systems, Aalborg University, Aalborg 9220, Denmark; Demant A/S, Smørum 2765, Denmark
| | - Daniel Fogerty
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, IL 61820, USA
| |
Collapse
|
46
|
Zaar J, Carney LH. Predicting speech intelligibility in hearing-impaired listeners using a physiologically inspired auditory model. Hear Res 2022; 426:108553. [PMID: 35750575 PMCID: PMC10560534 DOI: 10.1016/j.heares.2022.108553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 05/17/2022] [Accepted: 06/08/2022] [Indexed: 11/18/2022]
Abstract
This study presents a major update and full evaluation of a speech intelligibility (SI) prediction model previously introduced by Scheidiger, Carney, Dau, and Zaar [(2018), Acta Acust. United Ac. 104, 914-917]. The model predicts SI in speech-in-noise conditions via comparison of the noisy speech and the noise-alone reference. The two signals are processed through a physiologically inspired nonlinear model of the auditory periphery, for a range of characteristic frequencies (CFs), followed by a modulation analysis in the range of the fundamental frequency of speech. The decision metric of the model is the mean of a series of short-term, across-CF correlations between population responses to noisy speech and noise alone, with a sensitivity-limitation process imposed. The decision metric is assumed to be inversely related to SI and is converted to a percent-correct score using a single data-based fitting function. The model performance was evaluated in conditions of stationary, fluctuating, and speech-like interferers using sentence-based speech-reception thresholds (SRTs) previously obtained in 5 normal-hearing (NH) and 13 hearing-impaired (HI) listeners. For the NH listener group, the model accurately predicted SRTs across the different acoustic conditions (apart from a slight overestimation of the masking release observed for fluctuating maskers), as well as plausible effects in response to changes in presentation level. For HI listeners, the model was adjusted to account for the individual audiograms using standard assumptions concerning the amount of HI attributed to inner-hair-cell (IHC) and outer-hair-cell (OHC) impairment. HI model results accounted remarkably well for elevated individual SRTs and reduced masking release. Furthermore, plausible predictions of worsened SI were obtained when the relative contribution of IHC impairment to HI was increased. Overall, the present model provides a useful tool to accurately predict speech-in-noise outcomes in NH and HI listeners, and may yield important insights into auditory processes that are crucial for speech understanding.
Collapse
Affiliation(s)
- Johannes Zaar
- Eriksholm Research Centre, DK-3070 Snekkersten, Denmark; Hearing Systems Section, Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Laurel H Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY, 14642, USA
| |
Collapse
|
47
|
Abstract
This study investigated the role of harmonic cancellation in the intelligibility of speech in "cocktail party" situations. While there is evidence that harmonic cancellation plays a role in the segregation of simple harmonic sounds based on fundamental frequency (F0), its utility for mixtures of speech containing non-stationary F0s and unvoiced segments is unclear. Here we focused on the energetic masking of speech targets caused by competing speech maskers. Speech reception thresholds were measured using seven maskers: speech-shaped noise, monotonized and intonated harmonic complexes, monotonized speech, noise-vocoded speech, reversed speech and natural speech. These maskers enabled an estimate of how the masking potential of speech is influenced by harmonic structure, amplitude modulation and variations in F0 over time. Measured speech reception thresholds were compared to the predictions of two computational models, with and without a harmonic cancellation component. Overall, the results suggest a minor role of harmonic cancellation in reducing energetic masking in speech mixtures.
Collapse
Affiliation(s)
- Luna Prud'homme
- Univ Lyon, ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, 69518 Vaulx-en-Velin, France
| | - Mathieu Lavandier
- Univ Lyon, ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, 69518 Vaulx-en-Velin, France.
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Ave, Boston, MA, 02215, USA
| |
Collapse
|
48
|
Ozmeral EJ, Higgins NC. Defining functional spatial boundaries using a spatial release from masking task. JASA Express Lett 2022; 2:124402. [PMID: 36586966 PMCID: PMC9720634 DOI: 10.1121/10.0015356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]
Abstract
The classic spatial release from masking (SRM) task measures speech recognition thresholds for discrete separation angles between a target and masker. Alternatively, this study used a modified SRM task that adaptively measured the spatial-separation angle needed between a continuous male target stream (speech with digits) and two female masker streams to achieve a specific SRM. On average, 20 young normal-hearing listeners needed less spatial separation for 6 dB release than 9 dB release, and the presence of background babble reduced across-listener variability on the paradigm. Future work is needed to better understand the psychometric properties of this adaptive procedure.
Collapse
Affiliation(s)
- Erol J Ozmeral
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA ,
| | - Nathan C Higgins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA ,
| |
Collapse
|
49
|
Prud'homme L, Lavandier M, Best V. A dynamic binaural harmonic-cancellation model to predict speech intelligibility against a harmonic masker varying in intonation, temporal envelope, and location. Hear Res 2022; 426:108535. [PMID: 35654633 PMCID: PMC9684346 DOI: 10.1016/j.heares.2022.108535] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 04/26/2022] [Accepted: 05/23/2022] [Indexed: 11/28/2022]
Abstract
The aim of this study was to extend the harmonic-cancellation model proposed by Prud'homme et al. [J. Acoust. Soc. Am. 148 (2020) 3246--3254] to predict speech intelligibility against a harmonic masker, so that it takes into account binaural hearing, amplitude modulations in the masker and variations in masker fundamental frequency (F0) over time. This was done by segmenting the masker signal into time frames and combining the previous long-term harmonic-cancellation model with the binaural model proposed by Vicente and Lavandier [Hear. Res. 390 (2020) 107937]. The new model was tested on the data from two experiments involving harmonic complex maskers that varied in spatial location, temporal envelope and F0 contour. The interactions between the associated effects were accounted for in the model by varying the time frame duration and excluding the binaural unmasking computation when harmonic cancellation is active. Across both experiments, the correlation between data and model predictions was over 0.96, and the mean and largest absolute prediction errors were lower than 0.6 and 1.5 dB, respectively.
Collapse
Affiliation(s)
- Luna Prud'homme
- ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, University Lyon, Vaulx-en-Velin 69518, France
| | - Mathieu Lavandier
- ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, University Lyon, Vaulx-en-Velin 69518, France.
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, 635 Commonwealth Ave, Boston, MA 02215, USA
| |
Collapse
|
50
|
Stenbäck V, Marsja E, Hällgren M, Lyxell B, Larsby B. Informational Masking and Listening Effort in Speech Recognition in Noise: The Role of Working Memory Capacity and Inhibitory Control in Older Adults With and Without Hearing Impairment. J Speech Lang Hear Res 2022; 65:4417-4428. [PMID: 36283680 DOI: 10.1044/2022_jslhr-21-00674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE The study aimed to assess the relationship between (a) speech recognition in noise, mask type, working memory capacity (WMC), and inhibitory control and (b) self-rated listening effort, speech material, and mask type, in older adults with and without hearing impairment. It was of special interest to assess the relationship between WMC, inhibitory control, and speech recognition in noise when informational maskers masked target speech. METHOD A mixed design was used. A group (N = 24) of older (Mage = 69.7 years) individuals with hearing impairment and a group of age normal-hearing adults (Mage = 59.3 years, SD = 6.5) participated in the study. The participants were presented with auditory tests in a sound-attenuated room and with cognitive tests in a quiet office. The participants were asked to rate listening effort after being presented with energetic and informational background maskers in two different speech materials used in this study (i.e., Hearing In Noise Test and Hagerman test). Linear mixed-effects models were set up to assess the effect of the two different speech materials, energetic and informational maskers, hearing ability, WMC, inhibitory control, and self-rated listening effort. RESULTS Results showed that WMC and inhibitory control were of importance for speech recognition in noise, even when controlling for pure-tone average 4 hearing thresholds and age, when the maskers were informational. Concerning listening effort, on the other hand, the results suggest that hearing ability, but not cognitive abilities, is important for self-rated listening effort in speech recognition in noise. CONCLUSIONS Speech-in-noise recognition is more dependent on WMC for older adults in informational maskers than in energetic maskers. Hearing ability is a stronger predictor than cognition for self-rated listening effort. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21357648.
Collapse
Affiliation(s)
- Victoria Stenbäck
- Disability Research Division, Department of Behavioural Sciences and Learning, Linköping University, Sweden
- Division of Education, Teaching and Learning, Department of Behavioural Sciences and Learning, Linköping University, Sweden
| | - Erik Marsja
- Disability Research Division, Department of Behavioural Sciences and Learning, Linköping University, Sweden
| | - Mathias Hällgren
- Department of Otorhinolaryngology in Östergötland and Department of Biomedical and Clinical Sciences, Linköping University, Sweden
| | - Björn Lyxell
- Disability Research Division, Department of Behavioural Sciences and Learning, Linköping University, Sweden
- Department of Special Needs Education, University of Oslo, Norway
| | - Birgitta Larsby
- Department of Otorhinolaryngology in Östergötland and Department of Biomedical and Clinical Sciences, Linköping University, Sweden
| |
Collapse
|