1
|
CompHEAR: A Customizable and Scalable Web-Enabled Auditory Performance Evaluation Platform for Cochlear Implant Sound Processing Research. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.22.573126. [PMID: 38187767 PMCID: PMC10769353 DOI: 10.1101/2023.12.22.573126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Objective Cochlear implants (CIs) are auditory prostheses for individuals with severe to profound hearing loss, offering substantial but incomplete restoration of hearing function by stimulating the auditory nerve using electrodes. However, progress in CI performance and innovation has been constrained by the inability to rapidly test multiple sound processing strategies. Current research interfaces provided by major CI manufacturers have limitations in supporting a wide range of auditory experiments due to portability, programming difficulties, and the lack of direct comparison between sound processing algorithms. To address these limitations, we present the CompHEAR research platform, designed specifically for the Cochlear Implant Hackathon, enabling researchers to conduct diverse auditory experiments on a large scale. Study Design Quasi-experimental. Setting Virtual. Methods CompHEAR is an open-source, user-friendly platform which offers flexibility and ease of customization, allowing researchers to set up a broad set of auditory experiments. CompHEAR employs a vocoder to simulate novel sound coding strategies for CIs. It facilitates even distribution of listening tasks among participants and delivers real-time metrics for evaluation. The software architecture underlies the platform's flexibility in experimental design and its wide range of applications in sound processing research. Results Performance testing of the CompHEAR platform ensured that it could support at least 10,000 concurrent users. The CompHEAR platform was successfully implemented during the COVID-19 pandemic and enabled global collaboration for the CI Hackathon (www.cihackathon.com). Conclusion The CompHEAR platform is a useful research tool that permits comparing diverse signal processing strategies across a variety of auditory tasks with crowdsourced judging. Its versatility, scalability, and ease of use can enable further research with the goal of promoting advancements in cochlear implant performance and improved patient outcomes.
Collapse
|
2
|
Benefits of Harmonicity for Hearing in Noise Are Limited to Detection and Pitch-Related Discrimination Tasks. BIOLOGY 2023; 12:1522. [PMID: 38132348 PMCID: PMC10740545 DOI: 10.3390/biology12121522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 12/07/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023]
Abstract
Harmonic complex tones are easier to detect in noise than inharmonic complex tones, providing a potential perceptual advantage in complex auditory environments. Here, we explored whether the harmonic advantage extends to other auditory tasks that are important for navigating a noisy auditory environment, such as amplitude- and frequency-modulation detection. Sixty young normal-hearing listeners were tested, divided into two equal groups with and without musical training. Consistent with earlier studies, harmonic tones were easier to detect in noise than inharmonic tones, with a signal-to-noise ratio (SNR) advantage of about 2.5 dB, and the pitch discrimination of the harmonic tones was more accurate than that of inharmonic tones, even after differences in audibility were accounted for. In contrast, neither amplitude- nor frequency-modulation detection was superior with harmonic tones once differences in audibility were accounted for. Musical training was associated with better performance only in pitch-discrimination and frequency-modulation-detection tasks. The results confirm a detection and pitch-perception advantage for harmonic tones but reveal that the harmonic benefits do not extend to suprathreshold tasks that do not rely on extracting the fundamental frequency. A general theory is proposed that may account for the effects of both noise and memory on pitch-discrimination differences between harmonic and inharmonic tones.
Collapse
|
3
|
Auditory enhancement in younger and older listeners with normal and impaired hearinga). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3821-3832. [PMID: 38109406 PMCID: PMC10730236 DOI: 10.1121/10.0023937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 11/17/2023] [Accepted: 11/20/2023] [Indexed: 12/20/2023]
Abstract
Auditory enhancement is a spectral contrast aftereffect that can facilitate the detection of novel events in an ongoing background. A single-interval paradigm combined with roved frequency content between trials can yield as much as 20 dB enhancement in young normal-hearing listeners. This study compared such enhancement in 15 listeners with sensorineural hearing loss with that in 15 age-matched adults and 15 young adults with normal audiograms. All groups were presented with stimulus levels of 70 dB sound pressure level (SPL) per component. The two groups with normal hearing were also tested at 45 dB SPL per component. The hearing-impaired listeners showed very little enhancement overall. However, when tested at the same high (70-dB) level, both young and age-matched normal-hearing listeners also showed substantially reduced enhancement, relative to that found at 45 dB SPL. Some differences in enhancement emerged between young and older normal-hearing listeners at the lower sound level. The results suggest that enhancement is highly level-dependent and may also decrease somewhat with age or slight hearing loss. Implications for hearing-impaired listeners may include a poorer ability to adapt to real-world acoustic variability, due in part to the higher levels at which sound must be presented to be audible.
Collapse
|
4
|
Dissociating sensitivity from bias in the Mini Profile of Music Perception Skills. JASA EXPRESS LETTERS 2023; 3:094401. [PMID: 37747320 PMCID: PMC10523237 DOI: 10.1121/10.0021096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 09/06/2023] [Indexed: 09/26/2023]
Abstract
The Mini Profile of Music Perception Skills (Mini-PROMS) is a rapid performance-based measure of musical perceptual competence. The present study was designed to determine the optimal way to evaluate and score the Mini-PROMS results. Two traditional methods for scoring the Mini-PROMS, the weighted composite score and the parametric sensitivity index (d'), were compared with nonparametric alternatives, also derived from signal detection theory. Performance estimates using the traditional methods were found to depend on response bias (e.g., confidence), making them suboptimal. The simple nonparametric alternatives provided unbiased and reliable performance estimates from the Mini-PROMS and are therefore recommended instead.
Collapse
|
5
|
Sensitivity to Frequency Modulation is Limited Centrally. J Neurosci 2023; 43:3687-3695. [PMID: 37028932 PMCID: PMC10198444 DOI: 10.1523/jneurosci.0995-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 03/23/2023] [Accepted: 03/31/2023] [Indexed: 04/09/2023] Open
Abstract
Modulations in both amplitude and frequency are prevalent in natural sounds and are critical in defining their properties. Humans are exquisitely sensitive to frequency modulation (FM) at the slow modulation rates and low carrier frequencies that are common in speech and music. This enhanced sensitivity to slow-rate and low-frequency FM has been widely believed to reflect precise, stimulus-driven phase locking to temporal fine structure in the auditory nerve. At faster modulation rates and/or higher carrier frequencies, FM is instead thought to be coded by coarser frequency-to-place mapping, where FM is converted to amplitude modulation (AM) via cochlear filtering. Here, we show that patterns of human FM perception that have classically been explained by limits in peripheral temporal coding are instead better accounted for by constraints in the central processing of fundamental frequency (F0) or pitch. We measured FM detection in male and female humans using harmonic complex tones with an F0 within the range of musical pitch but with resolved harmonic components that were all above the putative limits of temporal phase locking (>8 kHz). Listeners were more sensitive to slow than fast FM rates, even though all components were beyond the limits of phase locking. In contrast, AM sensitivity remained better at faster than slower rates, regardless of carrier frequency. These findings demonstrate that classic trends in human FM sensitivity, previously attributed to auditory nerve phase locking, may instead reflect the constraints of a unitary code that operates at a more central level of processing.SIGNIFICANCE STATEMENT Natural sounds involve dynamic frequency and amplitude fluctuations. Humans are particularly sensitive to frequency modulation (FM) at slow rates and low carrier frequencies, which are prevalent in speech and music. This sensitivity has been ascribed to encoding of stimulus temporal fine structure (TFS) via phase-locked auditory nerve activity. To test this long-standing theory, we measured FM sensitivity using complex tones with a low F0 but only high-frequency harmonics beyond the limits of phase locking. Dissociating the F0 from TFS showed that FM sensitivity is limited not by peripheral encoding of TFS but rather by central processing of F0, or pitch. The results suggest a unitary code for FM detection limited by more central constraints.
Collapse
|
6
|
Consonance Perception in Congenital Amusia: Behavioral and Brain Responses to Harmonicity and Beating Cues. J Cogn Neurosci 2023; 35:765-780. [PMID: 36802367 PMCID: PMC10117172 DOI: 10.1162/jocn_a_01973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by difficulties in the perception and production of music, including the perception of consonance and dissonance, or the judgment of certain combinations of pitches as more pleasant than others. Two perceptual cues for dissonance are inharmonicity (the lack of a common fundamental frequency between components) and beating (amplitude fluctuations produced by close, interacting frequency components). Amusic individuals have previously been reported to be insensitive to inharmonicity, but to exhibit normal sensitivity to beats. In the present study, we measured adaptive discrimination thresholds in amusic participants and found elevated thresholds for both cues. We recorded EEG and measured the mismatch negativity (MMN) in evoked potentials to consonance and dissonance deviants in an oddball paradigm. The amplitude of the MMN response was similar overall for amusic and control participants; however, in controls, there was a tendency toward larger MMNs for inharmonicity than for beating cues, whereas the opposite tendency was observed for the amusic participants. These findings suggest that initial encoding of consonance cues may be intact in amusia despite impaired behavioral performance, but that the relative weight of nonspectral (beating) cues may be increased for amusic individuals.
Collapse
|
7
|
Questions and controversies surrounding the perception and neural coding of pitch. Front Neurosci 2023; 16:1074752. [PMID: 36699531 PMCID: PMC9868815 DOI: 10.3389/fnins.2022.1074752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/16/2022] [Indexed: 01/12/2023] Open
Abstract
Pitch is a fundamental aspect of auditory perception that plays an important role in our ability to understand speech, appreciate music, and attend to one sound while ignoring others. The questions surrounding how pitch is represented in the auditory system, and how our percept relates to the underlying acoustic waveform, have been a topic of inquiry and debate for well over a century. New findings and technological innovations have led to challenges of some long-standing assumptions and have raised new questions. This article reviews some recent developments in the study of pitch coding and perception and focuses on the topic of how pitch information is extracted from peripheral representations based on frequency-to-place mapping (tonotopy), stimulus-driven auditory-nerve spike timing (phase locking), or a combination of both. Although a definitive resolution has proved elusive, the answers to these questions have potentially important implications for mitigating the effects of hearing loss via devices such as cochlear implants.
Collapse
|
8
|
Methodological considerations when measuring and analyzing auditory steady-state responses with multi-channel EEG. CURRENT RESEARCH IN NEUROBIOLOGY 2022; 3:100061. [PMID: 36386860 PMCID: PMC9647176 DOI: 10.1016/j.crneur.2022.100061] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 07/11/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022] Open
Abstract
The auditory steady-state response (ASSR) has been traditionally recorded with few electrodes and is often measured as the voltage difference between mastoid and vertex electrodes (vertical montage). As high-density EEG recording systems have gained popularity, multi-channel analysis methods have been developed to integrate the ASSR signal across channels. The phases of ASSR across electrodes can be affected by factors including the stimulus modulation rate and re-referencing strategy, which will in turn affect the estimated ASSR strength. To explore the relationship between the classical vertical-montage ASSR and whole-scalp ASSR, we applied these two techniques to the same data to estimate the strength of ASSRs evoked by tones with sinusoidal amplitude modulation rates of around 40, 100, and 200 Hz. The whole-scalp methods evaluated in our study, with either linked-mastoid or common-average reference, included ones that assume equal phase across all channels, as well as ones that allow for different phase relationships. The performance of simple averaging was compared to that of more complex methods involving principal component analysis. Overall, the root-mean-square of the phase locking values (PLVs) across all channels provided the most efficient method to detect ASSR across the range of modulation rates tested here.
Collapse
|
9
|
Role of perceptual integration in pitch discrimination at high frequenciesa). JASA EXPRESS LETTERS 2022; 2:084402. [PMID: 37311192 PMCID: PMC10264831 DOI: 10.1121/10.0013429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
At very high frequencies, fundamental-frequency difference limens (F0DLs) for five-component harmonic complex tones can be better than predicted by optimal integration of information, assuming performance is limited by noise at the peripheral level, but are in line with predictions based on more central sources of noise. This study investigates whether there is a minimum number of harmonic components needed for such super-optimal integration effects and if harmonic range or inharmonicity affects this super-optimal integration. Results show super-optimal integration, even with two harmonic components and for most combinations of consecutive harmonic, but not inharmonic, components.
Collapse
|
10
|
Auditory filter shapes derived from forward and simultaneous masking at low frequencies: Implications for human cochlear tuning. Hear Res 2022; 420:108500. [PMID: 35405591 PMCID: PMC9167757 DOI: 10.1016/j.heares.2022.108500] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 03/08/2022] [Accepted: 03/28/2022] [Indexed: 01/04/2023]
Abstract
Behavioral forward-masking thresholds with a spectrally notched-noise masker and a fixed low-level probe tone have been shown to provide accurate estimates of cochlear tuning. Estimates using simultaneous masking are similar but generally broader, presumably due to nonlinear cochlear suppression effects. So far, estimates with forward masking have been limited to frequencies of 1 kHz and above. This study used spectrally notched noise under forward and simultaneous masking to estimate frequency selectivity between 200 and 1000 Hz for young adult listeners with normal hearing. Estimates of filter tuning at 1000 Hz were in agreement with previous studies. Estimated tuning broadened below 1000 Hz, with the filter quality factor based on the equivalent rectangular bandwidth (QERB) decreasing more rapidly with decreasing frequency than predicted by previous equations, in line with earlier predictions based on otoacoustic-emission latencies. Estimates from simultaneous masking remained broader than those from forward masking by approximately the same ratio. The new data provide a way to compare human cochlear tuning estimates with auditory-nerve tuning curves from other species across most of the auditory frequency range.
Collapse
|
11
|
WHAT MAKES HUMAN HEARING SPECIAL? FRONTIERS FOR YOUNG MINDS 2022; 10:708921. [PMID: 37465203 PMCID: PMC10353771 DOI: 10.3389/frym.2022.708921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Humans and many other animals can hear a wide range of sounds. We can hear low and high notes and both quiet and loud sounds. We are also very good at telling the difference between sounds that are similar, like the speech sounds "argh" and "ah," and picking apart sounds that are mixed together, like when an orchestra is playing. But how do human hearing abilities compare to those of other animals? In this article, we discover how the inner ear determines hearing abilities. Many other mammals can hear very high notes that we cannot, and some can hear quiet sounds that we cannot. However, humans may be better than any other species at distinguishing similar sounds. We know this because, milliseconds after the sounds around us go into our ears, other sounds come out: sounds that are actually produced by those same ears!
Collapse
|
12
|
No Benefit of Deriving Cochlear-Implant Maps From Binaural Temporal-Envelope Sensitivity for Speech Perception or Spatial Hearing Under Single-Sided Deafness. Ear Hear 2022; 43:310-322. [PMID: 34291758 PMCID: PMC8770730 DOI: 10.1097/aud.0000000000001094] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES This study tested whether speech perception and spatial acuity improved in people with single-sided deafness and a cochlear implant (SSD+CI) when the frequency allocation table (FAT) of the CI was adjusted to optimize frequency-dependent sensitivity to binaural disparities. DESIGN Nine SSD+CI listeners with at least 6 months of CI listening experience participated. Individual experimental FATs were created to best match the frequency-to-place mapping across ears using either sensitivity to binaural temporal-envelope disparities or estimated insertion depth. Spatial localization ability was measured, along with speech perception in spatially collocated or separated noise, first with the clinical FATs and then with the experimental FATs acutely and at 2-month intervals for 6 months. Listeners then returned to the clinical FATs and were retested acutely and after 1 month to control for long-term learning effects. RESULTS The experimental FAT varied between listeners, differing by an average of 0.15 octaves from the clinical FAT. No significant differences in performance were observed in any of the measures between the experimental FAT after 6 months and the clinical FAT one month later, and no clear relationship was found between the size of the frequency-allocation shift and perceptual changes. CONCLUSION Adjusting the FAT to optimize sensitivity to interaural temporal-envelope disparities did not improve localization or speech perception. The clinical frequency-to-place alignment may already be sufficient, given the inherently poor spectral resolution of CIs. Alternatively, other factors, such as temporal misalignment between the two ears, may need to be addressed before any benefits of spectral alignment can be observed.
Collapse
|
13
|
Voice disadvantage effects in absolute and relative pitch judgments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:2414. [PMID: 35461511 PMCID: PMC8993423 DOI: 10.1121/10.0010123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 03/16/2022] [Accepted: 03/21/2022] [Indexed: 06/14/2023]
Abstract
Absolute pitch (AP) possessors can identify musical notes without an external reference. Most AP studies have used musical instruments and pure tones for testing, rather than the human voice. However, the voice is crucial for human communication in both speech and music, and evidence for voice-specific neural processing mechanisms and brain regions suggests that AP processing of voice may be different. Here, musicians with AP or relative pitch (RP) completed online AP or RP note-naming tasks, respectively. Four synthetic sound categories were tested: voice, viola, simplified voice, and simplified viola. Simplified sounds had the same long-term spectral information but no temporal fluctuations (such as vibrato). The AP group was less accurate in judging the note names for voice than for viola in both the original and simplified conditions. A smaller, marginally significant effect was observed in the RP group. A voice disadvantage effect was also observed in a simple pitch discrimination task, even with simplified stimuli. To reconcile these results with voice-advantage effects in other domains, it is proposed that voices are processed in a way that voice- or speech-relevant features are facilitated at the expense of features that are less relevant to voice processing, such as fine-grained pitch information.
Collapse
|
14
|
Human discrimination and modeling of high-frequency complex tones shed light on the neural codes for pitch. PLoS Comput Biol 2022; 18:e1009889. [PMID: 35239639 PMCID: PMC8923464 DOI: 10.1371/journal.pcbi.1009889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 03/15/2022] [Accepted: 02/02/2022] [Indexed: 11/24/2022] Open
Abstract
Accurate pitch perception of harmonic complex tones is widely believed to rely on temporal fine structure information conveyed by the precise phase-locked responses of auditory-nerve fibers. However, accurate pitch perception remains possible even when spectrally resolved harmonics are presented at frequencies beyond the putative limits of neural phase locking, and it is unclear whether residual temporal information, or a coarser rate-place code, underlies this ability. We addressed this question by measuring human pitch discrimination at low and high frequencies for harmonic complex tones, presented either in isolation or in the presence of concurrent complex-tone maskers. We found that concurrent complex-tone maskers impaired performance at both low and high frequencies, although the impairment introduced by adding maskers at high frequencies relative to low frequencies differed between the tested masker types. We then combined simulated auditory-nerve responses to our stimuli with ideal-observer analysis to quantify the extent to which performance was limited by peripheral factors. We found that the worsening of both frequency discrimination and F0 discrimination at high frequencies could be well accounted for (in relative terms) by optimal decoding of all available information at the level of the auditory nerve. A Python package is provided to reproduce these results, and to simulate responses to acoustic stimuli from the three previously published models of the human auditory nerve used in our analyses.
Collapse
|
15
|
Infant Pitch and Timbre Discrimination in the Presence of Variation in the Other Dimension. J Assoc Res Otolaryngol 2021; 22:693-702. [PMID: 34519951 DOI: 10.1007/s10162-021-00807-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 07/02/2021] [Indexed: 11/25/2022] Open
Abstract
Adult listeners perceive pitch with fine precision, with many adults capable of discriminating less than a 1 % change in fundamental frequency (F0). Although there is variability across individuals, this precise pitch perception is an ability ascribed to cortical functions that are also important for speech and music perception. Infants display neural immaturity in the auditory cortex, suggesting that pitch discrimination may improve throughout infancy. In two experiments, we tested the limits of F0 (pitch) and spectral centroid (timbre) perception in 66 infants and 31 adults. Contrary to expectations, we found that infants at both 3 and 7 months were able to reliably detect small changes in F0 in the presence of random variations in spectral content, and vice versa, to the extent that their performance matched that of adults with musical training and exceeded that of adults without musical training. The results indicate high fidelity of F0 and spectral-envelope coding in infants, implying that fully mature cortical processing is not necessary for accurate discrimination of these features. The surprising difference in performance between infants and musically untrained adults may reflect a developmental trajectory for learning natural statistical covariations between pitch and timbre that improves coding efficiency but results in degraded performance in adults without musical training when expectations for such covariations are violated.
Collapse
|
16
|
Abstract
OBJECTIVES The identity of a speech sound can be affected by the spectrum of a preceding stimulus in a contrastive manner. Although such aftereffects are often reduced in people with hearing loss and cochlear implants (CIs), one recent study demonstrated larger spectral contrast effects in CI users than in normal-hearing (NH) listeners. The present study aimed to shed light on this puzzling finding. We hypothesized that poorer spectral resolution leads CI users to rely on different acoustic cues not only to identify speech sounds but also to adapt to the context. DESIGN Thirteen postlingually deafened adult CI users and 33 NH participants (listening to either vocoded or unprocessed speech) participated in this study. Psychometric functions were estimated in a vowel categorization task along the /I/ to /ε/ (as in "bit" and "bet") continuum following a context sentence, the long-term average spectrum of which was manipulated at the level of either fine-grained local spectral cues or coarser global spectral cues. RESULTS In NH listeners with unprocessed speech, the aftereffect was determined solely by the fine-grained local spectral cues, resulting in a surprising insensitivity to the larger, global spectral cues utilized by CI users. Restricting the spectral resolution available to NH listeners via vocoding resulted in patterns of responses more similar to those found in CI users. However, the size of the contrast aftereffect remained smaller in NH listeners than in CI users. CONCLUSIONS Only the spectral contrasts used by listeners contributed to the spectral contrast effects in vowel identification. These results explain why CI users can experience larger-than-normal context effects under specific conditions. The results also suggest that adaptation to new spectral cues can be very rapid for vowel discrimination, but may follow a longer time course to influence spectral contrast effects.
Collapse
|
17
|
Investigating age, hearing loss, and background noise effects on speaker-targeted head and eye movements in three-way conversations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1889. [PMID: 33765809 DOI: 10.1121/10.0003707] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 02/19/2021] [Indexed: 06/12/2023]
Abstract
Although beamforming algorithms for hearing aids can enhance performance, the wearer's head may not always face the target talker, potentially limiting real-world benefits. This study aimed to determine the extent to which eye tracking improves the accuracy of locating the current talker in three-way conversations and to test the hypothesis that eye movements become more likely to track the target talker with increasing background noise levels, particularly in older and/or hearing-impaired listeners. Conversations between a participant and two confederates were held around a small table in quiet and with background noise levels of 50, 60, and 70 dB sound pressure level, while the participant's eye and head movements were recorded. Ten young normal-hearing listeners were tested, along with ten older normal-hearing listeners and eight hearing-impaired listeners. Head movements generally undershot the talker's position by 10°-15°, but head and eye movements together predicted the talker's position well. Contrary to our original hypothesis, no major differences in listening behavior were observed between the groups or between noise levels, although the hearing-impaired listeners tended to spend less time looking at the current talker than the other groups, especially at the highest noise level.
Collapse
|
18
|
Role of semantic context and talker variability in speech perception of cochlear-implant users and normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1224. [PMID: 33639827 PMCID: PMC7895533 DOI: 10.1121/10.0003532] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 01/01/2021] [Accepted: 01/26/2021] [Indexed: 06/12/2023]
Abstract
This study assessed the impact of semantic context and talker variability on speech perception by cochlear-implant (CI) users and compared their overall performance and between-subjects variance with that of normal-hearing (NH) listeners under vocoded conditions. Thirty post-lingually deafened adult CI users were tested, along with 30 age-matched and 30 younger NH listeners, on sentences with and without semantic context, presented in quiet and noise, spoken by four different talkers. Additional measures included working memory, non-verbal intelligence, and spectral-ripple detection and discrimination. Semantic context and between-talker differences influenced speech perception to similar degrees for both CI users and NH listeners. Between-subjects variance for speech perception was greatest in the CI group but remained substantial in both NH groups, despite the uniformly degraded stimuli in these two groups. Spectral-ripple detection and discrimination thresholds in CI users were significantly correlated with speech perception, but a single set of vocoder parameters for NH listeners was not able to capture average CI performance in both speech and spectral-ripple tasks. The lack of difference in the use of semantic context between CI users and NH listeners suggests no overall differences in listening strategy between the groups, when the stimuli are similarly degraded.
Collapse
|
19
|
Development and Validation of Sentences Without Semantic Context to Complement the Basic English Lexicon Sentences. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3847-3854. [PMID: 33049146 PMCID: PMC8582750 DOI: 10.1044/2020_jslhr-20-00174] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Purpose The goal of this study was to develop and validate a new corpus of sentences without semantic context to facilitate research aimed at isolating the effects of semantic context in speech perception. Method The newly developed corpus contains nonsensical sentences but is matched in vocabulary and syntactic structure to the existing Basic English Lexicon (BEL) corpus. It consists of 20 lists, with each list containing 25 sentences and each sentence having four keywords. Each new list contains the same keywords as the respective list in the original BEL corpus, but the keywords within each list are scrambled across sentences to eliminate semantic context within each sentence, while maintaining the original syntactic structure. All sentences in the original and nonsense BEL corpora were recorded by the same two male and two female talkers. Results Mean intelligibility scores for each list were estimated by calculating the mean proportion of correct keywords achieved by 40 normal-hearing listeners for one male and one female talker. Although small but significant differences were found between some pairs of lists, mean performance for all 20 lists fell within the 95% confidence intervals of the mean. Conclusions Lists in the newly developed nonsense corpus are reasonably well equated for difficulty and can be used interchangeably in a randomized experimental design. Both the original and nonsense BEL sentences, all recorded by the same four talkers, are publicly available. Supplemental Material https://doi.org/10.23641/asha.13022900.
Collapse
|
20
|
The role of cochlear place coding in the perception of frequency modulation. eLife 2020; 9:58468. [PMID: 32996463 PMCID: PMC7556860 DOI: 10.7554/elife.58468] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/29/2020] [Indexed: 12/17/2022] Open
Abstract
Natural sounds convey information via frequency and amplitude modulations (FM and AM). Humans are acutely sensitive to the slow rates of FM that are crucial for speech and music. This sensitivity has long been thought to rely on precise stimulus-driven auditory-nerve spike timing (time code), whereas a coarser code, based on variations in the cochlear place of stimulation (place code), represents faster FM rates. We tested this theory in listeners with normal and impaired hearing, spanning a wide range of place-coding fidelity. Contrary to predictions, sensitivity to both slow and fast FM correlated with place-coding fidelity. We also used incoherent AM on two carriers to simulate place coding of FM and observed poorer sensitivity at high carrier frequencies and fast rates, two properties of FM detection previously ascribed to the limits of time coding. The results suggest a unitary place-based neural code for FM across all rates and carrier frequencies.
Collapse
|
21
|
Comment on 'Rapid acquisition of auditory subcortical steady state responses using multichannel recordings'. Clin Neurophysiol 2020; 131:1833-1834. [PMID: 32559638 PMCID: PMC7860925 DOI: 10.1016/j.clinph.2020.05.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 05/27/2020] [Indexed: 10/24/2022]
|
22
|
Sensitivity to binaural temporal-envelope beats with single-sided deafness and a cochlear implant as a measure of tonotopic match (L). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3626. [PMID: 32486770 PMCID: PMC7253218 DOI: 10.1121/10.0001305] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
For cochlear-implant users with near-normal contralateral hearing, a mismatch between the frequency-to-place mapping in the two ears could produce a suboptimal performance. This study assesses tonotopic matches via binaural interactions. Dynamic interaural time-difference sensitivity was measured using bandpass-filtered pulse trains at different rates in the acoustic and implanted ear, creating binaural envelope beats. Sensitivity to beats should peak when the same tonotopic region is stimulated in both ears. All nine participants detected dynamic interaural timing differences and demonstrated some frequency selectivity. This method provides a guide to frequency-to-place mapping without compensation for inherent latency differences between the acoustic and implanted ears.
Collapse
|
23
|
Effect of lowest harmonic rank on fundamental-frequency difference limens varies with fundamental frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2314. [PMID: 32359332 PMCID: PMC7166120 DOI: 10.1121/10.0001092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 03/25/2020] [Accepted: 03/27/2020] [Indexed: 06/11/2023]
Abstract
This study investigated the relationship between fundamental frequency difference limens (F0DLs) and the lowest harmonic number present over a wide range of F0s (30-2000 Hz) for 12-component harmonic complex tones that were presented in either sine or random phase. For fundamental frequencies (F0s) between 100 and 400 Hz, a transition from low (∼1%) to high (∼5%) F0DLs occurred as the lowest harmonic number increased from about seven to ten, in line with earlier studies. At lower and higher F0s, the transition between low and high F0DLs occurred at lower harmonic numbers. The worsening performance at low F0s was reasonably well predicted by the expected decrease in spectral resolution below about 500 Hz. At higher F0s, the degradation in performance at lower harmonic numbers could not be predicted by changes in spectral resolution but remained relatively good (<2%-3%) in some conditions, even when all harmonics were above 8 kHz, confirming that F0 can be extracted from harmonics even when temporal envelope or fine-structure cues are weak or absent.
Collapse
|
24
|
The Perception of Multiple Simultaneous Pitches as a Function of Number of Spectral Channels and Spectral Spread in a Noise-Excited Envelope Vocoder. J Assoc Res Otolaryngol 2020; 21:61-72. [PMID: 32048077 DOI: 10.1007/s10162-019-00738-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 10/30/2019] [Indexed: 01/06/2023] Open
Abstract
Cochlear implant (CI) listeners typically perform poorly on tasks involving the pitch of complex tones. This limitation in performance is thought to be mainly due to the restricted number of active channels and the broad current spread that leads to channel interactions and subsequent loss of precise spectral information, with temporal information limited primarily to temporal-envelope cues. Little is known about the degree of spectral resolution required to perceive combinations of multiple pitches, or a single pitch in the presence of other interfering tones in the same spectral region. This study used noise-excited envelope vocoders that simulate the limited resolution of CIs to explore the perception of multiple pitches presented simultaneously. The results show that the resolution required for perceiving multiple complex pitches is comparable to that found in a previous study using single complex tones. Although relatively high performance can be achieved with 48 channels, performance remained near chance when even limited spectral spread (with filter slopes as steep as 144 dB/octave) was introduced to the simulations. Overall, these tight constraints suggest that current CI technology will not be able to convey the pitches of combinations of spectrally overlapping complex tones.
Collapse
|
25
|
Spectral contrast effects and auditory enhancement under normal and impaired hearing. ACOUSTICAL SCIENCE AND TECHNOLOGY 2020; 41:108-112. [PMID: 32362758 PMCID: PMC7194197 DOI: 10.1250/ast.41.108] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We are generally able to identify sounds and understand speech with ease, despite the large variations in the acoustics of each sound, which occur due to factors such as different talkers, background noise, and room acoustics. This form of perceptual constancy is likely to be mediated in part by the auditory system's ability to adapt to the ongoing environment or context in which sounds are presented. Auditory context effects have been studied under different names, such as spectral contrast effects in speech and auditory enhancement effects in psychoacoustics, but they share some important properties and may be mediated by similar underlying neural mechanisms. This review provides a survey of recent studies from our laboratory that investigate the mechanisms of speech spectral contrast effects and auditory enhancement in people with normal hearing, hearing loss, and cochlear implants. We argue that a better understanding of such context effects in people with normal hearing may allow us to restore some of these important effects for people with hearing loss via signal processing in hearing aids and cochlear implants, thereby potentially improving auditory and speech perception in the complex and variable everyday acoustic backgrounds that surround us.
Collapse
|
26
|
Comparing Rapid and Traditional Forward-Masked Spatial Tuning Curves in Cochlear-Implant Users. Trends Hear 2019; 23:2331216519851306. [PMID: 31134842 PMCID: PMC6540501 DOI: 10.1177/2331216519851306] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
A rapid forward-masked spatial tuning curve measurement procedure, based on Bekesy tracking, was adapted and evaluated for use with cochlear implants. Twelve postlingually-deafened adult cochlear-implant users participated. Spatial tuning curves using the new procedure and using a traditional forced-choice adaptive procedure resulted in similar estimates of parameters. The Bekesy-tracking method was almost 3 times faster than the forced-choice procedure, but its test-retest reliability was significantly poorer. Although too time-consuming for general clinical use, the new method may have some benefits in individual cases, where identifying electrodes with poor spatial selectivity as candidates for deactivation is deemed necessary.
Collapse
|
27
|
Short- and long-term memory for pitch and non-pitch contours: Insights from congenital amusia. Brain Cogn 2019; 136:103614. [PMID: 31546175 PMCID: PMC6953621 DOI: 10.1016/j.bandc.2019.103614] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/11/2019] [Accepted: 09/13/2019] [Indexed: 11/25/2022]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits in music perception, including discriminating and remembering melodies and melodic contours. As non-amusic listeners can perceive contours in dimensions other than pitch, such as loudness and brightness, our present study investigated whether amusics' pitch contour deficits also extend to these other auditory dimensions. Amusic and control participants performed an identification task for ten familiar melodies and a short-term memory task requiring the discrimination of changes in the contour of novel four-tone melodies. For both tasks, melodic contour was defined by pitch, brightness, or loudness. Amusic participants showed some ability to extract contours in all three dimensions. For familiar melodies, amusic participants showed impairment in all conditions, perhaps reflecting the fact that the long-term memory representations of the familiar melodies were defined in pitch. In the contour discrimination task with novel melodies, amusic participants exhibited less impairment for loudness-based melodies than for pitch- or brightness-based melodies, suggesting some specificity of the deficit for spectral changes, if not for pitch alone. The results suggest pitch and brightness may not be processed by the same mechanisms as loudness, and that short-term memory for loudness contours may be spared to some degree in congenital amusia.
Collapse
|
28
|
Auditory enhancement under forward masking in normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3448. [PMID: 31795651 PMCID: PMC6872462 DOI: 10.1121/1.5133629] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 10/10/2019] [Accepted: 10/22/2019] [Indexed: 06/10/2023]
Abstract
A target within a spectrally notched masker can be enhanced by a preceding copy of the masker. Enhancement can also increase the effectiveness of the target as a forward masker. Enhancement has been reported in hearing-impaired listeners under simultaneous but not forward masking. However, previous studies of enhancement under forward masking did not fully assess the potential effect of differences in sensation level or spectral resolution between the normal-hearing and hearing-impaired listeners. This study measured enhancement via forward masking in hearing-impaired and age-matched normal-hearing listeners with different spectral notches in the masker, to account for potential differences in frequency selectivity, and with levels equated by adding a background masking noise to equate both sensation level and sound pressure level or by reducing the sound pressure level of the stimuli to equate sensation level. Hearing-impaired listeners showed no significant enhancement, regardless of spectral notch width. Normal-hearing listeners showed enhancement at high levels, but showed less enhancement when sensation levels were reduced to match those of the hearing-impaired group, either by reducing sound levels or by adding a masking noise. The results confirm a lack of forward-masked enhancement in hearing-impaired listeners but suggest this may be partly due to reduced sensation level.
Collapse
|
29
|
No effects of attention or visual perceptual load on cochlear function, as measured with stimulus-frequency otoacoustic emissions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1475. [PMID: 31472524 PMCID: PMC6715442 DOI: 10.1121/1.5123391] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 08/02/2019] [Accepted: 08/05/2019] [Indexed: 06/10/2023]
Abstract
The effects of selectively attending to a target stimulus in a background containing distractors can be observed in cortical representations of sound as an attenuation of the representation of distractor stimuli. The locus in the auditory system at which attentional modulations first arise is unknown, but anatomical evidence suggests that cortically driven modulation of neural activity could extend as peripherally as the cochlea itself. Previous studies of selective attention have used otoacoustic emissions to probe cochlear function under varying conditions of attention with mixed results. In the current study, two experiments combined visual and auditory tasks to maximize sustained attention, perceptual load, and cochlear dynamic range in an attempt to improve the likelihood of observing selective attention effects on cochlear responses. Across a total of 45 listeners in the two experiments, no systematic effects of attention or perceptual load were observed on stimulus-frequency otoacoustic emissions. The results revealed significant between-subject variability in the otoacoustic-emission measure of cochlear function that does not depend on listener performance in the behavioral tasks and is not related to movement-generated noise. The findings suggest that attentional modulation of auditory information in humans arises at stages of processing beyond the cochlea.
Collapse
|
30
|
Speech perception is similar for musicians and non-musicians across a wide range of conditions. Sci Rep 2019; 9:10404. [PMID: 31320656 PMCID: PMC6639310 DOI: 10.1038/s41598-019-46728-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 06/29/2019] [Indexed: 11/12/2022] Open
Abstract
It remains unclear whether musical training is associated with improved speech understanding in a noisy environment, with different studies reaching differing conclusions. Even in those studies that have reported an advantage for highly trained musicians, it is not known whether the benefits measured in laboratory tests extend to more ecologically valid situations. This study aimed to establish whether musicians are better than non-musicians at understanding speech in a background of competing speakers or speech-shaped noise under more realistic conditions, involving sounds presented in space via a spherical array of 64 loudspeakers, rather than over headphones, with and without simulated room reverberation. The study also included experiments testing fundamental frequency discrimination limens (F0DLs), interaural time differences limens (ITDLs), and attentive tracking. Sixty-four participants (32 non-musicians and 32 musicians) were tested, with the two groups matched in age, sex, and IQ as assessed with Raven’s Advanced Progressive matrices. There was a significant benefit of musicianship for F0DLs, ITDLs, and attentive tracking. However, speech scores were not significantly different between the two groups. The results suggest no musician advantage for understanding speech in background noise or talkers under a variety of conditions.
Collapse
|
31
|
Cognitive factors contribute to speech perception in cochlear-implant users and age-matched normal-hearing listeners under vocoded conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:195. [PMID: 31370651 PMCID: PMC6637026 DOI: 10.1121/1.5116009] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study examined the contribution of perceptual and cognitive factors to speech-perception abilities in cochlear-implant (CI) users. Thirty CI users were tested on word intelligibility in sentences with and without semantic context, presented in quiet and in noise. Performance was compared with measures of spectral-ripple detection and discrimination, thought to reflect peripheral processing, as well as with cognitive measures of working memory and non-verbal intelligence. Thirty age-matched and thirty younger normal-hearing (NH) adults also participated, listening via tone-excited vocoders, adjusted to produce mean performance for speech in noise comparable to that of the CI group. Results suggest that CI users may rely more heavily on semantic context than younger or older NH listeners, and that non-auditory working memory explains significant variance in the CI and age-matched NH groups. Between-subject variability in spectral-ripple detection thresholds was similar across groups, despite the spectral resolution for all NH listeners being limited by the same vocoder, whereas speech perception scores were more variable between CI users than between NH listeners. The results highlight the potential importance of central factors in explaining individual differences in CI users and question the extent to which standard measures of spectral resolution in CIs reflect purely peripheral processing.
Collapse
|
32
|
Corrigendum to “Learning for pitch and melody discrimination in congenital amusia” [Cortex 103 (2018) 167–178]. Cortex 2019; 115:371. [DOI: 10.1016/j.cortex.2019.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
33
|
The role of pitch and harmonic cancellation when listening to speech in harmonic background sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:3011. [PMID: 31153349 PMCID: PMC6529328 DOI: 10.1121/1.5102169] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 04/16/2019] [Accepted: 04/19/2019] [Indexed: 05/29/2023]
Abstract
Fundamental frequency differences (ΔF0) between competing talkers aid in the perceptual segregation of the talkers (ΔF0 benefit), but the underlying mechanisms remain incompletely understood. A model of ΔF0 benefit based on harmonic cancellation proposes that a masker's periodicity can be used to cancel (i.e., filter out) its neural representation. Earlier work suggested that an octave ΔF0 provided little benefit, an effect predicted by harmonic cancellation due to the shared periodicity of masker and target. Alternatively, this effect can be explained by spectral overlap between the harmonic components of the target and masker. To assess these competing explanations, speech intelligibility of a monotonized target talker, masked by a speech-shaped harmonic complex tone, was measured as a function of ΔF0, masker spectrum (all harmonics or odd harmonics only), and masker temporal envelope (amplitude modulated or unmodulated). Removal of the masker's even harmonics when the target was one octave above the masker improved speech reception thresholds by about 5 dB. Because this manipulation eliminated spectral overlap between target and masker components but preserved shared periodicity, the finding is consistent with the explanation for the lack of ΔF0 benefit at the octave based on spectral overlap, but not with the explanation based on harmonic cancellation.
Collapse
|
34
|
Pitch discrimination with mixtures of three concurrent harmonic complexes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2072. [PMID: 31046318 PMCID: PMC6469983 DOI: 10.1121/1.5096639] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 02/19/2019] [Accepted: 03/13/2019] [Indexed: 06/09/2023]
Abstract
In natural listening contexts, especially in music, it is common to hear three or more simultaneous pitches, but few empirical or theoretical studies have addressed how this is achieved. Place and pattern-recognition theories of pitch require at least some harmonics to be spectrally resolved for pitch to be extracted, but it is unclear how often such conditions exist when multiple complex tones are presented together. In three behavioral experiments, mixtures of three concurrent complexes were filtered into a single bandpass spectral region, and the relationship between the fundamental frequencies and spectral region was varied in order to manipulate the extent to which harmonics were resolved either before or after mixing. In experiment 1, listeners discriminated major from minor triads (a difference of 1 semitone in one note of the triad). In experiments 2 and 3, listeners compared the pitch of a probe tone with that of a subsequent target, embedded within two other tones. All three experiments demonstrated above-chance performance, even in conditions where the combinations of harmonic components were unlikely to be resolved after mixing, suggesting that fully resolved harmonics may not be necessary to extract the pitch from multiple simultaneous complexes.
Collapse
|
35
|
Examining replicability of an otoacoustic measure of cochlear function during selective attention. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2882. [PMID: 30522315 PMCID: PMC6246073 DOI: 10.1121/1.5079311] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 10/12/2018] [Accepted: 10/27/2018] [Indexed: 06/09/2023]
Abstract
Attention to a target stimulus within a complex scene often results in enhanced cortical representations of the target relative to the background. It remains unclear where along the auditory pathways attentional effects can first be measured. Anatomy suggests that attentional modulation could occur through corticofugal connections extending as far as the cochlea itself. Earlier attempts to investigate the effects of attention on human cochlear processing have revealed small and inconsistent effects. In this study, stimulus-frequency otoacoustic emissions were recorded from a total of 30 human participants as they performed tasks that required sustained selective attention to auditory or visual stimuli. In the first sample of 15 participants, emission magnitudes were significantly weaker when participants attended to the visual stimuli than when they attended to the auditory stimuli, by an average of 5.4 dB. However, no such effect was found in the second sample of 15 participants. When the data were pooled across samples, the average attentional effect was significant, but small (2.48 dB), with 12 of 30 listeners showing a significant effect, based on bootstrap analysis of the individual data. The results highlight the need for considering sources of individual differences and using large sample sizes in future investigations.
Collapse
|
36
|
Fundamental-frequency discrimination based on temporal-envelope cues: Effects of bandwidth and interference. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:EL423. [PMID: 30522318 PMCID: PMC6249132 DOI: 10.1121/1.5079569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/24/2018] [Accepted: 10/29/2018] [Indexed: 06/09/2023]
Abstract
Both music and speech perception rely on hearing out one pitch in the presence of others. Pitch discrimination of narrowband sounds based only on temporal-envelope cues is rendered nearly impossible by introducing interferers in both normal-hearing listeners and cochlear-implant (CI) users. This study tested whether performance improves in normal-hearing listeners if the target is presented over a broad spectral region. The results indicate that performance is still strongly affected by spectrally remote interferers, despite increases in bandwidth, suggesting that envelope-based pitch is unlikely to allow CI users to perceive pitch when multiple harmonic sounds are presented at once.
Collapse
|
37
|
Abstract
Frequency analysis of sound by the cochlea is the most fundamental property of the auditory system. Despite its importance, the resolution of this frequency analysis in humans remains controversial. The controversy persists because the methods used to estimate tuning in humans are indirect and have not all been independently validated in other species. Some data suggest that human cochlear tuning is considerably sharper than that of laboratory animals, while others suggest little or no difference between species. We show here in a single species (ferret) that behavioral estimates of tuning bandwidths obtained using perceptual masking methods, and objective estimates obtained using otoacoustic emissions, both also employed in humans, agree closely with direct physiological measurements from single auditory-nerve fibers. Combined with human behavioral data, this outcome indicates that the frequency analysis performed by the human cochlea is of significantly higher resolution than found in common laboratory animals. This finding raises important questions about the evolutionary origins of human cochlear tuning, its role in the emergence of speech communication, and the mechanisms underlying our ability to separate and process natural sounds in complex acoustic environments.
Collapse
|
38
|
Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2424. [PMID: 30404514 PMCID: PMC6909992 DOI: 10.1121/1.5065392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 06/08/2023]
Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.
Collapse
|
39
|
Abstract
The long-term spectrum of a preceding sentence can alter the perception of a following speech sound in a contrastive manner. This speech context effect contributes to our ability to extract reliable spectral characteristics of the surrounding acoustic environment and to compensate for the voice characteristics of different speakers or spectral colorations in different listening environments to maintain perceptual constancy. The extent to which such effects are mediated by low-level "automatic" processes, or require directed attention, remains unknown. This study investigated spectral context effects by measuring the effects of two competing sentences on the phoneme category boundary between /i/ and /ε/ in a following target word, while directing listeners' attention to one or the other context sentence. Spatial separation of the context sentences was achieved either by presenting them to different ears, or by presenting them to both ears but imposing an interaural time difference (ITD) between the ears. The results confirmed large context effects based on ear of presentation. Smaller effects were observed based on either ITD or attention. The results, combined with predictions from a two-stage model, suggest that ear-specific factors dominate speech context effects but that the effects can be modulated by higher-level features, such as perceived location, and by attention. (PsycINFO Database Record
Collapse
|
40
|
Loudness Context Effects and Auditory Enhancement in Normal, Impaired, and Electric Hearing. ACTA ACUST UNITED AC 2018. [DOI: 10.3813/aaa.919254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
41
|
Auditory enhancement and the role of spectral resolution in normal-hearing listeners and cochlear-implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:552. [PMID: 30180692 PMCID: PMC6072550 DOI: 10.1121/1.5048414] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 06/25/2018] [Accepted: 07/11/2018] [Indexed: 05/17/2023]
Abstract
Detection of a target tone in a simultaneous multi-tone masker can be improved by preceding the stimulus with the masker alone. The mechanisms underlying this auditory enhancement effect may enable the efficient detection of new acoustic events and may help to produce perceptual constancy under varying acoustic conditions. Previous work in cochlear-implant (CI) users has suggested reduced or absent enhancement, due perhaps to poor spatial resolution in the cochlea. This study used a supra-threshold enhancement paradigm that in normal-hearing listeners results in large enhancement effects, exceeding 20 dB. Results from vocoder simulations using normal-hearing listeners showed that near-normal enhancement was observed if the simulated spread of excitation was limited to spectral slopes no shallower than 24 dB/oct. No significant enhancement was observed on average in CI users with their clinical monopolar stimulation strategy. The variability in enhancement between CI users, and between electrodes in a single CI user, could not be explained by the spread of excitation, as estimated from auditory nerve evoked potentials. Enhancement remained small, but did reach statistical significance, under the narrower partial-tripolar stimulation strategy. The results suggest that enhancement may be at least partially restored by improvements in the spatial resolution of current CIs.
Collapse
|
42
|
Effects of spectral resolution on spectral contrast effects in cochlear-implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL468. [PMID: 29960500 PMCID: PMC6002271 DOI: 10.1121/1.5042082] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 05/02/2018] [Accepted: 05/27/2018] [Indexed: 06/08/2023]
Abstract
The identity of a speech sound can be affected by the long-term spectrum of a preceding stimulus. Poor spectral resolution of cochlear implants (CIs) may affect such context effects. Here, spectral contrast effects on a phoneme category boundary were investigated in CI users and normal-hearing (NH) listeners. Surprisingly, larger contrast effects were observed in CI users than in NH listeners, even when spectral resolution in NH listeners was limited via vocoder processing. The results may reflect a different weighting of spectral cues by CI users, based on poorer spectral resolution, which in turn may enhance some spectral contrast effects.
Collapse
|
43
|
Learning for pitch and melody discrimination in congenital amusia. Cortex 2018; 103:164-178. [PMID: 29655041 PMCID: PMC5988957 DOI: 10.1016/j.cortex.2018.03.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 12/12/2017] [Accepted: 03/08/2018] [Indexed: 11/30/2022]
Abstract
Congenital amusia is currently thought to be a life-long neurogenetic disorder in music perception, impervious to training in pitch or melody discrimination. This study provides an explicit test of whether amusic deficits can be reduced with training. Twenty amusics and 20 matched controls participated in four sessions of psychophysical training involving either pure-tone (500 Hz) pitch discrimination or a control task of lateralization (interaural level differences for bandpass white noise). Pure-tone pitch discrimination at low, medium, and high frequencies (500, 2000, and 8000 Hz) was measured before and after training (pretest and posttest) to determine the specificity of learning. Melody discrimination was also assessed before and after training using the full Montreal Battery of Evaluation of Amusia, the most widely used standardized test to diagnose amusia. Amusics performed more poorly than controls in pitch but not localization discrimination, but both groups improved with practice on the trained stimuli. Learning was broad, occurring across all three frequencies and melody discrimination for all groups, including those who trained on the non-pitch control task. Following training, 11 of 20 amusics no longer met the global diagnostic criteria for amusia. A separate group of untrained controls (n = 20), who also completed melody discrimination and pretest, improved by an equal amount as trained controls on all measures, suggesting that the bulk of learning for the control group occurred very rapidly from the pretest. Thirty-one trained participants (13 amusics) returned one year later to assess long-term maintenance of pitch and melody discrimination. On average, there was no change in performance between posttest and one-year follow-up, demonstrating that improvements on pitch- and melody-related tasks in amusics and controls can be maintained. The findings indicate that amusia is not always a life-long deficit when using the current standard diagnostic criteria.
Collapse
|
44
|
Effect of age and hearing loss on auditory stream segregation of speech sounds. Hear Res 2018; 364:118-128. [PMID: 29602593 DOI: 10.1016/j.heares.2018.03.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 02/09/2018] [Accepted: 03/15/2018] [Indexed: 10/17/2022]
Abstract
Segregating and understanding speech in complex environments is a major challenge for hearing-impaired (HI) listeners. It remains unclear to what extent these difficulties are dominated by direct interference, such as simultaneous masking, or by a failure of the mechanisms of stream segregation. This study compared older HI listeners' performance with that of young and older normal-hearing (NH) listeners in stream segregation tasks involving speech sounds. Listeners were presented with sequences of speech tokens, each consisting of a fricative consonant and a voiced vowel (CV). The CV tokens were concatenated into interleaved sequences that alternated in fundamental frequency (F0) and/or simulated vocal tract length (VTL). Each pair of interleaved sequences was preceded by a "word" consisting of two random tokens. The listeners were asked to indicate whether the word was present in the following interleaved sequences. The word, if present, occurred within one of the interleaved sequences, so that performance improved if the listeners were able to perceptually segregate the two sequences. Although HI listeners' identification of the speech tokens in isolation was poorer than that of the NH listeners, HI listeners were generally able to use both F0 and VTL cues to segregate the interleaved sequences. The results suggest that the difficulties experienced by HI listeners in complex acoustic environments cannot be explained by a loss of basic stream segregation abilities.
Collapse
|
45
|
Auditory enhancement under simultaneous masking in normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:901. [PMID: 29495696 PMCID: PMC5811308 DOI: 10.1121/1.5023687] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 01/09/2018] [Accepted: 01/24/2018] [Indexed: 06/08/2023]
Abstract
Auditory enhancement, where a target sound within a masker is rendered more audible by the prior presentation of the masker alone, may play an important role in auditory perception under variable everyday acoustic conditions. Cochlear hearing loss may reduce enhancement effects, potentially contributing to the difficulties experienced by hearing-impaired (HI) individuals in noisy and reverberant environments. However, it remains unknown whether, and by how much, enhancement under simultaneous masking is reduced in HI listeners. Enhancement of a pure tone under simultaneous masking with a multi-tone masker was measured in HI listeners and age-matched normal-hearing (NH) listeners as function of the spectral notch width of the masker, using stimuli at equal sensation levels as well as at equal sound pressure levels, but with the stimuli presented in noise to the NH listeners to maintain the equal sensation level between listener groups. The results showed that HI listeners exhibited some enhancement in all conditions. However, even when conditions were made as comparable as possible, in terms of effective spectral notch width and presentation level, the enhancement effect in HI listeners under simultaneous masking was reduced relative to that observed in NH listeners.
Collapse
|
46
|
Hearing, Emotion, Amplification, Research, and Training Workshop: Current Understanding of Hearing Loss and Emotion Perception and Priorities for Future Research. Trends Hear 2018; 22:2331216518803215. [PMID: 30270810 PMCID: PMC6168729 DOI: 10.1177/2331216518803215] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 08/18/2018] [Accepted: 09/03/2018] [Indexed: 12/19/2022] Open
Abstract
The question of how hearing loss and hearing rehabilitation affect patients' momentary emotional experiences is one that has received little attention but has considerable potential to affect patients' psychosocial function. This article is a product from the Hearing, Emotion, Amplification, Research, and Training workshop, which was convened to develop a consensus document describing research on emotion perception relevant for hearing research. This article outlines conceptual frameworks for the investigation of emotion in hearing research; available subjective, objective, neurophysiologic, and peripheral physiologic data acquisition research methods; the effects of age and hearing loss on emotion perception; potential rehabilitation strategies; priorities for future research; and implications for clinical audiologic rehabilitation. More broadly, this article aims to increase awareness about emotion perception research in audiology and to stimulate additional research on the topic.
Collapse
|
47
|
Abstract
Auditory perception is our main gateway to communication with others via speech and music, and it also plays an important role in alerting and orienting us to new events. This review provides an overview of selected topics pertaining to the perception and neural coding of sound, starting with the first stage of filtering in the cochlea and its profound impact on perception. The next topic, pitch, has been debated for millennia, but recent technical and theoretical developments continue to provide us with new insights. Cochlear filtering and pitch both play key roles in our ability to parse the auditory scene, enabling us to attend to one auditory object or stream while ignoring others. An improved understanding of the basic mechanisms of auditory perception will aid us in the quest to tackle the increasingly important problem of hearing loss in our aging population.
Collapse
|
48
|
Familiar Tonal Context Improves Accuracy of Pitch Interval Perception. Front Psychol 2017; 8:1753. [PMID: 29062295 PMCID: PMC5640898 DOI: 10.3389/fpsyg.2017.01753] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 09/22/2017] [Indexed: 12/03/2022] Open
Abstract
A fundamental feature of everyday music perception is sensitivity to familiar tonal structures such as musical keys. Many studies have suggested that a tonal context can enhance the perception and representation of pitch. Most of these studies have measured response time, which may reflect expectancy as opposed to perceptual accuracy. We instead used a performance-based measure, comparing participants’ ability to discriminate between a “small, in-tune” interval and a “large, mistuned” interval in conditions that involved familiar tonal relations (diatonic, or major, scale notes), unfamiliar tonal relations (whole-tone or mistuned-diatonic scale notes), repetition of a single pitch, or no tonal context. The context was established with a brief sequence of tones in Experiment 1 (melodic context), and a cadence-like two-chord progression in Experiment 2 (harmonic context). In both experiments, performance significantly differed across the context conditions, with a diatonic context providing a significant advantage over no context; however, no correlation with years of musical training was observed. The diatonic tonal context also provided an advantage over the whole-tone scale context condition in Experiment 1 (melodic context), and over the mistuned scale or repetition context conditions in Experiment 2 (harmonic context). However, the relatively small benefit to performance suggests that the main advantage of tonal context may be priming of expected stimuli, rather than enhanced accuracy of pitch interval representation.
Collapse
|
49
|
Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds. Sci Rep 2017; 7:12624. [PMID: 28974705 PMCID: PMC5626707 DOI: 10.1038/s41598-017-12937-9] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Accepted: 09/11/2017] [Indexed: 11/09/2022] Open
Abstract
Recent studies disagree on whether musicians have an advantage over non-musicians in understanding speech in noise. However, it has been suggested that musicians may be able to use differences in fundamental frequency (F0) to better understand target speech in the presence of interfering talkers. Here we studied a relatively large (N = 60) cohort of young adults, equally divided between non-musicians and highly trained musicians, to test whether the musicians were better able to understand speech either in noise or in a two-talker competing speech masker. The target speech and competing speech were presented with either their natural F0 contours or on a monotone F0, and the F0 difference between the target and masker was systematically varied. As expected, speech intelligibility improved with increasing F0 difference between the target and the two-talker masker for both natural and monotone speech. However, no significant intelligibility advantage was observed for musicians over non-musicians in any condition. Although F0 discrimination was significantly better for musicians than for non-musicians, it was not correlated with speech scores. Overall, the results do not support the hypothesis that musical training leads to improved speech intelligibility in complex speech or noise backgrounds.
Collapse
|
50
|
Superoptimal Perceptual Integration Suggests a Place-Based Representation of Pitch at High Frequencies. J Neurosci 2017; 37:9013-9021. [PMID: 28821642 PMCID: PMC5597982 DOI: 10.1523/jneurosci.1507-17.2017] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 07/29/2017] [Accepted: 08/05/2017] [Indexed: 11/21/2022] Open
Abstract
Pitch, the perceptual correlate of sound repetition rate or frequency, plays an important role in speech perception, music perception, and listening in complex acoustic environments. Despite the perceptual importance of pitch, the neural mechanisms that underlie it remain poorly understood. Although cortical regions responsive to pitch have been identified, little is known about how pitch information is extracted from the inner ear itself. The two primary theories of peripheral pitch coding involve stimulus-driven spike timing, or phase locking, in the auditory nerve (time code), and the spatial distribution of responses along the length of the cochlear partition (place code). To rule out the use of timing information, we tested pitch discrimination of very high-frequency tones (>8 kHz), well beyond the putative limit of phase locking. We found that high-frequency pure-tone discrimination was poor, but when the tones were combined into a harmonic complex, a dramatic improvement in discrimination ability was observed that exceeded performance predicted by the optimal integration of peripheral information from each of the component frequencies. The results are consistent with the existence of pitch-sensitive neurons that rely only on place-based information from multiple harmonically related components. The results also provide evidence against the common assumption that poor high-frequency pure-tone pitch perception is the result of peripheral neural-coding constraints. The finding that place-based spectral coding is sufficient to elicit complex pitch at high frequencies has important implications for the design of future neural prostheses to restore hearing to deaf individuals.SIGNIFICANCE STATEMENT The question of how pitch is represented in the ear has been debated for over a century. Two competing theories involve timing information from neural spikes in the auditory nerve (time code) and the spatial distribution of neural activity along the length of the cochlear partition (place code). By using very high-frequency tones unlikely to be coded via time information, we discovered that information from the individual harmonics is combined so efficiently that performance exceeds theoretical predictions based on the optimal integration of information from each harmonic. The findings have important implications for the design of auditory prostheses because they suggest that enhanced spatial resolution alone may be sufficient to restore pitch via such implants.
Collapse
|