1
|
Borjigin A, Bharadwaj HM. Individual Differences Elucidate the Perceptual Benefits Associated with Robust Temporal Fine-Structure Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.20.558670. [PMID: 37790457 PMCID: PMC10542537 DOI: 10.1101/2023.09.20.558670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The auditory system is unique among sensory systems in its ability to phase lock to and precisely follow very fast cycle-by-cycle fluctuations in the phase of sound-driven cochlear vibrations. Yet, the perceptual role of this temporal fine structure (TFS) code is debated. This fundamental gap is attributable to our inability to experimentally manipulate TFS cues without altering other perceptually relevant cues. Here, we circumnavigated this limitation by leveraging individual differences across 200 participants to systematically compare variations in TFS sensitivity to performance in a range of speech perception tasks. TFS sensitivity was assessed through detection of interaural time/phase differences, while speech perception was evaluated by word identification under noise interference. Results suggest that greater TFS sensitivity is not associated with greater masking release from fundamental-frequency or spatial cues, but appears to contribute to resilience against the effects of reverberation. We also found that greater TFS sensitivity is associated with faster response times, indicating reduced listening effort. These findings highlight the perceptual significance of TFS coding for everyday hearing.
Collapse
Affiliation(s)
- Agudemu Borjigin
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Waisman Center, University of Wisconsin - Madison, Madison, WI 53705, USA
| | - Hari M. Bharadwaj
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
2
|
Saddler MR, McDermott JH. Models optimized for real-world tasks reveal the necessity of precise temporal coding in hearing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.21.590435. [PMID: 38712054 PMCID: PMC11071365 DOI: 10.1101/2024.04.21.590435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Neurons encode information in the timing of their spikes in addition to their firing rates. Spike timing is particularly precise in the auditory nerve, where action potentials phase lock to sound with sub-millisecond precision, but its behavioral relevance is uncertain. To investigate the role of this temporal coding, we optimized machine learning models to perform real-world hearing tasks with simulated cochlear input. We asked how precise auditory nerve spike timing needed to be to reproduce human behavior. Models with high-fidelity phase locking exhibited more human-like sound localization and speech perception than models without, consistent with an essential role in human hearing. Degrading phase locking produced task-dependent effects, revealing how the use of fine-grained temporal information reflects both ecological task demands and neural implementation constraints. The results link neural coding to perception and clarify conditions in which prostheses that fail to restore high-fidelity temporal coding could in principle restore near-normal hearing.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, MA, USA
| |
Collapse
|
3
|
Moore BCJ, Vinay. Assessing mechanisms of frequency discrimination by comparison of different measures over a wide frequency range. Sci Rep 2023; 13:11379. [PMID: 37452119 PMCID: PMC10349105 DOI: 10.1038/s41598-023-38600-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 07/11/2023] [Indexed: 07/18/2023] Open
Abstract
It has been hypothesized that auditory detection of frequency modulation (FM) for low FM rates depends on the use of both temporal (phase locking) and place cues, depending on the carrier frequency, while detection of FM at high rates depends primarily on the use of place cues. To test this, FM detection for 2 and 20 Hz rates was measured over a wide frequency range, 1-10 kHz, including high frequencies for which temporal cues are assumed to be very weak. Performance was measured over the same frequency range for a task involving detection of changes in the temporal fine structure (TFS) of bandpass filtered complex tones, for which performance is assumed to depend primarily on the use of temporal cues. FM thresholds were better for the 2- than for the 20-Hz rate for center frequencies up to 4 kHz, while the reverse was true for higher center frequencies. For both FM rates, the thresholds, expressed as a proportion of the center frequency, were roughly constant for center frequencies from 6 to 10 Hz, consistent with the use of place cues. For the TFS task, thresholds worsened progressively with increasing frequency above 4 kHz, consistent with the weakening of temporal cues.
Collapse
Affiliation(s)
- Brian C J Moore
- Cambridge Hearing Group, Department of Psychology, University of Cambridge, Cambridge, UK.
| | - Vinay
- Audiology Group, Department of Neuromedicine and Movement Science, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology (NTNU), Tungasletta 2, 7491, Trondheim, Norway
| |
Collapse
|
4
|
Thoret E, Ystad S, Kronland-Martinet R. Hearing as adaptive cascaded envelope interpolation. Commun Biol 2023; 6:671. [PMID: 37355702 PMCID: PMC10290642 DOI: 10.1038/s42003-023-05040-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/12/2023] [Indexed: 06/26/2023] Open
Abstract
The human auditory system is designed to capture and encode sounds from our surroundings and conspecifics. However, the precise mechanisms by which it adaptively extracts the most important spectro-temporal information from sounds are still not fully understood. Previous auditory models have explained sound encoding at the cochlear level using static filter banks, but this vision is incompatible with the nonlinear and adaptive properties of the auditory system. Here we propose an approach that considers the cochlear processes as envelope interpolations inspired by cochlear physiology. It unifies linear and nonlinear adaptive behaviors into a single comprehensive framework that provides a data-driven understanding of auditory coding. It allows simulating a broad range of psychophysical phenomena from virtual pitches and combination tones to consonance and dissonance of harmonic sounds. It further predicts the properties of the cochlear filters such as frequency selectivity. Here we propose a possible link between the parameters of the model and the density of hair cells on the basilar membrane. Cascaded Envelope Interpolation may lead to improvements in sound processing for hearing aids by providing a non-linear, data-driven, way to preprocessing of acoustic signals consistent with peripheral processes.
Collapse
Affiliation(s)
- Etienne Thoret
- Aix Marseille Univ, CNRS, UMR7061 PRISM, UMR7020 LIS, Marseille, France.
- Institute of Language, Communication, and the Brain (ILCB), Marseille, France.
| | - Sølvi Ystad
- CNRS, Aix Marseille Univ, UMR 7061 PRISM, Marseille, France
| | | |
Collapse
|
5
|
de Cheveigné A. In-channel cancellation: A model of early auditory processing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:3350. [PMID: 37328948 DOI: 10.1121/10.0019752] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 06/02/2023] [Indexed: 06/18/2023]
Abstract
A model of early auditory processing is proposed in which each peripheral channel is processed by a delay-and-subtract cancellation filter, tuned independently for each channel with a criterion of minimum power. For a channel dominated by a pure tone or a resolved partial of a complex tone, the optimal delay is its period. For a channel responding to harmonically related partials, the optimal delay is their common fundamental period. Each peripheral channel is thus split into two subchannels-one that is cancellation-filtered and the other that is not. Perception can involve either or both, depending on the task. The model is illustrated by applying it to the masking asymmetry between pure tones and narrowband noise: a noise target masked by a tone is more easily detectable than a tone target masked by noise. The model is one of a wider class of models, monaural or binaural, that cancel irrelevant stimulus dimensions to attain invariance to competing sources. Similar to occlusion in the visual domain, cancellation yields sensory evidence that is incomplete, thus requiring Bayesian inference of an internal model of the world along the lines of Helmholtz's doctrine of unconscious inference.
Collapse
Affiliation(s)
- Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, Unité Mixte de Recherche 8248, Centre National de la Recherche Scientifique, Paris, France
| |
Collapse
|
6
|
Oxenham AJ. Questions and controversies surrounding the perception and neural coding of pitch. Front Neurosci 2023; 16:1074752. [PMID: 36699531 PMCID: PMC9868815 DOI: 10.3389/fnins.2022.1074752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/16/2022] [Indexed: 01/12/2023] Open
Abstract
Pitch is a fundamental aspect of auditory perception that plays an important role in our ability to understand speech, appreciate music, and attend to one sound while ignoring others. The questions surrounding how pitch is represented in the auditory system, and how our percept relates to the underlying acoustic waveform, have been a topic of inquiry and debate for well over a century. New findings and technological innovations have led to challenges of some long-standing assumptions and have raised new questions. This article reviews some recent developments in the study of pitch coding and perception and focuses on the topic of how pitch information is extracted from peripheral representations based on frequency-to-place mapping (tonotopy), stimulus-driven auditory-nerve spike timing (phase locking), or a combination of both. Although a definitive resolution has proved elusive, the answers to these questions have potentially important implications for mitigating the effects of hearing loss via devices such as cochlear implants.
Collapse
Affiliation(s)
- Andrew J. Oxenham
- Center for Applied and Translational Sensory Science, University of Minnesota Twin Cities, Minneapolis, MN, United States
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
7
|
Mammalian octopus cells are direction selective to frequency sweeps by excitatory synaptic sequence detection. Proc Natl Acad Sci U S A 2022; 119:e2203748119. [PMID: 36279465 PMCID: PMC9636937 DOI: 10.1073/pnas.2203748119] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Octopus cells are remarkable projection neurons of the mammalian cochlear nucleus, with extremely fast membranes and wide-frequency tuning. They are considered prime examples of coincidence detectors but are poorly characterized in vivo. We discover that octopus cells are selective to frequency sweep direction, a feature that is absent in their auditory nerve inputs. In vivo intracellular recordings reveal that direction selectivity does not derive from across-frequency coincidence detection but hinges on the amplitudes and activation sequence of auditory nerve inputs tuned to clusters of hot spot frequencies. A simple biophysical octopus cell model excited with real nerve spike trains recreates direction selectivity through interaction of intrinsic membrane conductances with the activation sequence of clustered excitatory inputs. We conclude that octopus cells are sequence detectors, sensitive to temporal patterns across cochlear frequency channels. The detection of sequences rather than coincidences is a much simpler but powerful operation to extract temporal information.
Collapse
|
8
|
Guest DR, Oxenham AJ. Human discrimination and modeling of high-frequency complex tones shed light on the neural codes for pitch. PLoS Comput Biol 2022; 18:e1009889. [PMID: 35239639 PMCID: PMC8923464 DOI: 10.1371/journal.pcbi.1009889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 03/15/2022] [Accepted: 02/02/2022] [Indexed: 11/24/2022] Open
Abstract
Accurate pitch perception of harmonic complex tones is widely believed to rely on temporal fine structure information conveyed by the precise phase-locked responses of auditory-nerve fibers. However, accurate pitch perception remains possible even when spectrally resolved harmonics are presented at frequencies beyond the putative limits of neural phase locking, and it is unclear whether residual temporal information, or a coarser rate-place code, underlies this ability. We addressed this question by measuring human pitch discrimination at low and high frequencies for harmonic complex tones, presented either in isolation or in the presence of concurrent complex-tone maskers. We found that concurrent complex-tone maskers impaired performance at both low and high frequencies, although the impairment introduced by adding maskers at high frequencies relative to low frequencies differed between the tested masker types. We then combined simulated auditory-nerve responses to our stimuli with ideal-observer analysis to quantify the extent to which performance was limited by peripheral factors. We found that the worsening of both frequency discrimination and F0 discrimination at high frequencies could be well accounted for (in relative terms) by optimal decoding of all available information at the level of the auditory nerve. A Python package is provided to reproduce these results, and to simulate responses to acoustic stimuli from the three previously published models of the human auditory nerve used in our analyses.
Collapse
Affiliation(s)
- Daniel R. Guest
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Andrew J. Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
9
|
Saddler MR, Gonzalez R, McDermott JH. Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception. Nat Commun 2021; 12:7278. [PMID: 34907158 PMCID: PMC8671597 DOI: 10.1038/s41467-021-27366-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 11/12/2021] [Indexed: 11/15/2022] Open
Abstract
Perception is thought to be shaped by the environments for which organisms are optimized. These influences are difficult to test in biological organisms but may be revealed by machine perceptual systems optimized under different conditions. We investigated environmental and physiological influences on pitch perception, whose properties are commonly linked to peripheral neural coding limits. We first trained artificial neural networks to estimate fundamental frequency from biologically faithful cochlear representations of natural sounds. The best-performing networks replicated many characteristics of human pitch judgments. To probe the origins of these characteristics, we then optimized networks given altered cochleae or sound statistics. Human-like behavior emerged only when cochleae had high temporal fidelity and when models were optimized for naturalistic sounds. The results suggest pitch perception is critically shaped by the constraints of natural environments in addition to those of the cochlea, illustrating the use of artificial neural networks to reveal underpinnings of behavior.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
| | - Ray Gonzalez
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
- Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
10
|
Demany L, Monteiro G, Semal C, Shamma S, Carlyon RP. The perception of octave pitch affinity and harmonic fusion have a common origin. Hear Res 2021; 404:108213. [PMID: 33662686 PMCID: PMC7614450 DOI: 10.1016/j.heares.2021.108213] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 02/05/2021] [Accepted: 02/10/2021] [Indexed: 02/06/2023]
Abstract
Musicians say that the pitches of tones with a frequency ratio of 2:1 (one octave) have a distinctive affinity, even if the tones do not have common spectral components. It has been suggested, however, that this affinity judgment has no biological basis and originates instead from an acculturation process ‒ the learning of musical rules unrelated to auditory physiology. We measured, in young amateur musicians, the perceptual detectability of octave mistunings for tones presented alternately (melodic condition) or simultaneously (harmonic condition). In the melodic condition, mistuning was detectable only by means of explicit pitch comparisons. In the harmonic condition, listeners could use a different and more efficient perceptual cue: in the absence of mistuning, the tones fused into a single sound percept; mistunings decreased fusion. Performance was globally better in the harmonic condition, in line with the hypothesis that listeners used a fusion cue in this condition; this hypothesis was also supported by results showing that an illusory simultaneity of the tones was much less advantageous than a real simultaneity. In the two conditions, mistuning detection was generally better for octave compressions than for octave stretchings. This asymmetry varied across listeners, but crucially the listener-specific asymmetries observed in the two conditions were highly correlated. Thus, the perception of the melodic octave appeared to be closely linked to the phenomenon of harmonic fusion. As harmonic fusion is thought to be determined by biological factors rather than factors related to musical culture or training, we argue that octave pitch affinity also has, at least in part, a biological basis.
Collapse
Affiliation(s)
- Laurent Demany
- Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS, EPHE, and Université de Bordeaux, Bordeaux, France.
| | - Guilherme Monteiro
- Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS, EPHE, and Université de Bordeaux, Bordeaux, France
| | - Catherine Semal
- Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS, EPHE, and Université de Bordeaux, Bordeaux, France; Bordeaux INP, Bordeaux, France.
| | - Shihab Shamma
- Institute for Systems Research, University of Maryland, College Park, MD, United States; Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris, France.
| | - Robert P Carlyon
- Cambridge Hearing Group, MRC Cognition and Brain Sciences Unit, Cambridge, United Kingdom.
| |
Collapse
|
11
|
de Cheveigné A. Harmonic Cancellation-A Fundamental of Auditory Scene Analysis. Trends Hear 2021; 25:23312165211041422. [PMID: 34698574 PMCID: PMC8552394 DOI: 10.1177/23312165211041422] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 07/23/2021] [Accepted: 07/09/2021] [Indexed: 11/16/2022] Open
Abstract
This paper reviews the hypothesis of harmonic cancellation according to which an interfering sound is suppressed or canceled on the basis of its harmonicity (or periodicity in the time domain) for the purpose of Auditory Scene Analysis. It defines the concept, discusses theoretical arguments in its favor, and reviews experimental results that support it, or not. If correct, the hypothesis may draw on time-domain processing of temporally accurate neural representations within the brainstem, as required also by the classic equalization-cancellation model of binaural unmasking. The hypothesis predicts that a target sound corrupted by interference will be easier to hear if the interference is harmonic than inharmonic, all else being equal. This prediction is borne out in a number of behavioral studies, but not all. The paper reviews those results, with the aim to understand the inconsistencies and come up with a reliable conclusion for, or against, the hypothesis of harmonic cancellation within the auditory system.
Collapse
Affiliation(s)
- Alain de Cheveigné
- Laboratoire des systèmes perceptifs, CNRS, Paris, France
- Département d’études cognitives, École normale supérieure, PSL
University, Paris, France
- UCL Ear Institute, London, UK
| |
Collapse
|
12
|
Mehta AH, Oxenham AJ. Effect of lowest harmonic rank on fundamental-frequency difference limens varies with fundamental frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2314. [PMID: 32359332 PMCID: PMC7166120 DOI: 10.1121/10.0001092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 03/25/2020] [Accepted: 03/27/2020] [Indexed: 06/11/2023]
Abstract
This study investigated the relationship between fundamental frequency difference limens (F0DLs) and the lowest harmonic number present over a wide range of F0s (30-2000 Hz) for 12-component harmonic complex tones that were presented in either sine or random phase. For fundamental frequencies (F0s) between 100 and 400 Hz, a transition from low (∼1%) to high (∼5%) F0DLs occurred as the lowest harmonic number increased from about seven to ten, in line with earlier studies. At lower and higher F0s, the transition between low and high F0DLs occurred at lower harmonic numbers. The worsening performance at low F0s was reasonably well predicted by the expected decrease in spectral resolution below about 500 Hz. At higher F0s, the degradation in performance at lower harmonic numbers could not be predicted by changes in spectral resolution but remained relatively good (<2%-3%) in some conditions, even when all harmonics were above 8 kHz, confirming that F0 can be extracted from harmonics even when temporal envelope or fine-structure cues are weak or absent.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
13
|
Dahlbom DA, Braasch J. How to pick a peak: Pitch and peak shifting in temporal models of pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2713. [PMID: 32359285 DOI: 10.1121/10.0001134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 04/06/2020] [Indexed: 06/11/2023]
Abstract
The standard autocorrelation model of pitch perception posits that the pitch of a stimulus can be predicted from the first major peak of a summary autocorrelation function (SACF) after the zero-delay peak. Models based on this theory are capable of predicting a wide range of pitch phenomena. There are, however, a number of cases where the approach fails. Two examples are noise edge pitch (NEP) and the pitch induced by the mistuning of a single component of an otherwise harmonic stimulus. Hartmann, Cariani, and Colburn [(2019). J. Acoust. Soc. Am. 145, 1993-2008] recently proposed the use of multiple SACF peaks in the estimation process. This enables prediction of the NEP but suppresses the shift associated with a mistuned harmonic. A functional model is developed that can predict both of these pitch phenomena. The multiple-peak framework is extended with a non-standard peak-selection method that associates a delay time to a given peak in a manner that takes into account the entire shape of the bump surrounding the peak. This effectively shifts the peak location slightly for non-harmonic stimuli. A possible physiological mechanism that could induce such peak shifting is discussed, and the model is tested against existing psychophysical data.
Collapse
Affiliation(s)
- David A Dahlbom
- School of Architecture, Rensselaer Polytechnic Institute, 110 8th Street, Troy, New York 12180, USA
| | - Jonas Braasch
- School of Architecture, Rensselaer Polytechnic Institute, 110 8th Street, Troy, New York 12180, USA
| |
Collapse
|
14
|
Hartmann WM, Cariani PA, Colburn HS. Noise edge pitch and models of pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:1993. [PMID: 31046377 PMCID: PMC7112715 DOI: 10.1121/1.5093546] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 01/31/2019] [Accepted: 02/19/2019] [Indexed: 06/09/2023]
Abstract
Monaural noise edge pitch (NEP) is evoked by a broadband noise with a sharp falling edge in the power spectrum. The pitch is heard near the spectral edge frequency but shifted slightly into the frequency region of the noise. Thus, the pitch of a lowpass (LP) noise is matched by a pure tone typically 2%-5% below the edge, whereas the pitch of highpass (HP) noise is matched a comparable amount above the edge. Musically trained listeners can recognize musical intervals between NEPs. The pitches can be understood from a temporal pattern-matching model of pitch perception based on the peaks of a simplified autocorrelation function. The pitch shifts arise from limits on the autocorrelation window duration. An alternative place-theory approach explains the pitch shifts as the result of lateral inhibition. Psychophysical experiments using edge frequencies of 100 Hz and below find that LP-noise pitches exist but HP-noise pitches do not. The result is consistent with a temporal analysis in tonotopic regions outside the noise band. LP and HP experiments with high-frequency edges find that pitch tends to disappear as the edge frequency approaches 5000 Hz, as expected from a timing theory, though exceptional listeners can go an octave higher.
Collapse
Affiliation(s)
- William M Hartmann
- Department of Physics and Astronomy, Michigan State University, 567 Wilson Road, East Lansing, Michigan 48824, USA
| | - Peter A Cariani
- Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, Massachusetts 02115, USA
| | - H Steven Colburn
- Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, Massachusetts 02115, USA
| |
Collapse
|
15
|
Graves JE, Oxenham AJ. Pitch discrimination with mixtures of three concurrent harmonic complexes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2072. [PMID: 31046318 PMCID: PMC6469983 DOI: 10.1121/1.5096639] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 02/19/2019] [Accepted: 03/13/2019] [Indexed: 06/09/2023]
Abstract
In natural listening contexts, especially in music, it is common to hear three or more simultaneous pitches, but few empirical or theoretical studies have addressed how this is achieved. Place and pattern-recognition theories of pitch require at least some harmonics to be spectrally resolved for pitch to be extracted, but it is unclear how often such conditions exist when multiple complex tones are presented together. In three behavioral experiments, mixtures of three concurrent complexes were filtered into a single bandpass spectral region, and the relationship between the fundamental frequencies and spectral region was varied in order to manipulate the extent to which harmonics were resolved either before or after mixing. In experiment 1, listeners discriminated major from minor triads (a difference of 1 semitone in one note of the triad). In experiments 2 and 3, listeners compared the pitch of a probe tone with that of a subsequent target, embedded within two other tones. All three experiments demonstrated above-chance performance, even in conditions where the combinations of harmonic components were unlikely to be resolved after mixing, suggesting that fully resolved harmonics may not be necessary to extract the pitch from multiple simultaneous complexes.
Collapse
Affiliation(s)
- Jackson E Graves
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
16
|
Barzelay O, Furst M, Barak O. A New Approach to Model Pitch Perception Using Sparse Coding. PLoS Comput Biol 2017; 13:e1005338. [PMID: 28099436 PMCID: PMC5308863 DOI: 10.1371/journal.pcbi.1005338] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2016] [Revised: 02/14/2017] [Accepted: 12/31/2016] [Indexed: 11/18/2022] Open
Abstract
Our acoustical environment abounds with repetitive sounds, some of which are related to pitch perception. It is still unknown how the auditory system, in processing these sounds, relates a physical stimulus and its percept. Since, in mammals, all auditory stimuli are conveyed into the nervous system through the auditory nerve (AN) fibers, a model should explain the perception of pitch as a function of this particular input. However, pitch perception is invariant to certain features of the physical stimulus. For example, a missing fundamental stimulus with resolved or unresolved harmonics, or a low and high-level amplitude stimulus with the same spectral content-these all give rise to the same percept of pitch. In contrast, the AN representations for these different stimuli are not invariant to these effects. In fact, due to saturation and non-linearity of both cochlear and inner hair cells responses, these differences are enhanced by the AN fibers. Thus there is a difficulty in explaining how pitch percept arises from the activity of the AN fibers. We introduce a novel approach for extracting pitch cues from the AN population activity for a given arbitrary stimulus. The method is based on a technique known as sparse coding (SC). It is the representation of pitch cues by a few spatiotemporal atoms (templates) from among a large set of possible ones (a dictionary). The amount of activity of each atom is represented by a non-zero coefficient, analogous to an active neuron. Such a technique has been successfully applied to other modalities, particularly vision. The model is composed of a cochlear model, an SC processing unit, and a harmonic sieve. We show that the model copes with different pitch phenomena: extracting resolved and non-resolved harmonics, missing fundamental pitches, stimuli with both high and low amplitudes, iterated rippled noises, and recorded musical instruments.
Collapse
Affiliation(s)
- Oded Barzelay
- School of Electrical Engineering, Faculty of Engineering, Tel-Aviv University, Tel Aviv, Israel
- Rappaport Faculty of Medicine, Network Biology Research Laboratories, Technion, Haifa, Israel
| | - Miriam Furst
- School of Electrical Engineering, Faculty of Engineering, Tel-Aviv University, Tel Aviv, Israel
| | - Omri Barak
- Rappaport Faculty of Medicine, Network Biology Research Laboratories, Technion, Haifa, Israel
| |
Collapse
|
17
|
Huang C, Rinzel J. A Neuronal Network Model for Pitch Selectivity and Representation. Front Comput Neurosci 2016; 10:57. [PMID: 27378900 PMCID: PMC4910526 DOI: 10.3389/fncom.2016.00057] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 05/26/2016] [Indexed: 11/14/2022] Open
Abstract
Pitch is a perceptual correlate of periodicity. Sounds with distinct spectra can elicit the same pitch. Despite the importance of pitch perception, understanding the cellular mechanism of pitch perception is still a major challenge and a mechanistic model of pitch is lacking. A multi-stage neuronal network model is developed for pitch frequency estimation using biophysically-based, high-resolution coincidence detector neurons. The neuronal units respond only to highly coincident input among convergent auditory nerve fibers across frequency channels. Their selectivity for only very fast rising slopes of convergent input enables these slope-detectors to distinguish the most prominent coincidences in multi-peaked input time courses. Pitch can then be estimated from the first-order interspike intervals of the slope-detectors. The regular firing pattern of the slope-detector neurons are similar for sounds sharing the same pitch despite the distinct timbres. The decoded pitch strengths also correlate well with the salience of pitch perception as reported by human listeners. Therefore, our model can serve as a neural representation for pitch. Our model performs successfully in estimating the pitch of missing fundamental complexes and reproducing the pitch variation with respect to the frequency shift of inharmonic complexes. It also accounts for the phase sensitivity of pitch perception in the cases of Schroeder phase, alternating phase and random phase relationships. Moreover, our model can also be applied to stochastic sound stimuli, iterated-ripple-noise, and account for their multiple pitch perceptions.
Collapse
Affiliation(s)
- Chengcheng Huang
- Department of Mathematics, Courant Institute of Mathematical Sciences, New York UniversityNew York, NY, USA
- Department of Mathematics, University of PittsburghPittsburgh, PA, USA
| | - John Rinzel
- Department of Mathematics, Courant Institute of Mathematical Sciences, New York UniversityNew York, NY, USA
- Center for Neural Science, New York UniversityNew York, NY, USA
| |
Collapse
|
18
|
Abstract
Robust representations of sounds with a complex spectrotemporal structure are thought to emerge in hierarchically organized auditory cortex, but the computational advantage of this hierarchy remains unknown. Here, we used computational models to study how such hierarchical structures affect temporal binding in neural networks. We equipped individual units in different types of feedforward networks with local memory mechanisms storing recent inputs and observed how this affected the ability of the networks to process stimuli context dependently. Our findings illustrate that these local memories stack up in hierarchical structures and hence allow network units to exhibit selectivity to spectral sequences longer than the time spans of the local memories. We also illustrate that short-term synaptic plasticity is a potential local memory mechanism within the auditory cortex, and we show that it can bring robustness to context dependence against variation in the temporal rate of stimuli, while introducing nonlinearities to response profiles that are not well captured by standard linear spectrotemporal receptive field models. The results therefore indicate that short-term synaptic plasticity might provide hierarchically structured auditory cortex with computational capabilities important for robust representations of spectrotemporal patterns.
Collapse
Affiliation(s)
- Johan Westö
- Department of Neuroscience and Biomedical Engineering, Aalto University, FI-00076 Espoo, Finland
| | - Patrick J. C. May
- Special Laboratory Non-Invasive Brain Imaging, Leibniz Institute for Neurobiology, D-39118 Magdeburg, Germany
| | - Hannu Tiitinen
- Department of Neuroscience and Biomedical Engineering, Aalto University, FI-00076 Espoo, Finland
| |
Collapse
|
19
|
Abstract
Sensorineural hearing loss is the most common type of hearing impairment worldwide. It arises as a consequence of damage to the cochlea or auditory nerve, and several structures are often affected simultaneously. There are many causes, including genetic mutations affecting the structures of the inner ear, and environmental insults such as noise, ototoxic substances, and hypoxia. The prevalence increases dramatically with age. Clinical diagnosis is most commonly accomplished by measuring detection thresholds and comparing these to normative values to determine the degree of hearing loss. In addition to causing insensitivity to weak sounds, sensorineural hearing loss has a number of adverse perceptual consequences, including loudness recruitment, poor perception of pitch and auditory space, and difficulty understanding speech, particularly in the presence of background noise. The condition is usually incurable; treatment focuses on restoring the audibility of sounds made inaudible by hearing loss using either hearing aids or cochlear implants.
Collapse
Affiliation(s)
- Kathryn Hopkins
- School of Psychological Sciences, University of Manchester, Manchester, UK.
| |
Collapse
|
20
|
Kan A, Litovsky RY. Binaural hearing with electrical stimulation. Hear Res 2014; 322:127-37. [PMID: 25193553 DOI: 10.1016/j.heares.2014.08.005] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2014] [Revised: 07/16/2014] [Accepted: 08/18/2014] [Indexed: 11/18/2022]
Abstract
Bilateral cochlear implantation is becoming a standard of care in many clinics. While much benefit has been shown through bilateral implantation, patients who have bilateral cochlear implants (CIs) still do not perform as well as normal hearing listeners in sound localization and understanding speech in noisy environments. This difference in performance can arise from a number of different factors, including the areas of hardware and engineering, surgical precision and pathology of the auditory system in deaf persons. While surgical precision and individual pathology are factors that are beyond careful control, improvements can be made in the areas of clinical practice and the engineering of binaural speech processors. These improvements should be grounded in a good understanding of the sensitivities of bilateral CI patients to the acoustic binaural cues that are important to normal hearing listeners for sound localization and speech in noise understanding. To this end, we review the current state-of-the-art in the understanding of the sensitivities of bilateral CI patients to binaural cues in electric hearing, and highlight the important issues and challenges as they relate to clinical practice and the development of new binaural processing strategies. This article is part of a Special Issue entitled <Lasker Award>.
Collapse
Affiliation(s)
- Alan Kan
- University of Wisconsin-Madison Waisman Center, 1500 Highland Ave, Madison WI 53705, USA.
| | - Ruth Y Litovsky
- University of Wisconsin-Madison Waisman Center, 1500 Highland Ave, Madison WI 53705, USA.
| |
Collapse
|
21
|
Etchemendy PE, Eguia MC, Mesz B. Principal pitch of frequency-modulated tones with asymmetrical modulation waveform: a comparison of models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:1344-1355. [PMID: 24606273 DOI: 10.1121/1.4863649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this work, the overall perceived pitch (principal pitch) of pure tones modulated in frequency with an asymmetric waveform is studied. The dependence of the principal pitch on the degree of asymmetric modulation was obtained from a psychophysical experiment. The modulation waveform consisted of a flat portion of constant frequency and two linear segments forming a peak. Consistent with previous results, significant pitch shifts with respect to the time-averaged geometric mean were observed. The direction of the shifts was always toward the flat portion of the modulation. The results from the psychophysical experiment, along with those obtained from previously reported studies, were compared with the predictions of six models of pitch perception proposed in the literature. Even though no single model was able to predict accurately the perceived pitch for all experiments, there were two models that give robust predictions that are within the range of acceptable tuning of modulated tones for almost all the cases. Both models point to the existence of an underlying "stability sensitive" mechanism for the computation of pitch that gives more weight to the portion of the stimuli where the frequency is changing more slowly.
Collapse
Affiliation(s)
- Pablo E Etchemendy
- Laboratorio de Acústica y Percepción Sonora, Universidad Nacional de Quilmes, R. S. Pena 352 Bernal, B1876BXD Buenos Aires, Argentina
| | - Manuel C Eguia
- Laboratorio de Acústica y Percepción Sonora, Universidad Nacional de Quilmes, R. S. Pena 352 Bernal, B1876BXD Buenos Aires, Argentina
| | - Bruno Mesz
- Laboratorio de Acústica y Percepción Sonora, Universidad Nacional de Quilmes, R. S. Pena 352 Bernal, B1876BXD Buenos Aires, Argentina
| |
Collapse
|
22
|
Jackson HM, Moore BCJ. The role of excitation-pattern, temporal-fine-structure, and envelope cues in the discrimination of complex tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:1356-1370. [PMID: 24606274 DOI: 10.1121/1.4864306] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The discrimination of bandpass-filtered harmonic (H) from inharmonic (I) tones (produced by shifting all components of the H tones upwards by a fixed amount in Hz) could be based on shifts in the pattern of ripples in the excitation pattern (EP) or on changes in the temporal fine structure evoked by the tones. The predictions of two computational EP models were compared with measured performance. One model used auditory filters with bandwidth values specified by Glasberg and Moore [(1990). Hear. Res. 47, 103-138] and one used filters that were twice as sharp. Stimulus variables were passband width, fundamental frequency, harmonic rank (N) of the lowest component within the passband, component phase (cosine or random), signal-to-noise ratio (SNR), and random perturbation in level of each component in the tones. While the EP models correctly predicted the lack of an effect of phase and some of the trends in the data as a function of fundamental frequency and N, neither model predicted the worsening in performance with increasing passband width or the lack of effect of SNR and level perturbation. It is concluded that discrimination of the H and I tones is not based solely on the use of EP cues.
Collapse
Affiliation(s)
- Helen M Jackson
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | - Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| |
Collapse
|
23
|
Lerud KD, Almonte FV, Kim JC, Large EW. Mode-locking neurodynamics predict human auditory brainstem responses to musical intervals. Hear Res 2013; 308:41-9. [PMID: 24091182 DOI: 10.1016/j.heares.2013.09.010] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Revised: 09/13/2013] [Accepted: 09/17/2013] [Indexed: 11/25/2022]
Abstract
The auditory nervous system is highly nonlinear. Some nonlinear responses arise through active processes in the cochlea, while others may arise in neural populations of the cochlear nucleus, inferior colliculus and higher auditory areas. In humans, auditory brainstem recordings reveal nonlinear population responses to combinations of pure tones, and to musical intervals composed of complex tones. Yet the biophysical origin of central auditory nonlinearities, their signal processing properties, and their relationship to auditory perception remain largely unknown. Both stimulus components and nonlinear resonances are well represented in auditory brainstem nuclei due to neural phase-locking. Recently mode-locking, a generalization of phase-locking that implies an intrinsically nonlinear processing of sound, has been observed in mammalian auditory brainstem nuclei. Here we show that a canonical model of mode-locked neural oscillation predicts the complex nonlinear population responses to musical intervals that have been observed in the human brainstem. The model makes predictions about auditory signal processing and perception that are different from traditional delay-based models, and may provide insight into the nature of auditory population responses. We anticipate that the application of dynamical systems analysis will provide the starting point for generic models of auditory population dynamics, and lead to a deeper understanding of nonlinear auditory signal processing possibly arising in excitatory-inhibitory networks of the central auditory nervous system. This approach has the potential to link neural dynamics with the perception of pitch, music, and speech, and lead to dynamical models of auditory system development.
Collapse
Affiliation(s)
- Karl D Lerud
- University of Connecticut, Department of Psychology, 406 Babbidge Road, Storrs, CT 06269-1020, USA
| | - Felix V Almonte
- University of Connecticut, Department of Psychology, 406 Babbidge Road, Storrs, CT 06269-1020, USA
| | - Ji Chul Kim
- University of Connecticut, Department of Psychology, 406 Babbidge Road, Storrs, CT 06269-1020, USA
| | - Edward W Large
- University of Connecticut, Department of Psychology, 406 Babbidge Road, Storrs, CT 06269-1020, USA.
| |
Collapse
|
24
|
Jackson HM, Moore BCJ. The dominant region for the pitch of complex tones with low fundamental frequencies. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:1193-1204. [PMID: 23927118 DOI: 10.1121/1.4812754] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The dominant region for pitch for complex tones with low fundamental frequency (F0) was investigated. Thresholds for detection of a change in F0 (F0DLs) were measured for a group of harmonics (group B) embedded in a group of fixed non-overlapping harmonics (group A) with the same mean F0. It was assumed that F0DLs would be smallest when the harmonics in group B fell in the dominant region. The rank of the lowest harmonic in group B, N, was varied from 1 to 15. When all components had the same level, F0DLs increased with increasing N, but the increase started at a lower value of N for F0 = 200 Hz than for F0 = 50 or 100 Hz, the opposite of what would be expected if the dominant region corresponds to resolved harmonics. When the component levels followed an equal-loudness contour, F0DLs for F0 = 50 Hz were lowest for N = 1, but overall performance was much worse than for equal-level components, suggesting that the lowest harmonics were masking information from the higher harmonics.
Collapse
Affiliation(s)
- Helen M Jackson
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | | |
Collapse
|
25
|
Deeks JM, Gockel HE, Carlyon RP. Further examination of complex pitch perception in the absence of a place-rate match. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:377-388. [PMID: 23297910 DOI: 10.1121/1.4770254] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Oxenham et al. [Proc. Nat. Acad. Sci. 101, 1421-1425 (2004)] reported that listeners cannot derive a "missing fundamental" from three transposed tones having high carrier frequencies and harmonically related low-frequency modulators. This finding was attributed to complex pitch perception requiring correct tonotopic representation but could have been due to the very high modulator rate difference limens (DLs) observed for individual transposed tones. Experiments 1 and 2 showed that much lower DLs could be obtained for bandpass-filtered pulse trains than for transposed tones with repetition rates of 100 or 300 pps; however, DLs were still larger than for low-frequency pure tones. Experiment 3 presented three pulse trains filtered between 1375 and 1875, 3900 and 5400, and 7800 and 10 800 Hz simultaneously with a pink-noise background. Listeners could not compare the "missing fundamental" of a stimulus in which the pulse rates were, respectively, 150, 225, and 300 pps, to one where all pulse trains had a rate of 75 pps, even though they could compare a 150 + 225 + 300 Hz complex tone to a 75-Hz pure tone. Hence although filtered pulse trains can produce fairly good pitch perception of simple stimuli having low repetition rates and high-frequency spectral content, no evidence that such stimuli enable complex pitch perception in the absence of a place-rate match was found.
Collapse
Affiliation(s)
- John M Deeks
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom.
| | | | | |
Collapse
|
26
|
Abstract
Tonal relationships are foundational in music, providing the basis upon which musical structures, such as melodies, are constructed and perceived. A recent dynamic theory of musical tonality predicts that networks of auditory neurons resonate nonlinearly to musical stimuli. Nonlinear resonance leads to stability and attraction relationships among neural frequencies, and these neural dynamics give rise to the perception of relationships among tones that we collectively refer to as tonal cognition. Because this model describes the dynamics of neural populations, it makes specific predictions about human auditory neurophysiology. Here, we show how predictions about the auditory brainstem response (ABR) are derived from the model. To illustrate, we derive a prediction about population responses to musical intervals that has been observed in the human brainstem. Our modeled ABR shows qualitative agreement with important features of the human ABR. This provides a source of evidence that fundamental principles of auditory neurodynamics might underlie the perception of tonal relationships, and forces reevaluation of the role of learning and enculturation in tonal cognition.
Collapse
Affiliation(s)
- Edward W Large
- Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, Florida 33431, USA.
| | | |
Collapse
|
27
|
Moore BCJ, Vickers DA, Mehta A. The effects of age on temporal fine structure sensitivity in monaural and binaural conditions. Int J Audiol 2012; 51:715-21. [DOI: 10.3109/14992027.2012.690079] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
28
|
Wang J, Baer T, Glasberg BR, Stone MA, Ye D, Moore BCJ. Pitch perception of concurrent harmonic tones with overlapping spectra. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:339-356. [PMID: 22779482 DOI: 10.1121/1.4728165] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Fundamental frequency difference limens (F0DLs) were measured for a target harmonic complex tone with nominal fundamental frequency (F0) of 200 Hz, in the presence and absence of a harmonic masker with overlapping spectrum. The F0 of the masker was 0, ± 3, or ± 6 semitones relative to 200 Hz. The stimuli were bandpass filtered into three regions: 0-1000 Hz (low, L), 1600-2400 Hz (medium, M), and 2800-3600 Hz (high, H), and a background noise was used to mask combination tones and to limit the audibility of components falling on the filter skirts. The components of the target or masker started either in cosine or random phase. Generally, the effect of F0 difference between target and masker was small. For the target alone, F0DLs were larger for random than cosine phase for region H. For the target plus masker, F0DLs were larger when the target had random phase than cosine phase for regions M and H. F0DLs increased with increasing center frequency of the bandpass filter. Modeling using excitation patterns and "summary autocorrelation" and "stabilized auditory image" models suggested that use of temporal fine structure information can account for the small F0DLs obtained when harmonics are barely, if at all, resolved.
Collapse
Affiliation(s)
- Jian Wang
- Department of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | | | | | | | | | | |
Collapse
|
29
|
Across-channel timing differences as a potential code for the frequency of pure tones. J Assoc Res Otolaryngol 2011; 13:159-171. [PMID: 22160791 PMCID: PMC3298616 DOI: 10.1007/s10162-011-0305-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 11/07/2011] [Indexed: 11/06/2022] Open
Abstract
When a pure tone or low-numbered harmonic is presented to a listener, the resulting travelling wave in the cochlea slows down at the portion of the basilar membrane (BM) tuned to the input frequency due to the filtering properties of the BM. This slowing is reflected in the phase of the response of neurons across the auditory nerve (AN) array. It has been suggested that the auditory system exploits these across-channel timing differences to encode the pitch of both pure tones and resolved harmonics in complex tones. Here, we report a quantitative analysis of previously published data on the response of guinea pig AN fibres, of a range of characteristic frequencies, to pure tones of different frequencies and levels. We conclude that although the use of across-channel timing cues provides an a priori attractive and plausible means of encoding pitch, many of the most obvious metrics for using that cue produce pitch estimates that are strongly influenced by the overall level and therefore are unlikely to provide a straightforward means for encoding the pitch of pure tones.
Collapse
|
30
|
Resolvability of components in complex tones and implications for theories of pitch perception. Hear Res 2011; 276:88-97. [PMID: 21236327 DOI: 10.1016/j.heares.2011.01.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2010] [Revised: 12/03/2010] [Accepted: 01/04/2011] [Indexed: 11/20/2022]
Abstract
This paper reviews methods that have been used to estimate the resolvability of individual partials in harmonic and inharmonic complex tones and considers the implications of the results for theories of pitch perception. The methods include: requiring comparisons of the pitch of an isolated pure tone and a partial within a complex tone as a measure of the ability to "hear out" that partial; considering the magnitude of ripples in the calculated excitation pattern of a complex tone; using a complex tone as a forward masker and using ripples in the masking pattern to estimate resolvability; measuring sensitivity to the relative phase of the components within complex tones. The measures are broadly consistent in indicating that harmonics with numbers up to about five are well resolved, but that resolution decreases for higher harmonics. Most measures suggest that harmonics with numbers above eight are poorly, if at all, resolved. However, there are uncertainties associated with each method that make the exact upper limit of resolvability uncertain. Evidence is presented suggesting a partial dissociation between resolution in the excitation pattern and the ability to hear out a partial. It is proposed that the latter requires information from temporal fine structure (phase locking).
Collapse
|
31
|
What breaks a melody: perceiving F0 and intensity sequences with a cochlear implant. Hear Res 2010; 269:34-41. [PMID: 20674733 DOI: 10.1016/j.heares.2010.07.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 07/16/2010] [Accepted: 07/20/2010] [Indexed: 11/21/2022]
Abstract
Pitch perception has been extensively studied using discrimination tasks on pairs of single sounds. When comparing pitch discrimination performance for normal-hearing (NH) and cochlear implant (CI) listeners, it usually appears that CI users have relatively poor pitch discrimination. Tasks involving pitch sequences, such as melody perception or auditory scene analysis, are also usually difficult for CI users. However, it is unclear whether the issue with pitch sequences is a consequence of sound discriminability, or if an impairment exists for sequence processing per se. Here, we compared sequence processing abilities across stimulus dimensions (fundamental frequency and intensity) and listener groups (NH, CI, and NH listeners presented with noise-vocoded sequences). The sequence elements were firstly matched in discriminability, for each listener and dimension. Participants were then presented with pairs of sequences, constituted by up to four elements varying on a single dimension, and they performed a same/different task. In agreement with a previous study (Cousineau et al., 2009) fundamental frequency sequences were processed more accurately than intensity sequences by NH listeners. However, this was not the case for CI listeners, nor for NH listeners presented with noise-vocoded sequences. Intensity sequence processing was, nonetheless, equally accurate in the three groups. These results show that the reduced pitch cues received by CI listeners do not only elevate thresholds, as previously documented, but also affect pitch sequence processing above threshold. We suggest that efficient sequence processing for pitch requires the resolution of individual harmonics in the auditory periphery, which is not achieved with the current generation of implants.
Collapse
|
32
|
Moore BCJ, Glasberg BR. The role of temporal fine structure in harmonic segregation through mistuning. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:5-8. [PMID: 20058944 DOI: 10.1121/1.3268509] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Bernstein and Oxenham [(2008). J. Acoust. Soc. Am. 124, 1653-1667] measured thresholds for discriminating the fundamental frequency, F0, of a complex tone that was passed through a fixed bandpass filter. They found that performance worsened when the F0 was decreased so that only harmonics above the tenth were audible. However, performance in this case was improved by mistuning the odd harmonics by 3%. Bernstein and Oxenham considered whether the results could be explained in terms of temporal fine structure information available at the output of a single auditory filter and concluded that their results did not appear to be consistent with such an explanation. Here, it is argued that such cues could have led to the improvement in performance produced by mistuning the odd harmonics.
Collapse
Affiliation(s)
- Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England.
| | | |
Collapse
|
33
|
Cousineau M, Demany L, Pressnitzer D. What makes a melody: The perceptual singularity of pitch sequences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:3179-3187. [PMID: 20000931 DOI: 10.1121/1.3257206] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
This study investigated the ability of normal-hearing listeners to process random sequences of tones varying in either pitch or loudness. Same/different judgments were collected for pairs of sequences with a variable length (up to eight elements) and built from only two different elements, which were 200-ms harmonic complex tones. The two possible elements of all sequences had a fixed level of discriminability, corresponding to a d(') value of about 2, irrespective of the auditory dimension (pitch or loudness) along which they differed. This made it possible to assess sequence processing per se, independent of the accuracy of sound encoding. Pitch sequences were found to be processed more effectively than loudness sequences. However, that was the case only when the sequence elements included low-rank harmonics, which could be at least partially resolved in the auditory periphery. The effect of roving and transposition was also investigated. These manipulations reduced overall performance, especially transposition, but an advantage for pitch sequences was still observed. These results suggest that automatic frequency-shift detectors, available for pitch sequences but not loudness sequences, participate in the effective encoding of melodies.
Collapse
Affiliation(s)
- Marion Cousineau
- Laboratoire de Psychologie de la Perception (UMR CNRS 8158), Universite Paris-Descartes, F-75230 Paris Cedex 05, France.
| | | | | |
Collapse
|
34
|
Moore BCJ, Sek A. Development of a fast method for determining sensitivity to temporal fine structure. Int J Audiol 2009; 48:161-71. [PMID: 19085395 DOI: 10.1080/14992020802475235] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Recent evidence suggests that sensitivity to the temporal fine structure (TFS) of sounds is adversely affected by cochlear hearing loss. This may partly explain the difficulties experienced by people with cochlear hearing loss in understanding speech when background sounds, especially fluctuating backgrounds, are present. We describe a test for assessing sensitivity to TFS. The test can be run using any PC with a sound card. The test involves discrimination of a harmonic complex tone (H), with a fundamental frequency F0, from a tone in which all harmonics are shifted upwards by the same amount in Hertz, resulting in an inharmonic tone (I). The phases of the components are selected randomly for every stimulus. Both tones have an envelope repetition rate equal to F0, but the tones differ in their TFS. To prevent discrimination based on spectral cues, all tones are passed through a fixed bandpass filter, usually centred at 11F0. A background noise is used to mask combination tones. The results show that, for normal-hearing subjects, learning effects are small, and the effect of the level of testing is also small. The test provides a simple, quick, and robust way to measure sensitivity to TFS.
Collapse
Affiliation(s)
- Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge, UK.
| | | |
Collapse
|
35
|
Moore BCJ, Hopkins K, Cuthbertson S. Discrimination of complex tones with unresolved components using temporal fine structure information. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:3214-3222. [PMID: 19425664 DOI: 10.1121/1.3106135] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The information used to discriminate complex tones with (largely) unresolved components was assessed. In experiment 1, subjects discriminated a harmonic complex tone, H, with fundamental frequency F0, from an inharmonic tone, I, in which all components were shifted upwards by the same amount in hertz. Tones H and I had the same envelope repetition rate but different temporal fine structure (TFS). The tones were passed through a fixed bandpass filter centered on harmonic N, to reduce excitation pattern cues. For all F0s (35-400 Hz), performance decreased as N was increased from 11 to 15, but, except for F0=35 Hz, remained above chance for N=15, where all harmonics should be unresolved. This suggests that discrimination can be based on TFS rather than on partially resolved components. In experiment 2, subjects discriminated the F0 of complex tones filtered as in experiment 1. Here, both envelope rate and TFS cues were available. Except for F0=35 Hz, discrimination thresholds, expressed as the Weber fraction for a change in time interval, were similar to those measured in experiment 1, suggesting that performance in experiment 2 was dominated by the use of TFS rather than envelope cues.
Collapse
Affiliation(s)
- Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England.
| | | | | |
Collapse
|
36
|
Balaguer-Ballester E, Clark NR, Coath M, Krumbholz K, Denham SL. Understanding pitch perception as a hierarchical process with top-down modulation. PLoS Comput Biol 2009; 5:e1000301. [PMID: 19266015 PMCID: PMC2639722 DOI: 10.1371/journal.pcbi.1000301] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2008] [Accepted: 01/23/2009] [Indexed: 11/18/2022] Open
Abstract
Pitch is one of the most important features of natural sounds, underlying the perception of melody in music and prosody in speech. However, the temporal dynamics of pitch processing are still poorly understood. Previous studies suggest that the auditory system uses a wide range of time scales to integrate pitch-related information and that the effective integration time is both task- and stimulus-dependent. None of the existing models of pitch processing can account for such task- and stimulus-dependent variations in processing time scales. This study presents an idealized neurocomputational model, which provides a unified account of the multiple time scales observed in pitch perception. The model is evaluated using a range of perceptual studies, which have not previously been accounted for by a single model, and new results from a neurophysiological experiment. In contrast to other approaches, the current model contains a hierarchy of integration stages and uses feedback to adapt the effective time scales of processing at each stage in response to changes in the input stimulus. The model has features in common with a hierarchical generative process and suggests a key role for efferent connections from central to sub-cortical areas in controlling the temporal dynamics of pitch processing.
Collapse
Affiliation(s)
- Emili Balaguer-Ballester
- Centre for Theoretical and Computational Neuroscience, University of Plymouth, Plymouth, United Kingdom.
| | | | | | | | | |
Collapse
|
37
|
Nelken I. Processing of complex sounds in the auditory system. Curr Opin Neurobiol 2008; 18:413-7. [PMID: 18805485 DOI: 10.1016/j.conb.2008.08.014] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2008] [Revised: 08/23/2008] [Accepted: 08/26/2008] [Indexed: 12/01/2022]
Abstract
The coding of complex sounds in the early auditory system has a 'standard model' based on the known physiology of the cochlea and main brainstem pathways. This model accounts for a wide range of perceptual capabilities. It is generally accepted that high cortical areas encode abstract qualities such as spatial location or speech sound identity. Between the early and late auditory system, the role of primary auditory cortex (A1) is still debated. A1 is clearly much more than a 'whiteboard' of acoustic information-neurons in A1 have complex response properties, showing sensitivity to both low-level and high-level features of sounds.
Collapse
Affiliation(s)
- Israel Nelken
- Department of Neurobiology, Silberman Institute of Life Sciences, and the Interdisciplinary Center for Neural Computation (ICNC), Hebrew University, Safra Campus, Jerusalem 91904, Israel.
| |
Collapse
|
38
|
Larsen E, Cedolin L, Delgutte B. Pitch representations in the auditory nerve: two concurrent complex tones. J Neurophysiol 2008; 100:1301-19. [PMID: 18632887 PMCID: PMC2544468 DOI: 10.1152/jn.01361.2007] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Pitch differences between concurrent sounds are important cues used in auditory scene analysis and also play a major role in music perception. To investigate the neural codes underlying these perceptual abilities, we recorded from single fibers in the cat auditory nerve in response to two concurrent harmonic complex tones with missing fundamentals and equal-amplitude harmonics. We investigated the efficacy of rate-place and interspike-interval codes to represent both pitches of the two tones, which had fundamental frequency (F0) ratios of 15/14 or 11/9. We relied on the principle of scaling invariance in cochlear mechanics to infer the spatiotemporal response patterns to a given stimulus from a series of measurements made in a single fiber as a function of F0. Templates created by a peripheral auditory model were used to estimate the F0s of double complex tones from the inferred distribution of firing rate along the tonotopic axis. This rate-place representation was accurate for F0s greater, similar900 Hz. Surprisingly, rate-based F0 estimates were accurate even when the two-tone mixture contained no resolved harmonics, so long as some harmonics were resolved prior to mixing. We also extended methods used previously for single complex tones to estimate the F0s of concurrent complex tones from interspike-interval distributions pooled over the tonotopic axis. The interval-based representation was accurate for F0s less, similar900 Hz, where the two-tone mixture contained no resolved harmonics. Together, the rate-place and interval-based representations allow accurate pitch perception for concurrent sounds over the entire range of human voice and cat vocalizations.
Collapse
Affiliation(s)
- Erik Larsen
- Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | | | | |
Collapse
|
39
|
Bernstein JGW, Oxenham AJ. Harmonic segregation through mistuning can improve fundamental frequency discrimination. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:1653-1667. [PMID: 19045656 PMCID: PMC2736713 DOI: 10.1121/1.2956484] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2007] [Revised: 06/16/2008] [Accepted: 06/17/2008] [Indexed: 05/27/2023]
Abstract
This study investigated the relationship between harmonic frequency resolution and fundamental frequency (f(0)) discrimination. Consistent with earlier studies, f(0) discrimination of a diotic bandpass-filtered harmonic complex deteriorated sharply as the f(0) decreased to the point where only harmonics above the tenth were presented. However, when the odd harmonics were mistuned by 3%, performance improved dramatically, such that performance nearly equaled that found with only even harmonics present. Mistuning also improved performance when alternating harmonics were presented to opposite ears (dichotic condition). In a task involving frequency discrimination of individual harmonics within the complexes, mistuning the odd harmonics yielded no significant improvement in the resolution of individual harmonics. Pitch matches to the mistuned complexes suggested that the even harmonics dominated the pitch for f(0)'s at which a benefit of mistuning was observed. The results suggest that f(0) discrimination performance can benefit from perceptual segregation based on inharmonicity, and that poor performance when only high-numbered harmonics are present is not due to limited peripheral harmonic resolvability. Taken together with earlier results, the findings suggest that f(0) discrimination may depend on auditory filter bandwidths, but that spectral resolution of individual harmonics is neither necessary nor sufficient for accurate f(0) discrimination.
Collapse
Affiliation(s)
- Joshua G W Bernstein
- Army Audiology and Speech Center, Walter Reed Army Medical Center, 6900 Georgia Avenue N.W., Washington, DC 20307, USA.
| | | |
Collapse
|
40
|
Balaguer-Ballester E, Coath M, Denham SL. A model of perceptual segregation based on clustering the time series of the simulated auditory nerve firing probability. BIOLOGICAL CYBERNETICS 2007; 97:479-491. [PMID: 17994247 DOI: 10.1007/s00422-007-0187-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2007] [Accepted: 09/27/2007] [Indexed: 05/25/2023]
Abstract
This paper introduces a model that accounts quantitatively for a phenomenon of perceptual segregation, the simultaneous perception of more than one pitch in a single complex sound. The method is based on a characterization of the time-varying spike probability generated by a model of cochlear responses to sounds. It demonstrates how the autocorrelation theories of pitch perception contain the necessary elements to define a specific measure in the phase space of the simulated auditory nerve probability of firing time series. This measure was motivated in the first instance by the correlation dimension of the attractor; however, it has been modified in several ways in order to increase the neurobiological plausibility. This quantity characterizes each of the cochlear frequency channels and gives rise to a channel clustering criterion. The model computes the clusters and the pitch estimates simultaneously using the same processing mechanisms of delay lines; therefore, it respects the biological constraints in a similar way to temporal theories of pitch. The model successfully explains a wide range of perceptual experiments.
Collapse
Affiliation(s)
- Emili Balaguer-Ballester
- Centre for Theoretical and Computational Neuroscience, University of Plymouth, Portland Square, Drake Circus, Plymouth, Devon, UK.
| | | | | |
Collapse
|
41
|
Bernstein JGW, Oxenham AJ. The relationship between frequency selectivity and pitch discrimination: sensorineural hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:3929-45. [PMID: 17225420 DOI: 10.1121/1.2372452] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
This study tested the relationship between frequency selectivity and the minimum spacing between harmonics necessary for accurate fo discrimination. Fundamental frequency difference limens (fo DLs) were measured for ten listeners with moderate sensorineural hearing loss (SNHL) and three normal-hearing listeners for sine- and random-phase harmonic complexes, bandpass filtered between 1500 and 3500 Hz, with fo's ranging from 75 to 500 Hz (or higher). All listeners showed a transition between small (good) fo DLs at high fo's and large (poor) fo DLs at low fo's, although the fo at which this transition occurred (fo,tr) varied across listeners. Three measures thought to reflect frequency selectivity were significantly correlated to both the fo,tr and the minimum fo DL achieved at high fo's: (1) the maximum fo for which fo DLs were phase dependent, (2) the maximum modulation frequency for which amplitude modulation and quasi-frequency modulation were discriminable, and (3) the equivalent rectangular bandwidth of the auditory filter, estimated using the notched-noise method. These results provide evidence of a relationship between fo discrimination performance and frequency selectivity in listeners with SNHL, supporting "spectral" and "spectro-temporal" theories of pitch perception that rely on sharp tuning in the auditory periphery to accurately extract fo information.
Collapse
Affiliation(s)
- Joshua G W Bernstein
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | |
Collapse
|
42
|
Bernstein JGW, Oxenham AJ. The relationship between frequency selectivity and pitch discrimination: effects of stimulus level. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:3916-28. [PMID: 17225419 DOI: 10.1121/1.2372451] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Three experiments tested the hypothesis that fundamental frequency (fo) discrimination depends on the resolvability of harmonics within a tone complex. Fundamental frequency difference limens (fo DLs) were measured for random-phase harmonic complexes with eight fo's between 75 and 400 Hz, bandpass filtered between 1.5 and 3.5 kHz, and presented at 12.5-dB/component average sensation level in threshold equalizing noise with levels of 10, 40, and 65 dB SPL per equivalent rectangular auditory filter bandwidth. With increasing level, the transition from large (poor) to small (good) fo DLs shifted to a higher fo. This shift corresponded to a decrease in harmonic resolvability, as estimated in the same listeners with excitation patterns derived from measures of auditory filter shape and with a more direct measure that involved hearing out individual harmonics. The results are consistent with the idea that resolved harmonics are necessary for good fo discrimination. Additionally, fo DLs for high fo's increased with stimulus level in the same way as pure-tone frequency DLs, suggesting that for this frequency range, the frequencies of harmonics are more poorly encoded at higher levels, even when harmonics are well resolved.
Collapse
Affiliation(s)
- Joshua G W Bernstein
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | |
Collapse
|