1
|
Dahlbom DA, Braasch J. How to pick a peak: Pitch and peak shifting in temporal models of pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2713. [PMID: 32359285 DOI: 10.1121/10.0001134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 04/06/2020] [Indexed: 06/11/2023]
Abstract
The standard autocorrelation model of pitch perception posits that the pitch of a stimulus can be predicted from the first major peak of a summary autocorrelation function (SACF) after the zero-delay peak. Models based on this theory are capable of predicting a wide range of pitch phenomena. There are, however, a number of cases where the approach fails. Two examples are noise edge pitch (NEP) and the pitch induced by the mistuning of a single component of an otherwise harmonic stimulus. Hartmann, Cariani, and Colburn [(2019). J. Acoust. Soc. Am. 145, 1993-2008] recently proposed the use of multiple SACF peaks in the estimation process. This enables prediction of the NEP but suppresses the shift associated with a mistuned harmonic. A functional model is developed that can predict both of these pitch phenomena. The multiple-peak framework is extended with a non-standard peak-selection method that associates a delay time to a given peak in a manner that takes into account the entire shape of the bump surrounding the peak. This effectively shifts the peak location slightly for non-harmonic stimuli. A possible physiological mechanism that could induce such peak shifting is discussed, and the model is tested against existing psychophysical data.
Collapse
Affiliation(s)
- David A Dahlbom
- School of Architecture, Rensselaer Polytechnic Institute, 110 8th Street, Troy, New York 12180, USA
| | - Jonas Braasch
- School of Architecture, Rensselaer Polytechnic Institute, 110 8th Street, Troy, New York 12180, USA
| |
Collapse
|
2
|
Zulfiqar I, Moerel M, Formisano E. Spectro-Temporal Processing in a Two-Stream Computational Model of Auditory Cortex. Front Comput Neurosci 2020; 13:95. [PMID: 32038212 PMCID: PMC6987265 DOI: 10.3389/fncom.2019.00095] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 12/23/2019] [Indexed: 12/14/2022] Open
Abstract
Neural processing of sounds in the dorsal and ventral streams of the (human) auditory cortex is optimized for analyzing fine-grained temporal and spectral information, respectively. Here we use a Wilson and Cowan firing-rate modeling framework to simulate spectro-temporal processing of sounds in these auditory streams and to investigate the link between neural population activity and behavioral results of psychoacoustic experiments. The proposed model consisted of two core (A1 and R, representing primary areas) and two belt (Slow and Fast, representing rostral and caudal processing respectively) areas, differing in terms of their spectral and temporal response properties. First, we simulated the responses to amplitude modulated (AM) noise and tones. In agreement with electrophysiological results, we observed an area-dependent transition from a temporal (synchronization) to a rate code when moving from low to high modulation rates. Simulated neural responses in a task of amplitude modulation detection suggested that thresholds derived from population responses in core areas closely resembled those of psychoacoustic experiments in human listeners. For tones, simulated modulation threshold functions were found to be dependent on the carrier frequency. Second, we simulated the responses to complex tones with missing fundamental stimuli and found that synchronization of responses in the Fast area accurately encoded pitch, with the strength of synchronization depending on number and order of harmonic components. Finally, using speech stimuli, we showed that the spectral and temporal structure of the speech was reflected in parallel by the modeled areas. The analyses highlighted that the Slow stream coded with high spectral precision the aspects of the speech signal characterized by slow temporal changes (e.g., prosody), while the Fast stream encoded primarily the faster changes (e.g., phonemes, consonants, temporal pitch). Interestingly, the pitch of a speaker was encoded both spatially (i.e., tonotopically) in Slow area and temporally in Fast area. Overall, performed simulations showed that the model is valuable for generating hypotheses on how the different cortical areas/streams may contribute toward behaviorally relevant aspects of auditory processing. The model can be used in combination with physiological models of neurovascular coupling to generate predictions for human functional MRI experiments.
Collapse
Affiliation(s)
- Isma Zulfiqar
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
| | - Michelle Moerel
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands.,Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Center, Maastricht, Netherlands
| | - Elia Formisano
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands.,Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Center, Maastricht, Netherlands
| |
Collapse
|
3
|
Su Y, Delgutte B. Pitch of harmonic complex tones: rate and temporal coding of envelope repetition rate in inferior colliculus of unanesthetized rabbits. J Neurophysiol 2019; 122:2468-2485. [PMID: 31664871 DOI: 10.1152/jn.00512.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Harmonic complex tones (HCTs) found in speech, music, and animal vocalizations evoke strong pitch percepts at their fundamental frequencies. The strongest pitches are produced by HCTs that contain harmonics resolved by cochlear frequency analysis, but HCTs containing solely unresolved harmonics also evoke a weaker pitch at their envelope repetition rate (ERR). In the auditory periphery, neurons phase lock to the stimulus envelope, but this temporal representation of ERR degrades and gives way to rate codes along the ascending auditory pathway. To assess the role of the inferior colliculus (IC) in such transformations, we recorded IC neuron responses to HCT and sinusoidally modulated broadband noise (SAMN) with varying ERR from unanesthetized rabbits. Different interharmonic phase relationships of HCT were used to manipulate the temporal envelope without changing the power spectrum. Many IC neurons demonstrated band-pass rate tuning to ERR between 60 and 1,600 Hz for HCT and between 40 and 500 Hz for SAMN. The tuning was not related to the pure-tone best frequency of neurons but was dependent on the shape of the stimulus envelope, indicating a temporal rather than spectral origin. A phenomenological model suggests that the tuning may arise from peripheral temporal response patterns via synaptic inhibition. We also characterized temporal coding to ERR. Some IC neurons could phase lock to the stimulus envelope up to 900 Hz for either HCT or SAMN, but phase locking was weaker with SAMN. Together, the rate code and the temporal code represent a wide range of ERR, providing strong cues for the pitch of unresolved harmonics.NEW & NOTEWORTHY Envelope repetition rate (ERR) provides crucial cues for pitch perception of frequency components that are not individually resolved by the cochlea, but the neural representation of ERR for stimuli containing many harmonics is poorly characterized. Here we show that the pitch of stimuli with unresolved harmonics is represented by both a rate code and a temporal code for ERR in auditory midbrain neurons and propose possible underlying neural mechanisms with a computational model.
Collapse
Affiliation(s)
- Yaqing Su
- Eaton-Peabody Labs, Massachusetts Eye and Ear, Boston, Massachusetts.,Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Bertrand Delgutte
- Eaton-Peabody Labs, Massachusetts Eye and Ear, Boston, Massachusetts.,Department of Otolaryngology, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
4
|
Tabas A, Andermann M, Schuberth V, Riedel H, Balaguer-Ballester E, Rupp A. Modeling and MEG evidence of early consonance processing in auditory cortex. PLoS Comput Biol 2019; 15:e1006820. [PMID: 30818358 PMCID: PMC6413961 DOI: 10.1371/journal.pcbi.1006820] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 03/12/2019] [Accepted: 01/24/2019] [Indexed: 11/18/2022] Open
Abstract
Pitch is a fundamental attribute of auditory perception. The interaction of concurrent pitches gives rise to a sensation that can be characterized by its degree of consonance or dissonance. In this work, we propose that human auditory cortex (AC) processes pitch and consonance through a common neural network mechanism operating at early cortical levels. First, we developed a new model of neural ensembles incorporating realistic neuronal and synaptic parameters to assess pitch processing mechanisms at early stages of AC. Next, we designed a magnetoencephalography (MEG) experiment to measure the neuromagnetic activity evoked by dyads with varying degrees of consonance or dissonance. MEG results show that dissonant dyads evoke a pitch onset response (POR) with a latency up to 36 ms longer than consonant dyads. Additionally, we used the model to predict the processing time of concurrent pitches; here, consonant pitch combinations were decoded faster than dissonant combinations, in line with the experimental observations. Specifically, we found a striking match between the predicted and the observed latency of the POR as elicited by the dyads. These novel results suggest that consonance processing starts early in human auditory cortex and may share the network mechanisms that are responsible for (single) pitch processing.
Collapse
Affiliation(s)
- Alejandro Tabas
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Faculty of Science and Technology, Bournemouth University, Poole, United Kingdom
- * E-mail: (AT); (EBB)
| | - Martin Andermann
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Heidelberg, Germany
| | - Valeria Schuberth
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Heidelberg, Germany
| | - Helmut Riedel
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Heidelberg, Germany
| | - Emili Balaguer-Ballester
- Faculty of Science and Technology, Bournemouth University, Poole, United Kingdom
- Bernstein Center for Computational Neuroscience, Heidelberg/Mannheim, Mannheim, Germany
- * E-mail: (AT); (EBB)
| | - André Rupp
- Section of Biomagnetism, Department of Neurology, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
5
|
Harczos T, Klefenz FM. Modeling Pitch Perception With an Active Auditory Model Extended by Octopus Cells. Front Neurosci 2018; 12:660. [PMID: 30319340 PMCID: PMC6167605 DOI: 10.3389/fnins.2018.00660] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 09/04/2018] [Indexed: 11/13/2022] Open
Abstract
Pitch is an essential category for musical sensations. Models of pitch perception are vividly discussed up to date. Most of them rely on definitions of mathematical methods in the spectral or temporal domain. Our proposed pitch perception model is composed of an active auditory model extended by octopus cells. The active auditory model is the same as used in the Stimulation based on Auditory Modeling (SAM), a successful cochlear implant sound processing strategy extended here by modeling the functional behavior of the octopus cells in the ventral cochlear nucleus and by modeling their connections to the auditory nerve fibers (ANFs). The neurophysiological parameterization of the extended model is fully described in the time domain. The model is based on latency-phase en- and decoding as octopus cells are latency-phase rectifiers in their local receptive fields. Pitch is ubiquitously represented by cascaded firing sweeps of octopus cells. Based on the firing patterns of octopus cells, inter-spike interval histograms can be aggregated, in which the place of the global maximum is assumed to encode the pitch.
Collapse
Affiliation(s)
- Tamas Harczos
- Fraunhofer Institute for Digital Media Technology, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Goettingen, Germany
- Institut für Mikroelektronik- und Mechatronik-Systeme gGmbH, Ilmenau, Germany
| | | |
Collapse
|
6
|
Orcioni S, Paffi A, Camera F, Apollonio F, Liberti M. Automatic decoding of input sinusoidal signal in a neuron model: High pass homomorphic filtering. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.03.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
7
|
Orcioni S, Paffi A, Camera F, Apollonio F, Liberti M. Automatic decoding of input sinusoidal signal in a neuron model: Improved SNR spectrum by low-pass homomorphic filtering. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.06.029] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
8
|
Neural Entrainment to the Beat: The "Missing-Pulse" Phenomenon. J Neurosci 2017; 37:6331-6341. [PMID: 28559379 DOI: 10.1523/jneurosci.2500-16.2017] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2016] [Revised: 05/09/2017] [Accepted: 05/16/2017] [Indexed: 11/21/2022] Open
Abstract
Most humans have a near-automatic inclination to tap, clap, or move to the beat of music. The capacity to extract a periodic beat from a complex musical segment is remarkable, as it requires abstraction from the temporal structure of the stimulus. It has been suggested that nonlinear interactions in neural networks result in cortical oscillations at the beat frequency, and that such entrained oscillations give rise to the percept of a beat or a pulse. Here we tested this neural resonance theory using MEG recordings as female and male individuals listened to 30 s sequences of complex syncopated drumbeats designed so that they contain no net energy at the pulse frequency when measured using linear analysis. We analyzed the spectrum of the neural activity while listening and compared it to the modulation spectrum of the stimuli. We found enhanced neural response in the auditory cortex at the pulse frequency. We also showed phase locking at the times of the missing pulse, even though the pulse was absent from the stimulus itself. Moreover, the strength of this pulse response correlated with individuals' speed in finding the pulse of these stimuli, as tested in a follow-up session. These findings demonstrate that neural activity at the pulse frequency in the auditory cortex is internally generated rather than stimulus-driven. The current results are both consistent with neural resonance theory and with models based on nonlinear response of the brain to rhythmic stimuli. The results thus help narrow the search for valid models of beat perception.SIGNIFICANCE STATEMENT Humans perceive music as having a regular pulse marking equally spaced points in time, within which musical notes are temporally organized. Neural resonance theory (NRT) provides a theoretical model explaining how an internal periodic representation of a pulse may emerge through nonlinear coupling between oscillating neural systems. After testing key falsifiable predictions of NRT using MEG recordings, we demonstrate the emergence of neural oscillations at the pulse frequency, which can be related to pulse perception. These findings rule out alternative explanations for neural entrainment and provide evidence linking neural synchronization to the perception of pulse, a widely debated topic in recent years.
Collapse
|