1
|
Mammalian octopus cells are direction selective to frequency sweeps by excitatory synaptic sequence detection. Proc Natl Acad Sci U S A 2022; 119:e2203748119. [PMID: 36279465 PMCID: PMC9636937 DOI: 10.1073/pnas.2203748119] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Octopus cells are remarkable projection neurons of the mammalian cochlear nucleus, with extremely fast membranes and wide-frequency tuning. They are considered prime examples of coincidence detectors but are poorly characterized in vivo. We discover that octopus cells are selective to frequency sweep direction, a feature that is absent in their auditory nerve inputs. In vivo intracellular recordings reveal that direction selectivity does not derive from across-frequency coincidence detection but hinges on the amplitudes and activation sequence of auditory nerve inputs tuned to clusters of hot spot frequencies. A simple biophysical octopus cell model excited with real nerve spike trains recreates direction selectivity through interaction of intrinsic membrane conductances with the activation sequence of clustered excitatory inputs. We conclude that octopus cells are sequence detectors, sensitive to temporal patterns across cochlear frequency channels. The detection of sequences rather than coincidences is a much simpler but powerful operation to extract temporal information.
Collapse
|
2
|
Gockel HE, Carlyon RP. On mistuning detection and beat perception for harmonic complex tones at low and very high frequencies. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:226. [PMID: 35931513 DOI: 10.1121/10.0012351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 06/19/2022] [Indexed: 06/15/2023]
Abstract
This study assessed the detection of mistuning of a single harmonic in complex tones (CTs) containing either low-frequency harmonics or very high-frequency harmonics, for which phase locking to the temporal fine structure is weak or absent. CTs had F0s of either 280 or 1400 Hz and contained harmonics 6-10, the 8th of which could be mistuned. Harmonics were presented either diotically or dichotically (odd and even harmonics to different ears). In the diotic condition, mistuning-detection thresholds were very low for both F0s and consistent with detection of temporal interactions (beats) produced by peripheral interactions of components. In the dichotic condition, for which the components in each ear were more widely spaced and beats were not reported, the mistuned component was perceptually segregated from the complex for the low F0, but subjects reported no "popping out" for the high F0 and performance was close to chance. This is consistent with the idea that phase locking is required for perceptual segregation to occur. For diotic presentation, the perceived beat rate corresponded to the amount of mistuning (in Hz). It is argued that the beat percept cannot be explained solely by interactions between the mistuned component and its two closest harmonic neighbours.
Collapse
Affiliation(s)
- Hedwig E Gockel
- Cambridge Hearing Group, MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom
| | - Robert P Carlyon
- Cambridge Hearing Group, MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom
| |
Collapse
|
3
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
4
|
Carney LH. Supra-Threshold Hearing and Fluctuation Profiles: Implications for Sensorineural and Hidden Hearing Loss. J Assoc Res Otolaryngol 2018; 19:331-352. [PMID: 29744729 PMCID: PMC6081887 DOI: 10.1007/s10162-018-0669-5] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Accepted: 04/19/2018] [Indexed: 12/22/2022] Open
Abstract
An important topic in contemporary auditory science is supra-threshold hearing. Difficulty hearing at conversational speech levels in background noise has long been recognized as a problem of sensorineural hearing loss, including that associated with aging (presbyacusis). Such difficulty in listeners with normal thresholds has received more attention recently, especially associated with descriptions of synaptopathy, the loss of auditory nerve (AN) fibers as a result of noise exposure or aging. Synaptopathy has been reported to cause a disproportionate loss of low- and medium-spontaneous rate (L/MSR) AN fibers. Several studies of synaptopathy have assumed that the wide dynamic ranges of L/MSR AN fiber rates are critical for coding supra-threshold sounds. First, this review will present data from the literature that argues against a direct role for average discharge rates of L/MSR AN fibers in coding sounds at moderate to high sound levels. Second, the encoding of sounds at supra-threshold levels is examined. A key assumption in many studies is that saturation of AN fiber discharge rates limits neural encoding, even though the majority of AN fibers, high-spontaneous rate (HSR) fibers, have saturated average rates at conversational sound levels. It is argued here that the cross-frequency profile of low-frequency neural fluctuation amplitudes, not average rates, encodes complex sounds. As described below, this fluctuation-profile coding mechanism benefits from both saturation of inner hair cell (IHC) transduction and average rate saturation associated with the IHC-AN synapse. Third, the role of the auditory efferent system, which receives inputs from L/MSR fibers, is revisited in the context of fluctuation-profile coding. The auditory efferent system is hypothesized to maintain and enhance neural fluctuation profiles. Lastly, central mechanisms sensitive to neural fluctuations are reviewed. Low-frequency fluctuations in AN responses are accentuated by cochlear nucleus neurons which, either directly or via other brainstem nuclei, relay fluctuation profiles to the inferior colliculus (IC). IC neurons are sensitive to the frequency and amplitude of low-frequency fluctuations and convert fluctuation profiles from the periphery into a phase-locked rate profile that is robust across a wide range of sound levels and in background noise. The descending projection from the midbrain (IC) to the efferent system completes a functional loop that, combined with inputs from the L/MSR pathway, is hypothesized to maintain "sharp" supra-threshold hearing, reminiscent of visual mechanisms that regulate optical accommodation. Examples from speech coding and detection in noise are reviewed. Implications for the effects of synaptopathy on control mechanisms hypothesized to influence supra-threshold hearing are discussed. This framework for understanding neural coding and control mechanisms for supra-threshold hearing suggests strategies for the design of novel hearing aid signal-processing and electrical stimulation patterns for cochlear implants.
Collapse
Affiliation(s)
- Laurel H Carney
- Departments of Biomedical Engineering, Neuroscience, and Electrical & Computer Engineering, Del Monte Institute for Neuroscience, University of Rochester, 601 Elmwood Ave., Box 603, Rochester, NY, 14642, USA.
| |
Collapse
|
5
|
Peng F, Innes-Brown H, McKay CM, Fallon JB, Zhou Y, Wang X, Hu N, Hou W. Temporal Coding of Voice Pitch Contours in Mandarin Tones. Front Neural Circuits 2018; 12:55. [PMID: 30087597 PMCID: PMC6066958 DOI: 10.3389/fncir.2018.00055] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 06/27/2018] [Indexed: 11/13/2022] Open
Abstract
Accurate perception of time-variant pitch is important for speech recognition, particularly for tonal languages with different lexical tones such as Mandarin, in which different tones convey different semantic information. Previous studies reported that the auditory nerve and cochlear nucleus can encode different pitches through phase-locked neural activities. However, little is known about how the inferior colliculus (IC) encodes the time-variant periodicity pitch of natural speech. In this study, the Mandarin syllable /ba/ pronounced with four lexical tones (flat, rising, falling then rising and falling) were used as stimuli. Local field potentials (LFPs) and single neuron activity were simultaneously recorded from 90 sites within contralateral IC of six urethane-anesthetized and decerebrate guinea pigs in response to the four stimuli. Analysis of the temporal information of LFPs showed that 93% of the LFPs exhibited robust encoding of periodicity pitch. Pitch strength of LFPs derived from the autocorrelogram was significantly (p < 0.001) stronger for rising tones than flat and falling tones. Pitch strength are also significantly increased (p < 0.05) with the characteristic frequency (CF). On the other hand, only 47% (42 or 90) of single neuron activities were significantly synchronized to the fundamental frequency of the stimulus suggesting that the temporal spiking pattern of single IC neuron could encode the time variant periodicity pitch of speech robustly. The difference between the number of LFPs and single neurons that encode the time-variant F0 voice pitch supports the notion of a transition at the level of IC from direct temporal coding in the spike trains of individual neurons to other form of neural representation.
Collapse
Affiliation(s)
- Fei Peng
- Key Laboratory of Biorheological Science and Technology of Ministry of Education, Bioengineering College of Chongqing University, Chongqing, China
- Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Hamish Innes-Brown
- Bionics Institute, East Melbourne, VIC, Australia
- Department of Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia
| | - Colette M. McKay
- Bionics Institute, East Melbourne, VIC, Australia
- Department of Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia
| | - James B. Fallon
- Bionics Institute, East Melbourne, VIC, Australia
- Department of Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia
- Department of Otolaryngology, University of Melbourne, Melbourne, VIC, Australia
| | - Yi Zhou
- Chongqing Key Laboratory of Neurobiology, Department of Neurobiology, Third Military Medical University, Chongqing, China
| | - Xing Wang
- Key Laboratory of Biorheological Science and Technology of Ministry of Education, Bioengineering College of Chongqing University, Chongqing, China
- Chongqing Medical Electronics Engineering Technology Research Center, Chongqing University, Chongqing, China
| | - Ning Hu
- Key Laboratory of Biorheological Science and Technology of Ministry of Education, Bioengineering College of Chongqing University, Chongqing, China
- Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Wensheng Hou
- Key Laboratory of Biorheological Science and Technology of Ministry of Education, Bioengineering College of Chongqing University, Chongqing, China
- Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
- Chongqing Medical Electronics Engineering Technology Research Center, Chongqing University, Chongqing, China
| |
Collapse
|
6
|
Felix RA, Gourévitch B, Portfors CV. Subcortical pathways: Towards a better understanding of auditory disorders. Hear Res 2018; 362:48-60. [PMID: 29395615 PMCID: PMC5911198 DOI: 10.1016/j.heares.2018.01.008] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 12/11/2017] [Accepted: 01/16/2018] [Indexed: 01/13/2023]
Abstract
Hearing loss is a significant problem that affects at least 15% of the population. This percentage, however, is likely significantly higher because of a variety of auditory disorders that are not identifiable through traditional tests of peripheral hearing ability. In these disorders, individuals have difficulty understanding speech, particularly in noisy environments, even though the sounds are loud enough to hear. The underlying mechanisms leading to such deficits are not well understood. To enable the development of suitable treatments to alleviate or prevent such disorders, the affected processing pathways must be identified. Historically, mechanisms underlying speech processing have been thought to be a property of the auditory cortex and thus the study of auditory disorders has largely focused on cortical impairments and/or cognitive processes. As we review here, however, there is strong evidence to suggest that, in fact, deficits in subcortical pathways play a significant role in auditory disorders. In this review, we highlight the role of the auditory brainstem and midbrain in processing complex sounds and discuss how deficits in these regions may contribute to auditory dysfunction. We discuss current research with animal models of human hearing and then consider human studies that implicate impairments in subcortical processing that may contribute to auditory disorders.
Collapse
Affiliation(s)
- Richard A Felix
- School of Biological Sciences and Integrative Physiology and Neuroscience, Washington State University, Vancouver, WA, USA
| | - Boris Gourévitch
- Unité de Génétique et Physiologie de l'Audition, UMRS 1120 INSERM, Institut Pasteur, Université Pierre et Marie Curie, F-75015, Paris, France; CNRS, France
| | - Christine V Portfors
- School of Biological Sciences and Integrative Physiology and Neuroscience, Washington State University, Vancouver, WA, USA.
| |
Collapse
|
7
|
A Role for Auditory Corticothalamic Feedback in the Perception of Complex Sounds. J Neurosci 2017; 37:6149-6161. [PMID: 28559384 PMCID: PMC5481946 DOI: 10.1523/jneurosci.0397-17.2017] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Revised: 04/12/2017] [Accepted: 04/18/2017] [Indexed: 11/21/2022] Open
Abstract
Feedback signals from the primary auditory cortex (A1) can shape the receptive field properties of neurons in the ventral division of the medial geniculate body (MGBv). However, the behavioral significance of corticothalamic modulation is unknown. The aim of this study was to elucidate the role of this descending pathway in the perception of complex sounds. We tested the ability of adult female ferrets to detect the presence of a mistuned harmonic in a complex tone using a positive conditioned go/no-go behavioral paradigm before and after the input from layer VI in A1 to MGBv was bilaterally and selectively eliminated using chromophore-targeted laser photolysis. MGBv neurons were identified by their short latencies and sharp tuning curves. They responded robustly to harmonic complex tones and exhibited an increase in firing rate and temporal pattern changes when one frequency component in the complex tone was mistuned. Injections of fluorescent microbeads conjugated with a light-sensitive chromophore were made in MGBv, and, following retrograde transport to the cortical cell bodies, apoptosis was induced by infrared laser illumination of A1. This resulted in a selective loss of ∼60% of layer VI A1-MGBv neurons. After the lesion, mistuning detection was impaired, as indicated by decreased d' values, a shift of the psychometric curves toward higher mistuning values, and increased thresholds, whereas discrimination performance was unaffected when level cues were also available. Our results suggest that A1-MGBv corticothalamic feedback contributes to the detection of harmonicity, one of the most important grouping cues in the perception of complex sounds.SIGNIFICANCE STATEMENT Perception of a complex auditory scene is based on the ability of the brain to group those sound components that belong to the same source and to segregate them from those belonging to different sources. Because two people talking simultaneously may differ in their voice pitch, perceiving the harmonic structure of sounds is very important for auditory scene analysis. Here we demonstrate mistuning sensitivity in the thalamus and that feedback from the primary auditory cortex is required for the normal ability of ferrets to detect a mistuned harmonic within a complex sound. These results provide novel insight into the function of descending sensory pathways in the brain and suggest that this corticothalamic circuit plays an important role in scene analysis.
Collapse
|
8
|
Snyder JS, Elhilali M. Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 2017; 1396:39-55. [PMID: 28199022 PMCID: PMC5446279 DOI: 10.1111/nyas.13317] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 12/21/2016] [Accepted: 01/08/2017] [Indexed: 11/29/2022]
Abstract
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds-and conventional behavioral techniques-to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the last few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field.
Collapse
Affiliation(s)
- Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, Nevada
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
9
|
Homma NY, Bajo VM, Happel MFK, Nodal FR, King AJ. Mistuning detection performance of ferrets in a go/no-go task. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:EL246. [PMID: 27369180 PMCID: PMC7116551 DOI: 10.1121/1.4954378] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The harmonic structure of sounds is an important grouping cue in auditory scene analysis. The ability of ferrets to detect mistuned harmonics was measured using a go/no-go task paradigm. Psychometric functions plotting sensitivity as a function of degree of mistuning were used to evaluate behavioral performance using signal detection theory. The mean (± standard error of the mean) threshold for mistuning detection was 0.8 ± 0.1 Hz, with sensitivity indices and reaction times depending on the degree of mistuning. These data provide a basis for investigation of the neural basis for the perception of complex sounds in ferrets, an increasingly used animal model in auditory research.
Collapse
Affiliation(s)
- Natsumi Y Homma
- Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford, OX1 3PT, United Kingdom , , ,
| | - Victoria M Bajo
- Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford, OX1 3PT, United Kingdom , , ,
| | - Max F K Happel
- Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford, OX1 3PT, United Kingdom , , ,
| | - Fernando R Nodal
- Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford, OX1 3PT, United Kingdom , , ,
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of Oxford, Parks Road, Oxford, OX1 3PT, United Kingdom , , ,
| |
Collapse
|
10
|
Bidelman GM, Alain C. Hierarchical neurocomputations underlying concurrent sound segregation: Connecting periphery to percept. Neuropsychologia 2015; 68:38-50. [DOI: 10.1016/j.neuropsychologia.2014.12.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 12/18/2014] [Accepted: 12/22/2014] [Indexed: 10/24/2022]
|
11
|
Sayles M, Stasiak A, Winter IM. Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging "periodicity-tagged" segregation of competing speech in rooms. Front Syst Neurosci 2015; 8:248. [PMID: 25628545 PMCID: PMC4290552 DOI: 10.3389/fnsys.2014.00248] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2014] [Accepted: 12/18/2014] [Indexed: 11/26/2022] Open
Abstract
The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into “auditory objects.” Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening conditions.
Collapse
Affiliation(s)
- Mark Sayles
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| | - Arkadiusz Stasiak
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| | - Ian M Winter
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
12
|
Bandyopadhyay S, Young ED. Nonlinear temporal receptive fields of neurons in the dorsal cochlear nucleus. J Neurophysiol 2013; 110:2414-25. [PMID: 23986561 DOI: 10.1152/jn.00278.2013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Studies of the dorsal cochlear nucleus (DCN) have focused on spectral processing because of the complex spectral receptive fields of the DCN. However, temporal fluctuations in natural signals convey important information, including information about moving sound sources or movements of the external ear in animals like cats. Here, we investigate the temporal filtering properties of DCN principal neurons through the use of temporal weighting functions that allow flexible analysis of nonlinearities and time variation in temporal response properties. First-order temporal receptive fields derived from the neurons are sufficient to characterize their response properties to low-contrast (3-dB standard deviation) stimuli. Larger contrasts require the second-order terms. Allowing temporal variation of the parameters of the first-order model or adding a component representing refractoriness improves predictions by the model by relatively small amounts. The importance of second-order components of the model is shown through simulations of nonlinear envelope synchronization behavior across sound level. The temporal model can be combined with a spectral model to predict tuning to the speed and direction of moving sounds.
Collapse
|
13
|
Zendel BR, Alain C. The influence of lifelong musicianship on neurophysiological measures of concurrent sound segregation. J Cogn Neurosci 2012; 25:503-16. [PMID: 23163409 DOI: 10.1162/jocn_a_00329] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The ability to separate concurrent sounds based on periodicity cues is critical for parsing complex auditory scenes. This ability is enhanced in young adult musicians and reduced in older adults. Here, we investigated the impact of lifelong musicianship on concurrent sound segregation and perception using scalp-recorded ERPs. Older and younger musicians and nonmusicians were presented with periodic harmonic complexes where the second harmonic could be tuned or mistuned by 1-16% of its original value. The likelihood of perceiving two simultaneous sounds increased with mistuning, and musicians, both older and younger, were more likely to detect and report hearing two sounds when the second harmonic was mistuned at or above 2%. The perception of a mistuned harmonic as a separate sound was paralleled by an object-related negativity that was larger and earlier in younger musicians compared with the other three groups. When listeners made a judgment about the harmonic stimuli, the perception of the mistuned harmonic as a separate sound was paralleled by a positive wave at about 400 msec poststimulus (P400), which was enhanced in both older and younger musicians. These findings suggest attention-dependent processing of a mistuned harmonic is enhanced in older musicians and provides further evidence that age-related decline in hearing abilities are mitigated by musical training.
Collapse
|
14
|
Du Y, Kong L, Wang Q, Wu X, Li L. Auditory frequency-following response: a neurophysiological measure for studying the "cocktail-party problem". Neurosci Biobehav Rev 2011; 35:2046-57. [PMID: 21645541 DOI: 10.1016/j.neubiorev.2011.05.008] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Revised: 05/12/2011] [Accepted: 05/19/2011] [Indexed: 11/19/2022]
Abstract
How do we recognize what one person is saying when others are speaking at the same time? The "cocktail-party problem" proposed by Cherry (1953) has puzzled scientific societies for half a century. This puzzle will not be solved without using appropriate neurophysiological investigation that should satisfy the following four essential requirements: (1) certain critical speech characteristics related to speech intelligibility are recorded; (2) neural responses to different speech sources are differentiated; (3) neural correlates of bottom-up binaural unmasking of responses to target speech are measurable; (4) neural correlates of attentional top-down unmasking of target speech are measurable. Before speech signals reach the cerebral cortex, some critical acoustic features are represented in subcortical structures by the frequency-following responses (FFRs), which are sustained evoked potentials based on precisely phase-locked responses of neuron populations to low-to-middle-frequency periodical acoustical stimuli. This review summarizes previous studies on FFRs associated with each of the four requirements and suggests that FFRs are useful for studying the "cocktail-party problem".
Collapse
Affiliation(s)
- Yi Du
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | | | | | | | | |
Collapse
|
15
|
Pitch, harmonicity and concurrent sound segregation: psychoacoustical and neurophysiological findings. Hear Res 2009; 266:36-51. [PMID: 19788920 DOI: 10.1016/j.heares.2009.09.012] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Revised: 09/23/2009] [Accepted: 09/24/2009] [Indexed: 11/18/2022]
Abstract
Harmonic complex tones are a particularly important class of sounds found in both speech and music. Although these sounds contain multiple frequency components, they are usually perceived as a coherent whole, with a pitch corresponding to the fundamental frequency (F0). However, when two or more harmonic sounds occur concurrently, e.g., at a cocktail party or in a symphony, the auditory system must separate harmonics and assign them to their respective F0s so that a coherent and veridical representation of the different sounds sources is formed. Here we review both psychophysical and neurophysiological (single-unit and evoked-potential) findings, which provide some insight into how, and how well, the auditory system accomplishes this task. A survey of computational models designed to estimate multiple F0s and segregate concurrent sources is followed by a review of the empirical literature on the perception and neural coding of concurrent harmonic sounds, including vowels, as well as findings obtained using single complex tones with mistuned harmonics.
Collapse
|
16
|
Shackleton TM, Liu LF, Palmer AR. Responses to diotic, dichotic, and alternating phase harmonic stimuli in the inferior colliculus of guinea pigs. J Assoc Res Otolaryngol 2009; 10:76-90. [PMID: 19089495 PMCID: PMC2644390 DOI: 10.1007/s10162-008-0149-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2008] [Accepted: 11/13/2008] [Indexed: 11/25/2022] Open
Abstract
Humans perceive a harmonic series as a single auditory object with a pitch equivalent to the fundamental frequency (F0) of the series. When harmonics are presented to alternate ears, the repetition rate of the waveform at each ear doubles. If the harmonics are resolved, then the pitch perceived is still equivalent to F0, suggesting the stimulus is binaurally integrated before pitch is processed. However, unresolved harmonics give rise to the doubling of pitch which would be expected from monaural processing (Bernstein and Oxenham, J. Acoust. Soc. Am., 113:3323-3334, 2003). We used similar stimuli to record responses of multi-unit clusters in the central nucleus of the inferior colliculus (IC) of anesthetized guinea pigs (urethane supplemented by fentanyl/fluanisone) to determine the nature of the representation of harmonic stimuli and to what extent there was binaural integration. We examined both the temporal and rate-tuning of IC clusters and found no evidence for binaural integration. Stimuli comprised all harmonics below 10 kHz with fundamental frequencies (F0) from 50 to 400 Hz in half-octave steps. In diotic conditions, all the harmonics were presented to both ears. In dichotic conditions, odd harmonics were presented to one ear and even harmonics to the other. Neural characteristic frequencies (CF, n = 85) were from 0.2 to 14.7 kHz; 29 had CFs below 1 kHz. The majority of clusters responded predominantly to the contralateral ear, with the dominance of the contralateral ear increasing with CF. With diotic stimuli, over half of the clusters (58%) had peaked firing rate vs. F0 functions. The most common peak F0 was 141 Hz. Almost all (98%) clusters phase locked diotically to an F0 of 50 Hz, and approximately 40% of clusters still phase locked significantly (Rayleigh coefficient >13.8) at the highest F0 tested (400 Hz). These results are consistent with the previous reports of responses to amplitude-modulated stimuli. Clusters phase locked significantly at a frequency equal to F0 for contralateral and diotic stimuli but at 2F0 for dichotic stimuli. We interpret these data as responses following the envelope periodicity in monaural channels rather than as a binaurally integrated representation.
Collapse
Affiliation(s)
- Trevor M Shackleton
- MRC Institute of Hearing Research, University Park, Nottingham, NG7 2RD, UK.
| | | | | |
Collapse
|
17
|
Snyder RL, Bonham BH, Sinex DG. Acute changes in frequency responses of inferior colliculus central nucleus (ICC) neurons following progressively enlarged restricted spiral ganglion lesions. Hear Res 2008; 246:59-78. [PMID: 18938235 PMCID: PMC2630712 DOI: 10.1016/j.heares.2008.09.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Revised: 09/24/2008] [Accepted: 09/24/2008] [Indexed: 11/30/2022]
Abstract
Immediate effects of sequential and progressively enlarged spiral ganglion (SG) lesions were recorded from cochleas and inferior colliculi. Small SG-lesions produced modest elevations in cochlear tone-evoked compound action potential (CAP) thresholds across narrow frequency ranges; progressively enlarged lesions produced progressively higher CAP-threshold elevations across progressively wider frequency ranges. No comparable changes in distortion product otoacoustic emissions (DPOAEs) amplitudes were observed consistent with silencing of auditory nerve sectors without affecting organ of Corti function. Frequency response areas (FRAs) of inferior colliculus (IC) neurons were recorded before and immediately after SG-lesions using multi-site silicon arrays fixed in place with recording sites arrayed along IC frequency gradient. Individual post-lesion FRAs exhibited progressively elevated response thresholds and diminished response amplitudes at lesion frequencies, whereas responses at non-lesion frequencies were either unchanged or enhanced. Characteristic frequencies were shifted and silent areas were introduced within these FRAs. Sequentially larger lesions produced sequentially larger shifts in CF and/or enlarged silent areas within affected FRAs, producing immediate changes in IC frequency organization. These results contrast with those from the auditory nerve, extend previous reports of experience-induced plasticity in the auditory CNS, and support results indicating afferent convergence onto ICC neurons across broad frequency bands.
Collapse
Affiliation(s)
- Russell L Snyder
- Department of Otolaryngology, University of California, San Francisco, CA 94143-0526, United States.
| | | | | |
Collapse
|
18
|
From sounds to meaning: the role of attention during auditory scene analysis. Curr Opin Otolaryngol Head Neck Surg 2008. [DOI: 10.1097/moo.0b013e32830e2096] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
19
|
Larsen E, Cedolin L, Delgutte B. Pitch representations in the auditory nerve: two concurrent complex tones. J Neurophysiol 2008; 100:1301-19. [PMID: 18632887 PMCID: PMC2544468 DOI: 10.1152/jn.01361.2007] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Pitch differences between concurrent sounds are important cues used in auditory scene analysis and also play a major role in music perception. To investigate the neural codes underlying these perceptual abilities, we recorded from single fibers in the cat auditory nerve in response to two concurrent harmonic complex tones with missing fundamentals and equal-amplitude harmonics. We investigated the efficacy of rate-place and interspike-interval codes to represent both pitches of the two tones, which had fundamental frequency (F0) ratios of 15/14 or 11/9. We relied on the principle of scaling invariance in cochlear mechanics to infer the spatiotemporal response patterns to a given stimulus from a series of measurements made in a single fiber as a function of F0. Templates created by a peripheral auditory model were used to estimate the F0s of double complex tones from the inferred distribution of firing rate along the tonotopic axis. This rate-place representation was accurate for F0s greater, similar900 Hz. Surprisingly, rate-based F0 estimates were accurate even when the two-tone mixture contained no resolved harmonics, so long as some harmonics were resolved prior to mixing. We also extended methods used previously for single complex tones to estimate the F0s of concurrent complex tones from interspike-interval distributions pooled over the tonotopic axis. The interval-based representation was accurate for F0s less, similar900 Hz, where the two-tone mixture contained no resolved harmonics. Together, the rate-place and interval-based representations allow accurate pitch perception for concurrent sounds over the entire range of human voice and cat vocalizations.
Collapse
Affiliation(s)
- Erik Larsen
- Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | | | | |
Collapse
|