1
|
van der Heijden K, Patel P, Bickel S, Herrero JL, Mehta AD, Mesgarani N. Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.13.593814. [PMID: 38798551 PMCID: PMC11118436 DOI: 10.1101/2024.05.13.593814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Listeners readily extract multi-dimensional auditory objects such as a 'localized talker' from complex acoustic scenes with multiple talkers. Yet, the neural mechanisms underlying simultaneous encoding and linking of different sound features - for example, a talker's voice and location - are poorly understood. We analyzed invasive intracranial recordings in neurosurgical patients attending to a localized talker in real-life cocktail party scenarios. We found that sensitivity to an individual talker's voice and location features was distributed throughout auditory cortex and that neural sites exhibited a gradient from sensitivity to a single feature to joint sensitivity to both features. On a population level, cortical response patterns of both dual-feature sensitive sites but also single-feature sensitive sites revealed simultaneous encoding of an attended talker's voice and location features. However, for single-feature sensitive sites, the representation of the primary feature was more precise. Further, sites which selective tracked an attended speech stream concurrently encoded an attended talker's voice and location features, indicating that such sites combine selective tracking of an attended auditory object with encoding of the object's features. Finally, we found that attending a localized talker selectively enhanced temporal coherence between single-feature voice sensitive sites and single-feature location sensitive sites, providing an additional mechanism for linking voice and location in multi-talker scenes. These results demonstrate that a talker's voice and location features are linked during multi-dimensional object formation in naturalistic multi-talker scenes by joint population coding as well as by temporal coherence between neural sites. SIGNIFICANCE STATEMENT Listeners effortlessly extract auditory objects from complex acoustic scenes consisting of multiple sound sources in naturalistic, spatial sound scenes. Yet, how the brain links different sound features to form a multi-dimensional auditory object is poorly understood. We investigated how neural responses encode and integrate an attended talker's voice and location features in spatial multi-talker sound scenes to elucidate which neural mechanisms underlie simultaneous encoding and linking of different auditory features. Our results show that joint population coding as well as temporal coherence mechanisms contribute to distributed multi-dimensional auditory object encoding. These findings shed new light on cortical functional specialization and multidimensional auditory object formation in complex, naturalistic listening scenes. HIGHLIGHTS Cortical responses to an single talker exhibit a distributed gradient, ranging from sites that are sensitive to both a talker's voice and location (dual-feature sensitive sites) to sites that are sensitive to either voice or location (single-feature sensitive sites).Population response patterns of dual-feature sensitive sites encode voice and location features of the attended talker in multi-talker scenes jointly and with equal precision.Despite their sensitivity to a single feature at the level of individual cortical sites, population response patterns of single-feature sensitive sites also encode location and voice features of a talker jointly, but with higher precision for the feature they are primarily sensitive to.Neural sites which selectively track an attended speech stream concurrently encode the attended talker's voice and location features.Attention selectively enhances temporal coherence between voice and location selective sites over time.Joint population coding as well as temporal coherence mechanisms underlie distributed multi-dimensional auditory object encoding in auditory cortex.
Collapse
|
2
|
Banno T, Shirley H, Fishman YI, Cohen YE. Changes in neural readout of response magnitude during auditory streaming do not correlate with behavioral choice in the auditory cortex. Cell Rep 2023; 42:113493. [PMID: 38039133 PMCID: PMC10784988 DOI: 10.1016/j.celrep.2023.113493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/01/2023] [Accepted: 11/09/2023] [Indexed: 12/03/2023] Open
Abstract
A fundamental goal of the auditory system is to group stimuli from the auditory environment into a perceptual unit (i.e., "stream") or segregate the stimuli into multiple different streams. Although previous studies have clarified the psychophysical and neural mechanisms that may underlie this ability, the relationship between these mechanisms remains elusive. Here, we recorded multiunit activity (MUA) from the auditory cortex of monkeys while they participated in an auditory-streaming task consisting of interleaved low- and high-frequency tone bursts. As the streaming stimulus unfolded over time, MUA amplitude habituated; the magnitude of this habituation was correlated with the frequency difference between the tone bursts. An ideal-observer model could classify these time- and frequency-dependent changes into reports of "one stream" or "two streams" in a manner consistent with the behavioral literature. However, because classification was not modulated by the monkeys' behavioral choices, this MUA habituation may not directly reflect perceptual reports.
Collapse
Affiliation(s)
- Taku Banno
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Harry Shirley
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Yonatan I Fishman
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yale E Cohen
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA; Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
3
|
Melland P, Curtu R. Attractor-Like Dynamics Extracted from Human Electrocorticographic Recordings Underlie Computational Principles of Auditory Bistable Perception. J Neurosci 2023; 43:3294-3311. [PMID: 36977581 PMCID: PMC10162465 DOI: 10.1523/jneurosci.1531-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 03/03/2023] [Accepted: 03/15/2023] [Indexed: 03/30/2023] Open
Abstract
In bistable perception, observers experience alternations between two interpretations of an unchanging stimulus. Neurophysiological studies of bistable perception typically partition neural measurements into stimulus-based epochs and assess neuronal differences between epochs based on subjects' perceptual reports. Computational studies replicate statistical properties of percept durations with modeling principles like competitive attractors or Bayesian inference. However, bridging neuro-behavioral findings with modeling theory requires the analysis of single-trial dynamic data. Here, we propose an algorithm for extracting nonstationary timeseries features from single-trial electrocorticography (ECoG) data. We applied the proposed algorithm to 5-min ECoG recordings from human primary auditory cortex obtained during perceptual alternations in an auditory triplet streaming task (six subjects: four male, two female). We report two ensembles of emergent neuronal features in all trial blocks. One ensemble consists of periodic functions that encode a stereotypical response to the stimulus. The other comprises more transient features and encodes dynamics associated with bistable perception at multiple time scales: minutes (within-trial alternations), seconds (duration of individual percepts), and milliseconds (switches between percepts). Within the second ensemble, we identified a slowly drifting rhythm that correlates with the perceptual states and several oscillators with phase shifts near perceptual switches. Projections of single-trial ECoG data onto these features establish low-dimensional attractor-like geometric structures invariant across subjects and stimulus types. These findings provide supporting neural evidence for computational models with oscillatory-driven attractor-based principles. The feature extraction techniques described here generalize across recording modality and are appropriate when hypothesized low-dimensional dynamics characterize an underlying neural system.SIGNIFICANCE STATEMENT Irrespective of the sensory modality, neurophysiological studies of multistable perception have typically investigated events time-locked to the perceptual switching rather than the time course of the perceptual states per se. Here, we propose an algorithm that extracts neuronal features of bistable auditory perception from largescale single-trial data while remaining agnostic to the subject's perceptual reports. The algorithm captures the dynamics of perception at multiple timescales, minutes (within-trial alternations), seconds (durations of individual percepts), and milliseconds (timing of switches), and distinguishes attributes of neural encoding of the stimulus from those encoding the perceptual states. Finally, our analysis identifies a set of latent variables that exhibit alternating dynamics along a low-dimensional manifold, similar to trajectories in attractor-based models for perceptual bistability.
Collapse
Affiliation(s)
- Pake Melland
- Department of Mathematics, Southern Methodist University, Dallas, Texas 75275
- Applied Mathematical & Computational Sciences, The University of Iowa, Iowa City, Iowa 52242
| | - Rodica Curtu
- Department of Mathematics, The University of Iowa, Iowa City, Iowa 52242
- The Iowa Neuroscience Institute, The University of Iowa, Iowa City, Iowa 52242
| |
Collapse
|
4
|
Weise A, Grimm S, Maria Rimmele J, Schröger E. Auditory representations for long lasting sounds: Insights from event-related brain potentials and neural oscillations. BRAIN AND LANGUAGE 2023; 237:105221. [PMID: 36623340 DOI: 10.1016/j.bandl.2022.105221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 12/26/2022] [Accepted: 12/27/2022] [Indexed: 06/17/2023]
Abstract
The basic features of short sounds, such as frequency and intensity including their temporal dynamics, are integrated in a unitary representation. Knowledge on how our brain processes long lasting sounds is scarce. We review research utilizing the Mismatch Negativity event-related potential and neural oscillatory activity for studying representations for long lasting simple versus complex sounds such as sinusoidal tones versus speech. There is evidence for a temporal constraint in the formation of auditory representations: Auditory edges like sound onsets within long lasting sounds open a temporal window of about 350 ms in which the sounds' dynamics are integrated into a representation, while information beyond that window contributes less to that representation. This integration window segments the auditory input into short chunks. We argue that the representations established in adjacent integration windows can be concatenated into an auditory representation of a long sound, thus, overcoming the temporal constraint.
Collapse
Affiliation(s)
- Annekathrin Weise
- Department of Psychology, Ludwig-Maximilians-University Munich, Germany; Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| | - Sabine Grimm
- Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| | - Johanna Maria Rimmele
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Germany; Center for Language, Music and Emotion, New York University, Max Planck Institute, Department of Psychology, 6 Washington Place, New York, NY 10003, United States.
| | - Erich Schröger
- Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| |
Collapse
|
5
|
Gilday OD, Praegel B, Maor I, Cohen T, Nelken I, Mizrahi A. Surround suppression in mouse auditory cortex underlies auditory edge detection. PLoS Comput Biol 2023; 19:e1010861. [PMID: 36656876 PMCID: PMC9888713 DOI: 10.1371/journal.pcbi.1010861] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 01/31/2023] [Accepted: 01/09/2023] [Indexed: 01/20/2023] Open
Abstract
Surround suppression (SS) is a fundamental property of sensory processing throughout the brain. In the auditory system, the early processing stream encodes sounds using a one dimensional physical space-frequency. Previous studies in the auditory system have shown SS to manifest as bandwidth tuning around the preferred frequency. We asked whether bandwidth tuning can be found around frequencies away from the preferred frequency. We exploited the simplicity of spectral representation of sounds to study SS by manipulating both sound frequency and bandwidth. We recorded single unit spiking activity from the auditory cortex (ACx) of awake mice in response to an array of broadband stimuli with varying central frequencies and bandwidths. Our recordings revealed that a significant portion of neuronal response profiles had a preferred bandwidth that varied in a regular way with the sound's central frequency. To gain insight into the possible mechanism underlying these responses, we modelled neuronal activity using a variation of the "Mexican hat" function often used to model SS. The model accounted for response properties of single neurons with high accuracy. Our data and model show that these responses in ACx obey simple rules resulting from the presence of lateral inhibitory sidebands, mostly above the excitatory band of the neuron, that result in sensitivity to the location of top frequency edges, invariant to other spectral attributes. Our work offers a simple explanation for auditory edge detection and possibly other computations of spectral content in sounds.
Collapse
Affiliation(s)
- Omri David Gilday
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Benedikt Praegel
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ido Maor
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Tav Cohen
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Israel Nelken
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Adi Mizrahi
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
- Department of Neurobiology, The Hebrew University of Jerusalem, Jerusalem, Israel
- * E-mail:
| |
Collapse
|
6
|
Di Stefano N, Vuust P, Brattico E. Consonance and dissonance perception. A critical review of the historical sources, multidisciplinary findings, and main hypotheses. Phys Life Rev 2022; 43:273-304. [PMID: 36372030 DOI: 10.1016/j.plrev.2022.10.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 10/17/2022] [Indexed: 11/05/2022]
Abstract
Revealed more than two millennia ago by Pythagoras, consonance and dissonance (C/D) are foundational concepts in music theory, perception, and aesthetics. The search for the biological, acoustical, and cultural factors that affect C/D perception has resulted in descriptive accounts inspired by arithmetic, musicological, psychoacoustical or neurobiological frameworks without reaching a consensus. Here, we review the key historical sources and modern multidisciplinary findings on C/D and integrate them into three main hypotheses: the vocal similarity hypothesis (VSH), the psychocultural hypothesis (PH), and the sensorimotor hypothesis (SH). By illustrating the hypotheses-related findings, we highlight their major conceptual, methodological, and terminological shortcomings. Trying to provide a unitary framework for C/D understanding, we put together multidisciplinary research on human and animal vocalizations, which converges to suggest that auditory roughness is associated with distress/danger and, therefore, elicits defensive behavioral reactions and neural responses that indicate aversion. We therefore stress the primacy of vocality and roughness as key factors in the explanation of C/D phenomenon, and we explore the (neuro)biological underpinnings of the attraction-aversion mechanisms that are triggered by C/D stimuli. Based on the reviewed evidence, while the aversive nature of dissonance appears as solidly rooted in the multidisciplinary findings, the attractive nature of consonance remains a somewhat speculative claim that needs further investigation. Finally, we outline future directions for empirical research in C/D, especially regarding cross-modal and cross-cultural approaches.
Collapse
Affiliation(s)
- Nicola Di Stefano
- Institute for Cognitive Sciences and Technologies (ISTC), National Research Council of Italy (CNR), Via San Martino della Battaglia 44, 00185 Rome, Italy.
| | - Peter Vuust
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University Royal Academy of Music Aarhus/Aalborg (RAMA), 8000 Aarhus, Denmark.
| | - Elvira Brattico
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University Royal Academy of Music Aarhus/Aalborg (RAMA), 8000 Aarhus, Denmark; Department of Education, Psychology, Communication, University of Bari Aldo Moro, 70122 Bari, Italy.
| |
Collapse
|
7
|
Harley HE, Fellner W, Frances C, Thomas A, Losch B, Newton K, Feuerbach D. Information-seeking across auditory scenes by an echolocating dolphin. Anim Cogn 2022; 25:1109-1131. [PMID: 36018473 DOI: 10.1007/s10071-022-01679-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 07/31/2022] [Accepted: 08/08/2022] [Indexed: 11/29/2022]
Abstract
Dolphins gain information through echolocation, a publicly accessible sensory system in which dolphins produce clicks and process returning echoes, thereby both investigating and contributing to auditory scenes. How their knowledge of these scenes contributes to their echoic information-seeking is unclear. Here, we investigate their top-down cognitive processes in an echoic matching-to-sample task in which targets and auditory scenes vary in their decipherability and shift from being completely unfamiliar to familiar. A blind-folded adult male dolphin investigated a target sample positioned in front of a hydrophone to allow recording of clicks, a measure of information-seeking and effort; the dolphin received fish for choosing an object identical to the sample from 3 alternatives. We presented 20 three-object sets, unfamiliar in the first five 18-trial sessions with each set. Performance accuracy and click counts varied widely across sets. Click counts of the four lowest-performance-accuracy/low-discriminability sets (X = 41%) and the four highest-performance-accuracy/high-discriminability sets (X = 91%) were similar at the first sessions' starts and then decreased for both kinds of scenes, although the decrease was substantially greater for low-discriminability sets. In four challenging-but-doable sets, number of clicks remained relatively steady across the 5 sessions. Reduced echoic effort with low-discriminability sets was not due to overall motivation: the differential relationship between click number and object-set discriminability was maintained when difficult and easy trials were interleaved and when objects from originally difficult scenes were grouped with more discriminable objects. These data suggest that dolphins calibrate their echoic information-seeking effort based on their knowledge and expectations of auditory scenes.
Collapse
Affiliation(s)
- Heidi E Harley
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA.
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA.
| | - Wendi Fellner
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Candice Frances
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA
- Basque Center on Cognition, Brain and Language, Donostia, Spain
| | - Amber Thomas
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Barbara Losch
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| | - Katherine Newton
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Road, Sarasota, FL, 34243, USA
- Department of Fisheries and Wildlife, Oregon State University, Corvallis, USA
| | - David Feuerbach
- The Seas, Epcot®, Walt Disney World® Resorts , Lake Buena Vista, FL, USA
| |
Collapse
|
8
|
Attentional control via synaptic gain mechanisms in auditory streaming. Brain Res 2021; 1778:147720. [PMID: 34785256 DOI: 10.1016/j.brainres.2021.147720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 09/13/2021] [Accepted: 11/05/2021] [Indexed: 11/21/2022]
Abstract
Attention is a crucial component in sound source segregation allowing auditory objects of interest to be both singled out and held in focus. Our study utilizes a fundamental paradigm for sound source segregation: a sequence of interleaved tones, A and B, of different frequencies that can be heard as a single integrated stream or segregated into two streams (auditory streaming paradigm). We focus on the irregular alternations between integrated and segregated that occur for long presentations, so-called auditory bistability. Psychaoustic experiments demonstrate how attentional control, a listener's intention to experience integrated or segregated, biases perception in favour of different perceptual interpretations. Our data show that this is achieved by prolonging the dominance times of the attended percept and, to a lesser extent, by curtailing the dominance times of the unattended percept, an effect that remains consistent across a range of values for the difference in frequency between A and B. An existing neuromechanistic model describes the neural dynamics of perceptual competition downstream of primary auditory cortex (A1). The model allows us to propose plausible neural mechanisms for attentional control, as linked to different attentional strategies, in a direct comparison with behavioural data. A mechanism based on a percept-specific input gain best accounts for the effects of attentional control.
Collapse
|
9
|
Jain S, Cherian R, Nataraja NP, Narne VK. The Relationship Between Tinnitus Pitch, Audiogram Edge Frequency, and Auditory Stream Segregation Abilities in Individuals With Tinnitus. Am J Audiol 2021; 30:524-534. [PMID: 34139145 DOI: 10.1044/2021_aja-20-00087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Purpose Around 80%-93% of the individuals with tinnitus have hearing loss. Researchers have found that tinnitus pitch was related to the frequencies of hearing loss, but unclear about the relationship between tinnitus pitch and audiometry edge frequency. The comorbidity of tinnitus and speech perception in noise problems had also been reported, but the relationship between tinnitus pitch and speech perception in noise had seldom been investigated. This study was designed to estimate the relationship between tinnitus pitch, audiogram edge frequency, and speech perception in noise. The speech perception in noise was measured using auditory stream segregation paradigm. Method Thirteen individuals with bilateral mild-to-severe tonal tinnitus and minimal-to-mild cochlear hearing loss were selected. Thirteen individuals with hearing loss without tinnitus were also selected. The audiogram of each participant with tinnitus was matched with that of the participant without tinnitus. Tinnitus pitch of the participants with tinnitus was measured and compared with audiogram edge frequency. The stream segregation thresholds were calculated at the participants' admitted tinnitus pitch and one octave below the tinnitus pitch. The stream segregation thresholds were estimated at fission and fusion boundary using pure-tone stimuli in ABA paradigm. Results High correlation between tinnitus pitch and audiogram edge frequency was noted. Overall stream segregation thresholds were higher for individuals with tinnitus. Higher thresholds indicated poorer stream segregation abilities. Within tinnitus participants, the thresholds were significantly lesser at frequency corresponding to admitted tinnitus pitch than at one octave below the tinnitus pitch. Conclusions The information from this study may be helpful in educating the patients about the relationship between hearing loss and tinnitus. The findings may also account for speech-perception-in-noise difficulties often reported by the individuals with tinnitus.
Collapse
Affiliation(s)
- Saransh Jain
- Department of Speech and Hearing, Jagadguru Sri Shivarathreeshwara Institute of Speech and Hearing, Mysuru, India
| | - Riya Cherian
- Department of ENT, Sree Gokulam Medical College & Research Foundation, Venjaranmood, India
| | - Nuggehalli P. Nataraja
- Department of Speech and Hearing, Jagadguru Sri Shivarathreeshwara Institute of Speech and Hearing, Mysuru, India
| | - Vijaya Kumar Narne
- Department of Mechanical Engineering, Indian Institute of Technology Kanpur, India
| |
Collapse
|
10
|
Kline AM, Aponte DA, Tsukano H, Giovannucci A, Kato HK. Inhibitory gating of coincidence-dependent sensory binding in secondary auditory cortex. Nat Commun 2021; 12:4610. [PMID: 34326331 PMCID: PMC8322099 DOI: 10.1038/s41467-021-24758-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 07/05/2021] [Indexed: 11/09/2022] Open
Abstract
Integration of multi-frequency sounds into a unified perceptual object is critical for recognizing syllables in speech. This "feature binding" relies on the precise synchrony of each component's onset timing, but little is known regarding its neural correlates. We find that multi-frequency sounds prevalent in vocalizations, specifically harmonics, preferentially activate the mouse secondary auditory cortex (A2), whose response deteriorates with shifts in component onset timings. The temporal window for harmonics integration in A2 was broadened by inactivation of somatostatin-expressing interneurons (SOM cells), but not parvalbumin-expressing interneurons (PV cells). Importantly, A2 has functionally connected subnetworks of neurons preferentially encoding harmonic over inharmonic sounds. These subnetworks are stable across days and exist prior to experimental harmonics exposure, suggesting their formation during development. Furthermore, A2 inactivation impairs performance in a discrimination task for coincident harmonics. Together, we propose A2 as a locus for multi-frequency integration, which may form the circuit basis for vocal processing.
Collapse
Affiliation(s)
- Amber M Kline
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Destinee A Aponte
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Hiroaki Tsukano
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Andrea Giovannucci
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill and North Carolina State University, Chapel Hill, NC, USA
| | - Hiroyuki K Kato
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. .,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. .,Carolina Institute for Developmental Disabilities, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
11
|
Neuronal figure-ground responses in primate primary auditory cortex. Cell Rep 2021; 35:109242. [PMID: 34133935 PMCID: PMC8220257 DOI: 10.1016/j.celrep.2021.109242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 12/09/2020] [Accepted: 05/20/2021] [Indexed: 11/22/2022] Open
Abstract
Figure-ground segregation, the brain’s ability to group related features into stable perceptual entities, is crucial for auditory perception in noisy environments. The neuronal mechanisms for this process are poorly understood in the auditory system. Here, we report figure-ground modulation of multi-unit activity (MUA) in the primary and non-primary auditory cortex of rhesus macaques. Across both regions, MUA increases upon presentation of auditory figures, which consist of coherent chord sequences. We show increased activity even in the absence of any perceptual decision, suggesting that neural mechanisms for perceptual grouping are, to some extent, independent of behavioral demands. Furthermore, we demonstrate differences in figure encoding between more anterior and more posterior regions; perceptual saliency is represented in anterior cortical fields only. Our results suggest an encoding of auditory figures from the earliest cortical stages by a rate code. Neuronal figure-ground modulation in primary auditory cortex A rate code is used to signal the presence of auditory figures Anteriorly located recording sites encode perceptual saliency Figure-ground modulation is present without perceptual detection
Collapse
|
12
|
Ferrario A, Rankin J. Auditory streaming emerges from fast excitation and slow delayed inhibition. JOURNAL OF MATHEMATICAL NEUROSCIENCE 2021; 11:8. [PMID: 33939042 PMCID: PMC8093365 DOI: 10.1186/s13408-021-00106-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 04/22/2021] [Indexed: 05/29/2023]
Abstract
In the auditory streaming paradigm, alternating sequences of pure tones can be perceived as a single galloping rhythm (integration) or as two sequences with separated low and high tones (segregation). Although studied for decades, the neural mechanisms underlining this perceptual grouping of sound remains a mystery. With the aim of identifying a plausible minimal neural circuit that captures this phenomenon, we propose a firing rate model with two periodically forced neural populations coupled by fast direct excitation and slow delayed inhibition. By analyzing the model in a non-smooth, slow-fast regime we analytically prove the existence of a rich repertoire of dynamical states and of their parameter dependent transitions. We impose plausible parameter restrictions and link all states with perceptual interpretations. Regions of stimulus parameters occupied by states linked with each percept match those found in behavioural experiments. Our model suggests that slow inhibition masks the perception of subsequent tones during segregation (forward masking), whereas fast excitation enables integration for large pitch differences between the two tones.
Collapse
Affiliation(s)
- Andrea Ferrario
- Department of Mathematics, College of Engineering, Mathematics & Physical Sciences, University of Exeter, Exeter, UK.
| | - James Rankin
- Department of Mathematics, College of Engineering, Mathematics & Physical Sciences, University of Exeter, Exeter, UK
| |
Collapse
|
13
|
Holmes E, Zeidman P, Friston KJ, Griffiths TD. Difficulties with Speech-in-Noise Perception Related to Fundamental Grouping Processes in Auditory Cortex. Cereb Cortex 2020; 31:1582-1596. [PMID: 33136138 PMCID: PMC7869094 DOI: 10.1093/cercor/bhaa311] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/04/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open
Abstract
In our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” [SPIN] perception). SPIN perception varies widely—and people who are worse at SPIN perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with SPIN perception to difficulties with figure-ground perception using functional magnetic resonance imaging. We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when SPIN and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% rather than 90% performance)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (SPIN) tasks—which provides a common computational basis for the link between SPIN perception and fundamental auditory grouping.
Collapse
Affiliation(s)
- Emma Holmes
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Peter Zeidman
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Timothy D Griffiths
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK.,Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
14
|
Nguyen QA, Rinzel J, Curtu R. Buildup and bistability in auditory streaming as an evidence accumulation process with saturation. PLoS Comput Biol 2020; 16:e1008152. [PMID: 32853256 PMCID: PMC7480857 DOI: 10.1371/journal.pcbi.1008152] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 09/09/2020] [Accepted: 07/15/2020] [Indexed: 12/23/2022] Open
Abstract
A repeating triplet-sequence ABA- of non-overlapping brief tones, A and B, is a valued paradigm for studying auditory stream formation and the cocktail party problem. The stimulus is "heard" either as a galloping pattern (integration) or as two interleaved streams (segregation); the initial percept is typically integration then followed by spontaneous alternations between segregation and integration, each being dominant for a few seconds. The probability of segregation grows over seconds, from near-zero to a steady value, defining the buildup function, BUF. Its stationary level increases with the difference in tone frequencies, DF, and the BUF rises faster. Percept durations have DF-dependent means and are gamma-like distributed. Behavioral and computational studies usually characterize triplet streaming either during alternations or during buildup. Here, our experimental design and modeling encompass both. We propose a pseudo-neuromechanistic model that incorporates spiking activity in primary auditory cortex, A1, as input and resolves perception along two network-layers downstream of A1. Our model is straightforward and intuitive. It describes the noisy accumulation of evidence against the current percept which generates switches when reaching a threshold. Accumulation can saturate either above or below threshold; if below, the switching dynamics resemble noise-induced transitions from an attractor state. Our model accounts quantitatively for three key features of data: the BUFs, mean durations, and normalized dominance duration distributions, at various DF values. It describes perceptual alternations without competition per se, and underscores that treating triplets in the sequence independently and averaging across trials, as implemented in earlier widely cited studies, is inadequate.
Collapse
Affiliation(s)
- Quynh-Anh Nguyen
- Department of Mathematics, The University of Iowa, Iowa City, Iowa, United States of America
| | - John Rinzel
- Center for Neural Science, New York University, New York, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America
| | - Rodica Curtu
- Department of Mathematics, The University of Iowa, Iowa City, Iowa, United States of America
- Iowa Neuroscience Institute, Human Brain Research Laboratory, Iowa City, Iowa, United States of America
- * E-mail:
| |
Collapse
|
15
|
Cai H, Dent ML. Attention capture in birds performing an auditory streaming task. PLoS One 2020; 15:e0235420. [PMID: 32589692 PMCID: PMC7319309 DOI: 10.1371/journal.pone.0235420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 06/15/2020] [Indexed: 11/19/2022] Open
Abstract
Numerous animal models have been used to investigate the neural mechanisms of auditory processing in complex acoustic environments, but it is unclear whether an animal’s auditory attention is functionally similar to a human’s in processing competing auditory scenes. Here we investigated the effects of attention capture in birds performing an objective auditory streaming paradigm. The classical ABAB… patterned pure tone sequences were modified and used for the task. We trained the birds to selectively attend to a target stream and only respond to the deviant appearing in the target stream, even though their attention may be captured by a deviant in the background stream. When no deviant appeared in the background stream, the birds experience the buildup of streaming process in a qualitatively similar way as they did in a subjective paradigm. Although the birds were trained to selectively attend to the target stream, they failed to avoid the involuntary attention switch caused by the background deviant, especially when the background deviant was sequentially unpredictable. Their global performance deteriorated more with increasingly salient background deviants, where the buildup process was reset by the background distractor. Moreover, sequential predictability of the background deviant facilitated the recovery of the buildup process after attention capture. This is the first study that addresses the perceptual consequences of the joint effects of top-down and bottom-up attention in behaving animals.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
| | - Micheal L. Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
- * E-mail:
| |
Collapse
|
16
|
Streaming of Repeated Noise in Primary and Secondary Fields of Auditory Cortex. J Neurosci 2020; 40:3783-3798. [PMID: 32273487 DOI: 10.1523/jneurosci.2105-19.2020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 02/06/2020] [Accepted: 02/11/2020] [Indexed: 11/21/2022] Open
Abstract
Statistical regularities in natural sounds facilitate the perceptual segregation of auditory sources, or streams. Repetition is one cue that drives stream segregation in humans, but the neural basis of this perceptual phenomenon remains unknown. We demonstrated a similar perceptual ability in animals by training ferrets of both sexes to detect a stream of repeating noise samples (foreground) embedded in a stream of random samples (background). During passive listening, we recorded neural activity in primary auditory cortex (A1) and secondary auditory cortex (posterior ectosylvian gyrus, PEG). We used two context-dependent encoding models to test for evidence of streaming of the repeating stimulus. The first was based on average evoked activity per noise sample and the second on the spectro-temporal receptive field. Both approaches tested whether differences in neural responses to repeating versus random stimuli were better modeled by scaling the response to both streams equally (global gain) or by separately scaling the response to the foreground versus background stream (stream-specific gain). Consistent with previous observations of adaptation, we found an overall reduction in global gain when the stimulus began to repeat. However, when we measured stream-specific changes in gain, responses to the foreground were enhanced relative to the background. This enhancement was stronger in PEG than A1. In A1, enhancement was strongest in units with low sparseness (i.e., broad sensory tuning) and with tuning selective for the repeated sample. Enhancement of responses to the foreground relative to the background provides evidence for stream segregation that emerges in A1 and is refined in PEG.SIGNIFICANCE STATEMENT To interact with the world successfully, the brain must parse behaviorally important information from a complex sensory environment. Complex mixtures of sounds often arrive at the ears simultaneously or in close succession, yet they are effortlessly segregated into distinct perceptual sources. This process breaks down in hearing-impaired individuals and speech recognition devices. By identifying the underlying neural mechanisms that facilitate perceptual segregation, we can develop strategies for ameliorating hearing loss and improving speech recognition technology in the presence of background noise. Here, we present evidence to support a hierarchical process, present in primary auditory cortex and refined in secondary auditory cortex, in which sound repetition facilitates segregation.
Collapse
|
17
|
Auditory streaming and bistability paradigm extended to a dynamic environment. Hear Res 2019; 383:107807. [PMID: 31622836 DOI: 10.1016/j.heares.2019.107807] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 11/23/2022]
Abstract
We explore stream segregation with temporally modulated acoustic features using behavioral experiments and modelling. The auditory streaming paradigm in which alternating high- A and low-frequency tones B appear in a repeating ABA-pattern, has been shown to be perceptually bistable for extended presentations (order of minutes). For a fixed, repeating stimulus, perception spontaneously changes (switches) at random times, every 2-15 s, between an integrated interpretation with a galloping rhythm and segregated streams. Streaming in a natural auditory environment requires segregation of auditory objects with features that evolve over time. With the relatively idealized ABA-triplet paradigm, we explore perceptual switching in a non-static environment by considering slowly and periodically varying stimulus features. Our previously published model captures the dynamics of auditory bistability and predicts here how perceptual switches are entrained, tightly locked to the rising and falling phase of modulation. In psychoacoustic experiments we find that entrainment depends on both the period of modulation and the intrinsic switch characteristics of individual listeners. The extended auditory streaming paradigm with slowly modulated stimulus features presented here will be of significant interest for future imaging and neurophysiology experiments by reducing the need for subjective perceptual reports of ongoing perception.
Collapse
|
18
|
Abstract
Humans and other animals use spatial hearing to rapidly localize events in the environment. However, neural encoding of sound location is a complex process involving the computation and integration of multiple spatial cues that are not represented directly in the sensory organ (the cochlea). Our understanding of these mechanisms has increased enormously in the past few years. Current research is focused on the contribution of animal models for understanding human spatial audition, the effects of behavioural demands on neural sound location encoding, the emergence of a cue-independent location representation in the auditory cortex, and the relationship between single-source and concurrent location encoding in complex auditory scenes. Furthermore, computational modelling seeks to unravel how neural representations of sound source locations are derived from the complex binaural waveforms of real-life sounds. In this article, we review and integrate the latest insights from neurophysiological, neuroimaging and computational modelling studies of mammalian spatial hearing. We propose that the cortical representation of sound location emerges from recurrent processing taking place in a dynamic, adaptive network of early (primary) and higher-order (posterior-dorsal and dorsolateral prefrontal) auditory regions. This cortical network accommodates changing behavioural requirements and is especially relevant for processing the location of real-life, complex sounds and complex auditory scenes.
Collapse
|
19
|
Neural Signatures of Auditory Perceptual Bistability Revealed by Large-Scale Human Intracranial Recordings. J Neurosci 2019; 39:6482-6497. [PMID: 31189576 PMCID: PMC6697394 DOI: 10.1523/jneurosci.0655-18.2019] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 05/26/2019] [Accepted: 05/28/2019] [Indexed: 11/25/2022] Open
Abstract
A key challenge in neuroscience is understanding how sensory stimuli give rise to perception, especially when the process is supported by neural activity from an extended network of brain areas. Perception is inherently subjective, so interrogating its neural signatures requires, ideally, a combination of three factors: (1) behavioral tasks that separate stimulus-driven activity from perception per se; (2) human subjects who self-report their percepts while performing those tasks; and (3) concurrent neural recordings acquired at high spatial and temporal resolution. In this study, we analyzed human electrocorticographic recordings obtained during an auditory task which supported mutually exclusive perceptual interpretations. Eight neurosurgical patients (5 male; 3 female) listened to sequences of repeated triplets where tones were separated in frequency by several semitones. Subjects reported spontaneous alternations between two auditory perceptual states, 1-stream and 2-stream, by pressing a button. We compared averaged auditory evoked potentials (AEPs) associated with 1-stream and 2-stream percepts and identified significant differences between them in primary and nonprimary auditory cortex, surrounding auditory-related temporoparietal cortex, and frontal areas. We developed classifiers to identify spatial maps of percept-related differences in the AEP, corroborating findings from statistical analysis. We used one-dimensional embedding spaces to perform the group-level analysis. Our data illustrate exemplar high temporal resolution AEP waveforms in auditory core region; explain inconsistencies in perceptual effects within auditory cortex, reported across noninvasive studies of streaming of triplets; show percept-related changes in frontoparietal areas previously highlighted by studies that focused on perceptual transitions; and demonstrate that auditory cortex encodes maintenance of percepts and switches between them. SIGNIFICANCE STATEMENT The human brain has the remarkable ability to discern complex and ambiguous stimuli from the external world by parsing mixed inputs into interpretable segments. However, one's perception can deviate from objective reality. But how do perceptual discrepancies occur? What are their anatomical substrates? To address these questions, we performed intracranial recordings in neurosurgical patients as they reported their perception of sounds associated with two mutually exclusive interpretations. We identified signatures of subjective percepts as distinct from sound-driven brain activity in core and non-core auditory cortex and frontoparietal cortex. These findings were compared with previous studies of auditory bistable perception and suggested that perceptual transitions and maintenance of perceptual states were supported by common neural substrates.
Collapse
|
20
|
Rankin J, Rinzel J. Computational models of auditory perception from feature extraction to stream segregation and behavior. Curr Opin Neurobiol 2019; 58:46-53. [PMID: 31326723 DOI: 10.1016/j.conb.2019.06.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 06/22/2019] [Indexed: 10/26/2022]
Abstract
Audition is by nature dynamic, from brainstem processing on sub-millisecond time scales, to segregating and tracking sound sources with changing features, to the pleasure of listening to music and the satisfaction of getting the beat. We review recent advances from computational models of sound localization, of auditory stream segregation and of beat perception/generation. A wealth of behavioral, electrophysiological and imaging studies shed light on these processes, typically with synthesized sounds having regular temporal structure. Computational models integrate knowledge from different experimental fields and at different levels of description. We advocate a neuromechanistic modeling approach that incorporates knowledge of the auditory system from various fields, that utilizes plausible neural mechanisms, and that bridges our understanding across disciplines.
Collapse
Affiliation(s)
- James Rankin
- College of Engineering, Mathematics and Physical Sciences, University of Exeter, Harrison Building, North Park Rd, Exeter EX4 4QF, UK.
| | - John Rinzel
- Center for Neural Science, New York University, 4 Washington Place, 10003 New York, NY, United States; Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, 10012 New York, NY, United States
| |
Collapse
|
21
|
Paredes-Gallardo A, Dau T, Marozeau J. Auditory Stream Segregation Can Be Modeled by Neural Competition in Cochlear Implant Listeners. Front Comput Neurosci 2019; 13:42. [PMID: 31333438 PMCID: PMC6616076 DOI: 10.3389/fncom.2019.00042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 06/17/2019] [Indexed: 11/13/2022] Open
Abstract
Auditory stream segregation is a perceptual process by which the human auditory system groups sounds from different sources into perceptually meaningful elements (e.g., a voice or a melody). The perceptual segregation of sounds is important, for example, for the understanding of speech in noisy scenarios, a particularly challenging task for listeners with a cochlear implant (CI). It has been suggested that some aspects of stream segregation may be explained by relatively basic neural mechanisms at a cortical level. During the past decades, a variety of models have been proposed to account for the data from stream segregation experiments in normal-hearing (NH) listeners. However, little attention has been given to corresponding findings in CI listeners. The present study investigated whether a neural model of sequential stream segregation, proposed to describe the behavioral effects observed in NH listeners, can account for behavioral data from CI listeners. The model operates on the stimulus features at the cortical level and includes a competition stage between the neuronal units encoding the different percepts. The competition arises from a combination of mutual inhibition, adaptation, and additive noise. The model was found to capture the main trends in the behavioral data from CI listeners, such as the larger probability of a segregated percept with increasing the feature difference between the sounds as well as the build-up effect. Importantly, this was achieved without any modification to the model's competition stage, suggesting that stream segregation could be mediated by a similar mechanism in both groups of listeners.
Collapse
Affiliation(s)
- Andreu Paredes-Gallardo
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Jeremy Marozeau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
22
|
Rajasingam SL, Summers RJ, Roberts B. Stream biasing by different induction sequences: Evaluating stream capture as an account of the segregation-promoting effects of constant-frequency inducers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:3409. [PMID: 30599694 DOI: 10.1121/1.5082300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 11/19/2018] [Indexed: 06/09/2023]
Abstract
Stream segregation for a test sequence comprising high-frequency (H) and low-frequency (L) pure tones, presented in a galloping rhythm, is much greater when preceded by a constant-frequency induction sequence matching one subset than by an inducer configured like the test sequence; this difference persists for several seconds. It has been proposed that constant-frequency inducers promote stream segregation by capturing the matching subset of test-sequence tones into an on-going, pre-established stream. This explanation was evaluated using 2-s induction sequences followed by longer test sequences (12-20 s). Listeners reported the number of streams heard throughout the test sequence. Experiment 1 used LHL- sequences and one or other subset of inducer tones was attenuated (0-24 dB in 6-dB steps, and ∞). Greater attenuation usually caused a progressive increase in segregation, towards that following the constant-frequency inducer. Experiment 2 used HLH- sequences and the L inducer tones were raised or lowered in frequency relative to their test-sequence counterparts (ΔfI = 0, 0.5, 1.0, or 1.5 × ΔfT ). Either change greatly increased segregation. These results are concordant with the notion of attention switching to new sounds but contradict the stream-capture hypothesis, unless a "proto-object" corresponding to the continuing subset is assumed to form during the induction sequence.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Robert J Summers
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
23
|
Ruggles DR, Tausend AN, Shamma SA, Oxenham AJ. Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2424. [PMID: 30404514 PMCID: PMC6909992 DOI: 10.1121/1.5065392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 06/08/2023]
Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.
Collapse
Affiliation(s)
- Dorea R Ruggles
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Alexis N Tausend
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Shihab A Shamma
- Electrical and Computer Engineering Department & Institute for Systems, University of Maryland, College Park, Maryland 20740, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
24
|
Kondo HM, Pressnitzer D, Shimada Y, Kochiyama T, Kashino M. Inhibition-excitation balance in the parietal cortex modulates volitional control for auditory and visual multistability. Sci Rep 2018; 8:14548. [PMID: 30267021 PMCID: PMC6162284 DOI: 10.1038/s41598-018-32892-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 09/18/2018] [Indexed: 11/25/2022] Open
Abstract
Perceptual organisation must select one interpretation from several alternatives to guide behaviour. Computational models suggest that this could be achieved through an interplay between inhibition and excitation across competing types of neural population coding for each interpretation. Here, to test for such models, we used magnetic resonance spectroscopy to measure non-invasively the concentrations of inhibitory γ-aminobutyric acid (GABA) and excitatory glutamate-glutamine (Glx) in several brain regions. Human participants first performed auditory and visual multistability tasks that produced spontaneous switching between percepts. Then, we observed that longer percept durations during behaviour were associated with higher GABA/Glx ratios in the sensory area coding for each modality. When participants were asked to voluntarily modulate their perception, a common factor across modalities emerged: the GABA/Glx ratio in the posterior parietal cortex tended to be positively correlated with the amount of effective volitional control. Our results provide direct evidence implicating that the balance between neural inhibition and excitation within sensory regions resolves perceptual competition. This powerful computational principle appears to be leveraged by both audition and vision, implemented independently across modalities, but modulated by an integrated control process.
Collapse
Affiliation(s)
- Hirohito M Kondo
- School of Psychology, Chukyo University, Nagoya, Aichi, Japan.
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan.
| | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France
- Département d'Études Cognitive, École Normale Supérieure, Paris, France
| | - Yasuhiro Shimada
- Brain Activity Imaging Center, ATR-Promotions, Seika-cho, Kyoto, Japan
| | - Takanori Kochiyama
- Brain Activity Imaging Center, ATR-Promotions, Seika-cho, Kyoto, Japan
- Department of Cognitive Neuroscience, Advanced Telecommunications Research Institute International, Seika-cho, Kyoto, Japan
| | - Makio Kashino
- Sports Brain Science Project, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan
- School of Engineering, Tokyo Institute of Technology, Yokohama, Kanagawa, Japan
| |
Collapse
|
25
|
Cai H, Screven LA, Dent ML. Behavioral measurements of auditory streaming and build-up by budgerigars ( Melopsittacus undulatus). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1508. [PMID: 30424658 DOI: 10.1121/1.5054297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/27/2018] [Indexed: 06/09/2023]
Abstract
The perception of the build-up of auditory streaming has been widely investigated in humans, while it is unknown whether animals experience a similar perception when hearing high (H) and low (L) tonal pattern sequences. The paradigm previously used in European starlings (Sturnus vulgaris) was adopted in two experiments to address the build-up of auditory streaming in budgerigars (Melopsittacus undulatus). In experiment 1, different numbers of repetitions of low-high-low triplets were used in five conditions to study the build-up process. In experiment 2, 5 and 15 repetitions of high-low-high triplets were used to investigate the effects of repetition rate, frequency separation, and frequency range of the two tones on the birds' streaming perception. Similar to humans, budgerigars subjectively experienced the build-up process in auditory streaming; faster repetition rates and larger frequency separations enhanced the streaming perception, and these results were consistent across the two frequency ranges. Response latency analysis indicated that the budgerigars needed a longer amount of time to respond to stimuli that elicited a salient streaming perception. These results indicate, for the first time using a behavioral paradigm, that budgerigars experience a build-up of auditory streaming in a manner similar to humans.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Laurel A Screven
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Micheal L Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
26
|
Selezneva E, Gorkin A, Budinger E, Brosch M. Neuronal correlates of auditory streaming in the auditory cortex of behaving monkeys. Eur J Neurosci 2018; 48:3234-3245. [PMID: 30070745 DOI: 10.1111/ejn.14098] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 06/27/2018] [Accepted: 07/20/2018] [Indexed: 11/29/2022]
Abstract
This study tested the hypothesis that spiking activity in the primary auditory cortex of monkeys is related to auditory stream formation. Evidence for this hypothesis was previously obtained in animals that were passively exposed to stimuli and in which differences in the streaming percept were confounded with differences between the stimuli. In this study, monkeys performed an operant task on sequences that were composed of light flashes and tones. The tones alternated between a high and a low frequency and could be perceived either as one auditory stream or two auditory streams. The flashes promoted either a one-stream percept or a two-stream percept. Comparison of different types of sequences revealed that the neuronal responses to the alternating tones were more similar when the flashes promoted auditory stream integration, and were more dissimilar when the flashes promoted auditory stream segregation. Thus our findings show that the spiking activity in the monkey primary auditory cortex is related to auditory stream formation.
Collapse
Affiliation(s)
| | | | - Eike Budinger
- Leibniz Institut für Neurobiologie, Magdeburg, Germany
| | | |
Collapse
|
27
|
Christison-Lagay KL, Cohen YE. The Contribution of Primary Auditory Cortex to Auditory Categorization in Behaving Monkeys. Front Neurosci 2018; 12:601. [PMID: 30210282 PMCID: PMC6123543 DOI: 10.3389/fnins.2018.00601] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 08/09/2018] [Indexed: 11/13/2022] Open
Abstract
The specific contribution of core auditory cortex to auditory perception –such as categorization– remains controversial. To identify a contribution of the primary auditory cortex (A1) to perception, we recorded A1 activity while monkeys reported whether a temporal sequence of tone bursts was heard as having a “small” or “large” frequency difference. We found that A1 had frequency-tuned responses that habituated, independent of frequency content, as this auditory sequence unfolded over time. We also found that A1 firing rate was modulated by the monkeys’ reports of “small” and “large” frequency differences; this modulation correlated with their behavioral performance. These findings are consistent with the hypothesis that A1 contributes to the processes underlying auditory categorization.
Collapse
Affiliation(s)
- Kate L Christison-Lagay
- Neuroscience Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Yale E Cohen
- Departments of Otorhinolaryngology, Neuroscience, and Bioengineering, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
28
|
Angeloni C, Geffen MN. Contextual modulation of sound processing in the auditory cortex. Curr Opin Neurobiol 2018; 49:8-15. [PMID: 29125987 PMCID: PMC6037899 DOI: 10.1016/j.conb.2017.10.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 10/11/2017] [Accepted: 10/13/2017] [Indexed: 12/26/2022]
Abstract
In everyday acoustic environments, we navigate through a maze of sounds that possess a complex spectrotemporal structure, spanning many frequencies and exhibiting temporal modulations that differ within frequency bands. Our auditory system needs to efficiently encode the same sounds in a variety of different contexts, while preserving the ability to separate complex sounds within an acoustic scene. Recent work in auditory neuroscience has made substantial progress in studying how sounds are represented in the auditory system under different contexts, demonstrating that auditory processing of seemingly simple acoustic features, such as frequency and time, is highly dependent on co-occurring acoustic and behavioral stimuli. Through a combination of electrophysiological recordings, computational analysis and behavioral techniques, recent research identified the interactions between external spectral and temporal context of stimuli, as well as the internal behavioral state.
Collapse
Affiliation(s)
- C Angeloni
- Department of Otorhinolaryngology: HNS, Department of Neuroscience, Psychology Graduate Group, Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, PA, United States
| | - M N Geffen
- Department of Otorhinolaryngology: HNS, Department of Neuroscience, Psychology Graduate Group, Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, PA, United States.
| |
Collapse
|
29
|
Noda T, Takahashi H. Behavioral evaluation of auditory stream segregation in rats. Neurosci Res 2018; 141:52-62. [PMID: 29580889 DOI: 10.1016/j.neures.2018.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 03/08/2018] [Accepted: 03/22/2018] [Indexed: 10/17/2022]
Abstract
Perceptual organization of sound sequences into separate sound sources or streams is called auditory stream segregation. Neural substrates for this process in both the spectral and temporal domains remain to be elucidated. Despite abundant knowledge about their auditory physiology, behavioral evidence for auditory streaming in rodents is still limited. We provided behavioral evidence for auditory streaming in the go/no-go discrimination task, but not in the two-alternative choice task. In the go/no-go discrimination phase, rats were able to discriminate different rhythms corresponding to segregated or integrated tone sequences in both short inter-tone interval (ITI) and long ITI conditions. Nevertheless, performance was poorer in the long ITI group. In probe testing, which assessed the ability to discriminate one of the segregated tone sequences from ABA- tone sequences, the detection rate increased with the difference in frequency (ΔF) for short (100 ms), but not long (200 ms) ITIs. Our results indicate that auditory streaming in rats on both the spectral and temporal features in the ABA- tone paradigm is qualitatively analogous to that observed in human psychophysics studies. This suggests that rodents are a valuable model for investigating the neural substrates of auditory streaming.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
30
|
Knyazeva S, Selezneva E, Gorkin A, Aggelopoulos NC, Brosch M. Neuronal Correlates of Auditory Streaming in Monkey Auditory Cortex for Tone Sequences without Spectral Differences. Front Integr Neurosci 2018; 12:4. [PMID: 29440999 PMCID: PMC5797536 DOI: 10.3389/fnint.2018.00004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/16/2018] [Indexed: 11/13/2022] Open
Abstract
This study finds a neuronal correlate of auditory perceptual streaming in the primary auditory cortex for sequences of tone complexes that have the same amplitude spectrum but a different phase spectrum. Our finding is based on microelectrode recordings of multiunit activity from 270 cortical sites in three awake macaque monkeys. The monkeys were presented with repeated sequences of a tone triplet that consisted of an A tone, a B tone, another A tone and then a pause. The A and B tones were composed of unresolved harmonics formed by adding the harmonics in cosine phase, in alternating phase, or in random phase. A previous psychophysical study on humans revealed that when the A and B tones are similar, humans integrate them into a single auditory stream; when the A and B tones are dissimilar, humans segregate them into separate auditory streams. We found that the similarity of neuronal rate responses to the triplets was highest when all A and B tones had cosine phase. Similarity was intermediate when the A tones had cosine phase and the B tones had alternating phase. Similarity was lowest when the A tones had cosine phase and the B tones had random phase. The present study corroborates and extends previous reports, showing similar correspondences between neuronal activity in the primary auditory cortex and auditory streaming of sound sequences. It also is consistent with Fishman’s population separation model of auditory streaming.
Collapse
Affiliation(s)
- Stanislava Knyazeva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Elena Selezneva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Alexander Gorkin
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Laboratory of Psychophysiology, Institute of Psychology, Moscow, Russia
| | | | - Michael Brosch
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
31
|
Neural Decoding of Bistable Sounds Reveals an Effect of Intention on Perceptual Organization. J Neurosci 2018; 38:2844-2853. [PMID: 29440556 PMCID: PMC5852662 DOI: 10.1523/jneurosci.3022-17.2018] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 01/21/2018] [Accepted: 02/06/2018] [Indexed: 12/05/2022] Open
Abstract
Auditory signals arrive at the ear as a mixture that the brain must decompose into distinct sources based to a large extent on acoustic properties of the sounds. An important question concerns whether listeners have voluntary control over how many sources they perceive. This has been studied using pure high (H) and low (L) tones presented in the repeating pattern HLH-HLH-, which can form a bistable percept heard either as an integrated whole (HLH-) or as segregated into high (H-H-) and low (-L-) sequences. Although instructing listeners to try to integrate or segregate sounds affects reports of what they hear, this could reflect a response bias rather than a perceptual effect. We had human listeners (15 males, 12 females) continuously report their perception of such sequences and recorded neural activity using MEG. During neutral listening, a classifier trained on patterns of neural activity distinguished between periods of integrated and segregated perception. In other conditions, participants tried to influence their perception by allocating attention either to the whole sequence or to a subset of the sounds. They reported hearing the desired percept for a greater proportion of time than when listening neutrally. Critically, neural activity supported these reports; stimulus-locked brain responses in auditory cortex were more likely to resemble the signature of segregation when participants tried to hear segregation than when attempting to perceive integration. These results indicate that listeners can influence how many sound sources they perceive, as reflected in neural responses that track both the input and its perceptual organization. SIGNIFICANCE STATEMENT Can we consciously influence our perception of the external world? We address this question using sound sequences that can be heard either as coming from a single source or as two distinct auditory streams. Listeners reported spontaneous changes in their perception between these two interpretations while we recorded neural activity to identify signatures of such integration and segregation. They also indicated that they could, to some extent, choose between these alternatives. This claim was supported by corresponding changes in responses in auditory cortex. By linking neural and behavioral correlates of perception, we demonstrate that the number of objects that we perceive can depend not only on the physical attributes of our environment, but also on how we intend to experience it.
Collapse
|
32
|
A Crucial Test of the Population Separation Model of Auditory Stream Segregation in Macaque Primary Auditory Cortex. J Neurosci 2017; 37:10645-10655. [PMID: 28954867 DOI: 10.1523/jneurosci.0792-17.2017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 08/29/2017] [Accepted: 09/05/2017] [Indexed: 11/21/2022] Open
Abstract
An important aspect of auditory scene analysis is auditory stream segregation-the organization of sound sequences into perceptual streams reflecting different sound sources in the environment. Several models have been proposed to account for stream segregation. According to the "population separation" (PS) model, alternating ABAB tone sequences are perceived as a single stream or as two separate streams when "A" and "B" tones activate the same or distinct frequency-tuned neuronal populations in primary auditory cortex (A1), respectively. A crucial test of the PS model is whether it can account for the observation that A and B tones are generally perceived as a single stream when presented synchronously, rather than in an alternating pattern, even if they are widely separated in frequency. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in A1 of male macaques. Consistent with predictions of the PS model, a greater effective tonotopic separation of A and B tone responses was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. While other models of stream segregation, such as temporal coherence, are not excluded by the present findings, we conclude that PS is sufficient to account for the perceptual organization of ALT and SYNC sequences and thus remains a viable model of auditory stream segregation.SIGNIFICANCE STATEMENT According to the population separation (PS) model of auditory stream segregation, sounds that activate the same or separate neural populations in primary auditory cortex (A1) are perceived as one or two streams, respectively. It is unclear, however, whether the PS model can account for the perception of sounds as a single stream when they are presented synchronously. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in macaque A1. A greater effective separation of tonotopic activity patterns was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. Based on these findings, we conclude that PS remains a plausible neurophysiological model of auditory stream segregation.
Collapse
|
33
|
Itatani N, Klump GM. Interaction of spatial and non-spatial cues in auditory stream segregation in the European starling. Eur J Neurosci 2017; 51:1191-1200. [PMID: 28922512 DOI: 10.1111/ejn.13716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/14/2017] [Accepted: 09/14/2017] [Indexed: 11/29/2022]
Abstract
Integrating sounds from the same source and segregating sounds from different sources in an acoustic scene are an essential function of the auditory system. Naturally, the auditory system simultaneously makes use of multiple cues. Here, we investigate the interaction between spatial cues and frequency cues in stream segregation of European starlings (Sturnus vulgaris) using an objective measure of perception. Neural responses to streaming sounds were recorded, while the bird was performing a behavioural task that results in a higher sensitivity during a one-stream than a two-stream percept. Birds were trained to detect an onset time shift of a B tone in an ABA- triplet sequence in which A and B could differ in frequency and/or spatial location. If the frequency difference or spatial separation between the signal sources or both were increased, the behavioural time shift detection performance deteriorated. Spatial separation had a smaller effect on the performance compared to the frequency difference and both cues additively affected the performance. Neural responses in the primary auditory forebrain were affected by the frequency and spatial cues. However, frequency and spatial cue differences being sufficiently large to elicit behavioural effects did not reveal correlated neural response differences. The difference between the neuronal response pattern and behavioural response is discussed with relation to the task given to the bird. Perceptual effects of combining different cues in auditory scene analysis indicate that these cues are analysed independently and given different weights suggesting that the streaming percept arises consecutively to initial cue analysis.
Collapse
Affiliation(s)
- Naoya Itatani
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
34
|
Prilop L, Gutschalk A. Auditory-cortex lesions impair contralateral tone-pattern detection under informational masking. Cortex 2017; 95:1-14. [PMID: 28806706 DOI: 10.1016/j.cortex.2017.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 06/22/2017] [Accepted: 07/11/2017] [Indexed: 10/19/2022]
Abstract
Impaired hearing contralateral to unilateral auditory-cortex lesions is typically only observed under conditions of perceptual competition, such as dichotic presentation or speech in noise. It remains unclear, however, if the source of this effect is direct competition in frequency-specific neurons, or if enhanced processing load in more distant frequencies can also impair auditory detection. To evaluate this question, we studied a group of patients with unilateral auditory-cortex lesions (N = 14, six left-hemispheric (LH), eight right-hemispheric (RH); four females; age range 26-72 years) and a control group (N = 25; 15 females; age range 18-76 years) with a target-detection task in presence of a multi-tone masker, which can produce informational masking. The results revealed reduced sensitivity for monaural target streams presented contralateral to auditory-cortex lesions, with an approximately 10% higher error rate in the contra-lesional ear. A general, bilateral reduction of target detection was only observed in a subgroup of patients, who were classified as additionally suffering from auditory neglect. These results demonstrate that auditory-cortex lesions impair monaural, contra-lesional target detection under informational masking. The finding supports the hypothesis that neural mechanisms beyond direct competition in frequency-specific neurons can be a source of impaired hearing under perceptual competition in patients with unilateral auditory-cortex lesions.
Collapse
Affiliation(s)
- Lisa Prilop
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany.
| |
Collapse
|
35
|
Phillips EAK, Schreiner CE, Hasenstaub AR. Diverse effects of stimulus history in waking mouse auditory cortex. J Neurophysiol 2017; 118:1376-1393. [PMID: 28566458 DOI: 10.1152/jn.00094.2017] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Revised: 05/10/2017] [Accepted: 05/29/2017] [Indexed: 11/22/2022] Open
Abstract
Responses to auditory stimuli are often strongly influenced by recent stimulus history. For example, in a paradigm called forward suppression, brief sounds can suppress the perception of, and the neural responses to, a subsequent sound, with the magnitude of this suppression depending on both the spectral and temporal distances between the sounds. As a step towards understanding the mechanisms that generate these adaptive representations in awake animals, we quantitatively characterize responses to two-tone sequences in the auditory cortex of waking mice. We find that cortical responses in a forward suppression paradigm are more diverse in waking mice than previously appreciated, that these responses vary between cells with different firing characteristics and waveform shapes, but that the variability in these responses is not substantially related to cortical depth or columnar location. Moreover, responses to the first tone in the sequence are often not linearly related to the suppression of the second tone response, suggesting that spike-frequency adaptation of cortical cells is not a large contributor to forward suppression or its variability. Instead, we use a simple multilayered model to show that cell-to-cell differences in the balance of intracortical inhibition and excitation will naturally produce such a diversity of forward interactions. We propose that diverse inhibitory connectivity allows the cortex to encode spectro-temporally fluctuating stimuli in multiple parallel ways.NEW & NOTEWORTHY Behavioral and neural responses to auditory stimuli are profoundly influenced by recent sounds, yet how this occurs is not known. Here, the authors show in the auditory cortex of awake mice that the quality of history-dependent effects is diverse and related to cell type, response latency, firing rates, and receptive field bandwidth. In a cortical model, differences in excitatory-inhibitory balance can produce this diversity, providing the cortex with multiple ways of representing temporally complex information.
Collapse
Affiliation(s)
- Elizabeth A K Phillips
- Coleman Memorial Laboratory, University of California, San Francisco, California.,Neuroscience Graduate Program, University of California, San Francisco, California.,Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, California.,Center for Integrative Neuroscience, University of California, San Francisco, California; and.,Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, California
| | - Christoph E Schreiner
- Coleman Memorial Laboratory, University of California, San Francisco, California.,Neuroscience Graduate Program, University of California, San Francisco, California.,Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, California.,Center for Integrative Neuroscience, University of California, San Francisco, California; and.,Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, California
| | - Andrea R Hasenstaub
- Coleman Memorial Laboratory, University of California, San Francisco, California; .,Neuroscience Graduate Program, University of California, San Francisco, California.,Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, California.,Center for Integrative Neuroscience, University of California, San Francisco, California; and.,Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, California
| |
Collapse
|
36
|
Snyder JS, Elhilali M. Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 2017; 1396:39-55. [PMID: 28199022 PMCID: PMC5446279 DOI: 10.1111/nyas.13317] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 12/21/2016] [Accepted: 01/08/2017] [Indexed: 11/29/2022]
Abstract
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds-and conventional behavioral techniques-to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the last few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field.
Collapse
Affiliation(s)
- Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, Nevada
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
37
|
Rankin J, Osborn Popp PJ, Rinzel J. Stimulus Pauses and Perturbations Differentially Delay or Promote the Segregation of Auditory Objects: Psychoacoustics and Modeling. Front Neurosci 2017; 11:198. [PMID: 28473747 PMCID: PMC5397483 DOI: 10.3389/fnins.2017.00198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 03/23/2017] [Indexed: 11/21/2022] Open
Abstract
Segregating distinct sound sources is fundamental for auditory perception, as in the cocktail party problem. In a process called the build-up of stream segregation, distinct sound sources that are perceptually integrated initially can be segregated into separate streams after several seconds. Previous research concluded that abrupt changes in the incoming sounds during build-up—for example, a step change in location, loudness or timing—reset the percept to integrated. Following this reset, the multisecond build-up process begins again. Neurophysiological recordings in auditory cortex (A1) show fast (subsecond) adaptation, but unified mechanistic explanations for the bias toward integration, multisecond build-up and resets remain elusive. Combining psychoacoustics and modeling, we show that initial unadapted A1 responses bias integration, that the slowness of build-up arises naturally from competition downstream, and that recovery of adaptation can explain resets. An early bias toward integrated perceptual interpretations arising from primary cortical stages that encode low-level features and feed into competition downstream could also explain similar phenomena in vision. Further, we report a previously overlooked class of perturbations that promote segregation rather than integration. Our results challenge current understanding for perturbation effects on the emergence of sound source segregation, leading to a new hypothesis for differential processing downstream of A1. Transient perturbations can momentarily redirect A1 responses as input to downstream competition units that favor segregation.
Collapse
Affiliation(s)
- James Rankin
- Department of Mathematics, University of ExeterExeter, UK.,Center for Neural Science, New York UniversityNew York, NY, USA
| | | | - John Rinzel
- Center for Neural Science, New York UniversityNew York, NY, USA.,Courant Institute of Mathematical SciencesNew York, NY, USA
| |
Collapse
|
38
|
Comparison of perceptual properties of auditory streaming between spectral and amplitude modulation domains. Hear Res 2017; 350:244-250. [PMID: 28323019 DOI: 10.1016/j.heares.2017.03.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 02/20/2017] [Accepted: 03/15/2017] [Indexed: 11/21/2022]
Abstract
The two-tone sequence (ABA_), which comprises two different sounds (A and B) and a silent gap, has been used to investigate how the auditory system organizes sequential sounds depending on various stimulus conditions or brain states. Auditory streaming can be evoked by differences not only in the tone frequency ("spectral cue": ΔFTONE, TONE condition) but also in the amplitude modulation rate ("AM cue": ΔFAM, AM condition). The aim of the present study was to explore the relationship between the perceptual properties of auditory streaming for the TONE and AM conditions. A sequence with a long duration (400 repetitions of ABA_) was used to examine the property of the bistability of streaming. The ratio of feature differences that evoked an equivalent probability of the segregated percept was close to the ratio of the Q-values of the auditory and modulation filters, consistent with a "channeling theory" of auditory streaming. On the other hand, for values of ΔFAM and ΔFTONE evoking equal probabilities of the segregated percept, the number of perceptual switches was larger for the TONE condition than for the AM condition, indicating that the mechanism(s) that determine the bistability of auditory streaming are different between or sensitive to the two domains. Nevertheless, the number of switches for individual listeners was positively correlated between the spectral and AM domains. The results suggest a possibility that the neural substrates for spectral and AM processes share a common switching mechanism but differ in location and/or in the properties of neural activity or the strength of internal noise at each level.
Collapse
|
39
|
Greiter W, Firzlaff U. Echo-acoustic flow shapes object representation in spatially complex acoustic scenes. J Neurophysiol 2017; 117:2113-2124. [PMID: 28275060 DOI: 10.1152/jn.00860.2016] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Revised: 03/06/2017] [Accepted: 03/06/2017] [Indexed: 11/22/2022] Open
Abstract
Echolocating bats use echoes of their sonar emissions to determine the position and distance of objects or prey. Target distance is represented as a map of echo delay in the auditory cortex (AC) of bats. During a bat's flight through a natural complex environment, echo streams are reflected from multiple objects along its flight path. Separating such complex streams of echoes or other sounds is a challenge for the auditory system of bats as well as other animals. We investigated the representation of multiple echo streams in the AC of anesthetized bats (Phyllostomus discolor) and tested the hypothesis that neurons can lock on echoes from specific objects in a complex echo-acoustic pattern while the representation of surrounding objects is suppressed. We combined naturalistic pulse/echo sequences simulating a bat's flight through a virtual acoustic space with extracellular recordings. Neurons could selectively lock on echoes from one object in complex echo streams originating from two different objects along a virtual flight path. The objects were processed sequentially in the order in which they were approached. Object selection depended on sequential changes of echo delay and amplitude, but not on absolute values. Furthermore, the detailed representation of the object echo delays in the cortical target range map was not fixed but could be dynamically adapted depending on the temporal pattern of sonar emission during target approach within a simulated flight sequence.NEW & NOTEWORTHY Complex signal analysis is a challenging task in sensory processing for all animals, particularly for bats because they use echolocation for navigation in darkness. Recent studies proposed that the bat's perceptional system might organize complex echo-acoustic information into auditory streams, allowing it to track specific auditory objects during flight. We show that in the auditory cortex of bats, neurons can selectively respond to echo streams from specific objects.
Collapse
Affiliation(s)
- Wolfgang Greiter
- Chair of Zoology, Technical University of Munich, Freising, Germany
| | - Uwe Firzlaff
- Chair of Zoology, Technical University of Munich, Freising, Germany
| |
Collapse
|
40
|
Neural Correlates of Speech Segregation Based on Formant Frequencies of Adjacent Vowels. Sci Rep 2017; 7:40790. [PMID: 28102300 PMCID: PMC5244401 DOI: 10.1038/srep40790] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 12/09/2016] [Indexed: 11/25/2022] Open
Abstract
The neural substrates by which speech sounds are perceptually segregated into distinct streams are poorly understood. Here, we recorded high-density scalp event-related potentials (ERPs) while participants were presented with a cyclic pattern of three vowel sounds (/ee/-/ae/-/ee/). Each trial consisted of an adaptation sequence, which could have either a small, intermediate, or large difference in first formant (Δf1) as well as a test sequence, in which Δf1 was always intermediate. For the adaptation sequence, participants tended to hear two streams (“streaming”) when Δf1 was intermediate or large compared to when it was small. For the test sequence, in which Δf1 was always intermediate, the pattern was usually reversed, with participants hearing a single stream with increasing Δf1 in the adaptation sequences. During the adaptation sequence, Δf1-related brain activity was found between 100–250 ms after the /ae/ vowel over fronto-central and left temporal areas, consistent with generation in auditory cortex. For the test sequence, prior stimulus modulated ERP amplitude between 20–150 ms over left fronto-central scalp region. Our results demonstrate that the proximity of formants between adjacent vowels is an important factor in the perceptual organization of speech, and reveal a widely distributed neural network supporting perceptual grouping of speech sounds.
Collapse
|
41
|
Itatani N, Klump GM. Animal models for auditory streaming. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0112. [PMID: 28044022 DOI: 10.1098/rstb.2016.0112] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2016] [Indexed: 11/12/2022] Open
Abstract
Sounds in the natural environment need to be assigned to acoustic sources to evaluate complex auditory scenes. Separating sources will affect the analysis of auditory features of sounds. As the benefits of assigning sounds to specific sources accrue to all species communicating acoustically, the ability for auditory scene analysis is widespread among different animals. Animal studies allow for a deeper insight into the neuronal mechanisms underlying auditory scene analysis. Here, we will review the paradigms applied in the study of auditory scene analysis and streaming of sequential sounds in animal models. We will compare the psychophysical results from the animal studies to the evidence obtained in human psychophysics of auditory streaming, i.e. in a task commonly used for measuring the capability for auditory scene analysis. Furthermore, the neuronal correlates of auditory streaming will be reviewed in different animal models and the observations of the neurons' response measures will be related to perception. The across-species comparison will reveal whether similar demands in the analysis of acoustic scenes have resulted in similar perceptual and neuronal processing mechanisms in the wide range of species being capable of auditory scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Naoya Itatani
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| | - Georg M Klump
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
42
|
Ni R, Bender DA, Shanechi AM, Gamble JR, Barbour DL. Contextual effects of noise on vocalization encoding in primary auditory cortex. J Neurophysiol 2016; 117:713-727. [PMID: 27881720 DOI: 10.1152/jn.00476.2016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 11/17/2016] [Indexed: 11/22/2022] Open
Abstract
Robust auditory perception plays a pivotal function for processing behaviorally relevant sounds, particularly with distractions from the environment. The neuronal coding enabling this ability, however, is still not well understood. In this study, we recorded single-unit activity from the primary auditory cortex (A1) of awake marmoset monkeys (Callithrix jacchus) while delivering conspecific vocalizations degraded by two different background noises: broadband white noise and vocalization babble. Noise effects on neural representation of target vocalizations were quantified by measuring the responses' similarity to those elicited by natural vocalizations as a function of signal-to-noise ratio. A clustering approach was used to describe the range of response profiles by reducing the population responses to a summary of four response classes (robust, balanced, insensitive, and brittle) under both noise conditions. This clustering approach revealed that, on average, approximately two-thirds of the neurons change their response class when encountering different noises. Therefore, the distortion induced by one particular masking background in single-unit responses is not necessarily predictable from that induced by another, suggesting the low likelihood of a unique group of noise-invariant neurons across different background conditions in A1. Regarding noise influence on neural activities, the brittle response group showed addition of spiking activity both within and between phrases of vocalizations relative to clean vocalizations, whereas the other groups generally showed spiking activity suppression within phrases, and the alteration between phrases was noise dependent. Overall, the variable single-unit responses, yet consistent response types, imply that primate A1 performs scene analysis through the collective activity of multiple neurons. NEW & NOTEWORTHY The understanding of where and how auditory scene analysis is accomplished is of broad interest to neuroscientists. In this paper, we systematically investigated neuronal coding of multiple vocalizations degraded by two distinct noises at various signal-to-noise ratios in nonhuman primates. In the process, we uncovered heterogeneity of single-unit representations for different auditory scenes yet homogeneity of responses across the population.
Collapse
Affiliation(s)
- Ruiye Ni
- Laboratory of Sensory Neuroscience and Neuroengineering, Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - David A Bender
- Laboratory of Sensory Neuroscience and Neuroengineering, Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Amirali M Shanechi
- Laboratory of Sensory Neuroscience and Neuroengineering, Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Jeffrey R Gamble
- Laboratory of Sensory Neuroscience and Neuroengineering, Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| | - Dennis L Barbour
- Laboratory of Sensory Neuroscience and Neuroengineering, Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri
| |
Collapse
|
43
|
Szabó BT, Denham SL, Winkler I. Computational Models of Auditory Scene Analysis: A Review. Front Neurosci 2016; 10:524. [PMID: 27895552 PMCID: PMC5108797 DOI: 10.3389/fnins.2016.00524] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 10/28/2016] [Indexed: 12/02/2022] Open
Abstract
Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.
Collapse
Affiliation(s)
- Beáta T Szabó
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic UniversityBudapest, Hungary; Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of SciencesBudapest, Hungary
| | - Susan L Denham
- School of Psychology, University of Plymouth Plymouth, UK
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences Budapest, Hungary
| |
Collapse
|
44
|
Deike S, Deliano M, Brechmann A. Probing neural mechanisms underlying auditory stream segregation in humans by transcranial direct current stimulation (tDCS). Neuropsychologia 2016; 91:262-267. [PMID: 27546076 DOI: 10.1016/j.neuropsychologia.2016.08.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 08/15/2016] [Accepted: 08/17/2016] [Indexed: 11/27/2022]
Abstract
One hypothesis concerning the neural underpinnings of auditory streaming states that frequency tuning of tonotopically organized neurons in primary auditory fields in combination with physiological forward suppression is necessary for the separation of representations of high-frequency A and low-frequency B tones. The extent of spatial overlap between the tonotopic activations of A and B tones is thought to underlie the perceptual organization of streaming sequences into one coherent or two separate streams. The present study attempts to interfere with these mechanisms by transcranial direct current stimulation (tDCS) and to probe behavioral outcomes reflecting the perception of ABAB streaming sequences. We hypothesized that tDCS by modulating cortical excitability causes a change in the separateness of the representations of A and B tones, which leads to a change in the proportions of one-stream and two-stream percepts. To test this, 22 subjects were presented with ambiguous ABAB sequences of three different frequency separations (∆F) and had to decide on their current percept after receiving sham, anodal, or cathodal tDCS over the left auditory cortex. We could confirm our hypothesis at the most ambiguous ∆F condition of 6 semitones. For anodal compared with sham and cathodal stimulation, we found a significant decrease in the proportion of two-stream perception and an increase in the proportion of one-stream perception. The results demonstrate the feasibility of using tDCS to probe mechanisms underlying auditory streaming through the use of various behavioral measures. Moreover, this approach allows one to probe the functions of auditory regions and their interactions with other processing stages.
Collapse
Affiliation(s)
- Susann Deike
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118 Magdeburg, Germany.
| | - Matthias Deliano
- Department of Systems Physiology of Learning, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118 Magdeburg, Germany
| | - André Brechmann
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118 Magdeburg, Germany
| |
Collapse
|
45
|
Teki S, Barascud N, Picard S, Payne C, Griffiths TD, Chait M. Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence. Cereb Cortex 2016; 26:3669-80. [PMID: 27325682 PMCID: PMC5004755 DOI: 10.1093/cercor/bhw173] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence-the coincidence of sound elements in and across time-is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals ("stochastic figure-ground": SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as "figures" popping out of a stochastic "ground." Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the "figure" from the randomly varying "ground." Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the "classic" auditory system, is also involved in the early stages of auditory scene analysis."
Collapse
Affiliation(s)
- Sundeep Teki
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Auditory Cognition Group, Institute of Neuroscience, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
- Current address: Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, UK
| | - Nicolas Barascud
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Ear Institute, University College London, London WC1X 8EE, UK
| | - Samuel Picard
- Ear Institute, University College London, London WC1X 8EE, UK
| | | | - Timothy D. Griffiths
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Auditory Cognition Group, Institute of Neuroscience, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Maria Chait
- Ear Institute, University College London, London WC1X 8EE, UK
| |
Collapse
|
46
|
Yamagishi S, Otsuka S, Furukawa S, Kashino M. Subcortical correlates of auditory perceptual organization in humans. Hear Res 2016; 339:104-11. [PMID: 27371867 DOI: 10.1016/j.heares.2016.06.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Revised: 06/22/2016] [Accepted: 06/27/2016] [Indexed: 11/25/2022]
Abstract
To make sense of complex auditory scenes, the auditory system sequentially organizes auditory components into perceptual objects or streams. In the conventional view of this process, the cortex plays a major role in perceptual organization, and subcortical mechanisms merely provide the cortex with acoustical features. Here, we show that the neural activities of the brainstem are linked to perceptual organization, which alternates spontaneously for human listeners without any stimulus change. The stimulus used in the experiment was an unchanging sequence of repeated triplet tones, which can be interpreted as either one or two streams. Listeners were instructed to report the perceptual states whenever they experienced perceptual switching between one and two streams throughout the stimulus presentation. Simultaneously, we recorded event related potentials with scalp electrodes. We measured the frequency-following response (FFR), which is considered to originate from the brainstem. We also assessed thalamo-cortical activity through the middle-latency response (MLR). The results demonstrate that the FFR and MLR varied with the state of auditory stream perception. In addition, we found that the MLR change precedes the FFR change with perceptual switching from a one-stream to a two-stream percept. This suggests that there are top-down influences on brainstem activity from the thalamo-cortical pathway. These findings are consistent with the idea of a distributed, hierarchical neural network for perceptual organization and suggest that the network extends to the brainstem level.
Collapse
Affiliation(s)
- Shimpei Yamagishi
- Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8503, Japan.
| | - Sho Otsuka
- NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa, 243-0198, Japan.
| | - Shigeto Furukawa
- NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa, 243-0198, Japan.
| | - Makio Kashino
- Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8503, Japan; NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa, 243-0198, Japan.
| |
Collapse
|
47
|
Abstract
UNLABELLED Stream segregation enables a listener to disentangle multiple competing sequences of sounds. A recent study from our laboratory demonstrated that cortical neurons in anesthetized cats exhibit spatial stream segregation (SSS) by synchronizing preferentially to one of two sequences of noise bursts that alternate between two source locations. Here, we examine the emergence of SSS along the ascending auditory pathway. Extracellular recordings were made in anesthetized rats from the inferior colliculus (IC), the nucleus of the brachium of the IC (BIN), the medial geniculate body (MGB), and the primary auditory cortex (A1). Stimuli consisted of interleaved sequences of broadband noise bursts that alternated between two source locations. At stimulus presentation rates of 5 and 10 bursts per second, at which human listeners report robust SSS, neural SSS is weak in the central nucleus of the IC (ICC), it appears in the nucleus of the brachium of the IC (BIN) and in approximately two-thirds of neurons in the ventral MGB (MGBv), and is prominent throughout A1. The enhancement of SSS at the cortical level reflects both increased spatial sensitivity and increased forward suppression. We demonstrate that forward suppression in A1 does not result from synaptic inhibition at the cortical level. Instead, forward suppression might reflect synaptic depression in the thalamocortical projection. Together, our findings indicate that auditory streams are increasingly segregated along the ascending auditory pathway as distinct mutually synchronized neural populations. SIGNIFICANCE STATEMENT Listeners are capable of disentangling multiple competing sequences of sounds that originate from distinct sources. This stream segregation is aided by differences in spatial location between the sources. A possible substrate of spatial stream segregation (SSS) has been described in the auditory cortex, but the mechanisms leading to those cortical responses are unknown. Here, we investigated SSS in three levels of the ascending auditory pathway with extracellular unit recordings in anesthetized rats. We found that neural SSS emerges within the ascending auditory pathway as a consequence of sharpening of spatial sensitivity and increasing forward suppression. Our results highlight brainstem mechanisms that culminate in SSS at the level of the auditory cortex.
Collapse
|
48
|
Teki S, Kumar S, Griffiths TD. Large-Scale Analysis of Auditory Segregation Behavior Crowdsourced via a Smartphone App. PLoS One 2016; 11:e0153916. [PMID: 27096165 PMCID: PMC4838209 DOI: 10.1371/journal.pone.0153916] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 04/06/2016] [Indexed: 11/23/2022] Open
Abstract
The human auditory system is adept at detecting sound sources of interest from a complex mixture of several other simultaneous sounds. The ability to selectively attend to the speech of one speaker whilst ignoring other speakers and background noise is of vital biological significance—the capacity to make sense of complex ‘auditory scenes’ is significantly impaired in aging populations as well as those with hearing loss. We investigated this problem by designing a synthetic signal, termed the ‘stochastic figure-ground’ stimulus that captures essential aspects of complex sounds in the natural environment. Previously, we showed that under controlled laboratory conditions, young listeners sampled from the university subject pool (n = 10) performed very well in detecting targets embedded in the stochastic figure-ground signal. Here, we presented a modified version of this cocktail party paradigm as a ‘game’ featured in a smartphone app (The Great Brain Experiment) and obtained data from a large population with diverse demographical patterns (n = 5148). Despite differences in paradigms and experimental settings, the observed target-detection performance by users of the app was robust and consistent with our previous results from the psychophysical study. Our results highlight the potential use of smartphone apps in capturing robust large-scale auditory behavioral data from normal healthy volunteers, which can also be extended to study auditory deficits in clinical populations with hearing impairments and central auditory disorders.
Collapse
Affiliation(s)
- Sundeep Teki
- Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom
- * E-mail:
| | - Sukhbinder Kumar
- Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Timothy D. Griffiths
- Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
49
|
Functional magnetic resonance imaging confirms forward suppression for rapidly alternating sounds in human auditory cortex but not in the inferior colliculus. Hear Res 2016; 335:25-32. [PMID: 26899342 DOI: 10.1016/j.heares.2016.02.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 02/08/2016] [Accepted: 02/15/2016] [Indexed: 11/21/2022]
Abstract
Forward suppression at the level of the auditory cortex has been suggested to subserve auditory stream segregation. Recent results in non-streaming stimulation contexts have indicated that forward suppression can also be observed in the inferior colliculus; whether this holds for streaming-related contexts remains unclear. Here, we used cardiac-gated fMRI to examine forward suppression in the inferior colliculus (and the rest of the human auditory pathway) in response to canonical streaming stimuli (rapid tone sequences comprised of either one repetitive tone or two alternating tones). The first stimulus is typically perceived as a single stream, the second as two interleaved streams. In different experiments using either pure tones differing in frequency or bandpass-filtered noise differing in inter-aural time differences, we observed stronger auditory cortex activation in response to alternating vs. repetitive stimulation, consistent with the presence of forward suppression. In contrast, activity in the inferior colliculus and other subcortical nuclei did not significantly differ between alternating and monotonic stimuli. This finding could be explained by active amplification of forward suppression in auditory cortex, by a low rate (or absence) of cells showing forward suppression in inferior colliculus, or both.
Collapse
|
50
|
Mate Searching Animals as Model Systems for Understanding Perceptual Grouping. PSYCHOLOGICAL MECHANISMS IN ANIMAL COMMUNICATION 2016. [DOI: 10.1007/978-3-319-48690-1_4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|