1
|
Kalra L, Altman S, Bee MA. Perceptually salient differences in a species recognition cue do not promote auditory streaming in eastern grey treefrogs (Hyla versicolor). J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2024:10.1007/s00359-024-01702-9. [PMID: 38733407 DOI: 10.1007/s00359-024-01702-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/13/2024]
Abstract
Auditory streaming underlies a receiver's ability to organize complex mixtures of auditory input into distinct perceptual "streams" that represent different sound sources in the environment. During auditory streaming, sounds produced by the same source are integrated through time into a single, coherent auditory stream that is perceptually segregated from other concurrent sounds. Based on human psychoacoustic studies, one hypothesis regarding auditory streaming is that any sufficiently salient perceptual difference may lead to stream segregation. Here, we used the eastern grey treefrog, Hyla versicolor, to test this hypothesis in the context of vocal communication in a non-human animal. In this system, females choose their mate based on perceiving species-specific features of a male's pulsatile advertisement calls in social environments (choruses) characterized by mixtures of overlapping vocalizations. We employed an experimental paradigm from human psychoacoustics to design interleaved pulsatile sequences (ABAB…) that mimicked key features of the species' advertisement call, and in which alternating pulses differed in pulse rise time, which is a robust species recognition cue in eastern grey treefrogs. Using phonotaxis assays, we found no evidence that perceptually salient differences in pulse rise time promoted the segregation of interleaved pulse sequences into distinct auditory streams. These results do not support the hypothesis that any perceptually salient acoustic difference can be exploited as a cue for stream segregation in all species. We discuss these findings in the context of cues used for species recognition and auditory streaming.
Collapse
Affiliation(s)
- Lata Kalra
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Shoshana Altman
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Mark A Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN, 55108, USA
| |
Collapse
|
2
|
Kondo HM, Hasegawa R, Ezaki T, Sakata H, Ho HT. Functional coupling between auditory memory and verbal transformations. Sci Rep 2024; 14:3480. [PMID: 38347058 PMCID: PMC10861569 DOI: 10.1038/s41598-024-54013-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 02/07/2024] [Indexed: 02/15/2024] Open
Abstract
The ability to parse sound mixtures into coherent auditory objects is fundamental to cognitive functions, such as speech comprehension and language acquisition. Yet, we still lack a clear understanding of how auditory objects are formed. To address this question, we studied a speech-specific case of perceptual multistability, called verbal transformations (VTs), in which a variety of verbal forms is induced by continuous repetition of a physically unchanging word. Here, we investigated the degree to which auditory memory through sensory adaptation influences VTs. Specifically, we hypothesized that when memory persistence is longer, participants are able to retain the current verbal form longer, resulting in sensory adaptation, which in turn, affects auditory perception. Participants performed VT and auditory memory tasks on different days. In the VT task, Japanese participants continuously reported their perception while listening to a Japanese word (2- or 3-mora in length) played repeatedly for 5 min. In the auditory memory task, a different sequence of three morae, e.g., /ka/, /hi/, and /su/, was presented to each ear simultaneously. After some period (0-4 s), participants were visually cued to recall one of the sequences, i.e., in the left or right ear. We found that delayed recall accuracy was negatively correlated with the number of VTs, particularly under 2-mora conditions. This suggests that memory persistence is important for formation and selection of perceptual objects.
Collapse
Affiliation(s)
- Hirohito M Kondo
- School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa, Nagoya, Aichi, 466-8666, Japan.
| | - Ryuju Hasegawa
- School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa, Nagoya, Aichi, 466-8666, Japan
| | - Takahiro Ezaki
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Honami Sakata
- School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa, Nagoya, Aichi, 466-8666, Japan
| | - Hao Tam Ho
- School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa, Nagoya, Aichi, 466-8666, Japan
- Département d'études Cognitives, École Normale Supérieure, Paris, France
| |
Collapse
|
3
|
Banno T, Shirley H, Fishman YI, Cohen YE. Changes in neural readout of response magnitude during auditory streaming do not correlate with behavioral choice in the auditory cortex. Cell Rep 2023; 42:113493. [PMID: 38039133 PMCID: PMC10784988 DOI: 10.1016/j.celrep.2023.113493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/01/2023] [Accepted: 11/09/2023] [Indexed: 12/03/2023] Open
Abstract
A fundamental goal of the auditory system is to group stimuli from the auditory environment into a perceptual unit (i.e., "stream") or segregate the stimuli into multiple different streams. Although previous studies have clarified the psychophysical and neural mechanisms that may underlie this ability, the relationship between these mechanisms remains elusive. Here, we recorded multiunit activity (MUA) from the auditory cortex of monkeys while they participated in an auditory-streaming task consisting of interleaved low- and high-frequency tone bursts. As the streaming stimulus unfolded over time, MUA amplitude habituated; the magnitude of this habituation was correlated with the frequency difference between the tone bursts. An ideal-observer model could classify these time- and frequency-dependent changes into reports of "one stream" or "two streams" in a manner consistent with the behavioral literature. However, because classification was not modulated by the monkeys' behavioral choices, this MUA habituation may not directly reflect perceptual reports.
Collapse
Affiliation(s)
- Taku Banno
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Harry Shirley
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Yonatan I Fishman
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yale E Cohen
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA; Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
4
|
L'Hermite S, Zoefel B. Rhythmic Entrainment Echoes in Auditory Perception. J Neurosci 2023; 43:6667-6678. [PMID: 37604689 PMCID: PMC10538584 DOI: 10.1523/jneurosci.0051-23.2023] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/10/2023] [Accepted: 03/20/2023] [Indexed: 08/23/2023] Open
Abstract
Rhythmic entrainment echoes-rhythmic brain responses that outlast rhythmic stimulation-can demonstrate endogenous neural oscillations entrained by the stimulus rhythm. Here, we tested for such echoes in auditory perception. Participants detected a pure tone target, presented at a variable delay after another pure tone that was rhythmically modulated in amplitude. In four experiments involving 154 human (female and male) participants, we tested (1) which stimulus rate produces the strongest entrainment echo and, inspired by the tonotopical organization of the auditory system and findings in nonhuman primates, (2) whether these are organized according to sound frequency. We found the strongest entrainment echoes after 6 and 8 Hz stimulation, respectively. The best moments for target detection (in phase or antiphase with the preceding rhythm) depended on whether sound frequencies of entraining and target stimuli matched, which is in line with a tonotopical organization. However, for the same experimental condition, best moments were not always consistent across experiments. We provide a speculative explanation for these differences that relies on the notion that neural entrainment and repetition-related adaptation might exercise competing opposite influences on perception. Together, we find rhythmic echoes in auditory perception that seem more complex than those predicted from initial theories of neural entrainment.SIGNIFICANCE STATEMENT Rhythmic entrainment echoes are rhythmic brain responses that are produced by a rhythmic stimulus and persist after its offset. These echoes play an important role for the identification of endogenous brain oscillations, entrained by rhythmic stimulation, and give us insights into whether and how participants predict the timing of events. In four independent experiments involving >150 participants, we examined entrainment echoes in auditory perception. We found that entrainment echoes have a preferred rate (between 6 and 8 Hz) and seem to follow the tonotopic organization of the auditory system. Although speculative, we also found evidence that several, potentially competing processes might interact to produce such echoes, a notion that might need to be considered for future experimental design.
Collapse
Affiliation(s)
| | - Benedikt Zoefel
- Université de Toulouse III-Paul Sabatier, 31062 Toulouse, France
- Centre National de la Recherche Scientifique, Centre de Recherche Cerveau et Cognition, Centre Hospitalier Universitaire Purpan, 31052 Toulouse, France
| |
Collapse
|
5
|
Veyrié A, Noreña A, Sarrazin JC, Pezard L. Information-Theoretic Approaches in EEG Correlates of Auditory Perceptual Awareness under Informational Masking. BIOLOGY 2023; 12:967. [PMID: 37508397 PMCID: PMC10376775 DOI: 10.3390/biology12070967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 06/23/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
In informational masking paradigms, the successful segregation between the target and masker creates auditory perceptual awareness. The dynamics of the build-up of auditory perception is based on a set of interactions between bottom-up and top-down processes that generate neuronal modifications within the brain network activity. These neural changes are studied here using event-related potentials (ERPs), entropy, and integrated information, leading to several measures applied to electroencephalogram signals. The main findings show that the auditory perceptual awareness stimulated functional activation in the fronto-temporo-parietal brain network through (i) negative temporal and positive centro-parietal ERP components; (ii) an enhanced processing of multi-information in the temporal cortex; and (iii) an increase in informational content in the fronto-central cortex. These different results provide information-based experimental evidence about the functional activation of the fronto-temporo-parietal brain network during auditory perceptual awareness.
Collapse
Affiliation(s)
- Alexandre Veyrié
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
- ONERA, The French Aerospace Lab, 13300 Salon de Provence, France
| | - Arnaud Noreña
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
| | | | - Laurent Pezard
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
| |
Collapse
|
6
|
Roberts B, Haywood NR. Asymmetric effects of sudden changes in timbre on auditory stream segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:363-378. [PMID: 37462404 DOI: 10.1121/10.0020172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/28/2023] [Indexed: 07/21/2023]
Abstract
Two experiments explored the effects of abrupt transitions in timbral properties [amplitude modulation (AM), pure tones vs narrow-band noises, and attack/decay envelope] on streaming. Listeners reported continuously the number of streams heard during 18-s-long alternating low- and high-frequency (LHL-) sequences (frequency separation: 2-6 semitones) that underwent a coherent transition at 6 s or remained unchanged. In experiment 1, triplets comprised unmodulated pure tones or 100%-depth AM was created using narrowly spaced tone pairs (dyads: 30- or 50-Hz modulation). In experiment 2, triplets comprised narrow-band noises, dyads, or pure tones with quasi-trapezoidal envelopes (10/80/10 ms), fast attacks and slow decays (10/90 ms), or vice versa (90/10 ms). Abrupt transitions led to direction-dependent changes in stream segregation. Transitions from modulated to unmodulated (or slower-modulated) tones, from noise bands to pure tones, or from slow- to fast-attack tones typically caused substantial loss of segregation (resetting), whereas transitions in the opposite direction mostly caused less or no resetting. Furthermore, for the smallest frequency separation, transitions in the latter direction usually led to increased segregation (overshoot). Overall, the results are reminiscent of the perceptual asymmetries found in auditory search for targets with or without a salient additional feature (or greater activation of that feature).
Collapse
Affiliation(s)
- Brian Roberts
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| | - Nicholas R Haywood
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, United Kingdom
| |
Collapse
|
7
|
Melland P, Curtu R. Attractor-Like Dynamics Extracted from Human Electrocorticographic Recordings Underlie Computational Principles of Auditory Bistable Perception. J Neurosci 2023; 43:3294-3311. [PMID: 36977581 PMCID: PMC10162465 DOI: 10.1523/jneurosci.1531-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 03/03/2023] [Accepted: 03/15/2023] [Indexed: 03/30/2023] Open
Abstract
In bistable perception, observers experience alternations between two interpretations of an unchanging stimulus. Neurophysiological studies of bistable perception typically partition neural measurements into stimulus-based epochs and assess neuronal differences between epochs based on subjects' perceptual reports. Computational studies replicate statistical properties of percept durations with modeling principles like competitive attractors or Bayesian inference. However, bridging neuro-behavioral findings with modeling theory requires the analysis of single-trial dynamic data. Here, we propose an algorithm for extracting nonstationary timeseries features from single-trial electrocorticography (ECoG) data. We applied the proposed algorithm to 5-min ECoG recordings from human primary auditory cortex obtained during perceptual alternations in an auditory triplet streaming task (six subjects: four male, two female). We report two ensembles of emergent neuronal features in all trial blocks. One ensemble consists of periodic functions that encode a stereotypical response to the stimulus. The other comprises more transient features and encodes dynamics associated with bistable perception at multiple time scales: minutes (within-trial alternations), seconds (duration of individual percepts), and milliseconds (switches between percepts). Within the second ensemble, we identified a slowly drifting rhythm that correlates with the perceptual states and several oscillators with phase shifts near perceptual switches. Projections of single-trial ECoG data onto these features establish low-dimensional attractor-like geometric structures invariant across subjects and stimulus types. These findings provide supporting neural evidence for computational models with oscillatory-driven attractor-based principles. The feature extraction techniques described here generalize across recording modality and are appropriate when hypothesized low-dimensional dynamics characterize an underlying neural system.SIGNIFICANCE STATEMENT Irrespective of the sensory modality, neurophysiological studies of multistable perception have typically investigated events time-locked to the perceptual switching rather than the time course of the perceptual states per se. Here, we propose an algorithm that extracts neuronal features of bistable auditory perception from largescale single-trial data while remaining agnostic to the subject's perceptual reports. The algorithm captures the dynamics of perception at multiple timescales, minutes (within-trial alternations), seconds (durations of individual percepts), and milliseconds (timing of switches), and distinguishes attributes of neural encoding of the stimulus from those encoding the perceptual states. Finally, our analysis identifies a set of latent variables that exhibit alternating dynamics along a low-dimensional manifold, similar to trajectories in attractor-based models for perceptual bistability.
Collapse
Affiliation(s)
- Pake Melland
- Department of Mathematics, Southern Methodist University, Dallas, Texas 75275
- Applied Mathematical & Computational Sciences, The University of Iowa, Iowa City, Iowa 52242
| | - Rodica Curtu
- Department of Mathematics, The University of Iowa, Iowa City, Iowa 52242
- The Iowa Neuroscience Institute, The University of Iowa, Iowa City, Iowa 52242
| |
Collapse
|
8
|
Heynckes M, Lage-Castellanos A, De Weerd P, Formisano E, De Martino F. Layer-specific correlates of detected and undetected auditory targets during attention. CURRENT RESEARCH IN NEUROBIOLOGY 2023; 4:100075. [PMID: 36755988 PMCID: PMC9900365 DOI: 10.1016/j.crneur.2023.100075] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 11/24/2022] [Accepted: 01/12/2023] [Indexed: 01/26/2023] Open
Abstract
In everyday life, the processing of acoustic information allows us to react to subtle changes in the auditory scene. Yet even when closely attending to sounds in the context of a task, we occasionally miss task-relevant features. The neural computations that underlie our ability to detect behavioral relevant sound changes are thought to be grounded in both feedforward and feedback processes within the auditory hierarchy. Here, we assessed the role of feedforward and feedback contributions in primary and non-primary auditory areas during behavioral detection of target sounds using submillimeter spatial resolution functional magnetic resonance imaging (fMRI) at high-fields (7 T) in humans. We demonstrate that the successful detection of subtle temporal shifts in target sounds leads to a selective increase of activation in superficial layers of primary auditory cortex (PAC). These results indicate that feedback signals reaching as far back as PAC may be relevant to the detection of targets in the auditory scene.
Collapse
Affiliation(s)
- Miriam Heynckes
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER, Maastricht, the Netherlands
| | - Agustin Lage-Castellanos
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER, Maastricht, the Netherlands
| | - Peter De Weerd
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER, Maastricht, the Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER, Maastricht, the Netherlands,Maastricht Centre for Systems Biology, Maastricht University, Universiteitssingel 60, 6229 ER, Maastricht, the Netherlands
| | - Federico De Martino
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER, Maastricht, the Netherlands,Corresponding author. Federico De Martino Department Cognitive Neuroscience Oxfordlaan 55, 6229EV, Maastricht, the Netherlands.
| |
Collapse
|
9
|
Thomassen S, Hartung K, Einhäuser W, Bendixen A. Low-high-low or high-low-high? Pattern effects on sequential auditory scene analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:2758. [PMID: 36456271 DOI: 10.1121/10.0015054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/17/2022] [Indexed: 06/17/2023]
Abstract
Sequential auditory scene analysis (ASA) is often studied using sequences of two alternating tones, such as ABAB or ABA_, with "_" denoting a silent gap, and "A" and "B" sine tones differing in frequency (nominally low and high). Many studies implicitly assume that the specific arrangement (ABAB vs ABA_, as well as low-high-low vs high-low-high within ABA_) plays a negligible role, such that decisions about the tone pattern can be governed by other considerations. To explicitly test this assumption, a systematic comparison of different tone patterns for two-tone sequences was performed in three different experiments. Participants were asked to report whether they perceived the sequences as originating from a single sound source (integrated) or from two interleaved sources (segregated). Results indicate that core findings of sequential ASA, such as an effect of frequency separation on the proportion of integrated and segregated percepts, are similar across the different patterns during prolonged listening. However, at sequence onset, the integrated percept was more likely to be reported by the participants in ABA_low-high-low than in ABA_high-low-high sequences. This asymmetry is important for models of sequential ASA, since the formation of percepts at onset is an integral part of understanding how auditory interpretations build up.
Collapse
Affiliation(s)
- Sabine Thomassen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Kevin Hartung
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Wolfgang Einhäuser
- Physics of Cognition Group, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| |
Collapse
|
10
|
Attentional control via synaptic gain mechanisms in auditory streaming. Brain Res 2021; 1778:147720. [PMID: 34785256 DOI: 10.1016/j.brainres.2021.147720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 09/13/2021] [Accepted: 11/05/2021] [Indexed: 11/21/2022]
Abstract
Attention is a crucial component in sound source segregation allowing auditory objects of interest to be both singled out and held in focus. Our study utilizes a fundamental paradigm for sound source segregation: a sequence of interleaved tones, A and B, of different frequencies that can be heard as a single integrated stream or segregated into two streams (auditory streaming paradigm). We focus on the irregular alternations between integrated and segregated that occur for long presentations, so-called auditory bistability. Psychaoustic experiments demonstrate how attentional control, a listener's intention to experience integrated or segregated, biases perception in favour of different perceptual interpretations. Our data show that this is achieved by prolonging the dominance times of the attended percept and, to a lesser extent, by curtailing the dominance times of the unattended percept, an effect that remains consistent across a range of values for the difference in frequency between A and B. An existing neuromechanistic model describes the neural dynamics of perceptual competition downstream of primary auditory cortex (A1). The model allows us to propose plausible neural mechanisms for attentional control, as linked to different attentional strategies, in a direct comparison with behavioural data. A mechanism based on a percept-specific input gain best accounts for the effects of attentional control.
Collapse
|
11
|
Rajasingam SL, Summers RJ, Roberts B. The dynamics of auditory stream segregation: Effects of sudden changes in frequency, level, or modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3769. [PMID: 34241493 DOI: 10.1121/10.0005049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 05/03/2021] [Indexed: 06/13/2023]
Abstract
Three experiments explored the effects of abrupt changes in stimulus properties on streaming dynamics. Listeners monitored 20-s-long low- and high-frequency (LHL-) tone sequences and reported the number of streams heard throughout. Experiments 1 and 2 used pure tones and examined the effects of changing triplet base frequency and level, respectively. Abrupt changes in base frequency (±3-12 semitones) caused significant magnitude-related falls in segregation (resetting), regardless of transition direction, but an asymmetry occurred for changes in level (±12 dB). Rising-level transitions usually decreased segregation significantly, whereas falling-level transitions had little or no effect. Experiment 3 used pure tones (unmodulated) and narrowly spaced (±25 Hz) tone pairs (dyads); the two evoke similar excitation patterns, but dyads are strongly modulated with a distinctive timbre. Dyad-only sequences induced a strongly segregated percept, limiting scope for further build-up. Alternation between groups of pure tones and dyads produced large, asymmetric changes in streaming. Dyad-to-pure transitions caused substantial resetting, but pure-to-dyad transitions sometimes elicited even greater segregation than for the corresponding interval in dyad-only sequences (overshoot). The results indicate that abrupt changes in timbre can strongly affect the likelihood of stream segregation without introducing significant peripheral-channeling cues. These asymmetric effects of transition direction are reminiscent of subtractive adaptation in vision.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Department of Vision and Hearing Sciences, Anglia Ruskin University, Cambridge CB1 1PT, United Kingdom
| | - Robert J Summers
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
12
|
Ferrario A, Rankin J. Auditory streaming emerges from fast excitation and slow delayed inhibition. JOURNAL OF MATHEMATICAL NEUROSCIENCE 2021; 11:8. [PMID: 33939042 PMCID: PMC8093365 DOI: 10.1186/s13408-021-00106-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 04/22/2021] [Indexed: 05/29/2023]
Abstract
In the auditory streaming paradigm, alternating sequences of pure tones can be perceived as a single galloping rhythm (integration) or as two sequences with separated low and high tones (segregation). Although studied for decades, the neural mechanisms underlining this perceptual grouping of sound remains a mystery. With the aim of identifying a plausible minimal neural circuit that captures this phenomenon, we propose a firing rate model with two periodically forced neural populations coupled by fast direct excitation and slow delayed inhibition. By analyzing the model in a non-smooth, slow-fast regime we analytically prove the existence of a rich repertoire of dynamical states and of their parameter dependent transitions. We impose plausible parameter restrictions and link all states with perceptual interpretations. Regions of stimulus parameters occupied by states linked with each percept match those found in behavioural experiments. Our model suggests that slow inhibition masks the perception of subsequent tones during segregation (forward masking), whereas fast excitation enables integration for large pitch differences between the two tones.
Collapse
Affiliation(s)
- Andrea Ferrario
- Department of Mathematics, College of Engineering, Mathematics & Physical Sciences, University of Exeter, Exeter, UK.
| | - James Rankin
- Department of Mathematics, College of Engineering, Mathematics & Physical Sciences, University of Exeter, Exeter, UK
| |
Collapse
|
13
|
Holmes E, Zeidman P, Friston KJ, Griffiths TD. Difficulties with Speech-in-Noise Perception Related to Fundamental Grouping Processes in Auditory Cortex. Cereb Cortex 2020; 31:1582-1596. [PMID: 33136138 PMCID: PMC7869094 DOI: 10.1093/cercor/bhaa311] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/04/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open
Abstract
In our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” [SPIN] perception). SPIN perception varies widely—and people who are worse at SPIN perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with SPIN perception to difficulties with figure-ground perception using functional magnetic resonance imaging. We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when SPIN and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% rather than 90% performance)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (SPIN) tasks—which provides a common computational basis for the link between SPIN perception and fundamental auditory grouping.
Collapse
Affiliation(s)
- Emma Holmes
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Peter Zeidman
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Timothy D Griffiths
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK.,Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
14
|
Nguyen QA, Rinzel J, Curtu R. Buildup and bistability in auditory streaming as an evidence accumulation process with saturation. PLoS Comput Biol 2020; 16:e1008152. [PMID: 32853256 PMCID: PMC7480857 DOI: 10.1371/journal.pcbi.1008152] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 09/09/2020] [Accepted: 07/15/2020] [Indexed: 12/23/2022] Open
Abstract
A repeating triplet-sequence ABA- of non-overlapping brief tones, A and B, is a valued paradigm for studying auditory stream formation and the cocktail party problem. The stimulus is "heard" either as a galloping pattern (integration) or as two interleaved streams (segregation); the initial percept is typically integration then followed by spontaneous alternations between segregation and integration, each being dominant for a few seconds. The probability of segregation grows over seconds, from near-zero to a steady value, defining the buildup function, BUF. Its stationary level increases with the difference in tone frequencies, DF, and the BUF rises faster. Percept durations have DF-dependent means and are gamma-like distributed. Behavioral and computational studies usually characterize triplet streaming either during alternations or during buildup. Here, our experimental design and modeling encompass both. We propose a pseudo-neuromechanistic model that incorporates spiking activity in primary auditory cortex, A1, as input and resolves perception along two network-layers downstream of A1. Our model is straightforward and intuitive. It describes the noisy accumulation of evidence against the current percept which generates switches when reaching a threshold. Accumulation can saturate either above or below threshold; if below, the switching dynamics resemble noise-induced transitions from an attractor state. Our model accounts quantitatively for three key features of data: the BUFs, mean durations, and normalized dominance duration distributions, at various DF values. It describes perceptual alternations without competition per se, and underscores that treating triplets in the sequence independently and averaging across trials, as implemented in earlier widely cited studies, is inadequate.
Collapse
Affiliation(s)
- Quynh-Anh Nguyen
- Department of Mathematics, The University of Iowa, Iowa City, Iowa, United States of America
| | - John Rinzel
- Center for Neural Science, New York University, New York, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America
| | - Rodica Curtu
- Department of Mathematics, The University of Iowa, Iowa City, Iowa, United States of America
- Iowa Neuroscience Institute, Human Brain Research Laboratory, Iowa City, Iowa, United States of America
- * E-mail:
| |
Collapse
|
15
|
Gustafson SJ, Grose J, Buss E. Perceptual organization and stability of auditory streaming for pure tones and /ba/ stimuli. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:EL159. [PMID: 32873027 PMCID: PMC7438158 DOI: 10.1121/10.0001744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 07/23/2020] [Accepted: 07/27/2020] [Indexed: 06/11/2023]
Abstract
The dynamics of auditory stream segregation were evaluated using repeating triplets composed of pure tones or the syllable /ba/. Stimuli differed in frequency (tones) or fundamental frequency (speech) by 4, 6, 8, or 10 semitones, and the standard frequency was either 250 Hz (tones and speech) or 400 Hz (tones). Twenty normal-hearing adults participated. For both tones and speech, a two-stream percept became more likely as frequency separation increased. Perceptual organization for speech tended to be more integrated and less stable compared to tones. Results suggest that prior data patterns observed with tones in this paradigm may generalize to speech stimuli.
Collapse
Affiliation(s)
- Samantha J Gustafson
- Department of Communication Sciences and Disorders, University of Utah, 390 South 1530 East, Salt Lake City, Utah 84112, USA
| | - John Grose
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina, 170, Manning Drive, Chapel Hill, North Carolina 27599, , ,
| | - Emily Buss
- Department of Otolaryngology-Head and Neck Surgery, University of North Carolina, 170, Manning Drive, Chapel Hill, North Carolina 27599, , ,
| |
Collapse
|
16
|
Cai H, Dent ML. Attention capture in birds performing an auditory streaming task. PLoS One 2020; 15:e0235420. [PMID: 32589692 PMCID: PMC7319309 DOI: 10.1371/journal.pone.0235420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 06/15/2020] [Indexed: 11/19/2022] Open
Abstract
Numerous animal models have been used to investigate the neural mechanisms of auditory processing in complex acoustic environments, but it is unclear whether an animal’s auditory attention is functionally similar to a human’s in processing competing auditory scenes. Here we investigated the effects of attention capture in birds performing an objective auditory streaming paradigm. The classical ABAB… patterned pure tone sequences were modified and used for the task. We trained the birds to selectively attend to a target stream and only respond to the deviant appearing in the target stream, even though their attention may be captured by a deviant in the background stream. When no deviant appeared in the background stream, the birds experience the buildup of streaming process in a qualitatively similar way as they did in a subjective paradigm. Although the birds were trained to selectively attend to the target stream, they failed to avoid the involuntary attention switch caused by the background deviant, especially when the background deviant was sequentially unpredictable. Their global performance deteriorated more with increasingly salient background deviants, where the buildup process was reset by the background distractor. Moreover, sequential predictability of the background deviant facilitated the recovery of the buildup process after attention capture. This is the first study that addresses the perceptual consequences of the joint effects of top-down and bottom-up attention in behaving animals.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
| | - Micheal L. Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
- * E-mail:
| |
Collapse
|
17
|
Kondo HM, Lin IF. Excitation-inhibition balance and auditory multistable perception are correlated with autistic traits and schizotypy in a non-clinical population. Sci Rep 2020; 10:8171. [PMID: 32424307 PMCID: PMC7234986 DOI: 10.1038/s41598-020-65126-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 04/27/2020] [Indexed: 12/20/2022] Open
Abstract
Individuals with autism spectrum disorder and individuals with schizophrenia have impaired social and communication skills. They also have altered auditory perception. This study investigated autistic traits and schizotypy in a non-clinical population as well as the excitation-inhibition (EI) balance in different brain regions and their auditory multistable perception. Thirty-four healthy participants were assessed by the Autism-Spectrum Quotient (AQ) and Schizotypal Personality Questionnaire (SPQ). The EI balance was evaluated by measuring the resting-state concentrations of glutamate-glutamine (Glx) and ϒ-aminobutyric acid (GABA) in vivo by using magnetic resonance spectroscopy. To observe the correlation between their traits and perception, we conducted an auditory streaming task and a verbal transformation task, in which participants reported spontaneous perceptual switching while listening to a sound sequence. Their AQ and SPQ scores were positively correlated with the Glx/GABA ratio in the auditory cortex but not in the frontal areas. These scores were negatively correlated with the number of perceptual switches in the verbal transformation task but not in the auditory streaming task. Our results suggest that the EI balance in the auditory cortex and the perceptual formation of speech are involved in autistic traits and schizotypy.
Collapse
Affiliation(s)
- Hirohito M Kondo
- School of Psychology, Chukyo University, Nagoya, Aichi, 466-8666, Japan. .,Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, 243-0198, Japan.
| | - I-Fan Lin
- Department of Occupational Medicine, Shuang Ho Hospital, New Taipei City, 235, Taiwan.,Department of Medicine, Taipei Medical University, Taipei, 110, Taiwan
| |
Collapse
|
18
|
Szalárdy O, Tóth B, Farkas D, Orosz G, Honbolygó F, Winkler I. Linguistic predictability influences auditory stimulus classification within two concurrent speech streams. Psychophysiology 2020; 57:e13547. [DOI: 10.1111/psyp.13547] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 01/20/2020] [Accepted: 01/22/2020] [Indexed: 11/30/2022]
Affiliation(s)
- Orsolya Szalárdy
- Faculty of Medicine Institute of Behavioural Sciences Semmelweis University Budapest Hungary
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| | - Brigitta Tóth
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| | - Dávid Farkas
- Analytics Development, Performance Management and Analytics, Business Development, Integrated Supply Chain Management, Nokia Business Services, Nokia Operations, Nokia Budapest Hungary
| | - Gábor Orosz
- Department of Psychology Stanford University Stanford CA USA
| | - Ferenc Honbolygó
- Brain Imaging Centre Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
- Institute of Psychology ELTE Eötvös Loránd University Budapest Hungary
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| |
Collapse
|
19
|
Streaming of Repeated Noise in Primary and Secondary Fields of Auditory Cortex. J Neurosci 2020; 40:3783-3798. [PMID: 32273487 DOI: 10.1523/jneurosci.2105-19.2020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 02/06/2020] [Accepted: 02/11/2020] [Indexed: 11/21/2022] Open
Abstract
Statistical regularities in natural sounds facilitate the perceptual segregation of auditory sources, or streams. Repetition is one cue that drives stream segregation in humans, but the neural basis of this perceptual phenomenon remains unknown. We demonstrated a similar perceptual ability in animals by training ferrets of both sexes to detect a stream of repeating noise samples (foreground) embedded in a stream of random samples (background). During passive listening, we recorded neural activity in primary auditory cortex (A1) and secondary auditory cortex (posterior ectosylvian gyrus, PEG). We used two context-dependent encoding models to test for evidence of streaming of the repeating stimulus. The first was based on average evoked activity per noise sample and the second on the spectro-temporal receptive field. Both approaches tested whether differences in neural responses to repeating versus random stimuli were better modeled by scaling the response to both streams equally (global gain) or by separately scaling the response to the foreground versus background stream (stream-specific gain). Consistent with previous observations of adaptation, we found an overall reduction in global gain when the stimulus began to repeat. However, when we measured stream-specific changes in gain, responses to the foreground were enhanced relative to the background. This enhancement was stronger in PEG than A1. In A1, enhancement was strongest in units with low sparseness (i.e., broad sensory tuning) and with tuning selective for the repeated sample. Enhancement of responses to the foreground relative to the background provides evidence for stream segregation that emerges in A1 and is refined in PEG.SIGNIFICANCE STATEMENT To interact with the world successfully, the brain must parse behaviorally important information from a complex sensory environment. Complex mixtures of sounds often arrive at the ears simultaneously or in close succession, yet they are effortlessly segregated into distinct perceptual sources. This process breaks down in hearing-impaired individuals and speech recognition devices. By identifying the underlying neural mechanisms that facilitate perceptual segregation, we can develop strategies for ameliorating hearing loss and improving speech recognition technology in the presence of background noise. Here, we present evidence to support a hierarchical process, present in primary auditory cortex and refined in secondary auditory cortex, in which sound repetition facilitates segregation.
Collapse
|
20
|
Little DF, Snyder JS, Elhilali M. Ensemble modeling of auditory streaming reveals potential sources of bistability across the perceptual hierarchy. PLoS Comput Biol 2020; 16:e1007746. [PMID: 32275706 PMCID: PMC7185718 DOI: 10.1371/journal.pcbi.1007746] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 04/27/2020] [Accepted: 02/25/2020] [Indexed: 11/19/2022] Open
Abstract
Perceptual bistability-the spontaneous, irregular fluctuation of perception between two interpretations of a stimulus-occurs when observing a large variety of ambiguous stimulus configurations. This phenomenon has the potential to serve as a tool for, among other things, understanding how function varies across individuals due to the large individual differences that manifest during perceptual bistability. Yet it remains difficult to interpret the functional processes at work, without knowing where bistability arises during perception. In this study we explore the hypothesis that bistability originates from multiple sources distributed across the perceptual hierarchy. We develop a hierarchical model of auditory processing comprised of three distinct levels: a Peripheral, tonotopic analysis, a Central analysis computing features found more centrally in the auditory system, and an Object analysis, where sounds are segmented into different streams. We model bistable perception within this system by applying adaptation, inhibition and noise into one or all of the three levels of the hierarchy. We evaluate a large ensemble of variations of this hierarchical model, where each model has a different configuration of adaptation, inhibition and noise. This approach avoids the assumption that a single configuration must be invoked to explain the data. Each model is evaluated based on its ability to replicate two hallmarks of bistability during auditory streaming: the selectivity of bistability to specific stimulus configurations, and the characteristic log-normal pattern of perceptual switches. Consistent with a distributed origin, a broad range of model parameters across this hierarchy lead to a plausible form of perceptual bistability.
Collapse
Affiliation(s)
- David F. Little
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas; Las Vegas, Nevada, United States of America
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
21
|
Bidelman GM, Bush LC, Boudreaux AM. Effects of Noise on the Behavioral and Neural Categorization of Speech. Front Neurosci 2020; 14:153. [PMID: 32180700 PMCID: PMC7057933 DOI: 10.3389/fnins.2020.00153] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 02/10/2020] [Indexed: 02/02/2023] Open
Abstract
We investigated whether the categorical perception (CP) of speech might also provide a mechanism that aids its perception in noise. We varied signal-to-noise ratio (SNR) [clear, 0 dB, -5 dB] while listeners classified an acoustic-phonetic continuum (/u/ to /a/). Noise-related changes in behavioral categorization were only observed at the lowest SNR. Event-related brain potentials (ERPs) differentiated category vs. category-ambiguous speech by the P2 wave (~180-320 ms). Paralleling behavior, neural responses to speech with clear phonetic status (i.e., continuum endpoints) were robust to noise down to -5 dB SNR, whereas responses to ambiguous tokens declined with decreasing SNR. Results demonstrate that phonetic speech representations are more resistant to degradation than corresponding acoustic representations. Findings suggest the mere process of binning speech sounds into categories provides a robust mechanism to aid figure-ground speech perception by fortifying abstract categories from the acoustic signal and making the speech code more resistant to external interferences.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, United States
| | - Lauren C Bush
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Alex M Boudreaux
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| |
Collapse
|
22
|
Neural correlates of perceptual switching while listening to bistable auditory streaming stimuli. Neuroimage 2020; 204:116220. [DOI: 10.1016/j.neuroimage.2019.116220] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 08/19/2019] [Accepted: 09/19/2019] [Indexed: 11/15/2022] Open
|
23
|
Auditory streaming and bistability paradigm extended to a dynamic environment. Hear Res 2019; 383:107807. [PMID: 31622836 DOI: 10.1016/j.heares.2019.107807] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 11/23/2022]
Abstract
We explore stream segregation with temporally modulated acoustic features using behavioral experiments and modelling. The auditory streaming paradigm in which alternating high- A and low-frequency tones B appear in a repeating ABA-pattern, has been shown to be perceptually bistable for extended presentations (order of minutes). For a fixed, repeating stimulus, perception spontaneously changes (switches) at random times, every 2-15 s, between an integrated interpretation with a galloping rhythm and segregated streams. Streaming in a natural auditory environment requires segregation of auditory objects with features that evolve over time. With the relatively idealized ABA-triplet paradigm, we explore perceptual switching in a non-static environment by considering slowly and periodically varying stimulus features. Our previously published model captures the dynamics of auditory bistability and predicts here how perceptual switches are entrained, tightly locked to the rising and falling phase of modulation. In psychoacoustic experiments we find that entrainment depends on both the period of modulation and the intrinsic switch characteristics of individual listeners. The extended auditory streaming paradigm with slowly modulated stimulus features presented here will be of significant interest for future imaging and neurophysiology experiments by reducing the need for subjective perceptual reports of ongoing perception.
Collapse
|
24
|
Symonds RM, Zhou JW, Cole SL, Brace KM, Sussman ES. Cognitive resources are distributed among the entire auditory landscape in auditory scene analysis. Psychophysiology 2019; 57:e13487. [PMID: 31578762 DOI: 10.1111/psyp.13487] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 08/21/2019] [Accepted: 09/04/2019] [Indexed: 01/30/2023]
Abstract
Although attention has been shown to enhance neural representations of selected inputs, the fate of unselected background sounds is still debated. The goal of the current study was to understand how processing resources are distributed among attended and unattended sounds during auditory scene analysis. We used a three-stream paradigm with four acoustic features uniquely defining each sound stream (frequency, envelope shape, spatial location, tone quality). We manipulated task load by having participants perform a difficult auditory task and an easy movie-viewing task with the same set of sounds in separate conditions. The mismatch negativity (MMN) component of event-related brain potentials (ERPs) was measured to evaluate sound processing in both conditions. We found no effect of task demands on unattended sound processing: MMNs were elicited by unattended deviants during both low- and high-load task conditions. A key factor of this result was the use of unique tone feature combinations to distinguish each of the three sound streams, strengthening the segregation of streams. In the auditory task, the P3b component demonstrates a two-stage process of target evaluation. Thus, these results, in conjunction with results of previous studies, suggest that stimulus-driven factors that strengthen stream segregation can free up processing capacity for higher-level analyses. The results illustrate the interactive nature of top-down and stimulus-driven processes in stream formation, supporting a distributive theory of attention that balances the strength of the bottom-up input with perceptual goals in analyzing the auditory scene.
Collapse
Affiliation(s)
- Renee M Symonds
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Juin W Zhou
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA.,Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, USA
| | - Sally L Cole
- Department of Counseling and Clinical Psychology, Teachers College, Columbia University, New York, New York, USA
| | - Kelin M Brace
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Elyse S Sussman
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
25
|
Abstract
Humans and other animals use spatial hearing to rapidly localize events in the environment. However, neural encoding of sound location is a complex process involving the computation and integration of multiple spatial cues that are not represented directly in the sensory organ (the cochlea). Our understanding of these mechanisms has increased enormously in the past few years. Current research is focused on the contribution of animal models for understanding human spatial audition, the effects of behavioural demands on neural sound location encoding, the emergence of a cue-independent location representation in the auditory cortex, and the relationship between single-source and concurrent location encoding in complex auditory scenes. Furthermore, computational modelling seeks to unravel how neural representations of sound source locations are derived from the complex binaural waveforms of real-life sounds. In this article, we review and integrate the latest insights from neurophysiological, neuroimaging and computational modelling studies of mammalian spatial hearing. We propose that the cortical representation of sound location emerges from recurrent processing taking place in a dynamic, adaptive network of early (primary) and higher-order (posterior-dorsal and dorsolateral prefrontal) auditory regions. This cortical network accommodates changing behavioural requirements and is especially relevant for processing the location of real-life, complex sounds and complex auditory scenes.
Collapse
|
26
|
Neural Signatures of Auditory Perceptual Bistability Revealed by Large-Scale Human Intracranial Recordings. J Neurosci 2019; 39:6482-6497. [PMID: 31189576 PMCID: PMC6697394 DOI: 10.1523/jneurosci.0655-18.2019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 05/26/2019] [Accepted: 05/28/2019] [Indexed: 11/25/2022] Open
Abstract
A key challenge in neuroscience is understanding how sensory stimuli give rise to perception, especially when the process is supported by neural activity from an extended network of brain areas. Perception is inherently subjective, so interrogating its neural signatures requires, ideally, a combination of three factors: (1) behavioral tasks that separate stimulus-driven activity from perception per se; (2) human subjects who self-report their percepts while performing those tasks; and (3) concurrent neural recordings acquired at high spatial and temporal resolution. In this study, we analyzed human electrocorticographic recordings obtained during an auditory task which supported mutually exclusive perceptual interpretations. Eight neurosurgical patients (5 male; 3 female) listened to sequences of repeated triplets where tones were separated in frequency by several semitones. Subjects reported spontaneous alternations between two auditory perceptual states, 1-stream and 2-stream, by pressing a button. We compared averaged auditory evoked potentials (AEPs) associated with 1-stream and 2-stream percepts and identified significant differences between them in primary and nonprimary auditory cortex, surrounding auditory-related temporoparietal cortex, and frontal areas. We developed classifiers to identify spatial maps of percept-related differences in the AEP, corroborating findings from statistical analysis. We used one-dimensional embedding spaces to perform the group-level analysis. Our data illustrate exemplar high temporal resolution AEP waveforms in auditory core region; explain inconsistencies in perceptual effects within auditory cortex, reported across noninvasive studies of streaming of triplets; show percept-related changes in frontoparietal areas previously highlighted by studies that focused on perceptual transitions; and demonstrate that auditory cortex encodes maintenance of percepts and switches between them. SIGNIFICANCE STATEMENT The human brain has the remarkable ability to discern complex and ambiguous stimuli from the external world by parsing mixed inputs into interpretable segments. However, one's perception can deviate from objective reality. But how do perceptual discrepancies occur? What are their anatomical substrates? To address these questions, we performed intracranial recordings in neurosurgical patients as they reported their perception of sounds associated with two mutually exclusive interpretations. We identified signatures of subjective percepts as distinct from sound-driven brain activity in core and non-core auditory cortex and frontoparietal cortex. These findings were compared with previous studies of auditory bistable perception and suggested that perceptual transitions and maintenance of perceptual states were supported by common neural substrates.
Collapse
|
27
|
Rankin J, Rinzel J. Computational models of auditory perception from feature extraction to stream segregation and behavior. Curr Opin Neurobiol 2019; 58:46-53. [PMID: 31326723 DOI: 10.1016/j.conb.2019.06.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 06/22/2019] [Indexed: 10/26/2022]
Abstract
Audition is by nature dynamic, from brainstem processing on sub-millisecond time scales, to segregating and tracking sound sources with changing features, to the pleasure of listening to music and the satisfaction of getting the beat. We review recent advances from computational models of sound localization, of auditory stream segregation and of beat perception/generation. A wealth of behavioral, electrophysiological and imaging studies shed light on these processes, typically with synthesized sounds having regular temporal structure. Computational models integrate knowledge from different experimental fields and at different levels of description. We advocate a neuromechanistic modeling approach that incorporates knowledge of the auditory system from various fields, that utilizes plausible neural mechanisms, and that bridges our understanding across disciplines.
Collapse
Affiliation(s)
- James Rankin
- College of Engineering, Mathematics and Physical Sciences, University of Exeter, Harrison Building, North Park Rd, Exeter EX4 4QF, UK.
| | - John Rinzel
- Center for Neural Science, New York University, 4 Washington Place, 10003 New York, NY, United States; Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, 10012 New York, NY, United States
| |
Collapse
|
28
|
Paredes-Gallardo A, Dau T, Marozeau J. Auditory Stream Segregation Can Be Modeled by Neural Competition in Cochlear Implant Listeners. Front Comput Neurosci 2019; 13:42. [PMID: 31333438 PMCID: PMC6616076 DOI: 10.3389/fncom.2019.00042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 06/17/2019] [Indexed: 11/13/2022] Open
Abstract
Auditory stream segregation is a perceptual process by which the human auditory system groups sounds from different sources into perceptually meaningful elements (e.g., a voice or a melody). The perceptual segregation of sounds is important, for example, for the understanding of speech in noisy scenarios, a particularly challenging task for listeners with a cochlear implant (CI). It has been suggested that some aspects of stream segregation may be explained by relatively basic neural mechanisms at a cortical level. During the past decades, a variety of models have been proposed to account for the data from stream segregation experiments in normal-hearing (NH) listeners. However, little attention has been given to corresponding findings in CI listeners. The present study investigated whether a neural model of sequential stream segregation, proposed to describe the behavioral effects observed in NH listeners, can account for behavioral data from CI listeners. The model operates on the stimulus features at the cortical level and includes a competition stage between the neuronal units encoding the different percepts. The competition arises from a combination of mutual inhibition, adaptation, and additive noise. The model was found to capture the main trends in the behavioral data from CI listeners, such as the larger probability of a segregated percept with increasing the feature difference between the sounds as well as the build-up effect. Importantly, this was achieved without any modification to the model's competition stage, suggesting that stream segregation could be mediated by a similar mechanism in both groups of listeners.
Collapse
Affiliation(s)
- Andreu Paredes-Gallardo
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Jeremy Marozeau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
29
|
Zou J, Feng J, Xu T, Jin P, Luo C, Zhang J, Pan X, Chen F, Zheng J, Ding N. Auditory and language contributions to neural encoding of speech features in noisy environments. Neuroimage 2019; 192:66-75. [DOI: 10.1016/j.neuroimage.2019.02.047] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 01/31/2019] [Accepted: 02/19/2019] [Indexed: 11/28/2022] Open
|
30
|
Chakrabarty D, Elhilali M. A Gestalt inference model for auditory scene segregation. PLoS Comput Biol 2019; 15:e1006711. [PMID: 30668568 PMCID: PMC6358108 DOI: 10.1371/journal.pcbi.1006711] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 02/01/2019] [Accepted: 12/12/2018] [Indexed: 11/18/2022] Open
Abstract
Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.
Collapse
Affiliation(s)
- Debmalya Chakrabarty
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
- * E-mail:
| |
Collapse
|
31
|
Rajasingam SL, Summers RJ, Roberts B. Stream biasing by different induction sequences: Evaluating stream capture as an account of the segregation-promoting effects of constant-frequency inducers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:3409. [PMID: 30599694 DOI: 10.1121/1.5082300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 11/19/2018] [Indexed: 06/09/2023]
Abstract
Stream segregation for a test sequence comprising high-frequency (H) and low-frequency (L) pure tones, presented in a galloping rhythm, is much greater when preceded by a constant-frequency induction sequence matching one subset than by an inducer configured like the test sequence; this difference persists for several seconds. It has been proposed that constant-frequency inducers promote stream segregation by capturing the matching subset of test-sequence tones into an on-going, pre-established stream. This explanation was evaluated using 2-s induction sequences followed by longer test sequences (12-20 s). Listeners reported the number of streams heard throughout the test sequence. Experiment 1 used LHL- sequences and one or other subset of inducer tones was attenuated (0-24 dB in 6-dB steps, and ∞). Greater attenuation usually caused a progressive increase in segregation, towards that following the constant-frequency inducer. Experiment 2 used HLH- sequences and the L inducer tones were raised or lowered in frequency relative to their test-sequence counterparts (ΔfI = 0, 0.5, 1.0, or 1.5 × ΔfT ). Either change greatly increased segregation. These results are concordant with the notion of attention switching to new sounds but contradict the stream-capture hypothesis, unless a "proto-object" corresponding to the continuing subset is assumed to form during the induction sequence.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Robert J Summers
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
32
|
Gifford AM, Sperling MR, Sharan A, Gorniak RJ, Williams RB, Davis K, Kahana MJ, Cohen YE. Neuronal phase consistency tracks dynamic changes in acoustic spectral regularity. Eur J Neurosci 2018; 49:1268-1287. [PMID: 30402926 DOI: 10.1111/ejn.14263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 10/15/2018] [Accepted: 10/23/2018] [Indexed: 11/28/2022]
Abstract
The brain parses the auditory environment into distinct sounds by identifying those acoustic features in the environment that have common relationships (e.g., spectral regularities) with one another and then grouping together the neuronal representations of these features. Although there is a large literature that tests how the brain tracks spectral regularities that are predictable, it is not known how the auditory system tracks spectral regularities that are not predictable and that change dynamically over time. Furthermore, the contribution of brain regions downstream of the auditory cortex to the coding of spectral regularity is unknown. Here, we addressed these two issues by recording electrocorticographic activity, while human patients listened to tone-burst sequences with dynamically varying spectral regularities, and identified potential neuronal mechanisms of the analysis of spectral regularities throughout the brain. We found that the degree of oscillatory stimulus phase consistency (PC) in multiple neuronal-frequency bands tracked spectral regularity. In particular, PC in the delta-frequency band seemed to be the best indicator of spectral regularity. We also found that these regularity representations existed in multiple regions throughout cortex. This widespread reliable modulation in PC - both in neuronal-frequency space and in cortical space - suggests that phase-based modulations may be a general mechanism for tracking regularity in the auditory system specifically and other sensory systems more generally. Our findings also support a general role for the delta-frequency band in processing the regularity of auditory stimuli.
Collapse
Affiliation(s)
- Adam M Gifford
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael R Sperling
- Jefferson Comprehensive Epilepsy Center, Department of Neurology, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Ashwini Sharan
- Jefferson Comprehensive Epilepsy Center, Department of Neurology, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Richard J Gorniak
- Department of Radiology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Ryan B Williams
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kathryn Davis
- Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael J Kahana
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania.,Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Yale E Cohen
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania.,Departments of Otorhinolaryngology, Neuroscience, and Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
33
|
Cortical tracking of multiple streams outside the focus of attention in naturalistic auditory scenes. Neuroimage 2018; 181:617-626. [DOI: 10.1016/j.neuroimage.2018.07.052] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 07/19/2018] [Accepted: 07/22/2018] [Indexed: 11/30/2022] Open
|
34
|
Ruggles DR, Tausend AN, Shamma SA, Oxenham AJ. Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2424. [PMID: 30404514 PMCID: PMC6909992 DOI: 10.1121/1.5065392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 06/08/2023]
Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.
Collapse
Affiliation(s)
- Dorea R Ruggles
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Alexis N Tausend
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Shihab A Shamma
- Electrical and Computer Engineering Department & Institute for Systems, University of Maryland, College Park, Maryland 20740, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
35
|
Kondo HM, Pressnitzer D, Shimada Y, Kochiyama T, Kashino M. Inhibition-excitation balance in the parietal cortex modulates volitional control for auditory and visual multistability. Sci Rep 2018; 8:14548. [PMID: 30267021 PMCID: PMC6162284 DOI: 10.1038/s41598-018-32892-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 09/18/2018] [Indexed: 11/25/2022] Open
Abstract
Perceptual organisation must select one interpretation from several alternatives to guide behaviour. Computational models suggest that this could be achieved through an interplay between inhibition and excitation across competing types of neural population coding for each interpretation. Here, to test for such models, we used magnetic resonance spectroscopy to measure non-invasively the concentrations of inhibitory γ-aminobutyric acid (GABA) and excitatory glutamate-glutamine (Glx) in several brain regions. Human participants first performed auditory and visual multistability tasks that produced spontaneous switching between percepts. Then, we observed that longer percept durations during behaviour were associated with higher GABA/Glx ratios in the sensory area coding for each modality. When participants were asked to voluntarily modulate their perception, a common factor across modalities emerged: the GABA/Glx ratio in the posterior parietal cortex tended to be positively correlated with the amount of effective volitional control. Our results provide direct evidence implicating that the balance between neural inhibition and excitation within sensory regions resolves perceptual competition. This powerful computational principle appears to be leveraged by both audition and vision, implemented independently across modalities, but modulated by an integrated control process.
Collapse
Affiliation(s)
- Hirohito M Kondo
- School of Psychology, Chukyo University, Nagoya, Aichi, Japan.
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan.
| | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France
- Département d'Études Cognitive, École Normale Supérieure, Paris, France
| | - Yasuhiro Shimada
- Brain Activity Imaging Center, ATR-Promotions, Seika-cho, Kyoto, Japan
| | - Takanori Kochiyama
- Brain Activity Imaging Center, ATR-Promotions, Seika-cho, Kyoto, Japan
- Department of Cognitive Neuroscience, Advanced Telecommunications Research Institute International, Seika-cho, Kyoto, Japan
| | - Makio Kashino
- Sports Brain Science Project, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa, Japan
- School of Engineering, Tokyo Institute of Technology, Yokohama, Kanagawa, Japan
| |
Collapse
|
36
|
Cai H, Screven LA, Dent ML. Behavioral measurements of auditory streaming and build-up by budgerigars ( Melopsittacus undulatus). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1508. [PMID: 30424658 DOI: 10.1121/1.5054297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/27/2018] [Indexed: 06/09/2023]
Abstract
The perception of the build-up of auditory streaming has been widely investigated in humans, while it is unknown whether animals experience a similar perception when hearing high (H) and low (L) tonal pattern sequences. The paradigm previously used in European starlings (Sturnus vulgaris) was adopted in two experiments to address the build-up of auditory streaming in budgerigars (Melopsittacus undulatus). In experiment 1, different numbers of repetitions of low-high-low triplets were used in five conditions to study the build-up process. In experiment 2, 5 and 15 repetitions of high-low-high triplets were used to investigate the effects of repetition rate, frequency separation, and frequency range of the two tones on the birds' streaming perception. Similar to humans, budgerigars subjectively experienced the build-up process in auditory streaming; faster repetition rates and larger frequency separations enhanced the streaming perception, and these results were consistent across the two frequency ranges. Response latency analysis indicated that the budgerigars needed a longer amount of time to respond to stimuli that elicited a salient streaming perception. These results indicate, for the first time using a behavioral paradigm, that budgerigars experience a build-up of auditory streaming in a manner similar to humans.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Laurel A Screven
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| | - Micheal L Dent
- Department of Psychology, University at Buffalo, The State University of New York, Buffalo, New York 14260, USA
| |
Collapse
|
37
|
Selezneva E, Gorkin A, Budinger E, Brosch M. Neuronal correlates of auditory streaming in the auditory cortex of behaving monkeys. Eur J Neurosci 2018; 48:3234-3245. [PMID: 30070745 DOI: 10.1111/ejn.14098] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 06/27/2018] [Accepted: 07/20/2018] [Indexed: 11/29/2022]
Abstract
This study tested the hypothesis that spiking activity in the primary auditory cortex of monkeys is related to auditory stream formation. Evidence for this hypothesis was previously obtained in animals that were passively exposed to stimuli and in which differences in the streaming percept were confounded with differences between the stimuli. In this study, monkeys performed an operant task on sequences that were composed of light flashes and tones. The tones alternated between a high and a low frequency and could be perceived either as one auditory stream or two auditory streams. The flashes promoted either a one-stream percept or a two-stream percept. Comparison of different types of sequences revealed that the neuronal responses to the alternating tones were more similar when the flashes promoted auditory stream integration, and were more dissimilar when the flashes promoted auditory stream segregation. Thus our findings show that the spiking activity in the monkey primary auditory cortex is related to auditory stream formation.
Collapse
Affiliation(s)
| | | | - Eike Budinger
- Leibniz Institut für Neurobiologie, Magdeburg, Germany
| | | |
Collapse
|
38
|
Christison-Lagay KL, Cohen YE. The Contribution of Primary Auditory Cortex to Auditory Categorization in Behaving Monkeys. Front Neurosci 2018; 12:601. [PMID: 30210282 PMCID: PMC6123543 DOI: 10.3389/fnins.2018.00601] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 08/09/2018] [Indexed: 11/13/2022] Open
Abstract
The specific contribution of core auditory cortex to auditory perception –such as categorization– remains controversial. To identify a contribution of the primary auditory cortex (A1) to perception, we recorded A1 activity while monkeys reported whether a temporal sequence of tone bursts was heard as having a “small” or “large” frequency difference. We found that A1 had frequency-tuned responses that habituated, independent of frequency content, as this auditory sequence unfolded over time. We also found that A1 firing rate was modulated by the monkeys’ reports of “small” and “large” frequency differences; this modulation correlated with their behavioral performance. These findings are consistent with the hypothesis that A1 contributes to the processes underlying auditory categorization.
Collapse
Affiliation(s)
- Kate L Christison-Lagay
- Neuroscience Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Yale E Cohen
- Departments of Otorhinolaryngology, Neuroscience, and Bioengineering, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
39
|
Noda T, Takahashi H. Behavioral evaluation of auditory stream segregation in rats. Neurosci Res 2018; 141:52-62. [PMID: 29580889 DOI: 10.1016/j.neures.2018.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 03/08/2018] [Accepted: 03/22/2018] [Indexed: 10/17/2022]
Abstract
Perceptual organization of sound sequences into separate sound sources or streams is called auditory stream segregation. Neural substrates for this process in both the spectral and temporal domains remain to be elucidated. Despite abundant knowledge about their auditory physiology, behavioral evidence for auditory streaming in rodents is still limited. We provided behavioral evidence for auditory streaming in the go/no-go discrimination task, but not in the two-alternative choice task. In the go/no-go discrimination phase, rats were able to discriminate different rhythms corresponding to segregated or integrated tone sequences in both short inter-tone interval (ITI) and long ITI conditions. Nevertheless, performance was poorer in the long ITI group. In probe testing, which assessed the ability to discriminate one of the segregated tone sequences from ABA- tone sequences, the detection rate increased with the difference in frequency (ΔF) for short (100 ms), but not long (200 ms) ITIs. Our results indicate that auditory streaming in rats on both the spectral and temporal features in the ABA- tone paradigm is qualitatively analogous to that observed in human psychophysics studies. This suggests that rodents are a valuable model for investigating the neural substrates of auditory streaming.
Collapse
Affiliation(s)
- Takahiro Noda
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan
| | - Hirokazu Takahashi
- Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
40
|
Knyazeva S, Selezneva E, Gorkin A, Aggelopoulos NC, Brosch M. Neuronal Correlates of Auditory Streaming in Monkey Auditory Cortex for Tone Sequences without Spectral Differences. Front Integr Neurosci 2018; 12:4. [PMID: 29440999 PMCID: PMC5797536 DOI: 10.3389/fnint.2018.00004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/16/2018] [Indexed: 11/13/2022] Open
Abstract
This study finds a neuronal correlate of auditory perceptual streaming in the primary auditory cortex for sequences of tone complexes that have the same amplitude spectrum but a different phase spectrum. Our finding is based on microelectrode recordings of multiunit activity from 270 cortical sites in three awake macaque monkeys. The monkeys were presented with repeated sequences of a tone triplet that consisted of an A tone, a B tone, another A tone and then a pause. The A and B tones were composed of unresolved harmonics formed by adding the harmonics in cosine phase, in alternating phase, or in random phase. A previous psychophysical study on humans revealed that when the A and B tones are similar, humans integrate them into a single auditory stream; when the A and B tones are dissimilar, humans segregate them into separate auditory streams. We found that the similarity of neuronal rate responses to the triplets was highest when all A and B tones had cosine phase. Similarity was intermediate when the A tones had cosine phase and the B tones had alternating phase. Similarity was lowest when the A tones had cosine phase and the B tones had random phase. The present study corroborates and extends previous reports, showing similar correspondences between neuronal activity in the primary auditory cortex and auditory streaming of sound sequences. It also is consistent with Fishman’s population separation model of auditory streaming.
Collapse
Affiliation(s)
- Stanislava Knyazeva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Elena Selezneva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Alexander Gorkin
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Laboratory of Psychophysiology, Institute of Psychology, Moscow, Russia
| | | | - Michael Brosch
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
41
|
Sanders RD, Winston JS, Barnes GR, Rees G. Magnetoencephalographic Correlates of Perceptual State During Auditory Bistability. Sci Rep 2018; 8:976. [PMID: 29343771 PMCID: PMC5772671 DOI: 10.1038/s41598-018-19287-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 12/22/2017] [Indexed: 11/24/2022] Open
Abstract
Bistability occurs when two alternative percepts can be derived from the same physical stimulus. To identify the neural correlates of specific subjective experiences we used a bistable auditory stimulus and determined whether the two perceptual states could be distinguished electrophysiologically. Fourteen participants underwent magnetoencephalography while reporting their perceptual experience while listening to a continuous bistable stream of auditory tones. Participants reported bistability with a similar overall proportion of the two alternative percepts (52% vs 48%). At the individual level, sensor space electrophysiological discrimination between the percepts was possible in 9/14 participants with canonical variate analysis (CVA) or linear support vector machine (SVM) analysis over space and time dimensions. Classification was possible in 14/14 subjects with non-linear SVM. Similar effects were noted in an unconstrained source space CVA analysis (classifying 10/14 participants), linear SVM (classifying 9/14 subjects) and non-linear SVM (classifiying 13/14 participants). Source space analysis restricted to a priori ROIs showed discrimination was possible in the right and left auditory cortex with each classification approach but in the right intraparietal sulcus this was only apparent with non-linear SVM and only in a minority of particpants. Magnetoencephalography can be used to objectively classify auditory experiences from individual subjects.
Collapse
Affiliation(s)
- Robert D Sanders
- Institute of Cognitive Neuroscience University College London, Alexandra House, 17-19 Queen Square, London, WC1N 3AR, London, United Kingdom.
- Department of Anesthesiology, University of Wisconsin, Madison, USA.
| | - Joel S Winston
- Institute of Cognitive Neuroscience University College London, Alexandra House, 17-19 Queen Square, London, WC1N 3AR, London, United Kingdom
- Wellcome Trust Centre for Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| | - Gareth R Barnes
- Wellcome Trust Centre for Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| | - Geraint Rees
- Institute of Cognitive Neuroscience University College London, Alexandra House, 17-19 Queen Square, London, WC1N 3AR, London, United Kingdom
- Wellcome Trust Centre for Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| |
Collapse
|
42
|
Rauschecker JP. Where, When, and How: Are they all sensorimotor? Towards a unified view of the dorsal pathway in vision and audition. Cortex 2018; 98:262-268. [PMID: 29183630 PMCID: PMC5771843 DOI: 10.1016/j.cortex.2017.10.020] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 08/19/2017] [Accepted: 10/12/2017] [Indexed: 10/18/2022]
Abstract
Dual processing streams in sensory systems have been postulated for a long time. Much experimental evidence has been accumulated from behavioral, neuropsychological, electrophysiological, neuroanatomical and neuroimaging work supporting the existence of largely segregated cortical pathways in both vision and audition. More recently, debate has returned to the question of overlap between these pathways and whether there aren't really more than two processing streams. The present piece defends the dual-system view. Focusing on the functions of the dorsal stream in the auditory and language system I try to reconcile the various models of Where, How and When into one coherent concept of sensorimotor integration. This framework incorporates principles of internal models in feedback control systems and is applicable to the visual system as well.
Collapse
Affiliation(s)
- Josef P Rauschecker
- Laboratory of Integrative Neuroscience and Cognition, Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA; Institute for Advanced Study, Technische Universität München, Garching bei München, Germany.
| |
Collapse
|
43
|
Attention Is Required for Knowledge-Based Sequential Grouping: Insights from the Integration of Syllables into Words. J Neurosci 2017; 38:1178-1188. [PMID: 29255005 DOI: 10.1523/jneurosci.2606-17.2017] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 11/08/2017] [Accepted: 12/05/2017] [Indexed: 11/21/2022] Open
Abstract
How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention.SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention.
Collapse
|
44
|
Shinn-Cunningham B. Cortical and Sensory Causes of Individual Differences in Selective Attention Ability Among Listeners With Normal Hearing Thresholds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:2976-2988. [PMID: 29049598 PMCID: PMC5945067 DOI: 10.1044/2017_jslhr-h-17-0080] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 06/23/2017] [Accepted: 07/05/2017] [Indexed: 05/28/2023]
Abstract
PURPOSE This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. METHOD The results from neuroscience and psychoacoustics are reviewed. RESULTS In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." CONCLUSIONS How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601617.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Center for Research in Sensory Communication and Emerging Neural Technology, Boston University, MA
| |
Collapse
|
45
|
A Crucial Test of the Population Separation Model of Auditory Stream Segregation in Macaque Primary Auditory Cortex. J Neurosci 2017; 37:10645-10655. [PMID: 28954867 DOI: 10.1523/jneurosci.0792-17.2017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 08/29/2017] [Accepted: 09/05/2017] [Indexed: 11/21/2022] Open
Abstract
An important aspect of auditory scene analysis is auditory stream segregation-the organization of sound sequences into perceptual streams reflecting different sound sources in the environment. Several models have been proposed to account for stream segregation. According to the "population separation" (PS) model, alternating ABAB tone sequences are perceived as a single stream or as two separate streams when "A" and "B" tones activate the same or distinct frequency-tuned neuronal populations in primary auditory cortex (A1), respectively. A crucial test of the PS model is whether it can account for the observation that A and B tones are generally perceived as a single stream when presented synchronously, rather than in an alternating pattern, even if they are widely separated in frequency. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in A1 of male macaques. Consistent with predictions of the PS model, a greater effective tonotopic separation of A and B tone responses was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. While other models of stream segregation, such as temporal coherence, are not excluded by the present findings, we conclude that PS is sufficient to account for the perceptual organization of ALT and SYNC sequences and thus remains a viable model of auditory stream segregation.SIGNIFICANCE STATEMENT According to the population separation (PS) model of auditory stream segregation, sounds that activate the same or separate neural populations in primary auditory cortex (A1) are perceived as one or two streams, respectively. It is unclear, however, whether the PS model can account for the perception of sounds as a single stream when they are presented synchronously. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in macaque A1. A greater effective separation of tonotopic activity patterns was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. Based on these findings, we conclude that PS remains a plausible neurophysiological model of auditory stream segregation.
Collapse
|
46
|
Itatani N, Klump GM. Interaction of spatial and non-spatial cues in auditory stream segregation in the European starling. Eur J Neurosci 2017; 51:1191-1200. [PMID: 28922512 DOI: 10.1111/ejn.13716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/14/2017] [Accepted: 09/14/2017] [Indexed: 11/29/2022]
Abstract
Integrating sounds from the same source and segregating sounds from different sources in an acoustic scene are an essential function of the auditory system. Naturally, the auditory system simultaneously makes use of multiple cues. Here, we investigate the interaction between spatial cues and frequency cues in stream segregation of European starlings (Sturnus vulgaris) using an objective measure of perception. Neural responses to streaming sounds were recorded, while the bird was performing a behavioural task that results in a higher sensitivity during a one-stream than a two-stream percept. Birds were trained to detect an onset time shift of a B tone in an ABA- triplet sequence in which A and B could differ in frequency and/or spatial location. If the frequency difference or spatial separation between the signal sources or both were increased, the behavioural time shift detection performance deteriorated. Spatial separation had a smaller effect on the performance compared to the frequency difference and both cues additively affected the performance. Neural responses in the primary auditory forebrain were affected by the frequency and spatial cues. However, frequency and spatial cue differences being sufficiently large to elicit behavioural effects did not reveal correlated neural response differences. The difference between the neuronal response pattern and behavioural response is discussed with relation to the task given to the bird. Perceptual effects of combining different cues in auditory scene analysis indicate that these cues are analysed independently and given different weights suggesting that the streaming percept arises consecutively to initial cue analysis.
Collapse
Affiliation(s)
- Naoya Itatani
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
47
|
Medathati NVK, Rankin J, Meso AI, Kornprobst P, Masson GS. Recurrent network dynamics reconciles visual motion segmentation and integration. Sci Rep 2017; 7:11270. [PMID: 28900120 PMCID: PMC5595847 DOI: 10.1038/s41598-017-11373-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 08/18/2017] [Indexed: 11/09/2022] Open
Abstract
In sensory systems, a range of computational rules are presumed to be implemented by neuronal subpopulations with different tuning functions. For instance, in primate cortical area MT, different classes of direction-selective cells have been identified and related either to motion integration, segmentation or transparency. Still, how such different tuning properties are constructed is unclear. The dominant theoretical viewpoint based on a linear-nonlinear feed-forward cascade does not account for their complex temporal dynamics and their versatility when facing different input statistics. Here, we demonstrate that a recurrent network model of visual motion processing can reconcile these different properties. Using a ring network, we show how excitatory and inhibitory interactions can implement different computational rules such as vector averaging, winner-take-all or superposition. The model also captures ordered temporal transitions between these behaviors. In particular, depending on the inhibition regime the network can switch from motion integration to segmentation, thus being able to compute either a single pattern motion or to superpose multiple inputs as in motion transparency. We thus demonstrate that recurrent architectures can adaptively give rise to different cortical computational regimes depending upon the input statistics, from sensory flow integration to segmentation.
Collapse
Affiliation(s)
| | - James Rankin
- College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, UK
- Center for Neural Science, New York University, New York, USA
| | - Andrew I Meso
- Institut de Neurosciences de la Timone, CNRS and Aix-Marseille Université, Marseille, France
- Psychology, Faculty of Science and Technology, Bournemouth University, Bournemouth, UK
| | - Pierre Kornprobst
- Université Côte d'Azur, Inria, Biovision team, Sophia Antipolis, France
| | - Guillaume S Masson
- Institut de Neurosciences de la Timone, CNRS and Aix-Marseille Université, Marseille, France
| |
Collapse
|
48
|
Prilop L, Gutschalk A. Auditory-cortex lesions impair contralateral tone-pattern detection under informational masking. Cortex 2017; 95:1-14. [PMID: 28806706 DOI: 10.1016/j.cortex.2017.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 06/22/2017] [Accepted: 07/11/2017] [Indexed: 10/19/2022]
Abstract
Impaired hearing contralateral to unilateral auditory-cortex lesions is typically only observed under conditions of perceptual competition, such as dichotic presentation or speech in noise. It remains unclear, however, if the source of this effect is direct competition in frequency-specific neurons, or if enhanced processing load in more distant frequencies can also impair auditory detection. To evaluate this question, we studied a group of patients with unilateral auditory-cortex lesions (N = 14, six left-hemispheric (LH), eight right-hemispheric (RH); four females; age range 26-72 years) and a control group (N = 25; 15 females; age range 18-76 years) with a target-detection task in presence of a multi-tone masker, which can produce informational masking. The results revealed reduced sensitivity for monaural target streams presented contralateral to auditory-cortex lesions, with an approximately 10% higher error rate in the contra-lesional ear. A general, bilateral reduction of target detection was only observed in a subgroup of patients, who were classified as additionally suffering from auditory neglect. These results demonstrate that auditory-cortex lesions impair monaural, contra-lesional target detection under informational masking. The finding supports the hypothesis that neural mechanisms beyond direct competition in frequency-specific neurons can be a source of impaired hearing under perceptual competition in patients with unilateral auditory-cortex lesions.
Collapse
Affiliation(s)
- Lisa Prilop
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany.
| |
Collapse
|
49
|
Snyder JS, Elhilali M. Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 2017; 1396:39-55. [PMID: 28199022 PMCID: PMC5446279 DOI: 10.1111/nyas.13317] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 12/21/2016] [Accepted: 01/08/2017] [Indexed: 11/29/2022]
Abstract
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds-and conventional behavioral techniques-to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the last few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field.
Collapse
Affiliation(s)
- Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, Nevada
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
50
|
Rankin J, Osborn Popp PJ, Rinzel J. Stimulus Pauses and Perturbations Differentially Delay or Promote the Segregation of Auditory Objects: Psychoacoustics and Modeling. Front Neurosci 2017; 11:198. [PMID: 28473747 PMCID: PMC5397483 DOI: 10.3389/fnins.2017.00198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 03/23/2017] [Indexed: 11/21/2022] Open
Abstract
Segregating distinct sound sources is fundamental for auditory perception, as in the cocktail party problem. In a process called the build-up of stream segregation, distinct sound sources that are perceptually integrated initially can be segregated into separate streams after several seconds. Previous research concluded that abrupt changes in the incoming sounds during build-up—for example, a step change in location, loudness or timing—reset the percept to integrated. Following this reset, the multisecond build-up process begins again. Neurophysiological recordings in auditory cortex (A1) show fast (subsecond) adaptation, but unified mechanistic explanations for the bias toward integration, multisecond build-up and resets remain elusive. Combining psychoacoustics and modeling, we show that initial unadapted A1 responses bias integration, that the slowness of build-up arises naturally from competition downstream, and that recovery of adaptation can explain resets. An early bias toward integrated perceptual interpretations arising from primary cortical stages that encode low-level features and feed into competition downstream could also explain similar phenomena in vision. Further, we report a previously overlooked class of perturbations that promote segregation rather than integration. Our results challenge current understanding for perturbation effects on the emergence of sound source segregation, leading to a new hypothesis for differential processing downstream of A1. Transient perturbations can momentarily redirect A1 responses as input to downstream competition units that favor segregation.
Collapse
Affiliation(s)
- James Rankin
- Department of Mathematics, University of ExeterExeter, UK.,Center for Neural Science, New York UniversityNew York, NY, USA
| | | | - John Rinzel
- Center for Neural Science, New York UniversityNew York, NY, USA.,Courant Institute of Mathematical SciencesNew York, NY, USA
| |
Collapse
|