1
|
Englitz B, Akram S, Elhilali M, Shamma S. Decoding contextual influences on auditory perception from primary auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.24.573229. [PMID: 38187523 PMCID: PMC10769425 DOI: 10.1101/2023.12.24.573229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Perception can be highly dependent on stimulus context, but whether and how sensory areas encode the context remains uncertain. We used an ambiguous auditory stimulus - a tritone pair - to investigate the neural activity associated with a preceding contextual stimulus that strongly influenced the tritone pair's perception: either as an ascending or a descending step in pitch. We recorded single-unit responses from a population of auditory cortical cells in awake ferrets listening to the tritone pairs preceded by the contextual stimulus. We find that the responses adapt locally to the contextual stimulus, consistent with human MEG recordings from the auditory cortex under the same conditions. Decoding the population responses demonstrates that cells responding to pitch-class-changes are able to predict well the context-sensitive percept of the tritone pairs. Conversely, decoding the individual pitch-class representations and taking their distance in the circular Shepard tone space predicts the opposite of the percept. The various percepts can be readily captured and explained by a neural model of cortical activity based on populations of adapting, pitch-class and pitch-class-direction cells, aligned with the neurophysiological responses. Together, these decoding and model results suggest that contextual influences on perception may well be already encoded at the level of the primary sensory cortices, reflecting basic neural response properties commonly found in these areas.
Collapse
|
2
|
Laback B, Tabuchi H, Kohlrausch A. Evidence for proactive and retroactive temporal pattern analysis in simultaneous maskinga). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3742-3759. [PMID: 38856312 DOI: 10.1121/10.0026240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 05/17/2024] [Indexed: 06/11/2024]
Abstract
Amplitude modulation (AM) of a masker reduces its masking on a simultaneously presented unmodulated pure-tone target, which likely involves dip listening. This study tested the idea that dip-listening efficiency may depend on stimulus context, i.e., the match in AM peakedness (AMP) between the masker and a precursor or postcursor stimulus, assuming a form of temporal pattern analysis process. Masked thresholds were measured in normal-hearing listeners using Schroeder-phase harmonic complexes as maskers and precursors or postcursors. Experiment 1 showed threshold elevation (i.e., interference) when a flat cursor preceded or followed a peaked masker, suggesting proactive and retroactive temporal pattern analysis. Threshold decline (facilitation) was observed when the masker AMP was matched to the precursor, irrespective of stimulus AMP, suggesting only proactive processing. Subsequent experiments showed that both interference and facilitation (1) remained robust when a temporal gap was inserted between masker and cursor, (2) disappeared when an F0-difference was introduced between masker and precursor, and (3) decreased when the presentation level was reduced. These results suggest an important role of envelope regularity in dip listening, especially when masker and cursor are F0-matched and, therefore, form one perceptual stream. The reported effects seem to represent a time-domain variant of comodulation masking release.
Collapse
Affiliation(s)
- Bernhard Laback
- Austrian Academy of Sciences, Acoustics Research Institute, Wohllebengasse 12-14, 1040 Vienna, Austria
| | - Hisaaki Tabuchi
- Department of Psychology, University of Innsbruck, Universitätsstraße 15, 6020 Innsbruck, Austria
| | - Armin Kohlrausch
- Industrial Engineering & Innovation Sciences, Technische Universiteit Eindhoven, P.O. Box 513, 5600 MB Eindhoven, Netherlands
| |
Collapse
|
3
|
Lu K, Dutta K, Mohammed A, Elhilali M, Shamma S. Temporal-Coherence Induces Binding of Responses to Sound Sequences in Ferret Auditory Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.595170. [PMID: 38854125 PMCID: PMC11160575 DOI: 10.1101/2024.05.21.595170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Binding the attributes of a sensory source is necessary to perceive it as a unified entity, one that can be attended to and extracted from its surrounding scene. In auditory perception, this is the essence of the cocktail party problem in which a listener segregates one speaker from a mixture of voices, or a musical stream from simultaneous others. It is postulated that coherence of the temporal modulations of a source's features is necessary to bind them. The focus of this study is on the role of temporal-coherence in binding and segregation, and specifically as evidenced by the neural correlates of rapid plasticity that enhance cortical responses among synchronized neurons, while suppressing them among asynchronized ones. In a first experiment, we find that attention to a sound sequence rapidly binds it to other coherent sequences while suppressing nearby incoherent sequences, thus enhancing the contrast between the two groups. In a second experiment, a sequence of synchronized multi-tone complexes, embedded in a cloud of randomly dispersed background of desynchronized tones, perceptually and neurally pops-out after a fraction of a second highlighting the binding among its coherent tones against the incoherent background. These findings demonstrate the role of temporal-coherence in binding and segregation.
Collapse
Affiliation(s)
- Kai Lu
- Emory University Medical School
| | - Kelsey Dutta
- Electrical and Computer Engineering Department & Institute for Systems Research, University of Maryland College Park
| | - Ali Mohammed
- Electrical and Computer Engineering Department & Institute for Systems Research, University of Maryland College Park
| | - Mounya Elhilali
- Electrical and Computer Engineering, The Johns Hopkins University
| | - Shihab Shamma
- Electrical and Computer Engineering Department & Institute for Systems Research, University of Maryland College Park
- Départment d'étude Cognitives, école normale supérieure, PSL
| |
Collapse
|
4
|
Lieder I, Sulem A, Ahissar M. Frequency-specific contributions to auditory perceptual priors: Testing the predictive-coding hypothesis. iScience 2024; 27:108946. [PMID: 38333707 PMCID: PMC10850758 DOI: 10.1016/j.isci.2024.108946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 12/02/2023] [Accepted: 01/15/2024] [Indexed: 02/10/2024] Open
Abstract
Perceptual priors formed by recent stimuli bias our immediate percept. These priors, expressing our implicit expectations, affect both high- and low-level processing stages. Yet, the nature of the inter-level interaction is unknown. Do priors operate top-down and bias low-level features toward recently experienced objects (predictive-coding hypothesis), or are low-level biases bottom-up driven and formed by local memory circuits? To decipher between these options in auditory perception, we used the "missing fundamental illusion", enabling the dissociation of low-level components from the high-level pitch. Surprisingly, in contrast to predictive coding, when the fundamental frequency was missing, pitch contraction across timbre categories was not found to the previously perceived high-level pitch, but to the physically present frequency. This bottom-up contribution of low-level memory components to perceptual priors, operating independently of recent high-level percepts, may stabilize the perceptual organization and underlie continuity between similar low-level features belonging to different object categories in the auditory modality.
Collapse
Affiliation(s)
- Itay Lieder
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Aviel Sulem
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| | - Merav Ahissar
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
- Department of Psychology, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
| |
Collapse
|
5
|
Farhadi A, Jennings SG, Strickland EA, Carney LH. Subcortical auditory model including efferent dynamic gain control with inputs from cochlear nucleus and inferior colliculus. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3644-3659. [PMID: 38051523 PMCID: PMC10836963 DOI: 10.1121/10.0022578] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 10/21/2023] [Accepted: 11/13/2023] [Indexed: 12/07/2023]
Abstract
An auditory model has been developed with a time-varying, gain-control signal based on the physiology of the efferent system and subcortical neural pathways. The medial olivocochlear (MOC) efferent stage of the model receives excitatory projections from fluctuation-sensitive model neurons of the inferior colliculus (IC) and wide-dynamic-range model neurons of the cochlear nucleus. The response of the model MOC stage dynamically controls cochlear gain via simulated outer hair cells. In response to amplitude-modulated (AM) noise, firing rates of most IC neurons with band-enhanced modulation transfer functions in awake rabbits increase over a time course consistent with the dynamics of the MOC efferent feedback. These changes in the rates of IC neurons in awake rabbits were employed to adjust the parameters of the efferent stage of the proposed model. Responses of the proposed model to AM noise were able to simulate the increasing IC rate over time, whereas the model without the efferent system did not show this trend. The proposed model with efferent gain control provides a powerful tool for testing hypotheses, shedding insight on mechanisms in hearing, specifically those involving the efferent system.
Collapse
Affiliation(s)
- Afagh Farhadi
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14642, USA
| | - Skyler G Jennings
- Department of Communication Sciences and Disorders, University of Utah, Salt Lake City, Utah 84112, USA
| | - Elizabeth A Strickland
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, New York 14642, USA
| |
Collapse
|
6
|
Maes A, Barahona M, Clopath C. Long- and short-term history effects in a spiking network model of statistical learning. Sci Rep 2023; 13:12939. [PMID: 37558704 PMCID: PMC10412617 DOI: 10.1038/s41598-023-39108-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/20/2023] [Indexed: 08/11/2023] Open
Abstract
The statistical structure of the environment is often important when making decisions. There are multiple theories of how the brain represents statistical structure. One such theory states that neural activity spontaneously samples from probability distributions. In other words, the network spends more time in states which encode high-probability stimuli. Starting from the neural assembly, increasingly thought of to be the building block for computation in the brain, we focus on how arbitrary prior knowledge about the external world can both be learned and spontaneously recollected. We present a model based upon learning the inverse of the cumulative distribution function. Learning is entirely unsupervised using biophysical neurons and biologically plausible learning rules. We show how this prior knowledge can then be accessed to compute expectations and signal surprise in downstream networks. Sensory history effects emerge from the model as a consequence of ongoing learning.
Collapse
Affiliation(s)
- Amadeus Maes
- Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, USA.
- Department of Bioengineering, Imperial College London, London, UK.
| | | | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, UK
| |
Collapse
|
7
|
McPherson MJ, McDermott JH. Relative pitch representations and invariance to timbre. Cognition 2023; 232:105327. [PMID: 36495710 PMCID: PMC10016107 DOI: 10.1016/j.cognition.2022.105327] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 09/13/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022]
Abstract
Information in speech and music is often conveyed through changes in fundamental frequency (f0), perceived by humans as "relative pitch". Relative pitch judgments are complicated by two facts. First, sounds can simultaneously vary in timbre due to filtering imposed by a vocal tract or instrument body. Second, relative pitch can be extracted in two ways: by measuring changes in constituent frequency components from one sound to another, or by estimating the f0 of each sound and comparing the estimates. We examined the effects of timbral differences on relative pitch judgments, and whether any invariance to timbre depends on whether judgments are based on constituent frequencies or their f0. Listeners performed up/down and interval discrimination tasks with pairs of spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic. Inharmonic sounds lack a well-defined f0, such that relative pitch must be extracted from changes in individual frequencies. Pitch judgments were less accurate when vowels/instruments were different compared to when they were the same, and were biased by the associated timbre differences. However, this bias was similar for harmonic and inharmonic sounds, and was observed even in conditions where judgments of harmonic sounds were based on f0 representations. Relative pitch judgments are thus not invariant to timbre, even when timbral variation is naturalistic, and when such judgments are based on representations of f0.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America; Center for Brains Minds and Machines, MIT, Cambridge, MA 02139, United States of America
| |
Collapse
|
8
|
Generalizing across tonal context, timbre, and octave in rapid absolute pitch training. Atten Percept Psychophys 2023; 85:525-542. [PMID: 36690914 DOI: 10.3758/s13414-023-02653-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/03/2023] [Indexed: 01/24/2023]
Abstract
Absolute pitch (AP) is the rare ability to name any musical note without the use of a reference note. Given that genuine AP representations are based on the identification of isolated notes by their tone chroma, they are considered to be invariant to (1) surrounding tonal context, (2) changes in instrumental timbre, and (3) changes in octave register. However, there is considerable variability in the literature in terms of how AP is trained and tested along these dimensions, making recent claims about AP learning difficult to assess. Here, we examined the effect of tonal context on participant success with a single-note identification training paradigm, including how learning generalized to an untested instrument and octave. We found that participants were able to rapidly learn to distinguish C from other notes, with and without feedback and regardless of the tonal context in which C was presented. Participants were also able to partly generalize this skill to an untrained instrument. However, participants displayed the weakest generalization in recognizing C in a higher octave. The results indicate that participants were likely attending to pitch height in addition to pitch chroma - a conjecture that was supported by analyzing the pattern of response errors. These findings highlight the complex nature of note representation in AP, which requires note identification across contexts, going beyond the simple storage of a note fundamental. The importance of standardizing testing that spans both timbre and octave in assessing AP and further implications on past literature and future work are discussed.
Collapse
|
9
|
Siedenburg K, Graves J, Pressnitzer D. A unitary model of auditory frequency change perception. PLoS Comput Biol 2023; 19:e1010307. [PMID: 36634121 PMCID: PMC9876382 DOI: 10.1371/journal.pcbi.1010307] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 01/25/2023] [Accepted: 01/04/2023] [Indexed: 01/13/2023] Open
Abstract
Changes in the frequency content of sounds over time are arguably the most basic form of information about the behavior of sound-emitting objects. In perceptual studies, such changes have mostly been investigated separately, as aspects of either pitch or timbre. Here, we propose a unitary account of "up" and "down" subjective judgments of frequency change, based on a model combining auditory correlates of acoustic cues in a sound-specific and listener-specific manner. To do so, we introduce a generalized version of so-called Shepard tones, allowing symmetric manipulations of spectral information on a fine scale, usually associated to pitch (spectral fine structure, SFS), and on a coarse scale, usually associated timbre (spectral envelope, SE). In a series of behavioral experiments, listeners reported "up" or "down" shifts across pairs of generalized Shepard tones that differed in SFS, in SE, or in both. We observed the classic properties of Shepard tones for either SFS or SE shifts: subjective judgements followed the smallest log-frequency change direction, with cases of ambiguity and circularity. Interestingly, when both SFS and SE changes were applied concurrently (synergistically or antagonistically), we observed a trade-off between cues. Listeners were encouraged to report when they perceived "both" directions of change concurrently, but this rarely happened, suggesting a unitary percept. A computational model could accurately fit the behavioral data by combining different cues reflecting frequency changes after auditory filtering. The model revealed that cue weighting depended on the nature of the sound. When presented with harmonic sounds, listeners put more weight on SFS-related cues, whereas inharmonic sounds led to more weight on SE-related cues. Moreover, these stimulus-based factors were modulated by inter-individual differences, revealing variability across listeners in the detailed recipe for "up" and "down" judgments. We argue that frequency changes are tracked perceptually via the adaptive combination of a diverse set of cues, in a manner that is in fact similar to the derivation of other basic auditory dimensions such as spatial location.
Collapse
Affiliation(s)
- Kai Siedenburg
- Carl von Ossietzky University of Oldenburg, Dept. of Medical Physics and Acoustics, Oldenburg, Germany
- * E-mail:
| | - Jackson Graves
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Daniel Pressnitzer
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| |
Collapse
|
10
|
Stein J, von Kriegstein K, Tabas A. Predictive encoding of pure tones and FM-sweeps in the human auditory cortex. Cereb Cortex Commun 2022; 3:tgac047. [PMID: 36545253 PMCID: PMC9764222 DOI: 10.1093/texcom/tgac047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 11/05/2022] [Accepted: 11/10/2022] [Indexed: 11/17/2022] Open
Abstract
Expectations substantially influence perception, but the neural mechanisms underlying this influence are not fully understood. A prominent view is that sensory neurons encode prediction error with respect to expectations on upcoming sensory input. Although the encoding of prediction error has been previously demonstrated in the human auditory cortex (AC), previous studies often induced expectations using stimulus repetition, potentially confounding prediction error with neural habituation. These studies also measured AC as a single population, failing to consider possible predictive specializations of different AC fields. Moreover, the few studies that considered prediction error to stimuli other than pure tones yielded conflicting results. Here, we used functional magnetic resonance imaging (fMRI) to systematically investigate prediction error to subjective expectations in auditory cortical fields Te1.0, Te1.1, Te1.2, and Te3, and two types of stimuli: pure tones and frequency modulated (FM) sweeps. Our results show that prediction error is elicited with respect to the participants' expectations independently of stimulus repetition and similarly expressed across auditory fields. Moreover, despite the radically different strategies underlying the decoding of pure tones and FM-sweeps, both stimulus modalities were encoded as prediction error in most fields of AC. Altogether, our results provide unequivocal evidence that predictive coding is the general encoding mechanism in AC.
Collapse
Affiliation(s)
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technical University Dresden, Bamberger Str. 7, Dresden 01187, Germany
| | - Alejandro Tabas
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technical University Dresden, Bamberger Str. 7, Dresden 01187, Germany
| |
Collapse
|
11
|
Tardiff N, Suriya-Arunroj L, Cohen YE, Gold JI. Rule-based and stimulus-based cues bias auditory decisions via different computational and physiological mechanisms. PLoS Comput Biol 2022; 18:e1010601. [PMID: 36206302 PMCID: PMC9581427 DOI: 10.1371/journal.pcbi.1010601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 10/19/2022] [Accepted: 09/26/2022] [Indexed: 11/06/2022] Open
Abstract
Expectations, such as those arising from either learned rules or recent stimulus regularities, can bias subsequent auditory perception in diverse ways. However, it is not well understood if and how these diverse effects depend on the source of the expectations. Further, it is unknown whether different sources of bias use the same or different computational and physiological mechanisms. We examined how rule-based and stimulus-based expectations influenced behavior and pupil-linked arousal, a marker of certain forms of expectation-based processing, of human subjects performing an auditory frequency-discrimination task. Rule-based cues consistently biased choices and response times (RTs) toward the more-probable stimulus. In contrast, stimulus-based cues had a complex combination of effects, including choice and RT biases toward and away from the frequency of recently presented stimuli. These different behavioral patterns also had: 1) distinct computational signatures, including different modulations of key components of a novel form of a drift-diffusion decision model and 2) distinct physiological signatures, including substantial bias-dependent modulations of pupil size in response to rule-based but not stimulus-based cues. These results imply that different sources of expectations can modulate auditory processing via distinct mechanisms: one that uses arousal-linked, rule-based information and another that uses arousal-independent, stimulus-based information to bias the speed and accuracy of auditory perceptual decisions. Prior information about upcoming stimuli can bias our perception of those stimuli. Whether different sources of prior information bias perception in similar or distinct ways is not well understood. We compared the influence of two kinds of prior information on tone-frequency discrimination: rule-based cues, in the form of explicit information about the most-likely identity of the upcoming tone; and stimulus-based cues, in the form of sequences of tones presented before the to-be-discriminated tone. Although both types of prior information biased auditory decision-making, they demonstrated distinct behavioral, computational, and physiological signatures. Our results suggest that the brain processes prior information in a form-specific manner rather than utilizing a general-purpose prior. Such form-specific processing has implications for understanding decision biases real-world contexts, in which prior information comes from many different sources and modalities.
Collapse
Affiliation(s)
- Nathan Tardiff
- Department of Otorhinolaryngology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- * E-mail:
| | - Lalitta Suriya-Arunroj
- Department of Otorhinolaryngology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yale E. Cohen
- Department of Otorhinolaryngology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Joshua I. Gold
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
12
|
Abstract
Perception adapts to the properties of prior stimulation, as illustrated by phenomena such as visual color constancy or speech context effects. In the auditory domain, only little is known about adaptive processes when it comes to the attribute of auditory brightness. Here, we report an experiment that tests whether listeners adapt to spectral colorations imposed on naturalistic music and speech excerpts. Our results indicate consistent contrastive adaptation of auditory brightness judgments on a trial-by-trial basis. The pattern of results suggests that these effects tend to grow with an increase in the duration of the adaptor context but level off after around 8 trials of 2 s duration. A simple model of the response criterion yields a correlation of r = .97 with the measured data and corroborates the notion that brightness perception adapts on timescales that fall in the range of auditory short-term memory. Effects turn out to be similar for spectral filtering based on linear spectral filter slopes and filtering based on a measured transfer function from a commercially available hearing device. Overall, our findings demonstrate the adaptivity of auditory brightness perception under realistic acoustical conditions.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
| | - Feline Malin Barg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Henning Schepker
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
- Starkey Hearing, Eden Prairie, MN, USA
| |
Collapse
|
13
|
Kornmeier J, Bhatia K, Joos E. Top-down resolution of visual ambiguity - knowledge from the future or footprints from the past? PLoS One 2021; 16:e0258667. [PMID: 34673791 PMCID: PMC8530352 DOI: 10.1371/journal.pone.0258667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 10/01/2021] [Indexed: 11/18/2022] Open
Abstract
Current theories about visual perception assume that our perceptual system weights the a priori incomplete, noisy and ambiguous sensory information with previous, memorized perceptual experiences in order to construct stable and reliable percepts. These theories are supported by numerous experimental findings. Theories about precognition have an opposite point of view. They assume that information from the future can have influence on perception, thoughts, and behavior. Several experimental studies provide evidence for precognition effects, other studies found no such effects. One problem may be that the vast majority of precognition paradigms did not systematically control for potential effects from the perceptual history. In the present study, we presented ambiguous Necker cube stimuli and disambiguated cube variants and systematically tested in two separate experiments whether perception of a currently observed ambiguous Necker cube stimulus can be influenced by a disambiguated cube variant, presented in the immediate perceptual past (perceptual history effects) and/or in the immediate perceptual future (precognition effects). We found perceptual history effects, which partly depended on the length of the perceptual history trace but were independent of the perceptual future. Results from some individual participants suggest on the first glance a precognition pattern, but results from our second experiment make a perceptual history explanation more probable. On the group level, no precognition effects were statistically indicated. The perceptual history effects found in the present study are in confirmation with related studies from the literature. The precognition analysis revealed some interesting individual patterns, which however did not allow for general conclusions. Overall, the present study demonstrates that any future experiment about sensory or extrasensory perception urgently needs to control for potential perceptual history effects and that temporal aspects of stimulus presentation are of high relevance.
Collapse
Affiliation(s)
- Jürgen Kornmeier
- Institute for Frontier Areas of Psychology and Mental Health, Freiburg, Germany
- Department of Psychiatry and Psychotherapy, Medical Center, University of Freiburg, Freiburg, Germany
- Faculty of Medicine, Freiburg, Germany
| | - Kriti Bhatia
- Experimental Cognitive Science, Eberhard Karls University Tübingen, Tübingen, Germany
| | - Ellen Joos
- INSERM U1114, Cognitive Neuropsychology and Pathophysiology of Schizophrenia, University of Strasbourg, Strasbourg, France
| |
Collapse
|
14
|
Tavoni G, Kersen DEC, Balasubramanian V. Cortical feedback and gating in odor discrimination and generalization. PLoS Comput Biol 2021; 17:e1009479. [PMID: 34634035 PMCID: PMC8530364 DOI: 10.1371/journal.pcbi.1009479] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 10/21/2021] [Accepted: 09/24/2021] [Indexed: 11/30/2022] Open
Abstract
A central question in neuroscience is how context changes perception. In the olfactory system, for example, experiments show that task demands can drive divergence and convergence of cortical odor responses, likely underpinning olfactory discrimination and generalization. Here, we propose a simple statistical mechanism for this effect based on unstructured feedback from the central brain to the olfactory bulb, which represents the context associated with an odor, and sufficiently selective cortical gating of sensory inputs. Strikingly, the model predicts that both convergence and divergence of cortical odor patterns should increase when odors are initially more similar, an effect reported in recent experiments. The theory in turn predicts reversals of these trends following experimental manipulations and in neurological conditions that increase cortical excitability.
Collapse
Affiliation(s)
- Gaia Tavoni
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Neuroscience, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - David E. Chen Kersen
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Vijay Balasubramanian
- Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
15
|
Mugruza-Vassallo CA, Potter DD, Tsiora S, Macfarlane JA, Maxwell A. Prior context influences motor brain areas in an auditory oddball task and prefrontal cortex multitasking modelling. Brain Inform 2021; 8:5. [PMID: 33745089 PMCID: PMC7982371 DOI: 10.1186/s40708-021-00124-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 12/21/2020] [Indexed: 11/19/2022] Open
Abstract
In this study, the relationship of orienting of attention, motor control and the Stimulus- (SDN) and Goal-Driven Networks (GDN) was explored through an innovative method for fMRI analysis considering all voxels in four experimental conditions: standard target (Goal; G), novel (N), neutral (Z) and noisy target (NG). First, average reaction times (RTs) for each condition were calculated. In the second-level analysis, 'distracted' participants, as indicated by slower RTs, evoked brain activations and differences in both hemispheres' neural networks for selective attention, while the participants, as a whole, demonstrated mainly left cortical and subcortical activations. A context analysis was run in the behaviourally distracted participant group contrasting the trials immediately prior to the G trials, namely one of the Z, N or NG conditions, i.e. Z.G, N.G, NG.G. Results showed different prefrontal activations dependent on prior context in the auditory modality, recruiting between 1 to 10 prefrontal areas. The higher the motor response and influence of the previous novel stimulus, the more prefrontal areas were engaged, which extends the findings of hierarchical studies of prefrontal control of attention and better explains how auditory processing interferes with movement. Also, the current study addressed how subcortical loops and models of previous motor response affected the signal processing of the novel stimulus, when this was presented laterally or simultaneously with the target. This multitasking model could enhance our understanding on how an auditory stimulus is affecting motor responses in a way that is self-induced, by taking into account prior context, as demonstrated in the standard condition and as supported by Pulvinar activations complementing visual findings. Moreover, current BCI works address some multimodal stimulus-driven systems.
Collapse
Affiliation(s)
- Carlos A Mugruza-Vassallo
- Grupo de Investigación de Computación Y Neurociencia Cognitiva, Facultad de Ingeniería Y Gestión, Universidad Nacional Tecnológica de Lima Sur - UNTELS, Lima, Perú.
| | - Douglas D Potter
- Neuroscience and Development Group, Arts and Science, University of Dundee, Dundee, UK
| | - Stamatina Tsiora
- School of Psychology, University of Lincoln, Lincoln, United Kingdom
| | | | - Adele Maxwell
- Neuroscience and Development Group, Arts and Science, University of Dundee, Dundee, UK
| |
Collapse
|
16
|
Richards VM, Tisby MK, Suzuki-Gill EN, Shen Y. Sub-optimal construction of an auditory profile from temporally distributed spectral information. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1567. [PMID: 33765831 PMCID: PMC7943247 DOI: 10.1121/10.0003646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 02/12/2021] [Accepted: 02/16/2021] [Indexed: 06/12/2023]
Abstract
When spectral components of a complex sound are presented not simultaneously but distributed over time, human listeners can still, to a degree, perceptually recover the spectral profile of the sound. This capability of integrating spectral information over time was investigated using a cued informational masking paradigm. Listeners detected a 1-kHz pure tone in a simultaneous masker composed of six random-frequency tones drawn on every trial. The spectral profile of the masker was cued using a precursor sound that consisted of a sequence of 50-ms bursts, separated by inter-burst intervals of 100 ms. Each burst in the precursor consisted of pure tones at the masker frequencies with tones appearing at each of the masker frequencies at different presentation probabilities. As the presentation probability increased in different conditions, the detectability of the target improved, indicating reliable precursor cuing regarding the spectral content of the masker. For many listeners, performance did not significantly improve as the number of precursor bursts increased from 2 to 16, indicating inefficient integration of information beyond 2 bursts. Additional analyses suggest that when intensity of the bursts is relatively constant, the contribution of the precursor is dominated by information in the initial burst.
Collapse
Affiliation(s)
- Virginia M Richards
- Department of Cognitive Sciences, University of California, Irvine, California 92687, USA
| | - Mariel Kazuko Tisby
- Department of Cognitive Sciences, University of California, Irvine, California 92687, USA
| | - Eli N Suzuki-Gill
- Department of Cognitive Sciences, University of California, Irvine, California 92687, USA
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington 98105, USA
| |
Collapse
|
17
|
Tabas A, von Kriegstein K. Neural modelling of the encoding of fast frequency modulation. PLoS Comput Biol 2021; 17:e1008787. [PMID: 33657098 PMCID: PMC7959405 DOI: 10.1371/journal.pcbi.1008787] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Revised: 03/15/2021] [Accepted: 02/12/2021] [Indexed: 11/19/2022] Open
Abstract
Frequency modulation (FM) is a basic constituent of vocalisation in many animals as well as in humans. In human speech, short rising and falling FM-sweeps of around 50 ms duration, called formant transitions, characterise individual speech sounds. There are two representations of FM in the ascending auditory pathway: a spectral representation, holding the instantaneous frequency of the stimuli; and a sweep representation, consisting of neurons that respond selectively to FM direction. To-date computational models use feedforward mechanisms to explain FM encoding. However, from neuroanatomy we know that there are massive feedback projections in the auditory pathway. Here, we found that a classical FM-sweep perceptual effect, the sweep pitch shift, cannot be explained by standard feedforward processing models. We hypothesised that the sweep pitch shift is caused by a predictive feedback mechanism. To test this hypothesis, we developed a novel model of FM encoding incorporating a predictive interaction between the sweep and the spectral representation. The model was designed to encode sweeps of the duration, modulation rate, and modulation shape of formant transitions. It fully accounted for experimental data that we acquired in a perceptual experiment with human participants as well as previously published experimental results. We also designed a new class of stimuli for a second perceptual experiment to further validate the model. Combined, our results indicate that predictive interaction between the frequency encoding and direction encoding neural representations plays an important role in the neural processing of FM. In the brain, this mechanism is likely to occur at early stages of the processing hierarchy.
Collapse
Affiliation(s)
- Alejandro Tabas
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Saxony, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Saxony, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Saxony, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Saxony, Germany
| |
Collapse
|
18
|
Ho HT, Burr DC, Alais D, Morrone MC. Propagation and update of auditory perceptual priors through alpha and theta rhythms. Eur J Neurosci 2021; 55:3083-3099. [PMID: 33559266 PMCID: PMC9543013 DOI: 10.1111/ejn.15141] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 01/05/2021] [Accepted: 01/28/2021] [Indexed: 12/15/2022]
Abstract
To maintain a continuous and coherent percept over time, the brain makes use of past sensory information to anticipate forthcoming stimuli. We recently showed that auditory experience of the immediate past is propagated through ear-specific reverberations, manifested as rhythmic fluctuations of decision bias at alpha frequencies. Here, we apply the same time-resolved behavioural method to investigate how perceptual performance changes over time under conditions of stimulus expectation and to examine the effect of unexpected events on behaviour. As in our previous study, participants were required to discriminate the ear-of-origin of a brief monaural pure tone embedded in uncorrelated dichotic white noise. We manipulated stimulus expectation by increasing the target probability in one ear to 80%. Consistent with our earlier findings, performance did not remain constant across trials, but varied rhythmically with delay from noise onset. Specifically, decision bias showed a similar oscillation at ~9 Hz, which depended on ear congruency between successive targets. This suggests rhythmic communication of auditory perceptual history occurs early and is not readily influenced by top-down expectations. In addition, we report a novel observation specific to infrequent, unexpected stimuli that gave rise to oscillations in accuracy at ~7.6 Hz one trial after the target occurred in the non-anticipated ear. This new behavioural oscillation may reflect a mechanism for updating the sensory representation once a prediction error has been detected.
Collapse
Affiliation(s)
- Hao Tam Ho
- School of Psychology, University of Sydney, Camperdown, NSW, Australia.,Department of Neuroscience, Psychology, Pharmacology, and Child Health, University of Florence, Florence, Italy
| | - David C Burr
- School of Psychology, University of Sydney, Camperdown, NSW, Australia.,Department of Neuroscience, Psychology, Pharmacology, and Child Health, University of Florence, Florence, Italy.,Institute of Neuroscience, Pisa, Italy
| | - David Alais
- School of Psychology, University of Sydney, Camperdown, NSW, Australia
| | - Maria Concetta Morrone
- Department of Translational Research on New Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
| |
Collapse
|
19
|
Using the perceptual past to predict the perceptual future influences the perceived present - A novel ERP paradigm. PLoS One 2020; 15:e0237663. [PMID: 32870908 PMCID: PMC7462302 DOI: 10.1371/journal.pone.0237663] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 07/30/2020] [Indexed: 11/19/2022] Open
Abstract
The information available through our senses is noisy, incomplete, and to varying degrees ambiguous. The perceptual system must create stable and reliable percepts out of this restricted information. It solves this perceptual inference problem by integrating memories of previous percepts and making predictions about the perceptual future. Using ambiguous figures and a new experimental approach, we studied whether generating predictions based on regularities in the past affects processing of the present and how this is done. Event-related potentials (ERPs) were measured to investigate whether a highly regular temporal context of either ambiguous or unambiguous stimulus variants differently affects processing of a current stimulus and/or task execution. Further, we tested whether symbolic announcements about the immediate perceptual future can replace the past experience of regularities as a source for making predictions. Both ERP and reaction time varied as a function of stimulus ambiguity in the temporal context of a present stimulus. No such effects were found with symbolic announcements. Our results indicate that predictions about the future automatically alter processing of the present, even if the predictions are irrelevant for the present percept and task. However, direct experiences of past regularities are necessary for predicting the future whereas symbolic information about the future is not sufficient.
Collapse
|
20
|
Lenc T, Keller PE, Varlet M, Nozaradan S. Neural and Behavioral Evidence for Frequency-Selective Context Effects in Rhythm Processing in Humans. Cereb Cortex Commun 2020; 1:tgaa037. [PMID: 34296106 PMCID: PMC8152888 DOI: 10.1093/texcom/tgaa037] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 06/30/2020] [Accepted: 07/16/2020] [Indexed: 01/17/2023] Open
Abstract
When listening to music, people often perceive and move along with a periodic meter. However, the dynamics of mapping between meter perception and the acoustic cues to meter periodicities in the sensory input remain largely unknown. To capture these dynamics, we recorded the electroencephalography while nonmusician and musician participants listened to nonrepeating rhythmic sequences, where acoustic cues to meter frequencies either gradually decreased (from regular to degraded) or increased (from degraded to regular). The results revealed greater neural activity selectively elicited at meter frequencies when the sequence gradually changed from regular to degraded compared with the opposite. Importantly, this effect was unlikely to arise from overall gain, or low-level auditory processing, as revealed by physiological modeling. Moreover, the context effect was more pronounced in nonmusicians, who also demonstrated facilitated sensory-motor synchronization with the meter for sequences that started as regular. In contrast, musicians showed weaker effects of recent context in their neural responses and robust ability to move along with the meter irrespective of stimulus degradation. Together, our results demonstrate that brain activity elicited by rhythm does not only reflect passive tracking of stimulus features, but represents continuous integration of sensory input with recent context.
Collapse
Affiliation(s)
- Tomas Lenc
- MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, Sydney, NSW 2751, Australia
| | - Peter E Keller
- MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, Sydney, NSW 2751, Australia
| | - Manuel Varlet
- MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, Sydney, NSW 2751, Australia
- School of Psychology, Western Sydney University, Penrith, Sydney, NSW 2751, Australia
| | - Sylvie Nozaradan
- MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, Sydney, NSW 2751, Australia
- Institute of Neuroscience (IONS), Université Catholique de Louvain (UCL), Brussels 1200, Belgium
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montreal QC H3C 3J7, Canada
| |
Collapse
|
21
|
Wang Y, Zhang J, Zou J, Luo H, Ding N. Prior Knowledge Guides Speech Segregation in Human Auditory Cortex. Cereb Cortex 2020; 29:1561-1571. [PMID: 29788144 DOI: 10.1093/cercor/bhy052] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/22/2018] [Accepted: 02/15/2018] [Indexed: 11/12/2022] Open
Abstract
Segregating concurrent sound streams is a computationally challenging task that requires integrating bottom-up acoustic cues (e.g. pitch) and top-down prior knowledge about sound streams. In a multi-talker environment, the brain can segregate different speakers in about 100 ms in auditory cortex. Here, we used magnetoencephalographic (MEG) recordings to investigate the temporal and spatial signature of how the brain utilizes prior knowledge to segregate 2 speech streams from the same speaker, which can hardly be separated based on bottom-up acoustic cues. In a primed condition, the participants know the target speech stream in advance while in an unprimed condition no such prior knowledge is available. Neural encoding of each speech stream is characterized by the MEG responses tracking the speech envelope. We demonstrate that an effect in bilateral superior temporal gyrus and superior temporal sulcus is much stronger in the primed condition than in the unprimed condition. Priming effects are observed at about 100 ms latency and last more than 600 ms. Interestingly, prior knowledge about the target stream facilitates speech segregation by mainly suppressing the neural tracking of the non-target speech stream. In sum, prior knowledge leads to reliable speech segregation in auditory cortex, even in the absence of reliable bottom-up speech segregation cue.
Collapse
Affiliation(s)
- Yuanye Wang
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Jianfeng Zhang
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jiajie Zou
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Huan Luo
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Nai Ding
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China.,Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, Zhejiang, China.,State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, Zhejiang, China.,Interdisciplinary Center for Social Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
22
|
Little DF, Snyder JS, Elhilali M. Ensemble modeling of auditory streaming reveals potential sources of bistability across the perceptual hierarchy. PLoS Comput Biol 2020; 16:e1007746. [PMID: 32275706 PMCID: PMC7185718 DOI: 10.1371/journal.pcbi.1007746] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 04/27/2020] [Accepted: 02/25/2020] [Indexed: 11/19/2022] Open
Abstract
Perceptual bistability-the spontaneous, irregular fluctuation of perception between two interpretations of a stimulus-occurs when observing a large variety of ambiguous stimulus configurations. This phenomenon has the potential to serve as a tool for, among other things, understanding how function varies across individuals due to the large individual differences that manifest during perceptual bistability. Yet it remains difficult to interpret the functional processes at work, without knowing where bistability arises during perception. In this study we explore the hypothesis that bistability originates from multiple sources distributed across the perceptual hierarchy. We develop a hierarchical model of auditory processing comprised of three distinct levels: a Peripheral, tonotopic analysis, a Central analysis computing features found more centrally in the auditory system, and an Object analysis, where sounds are segmented into different streams. We model bistable perception within this system by applying adaptation, inhibition and noise into one or all of the three levels of the hierarchy. We evaluate a large ensemble of variations of this hierarchical model, where each model has a different configuration of adaptation, inhibition and noise. This approach avoids the assumption that a single configuration must be invoked to explain the data. Each model is evaluated based on its ability to replicate two hallmarks of bistability during auditory streaming: the selectivity of bistability to specific stimulus configurations, and the characteristic log-normal pattern of perceptual switches. Consistent with a distributed origin, a broad range of model parameters across this hierarchy lead to a plausible form of perceptual bistability.
Collapse
Affiliation(s)
- David F. Little
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas; Las Vegas, Nevada, United States of America
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
23
|
Hao Y, Yao L, Sun Q, Gupta D. Interaction of Self-Regulation and Contextual Effects on Pre-attentive Auditory Processing: A Combined EEG/ECG Study. Front Neurosci 2019; 13:638. [PMID: 31275111 PMCID: PMC6593616 DOI: 10.3389/fnins.2019.00638] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2019] [Accepted: 06/03/2019] [Indexed: 11/13/2022] Open
Abstract
Environmental changes are not always within the focus of our attention, and sensitive reactions (i.e., quicker and stronger responses) can be essential for an organism's survival and adaptation. Here we report that neurophysiological responses to sound changes that are not in the focus of attention are related to both ambient acoustic contexts and regulation ability. We assessed electroencephalograph (EEG) mismatch negativity (MMN) latency and amplitude in response to sound changes in two contexts: ascending and descending pitch sequences while participants were instructed to attend to muted videos. Prolonged latency and increased amplitude of MMN at fronto-central region occurred in ascending pitch sequences relative to descending sequences. We also assessed how regulation related to the contextual effects on MMN. Reactions to changes in the ascending sequence were observed with the attention control (frontal EEG theta/beta ratio) indicating speed of reaction, and the autonomous regulation (heart-rate variability) indicating intensity of reaction. Moreover, sound changes in the ascending context were associated with more activation of anterior cingulate cortex and insula, suggesting arousal effects and regulation processes. These findings suggest that the relation between speed and intensity is not fixed and may be modified by contexts and self-regulation ability. Specifically, cortical and cardiovascular indicators of self-regulation may specify different aspects of response sensitivity in terms of speed and intensity.
Collapse
Affiliation(s)
- Yu Hao
- Department of Design and Environmental Analysis, Cornell University, Ithaca, NY, United States
| | - Lin Yao
- School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, United States
| | - Qiuyan Sun
- Department of Nutritional Science, Cornell University, Ithaca, NY, United States
| | - Disha Gupta
- School of Medicine, New York University, New York, NY, United States
| |
Collapse
|
24
|
Wide sensory filters underlie performance in memory-based discrimination and generalization. PLoS One 2019; 14:e0214817. [PMID: 30998708 PMCID: PMC6472767 DOI: 10.1371/journal.pone.0214817] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 03/20/2019] [Indexed: 11/30/2022] Open
Abstract
The way animals respond to a stimulus depends largely on an internal comparison between the current sensation and the memory of previous stimuli and outcomes. We know little about the accuracy with which the physical properties of the stimuli influence this type of memory-based discriminative decisions. Research has focused largely on discriminations between stimuli presented in quick succession, where animals can make relative inferences (same or different; higher or lower) from trial to trial. In the current study we used a memory-based task to explore how the stimulus’ physical properties, in this case tone frequency, affect auditory discrimination and generalization in mice. Mice performed ad libitum while living in groups in their home quarters. We found that the frequency distance between safe and conditioned sounds had a constraining effect on discrimination. As the safe-to-conditioned distance decreased across groups, performance deteriorated rapidly, even for frequency differences significantly larger than reported discrimination thresholds. Generalization width was influenced both by the physical distance and the previous experience of the mice, and was not accompanied by a decrease in sensory acuity. In conclusion, memory-based discriminations along a single stimulus dimension are inherently hard, reflecting a high overlap between the memory traces of the relevant stimuli. Memory-based discriminations rely therefore on wide sensory filters.
Collapse
|
25
|
Malek S. Pitch Class and Envelope Effects in the Tritone Paradox Are Mediated by Differently Pronounced Frequency Preference Regions. Front Psychol 2018; 9:1590. [PMID: 30323778 PMCID: PMC6173142 DOI: 10.3389/fpsyg.2018.01590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Accepted: 08/09/2018] [Indexed: 12/04/2022] Open
Abstract
Shepard tones (octave complex tones) are well defined in pitch chroma but are ambiguous in pitch height. Pitch direction judgments of Shepard tones depend on the clockwise distance of the pitch classes on the pitch class circle, indicating the proximity principle in auditory perception. The tritone paradox emerges when two Shepard tones that form a tritone interval are presented successively. In this case, no proximity cue is available and judgments depend on the first tone and vary from person to person. A common explanation for the tritone paradox is the assumption of a specific pitch class comparison mechanism based on a pitch class template that is differently orientated from person to person. In contrast, psychoacoustic approaches (e.g., the Terhardt virtual pitch theory) explain it with common pitch-processing mechanisms. The present paper proposes a probabilistic threshold model, which estimates Shepard tone pitch height by a probabilistic fundamental frequency extraction. In the first processing stage, only those frequency components whose amplitudes are above specific randomly distributed threshold values are selected for further processing, and whose expected values are determined by a threshold function. The lowest of these nonfiltered components is dedicated to the pitch height. The model is designed for tone pairs and provides occurrence probabilities for descending judgments. In a pitch-matching pretest, 12 Shepard tones (generated under a cosine envelope centered at 261 Hz) were compared to pure tones, whose frequencies were adjusted by an up-down staircase method. Matched frequencies corresponded to frequency components but were ambiguous in octave position. In order to test the model, Shepard tones were generated under six cosine envelopes centered over a wide frequency range (65.41, 261, 370, 440, 523.25, 1244.51 Hz). The model predicted pitch class effects and envelope effects. Steep threshold functions caused pronounced pitch class, whereas flat threshold functions caused pronounced envelope effects. The model provides an alternative explanation to the pitch class template theory and serves as a psychoacoustic framework for the perception of Shepard tones.
Collapse
|
26
|
Ward RM, Kelty-Stephen DG. Bringing the Nonlinearity of the Movement System to Gestural Theories of Language Use: Multifractal Structure of Spoken English Supports the Compensation for Coarticulation in Human Speech Perception. Front Physiol 2018; 9:1152. [PMID: 30233386 PMCID: PMC6129613 DOI: 10.3389/fphys.2018.01152] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Accepted: 07/31/2018] [Indexed: 01/13/2023] Open
Abstract
Coarticulation is the tendency for speech vocalization and articulation even at the phonemic level to change with context, and compensation for coarticulation (CfC) reflects the striking human ability to perceive phonemic stability despite this variability. A current controversy centers on whether CfC depends on contrast between formants of a speech-signal spectrogram-specifically, contrast between offset formants concluding context stimuli and onset formants opening the target sound-or on speech-sound variability specific to the coordinative movement of speech articulators (e.g., vocal folds, postural muscles, lips, tongues). This manuscript aims to encode that coordinative-movement context in terms of speech-signal multifractal structure and to determine whether speech's multifractal structure might explain the crucial gestural support for any proposed spectral contrast. We asked human participants to categorize individual target stimuli drawn from an 11-step [ga]-to-[da] continuum as either phonemes "GA" or "DA." Three groups each heard a specific-type context stimulus preceding target stimuli: either real-speech [al] or [a], sine-wave tones at the third-formant offset frequency of either [al] or [aɹ], and either simulated-speech contexts [al] or [aɹ]. Here, simulating speech contexts involved randomizing the sequence of relatively homogeneous pitch periods within vowel-sound [a] of each [al] and [aɹ]. Crucially, simulated-speech contexts had the same offset and extremely similar vowel formants as and, to additional naïve participants, sounded identical to real-speech contexts. However, randomization distorted original speech-context multifractality, and effects of spectral contrast following speech only appeared after regression modeling of trial-by-trial "GA" judgments controlled for context-stimulus multifractality. Furthermore, simulated-speech contexts elicited faster responses (like tone contexts do) and weakened known biases in CfC, suggesting that spectral contrast depends on the nonlinear interactions across multiple scales that articulatory gestures express through the speech signal. Traditional mouse-tracking behaviors measured as participants moved their computer-mouse cursor to register their "GA"-or-"DA" decisions with mouse-clicks suggest that listening to speech leads the movement system to resonate with the multifractality of context stimuli. We interpret these results as shedding light on a new multifractal terrain upon which to found a better understanding in which movement systems play an important role in shaping how speech perception makes use of acoustic information.
Collapse
|
27
|
Holt LL, Tierney AT, Guerra G, Laffere A, Dick F. Dimension-selective attention as a possible driver of dynamic, context-dependent re-weighting in speech processing. Hear Res 2018; 366:50-64. [PMID: 30131109 PMCID: PMC6107307 DOI: 10.1016/j.heares.2018.06.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 06/10/2018] [Accepted: 06/19/2018] [Indexed: 12/24/2022]
Abstract
The contribution of acoustic dimensions to an auditory percept is dynamically adjusted and reweighted based on prior experience about how informative these dimensions are across the long-term and short-term environment. This is especially evident in speech perception, where listeners differentially weight information across multiple acoustic dimensions, and use this information selectively to update expectations about future sounds. The dynamic and selective adjustment of how acoustic input dimensions contribute to perception has made it tempting to conceive of this as a form of non-spatial auditory selective attention. Here, we review several human speech perception phenomena that might be consistent with auditory selective attention although, as of yet, the literature does not definitively support a mechanistic tie. We relate these human perceptual phenomena to illustrative nonhuman animal neurobiological findings that offer informative guideposts in how to test mechanistic connections. We next present a novel empirical approach that can serve as a methodological bridge from human research to animal neurobiological studies. Finally, we describe four preliminary results that demonstrate its utility in advancing understanding of human non-spatial dimension-based auditory selective attention.
Collapse
Affiliation(s)
- Lori L Holt
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, 15213, USA; Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Giada Guerra
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Aeron Laffere
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK
| | - Frederic Dick
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK; Department of Experimental Psychology, University College London, London, WC1H 0AP, UK
| |
Collapse
|
28
|
Pressnitzer D, Graves J, Chambers C, de Gardelle V, Egré P. Auditory Perception: Laurel and Yanny Together at Last. Curr Biol 2018; 28:R739-R741. [PMID: 29990455 DOI: 10.1016/j.cub.2018.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
An auditory illusion caught the world's attention recently. For the same noisy speech utterance, different people reported hearing either 'Laurel' or 'Yanny'. The dichotomy highlights how perceptions are inferences from inherently ambiguous sensory information, even though ambiguity is often unnoticed.
Collapse
Affiliation(s)
- D Pressnitzer
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France.
| | - J Graves
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - C Chambers
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - V de Gardelle
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - P Egré
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| |
Collapse
|
29
|
Malek S, Sperschneider K. Aftereffects of Spectrally Similar and Dissimilar Spectral Motion Adaptors in the Tritone Paradox. Front Psychol 2018; 9:677. [PMID: 29867653 PMCID: PMC5953344 DOI: 10.3389/fpsyg.2018.00677] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 04/19/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Stephanie Malek
- Psychology Department, Martin Luther University Halle-Wittenberg, Halle, Germany
- *Correspondence: Stephanie Malek
| | | |
Collapse
|
30
|
Adaptive and Selective Time Averaging of Auditory Scenes. Curr Biol 2018; 28:1405-1418.e10. [PMID: 29681472 DOI: 10.1016/j.cub.2018.03.049] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Revised: 01/24/2018] [Accepted: 03/21/2018] [Indexed: 11/23/2022]
Abstract
To overcome variability, estimate scene characteristics, and compress sensory input, perceptual systems pool data into statistical summaries. Despite growing evidence for statistical representations in perception, the underlying mechanisms remain poorly understood. One example of such representations occurs in auditory scenes, where background texture appears to be represented with time-averaged sound statistics. We probed the averaging mechanism using "texture steps"-textures containing subtle shifts in stimulus statistics. Although generally imperceptible, steps occurring in the previous several seconds biased texture judgments, indicative of a multi-second averaging window. Listeners seemed unable to willfully extend or restrict this window but showed signatures of longer integration times for temporally variable textures. In all cases the measured timescales were substantially longer than previously reported integration times in the auditory system. Integration also showed signs of being restricted to sound elements attributed to a common source. The results suggest an integration process that depends on stimulus characteristics, integrating over longer extents when it benefits statistical estimation of variable signals and selectively integrating stimulus components likely to have a common cause in the world. Our methodology could be naturally extended to examine statistical representations of other types of sensory signals.
Collapse
|
31
|
Siedenburg K. Timbral Shepard-illusion reveals ambiguity and context sensitivity of brightness perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL93. [PMID: 29495721 DOI: 10.1121/1.5022983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recent research has described strong effects of prior context on the perception of ambiguous pitch shifts of Shepard tones [Chambers, Akram, Adam, Pelofi, Sahani, Shamma, and Pressnitzer (2017). Nat. Commun. 8, 15027]. Here, similar effects are demonstrated for brightness shift judgments of harmonic complexes with cyclic spectral envelope components and fixed fundamental frequency. It is shown that frequency shifts of the envelopes are perceived as systematic shifts of brightness. Analogous to the work of Chambers et al., the perceptual ambiguity of half-octave shifts resolves with the presentation of prior context tones. These results constitute a context effect for the perceptual processing of spectral envelope shifts and indicate so-far unknown commonalities between pitch and timbre perception.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
32
|
McPherson MJ, McDermott JH. Diversity in pitch perception revealed by task dependence. Nat Hum Behav 2018; 2:52-66. [PMID: 30221202 PMCID: PMC6136452 DOI: 10.1038/s41562-017-0261-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Accepted: 11/08/2017] [Indexed: 01/12/2023]
Abstract
Pitch conveys critical information in speech, music, and other natural sounds, and is conventionally defined as the perceptual correlate of a sound's fundamental frequency (F0). Although pitch is widely assumed to be subserved by a single F0 estimation process, real-world pitch tasks vary enormously, raising the possibility of underlying mechanistic diversity. To probe pitch mechanisms we conducted a battery of pitch-related music and speech tasks using conventional harmonic sounds and inharmonic sounds whose frequencies lack a common F0. Some pitch-related abilities - those relying on musical interval or voice recognition - were strongly impaired by inharmonicity, suggesting a reliance on F0. However, other tasks, including those dependent on pitch contours in speech and music, were unaffected by inharmonicity, suggesting a mechanism that tracks the frequency spectrum rather than the F0. The results suggest that pitch perception is mediated by several different mechanisms, only some of which conform to traditional notions of pitch.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
33
|
Arzounian D, de Kerangal M, de Cheveigné A. Sequential dependencies in pitch judgments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:3047. [PMID: 29195443 DOI: 10.1121/1.5009938] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Studies that measure pitch discrimination relate a subject's response on each trial to the stimuli presented on that trial, but there is evidence that behavior depends also on earlier stimulation. Here, listeners heard a sequence of tones and reported after each tone whether it was higher or lower in pitch than the previous tone. Frequencies were determined by an adaptive staircase targeting 75% correct, with interleaved tracks to ensure independence between consecutive frequency changes. Responses for this specific task were predicted by a model that took into account the frequency interval on the current trial, as well as the interval and response on the previous trial. This model was superior to simpler models. The dependence on the previous interval was positive (assimilative) for all subjects, consistent with persistence of the sensory trace. The dependence on the previous response was either positive or negative, depending on the subject, consistent with a subject-specific suboptimal response strategy. It is argued that a full stimulus + response model is necessary to account for effects of stimulus history and obtain an accurate estimate of sensory noise.
Collapse
Affiliation(s)
- Dorothée Arzounian
- Laboratoire des Systèmes Perceptifs, Département d'Etudes Cognitives, Ecole normale supérieure, PSL Research University, CNRS, 29 rue d'Ulm, Paris, 75005, France
| | - Mathilde de Kerangal
- The Ear Institute, University College London, 332 Grays Inn Road, Kings Cross, London, WC1X 8EE, United Kingdom
| | - Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, Département d'Etudes Cognitives, Ecole normale supérieure, PSL Research University, CNRS, 29 rue d'Ulm, Paris, 75005, France
| |
Collapse
|