1
|
Siedenburg K, Graves J, Pressnitzer D. A unitary model of auditory frequency change perception. PLoS Comput Biol 2023; 19:e1010307. [PMID: 36634121 PMCID: PMC9876382 DOI: 10.1371/journal.pcbi.1010307] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 01/25/2023] [Accepted: 01/04/2023] [Indexed: 01/13/2023] Open
Abstract
Changes in the frequency content of sounds over time are arguably the most basic form of information about the behavior of sound-emitting objects. In perceptual studies, such changes have mostly been investigated separately, as aspects of either pitch or timbre. Here, we propose a unitary account of "up" and "down" subjective judgments of frequency change, based on a model combining auditory correlates of acoustic cues in a sound-specific and listener-specific manner. To do so, we introduce a generalized version of so-called Shepard tones, allowing symmetric manipulations of spectral information on a fine scale, usually associated to pitch (spectral fine structure, SFS), and on a coarse scale, usually associated timbre (spectral envelope, SE). In a series of behavioral experiments, listeners reported "up" or "down" shifts across pairs of generalized Shepard tones that differed in SFS, in SE, or in both. We observed the classic properties of Shepard tones for either SFS or SE shifts: subjective judgements followed the smallest log-frequency change direction, with cases of ambiguity and circularity. Interestingly, when both SFS and SE changes were applied concurrently (synergistically or antagonistically), we observed a trade-off between cues. Listeners were encouraged to report when they perceived "both" directions of change concurrently, but this rarely happened, suggesting a unitary percept. A computational model could accurately fit the behavioral data by combining different cues reflecting frequency changes after auditory filtering. The model revealed that cue weighting depended on the nature of the sound. When presented with harmonic sounds, listeners put more weight on SFS-related cues, whereas inharmonic sounds led to more weight on SE-related cues. Moreover, these stimulus-based factors were modulated by inter-individual differences, revealing variability across listeners in the detailed recipe for "up" and "down" judgments. We argue that frequency changes are tracked perceptually via the adaptive combination of a diverse set of cues, in a manner that is in fact similar to the derivation of other basic auditory dimensions such as spatial location.
Collapse
Affiliation(s)
- Kai Siedenburg
- Carl von Ossietzky University of Oldenburg, Dept. of Medical Physics and Acoustics, Oldenburg, Germany
- * E-mail:
| | - Jackson Graves
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Daniel Pressnitzer
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| |
Collapse
|
2
|
Thomassen S, Hartung K, Einhäuser W, Bendixen A. Low-high-low or high-low-high? Pattern effects on sequential auditory scene analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:2758. [PMID: 36456271 DOI: 10.1121/10.0015054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/17/2022] [Indexed: 06/17/2023]
Abstract
Sequential auditory scene analysis (ASA) is often studied using sequences of two alternating tones, such as ABAB or ABA_, with "_" denoting a silent gap, and "A" and "B" sine tones differing in frequency (nominally low and high). Many studies implicitly assume that the specific arrangement (ABAB vs ABA_, as well as low-high-low vs high-low-high within ABA_) plays a negligible role, such that decisions about the tone pattern can be governed by other considerations. To explicitly test this assumption, a systematic comparison of different tone patterns for two-tone sequences was performed in three different experiments. Participants were asked to report whether they perceived the sequences as originating from a single sound source (integrated) or from two interleaved sources (segregated). Results indicate that core findings of sequential ASA, such as an effect of frequency separation on the proportion of integrated and segregated percepts, are similar across the different patterns during prolonged listening. However, at sequence onset, the integrated percept was more likely to be reported by the participants in ABA_low-high-low than in ABA_high-low-high sequences. This asymmetry is important for models of sequential ASA, since the formation of percepts at onset is an integral part of understanding how auditory interpretations build up.
Collapse
Affiliation(s)
- Sabine Thomassen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Kevin Hartung
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Wolfgang Einhäuser
- Physics of Cognition Group, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| |
Collapse
|
3
|
Rimmele JM, Kern P, Lubinus C, Frieler K, Poeppel D, Assaneo MF. Musical Sophistication and Speech Auditory-Motor Coupling: Easy Tests for Quick Answers. Front Neurosci 2022; 15:764342. [PMID: 35058741 PMCID: PMC8763673 DOI: 10.3389/fnins.2021.764342] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 11/22/2021] [Indexed: 12/05/2022] Open
Abstract
Musical training enhances auditory-motor cortex coupling, which in turn facilitates music and speech perception. How tightly the temporal processing of music and speech are intertwined is a topic of current research. We investigated the relationship between musical sophistication (Goldsmiths Musical Sophistication index, Gold-MSI) and spontaneous speech-to-speech synchronization behavior as an indirect measure of speech auditory-motor cortex coupling strength. In a group of participants (n = 196), we tested whether the outcome of the spontaneous speech-to-speech synchronization test (SSS-test) can be inferred from self-reported musical sophistication. Participants were classified as high (HIGHs) or low (LOWs) synchronizers according to the SSS-test. HIGHs scored higher than LOWs on all Gold-MSI subscales (General Score, Active Engagement, Musical Perception, Musical Training, Singing Skills), but the Emotional Attachment scale. More specifically, compared to a previously reported German-speaking sample, HIGHs overall scored higher and LOWs lower. Compared to an estimated distribution of the English-speaking general population, our sample overall scored lower, with the scores of LOWs significantly differing from the normal distribution, with scores in the ∼30th percentile. While HIGHs more often reported musical training compared to LOWs, the distribution of training instruments did not vary across groups. Importantly, even after the highly correlated subscores of the Gold-MSI were decorrelated, particularly the subscales Musical Perception and Musical Training allowed to infer the speech-to-speech synchronization behavior. The differential effects of musical perception and training were observed, with training predicting audio-motor synchronization in both groups, but perception only in the HIGHs. Our findings suggest that speech auditory-motor cortex coupling strength can be inferred from training and perceptual aspects of musical sophistication, suggesting shared mechanisms involved in speech and music perception.
Collapse
Affiliation(s)
- Johanna M. Rimmele
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany
- Max Planck NYU Center for Language, Music and Emotion, New York, NY, United States
| | - Pius Kern
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany
| | - Christina Lubinus
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany
| | - Klaus Frieler
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany
| | - David Poeppel
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany
- Max Planck NYU Center for Language, Music and Emotion, New York, NY, United States
- Department of Psychology, New York University, New York, NY, United States
- Ernst Strüngmann Institute for Neuroscience, Frankfurt, Germany
| | - M. Florencia Assaneo
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, México
| |
Collapse
|
4
|
An implicit representation of stimulus ambiguity in pupil size. Proc Natl Acad Sci U S A 2021; 118:2107997118. [PMID: 34819369 DOI: 10.1073/pnas.2107997118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/07/2021] [Indexed: 11/18/2022] Open
Abstract
To guide behavior, perceptual systems must operate on intrinsically ambiguous sensory input. Observers are usually able to acknowledge the uncertainty of their perception, but in some cases, they critically fail to do so. Here, we show that a physiological correlate of ambiguity can be found in pupil dilation even when the observer is not aware of such ambiguity. We used a well-known auditory ambiguous stimulus, known as the tritone paradox, which can induce the perception of an upward or downward pitch shift within the same individual. In two experiments, behavioral responses showed that listeners could not explicitly access the ambiguity in this stimulus, even though their responses varied from trial to trial. However, pupil dilation was larger for the more ambiguous cases. The ambiguity of the stimulus for each listener was indexed by the entropy of behavioral responses, and this entropy was also a significant predictor of pupil size. In particular, entropy explained additional variation in pupil size independent of the explicit judgment of confidence in the specific situation that we investigated, in which the two measures were decoupled. Our data thus suggest that stimulus ambiguity is implicitly represented in the brain even without explicit awareness of this ambiguity.
Collapse
|
5
|
Weisser A, Buchholz JM, Keidser G. Complex Acoustic Environments: Review, Framework, and Subjective Model. Trends Hear 2020; 23:2331216519881346. [PMID: 31808369 PMCID: PMC6900675 DOI: 10.1177/2331216519881346] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The concept of complex acoustic environments has appeared in several unrelated
research areas within acoustics in different variations. Based on a review of
the usage and evolution of this concept in the literature, a relevant framework
was developed, which includes nine broad characteristics that are thought to
drive the complexity of acoustic scenes. The framework was then used to study
the most relevant characteristics for stimuli of realistic, everyday, acoustic
scenes: multiple sources, source diversity, reverberation, and the listener’s
task. The effect of these characteristics on perceived scene complexity was then
evaluated in an exploratory study that reproduced the same stimuli with a
three-dimensional loudspeaker array inside an anechoic chamber. Sixty-five
subjects listened to the scenes and for each one had to rate 29 attributes,
including complexity, both with and without target speech in the scenes. The
data were analyzed using three-way principal component analysis with a (2 3 2)
Tucker3 model in the dimensions of scales (or ratings), scenes, and subjects,
explaining 42% of variation in the data. “Comfort” and “variability” were the
dominant scale components, which span the perceived complexity. Interaction
effects were observed, including the additional task of attending to target
speech that shifted the complexity rating closer to the comfort scale. Also,
speech contained in the background scenes introduced a second subject component,
which suggests that some subjects are more distracted than others by background
speech when listening to target speech. The results are interpreted in light of
the proposed framework.
Collapse
Affiliation(s)
- Adam Weisser
- Department of Linguistics, Faculty of Human Sciences, Macquarie University, Sydney, Australia.,The HEARing Cooperative Research Centre, Carlton, Victoria, Australia
| | - Jörg M Buchholz
- Department of Linguistics, Faculty of Human Sciences, Macquarie University, Sydney, Australia.,The HEARing Cooperative Research Centre, Carlton, Victoria, Australia
| | - Gitte Keidser
- The HEARing Cooperative Research Centre, Carlton, Victoria, Australia.,National Acoustic Laboratory, The Hearing Hub, Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
6
|
Siedenburg K, Röttges S, Wagener KC, Hohmann V. Can You Hear Out the Melody? Testing Musical Scene Perception in Young Normal-Hearing and Older Hearing-Impaired Listeners. Trends Hear 2020; 24:2331216520945826. [PMID: 32895034 PMCID: PMC7502688 DOI: 10.1177/2331216520945826] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
It is well known that hearing loss compromises auditory scene analysis abilities,
as is usually manifested in difficulties of understanding speech in noise.
Remarkably little is known about auditory scene analysis of hearing-impaired
(HI) listeners when it comes to musical sounds. Specifically, it is unclear to
which extent HI listeners are able to hear out a melody or an instrument from a
musical mixture. Here, we tested a group of younger normal-hearing (yNH) and
older HI (oHI) listeners with moderate hearing loss in their ability to match
short melodies and instruments presented as part of mixtures. Four-tone
sequences were used in conjunction with a simple musical accompaniment that
acted as a masker (cello/piano dyads or spectrally matched noise). In each
trial, a signal-masker mixture was presented, followed by two different versions
of the signal alone. Listeners indicated which signal version was part of the
mixture. Signal versions differed either in terms of the sequential order of the
pitch sequence or in terms of timbre (flute vs. trumpet). Signal-to-masker
thresholds were measured by varying the signal presentation level in an adaptive
two-down/one-up procedure. We observed that thresholds of oHI listeners were
elevated by on average 10 dB compared with that of yNH listeners. In contrast to
yNH listeners, oHI listeners did not show evidence of listening in dips of the
masker. Musical training of participants was associated with a lowering of
thresholds. These results may indicate detrimental effects of hearing loss on
central aspects of musical scene perception.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg
| | - Saskia Röttges
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg
| | | | - Volker Hohmann
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg.,Hörzentrum Oldenburg GmbH & Hörtech gGmbH, Oldenburg, Germany
| |
Collapse
|
7
|
Letailleur A, Bisesi E, Legrain P. Strategies Used by Musicians to Identify Notes' Pitch: Cognitive Bricks and Mental Representations. Front Psychol 2020; 11:1480. [PMID: 32733333 PMCID: PMC7358308 DOI: 10.3389/fpsyg.2020.01480] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 06/02/2020] [Indexed: 12/20/2022] Open
Abstract
To this day, the study of the substratum of thought and its implied mechanisms is rarely directly addressed. Nowadays, systemic approaches based on introspective methodologies are no longer fashionable and are often overlooked or ignored. Most frequently, reductionist approaches are followed for deciphering the neuronal circuits functionally associated with cognitive processes. However, we argue that systemic studies of individual thought may still contribute to a useful and complementary description of the multimodal nature of perception, because they can take into account individual diversity while still identifying the common features of perceptual processes. We propose to address this question by looking at one possible task for recognition of a "signifying sound", as an example of conceptual grasping of a perceptual response. By adopting a mixed approach combining qualitative analyses of interviews based on introspection with quantitative statistical analyses carried out on the resulting categorization, this study describes a variety of mental strategies used by musicians to identify notes' pitch. Sixty-seven musicians (music students and professionals) were interviewed, revealing that musicians utilize intermediate steps during note identification by selecting or activating cognitive bricks that help construct and reach the correct decision. We named these elements "mental anchorpoints" (MA). Although the anchorpoints are not universal, and differ between individuals, they can be grouped into categories related to three main sensory modalities - auditory, visual and kinesthetic. Such categorization enabled us to characterize the mental representations (MR) that allow musicians to name notes in relationship to eleven basic typologies of anchorpoints. We propose a conceptual framework which summarizes the process of note identification in five steps, starting from sensory detection and ending with the verbalization of the note pitch, passing through the pivotal role of MAs and MRs. We found that musicians use multiple strategies and select individual combinations of MAs belonging to these three different sensory modalities, both in isolation and in combination.
Collapse
Affiliation(s)
- Alain Letailleur
- CNRS UMR 8131, Centre Georg Simmel Recherches Franco-Allemandes en Sciences Sociales, École des Hautes Études en Sciences Sociales (EHESS), Paris, France
| | - Erica Bisesi
- CNRS UMR 3571, Paris, France.,Unité Perception et Mémoire, Institut Pasteur, Paris, France
| | - Pierre Legrain
- CNRS UMR 3571, Paris, France.,Unité Perception et Mémoire, Institut Pasteur, Paris, France
| |
Collapse
|
8
|
Walker SC, Williams K, Moore DJ. Superior Identification of Component Odors in a Mixture Is Linked to Autistic Traits in Children and Adults. Chem Senses 2020; 45:391-399. [PMID: 32249289 DOI: 10.1093/chemse/bjaa026] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Most familiar odors are complex mixtures of volatile molecules, which the olfactory system automatically synthesizes into a perceptual whole. However, odors are rarely encountered in isolation; thus, the brain must also separate distinct odor objects from complex and variable backgrounds. In vision, autistic traits are associated with superior performance in tasks that require focus on the local features of a perceptual scene. The aim of the present study was to determine whether the same advantage was observed in the analysis of olfactory scenes. To do this, we compared the ability of 1) 40 young adults (aged 16-35) with high (n = 20) and low levels of autistic traits and 2) 20 children (aged 7-11), with (n = 10) and without an autism spectrum disorder diagnosis, to identify individual odor objects presented within odor mixtures. First, we used a 4-alternative forced choice task to confirm that both adults and children were able to reliably identify 8 blended fragrances, representing food-related odors, when presented individually. We then used the same forced choice format to test participants' ability to identify the odors when they were combined in either binary or ternary mixtures. Adults with high levels of autistic traits showed superior performance on binary but not ternary mixture trials, whereas children with an autism spectrum disorder diagnosis outperformed age-matched neurotypical peers, irrespective of mixture complexity. These findings indicate that the local processing advantages associated with high levels of autistic traits in visual tasks are also apparent in a task requiring analytical processing of odor mixtures.
Collapse
Affiliation(s)
- Susannah C Walker
- Research Centre for Brain and Behaviour, School of Natural Sciences and Psychology, Liverpool John Moores University, Liverpool, UK
| | | | - David J Moore
- Research Centre for Brain and Behaviour, School of Natural Sciences and Psychology, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
9
|
Little DF, Snyder JS, Elhilali M. Ensemble modeling of auditory streaming reveals potential sources of bistability across the perceptual hierarchy. PLoS Comput Biol 2020; 16:e1007746. [PMID: 32275706 PMCID: PMC7185718 DOI: 10.1371/journal.pcbi.1007746] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 04/27/2020] [Accepted: 02/25/2020] [Indexed: 11/19/2022] Open
Abstract
Perceptual bistability-the spontaneous, irregular fluctuation of perception between two interpretations of a stimulus-occurs when observing a large variety of ambiguous stimulus configurations. This phenomenon has the potential to serve as a tool for, among other things, understanding how function varies across individuals due to the large individual differences that manifest during perceptual bistability. Yet it remains difficult to interpret the functional processes at work, without knowing where bistability arises during perception. In this study we explore the hypothesis that bistability originates from multiple sources distributed across the perceptual hierarchy. We develop a hierarchical model of auditory processing comprised of three distinct levels: a Peripheral, tonotopic analysis, a Central analysis computing features found more centrally in the auditory system, and an Object analysis, where sounds are segmented into different streams. We model bistable perception within this system by applying adaptation, inhibition and noise into one or all of the three levels of the hierarchy. We evaluate a large ensemble of variations of this hierarchical model, where each model has a different configuration of adaptation, inhibition and noise. This approach avoids the assumption that a single configuration must be invoked to explain the data. Each model is evaluated based on its ability to replicate two hallmarks of bistability during auditory streaming: the selectivity of bistability to specific stimulus configurations, and the characteristic log-normal pattern of perceptual switches. Consistent with a distributed origin, a broad range of model parameters across this hierarchy lead to a plausible form of perceptual bistability.
Collapse
Affiliation(s)
- David F. Little
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas; Las Vegas, Nevada, United States of America
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
10
|
Decrypting the electrophysiological individuality of the human brain: Identification of individuals based on resting-state EEG activity. Neuroimage 2019; 197:470-481. [PMID: 30978497 DOI: 10.1016/j.neuroimage.2019.04.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/01/2019] [Indexed: 01/09/2023] Open
Abstract
Biometric identification (BI) of individuals is a fast-growing field of research that is producing increasingly sophisticated applications in several spheres of everyday life. Previous magnetic resonance imaging (MRI) studies have demonstrated that based on the high inter-individual variability of brain structure and function, it is possible to identify individuals with high accuracy. Otherwise, there is the common belief that electroencephalographic (EEG) data recorded at the surface of the scalp are too noisy for identification purposes with a comparably high hit rate. In the present work, we compared BI quality (F1-scores, accuracy, sensitivity, and specificity) between different types of functional (instantaneous, lagged, and total coherence, phase synchronization, correlation, and mutual information) and effective (Granger causality, phase synchronization, and coherence) connectivity measures. Results revealed that across functional connectivity metrics, identification accuracy was in the range of 0.98-1, whereas sensitivity and F1-scores were between 0.00 and 1 and specificity was between 0.99 and 1. BI was higher for the connectivity metrics that are contaminated by volume conduction (instantaneous connectivity) compared to those that are unaffected by this variable (lagged connectivity). Support vector machine and neural network algorithms yielded the highest BI, followed by random forest and weighted k-nearest neighborhood, whereas linear discriminant analysis was less accurate. These results provide cross-validated counterevidence to the belief that EEG data are too noisy for identification purposes and demonstrate that functional and effective connectivity metrics are particularly suited for BI applications with comparable accuracy to MRI. Our results have important implications for fast, low-cost, and mobile BI applications.
Collapse
|
11
|
Pressnitzer D, Graves J, Chambers C, de Gardelle V, Egré P. Auditory Perception: Laurel and Yanny Together at Last. Curr Biol 2018; 28:R739-R741. [PMID: 29990455 DOI: 10.1016/j.cub.2018.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
An auditory illusion caught the world's attention recently. For the same noisy speech utterance, different people reported hearing either 'Laurel' or 'Yanny'. The dichotomy highlights how perceptions are inferences from inherently ambiguous sensory information, even though ambiguity is often unnoticed.
Collapse
Affiliation(s)
- D Pressnitzer
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France.
| | - J Graves
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - C Chambers
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - V de Gardelle
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| | - P Egré
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, ENS, PSL University, CNRS, Paris, France
| |
Collapse
|
12
|
Siedenburg K. Timbral Shepard-illusion reveals ambiguity and context sensitivity of brightness perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL93. [PMID: 29495721 DOI: 10.1121/1.5022983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recent research has described strong effects of prior context on the perception of ambiguous pitch shifts of Shepard tones [Chambers, Akram, Adam, Pelofi, Sahani, Shamma, and Pressnitzer (2017). Nat. Commun. 8, 15027]. Here, similar effects are demonstrated for brightness shift judgments of harmonic complexes with cyclic spectral envelope components and fixed fundamental frequency. It is shown that frequency shifts of the envelopes are perceived as systematic shifts of brightness. Analogous to the work of Chambers et al., the perceptual ambiguity of half-octave shifts resolves with the presentation of prior context tones. These results constitute a context effect for the perceptual processing of spectral envelope shifts and indicate so-far unknown commonalities between pitch and timbre perception.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
13
|
Dykstra AR, Cariani PA, Gutschalk A. A roadmap for the study of conscious audition and its neural basis. Philos Trans R Soc Lond B Biol Sci 2017; 372:20160103. [PMID: 28044014 PMCID: PMC5206271 DOI: 10.1098/rstb.2016.0103] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2016] [Indexed: 12/16/2022] Open
Abstract
How and which aspects of neural activity give rise to subjective perceptual experience-i.e. conscious perception-is a fundamental question of neuroscience. To date, the vast majority of work concerning this question has come from vision, raising the issue of generalizability of prominent resulting theories. However, recent work has begun to shed light on the neural processes subserving conscious perception in other modalities, particularly audition. Here, we outline a roadmap for the future study of conscious auditory perception and its neural basis, paying particular attention to how conscious perception emerges (and of which elements or groups of elements) in complex auditory scenes. We begin by discussing the functional role of the auditory system, particularly as it pertains to conscious perception. Next, we ask: what are the phenomena that need to be explained by a theory of conscious auditory perception? After surveying the available literature for candidate neural correlates, we end by considering the implications that such results have for a general theory of conscious perception as well as prominent outstanding questions and what approaches/techniques can best be used to address them.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Andrew R Dykstra
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| | | | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| |
Collapse
|
14
|
Kondo HM, van Loon AM, Kawahara JI, Moore BCJ. Auditory and visual scene analysis: an overview. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0099. [PMID: 28044011 DOI: 10.1098/rstb.2016.0099] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2016] [Indexed: 01/23/2023] Open
Abstract
We perceive the world as stable and composed of discrete objects even though auditory and visual inputs are often ambiguous owing to spatial and temporal occluders and changes in the conditions of observation. This raises important questions regarding where and how 'scene analysis' is performed in the brain. Recent advances from both auditory and visual research suggest that the brain does not simply process the incoming scene properties. Rather, top-down processes such as attention, expectations and prior knowledge facilitate scene perception. Thus, scene analysis is linked not only with the extraction of stimulus features and formation and selection of perceptual objects, but also with selective attention, perceptual binding and awareness. This special issue covers novel advances in scene-analysis research obtained using a combination of psychophysics, computational modelling, neuroimaging and neurophysiology, and presents new empirical and theoretical approaches. For integrative understanding of scene analysis beyond and across sensory modalities, we provide a collection of 15 articles that enable comparison and integration of recent findings in auditory and visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Hirohito M Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - Anouk M van Loon
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands .,Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands
| | - Jun-Ichiro Kawahara
- Department of Psychology, Graduate School of Letters, Hokkaido University, Sapporo 060-0810, Japan
| | - Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK
| |
Collapse
|
15
|
Cichy RM, Teng S. Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0108. [PMID: 28044019 PMCID: PMC5206276 DOI: 10.1098/rstb.2016.0108] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/22/2016] [Indexed: 01/06/2023] Open
Abstract
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’.
Collapse
Affiliation(s)
| | - Santani Teng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
16
|
Kondo HM, Farkas D, Denham SL, Asai T, Winkler I. Auditory multistability and neurotransmitter concentrations in the human brain. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0110. [PMID: 28044020 DOI: 10.1098/rstb.2016.0110] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/08/2016] [Indexed: 11/12/2022] Open
Abstract
Multistability in perception is a powerful tool for investigating sensory-perceptual transformations, because it produces dissociations between sensory inputs and subjective experience. Spontaneous switching between different perceptual objects occurs during prolonged listening to a sound sequence of tone triplets or repeated words (termed auditory streaming and verbal transformations, respectively). We used these examples of auditory multistability to examine to what extent neurochemical and cognitive factors influence the observed idiosyncratic patterns of switching between perceptual objects. The concentrations of glutamate-glutamine (Glx) and γ-aminobutyric acid (GABA) in brain regions were measured by magnetic resonance spectroscopy, while personality traits and executive functions were assessed using questionnaires and response inhibition tasks. Idiosyncratic patterns of perceptual switching in the two multistable stimulus configurations were identified using a multidimensional scaling (MDS) analysis. Intriguingly, although switching patterns within each individual differed between auditory streaming and verbal transformations, similar MDS dimensions were extracted separately from the two datasets. Individual switching patterns were significantly correlated with Glx and GABA concentrations in auditory cortex and inferior frontal cortex but not with the personality traits and executive functions. Our results suggest that auditory perceptual organization depends on the balance between neural excitation and inhibition in different brain regions.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Hirohito M Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - Dávid Farkas
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar Tudósok körútja 2, 1117 Budapest, Hungary.,Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Egry József utca 1, 1111 Budapest, Hungary
| | - Susan L Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, Devon PL4 8AA, UK
| | - Tomohisa Asai
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar Tudósok körútja 2, 1117 Budapest, Hungary
| |
Collapse
|