1
|
Pearson DV, Shen Y, McAuley JD, Kidd GR. The effect of rhythm on selective listening in multiple-source environments for young and older adults. Hear Res 2023; 435:108789. [PMID: 37276686 PMCID: PMC10460128 DOI: 10.1016/j.heares.2023.108789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 05/03/2023] [Accepted: 05/10/2023] [Indexed: 06/07/2023]
Abstract
Understanding continuous speech with competing background sounds is challenging, particularly for older adults. One stimulus property that may aid listeners understanding of to-be-attended (target) material is temporal regularity (rhythm). In the context of speech-in-noise understanding, McAuley and colleagues recently showed a target rhythm effect whereby recognition of target speech was better when natural speech rhythm of a target talker was intact than when it was temporally altered. The current study replicates the target rhythm effect using a synthetic vowel sequence paradigm in young adults (Experiment 1) and then uses this paradigm to investigate potential age-related changes in the effect of rhythm on recognition (Experiment 2). Listeners identified the last three vowels of temporally regular (isochronous) and irregular (anisochronous) synthetic vowel sequences in quiet and with a competing background sequence of vowel-like harmonic tone complexes presented at various tempos. The results replicated the target rhythm effect whereby temporal regularity in the vowel sequences improved identification accuracy of young listeners compared to irregular vowel sequences. The magnitude of the effect was not found to be influenced by background tempo, but faster background tempos led to greater vowel identification accuracy independent of regularity. Older listeners also demonstrated a target rhythm effect but received less benefit from the temporal regularity of the target sequences than did young listeners. This study highlights the importance of rhythm for understanding age-related differences in selective listening in complex environments and provides a novel paradigm for investigating effects of rhythm on perception.
Collapse
Affiliation(s)
- Dylan V Pearson
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, United States
| | - J Devin McAuley
- Department of Psychology, Michigan State University, United States
| | - Gary R Kidd
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States
| |
Collapse
|
2
|
Thomassen S, Hartung K, Einhäuser W, Bendixen A. Low-high-low or high-low-high? Pattern effects on sequential auditory scene analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:2758. [PMID: 36456271 DOI: 10.1121/10.0015054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/17/2022] [Indexed: 06/17/2023]
Abstract
Sequential auditory scene analysis (ASA) is often studied using sequences of two alternating tones, such as ABAB or ABA_, with "_" denoting a silent gap, and "A" and "B" sine tones differing in frequency (nominally low and high). Many studies implicitly assume that the specific arrangement (ABAB vs ABA_, as well as low-high-low vs high-low-high within ABA_) plays a negligible role, such that decisions about the tone pattern can be governed by other considerations. To explicitly test this assumption, a systematic comparison of different tone patterns for two-tone sequences was performed in three different experiments. Participants were asked to report whether they perceived the sequences as originating from a single sound source (integrated) or from two interleaved sources (segregated). Results indicate that core findings of sequential ASA, such as an effect of frequency separation on the proportion of integrated and segregated percepts, are similar across the different patterns during prolonged listening. However, at sequence onset, the integrated percept was more likely to be reported by the participants in ABA_low-high-low than in ABA_high-low-high sequences. This asymmetry is important for models of sequential ASA, since the formation of percepts at onset is an integral part of understanding how auditory interpretations build up.
Collapse
Affiliation(s)
- Sabine Thomassen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Kevin Hartung
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Wolfgang Einhäuser
- Physics of Cognition Group, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Faculty of Natural Sciences, Chemnitz University of Technology, 09107 Chemnitz, Germany
| |
Collapse
|
3
|
Foley L, Schlesinger J, Schutz M. More detectable, less annoying: Temporal variation in amplitude envelope and spectral content improves auditory interface efficacy. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:3189. [PMID: 35649914 DOI: 10.1121/10.0010447] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 04/22/2022] [Indexed: 06/15/2023]
Abstract
Auditory interfaces, such as auditory alarms, are useful tools for human computer interaction. Unfortunately, poor detectability and annoyance inhibit the efficacy of many interface sounds. Here, it is shown in two ways how moving beyond the traditional simplistic temporal structures of normative interface sounds can significantly improve auditory interface efficacy. First, participants rated tones with percussive amplitude envelopes as significantly less annoying than tones with flat amplitude envelopes. Crucially, this annoyance reduction did not come with a detection cost as percussive tones were detected more often than flat tones-particularly, at relatively low listening levels. Second, it was found that reductions in the duration of a tone's harmonics significantly lowered its annoyance without a commensurate reduction in detection. Together, these findings help inform our theoretical understanding of detection and annoyance of sound. In addition, they offer promising original design considerations for auditory interfaces.
Collapse
Affiliation(s)
- Liam Foley
- Department of Psychology, Neuroscience and Behaviour, McMaster University, Hamilton, Canada
| | - Joseph Schlesinger
- Anesthesiology Critical Care Medicine, Biomedical Engineering, Vanderbilt University Medical Center, Nashville, Tennessee 37212, USA
| | - Michael Schutz
- School of the Arts, McMaster University, Hamilton, Canada
| |
Collapse
|
4
|
Yang D, Cao X, Meng Q. Effects of a human sound-based index on the soundscapes of urban open spaces. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 802:149869. [PMID: 34461470 DOI: 10.1016/j.scitotenv.2021.149869] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 08/09/2021] [Accepted: 08/19/2021] [Indexed: 06/13/2023]
Abstract
The ratio of the perceived extent of natural sounds to the perceived extent of traffic noise in the environment has been demonstrated to be important for soundscapes, whereas research on the influence of human sounds has been limited. To examine this influence, this study proposes a human sound-based index named the red soundscape index (RSI), which is defined as the ratio of the perceived extent of human sounds to the perceived extent of other sounds. Sound pressure levels and crowd density were collected at 41 sites in 9 urban parks, and pedestrian streets in Harbin, China, and the perceived extent of various sounds was investigated by a questionnaire survey. The results confirmed a significant positive correlation between crowd density and RSI, and the A-weighted sound pressure level increased linearly with increasing RSIn (the ratio of human sounds to natural sounds) and decreased with increasing RSIt (the ratio of human sounds to traffic noises). Interestingly, the overall soundscape assessment linearly decreases with the increase in RSIn in the range of (0.8-1.5). The relationship with RSIt first shows an increase and then a decrease in a parabolic form, in which the axis of symmetry is RSIt = 2. Correspondingly, urban open spaces can be divided into three categories based on the variation trend, and different types have significant differences in overall soundscape assessment, pleasantness, and calmness. Among these, pleasantness is the highest in the sites of natural sound predominance perception. At the same time, this factor becomes the lowest in the sites of human sound predominance perception and middle in the site of balanced perception. Consequently, RSI is expected to be useful in soundscape prediction in urban open spaces.
Collapse
Affiliation(s)
- Da Yang
- Key Laboratory of Cold Region Urban and Rural Human Settlement Environment Science and Technology, Ministry of Industry and Information Technology, School of Architecture, Harbin Institute of Technology, China
| | - Xinhao Cao
- Key Laboratory of Cold Region Urban and Rural Human Settlement Environment Science and Technology, Ministry of Industry and Information Technology, School of Architecture, Harbin Institute of Technology, China
| | - Qi Meng
- Key Laboratory of Cold Region Urban and Rural Human Settlement Environment Science and Technology, Ministry of Industry and Information Technology, School of Architecture, Harbin Institute of Technology, China.
| |
Collapse
|
5
|
Soltanparast S, Toufan R, Talebian S, Pourbakht A. Regularity of background auditory scene and selective attention: a brain oscillatory study. Neurosci Lett 2022; 772:136465. [DOI: 10.1016/j.neulet.2022.136465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 12/29/2021] [Accepted: 01/14/2022] [Indexed: 11/27/2022]
|
6
|
Neubert CR, Förstel AP, Debener S, Bendixen A. Predictability-Based Source Segregation and Sensory Deviance Detection in Auditory Aging. Front Hum Neurosci 2021; 15:734231. [PMID: 34776906 PMCID: PMC8586071 DOI: 10.3389/fnhum.2021.734231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 10/08/2021] [Indexed: 11/30/2022] Open
Abstract
When multiple sound sources are present at the same time, auditory perception is often challenged with disentangling the resulting mixture and focusing attention on the target source. It has been repeatedly demonstrated that background (distractor) sound sources are easier to ignore when their spectrotemporal signature is predictable. Prior evidence suggests that this ability to exploit predictability for foreground-background segregation degrades with age. On a theoretical level, this has been related with an impairment in elderly adults’ capabilities to detect certain types of sensory deviance in unattended sound sequences. Yet the link between those two capacities, deviance detection and predictability-based sound source segregation, has not been empirically demonstrated. Here we report on a combined behavioral-EEG study investigating the ability of elderly listeners (60–75 years of age) to use predictability as a cue for sound source segregation, as well as their sensory deviance detection capacities. Listeners performed a detection task on a target stream that can only be solved when a concurrent distractor stream is successfully ignored. We contrast two conditions whose distractor streams differ in their predictability. The ability to benefit from predictability was operationalized as performance difference between the two conditions. Results show that elderly listeners can use predictability for sound source segregation at group level, yet with a high degree of inter-individual variation in this ability. In a further, passive-listening control condition, we measured correlates of deviance detection in the event-related brain potential (ERP) elicited by occasional deviations from the same spectrotemporal pattern as used for the predictable distractor sequence during the behavioral task. ERP results confirmed neural signatures of deviance detection in terms of mismatch negativity (MMN) at group level. Correlation analyses at single-subject level provide no evidence for the hypothesis that deviance detection ability (measured by MMN amplitude) is related to the ability to benefit from predictability for sound source segregation. These results are discussed in the frameworks of sensory deviance detection and predictive coding.
Collapse
Affiliation(s)
- Christiane R Neubert
- Cognitive Systems Lab, Faculty of Natural Sciences, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany
| | - Alexander P Förstel
- Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Faculty of Natural Sciences, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany
| |
Collapse
|
7
|
Attentional control via synaptic gain mechanisms in auditory streaming. Brain Res 2021; 1778:147720. [PMID: 34785256 DOI: 10.1016/j.brainres.2021.147720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 09/13/2021] [Accepted: 11/05/2021] [Indexed: 11/21/2022]
Abstract
Attention is a crucial component in sound source segregation allowing auditory objects of interest to be both singled out and held in focus. Our study utilizes a fundamental paradigm for sound source segregation: a sequence of interleaved tones, A and B, of different frequencies that can be heard as a single integrated stream or segregated into two streams (auditory streaming paradigm). We focus on the irregular alternations between integrated and segregated that occur for long presentations, so-called auditory bistability. Psychaoustic experiments demonstrate how attentional control, a listener's intention to experience integrated or segregated, biases perception in favour of different perceptual interpretations. Our data show that this is achieved by prolonging the dominance times of the attended percept and, to a lesser extent, by curtailing the dominance times of the unattended percept, an effect that remains consistent across a range of values for the difference in frequency between A and B. An existing neuromechanistic model describes the neural dynamics of perceptual competition downstream of primary auditory cortex (A1). The model allows us to propose plausible neural mechanisms for attentional control, as linked to different attentional strategies, in a direct comparison with behavioural data. A mechanism based on a percept-specific input gain best accounts for the effects of attentional control.
Collapse
|
8
|
Hodapp A, Grimm S. Neural signatures of temporal regularity and recurring patterns in random tonal sound sequences. Eur J Neurosci 2021; 53:2740-2754. [PMID: 33481296 DOI: 10.1111/ejn.15123] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 12/18/2020] [Accepted: 01/18/2021] [Indexed: 11/27/2022]
Abstract
The auditory system is highly sensitive to recurring patterns in the acoustic input - even in otherwise unstructured material, such as white noise or random tonal sequences. Electroencephalography (EEG) research revealed a characteristic negative potential to periodically recurring auditory patterns - a response, which has been interpreted as memory trace-related and specific, rather than as a sign of periodicity-driven entrainment. Here, we aim to disentangle these two possible contributions by investigating the influence of a periodic sound sequence's inherent temporal regularity on event-related potentials. Participants were presented continuous sequences of short tones of random pitch, with some sequences containing a recurring pattern, and asked to indicate whether they heard a repetition. Patterns were either spaced equally across the random sequence (isochronous condition) or with a temporal jitter (jittered condition), which enabled us to differentiate between event-related potentials (and thus processing operations associated with a memory trace for a repeated pattern) and the periodic nature of the repetitions. A negative recurrence-related component could be observed independently of temporal regularity, was pattern-specific, and modulated by across trial repetition of the pattern. Critically, isochronous pattern repetition induced an additional early periodicity-related positive component, which started to build up already before the pattern onset and which was elicited undampedly even when the repeated pattern was occasionally not presented. This positive component likely reflects a sensory driven entrainment process that could be the foundation of a behavioural benefit in detecting temporally regular repetitions.
Collapse
Affiliation(s)
- Alice Hodapp
- Institute of Psychology, University of Leipzig, Leipzig, Germany.,Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Sabine Grimm
- Institute of Psychology, University of Leipzig, Leipzig, Germany
| |
Collapse
|
9
|
Timmers R, Arthurs Y, Crook H. Stream segregation revisited: Dynamic listening and influences of emotional context on stream perception and attention. Conscious Cogn 2020; 85:103027. [PMID: 33059197 DOI: 10.1016/j.concog.2020.103027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 09/14/2020] [Accepted: 09/26/2020] [Indexed: 11/25/2022]
Abstract
A classical experiment of auditory stream segregation is revisited, reconceptualising perceptual ambiguity in terms of affordances and musical engagement. Specifically, three experiments are reported that investigate how listeners' perception of auditory sequences change dynamically depending on emotional context. The experiments show that listeners adapt their attention to higher or lower pitched streams (Experiments 1 and 2) and the degree of auditory stream integration or segregation (Experiment 3) in accordance with the presented emotional context. Participants with and without formal musical training show this influence, although to differing degrees (Experiment 2). Contributing evidence to the literature on interactions between emotion and cognition, these experiments demonstrate how emotion is an intrinsic part of music perception and not merely a product of the listening experience.
Collapse
Affiliation(s)
- Renee Timmers
- Department of Music, The University of Sheffield, UK.
| | - Yuko Arthurs
- Department of Music, The University of Sheffield, UK.
| | - Harriet Crook
- Department of Music, The University of Sheffield, UK; Department of Audiology, Royal Hallamshire Hospital, UK.
| |
Collapse
|
10
|
de Kerangal M, Vickers D, Chait M. The effect of healthy aging on change detection and sensitivity to predictable structure in crowded acoustic scenes. Hear Res 2020; 399:108074. [PMID: 33041093 DOI: 10.1016/j.heares.2020.108074] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 08/01/2020] [Accepted: 09/01/2020] [Indexed: 01/25/2023]
Abstract
The auditory system plays a critical role in supporting our ability to detect abrupt changes in our surroundings. Here we study how this capacity is affected in the course of healthy ageing. Artifical acoustic 'scenes', populated by multiple concurrent streams of pure tones ('sources') were used to capture the challenges of listening in complex acoustic environments. Two scene conditions were included: REG scenes consisted of sources characterized by a regular temporal structure. Matched RAND scenes contained sources which were temporally random. Changes, manifested as the abrupt disappearance of one of the sources, were introduced to a subset of the trials and participants ('young' group N = 41, age 20-38 years; 'older' group N = 41, age 60-82 years) were instructed to monitor the scenes for these events. Previous work demonstrated that young listeners exhibit better change detection performance in REG scenes, reflecting sensitivity to temporal structure. Here we sought to determine: (1) Whether 'baseline' change detection ability (i.e. in RAND scenes) is affected by age. (2) Whether aging affects listeners' sensitivity to temporal regularity. (3) How change detection capacity relates to listeners' hearing and cognitive profile (a battery of tests that capture hearing and cognitive abilities hypothesized to be affected by aging). The results demonstrated that healthy aging is associated with reduced sensitivity to abrupt scene changes in RAND scenes but that performance does not correlate with age or standard audiological measures such as pure tone audiometry or speech in noise performance. Remarkably older listeners' change detection performance improved substantially (up to the level exhibited by young listeners) in REG relative to RAND scenes. This suggests that the ability to extract and track the regularity associated with scene sources, even in crowded acoustic environments, is relatively preserved in older listeners.
Collapse
Affiliation(s)
- Mathilde de Kerangal
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK
| | - Deborah Vickers
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK; Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, UK
| | - Maria Chait
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK.
| |
Collapse
|
11
|
Szalárdy O, Tóth B, Farkas D, Orosz G, Honbolygó F, Winkler I. Linguistic predictability influences auditory stimulus classification within two concurrent speech streams. Psychophysiology 2020; 57:e13547. [DOI: 10.1111/psyp.13547] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 01/20/2020] [Accepted: 01/22/2020] [Indexed: 11/30/2022]
Affiliation(s)
- Orsolya Szalárdy
- Faculty of Medicine Institute of Behavioural Sciences Semmelweis University Budapest Hungary
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| | - Brigitta Tóth
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| | - Dávid Farkas
- Analytics Development, Performance Management and Analytics, Business Development, Integrated Supply Chain Management, Nokia Business Services, Nokia Operations, Nokia Budapest Hungary
| | - Gábor Orosz
- Department of Psychology Stanford University Stanford CA USA
| | - Ferenc Honbolygó
- Brain Imaging Centre Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
- Institute of Psychology ELTE Eötvös Loránd University Budapest Hungary
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology Research Centre for Natural Sciences Hungarian Academy of Sciences Budapest Hungary
| |
Collapse
|
12
|
Streaming of Repeated Noise in Primary and Secondary Fields of Auditory Cortex. J Neurosci 2020; 40:3783-3798. [PMID: 32273487 DOI: 10.1523/jneurosci.2105-19.2020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 02/06/2020] [Accepted: 02/11/2020] [Indexed: 11/21/2022] Open
Abstract
Statistical regularities in natural sounds facilitate the perceptual segregation of auditory sources, or streams. Repetition is one cue that drives stream segregation in humans, but the neural basis of this perceptual phenomenon remains unknown. We demonstrated a similar perceptual ability in animals by training ferrets of both sexes to detect a stream of repeating noise samples (foreground) embedded in a stream of random samples (background). During passive listening, we recorded neural activity in primary auditory cortex (A1) and secondary auditory cortex (posterior ectosylvian gyrus, PEG). We used two context-dependent encoding models to test for evidence of streaming of the repeating stimulus. The first was based on average evoked activity per noise sample and the second on the spectro-temporal receptive field. Both approaches tested whether differences in neural responses to repeating versus random stimuli were better modeled by scaling the response to both streams equally (global gain) or by separately scaling the response to the foreground versus background stream (stream-specific gain). Consistent with previous observations of adaptation, we found an overall reduction in global gain when the stimulus began to repeat. However, when we measured stream-specific changes in gain, responses to the foreground were enhanced relative to the background. This enhancement was stronger in PEG than A1. In A1, enhancement was strongest in units with low sparseness (i.e., broad sensory tuning) and with tuning selective for the repeated sample. Enhancement of responses to the foreground relative to the background provides evidence for stream segregation that emerges in A1 and is refined in PEG.SIGNIFICANCE STATEMENT To interact with the world successfully, the brain must parse behaviorally important information from a complex sensory environment. Complex mixtures of sounds often arrive at the ears simultaneously or in close succession, yet they are effortlessly segregated into distinct perceptual sources. This process breaks down in hearing-impaired individuals and speech recognition devices. By identifying the underlying neural mechanisms that facilitate perceptual segregation, we can develop strategies for ameliorating hearing loss and improving speech recognition technology in the presence of background noise. Here, we present evidence to support a hierarchical process, present in primary auditory cortex and refined in secondary auditory cortex, in which sound repetition facilitates segregation.
Collapse
|
13
|
Schröger E, Roeber U. Encoding of deterministic and stochastic auditory rules in the human brain: The mismatch negativity mechanism does not reflect basic probability. Hear Res 2020; 399:107907. [PMID: 32143958 DOI: 10.1016/j.heares.2020.107907] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/11/2020] [Accepted: 02/02/2020] [Indexed: 10/25/2022]
Abstract
Regularities in a sequence of sounds can be automatically encoded in a predictive model by the auditory system. When a sound deviates from the one predicted by the model, a mismatch negativity (MMN) is elicited, which is taken to reflect a prediction error at a particular level of the model hierarchy. Although there are many studies on deterministic regularities, only a few have investigated the brain's ability to encode non-deterministic regularities. We studied a simple stochastic regularity: two tone pitches (standards, each occurring on 45% of trials); this regularity was occasionally violated by another tone pitch (deviant, occurring on 10% of trials). We found MMN when the deviant's pitch was outside those of the standards, but not when it was between them. Importantly, when we alternated the occurrence of the same two standards, making them deterministic, the deviant elicited MMN, even when its pitch was between those of the standards. Thus, although the MMN system is extremely powerful in establishing even quite complex deterministic regularities, it fails with a simple stochastic regularity. We argue that the MMN system does not know basic probability.
Collapse
Affiliation(s)
- Erich Schröger
- Institute for Psychology, Leipzig University, Neumarkt 9-19, D-04109, Leipzig, Germany.
| | - Urte Roeber
- Institute for Psychology, Leipzig University, Neumarkt 9-19, D-04109, Leipzig, Germany.
| |
Collapse
|
14
|
Auditory streaming and bistability paradigm extended to a dynamic environment. Hear Res 2019; 383:107807. [PMID: 31622836 DOI: 10.1016/j.heares.2019.107807] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 11/23/2022]
Abstract
We explore stream segregation with temporally modulated acoustic features using behavioral experiments and modelling. The auditory streaming paradigm in which alternating high- A and low-frequency tones B appear in a repeating ABA-pattern, has been shown to be perceptually bistable for extended presentations (order of minutes). For a fixed, repeating stimulus, perception spontaneously changes (switches) at random times, every 2-15 s, between an integrated interpretation with a galloping rhythm and segregated streams. Streaming in a natural auditory environment requires segregation of auditory objects with features that evolve over time. With the relatively idealized ABA-triplet paradigm, we explore perceptual switching in a non-static environment by considering slowly and periodically varying stimulus features. Our previously published model captures the dynamics of auditory bistability and predicts here how perceptual switches are entrained, tightly locked to the rising and falling phase of modulation. In psychoacoustic experiments we find that entrainment depends on both the period of modulation and the intrinsic switch characteristics of individual listeners. The extended auditory streaming paradigm with slowly modulated stimulus features presented here will be of significant interest for future imaging and neurophysiology experiments by reducing the need for subjective perceptual reports of ongoing perception.
Collapse
|
15
|
Rajendran VG, Teki S, Schnupp JWH. Temporal Processing in Audition: Insights from Music. Neuroscience 2018; 389:4-18. [PMID: 29108832 PMCID: PMC6371985 DOI: 10.1016/j.neuroscience.2017.10.041] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/24/2017] [Accepted: 10/27/2017] [Indexed: 11/28/2022]
Abstract
Music is a curious example of a temporally patterned acoustic stimulus, and a compelling pan-cultural phenomenon. This review strives to bring some insights from decades of music psychology and sensorimotor synchronization (SMS) literature into the mainstream auditory domain, arguing that musical rhythm perception is shaped in important ways by temporal processing mechanisms in the brain. The feature that unites these disparate disciplines is an appreciation of the central importance of timing, sequencing, and anticipation. Perception of musical rhythms relies on an ability to form temporal predictions, a general feature of temporal processing that is equally relevant to auditory scene analysis, pattern detection, and speech perception. By bringing together findings from the music and auditory literature, we hope to inspire researchers to look beyond the conventions of their respective fields and consider the cross-disciplinary implications of studying auditory temporal sequence processing. We begin by highlighting music as an interesting sound stimulus that may provide clues to how temporal patterning in sound drives perception. Next, we review the SMS literature and discuss possible neural substrates for the perception of, and synchronization to, musical beat. We then move away from music to explore the perceptual effects of rhythmic timing in pattern detection, auditory scene analysis, and speech perception. Finally, we review the neurophysiology of general timing processes that may underlie aspects of the perception of rhythmic patterns. We conclude with a brief summary and outlook for future research.
Collapse
Affiliation(s)
- Vani G Rajendran
- Auditory Neuroscience Group, University of Oxford, Department of Physiology, Anatomy, and Genetics, Oxford, UK
| | - Sundeep Teki
- Auditory Neuroscience Group, University of Oxford, Department of Physiology, Anatomy, and Genetics, Oxford, UK
| | - Jan W H Schnupp
- City University of Hong Kong, Department of Biomedical Sciences, 31 To Yuen Street, Kowloon Tong, Hong Kong.
| |
Collapse
|
16
|
Wen W, Brann E, Di Costa S, Haggard P. Enhanced perceptual processing of self-generated motion: Evidence from steady-state visual evoked potentials. Neuroimage 2018; 175:438-448. [PMID: 29654877 PMCID: PMC5971214 DOI: 10.1016/j.neuroimage.2018.04.019] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 03/15/2018] [Accepted: 04/09/2018] [Indexed: 01/23/2023] Open
Abstract
The sense of agency emerges when our voluntary actions produce anticipated or predictable outcomes in the external world. It remains unclear how the sense of control also influences our perception of the external world. The present study examined perceptual processing of self-generated motion versus non-self-generated motion using steady-state visual evoked potentials (SSVEPs). Participants continuously moved their finger on a touchpad to trigger the movements of two shapes (Experiment 1) or two groups of dots (Experiment 2) on a monitor. Degree of control was manipulated by varying the spatial relation between finger movement and stimulus trajectory across conditions. However, the velocity, onset time, and offset time of visual stimuli always corresponded to participants' finger movement. Stimuli flickered at a frequency of either 7.5 Hz or 10 Hz, thus SSVEPs of these frequencies and their harmonics provided a frequency-tagged measurement of perceptual processing. Participants triggered the motion of all stimuli simultaneously, but had greater levels of control over some stimuli than over others. Their task was to detect a brief colour change on the border(s) of one shape (Experiment 1) or of one group of dots (Experiment 2). Although control over shapes/dots was irrelevant to the visual detection task, we found stronger SSVEPs for stimuli that were under a high level of control, compared with the stimuli that were under a low level of control. Our results suggest that the spatial regularity between self-generated movements and visual input boosted the neural responses underlying perceptual processing. Our results support the preactivation account of sensory attenuation, suggesting that perceptual processing of self-generated events is enhanced rather than inhibited.
Collapse
Affiliation(s)
- Wen Wen
- Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, UK; Department of Precision Engineering, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan.
| | - Elisa Brann
- Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, UK.
| | - Steven Di Costa
- Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, UK.
| | - Patrick Haggard
- Institute of Cognitive Neuroscience, University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, UK.
| |
Collapse
|
17
|
Lumaca M, Ravignani A, Baggio G. Music Evolution in the Laboratory: Cultural Transmission Meets Neurophysiology. Front Neurosci 2018; 12:246. [PMID: 29713263 PMCID: PMC5911491 DOI: 10.3389/fnins.2018.00246] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 03/29/2018] [Indexed: 11/16/2022] Open
Abstract
In recent years, there has been renewed interest in the biological and cultural evolution of music, and specifically in the role played by perceptual and cognitive factors in shaping core features of musical systems, such as melody, harmony, and rhythm. One proposal originates in the language sciences. It holds that aspects of musical systems evolve by adapting gradually, in the course of successive generations, to the structural and functional characteristics of the sensory and memory systems of learners and “users” of music. This hypothesis has found initial support in laboratory experiments on music transmission. In this article, we first review some of the most important theoretical and empirical contributions to the field of music evolution. Next, we identify a major current limitation of these studies, i.e., the lack of direct neural support for the hypothesis of cognitive adaptation. Finally, we discuss a recent experiment in which this issue was addressed by using event-related potentials (ERPs). We suggest that the introduction of neurophysiology in cultural transmission research may provide novel insights on the micro-evolutionary origins of forms of variation observed in cultural systems.
Collapse
Affiliation(s)
- Massimo Lumaca
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| | - Andrea Ravignani
- Artificial Intelligence Lab, Vrije Universiteit Brussel, Brussels, Belgium.,Research Department, Sealcentre Pieterburen, Pieterburen, Netherlands.,Language and Cognition Department, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| | - Giosuè Baggio
- Language Acquisition and Language Processing Lab, Department of Language and Literature, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
18
|
Abstract
The cocktail party problem requires listeners to infer individual sound sources from mixtures of sound. The problem can be solved only by leveraging regularities in natural sound sources, but little is known about how such regularities are internalized. We explored whether listeners learn source "schemas"-the abstract structure shared by different occurrences of the same type of sound source-and use them to infer sources from mixtures. We measured the ability of listeners to segregate mixtures of time-varying sources. In each experiment a subset of trials contained schema-based sources generated from a common template by transformations (transposition and time dilation) that introduced acoustic variation but preserved abstract structure. Across several tasks and classes of sound sources, schema-based sources consistently aided source separation, in some cases producing rapid improvements in performance over the first few exposures to a schema. Learning persisted across blocks that did not contain the learned schema, and listeners were able to learn and use multiple schemas simultaneously. No learning was evident when schema were presented in the task-irrelevant (i.e., distractor) source. However, learning from task-relevant stimuli showed signs of being implicit, in that listeners were no more likely to report that sources recurred in experiments containing schema-based sources than in control experiments containing no schema-based sources. The results implicate a mechanism for rapidly internalizing abstract sound structure, facilitating accurate perceptual organization of sound sources that recur in the environment.
Collapse
|
19
|
Denham SL, Winkler I. Predictive coding in auditory perception: challenges and unresolved questions. Eur J Neurosci 2018; 51:1151-1160. [PMID: 29250827 DOI: 10.1111/ejn.13802] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 09/03/2017] [Accepted: 11/20/2017] [Indexed: 11/30/2022]
Abstract
Predictive coding is arguably the currently dominant theoretical framework for the study of perception. It has been employed to explain important auditory perceptual phenomena, and it has inspired theoretical, experimental and computational modelling efforts aimed at describing how the auditory system parses the complex sound input into meaningful units (auditory scene analysis). These efforts have uncovered some vital questions, addressing which could help to further specify predictive coding and clarify some of its basic assumptions. The goal of the current review is to motivate these questions and show how unresolved issues in explaining some auditory phenomena lead to general questions of the theoretical framework. We focus on experimental and computational modelling issues related to sequential grouping in auditory scene analysis (auditory pattern detection and bistable perception), as we believe that this is the research topic where predictive coding has the highest potential for advancing our understanding. In addition to specific questions, our analysis led us to identify three more general questions that require further clarification: (1) What exactly is meant by prediction in predictive coding? (2) What governs which generative models make the predictions? and (3) What (if it exists) is the correlate of perceptual experience within the predictive coding framework?
Collapse
Affiliation(s)
- Susan L Denham
- School of Psychology, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
20
|
Aggelopoulos NC, Deike S, Selezneva E, Scheich H, Brechmann A, Brosch M. Predictive cues for auditory stream formation in humans and monkeys. Eur J Neurosci 2017; 51:1254-1264. [PMID: 29250854 DOI: 10.1111/ejn.13808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Revised: 12/12/2017] [Accepted: 12/12/2017] [Indexed: 11/27/2022]
Abstract
Auditory perception is improved when stimuli are predictable, and this effect is evident in a modulation of the activity of neurons in the auditory cortex as shown previously. Human listeners can better predict the presence of duration deviants embedded in stimulus streams with fixed interonset interval (isochrony) and repeated duration pattern (regularity), and neurons in the auditory cortex of macaque monkeys have stronger sustained responses in the 60-140 ms post-stimulus time window under these conditions. Subsequently, the question has arisen whether isochrony or regularity in the sensory input contributed to the enhancement of the neuronal and behavioural responses. Therefore, we varied the two factors isochrony and regularity independently and measured the ability of human subjects to detect deviants embedded in these sequences as well as measuring the responses of neurons the primary auditory cortex of macaque monkeys during presentations of the sequences. The performance of humans in detecting deviants was significantly increased by regularity. Isochrony enhanced detection only in the presence of the regularity cue. In monkeys, regularity increased the sustained component of neuronal tone responses in auditory cortex while isochrony had no consistent effect. Although both regularity and isochrony can be considered as parameters that would make a sequence of sounds more predictable, our results from the human and monkey experiments converge in that regularity has a greater influence on behavioural performance and neuronal responses.
Collapse
Affiliation(s)
- Nikolaos C Aggelopoulos
- Special Lab of Primate Neurobiology, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118, Magdeburg, Germany
| | - Susann Deike
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology, Magdeburg, Germany
| | - Elena Selezneva
- Special Lab of Primate Neurobiology, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118, Magdeburg, Germany
| | - Henning Scheich
- Emeritus Group Lifelong Learning, Leibniz Institute for Neurobiology, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| | - André Brechmann
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| | - Michael Brosch
- Special Lab of Primate Neurobiology, Leibniz Institute for Neurobiology, Brenneckestr. 6, 39118, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
21
|
Farkas D, Denham SL, Winkler I. Functional brain networks underlying idiosyncratic switching patterns in multi-stable auditory perception. Neuropsychologia 2017; 108:82-91. [PMID: 29197502 DOI: 10.1016/j.neuropsychologia.2017.11.032] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 11/15/2017] [Accepted: 11/27/2017] [Indexed: 11/28/2022]
Abstract
In perceptual multi-stability, perception stochastically switches between alternative interpretations of the stimulus allowing examination of perceptual experience independent of stimulus parameters. Previous studies found that listeners show temporally stable idiosyncratic switching patterns when listening to a multi-stable auditory stimulus, such as in the auditory streaming paradigm. This inter-individual variability can be described along two dimensions, Exploration and Segregation. In the current study, we explored the functional brain networks associated with these dimensions and their constituents using electroencephalography. Results showed that Segregation and its constituents are related to brain networks operating in the theta EEG band, whereas Exploration and its constituents are related to networks in the lower and upper alpha and beta bands. Thus, the dimensions on which individuals' perception differ from each other in the auditory streaming paradigm probably reflect separate perceptual processes in the human brain. Further, the results suggest that networks mainly located in left auditory areas underlie the perception of integration, whereas perceiving the alternative patterns is accompanied by stronger interhemispheric connections.
Collapse
Affiliation(s)
- Dávid Farkas
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok körútja 2, H-1117 Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Egry József utca 1, H-1111 Budapest, Hungary.
| | - Susan L Denham
- Cognition Institute and School of Psychology, University of Plymouth, Drake Circus, PL4 8AA Plymouth, United Kingdom
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok körútja 2, H-1117 Budapest, Hungary
| |
Collapse
|
22
|
Ono K, Yamasaki D, Altmann CF, Mima T. The effect of illusionary perception on mismatch negativity (MMN): An electroencephalography study. Hear Res 2017; 356:87-92. [PMID: 29074265 DOI: 10.1016/j.heares.2017.10.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Revised: 10/10/2017] [Accepted: 10/15/2017] [Indexed: 10/18/2022]
Abstract
Mismatch negativity (MMN) is a unique brain response elicited by any discernible change of features in a tone sequence. Although the occurrence of MMN is dependent upon the difference of a stimulus parameter, such as frequency or intensity, recent studies have suggested that MMN occurs as a result of a comparison between an internal representation created by perception and an incoming tone. The present study aimed to investigate MMN occurs based upon the physical properties of stimuli or as a result of the perception of the scale illusion. A scale illusion occurs during presentation of ascending and descending musical scales between C4 and C5. The tones of these scales are presented to the right and left ear alternately using a dichotic listening paradigm. Although the ascending/descending sequences are alternated between ears after each tone, we perceive the illusion of progressively ascending/descending tones as being separated by ear. The experiment was designed as an oddball task using the illusionary sequence and three different types of tone sequences as control conditions. Brain response to these sequences and infrequently presented deviants was measured using electroencephalography (EEG). All of the control sequences showed MMN in response to the deviant. However, the illusionary sequence did not result in a significant MMN. These results suggest that in the case of scale illusion, the occurrence of MMN is based upon the representation of tones created by perception, but not upon the physical properties of a tone sequence.
Collapse
Affiliation(s)
- Kentaro Ono
- Center of KANSEI Innovation, Hiroshima University, Japan; Graduate School of Core Ethics and Frontier Sciences, Ritsumeikan University, Japan.
| | - Daiki Yamasaki
- Department of Psychology, Graduate School of Letters, Kyoto University, Japan
| | - Christian F Altmann
- Center of Medical Education and Human Brain Research Center, Graduate School of Medicine, Kyoto University, Japan
| | - Tatsuya Mima
- Graduate School of Core Ethics and Frontier Sciences, Ritsumeikan University, Japan
| |
Collapse
|
23
|
McWalter R, Dau T. Cascaded Amplitude Modulations in Sound Texture Perception. Front Neurosci 2017; 11:485. [PMID: 28955191 PMCID: PMC5601004 DOI: 10.3389/fnins.2017.00485] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 08/15/2017] [Indexed: 11/25/2022] Open
Abstract
Sound textures, such as crackling fire or chirping crickets, represent a broad class of sounds defined by their homogeneous temporal structure. It has been suggested that the perception of texture is mediated by time-averaged summary statistics measured from early auditory representations. In this study, we investigated the perception of sound textures that contain rhythmic structure, specifically second-order amplitude modulations that arise from the interaction of different modulation rates, previously described as “beating” in the envelope-frequency domain. We developed an auditory texture model that utilizes a cascade of modulation filterbanks that capture the structure of simple rhythmic patterns. The model was examined in a series of psychophysical listening experiments using synthetic sound textures—stimuli generated using time-averaged statistics measured from real-world textures. In a texture identification task, our results indicated that second-order amplitude modulation sensitivity enhanced recognition. Next, we examined the contribution of the second-order modulation analysis in a preference task, where the proposed auditory texture model was preferred over a range of model deviants that lacked second-order modulation rate sensitivity. Lastly, the discriminability of textures that included second-order amplitude modulations appeared to be perceived using a time-averaging process. Overall, our results demonstrate that the inclusion of second-order modulation analysis generates improvements in the perceived quality of synthetic textures compared to the first-order modulation analysis considered in previous approaches.
Collapse
Affiliation(s)
- Richard McWalter
- Hearing Systems Group, Technical University of DenmarkKongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Group, Technical University of DenmarkKongens Lyngby, Denmark
| |
Collapse
|
24
|
Stachurski M, Summers RJ, Roberts B. Stream segregation of concurrent speech and the verbal transformation effect: Influence of fundamental frequency and lateralization cues. Hear Res 2017; 354:16-27. [PMID: 28843209 DOI: 10.1016/j.heares.2017.07.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 07/25/2017] [Accepted: 07/31/2017] [Indexed: 10/19/2022]
Abstract
Repeating a recorded word produces verbal transformations (VTs); perceptual regrouping of acoustic-phonetic elements may contribute to this effect. The influence of fundamental frequency (F0) and lateralization grouping cues was explored by presenting two concurrent sequences of the same word resynthesized on different F0s (100 and 178 Hz). In experiment 1, listeners monitored both sequences simultaneously, reporting for each any change in stimulus identity. Three lateralization conditions were used - diotic, ±680-μs interaural time difference, and dichotic. Results were similar for the first two conditions, but fewer forms and later initial transformations were reported in the dichotic condition. This suggests that large lateralization differences per se have little effect - rather, there are more possibilities for regrouping when each ear receives both sequences. In the dichotic condition, VTs reported for one sequence were also more independent of those reported for the other. Experiment 2 used diotic stimuli and explored the effect of the number of sequences presented and monitored. The most forms and earliest transformations were reported when two sequences were presented but only one was monitored, indicating that high task demands decreased reporting of VTs for concurrent sequences. Overall, these findings support the idea that perceptual regrouping contributes to the VT effect.
Collapse
Affiliation(s)
- Marcin Stachurski
- Psychology, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK
| | - Robert J Summers
- Psychology, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK
| | - Brian Roberts
- Psychology, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK.
| |
Collapse
|
25
|
Normal Aging Slows Spontaneous Switching in Auditory and Visual Bistability. Neuroscience 2017; 389:152-160. [PMID: 28479403 DOI: 10.1016/j.neuroscience.2017.04.040] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 04/26/2017] [Indexed: 11/20/2022]
Abstract
Age-related changes in auditory and visual perception have an impact on the quality of life. It has been debated how perceptual organization is influenced by advancing age. From the neurochemical perspective, we investigated age effects on auditory and visual bistability. In perceptual bistability, a sequence of sensory inputs induces spontaneous switching between different perceptual objects. We used different modality tasks of auditory streaming and visual plaids. Young and middle-aged participants (20-60years) were instructed to indicate by a button press whenever their perception changed from one stable state to the other. The number of perceptual switches decreased with participants' ages. We employed magnetic resonance spectroscopy to measure non-invasively concentrations of the inhibitory neurotransmitter (γ-aminobutyric acid, GABA) in the brain regions of interest. When participants were asked to voluntarily modulate their perception, the amount of effective volitional control was positively correlated with the GABA concentration in the auditory and motion-sensitive areas corresponding to each sensory modality. However, no correlation was found in the prefrontal cortex and anterior cingulate cortex. In addition, effective volitional control was reduced with advancing age. Our results suggest that sequential scene analysis in auditory and visual domains is influenced by both age-related and neurochemical factors.
Collapse
|
26
|
Shen Y, Pearson DV. Recognition of synthesized vowel sequences in steady-state and sinusoidally amplitude-modulated noises. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1835. [PMID: 28372131 PMCID: PMC5871221 DOI: 10.1121/1.4978060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 02/21/2017] [Accepted: 02/22/2017] [Indexed: 05/28/2023]
Abstract
Modulation masking is known to impact speech intelligibility, but it is not clear whether the mechanism underlying this phenomenon is an invariant, bottom-up process, or if it is subjected to factors such as perceptual segregation and stimulus uncertainty thereby showing a top-down component. In the main experiment of the current study (Exp. II), listeners' ability to recognize sequences of synthesized vowels (i.e., the target) in sinusoidally amplitude-modulated noises (i.e., the masker) was evaluated. The target and masker were designed to be perceptually distinct to limit the top-down component of modulation masking. The duration of each vowel was either 25 or 100 ms, the rate at which the vowels were presented was either 1 or 6 Hz, and the masker modulation rate was varied between 0.5 and 16 Hz. The selective performance degradation when the target and masker modulation spectra overlap, as would be expected from modulation masking, was not observed. In addition, these results were able to be adequately captured using a model of energetic masking without any modulation processing stages and fitted only using the vowel-recognition performance in steady-state maskers, as obtained from Exp. I. Results suggest that speech modulation masking might not be mediated through an early-sensory mechanism.
Collapse
Affiliation(s)
- Yi Shen
- Department of Speech and Hearing Sciences, Indiana University Bloomington, Bloomington, Indiana 47405, USA
| | - Dylan V Pearson
- Department of Speech and Hearing Sciences, Indiana University Bloomington, Bloomington, Indiana 47405, USA
| |
Collapse
|
27
|
Kondo HM, Farkas D, Denham SL, Asai T, Winkler I. Auditory multistability and neurotransmitter concentrations in the human brain. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0110. [PMID: 28044020 DOI: 10.1098/rstb.2016.0110] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/08/2016] [Indexed: 11/12/2022] Open
Abstract
Multistability in perception is a powerful tool for investigating sensory-perceptual transformations, because it produces dissociations between sensory inputs and subjective experience. Spontaneous switching between different perceptual objects occurs during prolonged listening to a sound sequence of tone triplets or repeated words (termed auditory streaming and verbal transformations, respectively). We used these examples of auditory multistability to examine to what extent neurochemical and cognitive factors influence the observed idiosyncratic patterns of switching between perceptual objects. The concentrations of glutamate-glutamine (Glx) and γ-aminobutyric acid (GABA) in brain regions were measured by magnetic resonance spectroscopy, while personality traits and executive functions were assessed using questionnaires and response inhibition tasks. Idiosyncratic patterns of perceptual switching in the two multistable stimulus configurations were identified using a multidimensional scaling (MDS) analysis. Intriguingly, although switching patterns within each individual differed between auditory streaming and verbal transformations, similar MDS dimensions were extracted separately from the two datasets. Individual switching patterns were significantly correlated with Glx and GABA concentrations in auditory cortex and inferior frontal cortex but not with the personality traits and executive functions. Our results suggest that auditory perceptual organization depends on the balance between neural excitation and inhibition in different brain regions.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Hirohito M Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - Dávid Farkas
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar Tudósok körútja 2, 1117 Budapest, Hungary.,Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Egry József utca 1, 1111 Budapest, Hungary
| | - Susan L Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, Devon PL4 8AA, UK
| | - Tomohisa Asai
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar Tudósok körútja 2, 1117 Budapest, Hungary
| |
Collapse
|
28
|
Southwell R, Baumann A, Gal C, Barascud N, Friston K, Chait M. Is predictability salient? A study of attentional capture by auditory patterns. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0105. [PMID: 28044016 PMCID: PMC5206273 DOI: 10.1098/rstb.2016.0105] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2016] [Indexed: 01/08/2023] Open
Abstract
In this series of behavioural and electroencephalography (EEG) experiments, we investigate the extent to which repeating patterns of sounds capture attention. Work in the visual domain has revealed attentional capture by statistically predictable stimuli, consistent with predictive coding accounts which suggest that attention is drawn to sensory regularities. Here, stimuli comprised rapid sequences of tone pips, arranged in regular (REG) or random (RAND) patterns. EEG data demonstrate that the brain rapidly recognizes predictable patterns manifested as a rapid increase in responses to REG relative to RAND sequences. This increase is reminiscent of the increase in gain on neural responses to attended stimuli often seen in the neuroimaging literature, and thus consistent with the hypothesis that predictable sequences draw attention. To study potential attentional capture by auditory regularities, we used REG and RAND sequences in two different behavioural tasks designed to reveal effects of attentional capture by regularity. Overall, the pattern of results suggests that regularity does not capture attention. This article is part of the themed issue ‘Auditory and visual scene analysis’.
Collapse
Affiliation(s)
- Rosy Southwell
- Ear Institute, University College London, London WC1X 8EE, UK
| | - Anna Baumann
- Ear Institute, University College London, London WC1X 8EE, UK
| | - Cécile Gal
- Ear Institute, University College London, London WC1X 8EE, UK
| | | | - Karl Friston
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
| | - Maria Chait
- Ear Institute, University College London, London WC1X 8EE, UK
| |
Collapse
|
29
|
Thomassen S, Bendixen A. Subjective perceptual organization of a complex auditory scene. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:265. [PMID: 28147594 DOI: 10.1121/1.4973806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Empirical research on the sequential decomposition of an auditory scene primarily relies on interleaved sound mixtures of only two tone sequences (e.g., ABAB…). This oversimplifies the sound decomposition problem by limiting the number of putative perceptual organizations. The current study used a sound mixture composed of three different tones (ABCABC…) that could be perceptually organized in many different ways. Participants listened to these sequences and reported their subjective perception by continuously choosing one out of 12 visually presented perceptual organization alternatives. Different levels of frequency and spatial separation were implemented to check whether participants' perceptual reports would be systematic and plausible. As hypothesized, while perception switched back and forth in each condition between various perceptual alternatives (multistability), spatial as well as frequency separation generally raised the proportion of segregated and reduced the proportion of integrated alternatives. During segregated percepts, in contrast to the hypothesis, many participants had a tendency to perceive two streams in the foreground, rather than reporting alternatives with a clear foreground-background differentiation. Finally, participants perceived the organization with intermediate feature values (e.g., middle tones of the pattern) segregated in the foreground slightly less often than similar alternatives with outer feature values (e.g., higher tones).
Collapse
Affiliation(s)
- Sabine Thomassen
- Auditory Psychophysiology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Ammerländer Heerstrasse 114-118, D-26129 Oldenburg, Germany
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Ammerländer Heerstrasse 114-118, D-26129 Oldenburg, Germany
| |
Collapse
|
30
|
Szabó BT, Denham SL, Winkler I. Computational Models of Auditory Scene Analysis: A Review. Front Neurosci 2016; 10:524. [PMID: 27895552 PMCID: PMC5108797 DOI: 10.3389/fnins.2016.00524] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 10/28/2016] [Indexed: 12/02/2022] Open
Abstract
Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.
Collapse
Affiliation(s)
- Beáta T Szabó
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic UniversityBudapest, Hungary; Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of SciencesBudapest, Hungary
| | - Susan L Denham
- School of Psychology, University of Plymouth Plymouth, UK
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences Budapest, Hungary
| |
Collapse
|
31
|
Tóth B, Kocsis Z, Háden GP, Szerafin Á, Shinn-Cunningham BG, Winkler I. EEG signatures accompanying auditory figure-ground segregation. Neuroimage 2016; 141:108-119. [PMID: 27421185 PMCID: PMC5656226 DOI: 10.1016/j.neuroimage.2016.07.028] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Revised: 07/06/2016] [Accepted: 07/11/2016] [Indexed: 11/16/2022] Open
Abstract
In everyday acoustic scenes, figure-ground segregation typically requires one to group together sound elements over both time and frequency. Electroencephalogram was recorded while listeners detected repeating tonal complexes composed of a random set of pure tones within stimuli consisting of randomly varying tonal elements. The repeating pattern was perceived as a figure over the randomly changing background. It was found that detection performance improved both as the number of pure tones making up each repeated complex (figure coherence) increased, and as the number of repeated complexes (duration) increased - i.e., detection was easier when either the spectral or temporal structure of the figure was enhanced. Figure detection was accompanied by the elicitation of the object related negativity (ORN) and the P400 event-related potentials (ERPs), which have been previously shown to be evoked by the presence of two concurrent sounds. Both ERP components had generators within and outside of auditory cortex. The amplitudes of the ORN and the P400 increased with both figure coherence and figure duration. However, only the P400 amplitude correlated with detection performance. These results suggest that 1) the ORN and P400 reflect processes involved in detecting the emergence of a new auditory object in the presence of other concurrent auditory objects; 2) the ORN corresponds to the likelihood of the presence of two or more concurrent sound objects, whereas the P400 reflects the perceptual recognition of the presence of multiple auditory objects and/or preparation for reporting the detection of a target object.
Collapse
Affiliation(s)
- Brigitta Tóth
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Center for Computational Neuroscience and Neural Technology, Boston University, Boston, USA.
| | - Zsuzsanna Kocsis
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | - Gábor P Háden
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Ágnes Szerafin
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | | | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged, Szeged, Hungary
| |
Collapse
|
32
|
Chang AC, Lutfi R, Lee J, Heo I. A Detection-Theoretic Analysis of Auditory Streaming and Its Relation to Auditory Masking. Trends Hear 2016; 20:20/0/2331216516664343. [PMID: 27641681 PMCID: PMC5029798 DOI: 10.1177/2331216516664343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Research on hearing has long been challenged with understanding our exceptional ability to hear out individual sounds in a mixture (the so-called cocktail party problem). Two general approaches to the problem have been taken using sequences of tones as stimuli. The first has focused on our tendency to hear sequences, sufficiently separated in frequency, split into separate cohesive streams (auditory streaming). The second has focused on our ability to detect a change in one sequence, ignoring all others (auditory masking). The two phenomena are clearly related, but that relation has never been evaluated analytically. This article offers a detection-theoretic analysis of the relation between multitone streaming and masking that underscores the expected similarities and differences between these phenomena and the predicted outcome of experiments in each case. The key to establishing this relation is the function linking performance to the information divergence of the tone sequences, DKL (a measure of the statistical separation of their parameters). A strong prediction is that streaming and masking of tones will be a common function of DKL provided that the statistical properties of sequences are symmetric. Results of experiments are reported supporting this prediction.
Collapse
Affiliation(s)
- An-Chieh Chang
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, WI, USA
| | - Robert Lutfi
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, WI, USA
| | - Jungmee Lee
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, WI, USA
| | - Inseok Heo
- Department of Electrical and Computer Engineering, University of Wisconsin-Madison, WI, USA
| |
Collapse
|
33
|
Farkas D, Denham SL, Bendixen A, Tóth D, Kondo HM, Winkler I. Auditory Multi-Stability: Idiosyncratic Perceptual Switching Patterns, Executive Functions and Personality Traits. PLoS One 2016; 11:e0154810. [PMID: 27135945 PMCID: PMC4852918 DOI: 10.1371/journal.pone.0154810] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 04/19/2016] [Indexed: 02/08/2023] Open
Abstract
Multi-stability refers to the phenomenon of perception stochastically switching between possible interpretations of an unchanging stimulus. Despite considerable variability, individuals show stable idiosyncratic patterns of switching between alternative perceptions in the auditory streaming paradigm. We explored correlates of the individual switching patterns with executive functions, personality traits, and creativity. The main dimensions on which individual switching patterns differed from each other were identified using multidimensional scaling. Individuals with high scores on the dimension explaining the largest portion of the inter-individual variance switched more often between the alternative perceptions than those with low scores. They also perceived the most unusual interpretation more often, and experienced all perceptual alternatives with a shorter delay from stimulus onset. The ego-resiliency personality trait, which reflects a tendency for adaptive flexibility and experience seeking, was significantly positively related to this dimension. Taking these results together we suggest that this dimension may reflect the individual's tendency for exploring the auditory environment. Executive functions were significantly related to some of the variables describing global properties of the switching patterns, such as the average number of switches. Thus individual patterns of perceptual switching in the auditory streaming paradigm are related to some personality traits and executive functions.
Collapse
Affiliation(s)
- Dávid Farkas
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
- Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
- * E-mail:
| | - Susan L. Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, United Kingdom
| | - Alexandra Bendixen
- School of Natural Sciences, Chemnitz University of Technology, Chemnitz, Germany
| | - Dénes Tóth
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Hirohito M. Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Japan
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
34
|
Farkas D, Denham SL, Bendixen A, Winkler I. Assessing the validity of subjective reports in the auditory streaming paradigm. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:1762. [PMID: 27106324 DOI: 10.1121/1.4945720] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
While subjective reports provide a direct measure of perception, their validity is not self-evident. Here, the authors tested three possible biasing effects on perceptual reports in the auditory streaming paradigm: errors due to imperfect understanding of the instructions, voluntary perceptual biasing, and susceptibility to implicit expectations. (1) Analysis of the responses to catch trials separately promoting each of the possible percepts allowed the authors to exclude participants who likely have not fully understood the instructions. (2) Explicit biasing instructions led to markedly different behavior than the conventional neutral-instruction condition, suggesting that listeners did not voluntarily bias their perception in a systematic way under the neutral instructions. Comparison with a random response condition further supported this conclusion. (3) No significant relationship was found between social desirability, a scale-based measure of susceptibility to implicit social expectations, and any of the perceptual measures extracted from the subjective reports. This suggests that listeners did not significantly bias their perceptual reports due to possible implicit expectations present in the experimental context. In sum, these results suggest that valid perceptual data can be obtained from subjective reports in the auditory streaming paradigm.
Collapse
Affiliation(s)
- Dávid Farkas
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Susan L Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, United Kingdom
| | - Alexandra Bendixen
- School of Natural Sciences, Chemnitz University of Technology, Chemnitz, Germany
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
35
|
Chang AC, Lutfi RA, Lee J. Auditory streaming of tones of uncertain frequency, level, and duration. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:EL504-EL508. [PMID: 26723358 PMCID: PMC4676779 DOI: 10.1121/1.4936981] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Revised: 10/15/2015] [Accepted: 11/20/2015] [Indexed: 06/05/2023]
Abstract
Stimulus uncertainty is known to critically affect auditory masking, but its influence on auditory streaming has been largely ignored. Standard ABA-ABA tone sequences were made increasingly uncertain by increasing the sigma of normal distributions from which the frequency, level, or duration of tones were randomly drawn. Consistent with predictions based on a model of masking by Lutfi, Gilbertson, Chang, and Stamas [J. Acoust. Soc. Am. 134, 2160-2170 (2013)], the frequency difference for which A and B tones formed separate streams increased as a linear function of sigma in tone frequency but was much less affected by sigma in tone level or duration.
Collapse
Affiliation(s)
- An-Chieh Chang
- Auditory Behavioral Research Lab, Department of Communicative Disorders, University of Wisconsin, Madison, Wisconsin 53706, USA , ,
| | - Robert A Lutfi
- Auditory Behavioral Research Lab, Department of Communicative Disorders, University of Wisconsin, Madison, Wisconsin 53706, USA , ,
| | - Jungmee Lee
- Auditory Behavioral Research Lab, Department of Communicative Disorders, University of Wisconsin, Madison, Wisconsin 53706, USA , ,
| |
Collapse
|
36
|
Rankin J, Sussman E, Rinzel J. Neuromechanistic Model of Auditory Bistability. PLoS Comput Biol 2015; 11:e1004555. [PMID: 26562507 PMCID: PMC4642990 DOI: 10.1371/journal.pcbi.1004555] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2015] [Accepted: 09/12/2015] [Indexed: 12/26/2022] Open
Abstract
Sequences of higher frequency A and lower frequency B tones repeating in an ABA- triplet pattern are widely used to study auditory streaming. One may experience either an integrated percept, a single ABA-ABA- stream, or a segregated percept, separate but simultaneous streams A-A-A-A- and -B---B--. During minutes-long presentations, subjects may report irregular alternations between these interpretations. We combine neuromechanistic modeling and psychoacoustic experiments to study these persistent alternations and to characterize the effects of manipulating stimulus parameters. Unlike many phenomenological models with abstract, percept-specific competition and fixed inputs, our network model comprises neuronal units with sensory feature dependent inputs that mimic the pulsatile-like A1 responses to tones in the ABA- triplets. It embodies a neuronal computation for percept competition thought to occur beyond primary auditory cortex (A1). Mutual inhibition, adaptation and noise are implemented. We include slow NDMA recurrent excitation for local temporal memory that enables linkage across sound gaps from one triplet to the next. Percepts in our model are identified in the firing patterns of the neuronal units. We predict with the model that manipulations of the frequency difference between tones A and B should affect the dominance durations of the stronger percept, the one dominant a larger fraction of time, more than those of the weaker percept—a property that has been previously established and generalized across several visual bistable paradigms. We confirm the qualitative prediction with our psychoacoustic experiments and use the behavioral data to further constrain and improve the model, achieving quantitative agreement between experimental and modeling results. Our work and model provide a platform that can be extended to consider other stimulus conditions, including the effects of context and volition. Humans have an astonishing ability to separate out different sound sources in a busy room: think of how we can hear individual voices in a bustling coffee shop. Rather than voices, we use sound stimuli in the lab: repeating patterns of high and low tones. The tone sequences are ambiguous and can be interpreted in different ways—either grouped into a single stream, or separated out into different streams. When listening for a long time, one’s perception switches every few seconds, a phenomenon called auditory bistability. Based on knowledge of the organization of brain areas involved in separating out different sound sources and how neurons in these areas respond to the ambiguous sequences, we developed a computational model of auditory bistabilty. Our model is less abstract than existing models and shows how groups of neurons may compete in order to dictate what you perceive. We predict how the difference between the two tone sequences affects what you hear over time and we performed an experiment with human listeners to confirm our prediction. The model provides groundwork to further explore the way the brain deals with the busy and often ambiguous world of sound.
Collapse
Affiliation(s)
- James Rankin
- Center for Neural Science, New York University, New York, New York, United States of America
- * E-mail:
| | - Elyse Sussman
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, United States of America
- Department of Otorhinolaryngology-HNS, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - John Rinzel
- Center for Neural Science, New York University, New York, New York, United States of America
- Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America
| |
Collapse
|
37
|
Winkler I, Schröger E. Auditory perceptual objects as generative models: Setting the stage for communication by sound. BRAIN AND LANGUAGE 2015; 148:1-22. [PMID: 26184883 DOI: 10.1016/j.bandl.2015.05.003] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 03/03/2015] [Accepted: 05/03/2015] [Indexed: 06/04/2023]
Abstract
Communication by sounds requires that the communication channels (i.e. speech/speakers and other sound sources) had been established. This allows to separate concurrently active sound sources, to track their identity, to assess the type of message arriving from them, and to decide whether and when to react (e.g., reply to the message). We propose that these functions rely on a common generative model of the auditory environment. This model predicts upcoming sounds on the basis of representations describing temporal/sequential regularities. Predictions help to identify the continuation of the previously discovered sound sources to detect the emergence of new sources as well as changes in the behavior of the known ones. It produces auditory event representations which provide a full sensory description of the sounds, including their relation to the auditory context and the current goals of the organism. Event representations can be consciously perceived and serve as objects in various cognitive operations.
Collapse
Affiliation(s)
- István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Hungary; Institute of Psychology, University of Szeged, Hungary.
| | - Erich Schröger
- Institute for Psychology, University of Leipzig, Germany.
| |
Collapse
|
38
|
Deike S, Heil P, Böckmann-Barthel M, Brechmann A. Decision making and ambiguity in auditory stream segregation. Front Neurosci 2015; 9:266. [PMID: 26321899 PMCID: PMC4531241 DOI: 10.3389/fnins.2015.00266] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 07/14/2015] [Indexed: 12/01/2022] Open
Abstract
Researchers of auditory stream segregation have largely taken a bottom-up view on the link between physical stimulus parameters and the perceptual organization of sequences of ABAB sounds. However, in the majority of studies, researchers have relied on the reported decisions of the subjects regarding which of the predefined percepts (e.g., one stream or two streams) predominated when subjects listened to more or less ambiguous streaming sequences. When searching for neural mechanisms of stream segregation, it should be kept in mind that such decision processes may contribute to brain activation, as also suggested by recent human imaging data. The present study proposes that the uncertainty of a subject in making a decision about the perceptual organization of ambiguous streaming sequences may be reflected in the time required to make an initial decision. To this end, subjects had to decide on their current percept while listening to ABAB auditory streaming sequences. Each sequence had a duration of 30 s and was composed of A and B harmonic tone complexes differing in fundamental frequency (ΔF). Sequences with seven different ΔF were tested. We found that the initial decision time varied non-monotonically with ΔF and that it was significantly correlated with the degree of perceptual ambiguity defined from the proportions of time the subjects reported a one-stream or a two-stream percept subsequent to the first decision. This strong relation of the proposed measures of decision uncertainty and perceptual ambiguity should be taken into account when searching for neural correlates of auditory stream segregation.
Collapse
Affiliation(s)
- Susann Deike
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| | - Peter Heil
- Department of Systems Physiology of Learning, Leibniz Institute for Neurobiology Magdeburg, Germany
| | - Martin Böckmann-Barthel
- Department of Experimental Audiology, Otto-von-Guericke-University Magdeburg Magdeburg, Germany
| | - André Brechmann
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| |
Collapse
|
39
|
Rimmele JM, Sussman E, Poeppel D. The role of temporal structure in the investigation of sensory memory, auditory scene analysis, and speech perception: a healthy-aging perspective. Int J Psychophysiol 2015; 95:175-83. [PMID: 24956028 PMCID: PMC4272684 DOI: 10.1016/j.ijpsycho.2014.06.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2013] [Revised: 06/13/2014] [Accepted: 06/15/2014] [Indexed: 01/08/2023]
Abstract
Listening situations with multiple talkers or background noise are common in everyday communication and are particularly demanding for older adults. Here we review current research on auditory perception in aging individuals in order to gain insights into the challenges of listening under noisy conditions. Informationally rich temporal structure in auditory signals--over a range of time scales from milliseconds to seconds--renders temporal processing central to perception in the auditory domain. We discuss the role of temporal structure in auditory processing, in particular from a perspective relevant for hearing in background noise, and focusing on sensory memory, auditory scene analysis, and speech perception. Interestingly, these auditory processes, usually studied in an independent manner, show considerable overlap of processing time scales, even though each has its own 'privileged' temporal regimes. By integrating perspectives on temporal structure processing in these three areas of investigation, we aim to highlight similarities typically not recognized.
Collapse
Affiliation(s)
- Johanna Maria Rimmele
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| | - Elyse Sussman
- Albert Einstein College of Medicine, Dominick P. Purpura Department of Neuroscience, Bronx, NY, United States
| | - David Poeppel
- Department of Psychology and Center for Neural Science, New York University, New York, NY, United States; Max-Planck Institute for Empirical Aesthetics, Frankfurt, Germany
| |
Collapse
|
40
|
Rimmele JM, Zion Golumbic E, Schröger E, Poeppel D. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene. Cortex 2015; 68:144-54. [PMID: 25650107 DOI: 10.1016/j.cortex.2014.12.014] [Citation(s) in RCA: 89] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 10/14/2014] [Accepted: 12/10/2014] [Indexed: 01/22/2023]
Abstract
Attending to one speaker in multi-speaker situations is challenging. One neural mechanism proposed to underlie the ability to attend to a particular speaker is phase-locking of low-frequency activity in auditory cortex to speech's temporal envelope ("speech-tracking"), which is more precise for attended speech. However, it is not known what brings about this attentional effect, and specifically if it reflects enhanced processing of the fine structure of attended speech. To investigate this question we compared attentional effects on speech-tracking of natural versus vocoded speech which preserves the temporal envelope but removes the fine structure of speech. Pairs of natural and vocoded speech stimuli were presented concurrently and participants attended to one stimulus and performed a detection task while ignoring the other stimulus. We recorded magnetoencephalography (MEG) and compared attentional effects on the speech-tracking response in auditory cortex. Speech-tracking of natural, but not vocoded, speech was enhanced by attention, whereas neural tracking of ignored speech was similar for natural and vocoded speech. These findings suggest that the more precise speech-tracking of attended natural speech is related to processing its fine structure, possibly reflecting the application of higher-order linguistic processes. In contrast, when speech is unattended its fine structure is not processed to the same degree and thus elicits less precise speech-tracking more similar to vocoded speech.
Collapse
Affiliation(s)
- Johanna M Rimmele
- Department of Psychology and Center for Neural Science, New York University, New York, NY, USA; Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| | - Elana Zion Golumbic
- Gonda Center for Brain Research, Bar Ilan University, Israel; Department of Psychiatry, Columbia University, New York, NY, USA.
| | - Erich Schröger
- Institute of Psychology, University of Leipzig, Leipzig, Germany.
| | - David Poeppel
- Department of Psychology and Center for Neural Science, New York University, New York, NY, USA; Max-Planck Institute for Empirical Aesthetics, Frankfurt, Germany.
| |
Collapse
|
41
|
Kaya EM, Elhilali M. Investigating bottom-up auditory attention. Front Hum Neurosci 2014; 8:327. [PMID: 24904367 PMCID: PMC4034154 DOI: 10.3389/fnhum.2014.00327] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Accepted: 05/01/2014] [Indexed: 11/22/2022] Open
Abstract
Bottom-up attention is a sensory-driven selection mechanism that directs perception toward a subset of the stimulus that is considered salient, or attention-grabbing. Most studies of bottom-up auditory attention have adapted frameworks similar to visual attention models whereby local or global “contrast” is a central concept in defining salient elements in a scene. In the current study, we take a more fundamental approach to modeling auditory attention; providing the first examination of the space of auditory saliency spanning pitch, intensity and timbre; and shedding light on complex interactions among these features. Informed by psychoacoustic results, we develop a computational model of auditory saliency implementing a novel attentional framework, guided by processes hypothesized to take place in the auditory pathway. In particular, the model tests the hypothesis that perception tracks the evolution of sound events in a multidimensional feature space, and flags any deviation from background statistics as salient. Predictions from the model corroborate the relationship between bottom-up auditory attention and statistical inference, and argues for a potential role of predictive coding as mechanism for saliency detection in acoustic scenes.
Collapse
Affiliation(s)
- Emine Merve Kaya
- Department of Electrical and Computer Engineering, The Johns Hopkins University Baltimore, MD, USA
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University Baltimore, MD, USA
| |
Collapse
|
42
|
Shestopalova L, Bőhm TM, Bendixen A, Andreou AG, Georgiou J, Garreau G, Hajdu B, Denham SL, Winkler I. Do audio-visual motion cues promote segregation of auditory streams? Front Neurosci 2014; 8:64. [PMID: 24778604 PMCID: PMC3985028 DOI: 10.3389/fnins.2014.00064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 03/19/2014] [Indexed: 11/19/2022] Open
Abstract
An audio-visual experiment using moving sound sources was designed to investigate whether the analysis of auditory scenes is modulated by synchronous presentation of visual information. Listeners were presented with an alternating sequence of two pure tones delivered by two separate sound sources. In different conditions, the two sound sources were either stationary or moving on random trajectories around the listener. Both the sounds and the movement trajectories were derived from recordings in which two humans were moving with loudspeakers attached to their heads. Visualized movement trajectories modeled by a computer animation were presented together with the sounds. In the main experiment, behavioral reports on sound organization were collected from young healthy volunteers. The proportion and stability of the different sound organizations were compared between the conditions in which the visualized trajectories matched the movement of the sound sources and when the two were independent of each other. The results corroborate earlier findings that separation of sound sources in space promotes segregation. However, no additional effect of auditory movement per se on the perceptual organization of sounds was obtained. Surprisingly, the presentation of movement-congruent visual cues did not strengthen the effects of spatial separation on segregating auditory streams. Our findings are consistent with the view that bistability in the auditory modality can occur independently from other modalities.
Collapse
Affiliation(s)
- Lidia Shestopalova
- Pavlov Institute of Physiology, Russian Academy of Sciences St.-Petersburg, Russia
| | - Tamás M Bőhm
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics Budapest, Hungary
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University of Oldenburg Oldenburg, Germany
| | - Andreas G Andreou
- Department of Electrical and Computer Engineering, Johns Hopkins University Baltimore, MD, USA ; Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Julius Georgiou
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Guillaume Garreau
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Botond Hajdu
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary
| | - Susan L Denham
- School of Psychology, Cognition Institute, University of Plymouth Plymouth, UK
| | - István Winkler
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged Szeged, Hungary
| |
Collapse
|
43
|
Bendixen A. Predictability effects in auditory scene analysis: a review. Front Neurosci 2014; 8:60. [PMID: 24744695 PMCID: PMC3978260 DOI: 10.3389/fnins.2014.00060] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 03/14/2014] [Indexed: 12/02/2022] Open
Abstract
Many sound sources emit signals in a predictable manner. The idea that predictability can be exploited to support the segregation of one source's signal emissions from the overlapping signals of other sources has been expressed for a long time. Yet experimental evidence for a strong role of predictability within auditory scene analysis (ASA) has been scarce. Recently, there has been an upsurge in experimental and theoretical work on this topic resulting from fundamental changes in our perspective on how the brain extracts predictability from series of sensory events. Based on effortless predictive processing in the auditory system, it becomes more plausible that predictability would be available as a cue for sound source decomposition. In the present contribution, empirical evidence for such a role of predictability in ASA will be reviewed. It will be shown that predictability affects ASA both when it is present in the sound source of interest (perceptual foreground) and when it is present in other sound sources that the listener wishes to ignore (perceptual background). First evidence pointing toward age-related impairments in the latter capacity will be addressed. Moreover, it will be illustrated how effects of predictability can be shown by means of objective listening tests as well as by subjective report procedures, with the latter approach typically exploiting the multi-stable nature of auditory perception. Critical aspects of study design will be delineated to ensure that predictability effects can be unambiguously interpreted. Possible mechanisms for a functional role of predictability within ASA will be discussed, and an analogy with the old-plus-new heuristic for grouping simultaneous acoustic signals will be suggested.
Collapse
Affiliation(s)
- Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all," European Medical School, Carl von Ossietzky University of Oldenburg Oldenburg, Germany
| |
Collapse
|
44
|
Szalárdy O, Bendixen A, Böhm TM, Davies LA, Denham SL, Winkler I. The effects of rhythm and melody on auditory stream segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:1392-1405. [PMID: 24606277 DOI: 10.1121/1.4865196] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
While many studies have assessed the efficacy of similarity-based cues for auditory stream segregation, much less is known about whether and how the larger-scale structure of sound sequences support stream formation and the choice of sound organization. Two experiments investigated the effects of musical melody and rhythm on the segregation of two interleaved tone sequences. The two sets of tones fully overlapped in pitch range but differed from each other in interaural time and intensity. Unbeknownst to the listener, separately, each of the interleaved sequences was created from the notes of a different song. In different experimental conditions, the notes and/or their timing could either follow those of the songs or they could be scrambled or, in case of timing, set to be isochronous. Listeners were asked to continuously report whether they heard a single coherent sequence (integrated) or two concurrent streams (segregated). Although temporal overlap between tones from the two streams proved to be the strongest cue for stream segregation, significant effects of tonality and familiarity with the songs were also observed. These results suggest that the regular temporal patterns are utilized as cues in auditory stream segregation and that long-term memory is involved in this process.
Collapse
Affiliation(s)
- Orsolya Szalárdy
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, P.O. Box 286, H-1519 Budapest, Hungary
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all," European Medical School, Carl von Ossietzky University of Oldenburg, Ammerländer Heerstrasse 114-118, D-26129 Oldenburg, Germany
| | - Tamás M Böhm
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, P.O. Box 286, H-1519 Budapest, Hungary
| | - Lucy A Davies
- Cognition Institute and School of Psychology, University of Plymouth, Drake Circus, Plymouth PL4 8AA, United Kingdom
| | - Susan L Denham
- Cognition Institute and School of Psychology, University of Plymouth, Drake Circus, Plymouth PL4 8AA, United Kingdom
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, P.O. Box 286, H-1519 Budapest, Hungary
| |
Collapse
|
45
|
Denham S, Bõhm TM, Bendixen A, Szalárdy O, Kocsis Z, Mill R, Winkler I. Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli. Front Neurosci 2014; 8:25. [PMID: 24616656 PMCID: PMC3937586 DOI: 10.3389/fnins.2014.00025] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Accepted: 01/27/2014] [Indexed: 11/25/2022] Open
Abstract
The ability of the auditory system to parse complex scenes into component objects in order to extract information from the environment is very robust, yet the processing principles underlying this ability are still not well understood. This study was designed to investigate the proposal that the auditory system constructs multiple interpretations of the acoustic scene in parallel, based on the finding that when listening to a long repetitive sequence listeners report switching between different perceptual organizations. Using the “ABA-” auditory streaming paradigm we trained listeners until they could reliably recognize all possible embedded patterns of length four which could in principle be extracted from the sequence, and in a series of test sessions investigated their spontaneous reports of those patterns. With the training allowing them to identify and mark a wider variety of possible patterns, participants spontaneously reported many more patterns than the ones traditionally assumed (Integrated vs. Segregated). Despite receiving consistent training and despite the apparent randomness of perceptual switching, we found individual switching patterns were idiosyncratic; i.e., the perceptual switching patterns of each participant were more similar to their own switching patterns in different sessions than to those of other participants. These individual differences were found to be preserved even between test sessions held a year after the initial experiment. Our results support the idea that the auditory system attempts to extract an exhaustive set of embedded patterns which can be used to generate expectations of future events and which by competing for dominance give rise to (changing) perceptual awareness, with the characteristics of pattern discovery and perceptual competition having a strong idiosyncratic component. Perceptual multistability thus provides a means for characterizing both general mechanisms and individual differences in human perception.
Collapse
Affiliation(s)
- Susan Denham
- Cognition Institute, University of Plymouth Plymouth, UK ; School of Psychology, University of Plymouth Plymouth, UK
| | - Tamás M Bõhm
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics Budapest, Hungary
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University of Oldenburg Oldenburg, Germany
| | - Orsolya Szalárdy
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Cognitive Science, Budapest University of Technology and Economics Budapest, Hungary
| | - Zsuzsanna Kocsis
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Cognitive Science, Budapest University of Technology and Economics Budapest, Hungary
| | - Robert Mill
- Cognition Institute, University of Plymouth Plymouth, UK
| | - István Winkler
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Institute of Psychology, University of Szeged Szeged, Hungary
| |
Collapse
|
46
|
Vallet GT, Shore DI, Schutz M. Exploring the Role of the Amplitude Envelope in Duration Estimation. Perception 2014; 43:616-30. [DOI: 10.1068/p7656] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
A sound's duration provides important information about the event producing it. Although many of the sounds we hear every day are ‘percussive’ in nature (ie resulting from two objects impacting) and therefore exhibit decaying/damped amplitude envelopes, perceptual experiments frequently use tones synthesized with ‘flat’ or abruptly ending envelopes. Such sounds afford an estimation strategy involving calculating the elapsed time between tone onset and offset—a strategy that would be problematic for ecologically pervasive decaying sounds. Here we compare duration judgments for tones with percussive (ie gradually decaying) and flat (ie abruptly ending) amplitude envelopes, finding evidence for the use of different strategies. This result is discussed in terms of its implications for dominant theories and models of sensory perception that are often assessed using artificial sounds (ie ‘flat tones’) affording strategies that may not be optimal or even available for everyday listening.
Collapse
Affiliation(s)
- Guillaume T Vallet
- Multisensory Perception Laboratory, Department of Psychology, Neuroscience and Behaviour, McMaster University, Canada; also Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal (CRIUGM), University of Montreal, Canada
| | - David I Shore
- Multisensory Perception Laboratory, Department of Psychology, Neuroscience and Behaviour, McMaster University, Canada
| | - Michael Schutz
- Music, Acoustics, Perception and LEarning (MAPLE) Laboratory, School of the Arts, McMaster University, 280 Main Street West, Hamilton, ON L8S 4K1, Canada
| |
Collapse
|
47
|
Schröger E, Bendixen A, Denham SL, Mill RW, Bőhm TM, Winkler I. Predictive Regularity Representations in Violation Detection and Auditory Stream Segregation: From Conceptual to Computational Models. Brain Topogr 2013; 27:565-77. [DOI: 10.1007/s10548-013-0334-6] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 11/13/2013] [Indexed: 11/24/2022]
|
48
|
Szalárdy O, Winkler I, Schröger E, Widmann A, Bendixen A. Foreground-background discrimination indicated by event-related brain potentials in a new auditory multistability paradigm. Psychophysiology 2013; 50:1239-50. [DOI: 10.1111/psyp.12139] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 07/15/2013] [Indexed: 11/26/2022]
Affiliation(s)
- Orsolya Szalárdy
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Department of Cognitive Science, Faculty of Natural Sciences; Budapest University of Technology and Economics; Budapest Hungary
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Institute of Psychology; University of Szeged; Szeged Hungary
| | - Erich Schröger
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Andreas Widmann
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Alexandra Bendixen
- Institute of Psychology; University of Leipzig; Leipzig Germany
- Department of Psychology; Cluster of Excellence “Hearing4all,” European Medical School; Carl von Ossietzky University of Oldenburg; Oldenburg Germany
| |
Collapse
|
49
|
Selezneva E, Deike S, Knyazeva S, Scheich H, Brechmann A, Brosch M. Rhythm sensitivity in macaque monkeys. Front Syst Neurosci 2013; 7:49. [PMID: 24046732 PMCID: PMC3764333 DOI: 10.3389/fnsys.2013.00049] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 08/19/2013] [Indexed: 11/13/2022] Open
Abstract
This study provides evidence that monkeys are rhythm sensitive. We composed isochronous tone sequences consisting of repeating triplets of two short tones and one long tone which humans perceive as repeating triplets of two weak and one strong beat. This regular sequence was compared to an irregular sequence with the same number of randomly arranged short and long tones with no such beat structure. To search for indication of rhythm sensitivity we employed an oddball paradigm in which occasional duration deviants were introduced in the sequences. In a pilot study on humans we showed that subjects more easily detected these deviants when they occurred in a regular sequence. In the monkeys we searched for spontaneous behaviors the animals executed concomitant with the deviants. We found that monkeys more frequently exhibited changes of gaze and facial expressions to the deviants when they occurred in the regular sequence compared to the irregular sequence. In addition we recorded neuronal firing and local field potentials from 175 sites of the primary auditory cortex during sequence presentation. We found that both types of neuronal signals differentiated regular from irregular sequences. Both signals were stronger in regular sequences and occurred after the onset of the long tones, i.e., at the position of the strong beat. Local field potential responses were also significantly larger for the durational deviants in regular sequences, yet in a later time window. We speculate that these temporal pattern-selective mechanisms with a focus on strong beats and their deviants underlie the perception of rhythm in the chosen sequences.
Collapse
Affiliation(s)
- Elena Selezneva
- Special Lab of Primate Neurobiology, Leibniz Institute for Neurobiology Magdeburg, Germany
| | | | | | | | | | | |
Collapse
|
50
|
Rajendran VG, Harper NS, Willmore BD, Hartmann WM, Schnupp JWH. Temporal predictability as a grouping cue in the perception of auditory streams. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:EL98-104. [PMID: 23862914 PMCID: PMC4491984 DOI: 10.1121/1.4811161] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
This study reports a role of temporal regularity on the perception of auditory streams. Listeners were presented with two-tone sequences in an A-B-A-B rhythm that was either regular or had a controlled amount of temporal jitter added independently to each of the B tones. Subjects were asked to report whether they perceived one or two streams. The percentage of trials in which two streams were reported substantially and significantly increased with increasing amounts of temporal jitter. This suggests that temporal predictability may serve as a binding cue during auditory scene analysis.
Collapse
Affiliation(s)
- Vani G Rajendran
- Department of Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford OX1 3PT, United Kingdom.
| | | | | | | | | |
Collapse
|