1
|
Ringer H, Schröger E, Grimm S. Neural signatures of automatic repetition detection in temporally regular and jittered acoustic sequences. PLoS One 2023; 18:e0284836. [PMID: 37948467 PMCID: PMC10637696 DOI: 10.1371/journal.pone.0284836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023] Open
Abstract
Detection of repeating patterns within continuous sound streams is crucial for efficient auditory perception. Previous studies demonstrated a remarkable sensitivity of the human auditory system to periodic repetitions in unfamiliar, meaningless sounds. Automatic repetition detection was reflected in different EEG markers, including sustained activity, neural synchronisation, and event-related responses to pattern occurrences. The current study investigated how listeners' attention and the temporal regularity of a sound modulate repetition perception, and how this influence is reflected in different EEG markers that were previously suggested to subserve dissociable functions. We reanalysed data of a previous study in which listeners were presented with sequences of unfamiliar artificial sounds that either contained repetitions of a certain sound segment or not. Repeating patterns occurred either regularly or with a temporal jitter within the sequences, and participants' attention was directed either towards the pattern repetitions or away from the auditory stimulation. Across both regular and jittered sequences during both attention and in-attention, pattern repetitions led to increased sustained activity throughout the sequence, evoked a characteristic positivity-negativity complex in the event-related potential, and enhanced inter-trial phase coherence of low-frequency oscillatory activity time-locked to repeating pattern onsets. While regularity only had a minor (if any) influence, attention significantly strengthened pattern repetition perception, which was consistently reflected in all three EEG markers. These findings suggest that the detection of pattern repetitions within continuous sounds relies on a flexible mechanism that is robust against in-attention and temporal irregularity, both of which typically occur in naturalistic listening situations. Yet, attention to the auditory input can enhance processing of repeating patterns and improve repetition detection.
Collapse
Affiliation(s)
- Hanna Ringer
- International Max Planck Research School on Neuroscience of Communication (IMPRS NeuroCom), Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Cognitive and Biological Psychology, Wilhelm Wundt Institute for Psychology, Leipzig University, Leipzig, Germany
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| | - Erich Schröger
- Cognitive and Biological Psychology, Wilhelm Wundt Institute for Psychology, Leipzig University, Leipzig, Germany
| | - Sabine Grimm
- Physics of Cognition Lab, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany
- Cognitive Systems Lab, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany
| |
Collapse
|
2
|
Weise A, Grimm S, Maria Rimmele J, Schröger E. Auditory representations for long lasting sounds: Insights from event-related brain potentials and neural oscillations. BRAIN AND LANGUAGE 2023; 237:105221. [PMID: 36623340 DOI: 10.1016/j.bandl.2022.105221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 12/26/2022] [Accepted: 12/27/2022] [Indexed: 06/17/2023]
Abstract
The basic features of short sounds, such as frequency and intensity including their temporal dynamics, are integrated in a unitary representation. Knowledge on how our brain processes long lasting sounds is scarce. We review research utilizing the Mismatch Negativity event-related potential and neural oscillatory activity for studying representations for long lasting simple versus complex sounds such as sinusoidal tones versus speech. There is evidence for a temporal constraint in the formation of auditory representations: Auditory edges like sound onsets within long lasting sounds open a temporal window of about 350 ms in which the sounds' dynamics are integrated into a representation, while information beyond that window contributes less to that representation. This integration window segments the auditory input into short chunks. We argue that the representations established in adjacent integration windows can be concatenated into an auditory representation of a long sound, thus, overcoming the temporal constraint.
Collapse
Affiliation(s)
- Annekathrin Weise
- Department of Psychology, Ludwig-Maximilians-University Munich, Germany; Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| | - Sabine Grimm
- Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| | - Johanna Maria Rimmele
- Department of Neuroscience, Max-Planck-Institute for Empirical Aesthetics, Germany; Center for Language, Music and Emotion, New York University, Max Planck Institute, Department of Psychology, 6 Washington Place, New York, NY 10003, United States.
| | - Erich Schröger
- Wilhelm Wundt Institute for Psychology, Leipzig University, Germany.
| |
Collapse
|
3
|
Bianco R, Chait M. No Link Between Speech-in-Noise Perception and Auditory Sensory Memory - Evidence From a Large Cohort of Older and Younger Listeners. Trends Hear 2023; 27:23312165231190688. [PMID: 37828868 PMCID: PMC10576936 DOI: 10.1177/23312165231190688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 07/06/2023] [Accepted: 07/11/2023] [Indexed: 10/14/2023] Open
Abstract
A growing literature is demonstrating a link between working memory (WM) and speech-in-noise (SiN) perception. However, the nature of this correlation and which components of WM might underlie it, are being debated. We investigated how SiN reception links with auditory sensory memory (aSM) - the low-level processes that support the short-term maintenance of temporally unfolding sounds. A large sample of old (N = 199, 60-79 yo) and young (N = 149, 20-35 yo) participants was recruited online and performed a coordinate response measure-based speech-in-babble task that taps listeners' ability to track a speech target in background noise. We used two tasks to investigate implicit and explicit aSM. Both were based on tone patterns overlapping in processing time scales with speech (presentation rate of tones 20 Hz; of patterns 2 Hz). We hypothesised that a link between SiN and aSM may be particularly apparent in older listeners due to age-related reduction in both SiN reception and aSM. We confirmed impaired SiN reception in the older cohort and demonstrated reduced aSM performance in those listeners. However, SiN and aSM did not share variability. Across the two age groups, SiN performance was predicted by a binaural processing test and age. The results suggest that previously observed links between WM and SiN may relate to the executive components and other cognitive demands of the used tasks. This finding helps to constrain the search for the perceptual and cognitive factors that explain individual variability in SiN performance.
Collapse
Affiliation(s)
- Roberta Bianco
- Ear Institute, University College London, London, UK
- Neuroscience of Perception and Action Lab, Italian Institute of Technology (IIT), Rome, Italy
| | - Maria Chait
- Ear Institute, University College London, London, UK
| |
Collapse
|
4
|
Agus TR, Pressnitzer D. Repetition detection and rapid auditory learning for stochastic tone clouds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1735. [PMID: 34598638 DOI: 10.1121/10.0005935] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 08/02/2021] [Indexed: 06/13/2023]
Abstract
Stochastic sounds are useful to probe auditory memory, as they require listeners to learn unpredictable and novel patterns under controlled experimental conditions. Previous studies using white noise or random click trains have demonstrated rapid auditory learning. Here, we explored perceptual learning with a more parametrically variable stimulus. These "tone clouds" were defined as broadband combinations of tone pips at randomized frequencies and onset times. Varying the number of tones covered a perceptual range from individually audible pips to noise-like stimuli. Results showed that listeners could detect and learn repeating patterns in tone clouds. Task difficulty varied depending on the density of tone pips, with sparse tone clouds the easiest. Rapid learning of individual tone clouds was observed for all densities, with a roughly constant benefit of learning irrespective of baseline performance. Variations in task difficulty were correlated to amplitude modulations in an auditory model. Tone clouds thus provide a tool to probe auditory learning in a variety of task-difficulty settings, which could be useful for clinical or neurophysiological studies. They also show that rapid auditory learning operates over a wide range of spectrotemporal complexity, essentially from melodies to noise.
Collapse
Affiliation(s)
- Trevor R Agus
- SARC, School of Arts, English and Languages, Queen's University Belfast, 1 Malone Road, Belfast, BT7 1NN, United Kingdom
| | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL University, CNRS, 29 rue d'Ulm, 75005, Paris, France
| |
Collapse
|
5
|
Krauss P, Tziridis K. Simulated transient hearing loss improves auditory sensitivity. Sci Rep 2021; 11:14791. [PMID: 34285327 PMCID: PMC8292442 DOI: 10.1038/s41598-021-94429-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 07/07/2021] [Indexed: 01/09/2023] Open
Abstract
Recently, it was proposed that a processing principle called adaptive stochastic resonance plays a major role in the auditory system, and serves to maintain optimal sensitivity even to highly variable sound pressure levels. As a side effect, in case of reduced auditory input, such as permanent hearing loss or frequency specific deprivation, this mechanism may eventually lead to the perception of phantom sounds like tinnitus or the Zwicker tone illusion. Using computational modeling, the biological plausibility of this processing principle was already demonstrated. Here, we provide experimental results that further support the stochastic resonance model of auditory perception. In particular, Mongolian gerbils were exposed to moderate intensity, non-damaging long-term notched noise, which mimics hearing loss for frequencies within the notch. Remarkably, the animals developed significantly increased sensitivity, i.e. improved hearing thresholds, for the frequency centered within the notch, but not for frequencies outside the notch. In addition, most animals treated with the new paradigm showed identical behavioral signs of phantom sound perception (tinnitus) as animals with acoustic trauma induced tinnitus. In contrast, animals treated with broadband noise as a control condition did not show any significant threshold change, nor behavioral signs of phantom sound perception.
Collapse
Affiliation(s)
- Patrick Krauss
- Neuroscience Lab, Experimental Otolaryngology, University Hospital Erlangen, Erlangen, Germany.
- Cognitive Computational Neuroscience Group, University Erlangen-Nürnberg (FAU), Erlangen, Germany.
- Pattern Recognition Lab, University Erlangen-Nürnberg (FAU), Erlangen, Germany.
- Department of Otolaryngology, University Medical Center Groningen, Groningen, The Netherlands.
| | - Konstantin Tziridis
- Neuroscience Lab, Experimental Otolaryngology, University Hospital Erlangen, Erlangen, Germany
| |
Collapse
|
6
|
The time course of auditory recognition measured with rapid sequences of short natural sounds. Sci Rep 2019; 9:8005. [PMID: 31142750 PMCID: PMC6541711 DOI: 10.1038/s41598-019-43126-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 03/25/2019] [Indexed: 11/09/2022] Open
Abstract
Human listeners are able to recognize accurately an impressive range of complex sounds, such as musical instruments or voices. The underlying mechanisms are still poorly understood. Here, we aimed to characterize the processing time needed to recognize a natural sound. To do so, by analogy with the “rapid visual sequential presentation paradigm”, we embedded short target sounds within rapid sequences of distractor sounds. The core hypothesis is that any correct report of the target implies that sufficient processing for recognition had been completed before the time of occurrence of the subsequent distractor sound. We conducted four behavioral experiments using short natural sounds (voices and instruments) as targets or distractors. We report the effects on performance, as measured by the fastest presentation rate for recognition, of sound duration, number of sounds in a sequence, the relative pitch between target and distractors and target position in the sequence. Results showed a very rapid auditory recognition of natural sounds in all cases. Targets could be recognized at rates up to 30 sounds per second. In addition, the best performance was observed for voices in sequences of instruments. These results give new insights about the remarkable efficiency of timbre processing in humans, using an original behavioral paradigm to provide strong constraints on future neural models of sound recognition.
Collapse
|
7
|
Weaver AJ, DiGiovanni JJ, Ries DT. Pspan: A New Tool for Assessing Pitch Temporal Processing and Patterning Capacity. Am J Audiol 2019; 28:322-332. [PMID: 31084578 DOI: 10.1044/2019_aja-18-0117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose The purpose of this study was to evaluate whether merging the clinical pitch pattern test procedure with psychoacoustic adaptive methods would create a new tool feasible to capture individual differences in pitch temporal processing and patterning capacity of children and adults. Method Sixty-six individuals, young children (ages 10-12 years, n = 22), older children (ages 13-15 years, n = 23), and adults (ages 18-33 years, n = 21), were recruited and assigned to subgroups based on reported duration (years) of instrumental music instruction. Additional background information was collected in order to assess if the pitch temporal processing and patterning span developed, the Pspan, was sensitive to individual differences across participants. Results The evaluation of the Pspan task as a scale indicated good parallel reliability across runs assessed by Cronbach's alpha, and scores were normally distributed. Between-subjects analysis of variance indicated main effects for both age groups and music groups recruited for the study. A multiple regression analysis with the Pspan scores as the dependent variable found that 3 measures of music instruction, age in years, and paternal education were predictive of enhanced temporal processing and patterning capacity for pitch input. Conclusions The outcomes suggest that the Pspan task is a time-efficient data collection tool that is sensitive to the duration of instrumental music instruction, maturation, and paternal education. In addition, results indicate that the task is sensitive to age-related auditory temporal processing and patterning performance changes during adolescence when children are 10-15 years old.
Collapse
Affiliation(s)
- Aurora J. Weaver
- Auditory Psychophysics and Signal Processing Lab, Division of Communication Sciences and Disorders, Ohio University, Athens
- Auditory and Music Perception Lab, Department of Communication Disorders, Auburn University, AL
| | - Jeffrey J. DiGiovanni
- Auditory Psychophysics and Signal Processing Lab, Division of Communication Sciences and Disorders, Ohio University, Athens
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | - Dennis T. Ries
- Department of Physical Medicine and Rehabilitation, University of Colorado–Anschutz Medical Campus, Aurora
| |
Collapse
|
8
|
Teichert T, Gurnsey K. Formation and decay of auditory short-term memory in the macaque monkey. J Neurophysiol 2019; 121:2401-2415. [PMID: 31017849 DOI: 10.1152/jn.00821.2018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Echoic memory (EM) is a short-lived, precategorical, and passive form of auditory short-term memory (STM). A key hallmark of EM is its rapid exponential decay with a time constant between 1 and 2 s. It is not clear whether auditory STM in the rhesus, an important model system, shares this rapid exponential decay. To resolve this shortcoming, two rhesus macaques were trained to perform a delayed frequency discrimination task. Discriminability of delayed tones was measured as a function of retention duration and the number of times the standard had been repeated before the target. Like in the human, our results show a rapid decline of discriminability with retention duration. In addition, the results suggest a gradual strengthening of discriminability with repetition number. Model-based analyses suggest the presence of two components of auditory STM: a short-lived component with a time constant on the order of 550 ms that most likely corresponds to EM and a more stable memory trace with time constants on the order of 10 s that strengthens with repetition and most likely corresponds to auditory recognition memory. NEW & NOTEWORTHY This is the first detailed quantification of the rapid temporal dynamics of auditory short-term memory in the rhesus. Much of the auditory information in short-term memory is lost within the first couple of seconds. Repeated presentations of a tone strengthen its encoding into short-term memory. Model-based analyses suggest two distinct components: an echoic memory homolog that mediates the rapid decay and a more stable but less detail-rich component that mediates strengthening of the trace with repetition.
Collapse
Affiliation(s)
- Tobias Teichert
- Department of Psychiatry, University of Pittsburgh , Pittsburgh, Pennsylvania.,Department of Bioengineering, University of Pittsburgh , Pittsburgh, Pennsylvania
| | - Kate Gurnsey
- Department of Psychiatry, University of Pittsburgh , Pittsburgh, Pennsylvania
| |
Collapse
|
9
|
Thunell E, Thorpe SJ. Memory for Repeated Images in Rapid-Serial-Visual-Presentation Streams of Thousands of Images. Psychol Sci 2019; 30:989-1000. [PMID: 31017834 DOI: 10.1177/0956797619842251] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Human observers readily detect targets in stimuli presented briefly and in rapid succession. Here, we show that even without predefined targets, humans can spot repetitions in streams of thousands of images. We presented sequences of natural images reoccurring a number of times interleaved with either one or two distractors, and we asked participants to detect the repetitions and to identify the repeated images after a delay that could last for minutes. Performance improved with the number of repeated-image presentations up to a ceiling around seven repetitions and was above chance even after only two to three presentations. The task was easiest for slow streams; performance dropped with increasing image-presentation rate but stabilized above 15 Hz and remained well above chance even at 120 Hz. To summarize, we reveal that the human brain has an impressive capacity to detect repetitions in rapid-serial-visual-presentation streams and to remember repeated images over a time course of minutes.
Collapse
Affiliation(s)
- Evelina Thunell
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique (CNRS), Université Toulouse III-Paul Sabatier
| | - Simon J Thorpe
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique (CNRS), Université Toulouse III-Paul Sabatier
| |
Collapse
|
10
|
Rajendran VG, Teki S, Schnupp JWH. Temporal Processing in Audition: Insights from Music. Neuroscience 2018; 389:4-18. [PMID: 29108832 PMCID: PMC6371985 DOI: 10.1016/j.neuroscience.2017.10.041] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/24/2017] [Accepted: 10/27/2017] [Indexed: 11/28/2022]
Abstract
Music is a curious example of a temporally patterned acoustic stimulus, and a compelling pan-cultural phenomenon. This review strives to bring some insights from decades of music psychology and sensorimotor synchronization (SMS) literature into the mainstream auditory domain, arguing that musical rhythm perception is shaped in important ways by temporal processing mechanisms in the brain. The feature that unites these disparate disciplines is an appreciation of the central importance of timing, sequencing, and anticipation. Perception of musical rhythms relies on an ability to form temporal predictions, a general feature of temporal processing that is equally relevant to auditory scene analysis, pattern detection, and speech perception. By bringing together findings from the music and auditory literature, we hope to inspire researchers to look beyond the conventions of their respective fields and consider the cross-disciplinary implications of studying auditory temporal sequence processing. We begin by highlighting music as an interesting sound stimulus that may provide clues to how temporal patterning in sound drives perception. Next, we review the SMS literature and discuss possible neural substrates for the perception of, and synchronization to, musical beat. We then move away from music to explore the perceptual effects of rhythmic timing in pattern detection, auditory scene analysis, and speech perception. Finally, we review the neurophysiology of general timing processes that may underlie aspects of the perception of rhythmic patterns. We conclude with a brief summary and outlook for future research.
Collapse
Affiliation(s)
- Vani G Rajendran
- Auditory Neuroscience Group, University of Oxford, Department of Physiology, Anatomy, and Genetics, Oxford, UK
| | - Sundeep Teki
- Auditory Neuroscience Group, University of Oxford, Department of Physiology, Anatomy, and Genetics, Oxford, UK
| | - Jan W H Schnupp
- City University of Hong Kong, Department of Biomedical Sciences, 31 To Yuen Street, Kowloon Tong, Hong Kong.
| |
Collapse
|
11
|
Kang H, Lancelin D, Pressnitzer D. Memory for Random Time Patterns in Audition, Touch, and Vision. Neuroscience 2018; 389:118-132. [PMID: 29577997 DOI: 10.1016/j.neuroscience.2018.03.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Revised: 03/09/2018] [Accepted: 03/13/2018] [Indexed: 11/28/2022]
Abstract
Perception deals with temporal sequences of events, like series of phonemes for audition, dynamic changes in pressure for touch textures, or moving objects for vision. Memory processes are thus needed to make sense of the temporal patterning of sensory information. Recently, we have shown that auditory temporal patterns could be learned rapidly and incidentally with repeated exposure [Kang et al., 2017]. Here, we tested whether rapid incidental learning of temporal patterns was specific to audition, or if it was a more general property of sensory systems. We used a same behavioral task in three modalities: audition, touch, and vision, for stimuli having identical temporal statistics. Participants were presented with sequences of acoustic pulses for audition, motion pulses to the fingertips for touch, or light pulses for vision. Pulses were randomly and irregularly spaced, with all inter-pulse intervals in the sub-second range and all constrained to be longer than the temporal acuity in any modality. This led to pulse sequences with an average inter-pulse interval of 166 ms, a minimum inter-pulse interval of 60 ms, and a total duration of 1.2 s. Results showed that, if a random temporal pattern re-occurred at random times during an experimental block, it was rapidly learned, whatever the sensory modality. Moreover, patterns first learned in the auditory modality displayed transfer of learning to either touch or vision. This suggests that sensory systems may be exquisitely tuned to incidentally learn re-occurring temporal patterns, with possible cross-talk between the senses.
Collapse
Affiliation(s)
- HiJee Kang
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, CNRS, 29 rue d'Ulm, 75005 Paris, France.
| | - Denis Lancelin
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, CNRS, 29 rue d'Ulm, 75005 Paris, France
| | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, CNRS, 29 rue d'Ulm, 75005 Paris, France.
| |
Collapse
|
12
|
Abstract
The cocktail party problem requires listeners to infer individual sound sources from mixtures of sound. The problem can be solved only by leveraging regularities in natural sound sources, but little is known about how such regularities are internalized. We explored whether listeners learn source "schemas"-the abstract structure shared by different occurrences of the same type of sound source-and use them to infer sources from mixtures. We measured the ability of listeners to segregate mixtures of time-varying sources. In each experiment a subset of trials contained schema-based sources generated from a common template by transformations (transposition and time dilation) that introduced acoustic variation but preserved abstract structure. Across several tasks and classes of sound sources, schema-based sources consistently aided source separation, in some cases producing rapid improvements in performance over the first few exposures to a schema. Learning persisted across blocks that did not contain the learned schema, and listeners were able to learn and use multiple schemas simultaneously. No learning was evident when schema were presented in the task-irrelevant (i.e., distractor) source. However, learning from task-relevant stimuli showed signs of being implicit, in that listeners were no more likely to report that sources recurred in experiments containing schema-based sources than in control experiments containing no schema-based sources. The results implicate a mechanism for rapidly internalizing abstract sound structure, facilitating accurate perceptual organization of sound sources that recur in the environment.
Collapse
|
13
|
Denham SL, Winkler I. Predictive coding in auditory perception: challenges and unresolved questions. Eur J Neurosci 2018; 51:1151-1160. [PMID: 29250827 DOI: 10.1111/ejn.13802] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 09/03/2017] [Accepted: 11/20/2017] [Indexed: 11/30/2022]
Abstract
Predictive coding is arguably the currently dominant theoretical framework for the study of perception. It has been employed to explain important auditory perceptual phenomena, and it has inspired theoretical, experimental and computational modelling efforts aimed at describing how the auditory system parses the complex sound input into meaningful units (auditory scene analysis). These efforts have uncovered some vital questions, addressing which could help to further specify predictive coding and clarify some of its basic assumptions. The goal of the current review is to motivate these questions and show how unresolved issues in explaining some auditory phenomena lead to general questions of the theoretical framework. We focus on experimental and computational modelling issues related to sequential grouping in auditory scene analysis (auditory pattern detection and bistable perception), as we believe that this is the research topic where predictive coding has the highest potential for advancing our understanding. In addition to specific questions, our analysis led us to identify three more general questions that require further clarification: (1) What exactly is meant by prediction in predictive coding? (2) What governs which generative models make the predictions? and (3) What (if it exists) is the correlate of perceptual experience within the predictive coding framework?
Collapse
Affiliation(s)
- Susan L Denham
- School of Psychology, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
14
|
Kang H, Agus TR, Pressnitzer D. Auditory memory for random time patterns. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2219. [PMID: 29092589 DOI: 10.1121/1.5007730] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The acquisition of auditory memory for temporal patterns was investigated. The temporal patterns were random sequences of irregularly spaced clicks. Participants performed a task previously used to study auditory memory for noise [Agus, Thorpe, and Pressnitzer (2010). Neuron 66, 610-618]. The memory for temporal patterns displayed strong similarities with the memory for noise: temporal patterns were learnt rapidly, in an unsupervised manner, and could be distinguished from statistically matched patterns after learning. There was, however, a qualitative difference from the memory for noise. For temporal patterns, no memory transfer was observed after time reversals, showing that both the time intervals and their order were represented in memory. Remarkably, learning was observed over a broad range of time scales, which encompassed rhythm-like and buzz-like temporal patterns. Temporal patterns present specific challenges to the neural mechanisms of plasticity, because the information to be learnt is distributed over time. Nevertheless, the present data show that the acquisition of novel auditory memories can be as efficient for temporal patterns as for sounds containing additional spectral and spectro-temporal cues, such as noise. This suggests that the rapid formation of memory traces may be a general by-product of repeated auditory exposure.
Collapse
Affiliation(s)
- HiJee Kang
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, Centre National de la Recherche Scientifique, 29 Rue d'Ulm, 75005 Paris, France
| | - Trevor R Agus
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, Centre National de la Recherche Scientifique, 29 Rue d'Ulm, 75005 Paris, France
| | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, Département d'études cognitives, École Normale Supérieure, PSL Research University, Centre National de la Recherche Scientifique, 29 Rue d'Ulm, 75005 Paris, France
| |
Collapse
|
15
|
Song K, Luo H. Temporal Organization of Sound Information in Auditory Memory. Front Psychol 2017; 8:999. [PMID: 28674512 PMCID: PMC5475238 DOI: 10.3389/fpsyg.2017.00999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 05/30/2017] [Indexed: 11/13/2022] Open
Abstract
Memory is a constructive and organizational process. Instead of being stored with all the fine details, external information is reorganized and structured at certain spatiotemporal scales. It is well acknowledged that time plays a central role in audition by segmenting sound inputs into temporal chunks of appropriate length. However, it remains largely unknown whether critical temporal structures exist to mediate sound representation in auditory memory. To address the issue, here we designed an auditory memory transferring study, by combining a previously developed unsupervised white noise memory paradigm with a reversed sound manipulation method. Specifically, we systematically measured the memory transferring from a random white noise sound to its locally temporal reversed version on various temporal scales in seven experiments. We demonstrate a U-shape memory-transferring pattern with the minimum value around temporal scale of 200 ms. Furthermore, neither auditory perceptual similarity nor physical similarity as a function of the manipulating temporal scale can account for the memory-transferring results. Our results suggest that sounds are not stored with all the fine spectrotemporal details but are organized and structured at discrete temporal chunks in long-term auditory memory representation.
Collapse
Affiliation(s)
- Kun Song
- Department of Connectomics, Max Planck Institute for Brain ResearchFrankfurt, Germany
| | - Huan Luo
- School of Psychological and Cognitive Sciences, Peking UniversityBeijing, China.,IDG/McGovern Institute for Brain Research, Peking UniversityBeijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking UniversityBeijing, China
| |
Collapse
|
16
|
Viswanathan J, Rémy F, Bacon-Macé N, Thorpe SJ. Long Term Memory for Noise: Evidence of Robust Encoding of Very Short Temporal Acoustic Patterns. Front Neurosci 2016; 10:490. [PMID: 27932941 PMCID: PMC5121232 DOI: 10.3389/fnins.2016.00490] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 10/13/2016] [Indexed: 11/13/2022] Open
Abstract
Recent research has demonstrated that humans are able to implicitly encode and retain repeating patterns in meaningless auditory noise. Our study aimed at testing the robustness of long-term implicit recognition memory for these learned patterns. Participants performed a cyclic/non-cyclic discrimination task, during which they were presented with either 1-s cyclic noises (CNs) (the two halves of the noise were identical) or 1-s plain random noises (Ns). Among CNs and Ns presented once, target CNs were implicitly presented multiple times within a block, and implicit recognition of these target CNs was tested 4 weeks later using a similar cyclic/non-cyclic discrimination task. Furthermore, robustness of implicit recognition memory was tested by presenting participants with looped (shifting the origin) and scrambled (chopping sounds into 10- and 20-ms bits before shuffling) versions of the target CNs. We found that participants had robust implicit recognition memory for learned noise patterns after 4 weeks, right from the first presentation. Additionally, this memory was remarkably resistant to acoustic transformations, such as looping and scrambling of the sounds. Finally, implicit recognition of sounds was dependent on participant's discrimination performance during learning. Our findings suggest that meaningless temporal features as short as 10 ms can be implicitly stored in long-term auditory memory. Moreover, successful encoding and storage of such fine features may vary between participants, possibly depending on individual attention and auditory discrimination abilities. Significance Statement Meaningless auditory patterns could be implicitly encoded and stored in long-term memory.Acoustic transformations of learned meaningless patterns could be implicitly recognized after 4 weeks.Implicit long-term memories can be formed for meaningless auditory features as short as 10 ms.Successful encoding and long-term implicit recognition of meaningless patterns may strongly depend on individual attention and auditory discrimination abilities.
Collapse
Affiliation(s)
- Jayalakshmi Viswanathan
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique UMR 5549Toulouse, France; Faculty of Medicine, Purpan, University of Toulouse III Paul SabatierToulouse, France
| | - Florence Rémy
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique UMR 5549Toulouse, France; Faculty of Medicine, Purpan, University of Toulouse III Paul SabatierToulouse, France
| | - Nadège Bacon-Macé
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique UMR 5549 Toulouse, France
| | - Simon J Thorpe
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique UMR 5549Toulouse, France; Faculty of Medicine, Purpan, University of Toulouse III Paul SabatierToulouse, France
| |
Collapse
|
17
|
Rajendran VG, Harper NS, Abdel-Latif KHA, Schnupp JWH. Rhythm Facilitates the Detection of Repeating Sound Patterns. Front Neurosci 2016; 10:9. [PMID: 26858589 PMCID: PMC4731741 DOI: 10.3389/fnins.2016.00009] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 01/11/2016] [Indexed: 11/13/2022] Open
Abstract
This study investigates the influence of temporal regularity on human listeners' ability to detect a repeating noise pattern embedded in statistically identical non-repeating noise. Human listeners were presented with white noise stimuli that either contained a frozen segment of noise that repeated in a temporally regular or irregular manner, or did not contain any repetition at all. Subjects were instructed to respond as soon as they detected any repetition in the stimulus. Pattern detection performance was best when repeated targets occurred in a temporally regular manner, suggesting that temporal regularity plays a facilitative role in pattern detection. A modulation filterbank model could account for these results.
Collapse
Affiliation(s)
- Vani G Rajendran
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford Oxford, UK
| | - Nicol S Harper
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford Oxford, UK
| | - Khaled H A Abdel-Latif
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford Oxford, UK
| | - Jan W H Schnupp
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford Oxford, UK
| |
Collapse
|
18
|
Andrillon T, Kouider S, Agus T, Pressnitzer D. Perceptual Learning of Acoustic Noise Generates Memory-Evoked Potentials. Curr Biol 2015; 25:2823-2829. [DOI: 10.1016/j.cub.2015.09.027] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 07/22/2015] [Accepted: 09/09/2015] [Indexed: 11/16/2022]
|
19
|
Winkler I, Schröger E. Auditory perceptual objects as generative models: Setting the stage for communication by sound. BRAIN AND LANGUAGE 2015; 148:1-22. [PMID: 26184883 DOI: 10.1016/j.bandl.2015.05.003] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 03/03/2015] [Accepted: 05/03/2015] [Indexed: 06/04/2023]
Abstract
Communication by sounds requires that the communication channels (i.e. speech/speakers and other sound sources) had been established. This allows to separate concurrently active sound sources, to track their identity, to assess the type of message arriving from them, and to decide whether and when to react (e.g., reply to the message). We propose that these functions rely on a common generative model of the auditory environment. This model predicts upcoming sounds on the basis of representations describing temporal/sequential regularities. Predictions help to identify the continuation of the previously discovered sound sources to detect the emergence of new sources as well as changes in the behavior of the known ones. It produces auditory event representations which provide a full sensory description of the sounds, including their relation to the auditory context and the current goals of the organism. Event representations can be consciously perceived and serve as objects in various cognitive operations.
Collapse
Affiliation(s)
- István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Hungary; Institute of Psychology, University of Szeged, Hungary.
| | - Erich Schröger
- Institute for Psychology, University of Leipzig, Germany.
| |
Collapse
|
20
|
Bartha-Doering L, Deuster D, Giordano V, am Zehnhoff-Dinnesen A, Dobel C. A systematic review of the mismatch negativity as an index for auditory sensory memory: From basic research to clinical and developmental perspectives. Psychophysiology 2015; 52:1115-30. [DOI: 10.1111/psyp.12459] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 05/05/2015] [Indexed: 11/28/2022]
Affiliation(s)
- Lisa Bartha-Doering
- Department of Pediatrics and Adolescent Medicine; Medical University Vienna; Vienna Austria
| | - Dirk Deuster
- Department of Phoniatry and Pedaudiology; University Hospital of Muenster; Muenster Germany
| | - Vito Giordano
- Department of Pediatrics and Adolescent Medicine; Medical University Vienna; Vienna Austria
| | | | - Christian Dobel
- Institute for Biomagnetism and Biosignalanalysis, University of Muenster; Muenster Germany
- Department of Otolaryngology and the Institute of Phoniatry and Pedaudiology; Friedrich-Schiller University Jena; Jena Germany
| |
Collapse
|
21
|
Agus TR, Carrión-Castillo A, Pressnitzer D, Ramus F. Perceptual learning of acoustic noise by individuals with dyslexia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:1069-1077. [PMID: 24167235 DOI: 10.1044/1092-4388(2013/13-0020)] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
PURPOSE A phonological deficit is thought to affect most individuals with developmental dyslexia. The present study addresses whether the phonological deficit is caused by difficulties with perceptual learning of fine acoustic details. METHOD A demanding test of nonverbal auditory memory, "noise learning," was administered to both adults with dyslexia and control adult participants. On each trial, listeners had to decide whether a stimulus was a 1-s noise token or 2 abutting presentations of the same 0.5-s noise token (repeated noise). Without the listener's knowledge, the exact same noise tokens were presented over many trials. An improved ability to perform the task for such "reference" noises reflects learning of their acoustic details. RESULTS Listeners with dyslexia did not differ from controls in any aspect of the task, qualitatively or quantitatively. They required the same amount of training to achieve discrimination of repeated from nonrepeated noises, and they learned the reference noises as often and as rapidly as the control group. However, they did show all the hallmarks of dyslexia, including a well-characterized phonological deficit. CONCLUSION The data did not support the hypothesis that deficits in basic auditory processing or nonverbal learning and memory are the cause of the phonological deficit in dyslexia.
Collapse
|
22
|
Gold JM, Aizenman A, Bond SM, Sekuler R. Memory and incidental learning for visual frozen noise sequences. Vision Res 2013; 99:19-36. [PMID: 24075900 DOI: 10.1016/j.visres.2013.09.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Revised: 09/07/2013] [Accepted: 09/10/2013] [Indexed: 12/01/2022]
Abstract
Five experiments explored short-term memory and incidental learning for random visual spatio-temporal sequences. In each experiment, human observers saw samples of 8 Hz temporally-modulated 1D or 2D contrast noise sequences whose members were either uncorrelated across an entire 1-s long stimulus sequence, or comprised two frozen noise sequences that repeated identically between a stimulus' first and second 500 ms halves ("Repeated" noise). Presented with randomly intermixed stimuli of both types, observers judged whether each sequence repeated or not. Additionally, a particular exemplar of Repeated noise (a frozen or "Fixed Repeated" noise) was interspersed multiple times within a block of trials. As previously shown with auditory frozen noise stimuli (Agus, Thorpe, & Pressnitzer, 2010) recognition performance (d') increased with successive presentations of a Fixed Repeated stimulus, and exceeded performance with regular Repeated noise. However, unlike the case with auditory stimuli, learning of random visual stimuli was slow and gradual, rather than fast and abrupt. Reverse correlation revealed that contrasts occupying particular temporal positions within a sequence had disproportionately heavy weight in observers' judgments. A subsequent experiment suggested that this result arose from observers' uncertainty about the temporal mid-point of the noise sequences. Additionally, discrimination performance fell dramatically when a sequence of contrast values was repeated, but in reverse ("mirror image") order. This poor performance with temporal mirror images is strikingly different from vision's exquisite sensitivity to spatial mirror images.
Collapse
Affiliation(s)
- Jason M Gold
- Department of Psychological and Brain Sciences, Indiana University, United States.
| | - Avi Aizenman
- Volen Center for Complex Systems, Brandeis University, United States
| | - Stephanie M Bond
- Volen Center for Complex Systems, Brandeis University, United States
| | - Robert Sekuler
- Volen Center for Complex Systems, Brandeis University, United States
| |
Collapse
|
23
|
Agus TR, Pressnitzer D. The detection of repetitions in noise before and after perceptual learning. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:464-473. [PMID: 23862821 DOI: 10.1121/1.4807641] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
In noise repetition-detection tasks, listeners have to distinguish trials of continuously running noise from trials in which noise tokens are repeated in a cyclic manner. Recently, it has been shown that using the exact same noise token across several trials ("reference noise") facilitates the detection of repetitions for this token [Agus et al. (2010). Neuron 66, 610-618]. This was attributed to perceptual learning. Here, the nature of the learning was investigated. In experiment 1, reference noise tokens were embedded in trials with or without cyclic presentation. Naïve listeners reported repetitions in both cases, thus responding to the reference noise even in the absence of an actual repetition. Experiment 2, with the same listeners, showed a similar pattern of results even after the design of the experiment was made explicit, ruling out a misunderstanding of the task. Finally, in experiment 3, listeners reported repetitions in trials containing the reference noise, even before ever hearing it presented cyclically. The results show that listeners were able to learn and recognize noise tokens in the absence of an immediate repetition. Moreover, the learning mandatorily interfered with listeners' ability to detect repetitions. It is concluded that salient perceptual changes accompany the learning of noise.
Collapse
Affiliation(s)
- Trevor R Agus
- Laboratoire Psychologie de la Perception, École normale supérieure, Centre National de la Recherche Scientifique, 29 rue d'Ulm, 75230 Paris Cédex 5, France.
| | | |
Collapse
|
24
|
Abstract
Aiming to further our understanding of fundamental mechanisms of auditory working memory (WM), the present study compared performance for three auditory materials (words, tones, timbres). In a forward recognition task (Experiment 1) participants indicated whether the order of the items in the second sequence was the same as in the first sequence. In a backward recognition task (Experiment 2) participants indicated whether the items of the second sequence were played in the correct backward order. In Experiment 3 participants performed an articulatory suppression task during the retention delay of the backward task. To investigate potential length effects the number of items per sequence was manipulated. Overall findings underline the benefit of a cross-material experimental approach and suggest that human auditory WM is not a unitary system. Whereas WM processes for timbres differed from those for tones and words, similarities and differences were observed for words and tones: Both types of stimuli appear to rely on rehearsal mechanisms, but might differ in the involved sensorimotor codes.
Collapse
Affiliation(s)
- Katrin Schulze
- Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics Team, CNRS UMR5292, INSERM U1028, Université de Lyon, Lyon, France.
| | | |
Collapse
|
25
|
McDermott JH, Schemitsch M, Simoncelli EP. Summary statistics in auditory perception. Nat Neurosci 2013; 16:493-8. [PMID: 23434915 PMCID: PMC4143328 DOI: 10.1038/nn.3347] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Accepted: 01/31/2013] [Indexed: 11/08/2022]
Abstract
Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. Here we provide evidence that the auditory system summarizes the temporal details of sounds using time-averaged statistics. We measured discrimination of 'sound textures' that were characterized by particular statistical properties, as normally result from the superposition of many acoustic features in auditory scenes. When listeners discriminated examples of different textures, performance improved with excerpt duration. In contrast, when listeners discriminated different examples of the same texture, performance declined with duration, a paradoxical result given that the information available for discrimination grows with duration. These results indicate that once these sounds are of moderate length, the brain's representation is limited to time-averaged statistics, which, for different examples of the same texture, converge to the same values with increasing duration. Such statistical representations produce good categorical discrimination, but limit the ability to discern temporal detail.
Collapse
Affiliation(s)
- Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
| | | | | |
Collapse
|
26
|
Mill RW, Bőhm TM, Bendixen A, Winkler I, Denham SL. Modelling the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Comput Biol 2013; 9:e1002925. [PMID: 23516340 PMCID: PMC3597549 DOI: 10.1371/journal.pcbi.1002925] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 12/31/2012] [Indexed: 11/29/2022] Open
Abstract
Many sound sources can only be recognised from the pattern of sounds they emit, and not from the individual sound events that make up their emission sequences. Auditory scene analysis addresses the difficult task of interpreting the sound world in terms of an unknown number of discrete sound sources (causes) with possibly overlapping signals, and therefore of associating each event with the appropriate source. There are potentially many different ways in which incoming events can be assigned to different causes, which means that the auditory system has to choose between them. This problem has been studied for many years using the auditory streaming paradigm, and recently it has become apparent that instead of making one fixed perceptual decision, given sufficient time, auditory perception switches back and forth between the alternatives—a phenomenon known as perceptual bi- or multi-stability. We propose a new model of auditory scene analysis at the core of which is a process that seeks to discover predictable patterns in the ongoing sound sequence. Representations of predictable fragments are created on the fly, and are maintained, strengthened or weakened on the basis of their predictive success, and conflict with other representations. Auditory perceptual organisation emerges spontaneously from the nature of the competition between these representations. We present detailed comparisons between the model simulations and data from an auditory streaming experiment, and show that the model accounts for many important findings, including: the emergence of, and switching between, alternative organisations; the influence of stimulus parameters on perceptual dominance, switching rate and perceptual phase durations; and the build-up of auditory streaming. The principal contribution of the model is to show that a two-stage process of pattern discovery and competition between incompatible patterns can account for both the contents (perceptual organisations) and the dynamics of human perception in auditory streaming. The sound waves produced by objects in the environment mix together before reaching the ears. Before we can make sense of an auditory scene, our brains must solve the puzzle of how to disassemble the sound waveform into groupings that correspond to the original source signals. How is this feat accomplished? We propose that the auditory system continually scans the structure of incoming signals in search of clues to indicate which pieces belong together. For instance, sound events may belong together if they have similar features, or form part of a clear temporal pattern. However this process is complicated by lack of knowledge of future events and the many possible ways in which even a simple sound sequence can be decomposed. The biological solution is multistability: one possible interpretation of a sound is perceived initially, which then gives way to another interpretation, and so on. We propose a model of auditory multistability, in which fragmental descriptions of the signal compete and cooperate to explain the sound scene. We demonstrate, using simplified experimental stimuli, that the model can account for both the contents (perceptual organisations) and the dynamics of human perception in auditory streaming.
Collapse
Affiliation(s)
- Robert W. Mill
- MRC Institute of Hearing Research, Nottingham, United Kingdom
| | - Tamás M. Bőhm
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, MTA, Budapest, Hungary
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
- * E-mail:
| | | | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, MTA, Budapest, Hungary
- Institute for Psychology, University of Szeged, Szeged, Hungary
| | - Susan L. Denham
- Cognition Institute and School of Psychology, University of Plymouth, Plymouth, United Kingdom
| |
Collapse
|
27
|
Winkler I, Denham S, Mill R, Bohm TM, Bendixen A. Multistability in auditory stream segregation: a predictive coding view. Philos Trans R Soc Lond B Biol Sci 2012; 367:1001-12. [PMID: 22371621 DOI: 10.1098/rstb.2011.0359] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Auditory stream segregation involves linking temporally separate acoustic events into one or more coherent sequences. For any non-trivial sequence of sounds, many alternative descriptions can be formed, only one or very few of which emerge in awareness at any time. Evidence from studies showing bi-/multistability in auditory streaming suggest that some, perhaps many of the alternative descriptions are represented in the brain in parallel and that they continuously vie for conscious perception. Here, based on a predictive coding view, we consider the nature of these sound representations and how they compete with each other. Predictive processing helps to maintain perceptual stability by signalling the continuation of previously established patterns as well as the emergence of new sound sources. It also provides a measure of how well each of the competing representations describes the current acoustic scene. This account of auditory stream segregation has been tested on perceptual data obtained in the auditory streaming paradigm.
Collapse
Affiliation(s)
- István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 398, 1394 Budapest, Hungary.
| | | | | | | | | |
Collapse
|
28
|
Abstract
Background Auditory sustained responses have been recently suggested to reflect neural processing of speech sounds in the auditory cortex. As periodic fluctuations below the pitch range are important for speech perception, it is necessary to investigate how low frequency periodic sounds are processed in the human auditory cortex. Auditory sustained responses have been shown to be sensitive to temporal regularity but the relationship between the amplitudes of auditory evoked sustained responses and the repetitive rates of auditory inputs remains elusive. As the temporal and spectral features of sounds enhance different components of sustained responses, previous studies with click trains and vowel stimuli presented diverging results. In order to investigate the effect of repetition rate on cortical responses, we analyzed the auditory sustained fields evoked by periodic and aperiodic noises using magnetoencephalography. Results Sustained fields were elicited by white noise and repeating frozen noise stimuli with repetition rates of 5-, 10-, 50-, 200- and 500 Hz. The sustained field amplitudes were significantly larger for all the periodic stimuli than for white noise. Although the sustained field amplitudes showed a rising and falling pattern within the repetition rate range, the response amplitudes to 5 Hz repetition rate were significantly larger than to 500 Hz. Conclusions The enhanced sustained field responses to periodic noises show that cortical sensitivity to periodic sounds is maintained for a wide range of repetition rates. Persistence of periodicity sensitivity below the pitch range suggests that in addition to processing the fundamental frequency of voice, sustained field generators can also resolve low frequency temporal modulations in speech envelope.
Collapse
|
29
|
Abstract
Cocktail parties and other natural auditory environments present organisms with mixtures of sounds. Segregating individual sound sources is thought to require prior knowledge of source properties, yet these presumably cannot be learned unless the sources are segregated first. Here we show that the auditory system can bootstrap its way around this problem by identifying sound sources as repeating patterns embedded in the acoustic input. Due to the presence of competing sounds, source repetition is not explicit in the input to the ear, but it produces temporal regularities that listeners detect and use for segregation. We used a simple generative model to synthesize novel sounds with naturalistic properties. We found that such sounds could be segregated and identified if they occurred more than once across different mixtures, even when the same sounds were impossible to segregate in single mixtures. Sensitivity to the repetition of sound sources can permit their recovery in the absence of other segregation cues or prior knowledge of sounds, and could help solve the cocktail party problem.
Collapse
|
30
|
McKeown D, Mills R, Mercer T. Comparisons of Complex Sounds across Extended Retention Intervals Survives Reading Aloud. Perception 2011; 40:1193-205. [DOI: 10.1068/p6988] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Abstract
A simple experimental arrangement is designed to foil verbal rehearsal during an extended (from 5 to 30 s) retention interval across which participants attempt to discriminate two periodic complex sounds. Sounds have an abstract timbre that does not lend itself to verbal labeling, they differ across trials so that no ‘standard’ comparison stimulus is built up by the participants, and the spectral change to be discriminated is very slight and therefore does not shift the stimulus into a new verbal category. And, crucially, in one experimental condition, participants read aloud during most of the retention interval. Despite these precautions, performance is robust across the extended retention interval. The inference is that one form of auditory memory does not require verbal rehearsal. Nevertheless, modest forgetting occurred. Whatever form memory takes in this situation, it is not totally secure from disruption.
Collapse
Affiliation(s)
| | | | - Tom Mercer
- Division of Psychology, School of Applied Sciences, University of Wolverhampton, Wolverhampton WV1 1LY, UK
| |
Collapse
|
31
|
Updating and feature overwriting in short-term memory for timbre. Atten Percept Psychophys 2010; 72:2289-303. [PMID: 21097870 DOI: 10.3758/bf03196702] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
32
|
Agus TR, Thorpe SJ, Pressnitzer D. Rapid formation of robust auditory memories: insights from noise. Neuron 2010; 66:610-8. [PMID: 20510864 DOI: 10.1016/j.neuron.2010.04.014] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/02/2010] [Indexed: 10/19/2022]
Abstract
Before a natural sound can be recognized, an auditory signature of its source must be learned through experience. Here we used random waveforms to probe the formation of new memories for arbitrary complex sounds. A behavioral measure was designed, based on the detection of repetitions embedded in noises up to 4 s long. Unbeknownst to listeners, some noise samples reoccurred randomly throughout an experimental block. Results showed that repeated exposure induced learning for otherwise totally unpredictable and meaningless sounds. The learning was unsupervised and resilient to interference from other task-relevant noises. When memories were formed, they emerged rapidly, performance became abruptly near-perfect, and multiple noises were remembered for several weeks. The acoustic transformations to which recall was tolerant suggest that the learned features were local in time. We propose that rapid sensory plasticity could explain how the auditory brain creates useful memories from the ever-changing, but sometimes repeating, acoustical world.
Collapse
Affiliation(s)
- Trevor R Agus
- Laboratoire psychologie de la perception, CNRS & Université Paris Descartes, Paris, France.
| | | | | |
Collapse
|
33
|
Furness D. Abstracts of the British Society of Audiology Short Papers Meeting on Experimental Studies of Hearing and Deafness. Int J Audiol 2010. [DOI: 10.3109/14992020903426264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
34
|
Kretzschmar C, Kalenscher T, Güntürkün O, Kaernbach C. Echoic memory in pigeons. Behav Processes 2008; 79:105-10. [PMID: 18606214 DOI: 10.1016/j.beproc.2008.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2007] [Revised: 06/02/2008] [Accepted: 06/06/2008] [Indexed: 11/17/2022]
|
35
|
Kaernbach C, Schlemmer K. The decay of pitch memory during rehearsal. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:1846-1849. [PMID: 18396991 DOI: 10.1121/1.2875365] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The present study investigates the decay of pitch memory over time. In a delayed pitch comparison paradigm, participants had to memorize the pitch of a Shepard tone, with silent, overt, or without any rehearsal. During overt rehearsal, recordings of the rehearsing were effectuated. Performance was best for silent rehearsal and worst for overt rehearsal. The differences, although partially significant, were not marked. The voice pitch during overt rehearsal was compatible with a random walk model, providing a possible explanation of why rehearsal does not improve the retention of the pitch trace.
Collapse
Affiliation(s)
- Christian Kaernbach
- Institut für Psychologie, Christian-Albrechts-Universität zu Kiel, Olshausenstrasse 62, 24098 Kiel, Germany. www.kaernbach.de
| | | |
Collapse
|
36
|
Auditory memory for temporal characteristics of sound. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2008; 194:457-67. [PMID: 18299849 DOI: 10.1007/s00359-008-0318-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2007] [Revised: 01/30/2008] [Accepted: 02/02/2008] [Indexed: 10/22/2022]
Abstract
This study evaluates auditory memory for variations in the rate of sinusoidal amplitude modulation (SAM) of noise bursts in the European starling (Sturnus vulgaris). To estimate the extent of the starling's auditory short-term memory store, a delayed non-matching-to-sample paradigm was applied. The birds were trained to discriminate between a series of identical "sample stimuli" and a single "test stimulus". The birds classified SAM rates of sample and test stimuli as being either the same or different. Memory performance of the birds was measured as the percentage of correct classifications. Auditory memory persistence time was estimated as a function of the delay between sample and test stimuli. Memory performance was significantly affected by the delay between sample and test and by the number of sample stimuli presented before the test stimulus, but was not affected by the difference in SAM rate between sample and test stimuli. The individuals' auditory memory persistence times varied between 2 and 13 s. The starlings' auditory memory persistence in the present study for signals varying in the temporal domain was significantly shorter compared to that of a previous study (Zokoll et al. in J Acoust Soc Am 121:2842, 2007) applying tonal stimuli varying in the spectral domain.
Collapse
|
37
|
Horváth J, Czigler I, Winkler I, Teder-Sälejärvi WA. The temporal window of integration in elderly and young adults. Neurobiol Aging 2007; 28:964-75. [PMID: 16793177 DOI: 10.1016/j.neurobiolaging.2006.05.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2005] [Revised: 04/14/2006] [Accepted: 05/02/2006] [Indexed: 10/24/2022]
|
38
|
Abstract
Since its discovery by Näätänen and colleagues in 1978, the mismatch negativity (MMN) has been used as an index of auditory sensory memory. The present paper explicates various possibilities of how MMN can assess memory functions, it reveals possible traps when interpreting MMN as an index of auditory memory, and it reviews recent developments of paradigms showing that memory on a short time-scale, consolidation of memory traces, and even implicit memory can be probed with MMN.
Collapse
Affiliation(s)
- Erich Schröger
- Institute for Psychology I, University of Leipzig, Germany
| |
Collapse
|
39
|
Dajani HR, Picton TW. Human auditory steady-state responses to changes in interaural correlation. Hear Res 2006; 219:85-100. [PMID: 16870369 DOI: 10.1016/j.heares.2006.06.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Revised: 05/17/2006] [Accepted: 06/14/2006] [Indexed: 10/24/2022]
Abstract
Steady-state responses were evoked by noise stimuli that alternated between two levels of interaural correlation rho at a frequency fm. With rho alternating between +1 and 0, responses at fm dropped steeply above 4 Hz, but persisted up to 64 Hz. Two time constants of 47 and 4.4 ms with delays of 198 and 36 ms, respectively, were obtained by fitting responses to a transfer function based on symmetric exponential windows. The longer time constant, possibly reflecting cortical integration, is consistent with perceptual binaural "sluggishness". The shorter time constant may reflect running cross-correlation in the high brainstem or primary auditory cortex. Responses at 2fm peaked with an amplitude of 848+/-479 nV (fm=4 Hz). Investigation of this robust response revealed that: (1) changes in rho and lateralization evoked similar responses, suggesting a common neural origin, (2) response was most dependent on stimulus frequencies below 1000 Hz, but frequencies up to 4000 Hz also contributed, and (3) when rho alternated between [0.2-1] and 0, response amplitude varied linearly with rho, and the physiological response threshold was close to the average behavioral threshold (rho=0.31). This steady-state response may prove useful in the objective investigation of binaural hearing.
Collapse
Affiliation(s)
- Hilmi R Dajani
- Rotman Research Institute at Baycrest and University of Toronto, 3560 Bathurst Street, Toronto, Ont., Canada M6A 2E1.
| | | |
Collapse
|
40
|
|