51
|
Vannier M, Misdariis N, Susini P, Grimault N. How does the perceptual organization of a multi-tone mixture interact with partial and global loudness judgments? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:575. [PMID: 29390738 DOI: 10.1121/1.5021551] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Two experiments were conducted to investigate how the perceptual organization of a multi-tone mixture interacts with global and partial loudness judgments. Grouping (single-object) and segregating (two-object) conditions were created using frequency modulation by applying the same or different modulation frequencies to the odd- and even-rank harmonics. While in Experiment 1 (Exp. 1) the two objects had the same loudness, in Experiment 2 (Exp. 2), loudness level differences (LLD) were introduced (LLD = 6, 12, 18, or 24 phons). In the two-object condition, the loudness of each object was not affected by the mixture when LLD = 0 (Exp. 1), otherwise (Exp. 2), the loudness of the softest object was modulated by LLD, and the loudness of the loudest object was the same regardless of whether it was presented in or out of the mixture. In the single- and the two-object conditions, the global loudness of the mixture was close to the loudness of the loudest object. Taken together, these results suggest that while partial loudness judgments are dependent on the perceptual organization of the scene, global loudness is not. Yet, both partial and global loudness computations are governed by relative "saliences" between different auditory objects (in the segregating condition) or within a single object (in the grouping condition).
Collapse
Affiliation(s)
- Michaël Vannier
- STMS (UMR9912), Sorbonne Université, Ministère de la Culture, Ircam, CNRS, Sorbonne Université, 1, place Igor Stravinsky - 75004 Paris, France
| | - Nicolas Misdariis
- STMS (UMR9912), Sorbonne Université, Ministère de la Culture, Ircam, CNRS, Sorbonne Université, 1, place Igor Stravinsky - 75004 Paris, France
| | - Patrick Susini
- STMS (UMR9912), Sorbonne Université, Ministère de la Culture, Ircam, CNRS, Sorbonne Université, 1, place Igor Stravinsky - 75004 Paris, France
| | - Nicolas Grimault
- UMR CNRS 5292 Centre de Recherche en Neurosciences de Lyon, Université Lyon 1, 50 Avenue Tony Garnier, 69366 Lyon Cedex 07, France
| |
Collapse
|
52
|
Dolležal LV, Tolnai S, Beutelmann R, Klump GM. Release from informational masking by auditory stream segregation: perception and its neural correlate. Eur J Neurosci 2017; 51:1242-1253. [PMID: 29247467 DOI: 10.1111/ejn.13794] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 11/11/2017] [Accepted: 12/07/2017] [Indexed: 11/27/2022]
Abstract
In the analysis of acoustic scenes, we easily miss sounds or are insensitive to sound features that are salient if presented in isolation. This insensitivity that is not due to interference in the inner ear is termed informational masking (IM). So far, the cellular mechanisms underlying IM remained elusive. Here, we apply a sequential IM paradigm to humans and gerbils using a sound level increment detection task determining the sensitivity to target tones in a background of standard (same frequency) and distracting tones (varying in level and frequency). The amount of IM that was indicated by the level increment thresholds depended on the frequency separation between the distracting and the standard and target tones. In humans and gerbils, we observed similar perceptual thresholds. A release from IM of more than 20 dB was observed in both species if the distracting tones were well segregated in frequency from the other tones. Neuronal rate responses elicited by similar sequences in gerbil inferior colliculus and auditory cortex were recorded. At both levels of the auditory pathway, the neuronal thresholds obtained with a signal-detection-theoretic approach deducing the sensitivity from the analysis of the neurons' receiver operating characteristics matched the psychophysical thresholds revealing that IM already emerges at midbrain level. By applying objective response measures in physiology and psychophysics, we demonstrated that the population of neurons has a sufficient sensitivity for explaining the perceptual level increment thresholds indicating IM. There was a good correspondence between the neuronal and perceptual release from IM being related to auditory stream segregation.
Collapse
Affiliation(s)
- Lena-Vanessa Dolležal
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Sandra Tolnai
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Rainer Beutelmann
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
53
|
Bauer AKR, Bleichner MG, Jaeger M, Thorne JD, Debener S. Dynamic phase alignment of ongoing auditory cortex oscillations. Neuroimage 2017; 167:396-407. [PMID: 29170070 DOI: 10.1016/j.neuroimage.2017.11.037] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Revised: 11/13/2017] [Accepted: 11/18/2017] [Indexed: 11/19/2022] Open
Abstract
Neural oscillations can synchronize to external rhythmic stimuli, as for example in speech and music. While previous studies have mainly focused on elucidating the fundamental concept of neural entrainment, less is known about the time course of entrainment. In this human electroencephalography (EEG) study, we unravel the temporal evolution of neural entrainment by contrasting short and long periods of rhythmic stimulation. Listeners had to detect short silent gaps that were systematically distributed with respect to the phase of a 3 Hz frequency-modulated tone. We found that gap detection performance was modulated by the stimulus stream with a consistent stimulus phase across participants for short and long stimulation. Electrophysiological analysis confirmed neural entrainment effects at 3 Hz and the 6 Hz harmonic for both short and long stimulation lengths. 3 Hz source level analysis revealed that longer stimulation resulted in a phase shift of a participant's neural phase relative to the stimulus phase. Phase coupling increased over the first second of stimulation, but no effects for phase coupling strength were observed over time. The dynamic evolution of phase alignment suggests that the brain attunes to external rhythmic stimulation by adapting the brain's internal representation of incoming environmental stimuli.
Collapse
Affiliation(s)
- Anna-Katharina R Bauer
- Neuropsychology Lab, Department of Psychology, European Medical School, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany.
| | - Martin G Bleichner
- Neuropsychology Lab, Department of Psychology, European Medical School, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany; Cluster of Excellence Hearing4all, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany
| | - Manuela Jaeger
- Neuropsychology Lab, Department of Psychology, European Medical School, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany; Research Centre Neurosensory Science, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany
| | - Jeremy D Thorne
- Neuropsychology Lab, Department of Psychology, European Medical School, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, European Medical School, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany; Cluster of Excellence Hearing4all, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany; Research Centre Neurosensory Science, University of Oldenburg, Ammerlaender Heerstraße 114-118, 26129, Oldenburg, Germany
| |
Collapse
|
54
|
Shinn-Cunningham B. Cortical and Sensory Causes of Individual Differences in Selective Attention Ability Among Listeners With Normal Hearing Thresholds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:2976-2988. [PMID: 29049598 PMCID: PMC5945067 DOI: 10.1044/2017_jslhr-h-17-0080] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 06/23/2017] [Accepted: 07/05/2017] [Indexed: 05/28/2023]
Abstract
PURPOSE This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. METHOD The results from neuroscience and psychoacoustics are reviewed. RESULTS In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." CONCLUSIONS How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601617.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Center for Research in Sensory Communication and Emerging Neural Technology, Boston University, MA
| |
Collapse
|
55
|
Auditory sequential accumulation of spectral information. Hear Res 2017; 356:118-126. [PMID: 29042121 DOI: 10.1016/j.heares.2017.10.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 10/03/2017] [Accepted: 10/04/2017] [Indexed: 11/22/2022]
Abstract
In many listening situations, information about the spectral content of a target sound may be distributed over time, and estimating the target spectrum requires efficient sequential processing. Listeners' ability to estimate the spectrum of a random-frequency, six-tone complex was investigated and the spectral content of the complex was revealed using a sequence of bursts. Whether each of the six tones was presented within each burst was determined at random according to a presentation probability. In separate conditions, the presentation probabilities (p) ranged from 0.2 to 1, the total number of bursts varied from 1 to 16, and the inter-burst interval was either 0 or 200 ms. To evaluate the information acquired by the listener, the burst sequence was followed, after a 500-ms silent interval, by the six-tone complex acting as an informational masker and the listener was required to detect a pure-tone target presented simultaneously with the masker. Greater performance in this task indicates more accurate estimation of the spectrum of the complex by the listener. Evidence for integration of information across bursts was observed, and the integration process did not significantly depend on inter-burst interval.
Collapse
|
56
|
Evidence for cue-independent spatial representation in the human auditory cortex during active listening. Proc Natl Acad Sci U S A 2017; 114:E7602-E7611. [PMID: 28827357 DOI: 10.1073/pnas.1707522114] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Few auditory functions are as important or as universal as the capacity for auditory spatial awareness (e.g., sound localization). That ability relies on sensitivity to acoustical cues-particularly interaural time and level differences (ITD and ILD)-that correlate with sound-source locations. Under nonspatial listening conditions, cortical sensitivity to ITD and ILD takes the form of broad contralaterally dominated response functions. It is unknown, however, whether that sensitivity reflects representations of the specific physical cues or a higher-order representation of auditory space (i.e., integrated cue processing), nor is it known whether responses to spatial cues are modulated by active spatial listening. To investigate, sensitivity to parametrically varied ITD or ILD cues was measured using fMRI during spatial and nonspatial listening tasks. Task type varied across blocks where targets were presented in one of three dimensions: auditory location, pitch, or visual brightness. Task effects were localized primarily to lateral posterior superior temporal gyrus (pSTG) and modulated binaural-cue response functions differently in the two hemispheres. Active spatial listening (location tasks) enhanced both contralateral and ipsilateral responses in the right hemisphere but maintained or enhanced contralateral dominance in the left hemisphere. Two observations suggest integrated processing of ITD and ILD. First, overlapping regions in medial pSTG exhibited significant sensitivity to both cues. Second, successful classification of multivoxel patterns was observed for both cue types and-critically-for cross-cue classification. Together, these results suggest a higher-order representation of auditory space in the human auditory cortex that at least partly integrates the specific underlying cues.
Collapse
|
57
|
Cui Z, Wang Q, Gao Y, Wang J, Wang M, Teng P, Guan Y, Zhou J, Li T, Luan G, Li L. Dynamic Correlations between Intrinsic Connectivity and Extrinsic Connectivity of the Auditory Cortex in Humans. Front Hum Neurosci 2017; 11:407. [PMID: 28848415 PMCID: PMC5554526 DOI: 10.3389/fnhum.2017.00407] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 07/25/2017] [Indexed: 12/31/2022] Open
Abstract
The arrival of sound signals in the auditory cortex (AC) triggers both local and inter-regional signal propagations over time up to hundreds of milliseconds and builds up both intrinsic functional connectivity (iFC) and extrinsic functional connectivity (eFC) of the AC. However, interactions between iFC and eFC are largely unknown. Using intracranial stereo-electroencephalographic recordings in people with drug-refractory epilepsy, this study mainly investigated the temporal dynamic of the relationships between iFC and eFC of the AC. The results showed that a Gaussian wideband-noise burst markedly elicited potentials in both the AC and numerous higher-order cortical regions outside the AC (non-auditory cortices). Granger causality analyses revealed that in the earlier time window, iFC of the AC was positively correlated with both eFC from the AC to the inferior temporal gyrus and that to the inferior parietal lobule. While in later periods, the iFC of the AC was positively correlated with eFC from the precentral gyrus to the AC and that from the insula to the AC. In conclusion, dual-directional interactions occur between iFC and eFC of the AC at different time windows following the sound stimulation and may form the foundation underlying various central auditory processes, including auditory sensory memory, object formation, integrations between sensory, perceptional, attentional, motor, emotional, and executive processes.
Collapse
Affiliation(s)
- Zhuang Cui
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China.,Beijing HospitalBeijing, China
| | - Qian Wang
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China.,School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory of Machine Perception (Ministry of Education), Peking UniversityBeijing, China
| | - Yayue Gao
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory of Machine Perception (Ministry of Education), Peking UniversityBeijing, China
| | - Jing Wang
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China
| | - Mengyang Wang
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China
| | - Pengfei Teng
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China
| | - Yuguang Guan
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China
| | - Jian Zhou
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China
| | - Tianfu Li
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China.,Beijing Institute for Brain DisordersBeijing, China
| | - Guoming Luan
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical UniversityBeijing, China.,Beijing Institute for Brain DisordersBeijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Key Laboratory of Machine Perception (Ministry of Education), Peking UniversityBeijing, China.,Beijing Institute for Brain DisordersBeijing, China
| |
Collapse
|
58
|
Costa-Faidella J, Sussman ES, Escera C. Selective entrainment of brain oscillations drives auditory perceptual organization. Neuroimage 2017; 159:195-206. [PMID: 28757195 DOI: 10.1016/j.neuroimage.2017.07.056] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Revised: 07/06/2017] [Accepted: 07/25/2017] [Indexed: 01/23/2023] Open
Abstract
Perceptual sound organization supports our ability to make sense of the complex acoustic environment, to understand speech and to enjoy music. However, the neuronal mechanisms underlying the subjective experience of perceiving univocal auditory patterns that can be listened to, despite hearing all sounds in a scene, are poorly understood. We hereby investigated the manner in which competing sound organizations are simultaneously represented by specific brain activity patterns and the way attention and task demands prime the internal model generating the current percept. Using a selective attention task on ambiguous auditory stimulation coupled with EEG recordings, we found that the phase of low-frequency oscillatory activity dynamically tracks multiple sound organizations concurrently. However, whereas the representation of ignored sound patterns is circumscribed to auditory regions, large-scale oscillatory entrainment in auditory, sensory-motor and executive-control network areas reflects the active perceptual organization, thereby giving rise to the subjective experience of a unitary percept.
Collapse
Affiliation(s)
- Jordi Costa-Faidella
- Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, 08035, Barcelona, Catalonia, Spain; Institute of Neurosciences, University of Barcelona, 08035, Barcelona, Catalonia, Spain
| | - Elyse S Sussman
- Departments of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, 10461, USA; Otorhinolaryngology-HNS, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Carles Escera
- Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, 08035, Barcelona, Catalonia, Spain; Institute of Neurosciences, University of Barcelona, 08035, Barcelona, Catalonia, Spain; Institut de Recerca Sant Joan de Déu, 08950, Esplugues de Llobregat, Catalonia, Spain.
| |
Collapse
|
59
|
Hausfeld L, Gutschalk A, Formisano E, Riecke L. Effects of Cross-modal Asynchrony on Informational Masking in Human Cortex. J Cogn Neurosci 2017; 29:980-990. [DOI: 10.1162/jocn_a_01097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
In many everyday listening situations, an otherwise audible sound may go unnoticed amid multiple other sounds. This auditory phenomenon, called informational masking (IM), is sensitive to visual input and involves early (50–250 msec) activity in the auditory cortex (the so-called awareness-related negativity). It is still unclear whether and how the timing of visual input influences the neural correlates of IM in auditory cortex. To address this question, we obtained simultaneous behavioral and neural measures of IM from human listeners in the presence of a visual input stream and varied the asynchrony between the visual stream and the rhythmic auditory target stream (in-phase, antiphase, or random). Results show effects of cross-modal asynchrony on both target detectability (RT and sensitivity) and the awareness-related negativity measured with EEG, which were driven primarily by antiphasic audiovisual stimuli. The neural effect was limited to the interval shortly before listeners' behavioral report of the target. Our results indicate that the relative timing of visual input can influence the IM of a target sound in the human auditory cortex. They further show that this audiovisual influence occurs early during the perceptual buildup of the target sound. In summary, these findings provide novel insights into the interaction of IM and multisensory interaction in the human brain.
Collapse
|
60
|
Snyder JS, Elhilali M. Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 2017; 1396:39-55. [PMID: 28199022 PMCID: PMC5446279 DOI: 10.1111/nyas.13317] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 12/21/2016] [Accepted: 01/08/2017] [Indexed: 11/29/2022]
Abstract
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds-and conventional behavioral techniques-to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the last few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field.
Collapse
Affiliation(s)
- Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, Nevada
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
61
|
Rankin J, Osborn Popp PJ, Rinzel J. Stimulus Pauses and Perturbations Differentially Delay or Promote the Segregation of Auditory Objects: Psychoacoustics and Modeling. Front Neurosci 2017; 11:198. [PMID: 28473747 PMCID: PMC5397483 DOI: 10.3389/fnins.2017.00198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 03/23/2017] [Indexed: 11/21/2022] Open
Abstract
Segregating distinct sound sources is fundamental for auditory perception, as in the cocktail party problem. In a process called the build-up of stream segregation, distinct sound sources that are perceptually integrated initially can be segregated into separate streams after several seconds. Previous research concluded that abrupt changes in the incoming sounds during build-up—for example, a step change in location, loudness or timing—reset the percept to integrated. Following this reset, the multisecond build-up process begins again. Neurophysiological recordings in auditory cortex (A1) show fast (subsecond) adaptation, but unified mechanistic explanations for the bias toward integration, multisecond build-up and resets remain elusive. Combining psychoacoustics and modeling, we show that initial unadapted A1 responses bias integration, that the slowness of build-up arises naturally from competition downstream, and that recovery of adaptation can explain resets. An early bias toward integrated perceptual interpretations arising from primary cortical stages that encode low-level features and feed into competition downstream could also explain similar phenomena in vision. Further, we report a previously overlooked class of perturbations that promote segregation rather than integration. Our results challenge current understanding for perturbation effects on the emergence of sound source segregation, leading to a new hypothesis for differential processing downstream of A1. Transient perturbations can momentarily redirect A1 responses as input to downstream competition units that favor segregation.
Collapse
Affiliation(s)
- James Rankin
- Department of Mathematics, University of ExeterExeter, UK.,Center for Neural Science, New York UniversityNew York, NY, USA
| | | | - John Rinzel
- Center for Neural Science, New York UniversityNew York, NY, USA.,Courant Institute of Mathematical SciencesNew York, NY, USA
| |
Collapse
|
62
|
Temporal coherence structure rapidly shapes neuronal interactions. Nat Commun 2017; 8:13900. [PMID: 28054545 PMCID: PMC5228385 DOI: 10.1038/ncomms13900] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 11/10/2016] [Indexed: 11/08/2022] Open
Abstract
Perception of segregated sources is essential in navigating cluttered acoustic environments. A basic mechanism to implement this process is the temporal coherence principle. It postulates that a signal is perceived as emitted from a single source only when all of its features are temporally modulated coherently, causing them to bind perceptually. Here we report on neural correlates of this process as rapidly reshaped interactions in primary auditory cortex, measured in three different ways: as changes in response rates, as adaptations of spectrotemporal receptive fields following stimulation by temporally coherent and incoherent tone sequences, and as changes in spiking correlations during the tone sequences. Responses, sensitivity and presumed connectivity were rapidly enhanced by synchronous stimuli, and suppressed by alternating (asynchronous) sounds, but only when the animals engaged in task performance and were attentive to the stimuli. Temporal coherence and attention are therefore both important factors in auditory scene analysis. One can easily identify if multiple sounds are originating from a single source yet the neural mechanisms underlying this process are unknown. Here the authors show that temporally coherent sounds elicit changes in receptive field dynamics of auditory cortical neurons in ferrets only when paying attention.
Collapse
|
63
|
Shinn-Cunningham B, Best V, Lee AKC. Auditory Object Formation and Selection. SPRINGER HANDBOOK OF AUDITORY RESEARCH 2017. [DOI: 10.1007/978-3-319-51662-2_2] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
64
|
Szabó BT, Denham SL, Winkler I. Computational Models of Auditory Scene Analysis: A Review. Front Neurosci 2016; 10:524. [PMID: 27895552 PMCID: PMC5108797 DOI: 10.3389/fnins.2016.00524] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 10/28/2016] [Indexed: 12/02/2022] Open
Abstract
Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.
Collapse
Affiliation(s)
- Beáta T Szabó
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic UniversityBudapest, Hungary; Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of SciencesBudapest, Hungary
| | - Susan L Denham
- School of Psychology, University of Plymouth Plymouth, UK
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences Budapest, Hungary
| |
Collapse
|
65
|
Tóth B, Kocsis Z, Háden GP, Szerafin Á, Shinn-Cunningham BG, Winkler I. EEG signatures accompanying auditory figure-ground segregation. Neuroimage 2016; 141:108-119. [PMID: 27421185 PMCID: PMC5656226 DOI: 10.1016/j.neuroimage.2016.07.028] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Revised: 07/06/2016] [Accepted: 07/11/2016] [Indexed: 11/16/2022] Open
Abstract
In everyday acoustic scenes, figure-ground segregation typically requires one to group together sound elements over both time and frequency. Electroencephalogram was recorded while listeners detected repeating tonal complexes composed of a random set of pure tones within stimuli consisting of randomly varying tonal elements. The repeating pattern was perceived as a figure over the randomly changing background. It was found that detection performance improved both as the number of pure tones making up each repeated complex (figure coherence) increased, and as the number of repeated complexes (duration) increased - i.e., detection was easier when either the spectral or temporal structure of the figure was enhanced. Figure detection was accompanied by the elicitation of the object related negativity (ORN) and the P400 event-related potentials (ERPs), which have been previously shown to be evoked by the presence of two concurrent sounds. Both ERP components had generators within and outside of auditory cortex. The amplitudes of the ORN and the P400 increased with both figure coherence and figure duration. However, only the P400 amplitude correlated with detection performance. These results suggest that 1) the ORN and P400 reflect processes involved in detecting the emergence of a new auditory object in the presence of other concurrent auditory objects; 2) the ORN corresponds to the likelihood of the presence of two or more concurrent sound objects, whereas the P400 reflects the perceptual recognition of the presence of multiple auditory objects and/or preparation for reporting the detection of a target object.
Collapse
Affiliation(s)
- Brigitta Tóth
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Center for Computational Neuroscience and Neural Technology, Boston University, Boston, USA.
| | - Zsuzsanna Kocsis
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | - Gábor P Háden
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Ágnes Szerafin
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | | | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged, Szeged, Hungary
| |
Collapse
|
66
|
Dykstra AR, Halgren E, Gutschalk A, Eskandar EN, Cash SS. Neural Correlates of Auditory Perceptual Awareness and Release from Informational Masking Recorded Directly from Human Cortex: A Case Study. Front Neurosci 2016; 10:472. [PMID: 27812318 PMCID: PMC5071374 DOI: 10.3389/fnins.2016.00472] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Accepted: 10/03/2016] [Indexed: 11/13/2022] Open
Abstract
In complex acoustic environments, even salient supra-threshold sounds sometimes go unperceived, a phenomenon known as informational masking. The neural basis of informational masking (and its release) has not been well-characterized, particularly outside auditory cortex. We combined electrocorticography in a neurosurgical patient undergoing invasive epilepsy monitoring with trial-by-trial perceptual reports of isochronous target-tone streams embedded in random multi-tone maskers. Awareness of such masker-embedded target streams was associated with a focal negativity between 100 and 200 ms and high-gamma activity (HGA) between 50 and 250 ms (both in auditory cortex on the posterolateral superior temporal gyrus) as well as a broad P3b-like potential (between ~300 and 600 ms) with generators in ventrolateral frontal and lateral temporal cortex. Unperceived target tones elicited drastically reduced versions of such responses, if at all. While it remains unclear whether these responses reflect conscious perception, itself, as opposed to pre- or post-perceptual processing, the results suggest that conscious perception of target sounds in complex listening environments may engage diverse neural mechanisms in distributed brain areas.
Collapse
Affiliation(s)
- Andrew R Dykstra
- Program in Speech and Hearing Bioscience and Technology, Harvard-MIT Division of Health Sciences and TechnologyCambridge, MA, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical SchoolBoston, MA, USA
| | - Eric Halgren
- Departments of Radiology and Neurosciences, University of California San Diego, La Jolla, CA, USA
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg Heidelberg, Germany
| | - Emad N Eskandar
- Department of Neurosurgery, Massachusetts General Hospital and Harvard Medical School Boston, MA, USA
| | - Sydney S Cash
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School Boston, MA, USA
| |
Collapse
|
67
|
Teki S, Barascud N, Picard S, Payne C, Griffiths TD, Chait M. Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence. Cereb Cortex 2016; 26:3669-80. [PMID: 27325682 PMCID: PMC5004755 DOI: 10.1093/cercor/bhw173] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence-the coincidence of sound elements in and across time-is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals ("stochastic figure-ground": SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as "figures" popping out of a stochastic "ground." Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the "figure" from the randomly varying "ground." Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the "classic" auditory system, is also involved in the early stages of auditory scene analysis."
Collapse
Affiliation(s)
- Sundeep Teki
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Auditory Cognition Group, Institute of Neuroscience, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
- Current address: Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, UK
| | - Nicolas Barascud
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Ear Institute, University College London, London WC1X 8EE, UK
| | - Samuel Picard
- Ear Institute, University College London, London WC1X 8EE, UK
| | | | - Timothy D. Griffiths
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3BG, UK
- Auditory Cognition Group, Institute of Neuroscience, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Maria Chait
- Ear Institute, University College London, London WC1X 8EE, UK
| |
Collapse
|
68
|
Peelle JE, Wingfield A. The Neural Consequences of Age-Related Hearing Loss. Trends Neurosci 2016; 39:486-497. [PMID: 27262177 DOI: 10.1016/j.tins.2016.05.001] [Citation(s) in RCA: 152] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Revised: 05/04/2016] [Accepted: 05/09/2016] [Indexed: 01/02/2023]
Abstract
During hearing, acoustic signals travel up the ascending auditory pathway from the cochlea to auditory cortex; efferent connections provide descending feedback. In human listeners, although auditory and cognitive processing have sometimes been viewed as separate domains, a growing body of work suggests they are intimately coupled. Here, we review the effects of hearing loss on neural systems supporting spoken language comprehension, beginning with age-related physiological decline. We suggest that listeners recruit domain general executive systems to maintain successful communication when the auditory signal is degraded, but that this compensatory processing has behavioral consequences: even relatively mild levels of hearing loss can lead to cascading cognitive effects that impact perception, comprehension, and memory, leading to increased listening effort during speech comprehension.
Collapse
Affiliation(s)
- Jonathan E Peelle
- Department of Otolaryngology, Washington University in St Louis, St Louis, MO, USA.
| | - Arthur Wingfield
- Volen National Center for Complex Systems, Brandeis University, Waltham, MA, USA.
| |
Collapse
|
69
|
Riecke L, Sack AT, Schroeder CE. Endogenous Delta/Theta Sound-Brain Phase Entrainment Accelerates the Buildup of Auditory Streaming. Curr Biol 2015; 25:3196-201. [PMID: 26628008 DOI: 10.1016/j.cub.2015.10.045] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 10/01/2015] [Accepted: 10/19/2015] [Indexed: 11/30/2022]
Abstract
In many natural listening situations, meaningful sounds (e.g., speech) fluctuate in slow rhythms among other sounds. When a slow rhythmic auditory stream is selectively attended, endogenous delta (1‒4 Hz) oscillations in auditory cortex may shift their timing so that higher-excitability neuronal phases become aligned with salient events in that stream [1, 2]. As a consequence of this stream-brain phase entrainment [3], these events are processed and perceived more readily than temporally non-overlapping events [4-11], essentially enhancing the neural segregation between the attended stream and temporally noncoherent streams [12]. Stream-brain phase entrainment is robust to acoustic interference [13-20] provided that target stream-evoked rhythmic activity can be segregated from noncoherent activity evoked by other sounds [21], a process that usually builds up over time [22-27]. However, it has remained unclear whether stream-brain phase entrainment functionally contributes to this buildup of rhythmic streams or whether it is merely an epiphenomenon of it. Here, we addressed this issue directly by experimentally manipulating endogenous stream-brain phase entrainment in human auditory cortex with non-invasive transcranial alternating current stimulation (TACS) [28-30]. We assessed the consequences of these manipulations on the perceptual buildup of the target stream (the time required to recognize its presence in a noisy background), using behavioral measures in 20 healthy listeners performing a naturalistic listening task. Experimentally induced cyclic 4-Hz variations in stream-brain phase entrainment reliably caused a cyclic 4-Hz pattern in perceptual buildup time. Our findings demonstrate that strong endogenous delta/theta stream-brain phase entrainment accelerates the perceptual emergence of task-relevant rhythmic streams in noisy environments.
Collapse
Affiliation(s)
- Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 Maastricht, the Netherlands.
| | - Alexander T Sack
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 Maastricht, the Netherlands
| | - Charles E Schroeder
- Cognitive Neuroscience and Schizophrenia Program, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY 10962, USA; Departments of Neurosurgery and Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY 10032-2695, USA
| |
Collapse
|
70
|
Dykstra AR, Gutschalk A. Does the mismatch negativity operate on a consciously accessible memory trace? SCIENCE ADVANCES 2015; 1:e1500677. [PMID: 26702432 PMCID: PMC4681331 DOI: 10.1126/sciadv.1500677] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 10/13/2015] [Indexed: 06/05/2023]
Abstract
The extent to which the contents of short-term memory are consciously accessible is a fundamental question of cognitive science. In audition, short-term memory is often studied via the mismatch negativity (MMN), a change-related component of the auditory evoked response that is elicited by violations of otherwise regular stimulus sequences. The prevailing functional view of the MMN is that it operates on preattentive and even preconscious stimulus representations. We directly examined the preconscious notion of the MMN using informational masking and magnetoencephalography. Spectrally isolated and otherwise suprathreshold auditory oddball sequences were occasionally random rendered inaudible by embedding them in random multitone masker "clouds." Despite identical stimulation/task contexts and a clear representation of all stimuli in auditory cortex, MMN was only observed when the preceding regularity (that is, the standard stream) was consciously perceived. The results call into question the preconscious interpretation of MMN and raise the possibility that it might index partial awareness in the absence of overt behavior.
Collapse
|
71
|
Detecting tones in complex auditory scenes. Neuroimage 2015; 122:203-13. [DOI: 10.1016/j.neuroimage.2015.07.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 03/12/2015] [Accepted: 07/01/2015] [Indexed: 11/18/2022] Open
|
72
|
Masutomi K, Barascud N, Kashino M, McDermott JH, Chait M. Sound segregation via embedded repetition is robust to inattention. J Exp Psychol Hum Percept Perform 2015; 42:386-400. [PMID: 26480248 PMCID: PMC4763252 DOI: 10.1037/xhp0000147] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The segregation of sound sources from the mixture of sounds that enters the ear is a core capacity of human hearing, but the extent to which this process is dependent on attention remains unclear. This study investigated the effect of attention on the ability to segregate sounds via repetition. We utilized a dual task design in which stimuli to be segregated were presented along with stimuli for a "decoy" task that required continuous monitoring. The task to assess segregation presented a target sound 10 times in a row, each time concurrent with a different distractor sound. McDermott, Wrobleski, and Oxenham (2011) demonstrated that repetition causes the target sound to be segregated from the distractors. Segregation was queried by asking listeners whether a subsequent probe sound was identical to the target. A control task presented similar stimuli but probed discrimination without engaging segregation processes. We present results from 3 different decoy tasks: a visual multiple object tracking task, a rapid serial visual presentation (RSVP) digit encoding task, and a demanding auditory monitoring task. Load was manipulated by using high- and low-demand versions of each decoy task. The data provide converging evidence of a small effect of attention that is nonspecific, in that it affected the segregation and control tasks to a similar extent. In all cases, segregation performance remained high despite the presence of a concurrent, objectively demanding decoy task. The results suggest that repetition-based segregation is robust to inattention.
Collapse
Affiliation(s)
- Keiko Masutomi
- Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
| | | | - Makio Kashino
- Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
| | | |
Collapse
|
73
|
Billig AJ, Carlyon RP. Automaticity and primacy of auditory streaming: Concurrent subjective and objective measures. J Exp Psychol Hum Percept Perform 2015; 42:339-353. [PMID: 26414168 PMCID: PMC4763253 DOI: 10.1037/xhp0000146] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Two experiments used subjective and objective measures to study the automaticity and primacy of auditory streaming. Listeners heard sequences of “ABA–” triplets, where “A” and “B” were tones of different frequencies and “–” was a silent gap. Segregation was more frequently reported, and rhythmically deviant triplets less well detected, for a greater between-tone frequency separation and later in the sequence. In Experiment 1, performing a competing auditory task for the first part of the sequence led to a reduction in subsequent streaming compared to when the tones were attended throughout. This is consistent with focused attention promoting streaming, and/or with attention switches resetting it. However, the proportion of segregated reports increased more rapidly following a switch than at the start of a sequence, indicating that some streaming occurred automatically. Modeling ruled out a simple “covert attention” account of this finding. Experiment 2 required listeners to perform subjective and objective tasks concurrently. It revealed superior performance during integrated compared to segregated reports, beyond that explained by the codependence of the two measures on stimulus parameters. We argue that listeners have limited access to low-level stimulus representations once perceptual organization has occurred, and that subjective and objective streaming measures partly index the same processes.
Collapse
|
74
|
Thakur CS, Wang RM, Afshar S, Hamilton TJ, Tapson JC, Shamma SA, van Schaik A. Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time. Front Neurosci 2015; 9:309. [PMID: 26388721 PMCID: PMC4557082 DOI: 10.3389/fnins.2015.00309] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 08/18/2015] [Indexed: 11/13/2022] Open
Abstract
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition.
Collapse
Affiliation(s)
- Chetan Singh Thakur
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Runchun M. Wang
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Saeed Afshar
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Tara J. Hamilton
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Jonathan C. Tapson
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Shihab A. Shamma
- Department of Electrical and Computer Engineering and Institute for Systems Research, University of MarylandCollege Park, MD, USA
| | - André van Schaik
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| |
Collapse
|
75
|
Abstract
The neural resonance theory of musical meter explains musical beat tracking as the result of entrainment of neural oscillations to the beat frequency and its higher harmonics. This theory has gained empirical support from experiments using simple, abstract stimuli. However, to date there has been no empirical evidence for a role of neural entrainment in the perception of the beat of ecologically valid music. Here we presented participants with a single pop song with a superimposed bassoon sound. This stimulus was either lined up with the beat of the music or shifted away from the beat by 25% of the average interbeat interval. Both conditions elicited a neural response at the beat frequency. However, although the on-the-beat condition elicited a clear response at the first harmonic of the beat, this frequency was absent in the neural response to the off-the-beat condition. These results support a role for neural entrainment in tracking the metrical structure of real music and show that neural meter tracking can be disrupted by the presentation of contradictory rhythmic cues.
Collapse
|
76
|
O'Sullivan JA, Shamma SA, Lalor EC. Evidence for Neural Computations of Temporal Coherence in an Auditory Scene and Their Enhancement during Active Listening. J Neurosci 2015; 35:7256-63. [PMID: 25948273 PMCID: PMC6605258 DOI: 10.1523/jneurosci.4973-14.2015] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2014] [Revised: 03/10/2015] [Accepted: 03/31/2015] [Indexed: 11/21/2022] Open
Abstract
The human brain has evolved to operate effectively in highly complex acoustic environments, segregating multiple sound sources into perceptually distinct auditory objects. A recent theory seeks to explain this ability by arguing that stream segregation occurs primarily due to the temporal coherence of the neural populations that encode the various features of an individual acoustic source. This theory has received support from both psychoacoustic and functional magnetic resonance imaging (fMRI) studies that use stimuli which model complex acoustic environments. Termed stochastic figure-ground (SFG) stimuli, they are composed of a "figure" and background that overlap in spectrotemporal space, such that the only way to segregate the figure is by computing the coherence of its frequency components over time. Here, we extend these psychoacoustic and fMRI findings by using the greater temporal resolution of electroencephalography to investigate the neural computation of temporal coherence. We present subjects with modified SFG stimuli wherein the temporal coherence of the figure is modulated stochastically over time, which allows us to use linear regression methods to extract a signature of the neural processing of this temporal coherence. We do this under both active and passive listening conditions. Our findings show an early effect of coherence during passive listening, lasting from ∼115 to 185 ms post-stimulus. When subjects are actively listening to the stimuli, these responses are larger and last longer, up to ∼265 ms. These findings provide evidence for early and preattentive neural computations of temporal coherence that are enhanced by active analysis of an auditory scene.
Collapse
Affiliation(s)
- James A O'Sullivan
- School of Engineering, Trinity Centre for Bioengineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland, and
| | - Shihab A Shamma
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742
| | - Edmund C Lalor
- School of Engineering, Trinity Centre for Bioengineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland, and
| |
Collapse
|
77
|
Pannese A, Herrmann CS, Sussman E. Analyzing the auditory scene: neurophysiologic evidence of a dissociation between detection of regularity and detection of change. Brain Topogr 2015; 28:411-22. [PMID: 24771006 PMCID: PMC4210364 DOI: 10.1007/s10548-014-0368-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2014] [Accepted: 04/07/2014] [Indexed: 10/25/2022]
Abstract
Detecting regularity and change in the environment is crucial for survival, as it enables making predictions about the world and informing goal-directed behavior. In the auditory modality, the detection of regularity involves segregating incoming sounds into distinct perceptual objects (stream segregation). The detection of change from this within-stream regularity is associated with the mismatch negativity, a component of auditory event-related brain potentials (ERPs). A central unanswered question is how the detection of regularity and the detection of change are interrelated, and whether attention affects the former, the latter, or both. Here we show that the detection of regularity and the detection of change can be empirically dissociated, and that attention modulates the detection of change without precluding the detection of regularity, and the perceptual organization of the auditory background into distinct streams. By applying frequency spectra analysis on the EEG of subjects engaged in a selective listening task, we found distinct peaks of ERP synchronization, corresponding to the rhythm of the frequency streams, independently of whether the stream was attended or ignored. Our results provide direct neurophysiological evidence of regularity detection in the auditory background, and show that it can occur independently of change detection and in the absence of attention.
Collapse
Affiliation(s)
- Alessia Pannese
- The Italian Academy for Advanced Studies, Columbia, University, 1161 Amsterdam Avenue, New York, NY 10027, USA; Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Christoph S. Herrmann
- Experimental Psychology Lab Center for Excellence “Hearing4all”, European Medical School, Carl von Ossietzky Universität Oldenburg, 26111 Oldenburg, Germany
| | - Elyse Sussman
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA; Department of Otorhinolaryngology-HNS, Albert Einstein, College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| |
Collapse
|
78
|
Eggermont JJ. Animal models of auditory temporal processing. Int J Psychophysiol 2015; 95:202-15. [DOI: 10.1016/j.ijpsycho.2014.03.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Revised: 03/27/2014] [Accepted: 03/27/2014] [Indexed: 10/25/2022]
|
79
|
Rimmele JM, Sussman E, Poeppel D. The role of temporal structure in the investigation of sensory memory, auditory scene analysis, and speech perception: a healthy-aging perspective. Int J Psychophysiol 2015; 95:175-83. [PMID: 24956028 PMCID: PMC4272684 DOI: 10.1016/j.ijpsycho.2014.06.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2013] [Revised: 06/13/2014] [Accepted: 06/15/2014] [Indexed: 01/08/2023]
Abstract
Listening situations with multiple talkers or background noise are common in everyday communication and are particularly demanding for older adults. Here we review current research on auditory perception in aging individuals in order to gain insights into the challenges of listening under noisy conditions. Informationally rich temporal structure in auditory signals--over a range of time scales from milliseconds to seconds--renders temporal processing central to perception in the auditory domain. We discuss the role of temporal structure in auditory processing, in particular from a perspective relevant for hearing in background noise, and focusing on sensory memory, auditory scene analysis, and speech perception. Interestingly, these auditory processes, usually studied in an independent manner, show considerable overlap of processing time scales, even though each has its own 'privileged' temporal regimes. By integrating perspectives on temporal structure processing in these three areas of investigation, we aim to highlight similarities typically not recognized.
Collapse
Affiliation(s)
- Johanna Maria Rimmele
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| | - Elyse Sussman
- Albert Einstein College of Medicine, Dominick P. Purpura Department of Neuroscience, Bronx, NY, United States
| | - David Poeppel
- Department of Psychology and Center for Neural Science, New York University, New York, NY, United States; Max-Planck Institute for Empirical Aesthetics, Frankfurt, Germany
| |
Collapse
|
80
|
|
81
|
Akram S, Englitz B, Elhilali M, Simon JZ, Shamma SA. Investigating the neural correlates of a streaming percept in an informational-masking paradigm. PLoS One 2014; 9:e114427. [PMID: 25490720 PMCID: PMC4260833 DOI: 10.1371/journal.pone.0114427] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 11/10/2014] [Indexed: 11/19/2022] Open
Abstract
Humans routinely segregate a complex acoustic scene into different auditory streams, through the extraction of bottom-up perceptual cues and the use of top-down selective attention. To determine the neural mechanisms underlying this process, neural responses obtained through magnetoencephalography (MEG) were correlated with behavioral performance in the context of an informational masking paradigm. In half the trials, subjects were asked to detect frequency deviants in a target stream, consisting of a rhythmic tone sequence, embedded in a separate masker stream composed of a random cloud of tones. In the other half of the trials, subjects were exposed to identical stimuli but asked to perform a different task—to detect tone-length changes in the random cloud of tones. In order to verify that the normalized neural response to the target sequence served as an indicator of streaming, we correlated neural responses with behavioral performance under a variety of stimulus parameters (target tone rate, target tone frequency, and the “protection zone”, that is, the spectral area with no tones around the target frequency) and attentional states (changing task objective while maintaining the same stimuli). In all conditions that facilitated target/masker streaming behaviorally, MEG normalized neural responses also changed in a manner consistent with the behavior. Thus, attending to the target stream caused a significant increase in power and phase coherence of the responses in recording channels correlated with an increase in the behavioral performance of the listeners. Normalized neural target responses also increased as the protection zone widened and as the frequency of the target tones increased. Finally, when the target sequence rate increased, the buildup of the normalized neural responses was significantly faster, mirroring the accelerated buildup of the streaming percepts. Our data thus support close links between the perceptual and neural consequences of the auditory stream segregation.
Collapse
Affiliation(s)
- Sahar Akram
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Bernhard Englitz
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Département d'Etudes Cognitives, Ecole normale supérieure, Paris, France
- Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Department of Biology, University of Maryland University, College Park, Maryland, United States of America
| | - Shihab A. Shamma
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Département d'Etudes Cognitives, Ecole normale supérieure, Paris, France
| |
Collapse
|
82
|
Riecke L, Scharke W, Valente G, Gutschalk A. Sustained selective attention to competing amplitude-modulations in human auditory cortex. PLoS One 2014; 9:e108045. [PMID: 25259525 PMCID: PMC4178064 DOI: 10.1371/journal.pone.0108045] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 08/23/2014] [Indexed: 11/18/2022] Open
Abstract
Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control.
Collapse
Affiliation(s)
- Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- * E-mail:
| | - Wolfgang Scharke
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University Hospital, RWTH Aachen University, Aachen, Germany
| | - Giancarlo Valente
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| |
Collapse
|
83
|
Ponnath A, Farris HE. Sound-by-sound thalamic stimulation modulates midbrain auditory excitability and relative binaural sensitivity in frogs. Front Neural Circuits 2014; 8:85. [PMID: 25120437 PMCID: PMC4111082 DOI: 10.3389/fncir.2014.00085] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2013] [Accepted: 07/04/2014] [Indexed: 11/13/2022] Open
Abstract
Descending circuitry can modulate auditory processing, biasing sensitivity to particular stimulus parameters and locations. Using awake in vivo single unit recordings, this study tested whether electrical stimulation of the thalamus modulates auditory excitability and relative binaural sensitivity in neurons of the amphibian midbrain. In addition, by using electrical stimuli that were either longer than the acoustic stimuli (i.e., seconds) or presented on a sound-by-sound basis (ms), experiments addressed whether the form of modulation depended on the temporal structure of the electrical stimulus. Following long duration electrical stimulation (3-10 s of 20 Hz square pulses), excitability (spikes/acoustic stimulus) to free-field noise stimuli decreased by 32%, but returned over 600 s. In contrast, sound-by-sound electrical stimulation using a single 2 ms duration electrical pulse 25 ms before each noise stimulus caused faster and varied forms of modulation: modulation lasted <2 s and, in different cells, excitability either decreased, increased or shifted in latency. Within cells, the modulatory effect of sound-by-sound electrical stimulation varied between different acoustic stimuli, including for different male calls, suggesting modulation is specific to certain stimulus attributes. For binaural units, modulation depended on the ear of input, as sound-by-sound electrical stimulation preceding dichotic acoustic stimulation caused asymmetric modulatory effects: sensitivity shifted for sounds at only one ear, or by different relative amounts for both ears. This caused a change in the relative difference in binaural sensitivity. Thus, sound-by-sound electrical stimulation revealed fast and ear-specific (i.e., lateralized) auditory modulation that is potentially suited to shifts in auditory attention during sound segregation in the auditory scene.
Collapse
Affiliation(s)
- Abhilash Ponnath
- Neuroscience Center, Louisiana State University Health Sciences Center New Orleans, LA, USA ; Department of Otolaryngology and Biocommunication, Louisiana State University Health Sciences Center New Orleans, LA, USA
| | - Hamilton E Farris
- Neuroscience Center, Louisiana State University Health Sciences Center New Orleans, LA, USA ; Department of Otolaryngology and Biocommunication, Louisiana State University Health Sciences Center New Orleans, LA, USA ; Department of Cell Biology and Anatomy, Louisiana State University Health Sciences Center New Orleans, LA, USA
| |
Collapse
|
84
|
Shuai L, Elhilali M. Task-dependent neural representations of salient events in dynamic auditory scenes. Front Neurosci 2014; 8:203. [PMID: 25100934 PMCID: PMC4104552 DOI: 10.3389/fnins.2014.00203] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Accepted: 06/27/2014] [Indexed: 11/13/2022] Open
Abstract
Selecting pertinent events in the cacophony of sounds that impinge on our ears every day is regulated by the acoustic salience of sounds in the scene as well as their behavioral relevance as dictated by top-down task-dependent demands. The current study aims to explore the neural signature of both facets of attention, as well as their possible interactions in the context of auditory scenes. Using a paradigm with dynamic auditory streams with occasional salient events, we recorded neurophysiological responses of human listeners using EEG while manipulating the subjects' attentional state as well as the presence or absence of a competing auditory stream. Our results showed that salient events caused an increase in the auditory steady-state response (ASSR) irrespective of attentional state or complexity of the scene. Such increase supplemented ASSR increases due to task-driven attention. Salient events also evoked a strong N1 peak in the ERP response when listeners were attending to the target sound stream, accompanied by an MMN-like component in some cases and changes in the P1 and P300 components under all listening conditions. Overall, bottom-up attention induced by a salient change in the auditory stream appears to mostly modulate the amplitude of the steady-state response and certain event-related potentials to salient sound events; though this modulation is affected by top-down attentional processes and the prominence of these events in the auditory scene as well.
Collapse
Affiliation(s)
| | - Mounya Elhilali
- Laboratory of Computational Audio Perception, Department of Electrical and Computer Engineering, Center for Speech and Language Processing, Johns Hopkins UniversityBaltimore, MD, USA
| |
Collapse
|
85
|
Chabot-Leclerc A, Jørgensen S, Dau T. The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3502-12. [PMID: 24907813 DOI: 10.1121/1.4873517] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.
Collapse
Affiliation(s)
- Alexandre Chabot-Leclerc
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Søren Jørgensen
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Torsten Dau
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| |
Collapse
|
86
|
Ding N, Simon JZ. Cortical entrainment to continuous speech: functional roles and interpretations. Front Hum Neurosci 2014; 8:311. [PMID: 24904354 PMCID: PMC4036061 DOI: 10.3389/fnhum.2014.00311] [Citation(s) in RCA: 248] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2014] [Accepted: 04/27/2014] [Indexed: 11/13/2022] Open
Abstract
Auditory cortical activity is entrained to the temporal envelope of speech, which corresponds to the syllabic rhythm of speech. Such entrained cortical activity can be measured from subjects naturally listening to sentences or spoken passages, providing a reliable neural marker of online speech processing. A central question still remains to be answered about whether cortical entrained activity is more closely related to speech perception or non-speech-specific auditory encoding. Here, we review a few hypotheses about the functional roles of cortical entrainment to speech, e.g., encoding acoustic features, parsing syllabic boundaries, and selecting sensory information in complex listening environments. It is likely that speech entrainment is not a homogeneous response and these hypotheses apply separately for speech entrainment generated from different neural sources. The relationship between entrained activity and speech intelligibility is also discussed. A tentative conclusion is that theta-band entrainment (4–8 Hz) encodes speech features critical for intelligibility while delta-band entrainment (1–4 Hz) is related to the perceived, non-speech-specific acoustic rhythm. To further understand the functional properties of speech entrainment, a splitter’s approach will be needed to investigate (1) not just the temporal envelope but what specific acoustic features are encoded and (2) not just speech intelligibility but what specific psycholinguistic processes are encoded by entrained cortical activity. Similarly, the anatomical and spectro-temporal details of entrained activity need to be taken into account when investigating its functional properties.
Collapse
Affiliation(s)
- Nai Ding
- Department of Psychology, New York University New York, NY, USA
| | - Jonathan Z Simon
- Department of Electrical and Computer Engineering, University of Maryland College Park, College Park MD, USA ; Department of Biology, University of Maryland College Park, College Park MD, USA ; Institute for Systems Research, University of Maryland College Park, College Park MD, USA
| |
Collapse
|
87
|
Simon JZ. The encoding of auditory objects in auditory cortex: insights from magnetoencephalography. Int J Psychophysiol 2014; 95:184-90. [PMID: 24841996 DOI: 10.1016/j.ijpsycho.2014.05.005] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Revised: 03/22/2014] [Accepted: 05/01/2014] [Indexed: 11/16/2022]
Abstract
Auditory objects, like their visual counterparts, are perceptually defined constructs, but nevertheless must arise from underlying neural circuitry. Using magnetoencephalography (MEG) recordings of the neural responses of human subjects listening to complex auditory scenes, we review studies that demonstrate that auditory objects are indeed neurally represented in auditory cortex. The studies use neural responses obtained from different experiments in which subjects selectively listen to one of two competing auditory streams embedded in a variety of auditory scenes. The auditory streams overlap spatially and often spectrally. In particular, the studies demonstrate that selective attentional gain does not act globally on the entire auditory scene, but rather acts differentially on the separate auditory streams. This stream-based attentional gain is then used as a tool to individually analyze the different neural representations of the competing auditory streams. The neural representation of the attended stream, located in posterior auditory cortex, dominates the neural responses. Critically, when the intensities of the attended and background streams are separately varied over a wide intensity range, the neural representation of the attended speech adapts only to the intensity of that speaker, irrespective of the intensity of the background speaker. This demonstrates object-level intensity gain control in addition to the above object-level selective attentional gain. Overall, these results indicate that concurrently streaming auditory objects, even if spectrally overlapping and not resolvable at the auditory periphery, are individually neurally encoded in auditory cortex, as separate objects.
Collapse
Affiliation(s)
- Jonathan Z Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742 USA; Department of Biology, University of Maryland, College Park, MD 20742, USA; Institute for Systems Research, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
88
|
Choi I, Wang L, Bharadwaj H, Shinn-Cunningham B. Individual differences in attentional modulation of cortical responses correlate with selective attention performance. Hear Res 2014; 314:10-9. [PMID: 24821552 DOI: 10.1016/j.heares.2014.04.008] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 04/18/2014] [Accepted: 04/23/2014] [Indexed: 11/29/2022]
Abstract
Many studies have shown that attention modulates the cortical representation of an auditory scene, emphasizing an attended source while suppressing competing sources. Yet, individual differences in the strength of this attentional modulation and their relationship with selective attention ability are poorly understood. Here, we ask whether differences in how strongly attention modulates cortical responses reflect differences in normal-hearing listeners' selective auditory attention ability. We asked listeners to attend to one of three competing melodies and identify its pitch contour while we measured cortical electroencephalographic responses. The three melodies were either from widely separated pitch ranges ("easy trials"), or from a narrow, overlapping pitch range ("hard trials"). The melodies started at slightly different times; listeners attended either the leading or lagging melody. Because of the timing of the onsets, the leading melody drew attention exogenously. In contrast, attending the lagging melody required listeners to direct top-down attention volitionally. We quantified how attention amplified auditory N1 response to the attended melody and found large individual differences in the N1 amplification, even though only correctly answered trials were used to quantify the ERP gain. Importantly, listeners with the strongest amplification of N1 response to the lagging melody in the easy trials were the best performers across other types of trials. Our results raise the possibility that individual differences in the strength of top-down gain control reflect inherent differences in the ability to control top-down attention.
Collapse
Affiliation(s)
- Inyong Choi
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA
| | - Le Wang
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA
| | - Hari Bharadwaj
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Barbara Shinn-Cunningham
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
| |
Collapse
|
89
|
Nourski KV, Steinschneider M, Oya H, Kawasaki H, Howard MA. Modulation of response patterns in human auditory cortex during a target detection task: an intracranial electrophysiology study. Int J Psychophysiol 2014; 95:191-201. [PMID: 24681353 DOI: 10.1016/j.ijpsycho.2014.03.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 03/11/2014] [Accepted: 03/18/2014] [Indexed: 11/15/2022]
Abstract
Selective attention enhances cortical activity representing an attended sound stream in human posterolateral superior temporal gyrus (PLST). It is unclear, however, what mechanisms are associated with a target detection task that necessitates sustained attention (vigilance) to a sound stream. We compared responses elicited by target and non-target sounds, and to sounds presented in a passive-listening paradigm. Subjects were neurosurgical patients undergoing invasive monitoring for medically refractory epilepsy. Stimuli were complex tones, band-limited noise bursts and speech syllables. High gamma cortical activity (70-150 Hz) was examined in all subjects using subdural grid electrodes implanted over PLST. Additionally, responses were measured from depth electrodes implanted within Heschl's gyrus (HG) in one subject. Responses to target sounds recorded from PLST were increased when compared to responses elicited by the same sounds when they were not-targets, and when they were presented during passive listening. Increases in high gamma activity to target sounds occurred during later portions (after 250 ms) of the response. These increases were related to the task and not to detailed stimulus characteristics. In contrast, earlier activity that did not vary across conditions did represent stimulus acoustic characteristics. Effects observed on PLST were not noted in HG. No consistent effects were noted in the averaged evoked potentials in either cortical region. We conclude that task dependence modulates later activity in PLST during vigilance. Later activity may represent feedback from higher cortical areas. Study of concurrently recorded activity from frontoparietal areas is necessary to further clarify task-related modulation of activity on PLST.
Collapse
Affiliation(s)
- Kirill V Nourski
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA.
| | | | - Hiroyuki Oya
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA
| | - Hiroto Kawasaki
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA
| | - Matthew A Howard
- Department of Neurosurgery, The University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
90
|
Chakalov I, Draganova R, Wollbrink A, Preissl H, Pantev C. Perceptual organization of auditory streaming-task relies on neural entrainment of the stimulus-presentation rate: MEG evidence. BMC Neurosci 2013; 14:120. [PMID: 24119225 PMCID: PMC3853018 DOI: 10.1186/1471-2202-14-120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 10/09/2013] [Indexed: 11/21/2022] Open
Abstract
Background Humans are able to extract regularities from complex auditory scenes in order to form perceptually meaningful elements. It has been shown previously that this process depends critically on both the temporal integration of the sensory input over time and the degree of frequency separation between concurrent sound sources. Our goal was to examine the relationship between these two aspects by means of magnetoencephalography (MEG). To achieve this aim, we combined time-frequency analysis on a sensor space level with source analysis. Our paradigm consisted of asymmetric ABA-tone triplets wherein the B-tones were presented temporally closer to the first A-tones, providing different tempi within the same sequence. Participants attended to the slowest B-rhythm whilst the frequency separation between tones was manipulated (0-, 2-, 4- and 10-semitones). Results The results revealed that the asymmetric ABA-triplets spontaneously elicited periodic-sustained responses corresponding to the temporal distribution of the A-B and B-A tone intervals in all conditions. Moreover, when attending to the B-tones, the neural representations of the A- and B-streams were both detectable in the scenarios which allow perceptual streaming (2-, 4- and 10-semitones). Alongside this, the steady-state responses tuned to the presentation of the B-tones enhanced significantly with increase of the frequency separation between tones. However, the strength of the B-tones related steady-state responses dominated the strength of the A-tones responses in the 10-semitones condition. Conversely, the representation of the A-tones dominated the B-tones in the cases of 2- and 4-semitones conditions, in which a greater effort was required for completing the task. Additionally, the P1 evoked fields’ component following the B-tones increased in magnitude with the increase of inter-tonal frequency difference. Conclusions The enhancement of the evoked fields in the source space, along with the B-tones related activity of the time-frequency results, likely reflect the selective enhancement of the attended B-stream. The results also suggested a dissimilar efficiency of the temporal integration of separate streams depending on the degree of frequency separation between the sounds. Overall, the present findings suggest that the neural effects of auditory streaming could be directly captured in the time-frequency spectrum at the sensor-space level.
Collapse
Affiliation(s)
- Ivan Chakalov
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Malmedyweg 15, 48149 Münster, Germany.
| | | | | | | | | |
Collapse
|
91
|
Szalárdy O, Winkler I, Schröger E, Widmann A, Bendixen A. Foreground-background discrimination indicated by event-related brain potentials in a new auditory multistability paradigm. Psychophysiology 2013; 50:1239-50. [DOI: 10.1111/psyp.12139] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 07/15/2013] [Indexed: 11/26/2022]
Affiliation(s)
- Orsolya Szalárdy
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Department of Cognitive Science, Faculty of Natural Sciences; Budapest University of Technology and Economics; Budapest Hungary
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Institute of Psychology; University of Szeged; Szeged Hungary
| | - Erich Schröger
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Andreas Widmann
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Alexandra Bendixen
- Institute of Psychology; University of Leipzig; Leipzig Germany
- Department of Psychology; Cluster of Excellence “Hearing4all,” European Medical School; Carl von Ossietzky University of Oldenburg; Oldenburg Germany
| |
Collapse
|
92
|
Gutschalk A, Dykstra AR. Functional imaging of auditory scene analysis. Hear Res 2013; 307:98-110. [PMID: 23968821 DOI: 10.1016/j.heares.2013.08.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Revised: 07/26/2013] [Accepted: 08/08/2013] [Indexed: 11/16/2022]
Abstract
Our auditory system is constantly faced with the task of decomposing the complex mixture of sound arriving at the ears into perceptually independent streams constituting accurate representations of individual sound sources. This decomposition, termed auditory scene analysis, is critical for both survival and communication, and is thought to underlie both speech and music perception. The neural underpinnings of auditory scene analysis have been studied utilizing invasive experiments with animal models as well as non-invasive (MEG, EEG, and fMRI) and invasive (intracranial EEG) studies conducted with human listeners. The present article reviews human neurophysiological research investigating the neural basis of auditory scene analysis, with emphasis on two classical paradigms termed streaming and informational masking. Other paradigms - such as the continuity illusion, mistuned harmonics, and multi-speaker environments - are briefly addressed thereafter. We conclude by discussing the emerging evidence for the role of auditory cortex in remapping incoming acoustic signals into a perceptual representation of auditory streams, which are then available for selective attention and further conscious processing. This article is part of a Special Issue entitled Human Auditory Neuroimaging.
Collapse
Affiliation(s)
- Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-University Heidelberg, Heidelberg, Germany.
| | | |
Collapse
|
93
|
Teki S, Chait M, Kumar S, Shamma S, Griffiths TD. Segregation of complex acoustic scenes based on temporal coherence. eLife 2013; 2:e00699. [PMID: 23898398 PMCID: PMC3721234 DOI: 10.7554/elife.00699] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 06/16/2013] [Indexed: 11/13/2022] Open
Abstract
In contrast to the complex acoustic environments we encounter everyday, most studies of auditory segregation have used relatively simple signals. Here, we synthesized a new stimulus to examine the detection of coherent patterns (‘figures’) from overlapping ‘background’ signals. In a series of experiments, we demonstrate that human listeners are remarkably sensitive to the emergence of such figures and can tolerate a variety of spectral and temporal perturbations. This robust behavior is consistent with the existence of automatic auditory segregation mechanisms that are highly sensitive to correlations across frequency and time. The observed behavior cannot be explained purely on the basis of adaptation-based models used to explain the segregation of deterministic narrowband signals. We show that the present results are consistent with the predictions of a model of auditory perceptual organization based on temporal coherence. Our data thus support a role for temporal coherence as an organizational principle underlying auditory segregation. DOI:http://dx.doi.org/10.7554/eLife.00699.001 Even when seated in the middle of a crowded restaurant, we are still able to distinguish the speech of the person sitting opposite us from the conversations of fellow diners and a host of other background noise. While we generally perform this task almost effortlessly, it is unclear how the brain solves what is in reality a complex information processing problem. In the 1970s, researchers began to address this question using stimuli consisting of simple tones. When subjects are played a sequence of alternating high and low frequency tones, they perceive them as two independent streams of sound. Similar experiments in macaque monkeys reveal that each stream activates a different area of auditory cortex, suggesting that the brain may distinguish acoustic stimuli on the basis of their frequency. However, the simple tones that are used in laboratory experiments bear little resemblance to the complex sounds we encounter in everyday life. These are often made up of multiple frequencies, and overlap—both in frequency and in time—with other sounds in the environment. Moreover, recent experiments have shown that if a subject hears two tones simultaneously, he or she perceives them as belonging to a single stream of sound even if they have different frequencies: models that assume that we distinguish stimuli from noise on the basis of frequency alone struggle to explain this observation. Now, Teki, Chait, et al. have used more complex sounds, in which frequency components of the target stimuli overlap with those of background signals, to obtain new insights into how the brain solves this problem. Subjects were extremely good at discriminating these complex target stimuli from background noise, and computational modelling confirmed that they did so via integration of both frequency and temporal information. The work of Teki, Chait, et al. thus offers the first explanation for our ability to home in on speech and other pertinent sounds, even amidst a sea of background noise. DOI:http://dx.doi.org/10.7554/eLife.00699.002
Collapse
Affiliation(s)
- Sundeep Teki
- Wellcome Trust Centre for Neuroimaging , University College London , London , United Kingdom
| | | | | | | | | |
Collapse
|
94
|
Oldoni D, De Coensel B, Boes M, Rademaker M, De Baets B, Van Renterghem T, Botteldooren D. A computational model of auditory attention for use in soundscape research. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:852-861. [PMID: 23862891 DOI: 10.1121/1.4807798] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Urban soundscape design involves creating outdoor spaces that are pleasing to the ear. One way to achieve this goal is to add or accentuate sounds that are considered to be desired by most users of the space, such that the desired sounds mask undesired sounds, or at least distract attention away from undesired sounds. In view of removing the need for a listening panel to assess the effectiveness of such soundscape measures, the interest for new models and techniques is growing. In this paper, a model of auditory attention to environmental sound is presented, which balances computational complexity and biological plausibility. Once the model is trained for a particular location, it classifies the sounds that are present in the soundscape and simulates how a typical listener would switch attention over time between different sounds. The model provides an acoustic summary, giving the soundscape designer a quick overview of the typical sounds at a particular location, and allows assessment of the perceptual effect of introducing additional sounds.
Collapse
Affiliation(s)
- Damiano Oldoni
- Acoustics Research Group, Department of Information Technology, Ghent University, St.-Pietersnieuwstraat 41, B-9000 Ghent, Belgium
| | | | | | | | | | | | | |
Collapse
|
95
|
Spatiotemporal coordination of slow-wave ongoing activity across auditory cortical areas. J Neurosci 2013; 33:3299-310. [PMID: 23426658 DOI: 10.1523/jneurosci.5079-12.2013] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Natural acoustic stimuli contain slow temporal fluctuations, and the modulation of ongoing slow-wave activity by bottom-up and top-down factors plays essential roles in auditory cortical processing. However, the spatiotemporal pattern of intrinsic slow-wave activity across the auditory cortical modality is unknown. Using in vivo voltage-sensitive dye imaging in anesthetized guinea pigs, we measured spectral tuning to acoustic stimuli across several core and belt auditory cortical areas, and then recorded spontaneous activity across this defined network. We found that phase coherence in spontaneous slow-wave (delta-theta band) activity was highest between regions of core and belt areas that had similar frequency tuning, even if they were distant. Further, core and belt regions with high phase coherence were phase shifted. Interestingly, phase shifts observed during spontaneous activity paralleled latency differences for evoked activity. Our findings suggest that the circuits underlying this intrinsic source of slow-wave activity support coordinated changes in excitability between functionally matched but distributed regions of the auditory cortical network.
Collapse
|
96
|
Zion Golumbic EM, Ding N, Bickel S, Lakatos P, Schevon CA, McKhann GM, Goodman RR, Emerson R, Mehta AD, Simon JZ, Poeppel D, Schroeder CE. Mechanisms underlying selective neuronal tracking of attended speech at a "cocktail party". Neuron 2013; 77:980-91. [PMID: 23473326 DOI: 10.1016/j.neuron.2012.12.037] [Citation(s) in RCA: 518] [Impact Index Per Article: 47.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2012] [Indexed: 11/26/2022]
Abstract
The ability to focus on and understand one talker in a noisy social environment is a critical social-cognitive capacity, whose underlying neuronal mechanisms are unclear. We investigated the manner in which speech streams are represented in brain activity and the way that selective attention governs the brain's representation of speech using a "Cocktail Party" paradigm, coupled with direct recordings from the cortical surface in surgical epilepsy patients. We find that brain activity dynamically tracks speech streams using both low-frequency phase and high-frequency amplitude fluctuations and that optimal encoding likely combines the two. In and near low-level auditory cortices, attention "modulates" the representation by enhancing cortical tracking of attended speech streams, but ignored speech remains represented. In higher-order regions, the representation appears to become more "selective," in that there is no detectable tracking of ignored speech. This selectivity itself seems to sharpen as a sentence unfolds.
Collapse
Affiliation(s)
- Elana M Zion Golumbic
- Department of Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
97
|
Lakatos P, Musacchia G, O'Connel MN, Falchier AY, Javitt DC, Schroeder CE. The spectrotemporal filter mechanism of auditory selective attention. Neuron 2013; 77:750-61. [PMID: 23439126 DOI: 10.1016/j.neuron.2012.11.034] [Citation(s) in RCA: 303] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/29/2012] [Indexed: 11/15/2022]
Abstract
Although we have convincing evidence that attention to auditory stimuli modulates neuronal responses at or before the level of primary auditory cortex (A1), the underlying physiological mechanisms are unknown. We found that attending to rhythmic auditory streams resulted in the entrainment of ongoing oscillatory activity reflecting rhythmic excitability fluctuations in A1. Strikingly, although the rhythm of the entrained oscillations in A1 neuronal ensembles reflected the temporal structure of the attended stream, the phase depended on the attended frequency content. Counter-phase entrainment across differently tuned A1 regions resulted in both the amplification and sharpening of responses at attended time points, in essence acting as a spectrotemporal filter mechanism. Our data suggest that selective attention generates a dynamically evolving model of attended auditory stimulus streams in the form of modulatory subthreshold oscillations across tonotopically organized neuronal ensembles in A1 that enhances the representation of attended stimuli.
Collapse
Affiliation(s)
- Peter Lakatos
- Cognitive Neuroscience and Schizophrenia Program, Nathan Kline Institute, Orangeburg, NY 10962, USA.
| | | | | | | | | | | |
Collapse
|
98
|
Tian X, Poeppel D. The effect of imagination on stimulation: the functional specificity of efference copies in speech processing. J Cogn Neurosci 2013; 25:1020-36. [PMID: 23469885 DOI: 10.1162/jocn_a_00381] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The computational role of efference copies is widely appreciated in action and perception research, but their properties for speech processing remain murky. We tested the functional specificity of auditory efference copies using magnetoencephalography recordings in an unconventional pairing: We used a classical cognitive manipulation (mental imagery--to elicit internal simulation and estimation) with a well-established experimental paradigm (one shot repetition--to assess neuronal specificity). Participants performed tasks that differentially implicated internal prediction of sensory consequences (overt speaking, imagined speaking, and imagined hearing) and their modulatory effects on the perception of an auditory (syllable) probe were assessed. Remarkably, the neural responses to overt syllable probes vary systematically, both in terms of directionality (suppression, enhancement) and temporal dynamics (early, late), as a function of the preceding covert mental imagery adaptor. We show, in the context of a dual-pathway model, that internal simulation shapes perception in a context-dependent manner.
Collapse
Affiliation(s)
- Xing Tian
- New York University, New York, NY, USA.
| | | |
Collapse
|
99
|
Xiang J, Poeppel D, Simon JZ. Physiological evidence for auditory modulation filterbanks: cortical responses to concurrent modulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL7-EL12. [PMID: 23298020 PMCID: PMC3555506 DOI: 10.1121/1.4769400] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 11/15/2012] [Indexed: 06/01/2023]
Abstract
Modern psychophysical models of auditory modulation processing suggest that concurrent auditory features with syllabic (~5 Hz) and phonemic rates (~20 Hz) are processed by different modulation filterbank elements, whereas features at similar modulation rates are processed together by a single element. The neurophysiology of concurrent modulation processing at speech-relevant rates is here investigated using magnetoencephalography. Results demonstrate expected neural responses to stimulus modulation frequencies; nonlinear interaction frequencies are also present, but, critically, only for nearby rates, analogous to "beating" in a cochlear filter. This provides direct physiological evidence for modulation filterbanks, allowing separate processing of concurrent syllabic and phonemic modulations.
Collapse
Affiliation(s)
- Juanjuan Xiang
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20815, USA.
| | | | | |
Collapse
|
100
|
A precluding but not ensuring role of entrained low-frequency oscillations for auditory perception. J Neurosci 2012; 32:12268-76. [PMID: 22933808 DOI: 10.1523/jneurosci.1877-12.2012] [Citation(s) in RCA: 121] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Oscillatory activity in sensory cortices reflects changes in local excitation-inhibition balance, and recent work suggests that phase signatures of ongoing oscillations predict the perceptual detection of subsequent stimuli. Low-frequency oscillations are also entrained by dynamic natural scenes, suggesting that the chance of detecting a brief target depends on the relative timing of this to the entrained rhythm. We tested this hypothesis in humans by implementing a cocktail-party-like scenario requiring subjects to detect a target embedded in a cacophony of background sounds. Using EEG to measure auditory cortical oscillations, we find that the chance of target detection systematically depends on both power and phase of theta-band (2-6 Hz) but not alpha-band (8-12 Hz) oscillations before target. Detection rates were higher and responses faster when oscillatory power was low and both detection rate and response speed were modulated by phase. Intriguingly, the phase dependency was stronger for miss than for hit trials, suggesting that phase has a inhibiting but not ensuring role for detection. Entrainment of theta range oscillations prominently occurs during the processing of attended complex stimuli, such as vocalizations and speech. Our results demonstrate that this entrainment to attended sensory environments may have negative effects on the detection of individual tokens within the environment, and they support the notion that specific phase ranges of cortical oscillations act as gatekeepers for perception.
Collapse
|