1
|
Doll L, Dykstra AR, Gutschalk A. Perceptual awareness of near-threshold tones scales gradually with auditory cortex activity and pupil dilation. iScience 2024; 27:110530. [PMID: 39175766 PMCID: PMC11338958 DOI: 10.1016/j.isci.2024.110530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 04/16/2024] [Accepted: 07/15/2024] [Indexed: 08/24/2024] Open
Abstract
Negative-going responses in sensory cortex co-vary with perceptual awareness of sensory stimuli. Given that this awareness negativity has also been observed for undetected stimuli, some have challenged its role for perception. To address this question, we combined magnetoencephalography, electroencephalography, and pupillometry to study how sustained attention and response criterion affect the auditory awareness negativity. Participants first detected distractor sounds and denied hearing task-irrelevant near-threshold tones, which evoked neither awareness negativity nor pupil dilation. These same tones evoked both responses when task-relevant, stronger for hit but also present for miss trials. Participants then rated their perception on a six-point scale to test whether response criterion explains the presence of these responses for miss trials. Decreasing perception ratings were associated with gradually reduced evoked responses, consistent with signal detection theory. These results support the concept of an awareness negativity that is modulated by attention but does not require a non-linear threshold mechanism.
Collapse
Affiliation(s)
- Laura Doll
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, 69120 Heidelberg, Germany
| | - Andrew R. Dykstra
- School of Communication Sciences and Disorders, University of Central Florida, Orlando, FL, USA
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, 69120 Heidelberg, Germany
| |
Collapse
|
2
|
Englitz B, Akram S, Elhilali M, Shamma S. Decoding contextual influences on auditory perception from primary auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.24.573229. [PMID: 38187523 PMCID: PMC10769425 DOI: 10.1101/2023.12.24.573229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Perception can be highly dependent on stimulus context, but whether and how sensory areas encode the context remains uncertain. We used an ambiguous auditory stimulus - a tritone pair - to investigate the neural activity associated with a preceding contextual stimulus that strongly influenced the tritone pair's perception: either as an ascending or a descending step in pitch. We recorded single-unit responses from a population of auditory cortical cells in awake ferrets listening to the tritone pairs preceded by the contextual stimulus. We find that the responses adapt locally to the contextual stimulus, consistent with human MEG recordings from the auditory cortex under the same conditions. Decoding the population responses demonstrates that cells responding to pitch-class-changes are able to predict well the context-sensitive percept of the tritone pairs. Conversely, decoding the individual pitch-class representations and taking their distance in the circular Shepard tone space predicts the opposite of the percept. The various percepts can be readily captured and explained by a neural model of cortical activity based on populations of adapting, pitch-class and pitch-class-direction cells, aligned with the neurophysiological responses. Together, these decoding and model results suggest that contextual influences on perception may well be already encoded at the level of the primary sensory cortices, reflecting basic neural response properties commonly found in these areas.
Collapse
|
3
|
Petley L, Blankenship C, Hunter LL, Stewart HJ, Lin L, Moore DR. Amplitude Modulation Perception and Cortical Evoked Potentials in Children With Listening Difficulties and Their Typically Developing Peers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:633-656. [PMID: 38241680 PMCID: PMC11000788 DOI: 10.1044/2023_jslhr-23-00317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 09/01/2023] [Accepted: 11/09/2023] [Indexed: 01/21/2024]
Abstract
PURPOSE Amplitude modulations (AMs) are important for speech intelligibility, and deficits in speech intelligibility are a leading source of impairment in childhood listening difficulties (LiD). The present study aimed to explore the relationships between AM perception and speech-in-noise (SiN) comprehension in children and to determine whether deficits in AM processing contribute to childhood LiD. Evoked responses were used to parse the neural origins of AM processing. METHOD Forty-one children with LiD and 44 typically developing children, ages 8-16 years, participated in the study. Behavioral AM depth thresholds were measured at 4 and 40 Hz. SiN tasks included the Listening in Spatialized Noise-Sentences Test (LiSN-S) and a coordinate response measure (CRM)-based task. Evoked responses were obtained during an AM change detection task using alternations between 4 and 40 Hz, including the N1 of the acoustic change complex, auditory steady-state response (ASSR), P300, and a late positive response (late potential [LP]). Maturational effects were explored via age correlations. RESULTS Age correlated with 4-Hz AM thresholds, CRM separated talker scores, and N1 amplitude. Age-normed LiSN-S scores obtained without spatial or talker cues correlated with age-corrected 4-Hz AM thresholds and area under the LP curve. CRM separated talker scores correlated with AM thresholds and area under the LP curve. Most behavioral measures of AM perception correlated with the signal-to-noise ratio and phase coherence of the 40-Hz ASSR. AM change response time also correlated with area under the LP curve. Children with LiD exhibited deficits with respect to 4-Hz thresholds, AM change accuracy, and area under the LP curve. CONCLUSIONS The observed relationships between AM perception and SiN performance extend the evidence that modulation perception is important for understanding SiN in childhood. In line with this finding, children with LiD demonstrated poorer performance on some measures of AM perception, but their evoked responses implicated a primarily cognitive deficit. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25009103.
Collapse
Affiliation(s)
- Lauren Petley
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- Patient Services Research, Cincinnati Children's Hospital Medical Center, OH
- Department of Psychology, Clarkson University, Potsdam, NY
| | - Chelsea Blankenship
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- Patient Services Research, Cincinnati Children's Hospital Medical Center, OH
| | - Lisa L. Hunter
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- Patient Services Research, Cincinnati Children's Hospital Medical Center, OH
- Department of Otolaryngology, College of Medicine, University of Cincinnati, OH
- Department of Communication Sciences and Disorders, College of Allied Health Sciences, University of Cincinnati, OH
| | | | - Li Lin
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- Patient Services Research, Cincinnati Children's Hospital Medical Center, OH
| | - David R. Moore
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- Patient Services Research, Cincinnati Children's Hospital Medical Center, OH
- Department of Otolaryngology, College of Medicine, University of Cincinnati, OH
- Manchester Centre for Audiology and Deafness, The University of Manchester, United Kingdom
| |
Collapse
|
4
|
Magnuson JS, Crinnion AM, Luthra S, Gaston P, Grubb S. Contra assertions, feedback improves word recognition: How feedback and lateral inhibition sharpen signals over noise. Cognition 2024; 242:105661. [PMID: 37944313 PMCID: PMC11238470 DOI: 10.1016/j.cognition.2023.105661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 10/17/2023] [Accepted: 11/02/2023] [Indexed: 11/12/2023]
Abstract
Whether top-down feedback modulates perception has deep implications for cognitive theories. Debate has been vigorous in the domain of spoken word recognition, where competing computational models and agreement on at least one diagnostic experimental paradigm suggest that the debate may eventually be resolvable. Norris and Cutler (2021) revisit arguments against lexical feedback in spoken word recognition models. They also incorrectly claim that recent computational demonstrations that feedback promotes accuracy and speed under noise (Magnuson et al., 2018) were due to the use of the Luce choice rule rather than adding noise to inputs (noise was in fact added directly to inputs). They also claim that feedback cannot improve word recognition because feedback cannot distinguish signal from noise. We have two goals in this paper. First, we correct the record about the simulations of Magnuson et al. (2018). Second, we explain how interactive activation models selectively sharpen signals via joint effects of feedback and lateral inhibition that boost lexically-coherent sublexical patterns over noise. We also review a growing body of behavioral and neural results consistent with feedback and inconsistent with autonomous (non-feedback) architectures, and conclude that parsimony supports feedback. We close by discussing the potential for synergy between autonomous and interactive approaches.
Collapse
Affiliation(s)
- James S Magnuson
- University of Connecticut. Storrs, CT, USA; BCBL. Basque Center on Cognition Brain and Language, Donostia-San Sebastián, Spain; Ikerbasque. Basque Foundation for Science, Bilbao, Spain.
| | | | | | | | | |
Collapse
|
5
|
Commuri V, Kulasingham JP, Simon JZ. Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention. Front Neurosci 2023; 17:1264453. [PMID: 38156264 PMCID: PMC10752935 DOI: 10.3389/fnins.2023.1264453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 11/21/2023] [Indexed: 12/30/2023] Open
Abstract
Auditory cortical responses to speech obtained by magnetoencephalography (MEG) show robust speech tracking to the speaker's fundamental frequency in the high-gamma band (70-200 Hz), but little is currently known about whether such responses depend on the focus of selective attention. In this study 22 human subjects listened to concurrent, fixed-rate, speech from male and female speakers, and were asked to selectively attend to one speaker at a time, while their neural responses were recorded with MEG. The male speaker's pitch range coincided with the lower range of the high-gamma band, whereas the female speaker's higher pitch range had much less overlap, and only at the upper end of the high-gamma band. Neural responses were analyzed using the temporal response function (TRF) framework. As expected, the responses demonstrate robust speech tracking of the fundamental frequency in the high-gamma band, but only to the male's speech, with a peak latency of ~40 ms. Critically, the response magnitude depends on selective attention: the response to the male speech is significantly greater when male speech is attended than when it is not attended, under acoustically identical conditions. This is a clear demonstration that even very early cortical auditory responses are influenced by top-down, cognitive, neural processing mechanisms.
Collapse
Affiliation(s)
- Vrishab Commuri
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States
| | | | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States
- Department of Biology, University of Maryland, College Park, MD, United States
- Institute for Systems Research, University of Maryland, College Park, MD, United States
| |
Collapse
|
6
|
Kothinti SR, Elhilali M. Are acoustics enough? Semantic effects on auditory salience in natural scenes. Front Psychol 2023; 14:1276237. [PMID: 38098516 PMCID: PMC10720592 DOI: 10.3389/fpsyg.2023.1276237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/10/2023] [Indexed: 12/17/2023] Open
Abstract
Auditory salience is a fundamental property of a sound that allows it to grab a listener's attention regardless of their attentional state or behavioral goals. While previous research has shed light on acoustic factors influencing auditory salience, the semantic dimensions of this phenomenon have remained relatively unexplored owing both to the complexity of measuring salience in audition as well as limited focus on complex natural scenes. In this study, we examine the relationship between acoustic, contextual, and semantic attributes and their impact on the auditory salience of natural audio scenes using a dichotic listening paradigm. The experiments present acoustic scenes in forward and backward directions; the latter allows to diminish semantic effects, providing a counterpoint to the effects observed in forward scenes. The behavioral data collected from a crowd-sourced platform reveal a striking convergence in temporal salience maps for certain sound events, while marked disparities emerge in others. Our main hypothesis posits that differences in the perceptual salience of events are predominantly driven by semantic and contextual cues, particularly evident in those cases displaying substantial disparities between forward and backward presentations. Conversely, events exhibiting a high degree of alignment can largely be attributed to low-level acoustic attributes. To evaluate this hypothesis, we employ analytical techniques that combine rich low-level mappings from acoustic profiles with high-level embeddings extracted from a deep neural network. This integrated approach captures both acoustic and semantic attributes of acoustic scenes along with their temporal trajectories. The results demonstrate that perceptual salience is a careful interplay between low-level and high-level attributes that shapes which moments stand out in a natural soundscape. Furthermore, our findings underscore the important role of longer-term context as a critical component of auditory salience, enabling us to discern and adapt to temporal regularities within an acoustic scene. The experimental and model-based validation of semantic factors of salience paves the way for a complete understanding of auditory salience. Ultimately, the empirical and computational analyses have implications for developing large-scale models for auditory salience and audio analytics.
Collapse
Affiliation(s)
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
7
|
Petley L, Blankenship C, Hunter LL, Stewart HJ, Lin L, Moore DR. Amplitude modulation perception and cortical evoked potentials in children with listening difficulties and their typically-developing peers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.26.23297523. [PMID: 37961469 PMCID: PMC10635202 DOI: 10.1101/2023.10.26.23297523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Purpose Amplitude modulations (AM) are important for speech intelligibility, and deficits in speech intelligibility are a leading source of impairment in childhood listening difficulties (LiD). The present study aimed to explore the relationships between AM perception and speech-in-noise (SiN) comprehension in children and to determine whether deficits in AM processing contribute to childhood LiD. Evoked responses were used to parse the neural origin of AM processing. Method Forty-one children with LiD and forty-four typically-developing children, ages 8-16 y.o., participated in the study. Behavioral AM depth thresholds were measured at 4 and 40 Hz. SiN tasks included the LiSN-S and a Coordinate Response Measure (CRM)-based task. Evoked responses were obtained during an AM Change detection task using alternations between 4 and 40 Hz, including the N1 of the acoustic change complex, auditory steady-state response (ASSR), P300, and a late positive response (LP). Maturational effects were explored via age correlations. Results Age correlated with 4 Hz AM thresholds, CRM Separated Talker scores, and N1 amplitude. Age-normed LiSN-S scores obtained without spatial or talker cues correlated with age-corrected 4 Hz AM thresholds and area under the LP curve. CRM Separated Talker scores correlated with AM thresholds and area under the LP curve. Most behavioral measures of AM perception correlated with the SNR and phase coherence of the 40 Hz ASSR. AM Change RT also correlated with area under the LP curve. Children with LiD exhibited deficits with respect to 4 Hz thresholds, AM Change accuracy, and area under the LP curve. Conclusions The observed relationships between AM perception and SiN performance extend the evidence that modulation perception is important for understanding SiN in childhood. In line with this finding, children with LiD demonstrated poorer performance on some measures of AM perception, but their evoked responses implicated a primarily cognitive deficit.
Collapse
|
8
|
Brown JA, Bidelman GM. Attention, Musicality, and Familiarity Shape Cortical Speech Tracking at the Musical Cocktail Party. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.562773. [PMID: 37961204 PMCID: PMC10634879 DOI: 10.1101/2023.10.28.562773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The "cocktail party problem" challenges our ability to understand speech in noisy environments, which often include background music. Here, we explored the role of background music in speech-in-noise listening. Participants listened to an audiobook in familiar and unfamiliar music while tracking keywords in either speech or song lyrics. We used EEG to measure neural tracking of the audiobook. When speech was masked by music, the modeled peak latency at 50 ms (P1TRF) was prolonged compared to unmasked. Additionally, P1TRF amplitude was larger in unfamiliar background music, suggesting improved speech tracking. We observed prolonged latencies at 100 ms (N1TRF) when speech was not the attended stimulus, though only in less musical listeners. Our results suggest early neural representations of speech are enhanced with both attention and concurrent unfamiliar music, indicating familiar music is more distracting. One's ability to perceptually filter "musical noise" at the cocktail party depends on objective musical abilities.
Collapse
Affiliation(s)
- Jane A. Brown
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN 38152, USA
| | - Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
9
|
Commuri V, Kulasingham JP, Simon JZ. Cortical Responses Time-Locked to Continuous Speech in the High-Gamma Band Depend on Selective Attention. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.20.549567. [PMID: 37546895 PMCID: PMC10401961 DOI: 10.1101/2023.07.20.549567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Auditory cortical responses to speech obtained by magnetoencephalography (MEG) show robust speech tracking to the speaker's fundamental frequency in the high-gamma band (70-200 Hz), but little is currently known about whether such responses depend on the focus of selective attention. In this study 22 human subjects listened to concurrent, fixed-rate, speech from male and female speakers, and were asked to selectively attend to one speaker at a time, while their neural responses were recorded with MEG. The male speaker's pitch range coincided with the lower range of the high-gamma band, whereas the female speaker's higher pitch range had much less overlap, and only at the upper end of the high-gamma band. Neural responses were analyzed using the temporal response function (TRF) framework. As expected, the responses demonstrate robust speech tracking of the fundamental frequency in the high-gamma band, but only to the male's speech, with a peak latency of approximately 40 ms. Critically, the response magnitude depends on selective attention: the response to the male speech is significantly greater when male speech is attended than when it is not attended, under acoustically identical conditions. This is a clear demonstration that even very early cortical auditory responses are influenced by top-down, cognitive, neural processing mechanisms.
Collapse
Affiliation(s)
- Vrishab Commuri
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States
| | | | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States
- Department of Biology, University of Maryland, College Park, MD, United States
- Institute for Systems Research, University of Maryland, College Park, MD, United States
| |
Collapse
|
10
|
Fischer M, Moscovitch M, Fukuda K, Alain C. Ready for action! When the brain learns, yet memory-biased action does not follow. Neuropsychologia 2023; 189:108660. [PMID: 37604333 DOI: 10.1016/j.neuropsychologia.2023.108660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 05/23/2023] [Accepted: 08/11/2023] [Indexed: 08/23/2023]
Abstract
Does memory prepare us to act? Long-term memory can facilitate signal detection, though the degree of benefit varies and can even be absent. To dissociate between learning and behavioral expression of learning, we used high-density electroencephalography (EEG) to assess memory retrieval and response processing. At learning, participants heard everyday sounds. Half of these sound clips were paired with an above-threshold lateralized tone, such that it was possible to form incidental associations between the sound clip and the location of the tone. Importantly, attention was directed to either the sound clip (Experiment 1) or the tone (Experiment 2). Participants then completed a novel detection task that separated cued retrieval from response processing. At retrieval, we observed a striking brain-behavior dissociation. Learning was observed neurally in both experiments. Behaviorally, however, signal detection was only facilitated in Experiment 2, for which there was an accompanying explicit memory for tone presence. Further, implicit neural memory for tone location correlated with the degree of response preparation, but not response execution. Together, the findings suggest 1) that attention at learning affects memory-biased action and 2) that memory prepared action via both explicit and implicit associative memory, with the latter triggering response preparation.
Collapse
Affiliation(s)
- Manda Fischer
- Department of Psychology, University of Toronto, Toronto, Canada; Department of Psychology, Rotman Research Institute at Baycrest Hospital, Toronto, Canada.
| | - Morris Moscovitch
- Department of Psychology, University of Toronto, Toronto, Canada; Department of Psychology, Rotman Research Institute at Baycrest Hospital, Toronto, Canada.
| | - Keisuke Fukuda
- Department of Psychology, University of Toronto, Toronto, Canada.
| | - Claude Alain
- Department of Psychology, University of Toronto, Toronto, Canada; Department of Psychology, Rotman Research Institute at Baycrest Hospital, Toronto, Canada.
| |
Collapse
|
11
|
Pomper U, Curetti LZ, Chait M. Neural dynamics underlying successful auditory short-term memory performance. Eur J Neurosci 2023; 58:3859-3878. [PMID: 37691137 PMCID: PMC10946728 DOI: 10.1111/ejn.16140] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/18/2023] [Accepted: 08/19/2023] [Indexed: 09/12/2023]
Abstract
Listeners often operate in complex acoustic environments, consisting of many concurrent sounds. Accurately encoding and maintaining such auditory objects in short-term memory is crucial for communication and scene analysis. Yet, the neural underpinnings of successful auditory short-term memory (ASTM) performance are currently not well understood. To elucidate this issue, we presented a novel, challenging auditory delayed match-to-sample task while recording MEG. Human participants listened to 'scenes' comprising three concurrent tone pip streams. The task was to indicate, after a delay, whether a probe stream was present in the just-heard scene. We present three key findings: First, behavioural performance revealed faster responses in correct versus incorrect trials as well as in 'probe present' versus 'probe absent' trials, consistent with ASTM search. Second, successful compared with unsuccessful ASTM performance was associated with a significant enhancement of event-related fields and oscillatory activity in the theta, alpha and beta frequency ranges. This extends previous findings of an overall increase of persistent activity during short-term memory performance. Third, using distributed source modelling, we found these effects to be confined mostly to sensory areas during encoding, presumably related to ASTM contents per se. Parietal and frontal sources then became relevant during the maintenance stage, indicating that effective STM operation also relies on ongoing inhibitory processes suppressing task-irrelevant information. In summary, our results deliver a detailed account of the neural patterns that differentiate successful from unsuccessful ASTM performance in the context of a complex, multi-object auditory scene.
Collapse
Affiliation(s)
- Ulrich Pomper
- Ear InstituteUniversity College LondonLondonUK
- Faculty of PsychologyUniversity of ViennaViennaAustria
| | | | - Maria Chait
- Ear InstituteUniversity College LondonLondonUK
| |
Collapse
|
12
|
Veyrié A, Noreña A, Sarrazin JC, Pezard L. Information-Theoretic Approaches in EEG Correlates of Auditory Perceptual Awareness under Informational Masking. BIOLOGY 2023; 12:967. [PMID: 37508397 PMCID: PMC10376775 DOI: 10.3390/biology12070967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 06/23/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
In informational masking paradigms, the successful segregation between the target and masker creates auditory perceptual awareness. The dynamics of the build-up of auditory perception is based on a set of interactions between bottom-up and top-down processes that generate neuronal modifications within the brain network activity. These neural changes are studied here using event-related potentials (ERPs), entropy, and integrated information, leading to several measures applied to electroencephalogram signals. The main findings show that the auditory perceptual awareness stimulated functional activation in the fronto-temporo-parietal brain network through (i) negative temporal and positive centro-parietal ERP components; (ii) an enhanced processing of multi-information in the temporal cortex; and (iii) an increase in informational content in the fronto-central cortex. These different results provide information-based experimental evidence about the functional activation of the fronto-temporo-parietal brain network during auditory perceptual awareness.
Collapse
Affiliation(s)
- Alexandre Veyrié
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
- ONERA, The French Aerospace Lab, 13300 Salon de Provence, France
| | - Arnaud Noreña
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
| | | | - Laurent Pezard
- Centre National de la Recherche Scientifique (UMR 7291), Laboratoire de Neurosciences Cognitives, Aix-Marseille Université, 13331 Marseille, France
| |
Collapse
|
13
|
Makov S, Pinto D, Har-Shai Yahav P, Miller LM, Zion Golumbic E. "Unattended, distracting or irrelevant": Theoretical implications of terminological choices in auditory selective attention research. Cognition 2023; 231:105313. [PMID: 36344304 DOI: 10.1016/j.cognition.2022.105313] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/30/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]
Abstract
For seventy years, auditory selective attention research has focused on studying the cognitive mechanisms of prioritizing the processing a 'main' task-relevant stimulus, in the presence of 'other' stimuli. However, a closer look at this body of literature reveals deep empirical inconsistencies and theoretical confusion regarding the extent to which this 'other' stimulus is processed. We argue that many key debates regarding attention arise, at least in part, from inappropriate terminological choices for experimental variables that may not accurately map onto the cognitive constructs they are meant to describe. Here we critically review the more common or disruptive terminological ambiguities, differentiate between methodology-based and theory-derived terms, and unpack the theoretical assumptions underlying different terminological choices. Particularly, we offer an in-depth analysis of the terms 'unattended' and 'distractor' and demonstrate how their use can lead to conflicting theoretical inferences. We also offer a framework for thinking about terminology in a more productive and precise way, in hope of fostering more productive debates and promoting more nuanced and accurate cognitive models of selective attention.
Collapse
Affiliation(s)
- Shiri Makov
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Danna Pinto
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Paz Har-Shai Yahav
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Lee M Miller
- The Center for Mind and Brain, University of California, Davis, CA, United States of America; Department of Neurobiology, Physiology, & Behavior, University of California, Davis, CA, United States of America; Department of Otolaryngology / Head and Neck Surgery, University of California, Davis, CA, United States of America
| | - Elana Zion Golumbic
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel.
| |
Collapse
|
14
|
Du X, Hare S, Summerfelt A, Adhikari BM, Garcia L, Marshall W, Zan P, Kvarta M, Goldwaser E, Bruce H, Gao S, Sampath H, Kochunov P, Simon JZ, Hong LE. Cortical connectomic mediations on gamma band synchronization in schizophrenia. Transl Psychiatry 2023; 13:13. [PMID: 36653335 PMCID: PMC9849210 DOI: 10.1038/s41398-022-02300-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 12/07/2022] [Accepted: 12/22/2022] [Indexed: 01/20/2023] Open
Abstract
Aberrant gamma frequency neural oscillations in schizophrenia have been well demonstrated using auditory steady-state responses (ASSR). However, the neural circuits underlying 40 Hz ASSR deficits in schizophrenia remain poorly understood. Sixty-six patients with schizophrenia spectrum disorders and 85 age- and gender-matched healthy controls completed one electroencephalography session measuring 40 Hz ASSR and one imaging session for resting-state functional connectivity (rsFC) assessments. The associations between the normalized power of 40 Hz ASSR and rsFC were assessed via linear regression and mediation models. We found that rsFC among auditory, precentral, postcentral, and prefrontal cortices were positively associated with 40 Hz ASSR in patients and controls separately and in the combined sample. The mediation analysis further confirmed that the deficit of gamma band ASSR in schizophrenia was nearly fully mediated by three of the rsFC circuits between right superior temporal gyrus-left medial prefrontal cortex (MPFC), left MPFC-left postcentral gyrus (PoG), and left precentral gyrus-right PoG. Gamma-band ASSR deficits in schizophrenia may be associated with deficient circuitry level connectivity to support gamma frequency synchronization. Correcting gamma band deficits in schizophrenia may require corrective interventions to normalize these aberrant networks.
Collapse
Affiliation(s)
- Xiaoming Du
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Stephanie Hare
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ann Summerfelt
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Bhim M Adhikari
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Laura Garcia
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Wyatt Marshall
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Peng Zan
- Department of Electrical & Computer Engineering, University of Maryland, College Park, MD, USA
| | - Mark Kvarta
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Eric Goldwaser
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Heather Bruce
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Si Gao
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Hemalatha Sampath
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jonathan Z Simon
- Department of Electrical & Computer Engineering, University of Maryland, College Park, MD, USA
- Department of Biology, University of Maryland, College Park, MD, USA
- Institute for Systems Research, University of Maryland, College Park, MD, USA
| | - L Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
15
|
Kwasa JA, Noyce AL, Torres LM, Richardson BN, Shinn-Cunningham BG. Top-down auditory attention modulates neural responses more strongly in neurotypical than ADHD young adults. Brain Res 2023; 1798:148144. [PMID: 36328068 PMCID: PMC9749882 DOI: 10.1016/j.brainres.2022.148144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 11/06/2022]
Abstract
Human cognitive abilities naturally vary along a spectrum, even among those we call "neurotypical". Individuals differ in their ability to selectively attend to goal-relevant auditory stimuli. We sought to characterize this variability in a cohort of people with diverse attentional functioning. We recruited both neurotypical (N = 20) and ADHD (N = 25) young adults, all with normal hearing. Participants listened to one of three concurrent, spatially separated speech streams and reported the order of the syllables in that stream while we recorded electroencephalography (EEG). We tested both the ability to sustain attentional focus on a single "Target" stream and the ability to monitor the Target but flexibly either ignore or switch attention to an unpredictable "Interrupter" stream from another direction that sometimes appeared. Although differences in both stimulus structure and task demands affected behavioral performance, ADHD status did not. In both groups, the Interrupter evoked larger neural responses when it was to be attended compared to when it was irrelevant, including for the P3a "reorienting" response previously described as involuntary. This attentional modulation was weaker in ADHD listeners, even though their behavioral performance was the same. Across the entire cohort, individual performance correlated with the degree of top-down modulation of neural responses. These results demonstrate that listeners differ in their ability to modulate neural representations of sound based on task goals, while suggesting that adults with ADHD may have weaker volitional control of attentional processes than their neurotypical counterparts.
Collapse
Affiliation(s)
- Jasmine A. Kwasa
- Neuroscience Institute, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, 15213, United States, Department of Biomedical Engineering, Boston University, 1 Silber Way, Boston, MA, 02215, United States, Corresponding author at: 4825 Frew St, A52A Baker Hall, Pittsburgh, PA 15213, United States. (J.A. Kwasa)
| | - Abigail L. Noyce
- Neuroscience Institute, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, 15213, United States
| | - Laura M. Torres
- Department of Biomedical Engineering, Boston University, 1 Silber Way, Boston, MA, 02215, United States
| | - Benjamin N. Richardson
- Neuroscience Institute, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, 15213, United States
| | | |
Collapse
|
16
|
Johns MA, Calloway RC, Phillips I, Karuzis VP, Dutta K, Smith E, Shamma SA, Goupell MJ, Kuchinsky SE. Performance on stochastic figure-ground perception varies with individual differences in speech-in-noise recognition and working memory capacity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:286. [PMID: 36732241 PMCID: PMC9851714 DOI: 10.1121/10.0016756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 12/07/2022] [Accepted: 12/10/2022] [Indexed: 06/18/2023]
Abstract
Speech recognition in noisy environments can be challenging and requires listeners to accurately segregate a target speaker from irrelevant background noise. Stochastic figure-ground (SFG) tasks in which temporally coherent inharmonic pure-tones must be identified from a background have been used to probe the non-linguistic auditory stream segregation processes important for speech-in-noise processing. However, little is known about the relationship between performance on SFG tasks and speech-in-noise tasks nor the individual differences that may modulate such relationships. In this study, 37 younger normal-hearing adults performed an SFG task with target figure chords consisting of four, six, eight, or ten temporally coherent tones amongst a background of randomly varying tones. Stimuli were designed to be spectrally and temporally flat. An increased number of temporally coherent tones resulted in higher accuracy and faster reaction times (RTs). For ten target tones, faster RTs were associated with better scores on the Quick Speech-in-Noise task. Individual differences in working memory capacity and self-reported musicianship further modulated these relationships. Overall, results demonstrate that the SFG task could serve as an assessment of auditory stream segregation accuracy and RT that is sensitive to individual differences in cognitive and auditory abilities, even among younger normal-hearing adults.
Collapse
Affiliation(s)
- Michael A Johns
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Regina C Calloway
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Ian Phillips
- Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland 20889, USA
| | - Valerie P Karuzis
- Applied Research Laboratory of Intelligence and Security, University of Maryland, College Park, Maryland 20742, USA
| | - Kelsey Dutta
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Ed Smith
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Shihab A Shamma
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Stefanie E Kuchinsky
- Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland 20889, USA
| |
Collapse
|
17
|
Veyrié A, Noreña A, Sarrazin JC, Pezard L. Investigating the influence of masker and target properties on the dynamics of perceptual awareness under informational masking. PLoS One 2023; 18:e0282885. [PMID: 36928693 PMCID: PMC10019711 DOI: 10.1371/journal.pone.0282885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 02/27/2023] [Indexed: 03/18/2023] Open
Abstract
Informational masking has been investigated using the detection of an auditory target embedded in a random multi-tone masker. The build-up of the target percept is influenced by the masker and target properties. Most studies dealing with discrimination performance neglect the dynamics of perceptual awareness. This study aims at investigating the dynamics of perceptual awareness using multi-level survival models in an informational masking paradigm by manipulating masker uncertainty, masker-target similarity and target repetition rate. Consistent with previous studies, it shows that high target repetition rates, low masker-target similarity and low masker uncertainty facilitate target detection. In the context of evidence accumulation models, these results can be interpreted by changes in the accumulation parameters. The probabilistic description of perceptual awareness provides a benchmark for the choice of target and masker parameters in order to examine the underlying cognitive and neural dynamics of perceptual awareness.
Collapse
Affiliation(s)
- Alexandre Veyrié
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- ONERA, The French Aerospace Lab, Salon de Provence, France
| | - Arnaud Noreña
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
| | | | - Laurent Pezard
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- * E-mail:
| |
Collapse
|
18
|
van Ackooij M, Paul JM, van der Zwaag W, van der Stoep N, Harvey BM. Auditory timing-tuned neural responses in the human auditory cortices. Neuroimage 2022; 258:119366. [PMID: 35690255 DOI: 10.1016/j.neuroimage.2022.119366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 05/25/2022] [Accepted: 06/08/2022] [Indexed: 11/27/2022] Open
Abstract
Perception of sub-second auditory event timing supports multisensory integration, and speech and music perception and production. Neural populations tuned for the timing (duration and rate) of visual events were recently described in several human extrastriate visual areas. Here we ask whether the brain also contains neural populations tuned for auditory event timing, and whether these are shared with visual timing. Using 7T fMRI, we measured responses to white noise bursts of changing duration and rate. We analyzed these responses using neural response models describing different parametric relationships between event timing and neural response amplitude. This revealed auditory timing-tuned responses in the primary auditory cortex, and auditory association areas of the belt, parabelt and premotor cortex. While these areas also showed tonotopic tuning for auditory pitch, pitch and timing preferences were not consistently correlated. Auditory timing-tuned response functions differed between these areas, though without clear hierarchical integration of responses. The similarity of auditory and visual timing tuned responses, together with the lack of overlap between the areas showing these responses for each modality, suggests modality-specific responses to event timing are computed similarly but from different sensory inputs, and then transformed differently to suit the needs of each modality.
Collapse
Affiliation(s)
- Martijn van Ackooij
- Experimental Psychology, Helmholtz Institute, Utrecht University, Heidelberglaan 1, Utrecht 3584 CS, the Netherlands
| | - Jacob M Paul
- Experimental Psychology, Helmholtz Institute, Utrecht University, Heidelberglaan 1, Utrecht 3584 CS, the Netherlands; Melbourne School of Psychological Sciences, University of Melbourne, Redmond Barry Building, Parkville 3010, Victoria, Australia
| | | | - Nathan van der Stoep
- Experimental Psychology, Helmholtz Institute, Utrecht University, Heidelberglaan 1, Utrecht 3584 CS, the Netherlands
| | - Ben M Harvey
- Experimental Psychology, Helmholtz Institute, Utrecht University, Heidelberglaan 1, Utrecht 3584 CS, the Netherlands.
| |
Collapse
|
19
|
Wang L, Hu X, Liu H, Zhao S, Guo L, Han J, Liu T. Functional Brain Networks Underlying Auditory Saliency During Naturalistic Listening Experience. IEEE Trans Cogn Dev Syst 2022. [DOI: 10.1109/tcds.2020.3025947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
20
|
Luberadzka J, Kayser H, Hohmann V. Making sense of periodicity glimpses in a prediction-update-loop-A computational model of attentive voice tracking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:712. [PMID: 35232067 PMCID: PMC9088677 DOI: 10.1121/10.0009337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 11/13/2021] [Accepted: 01/03/2022] [Indexed: 06/14/2023]
Abstract
Humans are able to follow a speaker even in challenging acoustic conditions. The perceptual mechanisms underlying this ability remain unclear. A computational model of attentive voice tracking, consisting of four computational blocks: (1) sparse periodicity-based auditory features (sPAF) extraction, (2) foreground-background segregation, (3) state estimation, and (4) top-down knowledge, is presented. The model connects the theories about auditory glimpses, foreground-background segregation, and Bayesian inference. It is implemented with the sPAF, sequential Monte Carlo sampling, and probabilistic voice models. The model is evaluated by comparing it with the human data obtained in the study by Woods and McDermott [Curr. Biol. 25(17), 2238-2246 (2015)], which measured the ability to track one of two competing voices with time-varying parameters [fundamental frequency (F0) and formants (F1,F2)]. Three model versions were tested, which differ in the type of information used for the segregation: version (a) uses the oracle F0, version (b) uses the estimated F0, and version (c) uses the spectral shape derived from the estimated F0 and oracle F1 and F2. Version (a) simulates the optimal human performance in conditions with the largest separation between the voices, version (b) simulates the conditions in which the separation in not sufficient to follow the voices, and version (c) is closest to the human performance for moderate voice separation.
Collapse
Affiliation(s)
- Joanna Luberadzka
- Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Germany
| | - Hendrik Kayser
- Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Germany
| | - Volker Hohmann
- Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Germany
| |
Collapse
|
21
|
Cortical Processing of Binaural Cues as Shown by EEG Responses to Random-Chord Stereograms. J Assoc Res Otolaryngol 2021; 23:75-94. [PMID: 34904205 PMCID: PMC8783002 DOI: 10.1007/s10162-021-00820-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/06/2021] [Indexed: 10/26/2022] Open
Abstract
Spatial hearing facilitates the perceptual organization of complex soundscapes into accurate mental representations of sound sources in the environment. Yet, the role of binaural cues in auditory scene analysis (ASA) has received relatively little attention in recent neuroscientific studies employing novel, spectro-temporally complex stimuli. This may be because a stimulation paradigm that provides binaurally derived grouping cues of sufficient spectro-temporal complexity has not yet been established for neuroscientific ASA experiments. Random-chord stereograms (RCS) are a class of auditory stimuli that exploit spectro-temporal variations in the interaural envelope correlation of noise-like sounds with interaurally coherent fine structure; they evoke salient auditory percepts that emerge only under binaural listening. Here, our aim was to assess the usability of the RCS paradigm for indexing binaural processing in the human brain. To this end, we recorded EEG responses to RCS stimuli from 12 normal-hearing subjects. The stimuli consisted of an initial 3-s noise segment with interaurally uncorrelated envelopes, followed by another 3-s segment, where envelope correlation was modulated periodically according to the RCS paradigm. Modulations were applied either across the entire stimulus bandwidth (wideband stimuli) or in temporally shifting frequency bands (ripple stimulus). Event-related potentials and inter-trial phase coherence analyses of the EEG responses showed that the introduction of the 3- or 5-Hz wideband modulations produced a prominent change-onset complex and ongoing synchronized responses to the RCS modulations. In contrast, the ripple stimulus elicited a change-onset response but no response to ongoing RCS modulation. Frequency-domain analyses revealed increased spectral power at the fundamental frequency and the first harmonic of wideband RCS modulations. RCS stimulation yields robust EEG measures of binaurally driven auditory reorganization and has potential to provide a flexible stimulation paradigm suitable for isolating binaural effects in ASA experiments.
Collapse
|
22
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
23
|
Brace KM, Sussman ES. The Brain Tracks Multiple Predictions About the Auditory Scene. Front Hum Neurosci 2021; 15:747769. [PMID: 34803633 PMCID: PMC8595267 DOI: 10.3389/fnhum.2021.747769] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 10/05/2021] [Indexed: 11/13/2022] Open
Abstract
The predictable rhythmic structure is important to most ecologically relevant sounds for humans, such as is found in the rhythm of speech or music. This study addressed the question of how rhythmic predictions are maintained in the auditory system when there are multiple perceptual interpretations occurring simultaneously and emanating from the same sound source. We recorded the electroencephalogram (EEG) while presenting participants with a tone sequence that had two different tone feature patterns, one based on the sequential rhythmic variation in tone duration and the other on sequential rhythmic variation in tone intensity. Participants were presented with the same sound sequences and were instructed to listen for the intensity pattern (ignore fluctuations in duration) and press a response key to detected pattern deviants (attend intensity pattern task); to listen to the duration pattern (ignore fluctuations in intensity) and make a button press to duration pattern deviants (attend duration pattern task), and to watch a movie and ignore the sounds presented to their ears (attend visual task). Both intensity and duration patterns occurred predictably 85% of the time, thus the key question involved evaluating how the brain treated the irrelevant feature patterns (standards and deviants) while performing an auditory or visual task. We expected that task-based feature patterns would have a more robust brain response to attended standards and deviants than the unattended feature patterns. Instead, we found that the neural entrainment to the rhythm of the standard attended patterns had similar power to the standard of the unattended feature patterns. In addition, the infrequent pattern deviants elicited the event-related brain potential called the mismatch negativity component (MMN). The MMN elicited by task-based feature pattern deviants had a similar amplitude to MMNs elicited by unattended pattern deviants that were unattended because they were not the target pattern or because the participant ignored the sounds and watched a movie. Thus, these results demonstrate that the brain tracks multiple predictions about the complexities in sound streams and can automatically track and detect deviations with respect to these predictions. This capability would be useful for switching attention rapidly among multiple objects in a busy auditory scene.
Collapse
Affiliation(s)
- Kelin M Brace
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Elyse S Sussman
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, United States
| |
Collapse
|
24
|
Renvall H, Seol J, Tuominen R, Sorger B, Riecke L, Salmelin R. Selective auditory attention within naturalistic scenes modulates reactivity to speech sounds. Eur J Neurosci 2021; 54:7626-7641. [PMID: 34697833 PMCID: PMC9298413 DOI: 10.1111/ejn.15504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 10/10/2021] [Indexed: 11/27/2022]
Abstract
Rapid recognition and categorization of sounds are essential for humans and animals alike, both for understanding and reacting to our surroundings and for daily communication and social interaction. For humans, perception of speech sounds is of crucial importance. In real life, this task is complicated by the presence of a multitude of meaningful non‐speech sounds. The present behavioural, magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) study was set out to address how attention to speech versus attention to natural non‐speech sounds within complex auditory scenes influences cortical processing. The stimuli were superimpositions of spoken words and environmental sounds, with parametric variation of the speech‐to‐environmental sound intensity ratio. The participants' task was to detect a repetition in either the speech or the environmental sound. We found that specifically when participants attended to speech within the superimposed stimuli, higher speech‐to‐environmental sound ratios resulted in shorter sustained MEG responses and stronger BOLD fMRI signals especially in the left supratemporal auditory cortex and in improved behavioural performance. No such effects of speech‐to‐environmental sound ratio were observed when participants attended to the environmental sound part within the exact same stimuli. These findings suggest stronger saliency of speech compared with other meaningful sounds during processing of natural auditory scenes, likely linked to speech‐specific top‐down and bottom‐up mechanisms activated during speech perception that are needed for tracking speech in real‐life‐like auditory environments.
Collapse
Affiliation(s)
- Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland.,BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, University of Helsinki and Aalto University School of Science, Helsinki, Finland
| | - Jaeho Seol
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Riku Tuominen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Bettina Sorger
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
25
|
Kiremitçi I, Yilmaz Ö, Çelik E, Shahdloo M, Huth AG, Çukur T. Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment. Cereb Cortex 2021; 31:4986-5005. [PMID: 34115102 PMCID: PMC8491717 DOI: 10.1093/cercor/bhab136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Humans are remarkably adept in listening to a desired speaker in a crowded environment, while filtering out nontarget speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed characterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modulation occurs in each brain area during the cocktail-party task. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed. Intrinsic selectivity profiles were identified via voxelwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that attention causes broad modulations at multiple levels of speech representations while growing stronger toward later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multispeaker environments.
Collapse
Affiliation(s)
- Ibrahim Kiremitçi
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Özgür Yilmaz
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
| | - Emin Çelik
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Mo Shahdloo
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK
| | - Alexander G Huth
- Department of Neuroscience, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| | - Tolga Çukur
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| |
Collapse
|
26
|
Gärtner K, Gutschalk A. Auditory cortex activity related to perceptual awareness versus masking of tone sequences. Neuroimage 2021; 228:117681. [PMID: 33359346 DOI: 10.1016/j.neuroimage.2020.117681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/24/2020] [Accepted: 12/15/2020] [Indexed: 10/22/2022] Open
Abstract
Sequences of repeating tones can be masked by other tones of different frequency. When these tone sequences are perceived, nevertheless, a prominent neural response in the auditory cortex is evoked by each tone of the sequence. When the targets are detected based on their isochrony, participants know that they are listening to the target once they detected it. To explore if the neural activity is more closely related to this detection task or to perceptual awareness, this magnetoencephalography (MEG) study used targets that could only be identified with cues provided after or before the masked target. In experiment 1, multiple mono-tone streams with jittered inter-stimulus interval were used, and the tone frequency of the target was indicated by a cue. Results showed no differential auditory cortex activity between hit and miss trials with post-stimulus cues. A late negative response for hit trials was only observed for pre-stimulus cues, suggesting a task-related component. Since experiment 1 provided no evidence for a link of a difference response with tone awareness, experiment 2 was planned to probe if detection of tone streams was linked to a difference response in auditory cortex. Random-tone sequences were presented in the presence of a multi-tone masker, and the sequence was repeated without masker thereafter. Results showed a prominent difference wave for hit compared to miss trials in experiment 2 evoked by targets in the presence of the masker. These results suggest that perceptual awareness of tone streams is linked to neural activity in auditory cortex.
Collapse
Affiliation(s)
- Kai Gärtner
- Department of Neurology, Heidelberg University, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany
| | - Alexander Gutschalk
- Department of Neurology, Heidelberg University, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany.
| |
Collapse
|
27
|
Fonseca R, Madeira N, Simoes C. Resilience to fear: The role of individual factors in amygdala response to stressors. Mol Cell Neurosci 2020; 110:103582. [PMID: 33346000 DOI: 10.1016/j.mcn.2020.103582] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 11/13/2020] [Accepted: 12/02/2020] [Indexed: 10/22/2022] Open
Abstract
Resilience to stress is an adaptive process that varies individually. Resilience refers to the adaptation, or the ability to maintain or regain mental health, despite being subject to adverse situation. Resilience is a dynamic concept that reflects a combination of internal individual factors, including age and gender interacting with external factors such as social, cultural and environmental factors. In the last decade, we have witnessed an increase in the prevalence of anxiety disorders, including post-traumatic stress disorder. Given that stress in unavoidable, it is of great interest to understand the neurophysiological mechanisms of resilience, the individual factors that may contribute to susceptibility and promote efficacious approaches to improve resilience. Here, we address this complex question, attempting at defining clear and operational definitions that may allow us to improve our analysis of behavior incorporating individuality. We examine how individual perception of the stressor can alter the outcome of an adverse situation using as an example, the fear-conditioning paradigm and discuss how individual differences in the reward system can contribute to resilience. Given the central role of the endocannabinoid system in regulating fear responses and anxiety, we discuss the evidence that polymorphisms in several molecules of this signaling system contribute to different anxiety phenotypes. The endocannabinoid system is highly interconnected with the serotoninergic and dopaminergic modulatory systems, contributing to individual differences in stress perception and coping mechanisms. We review how the individual variability in these modulatory systems can be used towards a multivariable assessment of stress risk. Incorporating individuality in our research will allow us to define biomarkers of anxiety disorders as well as assess prognosis, towards a personalized clinical approach to mental health.
Collapse
Affiliation(s)
- Rosalina Fonseca
- Cellular and Systems Neurobiology, Chronic Diseases Research Center (CEDOC), NOVA Medical School, Universidade Nova de Lisboa, Campo dos Mártires da Pátria, 130 1169-056 Lisboa, Portugal.
| | - Natália Madeira
- Cellular and Systems Neurobiology, Chronic Diseases Research Center (CEDOC), NOVA Medical School, Universidade Nova de Lisboa, Campo dos Mártires da Pátria, 130 1169-056 Lisboa, Portugal
| | - Carla Simoes
- Cellular and Systems Neurobiology, Chronic Diseases Research Center (CEDOC), NOVA Medical School, Universidade Nova de Lisboa, Campo dos Mártires da Pátria, 130 1169-056 Lisboa, Portugal
| |
Collapse
|
28
|
Holmes E, Zeidman P, Friston KJ, Griffiths TD. Difficulties with Speech-in-Noise Perception Related to Fundamental Grouping Processes in Auditory Cortex. Cereb Cortex 2020; 31:1582-1596. [PMID: 33136138 PMCID: PMC7869094 DOI: 10.1093/cercor/bhaa311] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/04/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open
Abstract
In our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” [SPIN] perception). SPIN perception varies widely—and people who are worse at SPIN perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with SPIN perception to difficulties with figure-ground perception using functional magnetic resonance imaging. We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when SPIN and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% rather than 90% performance)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (SPIN) tasks—which provides a common computational basis for the link between SPIN perception and fundamental auditory grouping.
Collapse
Affiliation(s)
- Emma Holmes
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Peter Zeidman
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK
| | - Timothy D Griffiths
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, UCL, London WC1N 3AR, UK.,Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
29
|
Brodbeck C, Jiao A, Hong LE, Simon JZ. Neural speech restoration at the cocktail party: Auditory cortex recovers masked speech of both attended and ignored speakers. PLoS Biol 2020; 18:e3000883. [PMID: 33091003 PMCID: PMC7644085 DOI: 10.1371/journal.pbio.3000883] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 11/05/2020] [Accepted: 09/14/2020] [Indexed: 01/09/2023] Open
Abstract
Humans are remarkably skilled at listening to one speaker out of an acoustic mixture of several speech sources. Two speakers are easily segregated, even without binaural cues, but the neural mechanisms underlying this ability are not well understood. One possibility is that early cortical processing performs a spectrotemporal decomposition of the acoustic mixture, allowing the attended speech to be reconstructed via optimally weighted recombinations that discount spectrotemporal regions where sources heavily overlap. Using human magnetoencephalography (MEG) responses to a 2-talker mixture, we show evidence for an alternative possibility, in which early, active segregation occurs even for strongly spectrotemporally overlapping regions. Early (approximately 70-millisecond) responses to nonoverlapping spectrotemporal features are seen for both talkers. When competing talkers’ spectrotemporal features mask each other, the individual representations persist, but they occur with an approximately 20-millisecond delay. This suggests that the auditory cortex recovers acoustic features that are masked in the mixture, even if they occurred in the ignored speech. The existence of such noise-robust cortical representations, of features present in attended as well as ignored speech, suggests an active cortical stream segregation process, which could explain a range of behavioral effects of ignored background speech. How do humans focus on one speaker when several are talking? MEG responses to a continuous two-talker mixture suggest that, even though listeners attend only to one of the talkers, their auditory cortex tracks acoustic features from both speakers. This occurs even when those features are locally masked by the other speaker.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Alex Jiao
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
| | - L. Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Department of Biology, University of Maryland, College Park, Maryland, United States of America
| |
Collapse
|
30
|
Kaya EM, Huang N, Elhilali M. Pitch, Timbre and Intensity Interdependently Modulate Neural Responses to Salient Sounds. Neuroscience 2020; 440:1-14. [PMID: 32445938 DOI: 10.1016/j.neuroscience.2020.05.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 04/28/2020] [Accepted: 05/10/2020] [Indexed: 01/31/2023]
Abstract
As we listen to everyday sounds, auditory perception is heavily shaped by interactions between acoustic attributes such as pitch, timbre and intensity; though it is not clear how such interactions affect judgments of acoustic salience in dynamic soundscapes. Salience perception is believed to rely on an internal brain model that tracks the evolution of acoustic characteristics of a scene and flags events that do not fit this model as salient. The current study explores how the interdependency between attributes of dynamic scenes affects the neural representation of this internal model and shapes encoding of salient events. Specifically, the study examines how deviations along combinations of acoustic attributes interact to modulate brain responses, and subsequently guide perception of certain sound events as salient given their context. Human volunteers have their attention focused on a visual task and ignore acoustic melodies playing in the background while their brain activity using electroencephalography is recorded. Ambient sounds consist of musical melodies with probabilistically-varying acoustic attributes. Salient notes embedded in these scenes deviate from the melody's statistical distribution along pitch, timbre and/or intensity. Recordings of brain responses to salient notes reveal that neural power in response to the melodic rhythm as well as cross-trial phase alignment in the theta band are modulated by degree of salience of the notes, estimated across all acoustic attributes given their probabilistic context. These neural nonlinear effects across attributes strongly parallel behavioral nonlinear interactions observed in perceptual judgments of auditory salience using similar dynamic melodies; suggesting a neural underpinning of nonlinear interactions that underlie salience perception.
Collapse
Affiliation(s)
- Emine Merve Kaya
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Huang
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
31
|
Wang Y, Zhang J, Zou J, Luo H, Ding N. Prior Knowledge Guides Speech Segregation in Human Auditory Cortex. Cereb Cortex 2020; 29:1561-1571. [PMID: 29788144 DOI: 10.1093/cercor/bhy052] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/22/2018] [Accepted: 02/15/2018] [Indexed: 11/12/2022] Open
Abstract
Segregating concurrent sound streams is a computationally challenging task that requires integrating bottom-up acoustic cues (e.g. pitch) and top-down prior knowledge about sound streams. In a multi-talker environment, the brain can segregate different speakers in about 100 ms in auditory cortex. Here, we used magnetoencephalographic (MEG) recordings to investigate the temporal and spatial signature of how the brain utilizes prior knowledge to segregate 2 speech streams from the same speaker, which can hardly be separated based on bottom-up acoustic cues. In a primed condition, the participants know the target speech stream in advance while in an unprimed condition no such prior knowledge is available. Neural encoding of each speech stream is characterized by the MEG responses tracking the speech envelope. We demonstrate that an effect in bilateral superior temporal gyrus and superior temporal sulcus is much stronger in the primed condition than in the unprimed condition. Priming effects are observed at about 100 ms latency and last more than 600 ms. Interestingly, prior knowledge about the target stream facilitates speech segregation by mainly suppressing the neural tracking of the non-target speech stream. In sum, prior knowledge leads to reliable speech segregation in auditory cortex, even in the absence of reliable bottom-up speech segregation cue.
Collapse
Affiliation(s)
- Yuanye Wang
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Jianfeng Zhang
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jiajie Zou
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Huan Luo
- School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,McGovern Institute for Brain Research, Peking University, Beijing, China.,Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Nai Ding
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, Zhejiang, China.,Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, Zhejiang, China.,State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, Zhejiang, China.,Interdisciplinary Center for Social Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
32
|
Huang N, Elhilali M. Push-pull competition between bottom-up and top-down auditory attention to natural soundscapes. eLife 2020; 9:52984. [PMID: 32196457 PMCID: PMC7083598 DOI: 10.7554/elife.52984] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 02/13/2020] [Indexed: 12/17/2022] Open
Abstract
In everyday social environments, demands on attentional resources dynamically shift to balance our attention to targets of interest while alerting us to important objects in our surrounds. The current study uses electroencephalography to explore how the push-pull interaction between top-down and bottom-up attention manifests itself in dynamic auditory scenes. Using natural soundscapes as distractors while subjects attend to a controlled rhythmic sound sequence, we find that salient events in background scenes significantly suppress phase-locking and gamma responses to the attended sequence, countering enhancement effects observed for attended targets. In line with a hypothesis of limited attentional resources, the modulation of neural activity by bottom-up attention is graded by degree of salience of ambient events. The study also provides insights into the interplay between endogenous and exogenous attention during natural soundscapes, with both forms of attention engaging a common fronto-parietal network at different time lags.
Collapse
Affiliation(s)
- Nicholas Huang
- Laboratory for Computational Audio Perception, Department of Electrical Engineering, Johns Hopkins University, Baltimore, United States
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical Engineering, Johns Hopkins University, Baltimore, United States
| |
Collapse
|
33
|
Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention. J Neurosci 2020; 40:2562-2572. [PMID: 32094201 PMCID: PMC7083526 DOI: 10.1523/jneurosci.1936-19.2020] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 01/17/2020] [Accepted: 01/30/2020] [Indexed: 11/21/2022] Open
Abstract
When selectively attending to a speech stream in multi-talker scenarios, low-frequency cortical activity is known to synchronize selectively to fluctuations in the attended speech signal. Older listeners with age-related sensorineural hearing loss (presbycusis) often struggle to understand speech in such situations, even when wearing a hearing aid. Yet, it is unclear whether a peripheral hearing loss degrades the attentional modulation of cortical speech tracking. Here, we used psychoacoustics and electroencephalography (EEG) in male and female human listeners to examine potential effects of hearing loss on EEG correlates of speech envelope synchronization in cortex. Behaviorally, older hearing-impaired (HI) listeners showed degraded speech-in-noise recognition and reduced temporal acuity compared with age-matched normal-hearing (NH) controls. During EEG recordings, we used a selective attention task with two spatially separated simultaneous speech streams where NH and HI listeners both showed high speech recognition performance. Low-frequency (<10 Hz) envelope-entrained EEG responses were enhanced in the HI listeners, both for the attended speech, but also for tone sequences modulated at slow rates (4 Hz) during passive listening. Compared with the attended speech, responses to the ignored stream were found to be reduced in both HI and NH listeners, allowing for the attended target to be classified from single-trial EEG data with similar high accuracy in the two groups. However, despite robust attention-modulated speech entrainment, the HI listeners rated the competing speech task to be more difficult. These results suggest that speech-in-noise problems experienced by older HI listeners are not necessarily associated with degraded attentional selection. SIGNIFICANCE STATEMENT People with age-related sensorineural hearing loss often struggle to follow speech in the presence of competing talkers. It is currently unclear whether hearing impairment may impair the ability to use selective attention to suppress distracting speech in situations when the distractor is well segregated from the target. Here, we report amplified envelope-entrained cortical EEG responses to attended speech and to simple tones modulated at speech rates (4 Hz) in listeners with age-related hearing loss. Critically, despite increased self-reported listening difficulties, cortical synchronization to speech mixtures was robustly modulated by selective attention in listeners with hearing loss. This allowed the attended talker to be classified from single-trial EEG responses with high accuracy in both older hearing-impaired listeners and age-matched normal-hearing controls.
Collapse
|
34
|
Deng Y, Choi I, Shinn-Cunningham B. Topographic specificity of alpha power during auditory spatial attention. Neuroimage 2020; 207:116360. [PMID: 31760150 PMCID: PMC9883080 DOI: 10.1016/j.neuroimage.2019.116360] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 10/06/2019] [Accepted: 11/13/2019] [Indexed: 01/31/2023] Open
Abstract
Visual and somatosensory spatial attention both induce parietal alpha (8-14 Hz) oscillations whose topographical distribution depends on the direction of spatial attentional focus. In the auditory domain, contrasts of parietal alpha power for leftward and rightward attention reveal qualitatively similar lateralization; however, it is not clear whether alpha lateralization changes monotonically with the direction of auditory attention as it does for visual spatial attention. In addition, most previous studies of alpha oscillation did not consider individual differences in alpha frequency, but simply analyzed power in a fixed spectral band. Here, we recorded electroencephalography in human subjects when they directed attention to one of five azimuthal locations. After a cue indicating the direction of an upcoming target sequence of spoken syllables (yet before the target began), alpha power changed in a task-specific manner. Individual peak alpha frequencies differed consistently between central electrodes and parieto-occipital electrodes, suggesting multiple neural generators of task-related alpha. Parieto-occipital alpha increased over the hemisphere ipsilateral to attentional focus compared to the contralateral hemisphere, and changed systematically as the direction of attention shifted from far left to far right. These results showing that parietal alpha lateralization changes smoothly with the direction of auditory attention as in visual spatial attention provide further support to the growing evidence that the frontoparietal attention network is supramodal.
Collapse
Affiliation(s)
- Yuqi Deng
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, 52242, USA
| | - Barbara Shinn-Cunningham
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA,Carnegie Mellon Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA,Corresponding author. Baker Hall 254G, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA. (B. Shinn-Cunningham)
| |
Collapse
|
35
|
Li Y, Wang F, Chen Y, Cichocki A, Sejnowski T. The Effects of Audiovisual Inputs on Solving the Cocktail Party Problem in the Human Brain: An fMRI Study. Cereb Cortex 2019; 28:3623-3637. [PMID: 29029039 DOI: 10.1093/cercor/bhx235] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Indexed: 11/13/2022] Open
Abstract
At cocktail parties, our brains often simultaneously receive visual and auditory information. Although the cocktail party problem has been widely investigated under auditory-only settings, the effects of audiovisual inputs have not. This study explored the effects of audiovisual inputs in a simulated cocktail party. In our fMRI experiment, each congruent audiovisual stimulus was a synthesis of 2 facial movie clips, each of which could be classified into 1 of 2 emotion categories (crying and laughing). Visual-only (faces) and auditory-only stimuli (voices) were created by extracting the visual and auditory contents from the synthesized audiovisual stimuli. Subjects were instructed to selectively attend to 1 of the 2 objects contained in each stimulus and to judge its emotion category in the visual-only, auditory-only, and audiovisual conditions. The neural representations of the emotion features were assessed by calculating decoding accuracy and brain pattern-related reproducibility index based on the fMRI data. We compared the audiovisual condition with the visual-only and auditory-only conditions and found that audiovisual inputs enhanced the neural representations of emotion features of the attended objects instead of the unattended objects. This enhancement might partially explain the benefits of audiovisual inputs for the brain to solve the cocktail party problem.
Collapse
Affiliation(s)
- Yuanqing Li
- Center for Brain Computer Interfaces and Brain Information Processing, South China University of Technology, Guangzhou, China.,Guangzhou Key Laboratory of Brain Computer Interaction and Applications, Guangzhou, China
| | - Fangyi Wang
- Center for Brain Computer Interfaces and Brain Information Processing, South China University of Technology, Guangzhou, China.,Guangzhou Key Laboratory of Brain Computer Interaction and Applications, Guangzhou, China
| | - Yongbin Chen
- Center for Brain Computer Interfaces and Brain Information Processing, South China University of Technology, Guangzhou, China.,Guangzhou Key Laboratory of Brain Computer Interaction and Applications, Guangzhou, China
| | - Andrzej Cichocki
- Riken Brain Science Institute, Wako shi, Japan.,Skolkovo Institute of Science and Technology (SKOTECH), Moscow, Russia
| | - Terrence Sejnowski
- Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| |
Collapse
|
36
|
O'Sullivan J, Herrero J, Smith E, Schevon C, McKhann GM, Sheth SA, Mehta AD, Mesgarani N. Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception. Neuron 2019; 104:1195-1209.e3. [PMID: 31648900 DOI: 10.1016/j.neuron.2019.09.007] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/11/2019] [Accepted: 09/06/2019] [Indexed: 11/15/2022]
Abstract
Humans can easily focus on one speaker in a multi-talker acoustic environment, but how different areas of the human auditory cortex (AC) represent the acoustic components of mixed speech is unknown. We obtained invasive recordings from the primary and nonprimary AC in neurosurgical patients as they listened to multi-talker speech. We found that neural sites in the primary AC responded to individual speakers in the mixture and were relatively unchanged by attention. In contrast, neural sites in the nonprimary AC were less discerning of individual speakers but selectively represented the attended speaker. Moreover, the encoding of the attended speaker in the nonprimary AC was invariant to the degree of acoustic overlap with the unattended speaker. Finally, this emergent representation of attended speech in the nonprimary AC was linearly predictable from the primary AC responses. Our results reveal the neural computations underlying the hierarchical formation of auditory objects in human AC during multi-talker speech perception.
Collapse
Affiliation(s)
- James O'Sullivan
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | - Jose Herrero
- Department of Neurosurgery, Hofstra-Northwell School of Medicine and Feinstein Institute for Medical Research, Manhasset, New York, NY, USA
| | - Elliot Smith
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA; Department of Neurosurgery, University of Utah, Salt Lake City, UT, USA
| | - Catherine Schevon
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA
| | - Guy M McKhann
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA
| | - Sameer A Sheth
- Department of Neurological Surgery, The Neurological Institute, New York, NY, USA; Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Ashesh D Mehta
- Department of Neurosurgery, Hofstra-Northwell School of Medicine and Feinstein Institute for Medical Research, Manhasset, New York, NY, USA
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, NY, USA.
| |
Collapse
|
37
|
Gao Y, Zhang J, Wang Q. Robust neural tracking of linguistic units relates to distractor suppression. Eur J Neurosci 2019; 51:641-650. [PMID: 31430411 DOI: 10.1111/ejn.14552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 08/01/2019] [Accepted: 08/08/2019] [Indexed: 11/30/2022]
Abstract
In a complex auditory scene, speech comprehension involves several stages: for example segregating the target from the background, recognizing syllables and integrating syllables into linguistic units (e.g., words). Although speech segregation is robust as shown by invariant neural tracking to target speech envelope, whether neural tracking to linguistic units is also robust and how this robustness is achieved remain unknown. To investigate these questions, we concurrently recorded neural responses tracking a rhythmic speech stream at its syllabic and word rates, using electroencephalography. Human participants listened to that target speech under a speech or noise distractor at varying signal-to-noise ratios. Neural tracking at the word rate was not as robust as neural tracking at the syllabic rate. Robust neural tracking to target's words was only observed under the speech distractor but not under the noise distractor. Moreover, this robust word tracking correlated with a successful suppression of distractor tracking. Critically, both word tracking and distractor suppression correlated with behavioural comprehension accuracy. In sum, our results suggest that a robust neural tracking of higher-level linguistic units relates to not only the target tracking, but also the distractor suppression.
Collapse
Affiliation(s)
- Yayue Gao
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.,Department of Psychology, Beihang University, Beijing, China
| | - Jianfeng Zhang
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Qian Wang
- Department of Clinical Psychology, Epilepsy Center, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
38
|
Object-based attention in complex, naturalistic auditory streams. Sci Rep 2019; 9:2854. [PMID: 30814547 PMCID: PMC6393668 DOI: 10.1038/s41598-019-39166-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 01/14/2019] [Indexed: 11/08/2022] Open
Abstract
In vision, objects have been described as the 'units' on which non-spatial attention operates in many natural settings. Here, we test the idea of object-based attention in the auditory domain within ecologically valid auditory scenes, composed of two spatially and temporally overlapping sound streams (speech signal vs. environmental soundscapes in Experiment 1 and two speech signals in Experiment 2). Top-down attention was directed to one or the other auditory stream by a non-spatial cue. To test for high-level, object-based attention effects we introduce an auditory repetition detection task in which participants have to detect brief repetitions of auditory objects, ruling out any possible confounds with spatial or feature-based attention. The participants' responses were significantly faster and more accurate in the valid cue condition compared to the invalid cue condition, indicating a robust cue-validity effect of high-level, object-based auditory attention.
Collapse
|
39
|
Kirszenblat L, Ertekin D, Goodsell J, Zhou Y, Shaw PJ, van Swinderen B. Sleep regulates visual selective attention in Drosophila. ACTA ACUST UNITED AC 2018; 221:jeb.191429. [PMID: 30355611 DOI: 10.1242/jeb.191429] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 10/17/2018] [Indexed: 01/22/2023]
Abstract
Although sleep deprivation is known to impair attention in humans and other mammals, the underlying reasons are not well understood, and whether similar effects are present in non-mammalian species is not known. We therefore sought to investigate whether sleep is important for optimizing attention in an invertebrate species, the genetic model Drosophila melanogaster We developed a high-throughput paradigm to measure visual attention in freely walking Drosophila, using competing foreground/background visual stimuli. We found that whereas sleep-deprived flies could respond normally to either stimulus alone, they were more distracted by background cues in a visual competition task. Other stressful manipulations such as starvation, heat exposure and mechanical stress had no effects on visual attention in this paradigm. In contrast to sleep deprivation, providing additional sleep using the GABA-A agonist 4,5,6,7-tetrahydroisoxazolo-[5,4-c]pyridine-3-ol (THIP) did not affect attention in wild-type flies, but specifically improved attention in the learning mutant dunce Our results reveal a key function of sleep in optimizing attention processes in Drosophila, and establish a behavioral paradigm that can be used to explore the molecular mechanisms involved.
Collapse
Affiliation(s)
- Leonie Kirszenblat
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Deniz Ertekin
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Joseph Goodsell
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Yanqiong Zhou
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Paul J Shaw
- Department of Anatomy and Neurobiology, Washington University in St. Louis, 660 South Euclid Avenue, St Louis, MO 63110, USA
| | - Bruno van Swinderen
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| |
Collapse
|
40
|
Auditory Figure-Ground Segregation Is Impaired by High Visual Load. J Neurosci 2018; 39:1699-1708. [PMID: 30541915 PMCID: PMC6391559 DOI: 10.1523/jneurosci.2518-18.2018] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 11/19/2018] [Accepted: 11/19/2018] [Indexed: 11/21/2022] Open
Abstract
Figure-ground segregation is fundamental to listening in complex acoustic environments. An ongoing debate pertains to whether segregation requires attention or is "automatic" and preattentive. In this magnetoencephalography study, we tested a prediction derived from load theory of attention (e.g., Lavie, 1995) that segregation requires attention but can benefit from the automatic allocation of any "leftover" capacity under low load. Complex auditory scenes were modeled with stochastic figure-ground stimuli (Teki et al., 2013), which occasionally contained repeated frequency component "figures." Naive human participants (both sexes) passively listened to these signals while performing a visual attention task of either low or high load. While clear figure-related neural responses were observed under conditions of low load, high visual load substantially reduced the neural response to the figure in auditory cortex (planum temporale, Heschl's gyrus). We conclude that fundamental figure-ground segregation in hearing is not automatic but draws on resources that are shared across vision and audition.SIGNIFICANCE STATEMENT This work resolves a long-standing question of whether figure-ground segregation, a fundamental process of auditory scene analysis, requires attention or is underpinned by automatic, encapsulated computations. Task-irrelevant sounds were presented during performance of a visual search task. We revealed a clear magnetoencephalography neural signature of figure-ground segregation in conditions of low visual load, which was substantially reduced in conditions of high visual load. This demonstrates that, although attention does not need to be actively allocated to sound for auditory segregation to occur, segregation depends on shared computational resources across vision and hearing. The findings further highlight that visual load can impair the computational capacity of the auditory system, even when it does not simply dampen auditory responses as a whole.
Collapse
|
41
|
Gifford AM, Sperling MR, Sharan A, Gorniak RJ, Williams RB, Davis K, Kahana MJ, Cohen YE. Neuronal phase consistency tracks dynamic changes in acoustic spectral regularity. Eur J Neurosci 2018; 49:1268-1287. [PMID: 30402926 DOI: 10.1111/ejn.14263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 10/15/2018] [Accepted: 10/23/2018] [Indexed: 11/28/2022]
Abstract
The brain parses the auditory environment into distinct sounds by identifying those acoustic features in the environment that have common relationships (e.g., spectral regularities) with one another and then grouping together the neuronal representations of these features. Although there is a large literature that tests how the brain tracks spectral regularities that are predictable, it is not known how the auditory system tracks spectral regularities that are not predictable and that change dynamically over time. Furthermore, the contribution of brain regions downstream of the auditory cortex to the coding of spectral regularity is unknown. Here, we addressed these two issues by recording electrocorticographic activity, while human patients listened to tone-burst sequences with dynamically varying spectral regularities, and identified potential neuronal mechanisms of the analysis of spectral regularities throughout the brain. We found that the degree of oscillatory stimulus phase consistency (PC) in multiple neuronal-frequency bands tracked spectral regularity. In particular, PC in the delta-frequency band seemed to be the best indicator of spectral regularity. We also found that these regularity representations existed in multiple regions throughout cortex. This widespread reliable modulation in PC - both in neuronal-frequency space and in cortical space - suggests that phase-based modulations may be a general mechanism for tracking regularity in the auditory system specifically and other sensory systems more generally. Our findings also support a general role for the delta-frequency band in processing the regularity of auditory stimuli.
Collapse
Affiliation(s)
- Adam M Gifford
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael R Sperling
- Jefferson Comprehensive Epilepsy Center, Department of Neurology, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Ashwini Sharan
- Jefferson Comprehensive Epilepsy Center, Department of Neurology, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Richard J Gorniak
- Department of Radiology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Ryan B Williams
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kathryn Davis
- Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael J Kahana
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania.,Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Yale E Cohen
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania.,Departments of Otorhinolaryngology, Neuroscience, and Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
42
|
Cortical tracking of multiple streams outside the focus of attention in naturalistic auditory scenes. Neuroimage 2018; 181:617-626. [DOI: 10.1016/j.neuroimage.2018.07.052] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 07/19/2018] [Accepted: 07/22/2018] [Indexed: 11/30/2022] Open
|
43
|
Ruggles DR, Tausend AN, Shamma SA, Oxenham AJ. Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2424. [PMID: 30404514 PMCID: PMC6909992 DOI: 10.1121/1.5065392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 06/08/2023]
Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.
Collapse
Affiliation(s)
- Dorea R Ruggles
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Alexis N Tausend
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Shihab A Shamma
- Electrical and Computer Engineering Department & Institute for Systems, University of Maryland, College Park, Maryland 20740, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
44
|
Feng L, Oxenham AJ. Spectral contrast effects produced by competing speech contexts. J Exp Psychol Hum Percept Perform 2018; 44:1447-1457. [PMID: 29847973 PMCID: PMC6110988 DOI: 10.1037/xhp0000546] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The long-term spectrum of a preceding sentence can alter the perception of a following speech sound in a contrastive manner. This speech context effect contributes to our ability to extract reliable spectral characteristics of the surrounding acoustic environment and to compensate for the voice characteristics of different speakers or spectral colorations in different listening environments to maintain perceptual constancy. The extent to which such effects are mediated by low-level "automatic" processes, or require directed attention, remains unknown. This study investigated spectral context effects by measuring the effects of two competing sentences on the phoneme category boundary between /i/ and /ε/ in a following target word, while directing listeners' attention to one or the other context sentence. Spatial separation of the context sentences was achieved either by presenting them to different ears, or by presenting them to both ears but imposing an interaural time difference (ITD) between the ears. The results confirmed large context effects based on ear of presentation. Smaller effects were observed based on either ITD or attention. The results, combined with predictions from a two-stage model, suggest that ear-specific factors dominate speech context effects but that the effects can be modulated by higher-level features, such as perceived location, and by attention. (PsycINFO Database Record
Collapse
Affiliation(s)
- Lei Feng
- Department of Psychology, University of Minnesota
| | | |
Collapse
|
45
|
Huang N, Slaney M, Elhilali M. Connecting Deep Neural Networks to Physical, Perceptual, and Electrophysiological Auditory Signals. Front Neurosci 2018; 12:532. [PMID: 30154688 PMCID: PMC6102345 DOI: 10.3389/fnins.2018.00532] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Accepted: 07/16/2018] [Indexed: 11/13/2022] Open
Abstract
Deep neural networks have been recently shown to capture intricate information transformation of signals from the sensory profiles to semantic representations that facilitate recognition or discrimination of complex stimuli. In this vein, convolutional neural networks (CNNs) have been used very successfully in image and audio classification. Designed to imitate the hierarchical structure of the nervous system, CNNs reflect activation with increasing degrees of complexity that transform the incoming signal onto object-level representations. In this work, we employ a CNN trained for large-scale audio object classification to gain insights about the contribution of various audio representations that guide sound perception. The analysis contrasts activation of different layers of a CNN with acoustic features extracted directly from the scenes, perceptual salience obtained from behavioral responses of human listeners, as well as neural oscillations recorded by electroencephalography (EEG) in response to the same natural scenes. All three measures are tightly linked quantities believed to guide percepts of salience and object formation when listening to complex scenes. The results paint a picture of the intricate interplay between low-level and object-level representations in guiding auditory salience that is very much dependent on context and sound category.
Collapse
Affiliation(s)
- Nicholas Huang
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Malcolm Slaney
- Machine Hearing, Google AI, Google (United States), Mountain View, CA, United States
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
46
|
Neural Signatures of the Processing of Temporal Patterns in Sound. J Neurosci 2018; 38:5466-5477. [PMID: 29773757 DOI: 10.1523/jneurosci.0346-18.2018] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 04/13/2018] [Accepted: 05/06/2018] [Indexed: 11/21/2022] Open
Abstract
The ability to detect regularities in sound (i.e., recurring structure) is critical for effective perception, enabling, for example, change detection and prediction. Two seemingly unconnected lines of research concern the neural operations involved in processing regularities: one investigates how neural activity synchronizes with temporal regularities (e.g., frequency modulation; FM) in sounds, whereas the other focuses on increases in sustained activity during stimulation with repeating tone-frequency patterns. In three electroencephalography studies with male and female human participants, we investigated whether neural synchronization and sustained neural activity are dissociable, or whether they are functionally interdependent. Experiment I demonstrated that neural activity synchronizes with temporal regularity (FM) in sounds, and that sustained activity increases concomitantly. In Experiment II, phase coherence of FM in sounds was parametrically varied. Although neural synchronization was more sensitive to changes in FM coherence, such changes led to a systematic modulation of both neural synchronization and sustained activity, with magnitude increasing as coherence increased. In Experiment III, participants either performed a duration categorization task on the sounds, or a visual object tracking task to distract attention. Neural synchronization was observed regardless of task, whereas the sustained response was observed only when attention was on the auditory task, not under (visual) distraction. The results suggest that neural synchronization and sustained activity levels are functionally linked: both are sensitive to regularities in sounds. However, neural synchronization might reflect a more sensory-driven response to regularity, compared with sustained activity which may be influenced by attentional, contextual, or other experiential factors.SIGNIFICANCE STATEMENT Optimal perception requires that the auditory system detects regularities in sounds. Synchronized neural activity and increases in sustained neural activity both appear to index the detection of a regularity, but the functional interrelation of these two neural signatures is unknown. In three electroencephalography experiments, we measured both signatures concomitantly while listeners were presented with sounds containing frequency modulations that differed in their regularity. We observed that both neural signatures are sensitive to temporal regularity in sounds, although they functionally decouple when a listener is distracted by a demanding visual task. Our data suggest that neural synchronization reflects a more automatic response to regularity compared with sustained activity, which may be influenced by attentional, contextual, or other experiential factors.
Collapse
|
47
|
Jaeger M, Bleichner MG, Bauer AKR, Mirkovic B, Debener S. Did You Listen to the Beat? Auditory Steady-State Responses in the Human Electroencephalogram at 4 and 7 Hz Modulation Rates Reflect Selective Attention. Brain Topogr 2018; 31:811-826. [DOI: 10.1007/s10548-018-0637-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 02/23/2018] [Indexed: 01/23/2023]
|
48
|
Riecke L, Peters JC, Valente G, Kemper VG, Formisano E, Sorger B. Frequency-Selective Attention in Auditory Scenes Recruits Frequency Representations Throughout Human Superior Temporal Cortex. Cereb Cortex 2018; 27:3002-3014. [PMID: 27230215 DOI: 10.1093/cercor/bhw160] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
A sound of interest may be tracked amid other salient sounds by focusing attention on its characteristic features including its frequency. Functional magnetic resonance imaging findings have indicated that frequency representations in human primary auditory cortex (AC) contribute to this feat. However, attentional modulations were examined at relatively low spatial and spectral resolutions, and frequency-selective contributions outside the primary AC could not be established. To address these issues, we compared blood oxygenation level-dependent (BOLD) responses in the superior temporal cortex of human listeners while they identified single frequencies versus listened selectively for various frequencies within a multifrequency scene. Using best-frequency mapping, we observed that the detailed spatial layout of attention-induced BOLD response enhancements in primary AC follows the tonotopy of stimulus-driven frequency representations-analogous to the "spotlight" of attention enhancing visuospatial representations in retinotopic visual cortex. Moreover, using an algorithm trained to discriminate stimulus-driven frequency representations, we could successfully decode the focus of frequency-selective attention from listeners' BOLD response patterns in nonprimary AC. Our results indicate that the human brain facilitates selective listening to a frequency of interest in a scene by reinforcing the fine-grained activity pattern throughout the entire superior temporal cortex that would be evoked if that frequency was present alone.
Collapse
Affiliation(s)
- Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands
| | - Judith C Peters
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands.,Netherlands Institute for Neuroscience, Institute of the Royal Netherlands Academy of Arts and Sciences (KNAW), 1105 BA Amsterdam, The Netherlands
| | - Giancarlo Valente
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands
| | - Valentin G Kemper
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands
| | - Bettina Sorger
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 EV Maastricht, The Netherlands
| |
Collapse
|
49
|
Puvvada KC, Summerfelt A, Du X, Krishna N, Kochunov P, Rowland LM, Simon JZ, Hong LE. Delta Vs Gamma Auditory Steady State Synchrony in Schizophrenia. Schizophr Bull 2018; 44:378-387. [PMID: 29036430 PMCID: PMC5814801 DOI: 10.1093/schbul/sbx078] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Background Delta band (1-4 Hz) neuronal responses support the precision and stability of auditory processing, and a deficit in delta band synchrony may be relevant to auditory domain symptoms in schizophrenia patients. Methods Delta band synchronization elicited by a 2.5 Hz auditory steady state response (ASSR) paradigm, along with those from theta (5 Hz), alpha (10 Hz), beta (20 Hz), gamma (40 Hz), and high gamma (80 Hz) frequency ASSR, were compared in 128 patients with schizophrenia, 108 healthy controls, and 55 first-degree relatives (FDR) of patients. Results Delta band synchronization was significantly impaired in patients compared with controls (F = 18.3, P < .001). There was a significant 2.5 Hz by 40 Hz ASSR interaction (P = .023), arising from a greater reduction of 2.5 Hz ASSR than of 40 Hz ASSR, in patients compared with controls. Greater deficit in delta ASSR was associated with auditory perceptual abnormality (P = .007) and reduced verbal working memory (P < .001). Gamma frequency ASSR impairment was also significant but more modest (F = 8.7, P = .004), and this deficit was also present in FDR (P = .022). Conclusions The ability to sustain delta band oscillation entrainment in the auditory pathway is significantly reduced in schizophrenia patients and appears to be clinically relevant.
Collapse
Affiliation(s)
- Krishna C Puvvada
- Department of Electrical & Computer Engineering, University of Maryland, College Park, MD
| | - Ann Summerfelt
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| | - Xiaoming Du
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| | - Nithin Krishna
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| | - Peter Kochunov
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| | - Laura M Rowland
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| | - Jonathan Z Simon
- Department of Electrical & Computer Engineering, University of Maryland, College Park, MD
- Department of Biology, University of Maryland, College Park, MD
- Institute for Systems Research, University of Maryland, College Park, MD
| | - L Elliot Hong
- Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD
| |
Collapse
|
50
|
Wiegand K, Heiland S, Uhlig CH, Dykstra AR, Gutschalk A. Cortical networks for auditory detection with and without informational masking: Task effects and implications for conscious perception. Neuroimage 2018; 167:178-190. [DOI: 10.1016/j.neuroimage.2017.11.036] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Revised: 10/06/2017] [Accepted: 11/18/2017] [Indexed: 01/08/2023] Open
|