1
|
Pedapati EV, Ethridge LE, Liu Y, Liu R, Sweeney JA, DeStefano LA, Miyakoshi M, Razak K, Schmitt LM, Moore DR, Gilbert DL, Wu SW, Smith E, Shaffer RC, Dominick KC, Horn PS, Binder D, Erickson CA. Frontal Cortex Hyperactivation and Gamma Desynchrony in Fragile X Syndrome: Correlates of Auditory Hypersensitivity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.13.598957. [PMID: 38915683 PMCID: PMC11195233 DOI: 10.1101/2024.06.13.598957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Fragile X syndrome (FXS) is an X-linked disorder that often leads to intellectual disability, anxiety, and sensory hypersensitivity. While sound sensitivity (hyperacusis) is a distressing symptom in FXS, its neural basis is not well understood. It is postulated that hyperacusis may stem from temporal lobe hyperexcitability or dysregulation in top-down modulation. Studying the neural mechanisms underlying sound sensitivity in FXS using scalp electroencephalography (EEG) is challenging because the temporal and frontal regions have overlapping neural projections that are difficult to differentiate. To overcome this challenge, we conducted EEG source analysis on a group of 36 individuals with FXS and 39 matched healthy controls. Our goal was to characterize the spatial and temporal properties of the response to an auditory chirp stimulus. Our results showed that males with FXS exhibit excessive activation in the frontal cortex in response to the stimulus onset, which may reflect changes in top-down modulation of auditory processing. Additionally, during the chirp stimulus, individuals with FXS demonstrated a reduction in typical gamma phase synchrony, along with an increase in asynchronous gamma power, across multiple regions, most strongly in temporal cortex. Consistent with these findings, we observed a decrease in the signal-to-noise ratio, estimated by the ratio of synchronous to asynchronous gamma activity, in individuals with FXS. Furthermore, this ratio was highly correlated with performance in an auditory attention task. Compared to controls, males with FXS demonstrated elevated bidirectional frontotemporal information flow at chirp onset. The evidence indicates that both temporal lobe hyperexcitability and disruptions in top-down regulation play a role in auditory sensitivity disturbances in FXS. These findings have the potential to guide the development of therapeutic targets and back-translation strategies.
Collapse
Affiliation(s)
- Ernest V Pedapati
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Psychiatry, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Lauren E Ethridge
- Department of Pediatrics, Section on Developmental and Behavioral Pediatrics, University of Oklahoma Health Sciences Center, Oklahoma City, OK, United States
- Department of Psychology, University of Oklahoma, Norman, OK, United States
| | - Yanchen Liu
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Rui Liu
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - John A Sweeney
- Department of Psychiatry, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Lisa A DeStefano
- Division of Developmental and Behavioral Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Makoto Miyakoshi
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Khaleel Razak
- Department of Psychology, University of California, Riverside, Riverside, CA, United States
| | - Lauren M Schmitt
- Division of Developmental and Behavioral Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - David R Moore
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Manchester Centre for Audiology and Deafness, University of Manchester, Manchester, UK
| | - Donald L Gilbert
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Steve W Wu
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Elizabeth Smith
- Division of Developmental and Behavioral Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Rebecca C Shaffer
- Division of Developmental and Behavioral Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Kelli C Dominick
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Psychiatry, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Paul S Horn
- Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Devin Binder
- Division of Biomedical Sciences, School of Medicine, University of California, Riverside, United States
| | - Craig A Erickson
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
- Department of Psychiatry, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| |
Collapse
|
2
|
Albouy P, Mehr SA, Hoyer RS, Ginzburg J, Du Y, Zatorre RJ. Spectro-temporal acoustical markers differentiate speech from song across cultures. Nat Commun 2024; 15:4835. [PMID: 38844457 PMCID: PMC11156671 DOI: 10.1038/s41467-024-49040-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/21/2024] [Indexed: 06/09/2024] Open
Abstract
Humans produce two forms of cognitively complex vocalizations: speech and song. It is debated whether these differ based primarily on culturally specific, learned features, or if acoustical features can reliably distinguish them. We study the spectro-temporal modulation patterns of vocalizations produced by 369 people living in 21 urban, rural, and small-scale societies across six continents. Specific ranges of spectral and temporal modulations, overlapping within categories and across societies, significantly differentiate speech from song. Machine-learning classification shows that this effect is cross-culturally robust, vocalizations being reliably classified solely from their spectro-temporal features across all 21 societies. Listeners unfamiliar with the cultures classify these vocalizations using similar spectro-temporal cues as the machine learning algorithm. Finally, spectro-temporal features are better able to discriminate song from speech than a broad range of other acoustical variables, suggesting that spectro-temporal modulation-a key feature of auditory neuronal tuning-accounts for a fundamental difference between these categories.
Collapse
Affiliation(s)
- Philippe Albouy
- CERVO Brain Research Centre, School of Psychology, Laval University, Québec City, QC, Canada.
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada.
- Centre for Research in Brain, Language and Music and Centre for Interdisciplinary Research in Music, Media, and Technology, Montréal, QC, Canada.
| | - Samuel A Mehr
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada
- School of Psychology, University of Auckland, Auckland, 1010, New Zealand
- Child Study Center, Yale University, New Haven, CT, 06511, USA
| | - Roxane S Hoyer
- CERVO Brain Research Centre, School of Psychology, Laval University, Québec City, QC, Canada
| | - Jérémie Ginzburg
- CERVO Brain Research Centre, School of Psychology, Laval University, Québec City, QC, Canada
- Lyon Neuroscience Research Center, CNRS, UMR5292, INSERM, U1028 - Université Claude Bernard Lyon 1, F-69000, Lyon, France
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| | - Yi Du
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Robert J Zatorre
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada.
- Centre for Research in Brain, Language and Music and Centre for Interdisciplinary Research in Music, Media, and Technology, Montréal, QC, Canada.
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada.
| |
Collapse
|
3
|
Rupp KM, Hect JL, Harford EE, Holt LL, Ghuman AS, Abel TJ. A hierarchy of processing complexity and timescales for natural sounds in human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595822. [PMID: 38826304 PMCID: PMC11142240 DOI: 10.1101/2024.05.24.595822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Efficient behavior is supported by humans' ability to rapidly recognize acoustically distinct sounds as members of a common category. Within auditory cortex, there are critical unanswered questions regarding the organization and dynamics of sound categorization. Here, we performed intracerebral recordings in the context of epilepsy surgery as 20 patient-participants listened to natural sounds. We built encoding models to predict neural responses using features of these sounds extracted from different layers within a sound-categorization deep neural network (DNN). This approach yielded highly accurate models of neural responses throughout auditory cortex. The complexity of a cortical site's representation (measured by the depth of the DNN layer that produced the best model) was closely related to its anatomical location, with shallow, middle, and deep layers of the DNN associated with core (primary auditory cortex), lateral belt, and parabelt regions, respectively. Smoothly varying gradients of representational complexity also existed within these regions, with complexity increasing along a posteromedial-to-anterolateral direction in core and lateral belt, and along posterior-to-anterior and dorsal-to-ventral dimensions in parabelt. When we estimated the time window over which each recording site integrates information, we found shorter integration windows in core relative to lateral belt and parabelt. Lastly, we found a relationship between the length of the integration window and the complexity of information processing within core (but not lateral belt or parabelt). These findings suggest hierarchies of timescales and processing complexity, and their interrelationship, represent a functional organizational principle of the auditory stream that underlies our perception of complex, abstract auditory information.
Collapse
Affiliation(s)
- Kyle M. Rupp
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Jasmine L. Hect
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Emily E. Harford
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Lori L. Holt
- Department of Psychology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Avniel Singh Ghuman
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Taylor J. Abel
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
4
|
Reecher HM, Bearden DJ, Koop JI, Berl MM, Patrick KE, Ailion AS. The changing landscape of electrical stimulation language mapping with subdural electrodes and stereoelectroencephalography for pediatric epilepsy: A literature review and commentary. Epilepsia 2024. [PMID: 38787551 DOI: 10.1111/epi.18009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/26/2024] [Accepted: 04/29/2024] [Indexed: 05/25/2024]
Abstract
Electrical stimulation mapping (ESM) is used to locate the brain areas supporting language directly within the human cortex to minimize the risk of functional decline following epilepsy surgery. ESM is completed by utilizing subdural grid or depth electrodes (stereo-electroencephalography [sEEG]) in combination with behavioral evaluation of language. Despite technological advances, there is no standardized method of assessing language during pediatric ESM. To identify current clinical practices for pediatric ESM of language, we surveyed neuropsychologists in the Pediatric Epilepsy Research Consortium. Results indicated that sEEG is used for functional mapping at >80% of participating epilepsy surgery centers (n = 13/16) in the United States. However, >65% of sites did not report a standardized protocol to map language. Survey results indicated a clear need for practice recommendations regarding ESM of language. We then utilized PubMed/Medline and PsychInfo to identify 42 articles that reported on ESM of language, of which 18 met inclusion criteria, which included use of ESM/signal recording to localize language regions in children (<21 years) and a detailed account of the procedure and language measures used, and region-specific language localization outcomes. Articles were grouped based on the language domain assessed, language measures used, and the brain regions involved. Our review revealed the need for evidence-based clinical guidelines for pediatric language paradigms during ESM and a standardized language mapping protocol as well as standardized reporting of brain regions in research. Relevant limitations and future directions are discussed with a focus on considerations for pediatric language mapping.
Collapse
Affiliation(s)
- Hope M Reecher
- Department of Neurology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Donald J Bearden
- Department of Neurology, Emory University School of Medicine, Atlanta, Georgia, USA
- Department of Neuropsychology, Children's Healthcare of Atlanta, Atlanta, Georgia, USA
| | - Jennifer I Koop
- Department of Neurology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
- Department of Neurology, Department of Neuropsychology, Children's Wisconsin, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Madison M Berl
- Department of Neuropsychology, Children's National Hospital, Washington, DC, USA
- Department of Psychiatry and Behavioral Sciences, George Washington University, Washington, DC, USA
| | - Kristina E Patrick
- Department of Neurology, University of Washington School of Medicine, Seattle, Washington, USA
- Department of Neuroscience, Seattle Children's Hospital, Seattle, Washington, USA
| | - Alyssa S Ailion
- Department of Neurology, Boston Children's Hospital, Boston, Massachusetts, USA
- Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
5
|
Kurteff GL, Field AM, Asghar S, Tyler-Kabara EC, Clarke D, Weiner HL, Anderson AE, Watrous AJ, Buchanan RJ, Modur PN, Hamilton LS. Processing of auditory feedback in perisylvian and insular cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.593257. [PMID: 38798574 PMCID: PMC11118286 DOI: 10.1101/2024.05.14.593257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
When we speak, we not only make movements with our mouth, lips, and tongue, but we also hear the sound of our own voice. Thus, speech production in the brain involves not only controlling the movements we make, but also auditory and sensory feedback. Auditory responses are typically suppressed during speech production compared to perception, but how this manifests across space and time is unclear. Here we recorded intracranial EEG in seventeen pediatric, adolescent, and adult patients with medication-resistant epilepsy who performed a reading/listening task to investigate how other auditory responses are modulated during speech production. We identified onset and sustained responses to speech in bilateral auditory cortex, with a selective suppression of onset responses during speech production. Onset responses provide a temporal landmark during speech perception that is redundant with forward prediction during speech production. Phonological feature tuning in these "onset suppression" electrodes remained stable between perception and production. Notably, the posterior insula responded at sentence onset for both perception and production, suggesting a role in multisensory integration during feedback control.
Collapse
Affiliation(s)
- Garret Lynn Kurteff
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Alyssa M. Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
| | - Saman Asghar
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Elizabeth C. Tyler-Kabara
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Dave Clarke
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Pediatrics, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Howard L. Weiner
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Anne E. Anderson
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Andrew J. Watrous
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Robert J. Buchanan
- Department of Neurosurgery, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Pradeep N. Modur
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
| | - Liberty S. Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA
- Lead contact
| |
Collapse
|
6
|
Hullett PW, Leonard MK, Gorno-Tempini ML, Mandelli ML, Chang EF. Parallel Encoding of Speech in Human Frontal and Temporal Lobes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585648. [PMID: 38562883 PMCID: PMC10983886 DOI: 10.1101/2024.03.19.585648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Models of speech perception are centered around a hierarchy in which auditory representations in the thalamus propagate to primary auditory cortex, then to the lateral temporal cortex, and finally through dorsal and ventral pathways to sites in the frontal lobe. However, evidence for short latency speech responses and low-level spectrotemporal representations in frontal cortex raises the question of whether speech-evoked activity in frontal cortex strictly reflects downstream processing from lateral temporal cortex or whether there are direct parallel pathways from the thalamus or primary auditory cortex to the frontal lobe that supplement the traditional hierarchical architecture. Here, we used high-density direct cortical recordings, high-resolution diffusion tractography, and hemodynamic functional connectivity to evaluate for evidence of direct parallel inputs to frontal cortex from low-level areas. We found that neural populations in the frontal lobe show speech-evoked responses that are synchronous or occur earlier than responses in the lateral temporal cortex. These short latency frontal lobe neural populations encode spectrotemporal speech content indistinguishable from spectrotemporal encoding patterns observed in the lateral temporal lobe, suggesting parallel auditory speech representations reaching temporal and frontal cortex simultaneously. This is further supported by white matter tractography and functional connectivity patterns that connect the auditory nucleus of the thalamus (medial geniculate body) and the primary auditory cortex to the frontal lobe. Together, these results support the existence of a robust pathway of parallel inputs from low-level auditory areas to frontal lobe targets and illustrate long-range parallel architecture that works alongside the classical hierarchical speech network model.
Collapse
|
7
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
8
|
Leonard MK, Gwilliams L, Sellers KK, Chung JE, Xu D, Mischler G, Mesgarani N, Welkenhuysen M, Dutta B, Chang EF. Large-scale single-neuron speech sound encoding across the depth of human cortex. Nature 2024; 626:593-602. [PMID: 38093008 PMCID: PMC10866713 DOI: 10.1038/s41586-023-06839-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/06/2023] [Indexed: 01/31/2024]
Abstract
Understanding the neural basis of speech perception requires that we study the human brain both at the scale of the fundamental computational unit of neurons and in their organization across the depth of cortex. Here we used high-density Neuropixels arrays1-3 to record from 685 neurons across cortical layers at nine sites in a high-level auditory region that is critical for speech, the superior temporal gyrus4,5, while participants listened to spoken sentences. Single neurons encoded a wide range of speech sound cues, including features of consonants and vowels, relative vocal pitch, onsets, amplitude envelope and sequence statistics. Neurons at each cross-laminar recording exhibited dominant tuning to a primary speech feature while also containing a substantial proportion of neurons that encoded other features contributing to heterogeneous selectivity. Spatially, neurons at similar cortical depths tended to encode similar speech features. Activity across all cortical layers was predictive of high-frequency field potentials (electrocorticography), providing a neuronal origin for macroelectrode recordings from the cortical surface. Together, these results establish single-neuron tuning across the cortical laminae as an important dimension of speech encoding in human superior temporal gyrus.
Collapse
Affiliation(s)
- Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Laura Gwilliams
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Kristin K Sellers
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jason E Chung
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Duo Xu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Gavin Mischler
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | - Nima Mesgarani
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Electrical Engineering, Columbia University, New York, NY, USA
| | | | | | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
9
|
Mai A, Riès S, Ben-Haim S, Shih JJ, Gentner TQ. Acoustic and language-specific sources for phonemic abstraction from speech. Nat Commun 2024; 15:677. [PMID: 38263364 PMCID: PMC10805762 DOI: 10.1038/s41467-024-44844-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 01/03/2024] [Indexed: 01/25/2024] Open
Abstract
Spoken language comprehension requires abstraction of linguistic information from speech, but the interaction between auditory and linguistic processing of speech remains poorly understood. Here, we investigate the nature of this abstraction using neural responses recorded intracranially while participants listened to conversational English speech. Capitalizing on multiple, language-specific patterns where phonological and acoustic information diverge, we demonstrate the causal efficacy of the phoneme as a unit of analysis and dissociate the unique contributions of phonemic and spectrographic information to neural responses. Quantitive higher-order response models also reveal that unique contributions of phonological information are carried in the covariance structure of the stimulus-response relationship. This suggests that linguistic abstraction is shaped by neurobiological mechanisms that involve integration across multiple spectro-temporal features and prior phonological information. These results link speech acoustics to phonology and morphosyntax, substantiating predictions about abstractness in linguistic theory and providing evidence for the acoustic features that support that abstraction.
Collapse
Affiliation(s)
- Anna Mai
- University of California, San Diego, Linguistics, 9500 Gilman Dr., La Jolla, CA, 92093, USA.
| | - Stephanie Riès
- San Diego State University, School of Speech, Language, and Hearing Sciences, 5500 Campanile Drive, San Diego, CA, 92182, USA
- San Diego State University, Center for Clinical and Cognitive Sciences, 5500 Campanile Drive, San Diego, CA, 92182, USA
| | - Sharona Ben-Haim
- University of California, San Diego, Neurological Surgery, 9500 Gilman Dr., La Jolla, CA, 92093, USA
| | - Jerry J Shih
- University of California, San Diego, Neurosciences, 9500 Gilman Dr., La Jolla, CA, 92093, USA
| | - Timothy Q Gentner
- University of California, San Diego, Psychology, 9500 Gilman Dr., La Jolla, CA, 92093, USA
- University of California, San Diego, Neurobiology, 9500 Gilman Dr., La Jolla, CA, 92093, USA
- University of California, San Diego, Kavli Institute for Brain and Mind, 9500 Gilman Dr., La Jolla, CA, 92093, USA
| |
Collapse
|
10
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
11
|
Papanicolaou AC. Non-Invasive Mapping of the Neuronal Networks of Language. Brain Sci 2023; 13:1457. [PMID: 37891824 PMCID: PMC10605023 DOI: 10.3390/brainsci13101457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/13/2023] [Accepted: 10/05/2023] [Indexed: 10/29/2023] Open
Abstract
This review consists of three main sections. In the first, the Introduction, the main theories of the neuronal mediation of linguistic operations, derived mostly from studies of the effects of focal lesions on linguistic performance, are summarized. These models furnish the conceptual framework on which the design of subsequent functional neuroimaging investigations is based. In the second section, the methods of functional neuroimaging, especially those of functional Magnetic Resonance Imaging (fMRI) and of Magnetoencephalography (MEG), are detailed along with the specific activation tasks employed in presurgical functional mapping. The reliability of these non-invasive methods and their validity, judged against the results of the invasive methods, namely, the "Wada" procedure and Cortical Stimulation Mapping (CSM), is assessed and their use in presurgical mapping is justified. In the third and final section, the applications of fMRI and MEG in basic research are surveyed in the following six sub-sections, each dealing with the assessment of the neuronal networks for (1) the acoustic and phonological, (2) for semantic, (3) for syntactic, (4) for prosodic operations, (5) for sign language and (6) for the operations of reading and the mechanisms of dyslexia.
Collapse
Affiliation(s)
- Andrew C Papanicolaou
- Department of Pediatrics, Division of Pediatric Neurology, College of Medicine, University of Tennessee Health Science Center, Memphis, TN 38013, USA
| |
Collapse
|
12
|
Bellier L, Llorens A, Marciano D, Gunduz A, Schalk G, Brunner P, Knight RT. Music can be reconstructed from human auditory cortex activity using nonlinear decoding models. PLoS Biol 2023; 21:e3002176. [PMID: 37582062 PMCID: PMC10427021 DOI: 10.1371/journal.pbio.3002176] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 05/30/2023] [Indexed: 08/17/2023] Open
Abstract
Music is core to human experience, yet the precise neural dynamics underlying music perception remain unknown. We analyzed a unique intracranial electroencephalography (iEEG) dataset of 29 patients who listened to a Pink Floyd song and applied a stimulus reconstruction approach previously used in the speech domain. We successfully reconstructed a recognizable song from direct neural recordings and quantified the impact of different factors on decoding accuracy. Combining encoding and decoding analyses, we found a right-hemisphere dominance for music perception with a primary role of the superior temporal gyrus (STG), evidenced a new STG subregion tuned to musical rhythm, and defined an anterior-posterior STG organization exhibiting sustained and onset responses to musical elements. Our findings show the feasibility of applying predictive modeling on short datasets acquired in single patients, paving the way for adding musical elements to brain-computer interface (BCI) applications.
Collapse
Affiliation(s)
- Ludovic Bellier
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Anaïs Llorens
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Déborah Marciano
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Aysegul Gunduz
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Gerwin Schalk
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
| | - Peter Brunner
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, Missouri, United States of America
- National Center for Adaptive Neurotechnologies, Albany, New York, United States of America
| | - Robert T. Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
13
|
Damera SR, Chang L, Nikolov PP, Mattei JA, Banerjee S, Glezer LS, Cox PH, Jiang X, Rauschecker JP, Riesenhuber M. Evidence for a Spoken Word Lexicon in the Auditory Ventral Stream. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:420-434. [PMID: 37588129 PMCID: PMC10426387 DOI: 10.1162/nol_a_00108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 04/27/2023] [Indexed: 08/18/2023]
Abstract
The existence of a neural representation for whole words (i.e., a lexicon) is a common feature of many models of speech processing. Prior studies have provided evidence for a visual lexicon containing representations of whole written words in an area of the ventral visual stream known as the visual word form area. Similar experimental support for an auditory lexicon containing representations of spoken words has yet to be shown. Using functional magnetic resonance imaging rapid adaptation techniques, we provide evidence for an auditory lexicon in the auditory word form area in the human left anterior superior temporal gyrus that contains representations highly selective for individual spoken words. Furthermore, we show that familiarization with novel auditory words sharpens the selectivity of their representations in the auditory word form area. These findings reveal strong parallels in how the brain represents written and spoken words, showing convergent processing strategies across modalities in the visual and auditory ventral streams.
Collapse
Affiliation(s)
- Srikanth R. Damera
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Lillian Chang
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Plamen P. Nikolov
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - James A. Mattei
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Suneel Banerjee
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Laurie S. Glezer
- Department of Speech, Language, and Hearing Sciences, San Diego State University, San Diego, CA, USA
| | - Patrick H. Cox
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Xiong Jiang
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | - Josef P. Rauschecker
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| | | |
Collapse
|
14
|
Damera SR, Malone PS, Stevens BW, Klein R, Eberhardt SP, Auer ET, Bernstein LE, Riesenhuber M. Metamodal Coupling of Vibrotactile and Auditory Speech Processing Systems through Matched Stimulus Representations. J Neurosci 2023; 43:4984-4996. [PMID: 37197979 PMCID: PMC10324991 DOI: 10.1523/jneurosci.1710-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 03/10/2023] [Accepted: 04/29/2023] [Indexed: 05/19/2023] Open
Abstract
It has been postulated that the brain is organized by "metamodal," sensory-independent cortical modules capable of performing tasks (e.g., word recognition) in both "standard" and novel sensory modalities. Still, this theory has primarily been tested in sensory-deprived individuals, with mixed evidence in neurotypical subjects, thereby limiting its support as a general principle of brain organization. Critically, current theories of metamodal processing do not specify requirements for successful metamodal processing at the level of neural representations. Specification at this level may be particularly important in neurotypical individuals, where novel sensory modalities must interface with existing representations for the standard sense. Here we hypothesized that effective metamodal engagement of a cortical area requires congruence between stimulus representations in the standard and novel sensory modalities in that region. To test this, we first used fMRI to identify bilateral auditory speech representations. We then trained 20 human participants (12 female) to recognize vibrotactile versions of auditory words using one of two auditory-to-vibrotactile algorithms. The vocoded algorithm attempted to match the encoding scheme of auditory speech while the token-based algorithm did not. Crucially, using fMRI, we found that only in the vocoded group did trained-vibrotactile stimuli recruit speech representations in the superior temporal gyrus and lead to increased coupling between them and somatosensory areas. Our results advance our understanding of brain organization by providing new insight into unlocking the metamodal potential of the brain, thereby benefitting the design of novel sensory substitution devices that aim to tap into existing processing streams in the brain.SIGNIFICANCE STATEMENT It has been proposed that the brain is organized by "metamodal," sensory-independent modules specialized for performing certain tasks. This idea has inspired therapeutic applications, such as sensory substitution devices, for example, enabling blind individuals "to see" by transforming visual input into soundscapes. Yet, other studies have failed to demonstrate metamodal engagement. Here, we tested the hypothesis that metamodal engagement in neurotypical individuals requires matching the encoding schemes between stimuli from the novel and standard sensory modalities. We trained two groups of subjects to recognize words generated by one of two auditory-to-vibrotactile transformations. Critically, only vibrotactile stimuli that were matched to the neural encoding of auditory speech engaged auditory speech areas after training. This suggests that matching encoding schemes is critical to unlocking the brain's metamodal potential.
Collapse
Affiliation(s)
- Srikanth R Damera
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
| | - Patrick S Malone
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
| | - Benson W Stevens
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
| | - Richard Klein
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
| | - Silvio P Eberhardt
- Department of Speech Language & Hearing Sciences, George Washington University, Washington, DC 20052
| | - Edward T Auer
- Department of Speech Language & Hearing Sciences, George Washington University, Washington, DC 20052
| | - Lynne E Bernstein
- Department of Speech Language & Hearing Sciences, George Washington University, Washington, DC 20052
| | | |
Collapse
|
15
|
Cusinato R, Alnes SL, van Maren E, Boccalaro I, Ledergerber D, Adamantidis A, Imbach LL, Schindler K, Baud MO, Tzovara A. Intrinsic Neural Timescales in the Temporal Lobe Support an Auditory Processing Hierarchy. J Neurosci 2023; 43:3696-3707. [PMID: 37045604 PMCID: PMC10198454 DOI: 10.1523/jneurosci.1941-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 02/21/2023] [Accepted: 03/02/2023] [Indexed: 04/14/2023] Open
Abstract
During rest, intrinsic neural dynamics manifest at multiple timescales, which progressively increase along visual and somatosensory hierarchies. Theoretically, intrinsic timescales are thought to facilitate processing of external stimuli at multiple stages. However, direct links between timescales at rest and sensory processing, as well as translation to the auditory system are lacking. Here, we measured intracranial EEG in 11 human patients with epilepsy (4 women), while listening to pure tones. We show that, in the auditory network, intrinsic neural timescales progressively increase, while the spectral exponent flattens, from temporal to entorhinal cortex, hippocampus, and amygdala. Within the neocortex, intrinsic timescales exhibit spatial gradients that follow the temporal lobe anatomy. Crucially, intrinsic timescales at baseline can explain the latency of auditory responses: as intrinsic timescales increase, so do the single-electrode response onset and peak latencies. Our results suggest that the human auditory network exhibits a repertoire of intrinsic neural dynamics, which manifest in cortical gradients with millimeter resolution and may provide a variety of temporal windows to support auditory processing.SIGNIFICANCE STATEMENT Endogenous neural dynamics are often characterized by their intrinsic timescales. These are thought to facilitate processing of external stimuli. However, a direct link between intrinsic timing at rest and sensory processing is missing. Here, with intracranial EEG, we show that intrinsic timescales progressively increase from temporal to entorhinal cortex, hippocampus, and amygdala. Intrinsic timescales at baseline can explain the variability in the timing of intracranial EEG responses to sounds: cortical electrodes with fast timescales also show fast- and short-lasting responses to auditory stimuli, which progressively increase in the hippocampus and amygdala. Our results suggest that a hierarchy of neural dynamics in the temporal lobe manifests across cortical and limbic structures and can explain the temporal richness of auditory responses.
Collapse
Affiliation(s)
- Riccardo Cusinato
- Institute of Computer Science, University of Bern, Bern 3012, Switzerland
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Sigurd L Alnes
- Institute of Computer Science, University of Bern, Bern 3012, Switzerland
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Ellen van Maren
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Ida Boccalaro
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | | | - Antoine Adamantidis
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Lukas L Imbach
- Swiss Epilepsy Center, Klinik Lengg, Zurich 8008, Switzerland
| | - Kaspar Schindler
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Maxime O Baud
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
| | - Athina Tzovara
- Institute of Computer Science, University of Bern, Bern 3012, Switzerland
- Center for Experimental Neurology, Sleep Wake Epilepsy Center, NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Helen Wills Neuroscience Institute, University of California-Berkeley, Berkeley 94720, California
| |
Collapse
|
16
|
Setti F, Handjaras G, Bottari D, Leo A, Diano M, Bruno V, Tinti C, Cecchetti L, Garbarini F, Pietrini P, Ricciardi E. A modality-independent proto-organization of human multisensory areas. Nat Hum Behav 2023; 7:397-410. [PMID: 36646839 PMCID: PMC10038796 DOI: 10.1038/s41562-022-01507-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 12/05/2022] [Indexed: 01/18/2023]
Abstract
The processing of multisensory information is based upon the capacity of brain regions, such as the superior temporal cortex, to combine information across modalities. However, it is still unclear whether the representation of coherent auditory and visual events requires any prior audiovisual experience to develop and function. Here we measured brain synchronization during the presentation of an audiovisual, audio-only or video-only version of the same narrative in distinct groups of sensory-deprived (congenitally blind and deaf) and typically developed individuals. Intersubject correlation analysis revealed that the superior temporal cortex was synchronized across auditory and visual conditions, even in sensory-deprived individuals who lack any audiovisual experience. This synchronization was primarily mediated by low-level perceptual features, and relied on a similar modality-independent topographical organization of slow temporal dynamics. The human superior temporal cortex is naturally endowed with a functional scaffolding to yield a common representation across multisensory events.
Collapse
Affiliation(s)
- Francesca Setti
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - Davide Bottari
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Andrea Leo
- Department of Translational Research and Advanced Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
| | - Matteo Diano
- Department of Psychology, University of Turin, Turin, Italy
| | - Valentina Bruno
- Manibus Lab, Department of Psychology, University of Turin, Turin, Italy
| | - Carla Tinti
- Department of Psychology, University of Turin, Turin, Italy
| | - Luca Cecchetti
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - Pietro Pietrini
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | |
Collapse
|
17
|
Pierce ZP, Johnson ER, Kim IA, Lear BE, Mast AM, Black JM. Therapeutic interventions impact brain function and promote post-traumatic growth in adults living with post-traumatic stress disorder: A systematic review and meta-analysis of functional magnetic resonance imaging studies. Front Psychol 2023; 14:1074972. [PMID: 36844333 PMCID: PMC9948410 DOI: 10.3389/fpsyg.2023.1074972] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 01/20/2023] [Indexed: 02/11/2023] Open
Abstract
Introduction The present systematic review and meta-analysis explores the impacts of cognitive processing therapy (CPT), eye movement desensitization and reprocessing (EMDR), and prolonged exposure (PE) therapy on neural activity underlying the phenomenon of post-traumatic growth for adult trauma survivors. Methods We utilized the following databases to conduct our systematic search: Boston College Libraries, PubMed, MEDLINE, and PsycINFO. Our initial search yielded 834 studies for initial screening. We implemented seven eligibility criteria to vet articles for full-text review. Twenty-nine studies remained for full-text review after our systematic review process was completed. Studies were subjected to several levels of analysis. First, pre-and post- test post-traumatic growth inventory (PTGI) scores were collected from all studies and analyzed through a forest plot using Hedges' g. Next, Montreal Neurological Institute (MNI) coordinates and t-scores were collected and analyzed using an Activation Likelihood Estimation (ALE) to measure brain function. T-scores and Hedges' g values were then analyzed using Pearson correlations to determine if there were any relationships between brain function and post-traumatic growth for each modality. Lastly, all studies were subjected to a bubble plot and Egger's test to assess risk of publication bias across the review sample. Results Forest plot results indicated that all three interventions had a robust effect on PTGI scores. ALE meta-analysis results indicated that EMDR exhibited the largest effect on brain function, with the R thalamus (t = 4.23, p < 0.001) showing robust activation, followed closely by the R precuneus (t = 4.19, p < 0.001). Pearson correlation results showed that EMDR demonstrated the strongest correlation between increased brain function and PTGI scores (r = 0.910, p < 0.001). Qualitative review of the bubble plot indicated no obvious traces of publication bias, which was corroborated by the results of the Egger's test (p = 0.127). Discussion Our systematic review and meta-analysis showed that CPT, EMDR, and PE each exhibited a robust effect on PTG impacts across the course of treatment. However, when looking closer at comparative analyses of neural activity (ALE) and PTGI scores (Pearson correlation), EMDR exhibited a more robust effect on PTG impacts and brain function than CPT and PE.
Collapse
Affiliation(s)
- Zachary P. Pierce
- School of Social Work, Boston College, Chestnut Hill, MA, United States
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
| | - Emily R. Johnson
- School of Social Work, Boston College, Chestnut Hill, MA, United States
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
| | - Isabelle A. Kim
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
- Department of Neurology, Boston Children's Hospital, Harvard Medical School, Boston, MA, United States
| | - Brianna E. Lear
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
| | - A. Michaela Mast
- School of Social Work, Boston College, Chestnut Hill, MA, United States
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
| | - Jessica M. Black
- School of Social Work, Boston College, Chestnut Hill, MA, United States
- The Cell to Society Laboratory, Chestnut Hill, MA, United States
| |
Collapse
|
18
|
Zatorre RJ. Hemispheric asymmetries for music and speech: Spectrotemporal modulations and top-down influences. Front Neurosci 2022; 16:1075511. [PMID: 36605556 PMCID: PMC9809288 DOI: 10.3389/fnins.2022.1075511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/30/2022] [Indexed: 01/07/2023] Open
Abstract
Hemispheric asymmetries in auditory cognition have been recognized for a long time, but their neural basis is still debated. Here I focus on specialization for processing of speech and music, the two most important auditory communication systems that humans possess. A great deal of evidence from lesion studies and functional imaging suggests that aspects of music linked to the processing of pitch patterns depend more on right than left auditory networks. A complementary specialization for temporal resolution has been suggested for left auditory networks. These diverse findings can be integrated within the context of the spectrotemporal modulation framework, which has been developed as a way to characterize efficient neuronal encoding of complex sounds. Recent studies show that degradation of spectral modulation impairs melody perception but not speech content, whereas degradation of temporal modulation has the opposite effect. Neural responses in the right and left auditory cortex in those studies are linked to processing of spectral and temporal modulations, respectively. These findings provide a unifying model to understand asymmetries in terms of sensitivity to acoustical features of communication sounds in humans. However, this explanation does not account for evidence that asymmetries can shift as a function of learning, attention, or other top-down factors. Therefore, it seems likely that asymmetries arise both from bottom-up specialization for acoustical modulations and top-down influences coming from hierarchically higher components of the system. Such interactions can be understood in terms of predictive coding mechanisms for perception.
Collapse
|
19
|
Hullett PW, Kandahari N, Shih TT, Kleen JK, Knowlton RC, Rao VR, Chang EF. Intact speech perception after resection of dominant hemisphere primary auditory cortex for the treatment of medically refractory epilepsy: illustrative case. JOURNAL OF NEUROSURGERY. CASE LESSONS 2022; 4:CASE22417. [PMID: 36443954 PMCID: PMC9705521 DOI: 10.3171/case22417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 10/27/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND In classic speech network models, the primary auditory cortex is the source of auditory input to Wernicke's area in the posterior superior temporal gyrus (pSTG). Because resection of the primary auditory cortex in the dominant hemisphere removes inputs to the pSTG, there is a risk of speech impairment. However, recent research has shown the existence of other, nonprimary auditory cortex inputs to the pSTG, potentially reducing the risk of primary auditory cortex resection in the dominant hemisphere. OBSERVATIONS Here, the authors present a clinical case of a woman with severe medically refractory epilepsy with a lesional epileptic focus in the left (dominant) Heschl's gyrus. Analysis of neural responses to speech stimuli was consistent with primary auditory cortex localization to Heschl's gyrus. Although the primary auditory cortex was within the proposed resection margins, she underwent lesionectomy with total resection of Heschl's gyrus. Postoperatively, she had no speech deficits and her seizures were fully controlled. LESSONS While resection of the dominant hemisphere Heschl's gyrus/primary auditory cortex warrants caution, this case illustrates the ability to resect the primary auditory cortex without speech impairment and supports recent models of multiple parallel inputs to the pSTG.
Collapse
Affiliation(s)
- Patrick W. Hullett
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Nazineen Kandahari
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and ,Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Tina T. Shih
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Jonathan K. Kleen
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Robert C. Knowlton
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Vikram R. Rao
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California
| | - Edward F. Chang
- Department of Neurosurgery, University of California San Francisco, San Francisco, California; and
| |
Collapse
|
20
|
A Combined Image- and Coordinate-Based Meta-Analysis of Whole-Brain Voxel-Based Morphometry Studies Investigating Subjective Tinnitus. Brain Sci 2022; 12:brainsci12091192. [PMID: 36138928 PMCID: PMC9496862 DOI: 10.3390/brainsci12091192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 08/28/2022] [Accepted: 08/29/2022] [Indexed: 11/17/2022] Open
Abstract
Previous voxel-based morphometry (VBM) studies investigating tinnitus have reported structural differences in a variety of spatially distinct gray matter regions. However, the results have been highly inconsistent and sometimes contradictory. In the current study, we conducted a combined image- and coordinate-based meta-analysis of VBM studies investigating tinnitus to identify robust gray matter differences associated with tinnitus, as well as examine the possible effects of hearing loss on the outcome of the meta-analysis. The PubMed and Web of Science databases were searched for studies published up to August 2021. Additional manual searches were conducted for studies published up to December 2021. A whole-brain meta-analysis was performed using Seed-Based d Mapping with Permutation of Subject Images (SDM-PSI). Fifteen studies comprising 423 individuals with tinnitus and either normal hearing or hearing loss (mean age 50.94 years; 173 females) and 508 individuals without tinnitus and either normal hearing or hearing loss (mean age 51.59 years; 234 females) met the inclusion criteria. We found a small but significant reduction in gray matter in the left inferior temporal gyrus for groups of normal hearing individuals with tinnitus compared to groups of hearing-matched individuals without tinnitus. In sharp contrast, in groups with hearing loss, tinnitus was associated with increased gray matter levels in the bilateral lingual gyrus and the bilateral precuneus. Those results were dependent upon matching the hearing levels between the groups with or without tinnitus. The current investigation suggests that hearing loss is the driving force of changes in cortical gray matter across individuals with and without tinnitus. Future studies should carefully account for confounders, including hearing loss, hyperacusis, anxiety, and depression, to identify gray matter changes specifically related to tinnitus. Ultimately, the aggregation of standardized individual datasets with both anatomical and useful phenotypical information will permit a better understanding of tinnitus-related gray matter differences, the effects of potential comorbidities, and their interactions with tinnitus.
Collapse
|
21
|
Azaiez N, Loberg O, Hämäläinen JA, Leppänen PHT. Brain Source Correlates of Speech Perception and Reading Processes in Children With and Without Reading Difficulties. Front Neurosci 2022; 16:921977. [PMID: 35928008 PMCID: PMC9344064 DOI: 10.3389/fnins.2022.921977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 06/20/2022] [Indexed: 11/13/2022] Open
Abstract
Neural correlates in reading and speech processing have been addressed extensively in the literature. While reading skills and speech perception have been shown to be associated with each other, their relationship remains debatable. In this study, we investigated reading skills, speech perception, reading, and their correlates with brain source activity in auditory and visual modalities. We used high-density event-related potentials (ERPs), fixation-related potentials (FRPs), and the source reconstruction method. The analysis was conducted on 12–13-year-old schoolchildren who had different reading levels. Brain ERP source indices were computed from frequently repeated Finnish speech stimuli presented in an auditory oddball paradigm. Brain FRP source indices were also computed for words within sentences presented in a reading task. The results showed significant correlations between speech ERP sources and reading scores at the P100 (P1) time range in the left hemisphere and the N250 time range in both hemispheres, and a weaker correlation for visual word processing N170 FRP source(s) in the posterior occipital areas, in the vicinity of the visual word form areas (VWFA). Furthermore, significant brain-to-brain correlations were found between the two modalities, where the speech brain sources of the P1 and N250 responses correlated with the reading N170 response. The results suggest that speech processes are linked to reading fluency and that brain activations to speech are linked to visual brain processes of reading. These results indicate that a relationship between language and reading systems is present even after several years of exposure to print.
Collapse
Affiliation(s)
- Najla Azaiez
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Jyväskylä, Finland
- *Correspondence: Najla Azaiez ; orcid.org/0000-0002-7525-3745
| | - Otto Loberg
- Department of Psychology, Faculty of Science and Technology, Bournemouth University, Bournemouth, United Kingdom
| | - Jarmo A. Hämäläinen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Jyväskylä, Finland
- Department of Psychology, Jyväskylä Center for Interdisciplinary Brain Research, University of Jyväskylä, Jyväskylä, Finland
| | - Paavo H. T. Leppänen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Jyväskylä, Finland
- Department of Psychology, Jyväskylä Center for Interdisciplinary Brain Research, University of Jyväskylä, Jyväskylä, Finland
| |
Collapse
|
22
|
Chalas N, Daube C, Kluger DS, Abbasi O, Nitsch R, Gross J. Multivariate analysis of speech envelope tracking reveals coupling beyond auditory cortex. Neuroimage 2022; 258:119395. [PMID: 35718023 DOI: 10.1016/j.neuroimage.2022.119395] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/16/2022] [Accepted: 06/14/2022] [Indexed: 11/19/2022] Open
Abstract
The systematic alignment of low-frequency brain oscillations with the acoustic speech envelope signal is well established and has been proposed to be crucial for actively perceiving speech. Previous studies investigating speech-brain coupling in source space are restricted to univariate pairwise approaches between brain and speech signals, and therefore speech tracking information in frequency-specific communication channels might be lacking. To address this, we propose a novel multivariate framework for estimating speech-brain coupling where neural variability from source-derived activity is taken into account along with the rate of envelope's amplitude change (derivative). We applied it in magnetoencephalographic (MEG) recordings while human participants (male and female) listened to one hour of continuous naturalistic speech, showing that a multivariate approach outperforms the corresponding univariate method in low- and high frequencies across frontal, motor, and temporal areas. Systematic comparisons revealed that the gain in low frequencies (0.6 - 0.8 Hz) was related to the envelope's rate of change whereas in higher frequencies (from 0.8 to 10 Hz) it was mostly related to the increased neural variability from source-derived cortical areas. Furthermore, following a non-negative matrix factorization approach we found distinct speech-brain components across time and cortical space related to speech processing. We confirm that speech envelope tracking operates mainly in two timescales (δ and θ frequency bands) and we extend those findings showing shorter coupling delays in auditory-related components and longer delays in higher-association frontal and motor components, indicating temporal differences of speech tracking and providing implications for hierarchical stimulus-driven speech processing.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany.
| | - Christoph Daube
- Centre for Cognitive Neuroimaging, University of Glasgow, Glasgow, UK
| | - Daniel S Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Robert Nitsch
- Institute for Translational Neuroscience, University of Münster, Münster, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
23
|
DIANA, a Process-Oriented Model of Human Auditory Word Recognition. Brain Sci 2022; 12:brainsci12050681. [PMID: 35625067 PMCID: PMC9140177 DOI: 10.3390/brainsci12050681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/05/2022] [Accepted: 05/10/2022] [Indexed: 02/04/2023] Open
Abstract
This article presents DIANA, a new, process-oriented model of human auditory word recognition, which takes as its input the acoustic signal and can produce as its output word identifications and lexicality decisions, as well as reaction times. This makes it possible to compare its output with human listeners’ behavior in psycholinguistic experiments. DIANA differs from existing models in that it takes more available neuro-physiological evidence on speech processing into account. For instance, DIANA accounts for the effect of ambiguity in the acoustic signal on reaction times following the Hick–Hyman law and it interprets the acoustic signal in the form of spectro-temporal receptive fields, which are attested in the human superior temporal gyrus, instead of in the form of abstract phonological units. The model consists of three components: activation, decision and execution. The activation and decision components are described in detail, both at the conceptual level (in the running text) and at the computational level (in the Appendices). While the activation component is independent of the listener’s task, the functioning of the decision component depends on this task. The article also describes how DIANA could be improved in the future in order to even better resemble the behavior of human listeners.
Collapse
|
24
|
Long-term priors constrain category learning in the context of short-term statistical regularities. Psychon Bull Rev 2022; 29:1925-1937. [PMID: 35524011 DOI: 10.3758/s13423-022-02114-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2022] [Indexed: 11/08/2022]
Abstract
Cognitive systems face a constant tension of maintaining existing representations that have been fine-tuned to long-term input regularities and adapting representations to meet the needs of short-term input that may deviate from long-term norms. Systems must balance the stability of long-term representations with plasticity to accommodate novel contexts. We investigated the interaction between perceptual biases or priors acquired across the long-term and sensitivity to statistical regularities introduced in the short-term. Participants were first passively exposed to short-term acoustic regularities and then learned categories in a supervised training task that either conflicted or aligned with long-term perceptual priors. We found that the long-term priors had robust and pervasive impact on categorization behavior. In contrast, behavior was not influenced by the nature of the short-term passive exposure. These results demonstrate that perceptual priors place strong constraints on the course of learning and that short-term passive exposure to acoustic regularities has limited impact on directing subsequent category learning.
Collapse
|
25
|
Norman-Haignere SV, Long LK, Devinsky O, Doyle W, Irobunda I, Merricks EM, Feldstein NA, McKhann GM, Schevon CA, Flinker A, Mesgarani N. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat Hum Behav 2022; 6:455-469. [PMID: 35145280 PMCID: PMC8957490 DOI: 10.1038/s41562-021-01261-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 11/18/2021] [Indexed: 01/11/2023]
Abstract
To derive meaning from sound, the brain must integrate information across many timescales. What computations underlie multiscale integration in human auditory cortex? Evidence suggests that auditory cortex analyses sound using both generic acoustic representations (for example, spectrotemporal modulation tuning) and category-specific computations, but the timescales over which these putatively distinct computations integrate remain unclear. To answer this question, we developed a general method to estimate sensory integration windows-the time window when stimuli alter the neural response-and applied our method to intracranial recordings from neurosurgical patients. We show that human auditory cortex integrates hierarchically across diverse timescales spanning from ~50 to 400 ms. Moreover, we find that neural populations with short and long integration windows exhibit distinct functional properties: short-integration electrodes (less than ~200 ms) show prominent spectrotemporal modulation selectivity, while long-integration electrodes (greater than ~200 ms) show prominent category selectivity. These findings reveal how multiscale integration organizes auditory computation in the human brain.
Collapse
Affiliation(s)
- Sam V Norman-Haignere
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,HHMI Postdoctoral Fellow of the Life Sciences Research Foundation
| | - Laura K. Long
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,Doctoral Program in Neurobiology and Behavior, Columbia University
| | - Orrin Devinsky
- Department of Neurology, NYU Langone Medical Center,Comprehensive Epilepsy Center, NYU Langone Medical Center
| | - Werner Doyle
- Comprehensive Epilepsy Center, NYU Langone Medical Center,Department of Neurosurgery, NYU Langone Medical Center
| | - Ifeoma Irobunda
- Department of Neurology, Columbia University Irving Medical Center
| | | | - Neil A. Feldstein
- Department of Neurological Surgery, Columbia University Irving Medical Center
| | - Guy M. McKhann
- Department of Neurological Surgery, Columbia University Irving Medical Center
| | | | - Adeen Flinker
- Department of Neurology, NYU Langone Medical Center,Comprehensive Epilepsy Center, NYU Langone Medical Center,Department of Biomedical Engineering, NYU Tandon School of Engineering
| | - Nima Mesgarani
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,Doctoral Program in Neurobiology and Behavior, Columbia University,Department of Electrical Engineering, Columbia University
| |
Collapse
|
26
|
Conroy C, Byrne AJ, Kidd G. Forward masking of spectrotemporal modulation detection. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1181. [PMID: 35232084 PMCID: PMC8865928 DOI: 10.1121/10.0009404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 01/14/2022] [Accepted: 01/15/2022] [Indexed: 06/14/2023]
Abstract
Recent work has suggested that there may be specialized mechanisms in the auditory system for coding spectrotemporal modulations (STMs), tuned to different combinations of spectral modulation frequency, temporal modulation frequency, and STM sweep direction. The current study sought evidence of such mechanisms using a psychophysical forward masking paradigm. The detectability of a target comprising upward sweeping STMs was measured following the presentation of modulated maskers applied to the same carrier. Four maskers were tested, which had either (1) the same spectral modulation frequency as the target but a flat temporal envelope, (2) the same temporal modulation frequency as the target but a flat spectral envelope, (3) the same spectral and temporal modulation frequencies as the target but the opposite sweep direction (downward sweeping STMs), or (4) the same spectral and temporal modulation frequencies as the target and the same sweep direction (upward sweeping STMs). Forward masking was greatest for the masker fully matched to the target (4), intermediate for the masker with the opposite sweep direction (3), and negligible for the other two (1, 2). These findings are consistent with the suggestion that the detectability of the target was mediated by an STM-specific coding mechanism with sweep-direction selectivity.
Collapse
Affiliation(s)
- Christopher Conroy
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Andrew J Byrne
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
27
|
Monahan PJ, Schertz J, Fu Z, Pérez A. Unified Coding of Spectral and Temporal Phonetic Cues: Electrophysiological Evidence for Abstract Phonological Features. J Cogn Neurosci 2022; 34:618-638. [DOI: 10.1162/jocn_a_01817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Spoken word recognition models and phonological theory propose that abstract features play a central role in speech processing. It remains unknown, however, whether auditory cortex encodes linguistic features in a manner beyond the phonetic properties of the speech sounds themselves. We took advantage of the fact that English phonology functionally codes stops and fricatives as voiced or voiceless with two distinct phonetic cues: Fricatives use a spectral cue, whereas stops use a temporal cue. Evidence that these cues can be grouped together would indicate the disjunctive coding of distinct phonetic cues into a functionally defined abstract phonological feature. In English, the voicing feature, which distinguishes the consonants [s] and [t] from [z] and [d], respectively, is hypothesized to be specified only for voiceless consonants (e.g., [s t]). Here, participants listened to syllables in a many-to-one oddball design, while their EEG was recorded. In one block, both voiceless stops and fricatives were the standards. In the other block, both voiced stops and fricatives were the standards. A critical design element was the presence of intercategory variation within the standards. Therefore, a many-to-one relationship, which is necessary to elicit an MMN, existed only if the stop and fricative standards were grouped together. In addition to the ERPs, event-related spectral power was also analyzed. Results showed an MMN effect in the voiceless standards block—an asymmetric MMN—in a time window consistent with processing in auditory cortex, as well as increased prestimulus beta-band oscillatory power to voiceless standards. These findings suggest that (i) there is an auditory memory trace of the standards based on the shared (voiceless) feature, which is only functionally defined; (ii) voiced consonants are underspecified; and (iii) features can serve as a basis for predictive processing. Taken together, these results point toward auditory cortex's ability to functionally code distinct phonetic cues together and suggest that abstract features can be used to parse the continuous acoustic signal.
Collapse
Affiliation(s)
| | | | - Zhanao Fu
- Cambridge University, United Kingdom
| | - Alejandro Pérez
- University of Toronto Scarborough, Ontario, Canada
- Cambridge University, United Kingdom
| |
Collapse
|
28
|
Al-Zubaidi A, Bräuer S, Holdgraf CR, Schepers IM, Rieger JW. OUP accepted manuscript. Cereb Cortex Commun 2022; 3:tgac007. [PMID: 35281216 PMCID: PMC8914075 DOI: 10.1093/texcom/tgac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 11/14/2022] Open
Affiliation(s)
- Arkan Al-Zubaidi
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
- Research Center Neurosensory Science, Oldenburg University, 26129 Oldenburg, Germany
| | - Susann Bräuer
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Chris R Holdgraf
- Department of Statistics, UC Berkeley, Berkeley, CA 94720, USA
- International Interactive Computing Collaboration
| | - Inga M Schepers
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Jochem W Rieger
- Corresponding author: Department of Psychology, Faculty VI, Oldenburg University, 26129 Oldenburg, Germany.
| |
Collapse
|
29
|
Miceli G, Caccia A. Cortical disorders of speech processing: Pure word deafness and auditory agnosia. HANDBOOK OF CLINICAL NEUROLOGY 2022; 187:69-87. [PMID: 35964993 DOI: 10.1016/b978-0-12-823493-8.00005-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Selective disorders of auditory speech processing due to brain lesions are reviewed. Over 120 years after the first anatomic report (Dejerine and Sérieux, 1898), fewer than 80 cumulative cases of generalized auditory agnosia and pure word deafness with documented brain lesions are on record. Most patients (approximately 70%) had vascular lesions. Damage is very frequently bilateral in generalized auditory agnosia, and more frequently unilateral in pure word deafness. In unilateral cases, anatomical disconnection is not a prerequisite, and disorders may be due to functional disconnection. Regardless of whether lesions are unilateral or bilateral, speech processing difficulties emerge in the presence of damage to the superior temporal regions of the language-dominant hemisphere, suggesting that speech input is processed asymmetrically at early stages already. Extant evidence does not allow establishing whether processing asymmetry originates in the primary auditory cortex or in higher associative cortices, nor whether auditory processing in the brainstem is entirely symmetric. Results are consistent with the view that the difficulty in processing auditory input characterized by quick spectral and/or temporal changes is one of the critical dimensions of the disorder. Forthcoming studies should focus on detailed audiologic, neurolinguistic, and neuroanatomic descriptions of each case.
Collapse
Affiliation(s)
- Gabriele Miceli
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Rovereto, Italy; Centro Interdisciplinare Linceo 'Beniamino Segre'-Accademia dei Lincei, Rome, Italy.
| | - Antea Caccia
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Rovereto, Italy; Department of Psychology, University of Milano-Bicocca, Milan, Italy
| |
Collapse
|
30
|
Fuglsang SA, Madsen KH, Puonti O, Hjortkjær J, Siebner HR. Mapping cortico-subcortical sensitivity to 4 Hz amplitude modulation depth in human auditory system with functional MRI. Neuroimage 2021; 246:118745. [PMID: 34808364 DOI: 10.1016/j.neuroimage.2021.118745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 10/19/2022] Open
Abstract
Temporal modulations in the envelope of acoustic waveforms at rates around 4 Hz constitute a strong acoustic cue in speech and other natural sounds. It is often assumed that the ascending auditory pathway is increasingly sensitive to slow amplitude modulation (AM), but sensitivity to AM is typically considered separately for individual stages of the auditory system. Here, we used blood oxygen level dependent (BOLD) fMRI in twenty human subjects (10 male) to measure sensitivity of regional neural activity in the auditory system to 4 Hz temporal modulations. Participants were exposed to AM noise stimuli varying parametrically in modulation depth to characterize modulation-depth effects on BOLD responses. A Bayesian hierarchical modeling approach was used to model potentially nonlinear relations between AM depth and group-level BOLD responses in auditory regions of interest (ROIs). Sound stimulation activated the auditory brainstem and cortex structures in single subjects. BOLD responses to noise exposure in core and belt auditory cortices scaled positively with modulation depth. This finding was corroborated by whole-brain cluster-level inference. Sensitivity to AM depth variations was particularly pronounced in the Heschl's gyrus but also found in higher-order auditory cortical regions. None of the sound-responsive subcortical auditory structures showed a BOLD response profile that reflected the parametric variation in AM depth. The results are compatible with the notion that early auditory cortical regions play a key role in processing low-rate modulation content of sounds in the human auditory system.
Collapse
Affiliation(s)
- Søren A Fuglsang
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre Denmark.
| | - Kristoffer H Madsen
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre Denmark; Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Oula Puonti
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre Denmark; Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Jens Hjortkjær
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre Denmark; Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Hartwig R Siebner
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, Hvidovre Denmark; Department of Neurology, Copenhagen University Hospital Bispebjerg and Frederiksberg, Copenhagen, Denmark; Department of Clinical Medicine, Faculty of Medical and Health Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
31
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA; .,Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
32
|
Karthik G, Plass J, Beltz AM, Liu Z, Grabowecky M, Suzuki S, Stacey WC, Wasade VS, Towle VL, Tao JX, Wu S, Issa NP, Brang D. Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. Eur J Neurosci 2021; 54:7301-7317. [PMID: 34587350 DOI: 10.1111/ejn.15482] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/20/2021] [Accepted: 08/28/2021] [Indexed: 12/13/2022]
Abstract
Speech perception is a central component of social communication. Although principally an auditory process, accurate speech perception in everyday settings is supported by meaningful information extracted from visual cues. Visual speech modulates activity in cortical areas subserving auditory speech perception including the superior temporal gyrus (STG). However, it is unknown whether visual modulation of auditory processing is a unitary phenomenon or, rather, consists of multiple functionally distinct processes. To explore this question, we examined neural responses to audiovisual speech measured from intracranially implanted electrodes in 21 patients with epilepsy. We found that visual speech modulated auditory processes in the STG in multiple ways, eliciting temporally and spatially distinct patterns of activity that differed across frequency bands. In the theta band, visual speech suppressed the auditory response from before auditory speech onset to after auditory speech onset (-93 to 500 ms) most strongly in the posterior STG. In the beta band, suppression was seen in the anterior STG from -311 to -195 ms before auditory speech onset and in the middle STG from -195 to 235 ms after speech onset. In high gamma, visual speech enhanced the auditory response from -45 to 24 ms only in the posterior STG. We interpret the visual-induced changes prior to speech onset as reflecting crossmodal prediction of speech signals. In contrast, modulations after sound onset may reflect a decrease in sustained feedforward auditory activity. These results are consistent with models that posit multiple distinct mechanisms supporting audiovisual speech perception.
Collapse
Affiliation(s)
- G Karthik
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - John Plass
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - Adriene M Beltz
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - Zhongming Liu
- Department of Biomedical Engineering and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA
| | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, Illinois, USA
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, Illinois, USA
| | - William C Stacey
- Department of Neurology and Department of Biomedical Engineering, University of Michigan, Ann Arbor, Michigan, USA
| | - Vibhangini S Wasade
- Department of Neurology, Henry Ford Hospital, Detroit, Michigan, USA.,Department of Neurology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Vernon L Towle
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - James X Tao
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - Shasha Wu
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - Naoum P Issa
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
33
|
Hamilton LS, Oganian Y, Hall J, Chang EF. Parallel and distributed encoding of speech across human auditory cortex. Cell 2021; 184:4626-4639.e13. [PMID: 34411517 DOI: 10.1016/j.cell.2021.07.019] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 02/11/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022]
Abstract
Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Jeffery Hall
- Department of Neurology and Neurosurgery, McGill University Montreal Neurological Institute, Montreal, QC, H3A 2B4, Canada
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
34
|
Homma NY, Bajo VM. Lemniscal Corticothalamic Feedback in Auditory Scene Analysis. Front Neurosci 2021; 15:723893. [PMID: 34489635 PMCID: PMC8417129 DOI: 10.3389/fnins.2021.723893] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/30/2021] [Indexed: 12/15/2022] Open
Abstract
Sound information is transmitted from the ear to central auditory stations of the brain via several nuclei. In addition to these ascending pathways there exist descending projections that can influence the information processing at each of these nuclei. A major descending pathway in the auditory system is the feedback projection from layer VI of the primary auditory cortex (A1) to the ventral division of medial geniculate body (MGBv) in the thalamus. The corticothalamic axons have small glutamatergic terminals that can modulate thalamic processing and thalamocortical information transmission. Corticothalamic neurons also provide input to GABAergic neurons of the thalamic reticular nucleus (TRN) that receives collaterals from the ascending thalamic axons. The balance of corticothalamic and TRN inputs has been shown to refine frequency tuning, firing patterns, and gating of MGBv neurons. Therefore, the thalamus is not merely a relay stage in the chain of auditory nuclei but does participate in complex aspects of sound processing that include top-down modulations. In this review, we aim (i) to examine how lemniscal corticothalamic feedback modulates responses in MGBv neurons, and (ii) to explore how the feedback contributes to auditory scene analysis, particularly on frequency and harmonic perception. Finally, we will discuss potential implications of the role of corticothalamic feedback in music and speech perception, where precise spectral and temporal processing is essential.
Collapse
Affiliation(s)
- Natsumi Y. Homma
- Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
- Coleman Memorial Laboratory, Department of Otolaryngology – Head and Neck Surgery, University of California, San Francisco, San Francisco, CA, United States
| | - Victoria M. Bajo
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
35
|
Khalighinejad B, Patel P, Herrero JL, Bickel S, Mehta AD, Mesgarani N. Functional characterization of human Heschl's gyrus in response to natural speech. Neuroimage 2021; 235:118003. [PMID: 33789135 PMCID: PMC8608271 DOI: 10.1016/j.neuroimage.2021.118003] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 03/23/2021] [Accepted: 03/25/2021] [Indexed: 01/11/2023] Open
Abstract
Heschl's gyrus (HG) is a brain area that includes the primary auditory cortex in humans. Due to the limitations in obtaining direct neural measurements from this region during naturalistic speech listening, the functional organization and the role of HG in speech perception remain uncertain. Here, we used intracranial EEG to directly record neural activity in HG in eight neurosurgical patients as they listened to continuous speech stories. We studied the spatial distribution of acoustic tuning and the organization of linguistic feature encoding. We found a main gradient of change from posteromedial to anterolateral parts of HG. We also observed a decrease in frequency and temporal modulation tuning and an increase in phonemic representation, speaker normalization, speech sensitivity, and response latency. We did not observe a difference between the two brain hemispheres. These findings reveal a functional role for HG in processing and transforming simple to complex acoustic features and inform neurophysiological models of speech processing in the human auditory cortex.
Collapse
Affiliation(s)
- Bahar Khalighinejad
- Mortimer B. Zuckerman Brain Behavior Institute, Columbia University, New York, NY, United States,Department of Electrical Engineering, Columbia University, New York, NY, United States
| | - Prachi Patel
- Mortimer B. Zuckerman Brain Behavior Institute, Columbia University, New York, NY, United States,Department of Electrical Engineering, Columbia University, New York, NY, United States
| | - Jose L. Herrero
- Hofstra Northwell School of Medicine, Manhasset, NY, United States,The Feinstein Institutes for Medical Research, Manhasset, NY, United States
| | - Stephan Bickel
- Hofstra Northwell School of Medicine, Manhasset, NY, United States,The Feinstein Institutes for Medical Research, Manhasset, NY, United States
| | - Ashesh D. Mehta
- Hofstra Northwell School of Medicine, Manhasset, NY, United States,The Feinstein Institutes for Medical Research, Manhasset, NY, United States
| | - Nima Mesgarani
- Mortimer B. Zuckerman Brain Behavior Institute, Columbia University, New York, NY, United States,Department of Electrical Engineering, Columbia University, New York, NY, United States,Corresponding author at: Department of Electrical Engineering, Columbia University, New York, NY, United States. (B. Khalighinejad), (P. Patel), (J.L. Herrero), (S. Bickel), (A.D. Mehta), (N. Mesgarani)
| |
Collapse
|
36
|
Kraus F, Tune S, Ruhe A, Obleser J, Wöstmann M. Unilateral Acoustic Degradation Delays Attentional Separation of Competing Speech. Trends Hear 2021; 25:23312165211013242. [PMID: 34184964 PMCID: PMC8246482 DOI: 10.1177/23312165211013242] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Hearing loss is often asymmetric such that hearing thresholds differ substantially between the two ears. The extreme case of such asymmetric hearing is single-sided deafness. A unilateral cochlear implant (CI) on the more severely impaired ear is an effective treatment to restore hearing. The interactive effects of unilateral acoustic degradation and spatial attention to one sound source in multitalker situations are at present unclear. Here, we simulated some features of listening with a unilateral CI in young, normal-hearing listeners (N = 22) who were presented with 8-band noise-vocoded speech to one ear and intact speech to the other ear. Neural responses were recorded in the electroencephalogram to obtain the spectrotemporal response function to speech. Listeners made more mistakes when answering questions about vocoded (vs. intact) attended speech. At the neural level, we asked how unilateral acoustic degradation would impact the attention-induced amplification of tracking target versus distracting speech. Interestingly, unilateral degradation did not per se reduce the attention-induced amplification but instead delayed it in time: Speech encoding accuracy, modelled on the basis of the spectrotemporal response function, was significantly enhanced for attended versus ignored intact speech at earlier neural response latencies (<∼250 ms). This attentional enhancement was not absent but delayed for vocoded speech. These findings suggest that attentional selection of unilateral, degraded speech is feasible but induces delayed neural separation of competing speech, which might explain listening challenges experienced by unilateral CI users.
Collapse
Affiliation(s)
- Frauke Kraus
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Sarah Tune
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Anna Ruhe
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Malte Wöstmann
- Department of Psychology, University of Lübeck, Lübeck, Germany
| |
Collapse
|
37
|
Riad R, Karadayi J, Bachoud-Lévi AC, Dupoux E. Learning spectro-temporal representations of complex sounds with parameterized neural networks. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:353. [PMID: 34340514 DOI: 10.1121/10.0005482] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 06/08/2021] [Indexed: 06/13/2023]
Abstract
Deep learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes in a variety of auditory tasks, yet these models often lack interpretability to fully understand the exact computations that have been performed. Here, we proposed a parametrized neural network layer, which computes specific spectro-temporal modulations based on Gabor filters [learnable spectro-temporal filters (STRFs)] and is fully interpretable. We evaluated this layer on speech activity detection, speaker verification, urban sound classification, and zebra finch call type classification. We found that models based on learnable STRFs are on par for all tasks with state-of-the-art and obtain the best performance for speech activity detection. As this layer remains a Gabor filter, it is fully interpretable. Thus, we used quantitative measures to describe distribution of the learned spectro-temporal modulations. Filters adapted to each task and focused mostly on low temporal and spectral modulations. The analyses show that the filters learned on human speech have similar spectro-temporal parameters as the ones measured directly in the human auditory cortex. Finally, we observed that the tasks organized in a meaningful way: the human vocalization tasks closer to each other and bird vocalizations far away from human vocalizations and urban sounds tasks.
Collapse
Affiliation(s)
- Rachid Riad
- Ecole des Hautes Etudes en Sciences Sociales, CNRS, Institut National de Recherche informatique et Automatique, Département d'Études Cognitives, Ecole Normale Supérieure-Paris Sciences et Lettres University, 29 Rue d'Ulm, 75005 Paris, France
| | - Julien Karadayi
- Ecole des Hautes Etudes en Sciences Sociales, CNRS, Institut National de Recherche informatique et Automatique, Département d'Études Cognitives, Ecole Normale Supérieure-Paris Sciences et Lettres University, 29 Rue d'Ulm, 75005 Paris, France
| | - Anne-Catherine Bachoud-Lévi
- NeuroPsychologie Interventionnelle, Département d'Études Cognitives, Ecole Normale Supérieure, Institut National de la Santé et de la Recherche Médicale, Institut Mondor de Recherche Biomédicale, Neuratris, Université Paris-Est Créteil, Paris Sciences et Lettres University, 29 Rue d'Ulm, 75005 Paris, France
| | - Emmanuel Dupoux
- Ecole des Hautes Etudes en Sciences Sociales, CNRS, Institut National de Recherche informatique et Automatique, Département d'Études Cognitives, Ecole Normale Supérieure-Paris Sciences et Lettres University, 29 Rue d'Ulm, 75005 Paris, France
| |
Collapse
|
38
|
Boebinger D, Norman-Haignere SV, McDermott JH, Kanwisher N. Music-selective neural populations arise without musical training. J Neurophysiol 2021; 125:2237-2263. [PMID: 33596723 PMCID: PMC8285655 DOI: 10.1152/jn.00588.2020] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 02/12/2021] [Accepted: 02/12/2021] [Indexed: 11/22/2022] Open
Abstract
Recent work has shown that human auditory cortex contains neural populations anterior and posterior to primary auditory cortex that respond selectively to music. However, it is unknown how this selectivity for music arises. To test whether musical training is necessary, we measured fMRI responses to 192 natural sounds in 10 people with almost no musical training. When voxel responses were decomposed into underlying components, this group exhibited a music-selective component that was very similar in response profile and anatomical distribution to that previously seen in individuals with moderate musical training. We also found that musical genres that were less familiar to our participants (e.g., Balinese gamelan) produced strong responses within the music component, as did drum clips with rhythm but little melody, suggesting that these neural populations are broadly responsive to music as a whole. Our findings demonstrate that the signature properties of neural music selectivity do not require musical training to develop, showing that the music-selective neural populations are a fundamental and widespread property of the human brain.NEW & NOTEWORTHY We show that music-selective neural populations are clearly present in people without musical training, demonstrating that they are a fundamental and widespread property of the human brain. Additionally, we show music-selective neural populations respond strongly to music from unfamiliar genres as well as music with rhythm but little pitch information, suggesting that they are broadly responsive to music as a whole.
Collapse
Affiliation(s)
- Dana Boebinger
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, Massachusetts
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Sam V Norman-Haignere
- Laboratoire des Sytèmes Perceptifs, Département d'Études Cognitives, École Normale Supérieure, PSL Research University, CNRS, Paris France
- Zuckerman Institute for Brain Research, Columbia University, New York, New York
| | - Josh H McDermott
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, Massachusetts
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Nancy Kanwisher
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, Massachusetts
| |
Collapse
|
39
|
Boos M, Lücke J, Rieger JW. Generalizable dimensions of human cortical auditory processing of speech in natural soundscapes: A data-driven ultra high field fMRI approach. Neuroimage 2021; 237:118106. [PMID: 33991696 DOI: 10.1016/j.neuroimage.2021.118106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 04/25/2021] [Indexed: 11/27/2022] Open
Abstract
Speech comprehension in natural soundscapes rests on the ability of the auditory system to extract speech information from a complex acoustic signal with overlapping contributions from many sound sources. Here we reveal the canonical processing of speech in natural soundscapes on multiple scales by using data-driven modeling approaches to characterize sounds to analyze ultra high field fMRI recorded while participants listened to the audio soundtrack of a movie. We show that at the functional level the neuronal processing of speech in natural soundscapes can be surprisingly low dimensional in the human cortex, highlighting the functional efficiency of the auditory system for a seemingly complex task. Particularly, we find that a model comprising three functional dimensions of auditory processing in the temporal lobes is shared across participants' fMRI activity. We further demonstrate that the three functional dimensions are implemented in anatomically overlapping networks that process different aspects of speech in natural soundscapes. One is most sensitive to complex auditory features present in speech, another to complex auditory features and fast temporal modulations, that are not specific to speech, and one codes mainly sound level. These results were derived with few a-priori assumptions and provide a detailed and computationally reproducible account of the cortical activity in the temporal lobe elicited by the processing of speech in natural soundscapes.
Collapse
Affiliation(s)
- Moritz Boos
- Applied Neurocognitive Psychology Lab, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany.
| | - Jörg Lücke
- Machine Learning Division, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany
| | - Jochem W Rieger
- Applied Neurocognitive Psychology Lab, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
40
|
Homma NY, Atencio CA, Schreiner CE. Plasticity of Multidimensional Receptive Fields in Core Rat Auditory Cortex Directed by Sound Statistics. Neuroscience 2021; 467:150-170. [PMID: 33951506 DOI: 10.1016/j.neuroscience.2021.04.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 04/09/2021] [Accepted: 04/24/2021] [Indexed: 11/17/2022]
Abstract
Sensory cortical neurons can nonlinearly integrate a wide range of inputs. The outcome of this nonlinear process can be approximated by more than one receptive field component or filter to characterize the ensuing stimulus preference. The functional properties of multidimensional filters are, however, not well understood. Here we estimated two spectrotemporal receptive fields (STRFs) per neuron using maximally informative dimension analysis. We compared their temporal and spectral modulation properties and determined the stimulus information captured by the two STRFs in core rat auditory cortical fields, primary auditory cortex (A1) and ventral auditory field (VAF). The first STRF is the dominant filter and acts as a sound feature detector in both fields. The second STRF is less feature specific, preferred lower modulations, and had less spike information compared to the first STRF. The information jointly captured by the two STRFs was larger than that captured by the sum of the individual STRFs, reflecting nonlinear interactions of two filters. This information gain was larger in A1. We next determined how the acoustic environment affects the structure and relationship of these two STRFs. Rats were exposed to moderate levels of spectrotemporally modulated noise during development. Noise exposure strongly altered the spectrotemporal preference of the first STRF in both cortical fields. The interaction between the two STRFs was reduced by noise exposure in A1 but not in VAF. The results reveal new functional distinctions between A1 and VAF indicating that (i) A1 has stronger interactions of the two STRFs than VAF, (ii) noise exposure diminishes modulation parameter representation contained in the noise more strongly for the first STRF in both fields, and (iii) plasticity induced by noise exposure can affect the strength of filter interactions in A1. Taken together, ascertaining two STRFs per neuron enhances the understanding of cortical information processing and plasticity effects in core auditory cortex.
Collapse
Affiliation(s)
- Natsumi Y Homma
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco, San Francisco, USA; Center for Integrative Neuroscience, University of California San Francisco, San Francisco, USA.
| | - Craig A Atencio
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco, San Francisco, USA
| | - Christoph E Schreiner
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco, San Francisco, USA; Center for Integrative Neuroscience, University of California San Francisco, San Francisco, USA
| |
Collapse
|
41
|
Homma NY, Hullett PW, Atencio CA, Schreiner CE. Auditory Cortical Plasticity Dependent on Environmental Noise Statistics. Cell Rep 2021; 30:4445-4458.e5. [PMID: 32234479 PMCID: PMC7326484 DOI: 10.1016/j.celrep.2020.03.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 08/07/2019] [Accepted: 03/05/2020] [Indexed: 01/14/2023] Open
Abstract
During critical periods, neural circuits develop to form receptive fields that adapt to the sensory environment and enable optimal performance of relevant tasks. We hypothesized that early exposure to background noise can improve signal-in-noise processing, and the resulting receptive field plasticity in the primary auditory cortex can reveal functional principles guiding that important task. We raised rat pups in different spectro-temporal noise statistics during their auditory critical period. As adults, they showed enhanced behavioral performance in detecting vocalizations in noise. Concomitantly, encoding of vocalizations in noise in the primary auditory cortex improves with noise-rearing. Significantly, spectro-temporal modulation plasticity shifts cortical preferences away from the exposed noise statistics, thus reducing noise interference with the foreground sound representation. Auditory cortical plasticity shapes receptive field preferences to optimally extract foreground information in noisy environments during noise-rearing. Early noise exposure induces cortical circuits to implement efficient coding in the joint spectral and temporal modulation domain. After rearing rats in moderately loud spectro-temporally modulated background noise, Homma et al. investigated signal-in-noise processing in the primary auditory cortex. Noise-rearing improved vocalization-in-noise performance in both behavioral testing and neural decoding. Cortical plasticity shifted neuronal spectro-temporal modulation preferences away from the exposed noise statistics.
Collapse
Affiliation(s)
- Natsumi Y Homma
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Patrick W Hullett
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Craig A Atencio
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Christoph E Schreiner
- Coleman Memorial Laboratory, Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, CA 94143, USA; Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA.
| |
Collapse
|
42
|
Nakai T, Koide-Majima N, Nishimoto S. Correspondence of categorical and feature-based representations of music in the human brain. Brain Behav 2021; 11:e01936. [PMID: 33164348 PMCID: PMC7821620 DOI: 10.1002/brb3.1936] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 09/24/2020] [Accepted: 10/21/2020] [Indexed: 01/11/2023] Open
Abstract
INTRODUCTION Humans tend to categorize auditory stimuli into discrete classes, such as animal species, language, musical instrument, and music genre. Of these, music genre is a frequently used dimension of human music preference and is determined based on the categorization of complex auditory stimuli. Neuroimaging studies have reported that the superior temporal gyrus (STG) is involved in response to general music-related features. However, there is considerable uncertainty over how discrete music categories are represented in the brain and which acoustic features are more suited for explaining such representations. METHODS We used a total of 540 music clips to examine comprehensive cortical representations and the functional organization of music genre categories. For this purpose, we applied a voxel-wise modeling approach to music-evoked brain activity measured using functional magnetic resonance imaging. In addition, we introduced a novel technique for feature-brain similarity analysis and assessed how discrete music categories are represented based on the cortical response pattern to acoustic features. RESULTS Our findings indicated distinct cortical organizations for different music genres in the bilateral STG, and they revealed representational relationships between different music genres. On comparing different acoustic feature models, we found that these representations of music genres could be explained largely by a biologically plausible spectro-temporal modulation-transfer function model. CONCLUSION Our findings have elucidated the quantitative representation of music genres in the human cortex, indicating the possibility of modeling this categorization of complex auditory stimuli based on brain activity.
Collapse
Affiliation(s)
- Tomoya Nakai
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, Suita, Japan.,Graduate School of Frontier Biosciences, Osaka University, Suita, Japan
| | - Naoko Koide-Majima
- Graduate School of Frontier Biosciences, Osaka University, Suita, Japan.,AI Science Research and Development Promotion Center, National Institute of Information and Communications Technology, Suita, Japan
| | - Shinji Nishimoto
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, Suita, Japan.,Graduate School of Frontier Biosciences, Osaka University, Suita, Japan.,Graduate School of Medicine, Osaka University, Suita, Japan
| |
Collapse
|
43
|
Ponsot E, Varnet L, Wallaert N, Daoud E, Shamma SA, Lorenzi C, Neri P. Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners. Trends Hear 2021; 25:2331216520978029. [PMID: 33620023 PMCID: PMC7905488 DOI: 10.1177/2331216520978029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 10/26/2020] [Accepted: 11/06/2020] [Indexed: 11/20/2022] Open
Abstract
Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)-nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility.
Collapse
Affiliation(s)
- Emmanuel Ponsot
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
- Hearing Technology @ WAVES, Department of Information
Technology, Ghent University, Ghent, Belgium
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Nicolas Wallaert
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Elza Daoud
- Aix-Marseille Université, UMR CNRS 7260, Laboratoire
Neurosciences Intégratives et Adaptatives, Centre Saint-Charles,
Marseille, France
| | - Shihab A. Shamma
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Christian Lorenzi
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Peter Neri
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| |
Collapse
|
44
|
Cortical voice processing is grounded in elementary sound analyses for vocalization relevant sound patterns. Prog Neurobiol 2020; 200:101982. [PMID: 33338555 DOI: 10.1016/j.pneurobio.2020.101982] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 12/05/2020] [Accepted: 12/11/2020] [Indexed: 01/31/2023]
Abstract
A subregion of the auditory cortex (AC) was proposed to selectively process voices. This selectivity of the temporal voice area (TVA) and its role in processing non-voice sounds however have remained elusive. For a better functional description of the TVA, we investigated its neural responses both to voice and non-voice sounds, and critically also to textural sound patterns (TSPs) that share basic features with natural sounds but that are perceptually very distant from voices. Listening to these TSPs, first, elicited activity in large subregions of the TVA, which was mainly driven by perpetual ratings of TSPs along a voice similarity scale. This similar TVA activity in response to TSPs might partially explain activation patterns typically observed during voice processing. Second, we reconstructed the TVA activity that is usually observed in voice processing with a linear combination of activation patterns from TSPs. An analysis of the reconstruction model weights demonstrated that the TVA similarly processes both natural voice and non-voice sounds as well as TSPs along their acoustic and perceptual features. The predominant factor in reconstructing the TVA pattern by TSPs were the perceptual voice similarity ratings. Third, a multi-voxel pattern analysis confirms that the TSPs contain sufficient sound information to explain TVA activity for voice processing. Altogether, rather than being restricted to higher-order voice processing only, the human "voice area" uses mechanisms to evaluate the perceptual and acoustic quality of non-voice sounds, and responds to the latter with a "voice-like" processing pattern when detecting some rudimentary perceptual similarity with voices.
Collapse
|
45
|
Speech frequency-following response in human auditory cortex is more than a simple tracking. Neuroimage 2020; 226:117545. [PMID: 33186711 DOI: 10.1016/j.neuroimage.2020.117545] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 10/29/2020] [Accepted: 11/02/2020] [Indexed: 11/20/2022] Open
Abstract
The human auditory cortex is recently found to contribute to the frequency following response (FFR) and the cortical component has been shown to be more relevant to speech perception. However, it is not clear how cortical FFR may contribute to the processing of speech fundamental frequency (F0) and the dynamic pitch. Using intracranial EEG recordings, we observed a significant FFR at the fundamental frequency (F0) for both speech and speech-like harmonic complex stimuli in the human auditory cortex, even in the missing fundamental condition. Both the spectral amplitude and phase coherence of the cortical FFR showed a significant harmonic preference, and attenuated from the primary auditory cortex to the surrounding associative auditory cortex. The phase coherence of the speech FFR was found significantly higher than that of the harmonic complex stimuli, especially in the left hemisphere, showing a high timing fidelity of the cortical FFR in tracking dynamic F0 in speech. Spectrally, the frequency band of the cortical FFR was largely overlapped with the range of the human vocal pitch. Taken together, our study parsed the intrinsic properties of the cortical FFR and reveals a preference for speech-like sounds, supporting its potential role in processing speech intonation and lexical tones.
Collapse
|
46
|
Monk AM, Barnes GR, Maguire EA. The Effect of Object Type on Building Scene Imagery-an MEG Study. Front Hum Neurosci 2020; 14:592175. [PMID: 33240069 PMCID: PMC7683518 DOI: 10.3389/fnhum.2020.592175] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/09/2020] [Indexed: 12/28/2022] Open
Abstract
Previous studies have reported that some objects evoke a sense of local three-dimensional space (space-defining; SD), while others do not (space-ambiguous; SA), despite being imagined or viewed in isolation devoid of a background context. Moreover, people show a strong preference for SD objects when given a choice of objects with which to mentally construct scene imagery. When deconstructing scenes, people retain significantly more SD objects than SA objects. It, therefore, seems that SD objects might enjoy a privileged role in scene construction. In the current study, we leveraged the high temporal resolution of magnetoencephalography (MEG) to compare the neural responses to SD and SA objects while they were being used to build imagined scene representations, as this has not been examined before using neuroimaging. On each trial, participants gradually built a scene image from three successive auditorily-presented object descriptions and an imagined 3D space. We then examined the neural dynamics associated with the points during scene construction when either SD or SA objects were being imagined. We found that SD objects elicited theta changes relative to SA objects in two brain regions, the right ventromedial prefrontal cortex (vmPFC) and the right superior temporal gyrus (STG). Furthermore, using dynamic causal modeling, we observed that the vmPFC drove STG activity. These findings may indicate that SD objects serve to activate schematic and conceptual knowledge in vmPFC and STG upon which scene representations are then built.
Collapse
Affiliation(s)
- Anna M Monk
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Gareth R Barnes
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Eleanor A Maguire
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
47
|
Sohoglu E, Davis MH. Rapid computations of spectrotemporal prediction error support perception of degraded speech. eLife 2020; 9:e58077. [PMID: 33147138 PMCID: PMC7641582 DOI: 10.7554/elife.58077] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022] Open
Abstract
Human speech perception can be described as Bayesian perceptual inference but how are these Bayesian computations instantiated neurally? We used magnetoencephalographic recordings of brain responses to degraded spoken words and experimentally manipulated signal quality and prior knowledge. We first demonstrate that spectrotemporal modulations in speech are more strongly represented in neural responses than alternative speech representations (e.g. spectrogram or articulatory features). Critically, we found an interaction between speech signal quality and expectations from prior written text on the quality of neural representations; increased signal quality enhanced neural representations of speech that mismatched with prior expectations, but led to greater suppression of speech that matched prior expectations. This interaction is a unique neural signature of prediction error computations and is apparent in neural responses within 100 ms of speech input. Our findings contribute to the detailed specification of a computational model of speech perception based on predictive coding frameworks.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of SussexBrightonUnited Kingdom
| | - Matthew H Davis
- MRC Cognition and Brain Sciences UnitCambridgeUnited Kingdom
| |
Collapse
|
48
|
Brodbeck C, Jiao A, Hong LE, Simon JZ. Neural speech restoration at the cocktail party: Auditory cortex recovers masked speech of both attended and ignored speakers. PLoS Biol 2020; 18:e3000883. [PMID: 33091003 PMCID: PMC7644085 DOI: 10.1371/journal.pbio.3000883] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 11/05/2020] [Accepted: 09/14/2020] [Indexed: 01/09/2023] Open
Abstract
Humans are remarkably skilled at listening to one speaker out of an acoustic mixture of several speech sources. Two speakers are easily segregated, even without binaural cues, but the neural mechanisms underlying this ability are not well understood. One possibility is that early cortical processing performs a spectrotemporal decomposition of the acoustic mixture, allowing the attended speech to be reconstructed via optimally weighted recombinations that discount spectrotemporal regions where sources heavily overlap. Using human magnetoencephalography (MEG) responses to a 2-talker mixture, we show evidence for an alternative possibility, in which early, active segregation occurs even for strongly spectrotemporally overlapping regions. Early (approximately 70-millisecond) responses to nonoverlapping spectrotemporal features are seen for both talkers. When competing talkers’ spectrotemporal features mask each other, the individual representations persist, but they occur with an approximately 20-millisecond delay. This suggests that the auditory cortex recovers acoustic features that are masked in the mixture, even if they occurred in the ignored speech. The existence of such noise-robust cortical representations, of features present in attended as well as ignored speech, suggests an active cortical stream segregation process, which could explain a range of behavioral effects of ignored background speech. How do humans focus on one speaker when several are talking? MEG responses to a continuous two-talker mixture suggest that, even though listeners attend only to one of the talkers, their auditory cortex tracks acoustic features from both speakers. This occurs even when those features are locally masked by the other speaker.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Alex Jiao
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
| | - L. Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Department of Biology, University of Maryland, College Park, Maryland, United States of America
| |
Collapse
|
49
|
Ramos Nuñez AI, Yue Q, Pasalar S, Martin RC. The role of left vs. right superior temporal gyrus in speech perception: An fMRI-guided TMS study. BRAIN AND LANGUAGE 2020; 209:104838. [PMID: 32801090 DOI: 10.1016/j.bandl.2020.104838] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 05/27/2020] [Accepted: 07/13/2020] [Indexed: 05/15/2023]
Abstract
Debate continues regarding the necessary role of right superior temporal gyrus (STG) regions in sublexical speech perception given the bilateral STG activation often observed in fMRI studies. To evaluate the causal roles, TMS pulses were delivered to inhibit and disrupt neuronal activity at the left and right STG regions during a nonword discrimination task based on peak activations from a blocked fMRI paradigm assessing speech vs. nonspeech perception (N = 20). Relative to a control region located in the posterior occipital lobe, TMS to the left anterior STG (laSTG) led to significantly worse accuracy, whereas TMS to the left posterior STG (lpSTG) and right anterior STG (raSTG) did not. Although the disruption from TMS was significantly greater for the laSTG than for raSTG, the difference in accuracy between the laSTG and lpSTG did not reach significance. The results argue for a causal role of the laSTG but not raSTG in speech perception. Further research is needed to establish the source of the differences between the laSTG and lpSTG.
Collapse
Affiliation(s)
- Aurora I Ramos Nuñez
- Department of Social Sciences, College of Coastal Georgia, Brunswick, GA 31520, USA.
| | - Qiuhai Yue
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA; Department of Psychology, Vanderbilt University, Nashville, TN 37212, USA
| | - Siavash Pasalar
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA
| | - Randi C Martin
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA
| |
Collapse
|
50
|
Plass J, Brang D, Suzuki S, Grabowecky M. Vision perceptually restores auditory spectral dynamics in speech. Proc Natl Acad Sci U S A 2020; 117:16920-16927. [PMID: 32632010 PMCID: PMC7382243 DOI: 10.1073/pnas.2002887117] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time-frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
Collapse
Affiliation(s)
- John Plass
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109;
- Department of Psychology, Northwestern University, Evanston, IL 60208
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| |
Collapse
|