1
|
Nentwich M, Leszczynski M, Schroeder CE, Bickel S, Parra LC. Intrinsic dynamic shapes responses to external stimulation in the human brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.05.606665. [PMID: 39463938 PMCID: PMC11507726 DOI: 10.1101/2024.08.05.606665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Sensory stimulation of the brain reverberates in its recurrent neuronal networks. However, current computational models of brain activity do not separate immediate sensory responses from intrinsic recurrent dynamics. We apply a vector-autoregressive model with external input (VARX), combining the concepts of "functional connectivity" and "encoding models", to intracranial recordings in humans. We find that the recurrent connectivity during rest is largely unaltered during movie watching. The intrinsic recurrent dynamic enhances and prolongs the neural responses to scene cuts, eye movements, and sounds. Failing to account for these exogenous inputs, leads to spurious connections in the intrinsic "connectivity". The model shows that an external stimulus can reduce intrinsic noise. It also shows that sensory areas have mostly outward, whereas higher-order brain areas mostly incoming connections. We conclude that the response to an external audiovisual stimulus can largely be attributed to the intrinsic dynamic of the brain, already observed during rest.
Collapse
Affiliation(s)
- Maximilian Nentwich
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Marcin Leszczynski
- Departments of Psychiatry and Neurology, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Translational Neuroscience Lab Division, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
- Cognitive Science Department, Institute of Philosophy, Jagiellonian University, Kraków, Poland
| | - Charles E Schroeder
- Departments of Psychiatry and Neurology, Columbia University College of Physicians and Surgeons, New York, NY, USA
- Translational Neuroscience Lab Division, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Stephan Bickel
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Departments of Neurology and Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, USA
| | - Lucas C Parra
- Department of Biomedical Engineering, The City College of New York, New York, NY, USA
| |
Collapse
|
2
|
Clonan AC, Zhai X, Stevenson IH, Escabí MA. Interference of mid-level sound statistics underlie human speech recognition sensitivity in natural noise. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.579526. [PMID: 38405870 PMCID: PMC10888804 DOI: 10.1101/2024.02.13.579526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Recognizing speech in noise, such as in a busy restaurant, is an essential cognitive skill where the task difficulty varies across environments and noise levels. Although there is growing evidence that the auditory system relies on statistical representations for perceiving 1-5 and coding4,6-9 natural sounds, it's less clear how statistical cues and neural representations contribute to segregating speech in natural auditory scenes. We demonstrate that human listeners rely on mid-level statistics to segregate and recognize speech in environmental noise. Using natural backgrounds and variants with perturbed spectro-temporal statistics, we show that speech recognition accuracy at a fixed noise level varies extensively across natural backgrounds (0% to 100%). Furthermore, for each background the unique interference created by summary statistics can mask or unmask speech, thus hindering or improving speech recognition. To identify the neural coding strategy and statistical cues that influence accuracy, we developed generalized perceptual regression, a framework that links summary statistics from a neural model to word recognition accuracy. Whereas a peripheral cochlear model accounts for only 60% of perceptual variance, summary statistics from a mid-level auditory midbrain model accurately predicts single trial sensory judgments, accounting for more than 90% of the perceptual variance. Furthermore, perceptual weights from the regression framework identify which statistics and tuned neural filters are influential and how they impact recognition. Thus, perception of speech in natural backgrounds relies on a mid-level auditory representation involving interference of multiple summary statistics that impact recognition beneficially or detrimentally across natural background sounds.
Collapse
Affiliation(s)
- Alex C Clonan
- Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269
- Biomedical Engineering, University of Connecticut, Storrs, CT 06269
- Institute of Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269
| | - Xiu Zhai
- Biomedical Engineering, Wentworth Institute of Technology, Boston, MA 02115
| | - Ian H Stevenson
- Biomedical Engineering, University of Connecticut, Storrs, CT 06269
- Psychological Sciences, University of Connecticut, Storrs, CT 06269
- Institute of Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269
| | - Monty A Escabí
- Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269
- Psychological Sciences, University of Connecticut, Storrs, CT 06269
- Institute of Brain and Cognitive Sciences, University of Connecticut, Storrs, CT 06269
| |
Collapse
|
3
|
Zada Z, Goldstein A, Michelmann S, Simony E, Price A, Hasenfratz L, Barham E, Zadbood A, Doyle W, Friedman D, Dugan P, Melloni L, Devore S, Flinker A, Devinsky O, Nastase SA, Hasson U. A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations. Neuron 2024; 112:3211-3222.e5. [PMID: 39096896 PMCID: PMC11427153 DOI: 10.1016/j.neuron.2024.06.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 03/26/2024] [Accepted: 06/25/2024] [Indexed: 08/05/2024]
Abstract
Effective communication hinges on a mutual understanding of word meaning in different contexts. We recorded brain activity using electrocorticography during spontaneous, face-to-face conversations in five pairs of epilepsy patients. We developed a model-based coupling framework that aligns brain activity in both speaker and listener to a shared embedding space from a large language model (LLM). The context-sensitive LLM embeddings allow us to track the exchange of linguistic information, word by word, from one brain to another in natural conversations. Linguistic content emerges in the speaker's brain before word articulation and rapidly re-emerges in the listener's brain after word articulation. The contextual embeddings better capture word-by-word neural alignment between speaker and listener than syntactic and articulatory models. Our findings indicate that the contextual embeddings learned by LLMs can serve as an explicit numerical model of the shared, context-rich meaning space humans use to communicate their thoughts to one another.
Collapse
Affiliation(s)
- Zaid Zada
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA.
| | - Ariel Goldstein
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA; Department of Cognitive and Brain Sciences and Business School, Hebrew University, Jerusalem 9190501, Israel
| | - Sebastian Michelmann
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Erez Simony
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA; Faculty of Engineering, Holon Institute of Technology, Holon 5810201, Israel
| | - Amy Price
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Liat Hasenfratz
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Emily Barham
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Asieh Zadbood
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA; Department of Psychology, Columbia University, New York, NY 10027, USA
| | - Werner Doyle
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Daniel Friedman
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Patricia Dugan
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Lucia Melloni
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Sasha Devore
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Adeen Flinker
- Grossman School of Medicine, New York University, New York, NY 10016, USA; Tandon School of Engineering, New York University, New York, NY 10016, USA
| | - Orrin Devinsky
- Grossman School of Medicine, New York University, New York, NY 10016, USA
| | - Samuel A Nastase
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Uri Hasson
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
4
|
Marsicano G, Bertini C, Ronconi L. Decoding cognition in neurodevelopmental, psychiatric and neurological conditions with multivariate pattern analysis of EEG data. Neurosci Biobehav Rev 2024; 164:105795. [PMID: 38977116 DOI: 10.1016/j.neubiorev.2024.105795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/21/2024] [Accepted: 07/03/2024] [Indexed: 07/10/2024]
Abstract
Multivariate pattern analysis (MVPA) of electroencephalographic (EEG) data represents a revolutionary approach to investigate how the brain encodes information. By considering complex interactions among spatio-temporal features at the individual level, MVPA overcomes the limitations of univariate techniques, which often fail to account for the significant inter- and intra-individual neural variability. This is particularly relevant when studying clinical populations, and therefore MVPA of EEG data has recently started to be employed as a tool to study cognition in brain disorders. Here, we review the insights offered by this methodology in the study of anomalous patterns of neural activity in conditions such as autism, ADHD, schizophrenia, dyslexia, neurological and neurodegenerative disorders, within different cognitive domains (perception, attention, memory, consciousness). Despite potential drawbacks that should be attentively addressed, these studies reveal a peculiar sensitivity of MVPA in unveiling dysfunctional and compensatory neurocognitive dynamics of information processing, which often remain blind to traditional univariate approaches. Such higher sensitivity in characterizing individual neurocognitive profiles can provide unique opportunities to optimise assessment and promote personalised interventions.
Collapse
Affiliation(s)
- Gianluca Marsicano
- Department of Psychology, University of Bologna, Viale Berti Pichat 5, Bologna 40121, Italy; Centre for Studies and Research in Cognitive Neuroscience, University of Bologna, Via Rasi e Spinelli 176, Cesena 47023, Italy.
| | - Caterina Bertini
- Department of Psychology, University of Bologna, Viale Berti Pichat 5, Bologna 40121, Italy; Centre for Studies and Research in Cognitive Neuroscience, University of Bologna, Via Rasi e Spinelli 176, Cesena 47023, Italy.
| | - Luca Ronconi
- School of Psychology, Vita-Salute San Raffaele University, Milan, Italy; Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy.
| |
Collapse
|
5
|
Desai M, Field AM, Hamilton LS. A comparison of EEG encoding models using audiovisual stimuli and their unimodal counterparts. PLoS Comput Biol 2024; 20:e1012433. [PMID: 39250485 PMCID: PMC11412666 DOI: 10.1371/journal.pcbi.1012433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 09/19/2024] [Accepted: 08/21/2024] [Indexed: 09/11/2024] Open
Abstract
Communication in the real world is inherently multimodal. When having a conversation, typically sighted and hearing people use both auditory and visual cues to understand one another. For example, objects may make sounds as they move in space, or we may use the movement of a person's mouth to better understand what they are saying in a noisy environment. Still, many neuroscience experiments rely on unimodal stimuli to understand encoding of sensory features in the brain. The extent to which visual information may influence encoding of auditory information and vice versa in natural environments is thus unclear. Here, we addressed this question by recording scalp electroencephalography (EEG) in 11 subjects as they listened to and watched movie trailers in audiovisual (AV), visual (V) only, and audio (A) only conditions. We then fit linear encoding models that described the relationship between the brain responses and the acoustic, phonetic, and visual information in the stimuli. We also compared whether auditory and visual feature tuning was the same when stimuli were presented in the original AV format versus when visual or auditory information was removed. In these stimuli, visual and auditory information was relatively uncorrelated, and included spoken narration over a scene as well as animated or live-action characters talking with and without their face visible. For this stimulus, we found that auditory feature tuning was similar in the AV and A-only conditions, and similarly, tuning for visual information was similar when stimuli were presented with the audio present (AV) and when the audio was removed (V only). In a cross prediction analysis, we investigated whether models trained on AV data predicted responses to A or V only test data similarly to models trained on unimodal data. Overall, prediction performance using AV training and V test sets was similar to using V training and V test sets, suggesting that the auditory information has a relatively smaller effect on EEG. In contrast, prediction performance using AV training and A only test set was slightly worse than using matching A only training and A only test sets. This suggests the visual information has a stronger influence on EEG, though this makes no qualitative difference in the derived feature tuning. In effect, our results show that researchers may benefit from the richness of multimodal datasets, which can then be used to answer more than one research question.
Collapse
Affiliation(s)
- Maansi Desai
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Alyssa M Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Liberty S Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas, United States of America
| |
Collapse
|
6
|
Fantoni M, Federici A, Camponogara I, Handjaras G, Martinelli A, Bednaya E, Ricciardi E, Pavani F, Bottari D. The impact of face masks on face-to-face neural tracking of speech: Auditory and visual obstacles. Heliyon 2024; 10:e34860. [PMID: 39157360 PMCID: PMC11328033 DOI: 10.1016/j.heliyon.2024.e34860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 07/17/2024] [Accepted: 07/17/2024] [Indexed: 08/20/2024] Open
Abstract
Face masks provide fundamental protection against the transmission of respiratory viruses but hamper communication. We estimated auditory and visual obstacles generated by face masks on communication by measuring the neural tracking of speech. To this end, we recorded the EEG while participants were exposed to naturalistic audio-visual speech, embedded in 5-talker noise, in three contexts: (i) no-mask (audio-visual information was fully available), (ii) virtual mask (occluded lips, but intact audio), and (iii) real mask (occluded lips and degraded audio). Neural tracking of lip movements and of the sound envelope of speech was measured through backward modeling, that is, by reconstructing stimulus properties from neural activity. Behaviorally, face masks increased perceived listening difficulty and phonological errors in speech content retrieval. At the neural level, we observed that the occlusion of the mouth abolished lip tracking and dampened neural tracking of the speech envelope at the earliest processing stages. By contrast, degraded acoustic information related to face mask filtering altered neural tracking of speech envelope at later processing stages. Finally, a consistent link emerged between the increment of perceived listening difficulty and the drop in reconstruction performance of speech envelope when attending to a speaker wearing a face mask. Results clearly dissociated the visual and auditory impact of face masks on the neural tracking of speech. While the visual obstacle related to face masks hampered the ability to predict and integrate audio-visual speech, the auditory filter generated by face masks impacted neural processing stages typically associated with auditory selective attention. The link between perceived difficulty and neural tracking drop also provides evidence of the impact of face masks on the metacognitive levels subtending face-to-face communication.
Collapse
Affiliation(s)
- M. Fantoni
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - A. Federici
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - G. Handjaras
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - E. Bednaya
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - E. Ricciardi
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - F. Pavani
- Centro Interdipartimentale Mente/Cervello–CIMEC, University of Trento, Italy
- Centro Interuniversitario di Ricerca “Cognizione Linguaggio e Sordità”–CIRCLeS, University of Trento, Italy
| | - D. Bottari
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| |
Collapse
|
7
|
Yu L, Dugan P, Doyle W, Devinsky O, Friedman D, Flinker A. A left-lateralized dorsolateral prefrontal network for naming. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594403. [PMID: 38798614 PMCID: PMC11118423 DOI: 10.1101/2024.05.15.594403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The ability to connect the form and meaning of a concept, known as word retrieval, is fundamental to human communication. While various input modalities could lead to identical word retrieval, the exact neural dynamics supporting this convergence relevant to daily auditory discourse remain poorly understood. Here, we leveraged neurosurgical electrocorticographic (ECoG) recordings from 48 patients and dissociated two key language networks that highly overlap in time and space integral to word retrieval. Using unsupervised temporal clustering techniques, we found a semantic processing network located in the middle and inferior frontal gyri. This network was distinct from an articulatory planning network in the inferior frontal and precentral gyri, which was agnostic to input modalities. Functionally, we confirmed that the semantic processing network encodes word surprisal during sentence perception. Our findings characterize how humans integrate ongoing auditory semantic information over time, a critical linguistic function from passive comprehension to daily discourse.
Collapse
Affiliation(s)
- Leyao Yu
- Department of Biomedical Engineering, New York University, New York, 10016, New York, the United States
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Patricia Dugan
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Werner Doyle
- Department of Neurosurgery, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Orrin Devinsky
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Daniel Friedman
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Adeen Flinker
- Department of Biomedical Engineering, New York University, New York, 10016, New York, the United States
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| |
Collapse
|
8
|
Müller R. Bioinspiration from bats and new paradigms for autonomy in natural environments. BIOINSPIRATION & BIOMIMETICS 2024; 19:033001. [PMID: 38452384 DOI: 10.1088/1748-3190/ad311e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 03/07/2024] [Indexed: 03/09/2024]
Abstract
Achieving autonomous operation in complex natural environment remains an unsolved challenge. Conventional engineering approaches to this problem have focused on collecting large amounts of sensory data that are used to create detailed digital models of the environment. However, this only postpones solving the challenge of identifying the relevant sensory information and linking it to action control to the domain of the digital world model. Furthermore, it imposes high demands in terms of computing power and introduces large processing latencies that hamper autonomous real-time performance. Certain species of bats that are able to navigate and hunt their prey in dense vegetation could be a biological model system for an alternative approach to addressing the fundamental issues associated with autonomy in complex natural environments. Bats navigating in dense vegetation rely on clutter echoes, i.e. signals that consist of unresolved contributions from many scatters. Yet, the animals are able to extract the relevant information from these input signals with brains that are often less than 1 g in mass. Pilot results indicate that information relevant to location identification and passageway finding can be directly obtained from clutter echoes, opening up the possibility that the bats' skill can be replicated in man-made autonomous systems.
Collapse
Affiliation(s)
- Rolf Müller
- Department of Mechanical Engineering, Virginia Tech, Blacksburg, VA 24061, United States of America
| |
Collapse
|
9
|
Simon A, Bech S, Loquet G, Østergaard J. Cortical linear encoding and decoding of sounds: Similarities and differences between naturalistic speech and music listening. Eur J Neurosci 2024; 59:2059-2074. [PMID: 38303522 DOI: 10.1111/ejn.16265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 11/02/2023] [Accepted: 01/12/2024] [Indexed: 02/03/2024]
Abstract
Linear models are becoming increasingly popular to investigate brain activity in response to continuous and naturalistic stimuli. In the context of auditory perception, these predictive models can be 'encoding', when stimulus features are used to reconstruct brain activity, or 'decoding' when neural features are used to reconstruct the audio stimuli. These linear models are a central component of some brain-computer interfaces that can be integrated into hearing assistive devices (e.g., hearing aids). Such advanced neurotechnologies have been widely investigated when listening to speech stimuli but rarely when listening to music. Recent attempts at neural tracking of music show that the reconstruction performances are reduced compared with speech decoding. The present study investigates the performance of stimuli reconstruction and electroencephalogram prediction (decoding and encoding models) based on the cortical entrainment of temporal variations of the audio stimuli for both music and speech listening. Three hypotheses that may explain differences between speech and music stimuli reconstruction were tested to assess the importance of the speech-specific acoustic and linguistic factors. While the results obtained with encoding models suggest different underlying cortical processing between speech and music listening, no differences were found in terms of reconstruction of the stimuli or the cortical data. The results suggest that envelope-based linear modelling can be used to study both speech and music listening, despite the differences in the underlying cortical mechanisms.
Collapse
Affiliation(s)
- Adèle Simon
- Artificial Intelligence and Sound, Department of Electronic Systems, Aalborg University, Aalborg, Denmark
- Research Department, Bang & Olufsen A/S, Struer, Denmark
| | - Søren Bech
- Artificial Intelligence and Sound, Department of Electronic Systems, Aalborg University, Aalborg, Denmark
- Research Department, Bang & Olufsen A/S, Struer, Denmark
| | - Gérard Loquet
- Department of Audiology and Speech Pathology, University of Melbourne, Melbourne, Victoria, Australia
| | - Jan Østergaard
- Artificial Intelligence and Sound, Department of Electronic Systems, Aalborg University, Aalborg, Denmark
| |
Collapse
|
10
|
Gonzalez JE, Nieto N, Brusco P, Gravano A, Kamienkowski JE. Speech-induced suppression during natural dialogues. Commun Biol 2024; 7:291. [PMID: 38459110 PMCID: PMC10923813 DOI: 10.1038/s42003-024-05945-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 02/21/2024] [Indexed: 03/10/2024] Open
Abstract
When engaged in a conversation, one receives auditory information from the other's speech but also from their own speech. However, this information is processed differently by an effect called Speech-Induced Suppression. Here, we studied brain representation of acoustic properties of speech in natural unscripted dialogues, using electroencephalography (EEG) and high-quality speech recordings from both participants. Using encoding techniques, we were able to reproduce a broad range of previous findings on listening to another's speech, and achieving even better performances when predicting EEG signal in this complex scenario. Furthermore, we found no response when listening to oneself, using different acoustic features (spectrogram, envelope, etc.) and frequency bands, evidencing a strong effect of SIS. The present work shows that this mechanism is present, and even stronger, during natural dialogues. Moreover, the methodology presented here opens the possibility of a deeper understanding of the related mechanisms in a wider range of contexts.
Collapse
Affiliation(s)
- Joaquin E Gonzalez
- Laboratorio de Inteligencia Artificial Aplicada, Instituto de Ciencias de la Computación (Universidad de Buenos Aires - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Buenos Aires, Argentina.
| | - Nicolás Nieto
- Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional, sinc(i) (Universidad Nacional del Litoral - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Santa Fe, Argentina
- Instituto de Matemática Aplicada del Litoral, IMAL-UNL/CONICET, Santa Fe, Argentina
| | - Pablo Brusco
- Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Agustín Gravano
- Laboratorio de Inteligencia Artificial, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Escuela de Negocios, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Juan E Kamienkowski
- Laboratorio de Inteligencia Artificial Aplicada, Instituto de Ciencias de la Computación (Universidad de Buenos Aires - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Buenos Aires, Argentina
- Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
- Maestria de Explotación de Datos y Descubrimiento del Conocimiento, Facultad de Ciencias Exactas y Naturales - Facultad de Ingenieria, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
11
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
12
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
13
|
Winters DE, Dugré JR, Sakai JT, Carter RM. Executive function and underlying brain network distinctions for callous-unemotional traits and conduct problems in adolescents. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.31.565009. [PMID: 37961691 PMCID: PMC10635075 DOI: 10.1101/2023.10.31.565009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The complexity of executive function (EF) impairments in youth antisocial phenotypes of callous-unemotional (CU) traits and conduct problems (CP) challenge identifying phenotypic specific EF deficits. We can redress these challenges by (1) accounting for EF measurement error and (2) testing distinct functional brain properties accounting for differences in EF. Thus, we employed a latent modeling approach for EFs (inhibition, shifting, fluency, common EF) and extracted connection density from matching contemporary EF brain models with a sample of 112 adolescents (ages 13-17, 42% female). Path analysis indicated CU traits associated with lower inhibition. Inhibition network density positively associated with inhibition, but this association was strengthened by CU and attenuated by CP. Common EF associated with three-way interactions between density*CP by CU for the inhibition and shifting networks. This suggests those higher in CU require their brain to work harder for lower inhibition, whereas those higher in CP have difficulty engaging inhibitory brain responses. Additionally, those with CP interacting with CU show distinct brain patterns for a more general EF capacity. Importantly, modeling cross-network connection density in contemporary EF models to test EF involvement in core impairments in CU and CP may accelerate our understanding of EF in these phenotypes.
Collapse
Affiliation(s)
- Drew E. Winters
- Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus
| | - Jules R Dugré
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, UK
| | - Joseph T. Sakai
- Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus
| | - R. McKell Carter
- Department of Psychology & Neuroscience, University of Colorado Boulder, Boulder, CO, USA
- Institute of Cognitive Science, University of Colorado Boulder, Boulder, CO, USA; Department of Electrical, Computer and Energy Engineering, University of Colorado Boulder, Boulder, CO, USA
| |
Collapse
|
14
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
15
|
Stephen EP, Li Y, Metzger S, Oganian Y, Chang EF. Latent neural dynamics encode temporal context in speech. Hear Res 2023; 437:108838. [PMID: 37441880 PMCID: PMC11182421 DOI: 10.1016/j.heares.2023.108838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 06/15/2023] [Accepted: 07/03/2023] [Indexed: 07/15/2023]
Abstract
Direct neural recordings from human auditory cortex have demonstrated encoding for acoustic-phonetic features of consonants and vowels. Neural responses also encode distinct acoustic amplitude cues related to timing, such as those that occur at the onset of a sentence after a silent period or the onset of the vowel in each syllable. Here, we used a group reduced rank regression model to show that distributed cortical responses support a low-dimensional latent state representation of temporal context in speech. The timing cues each capture more unique variance than all other phonetic features and exhibit rotational or cyclical dynamics in latent space from activity that is widespread over the superior temporal gyrus. We propose that these spatially distributed timing signals could serve to provide temporal context for, and possibly bind across time, the concurrent processing of individual phonetic features, to compose higher-order phonological (e.g. word-level) representations.
Collapse
Affiliation(s)
- Emily P Stephen
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; Department of Mathematics and Statistics, Boston University, Boston, MA 02215, United States
| | - Yuanning Li
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Sean Metzger
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States
| | - Yulia Oganian
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States; Center for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
| | - Edward F Chang
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States.
| |
Collapse
|
16
|
LeBel A, Wagner L, Jain S, Adhikari-Desai A, Gupta B, Morgenthal A, Tang J, Xu L, Huth AG. A natural language fMRI dataset for voxelwise encoding models. Sci Data 2023; 10:555. [PMID: 37612332 PMCID: PMC10447563 DOI: 10.1038/s41597-023-02437-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 08/02/2023] [Indexed: 08/25/2023] Open
Abstract
Speech comprehension is a complex process that draws on humans' abilities to extract lexical information, parse syntax, and form semantic understanding. These sub-processes have traditionally been studied using separate neuroimaging experiments that attempt to isolate specific effects of interest. More recently it has become possible to study all stages of language comprehension in a single neuroimaging experiment using narrative natural language stimuli. The resulting data are richly varied at every level, enabling analyses that can probe everything from spectral representations to high-level representations of semantic meaning. We provide a dataset containing BOLD fMRI responses recorded while 8 participants each listened to 27 complete, natural, narrative stories (~6 hours). This dataset includes pre-processed and raw MRIs, as well as hand-constructed 3D cortical surfaces for each participant. To address the challenges of analyzing naturalistic data, this dataset is accompanied by a python library containing basic code for creating voxelwise encoding models. Altogether, this dataset provides a large and novel resource for understanding speech and language processing in the human brain.
Collapse
Affiliation(s)
- Amanda LeBel
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94704, USA
| | - Lauren Wagner
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, 90095, USA
| | - Shailee Jain
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Aneesh Adhikari-Desai
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Bhavin Gupta
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Allyson Morgenthal
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Jerry Tang
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Lixiang Xu
- Department of Physics, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Alexander G Huth
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA.
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
17
|
Bellier L, Llorens A, Marciano D, Gunduz A, Schalk G, Brunner P, Knight RT. Music can be reconstructed from human auditory cortex activity using nonlinear decoding models. PLoS Biol 2023; 21:e3002176. [PMID: 37582062 PMCID: PMC10427021 DOI: 10.1371/journal.pbio.3002176] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 05/30/2023] [Indexed: 08/17/2023] Open
Abstract
Music is core to human experience, yet the precise neural dynamics underlying music perception remain unknown. We analyzed a unique intracranial electroencephalography (iEEG) dataset of 29 patients who listened to a Pink Floyd song and applied a stimulus reconstruction approach previously used in the speech domain. We successfully reconstructed a recognizable song from direct neural recordings and quantified the impact of different factors on decoding accuracy. Combining encoding and decoding analyses, we found a right-hemisphere dominance for music perception with a primary role of the superior temporal gyrus (STG), evidenced a new STG subregion tuned to musical rhythm, and defined an anterior-posterior STG organization exhibiting sustained and onset responses to musical elements. Our findings show the feasibility of applying predictive modeling on short datasets acquired in single patients, paving the way for adding musical elements to brain-computer interface (BCI) applications.
Collapse
Affiliation(s)
- Ludovic Bellier
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Anaïs Llorens
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Déborah Marciano
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Aysegul Gunduz
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Gerwin Schalk
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
| | - Peter Brunner
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, Missouri, United States of America
- National Center for Adaptive Neurotechnologies, Albany, New York, United States of America
| | - Robert T. Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
18
|
Meng K, Goodarzy F, Kim E, Park YJ, Kim JS, Cook MJ, Chung CK, Grayden DB. Continuous synthesis of artificial speech sounds from human cortical surface recordings during silent speech production. J Neural Eng 2023; 20:046019. [PMID: 37459853 DOI: 10.1088/1741-2552/ace7f6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 07/17/2023] [Indexed: 07/28/2023]
Abstract
Objective. Brain-computer interfaces can restore various forms of communication in paralyzed patients who have lost their ability to articulate intelligible speech. This study aimed to demonstrate the feasibility of closed-loop synthesis of artificial speech sounds from human cortical surface recordings during silent speech production.Approach. Ten participants with intractable epilepsy were temporarily implanted with intracranial electrode arrays over cortical surfaces. A decoding model that predicted audible outputs directly from patient-specific neural feature inputs was trained during overt word reading and immediately tested with overt, mimed and imagined word reading. Predicted outputs were later assessed objectively against corresponding voice recordings and subjectively through human perceptual judgments.Main results. Artificial speech sounds were successfully synthesized during overt and mimed utterances by two participants with some coverage of the precentral gyrus. About a third of these sounds were correctly identified by naïve listeners in two-alternative forced-choice tasks. A similar outcome could not be achieved during imagined utterances by any of the participants. However, neural feature contribution analyses suggested the presence of exploitable activation patterns during imagined speech in the postcentral gyrus and the superior temporal gyrus. In future work, a more comprehensive coverage of cortical surfaces, including posterior parts of the middle frontal gyrus and the inferior frontal gyrus, could improve synthesis performance during imagined speech.Significance.As the field of speech neuroprostheses is rapidly moving toward clinical trials, this study addressed important considerations about task instructions and brain coverage when conducting research on silent speech with non-target participants.
Collapse
Affiliation(s)
- Kevin Meng
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
| | - Farhad Goodarzy
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| | - EuiYoung Kim
- Interdisciplinary Program in Neuroscience, Seoul National University, Seoul, Republic of Korea
| | - Ye Jin Park
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
| | - June Sic Kim
- Research Institute of Basic Sciences, Seoul National University, Seoul, Republic of Korea
| | - Mark J Cook
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| | - Chun Kee Chung
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
- Department of Neurosurgery, Seoul National University Hospital, Seoul, Republic of Korea
| | - David B Grayden
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| |
Collapse
|
19
|
Zada Z, Goldstein A, Michelmann S, Simony E, Price A, Hasenfratz L, Barham E, Zadbood A, Doyle W, Friedman D, Dugan P, Melloni L, Devore S, Flinker A, Devinsky O, Nastase SA, Hasson U. A shared linguistic space for transmitting our thoughts from brain to brain in natural conversations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.27.546708. [PMID: 37425747 PMCID: PMC10327051 DOI: 10.1101/2023.06.27.546708] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Effective communication hinges on a mutual understanding of word meaning in different contexts. The embedding space learned by large language models can serve as an explicit model of the shared, context-rich meaning space humans use to communicate their thoughts. We recorded brain activity using electrocorticography during spontaneous, face-to-face conversations in five pairs of epilepsy patients. We demonstrate that the linguistic embedding space can capture the linguistic content of word-by-word neural alignment between speaker and listener. Linguistic content emerged in the speaker's brain before word articulation, and the same linguistic content rapidly reemerged in the listener's brain after word articulation. These findings establish a computational framework to study how human brains transmit their thoughts to one another in real-world contexts.
Collapse
Affiliation(s)
- Zaid Zada
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Ariel Goldstein
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
- Department of Cognitive and Brain Sciences and Business School, Hebrew University; Jerusalem, 9190501, Israel
| | - Sebastian Michelmann
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Erez Simony
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
- Faculty of Engineering, Holon Institute of Technology, Holon, 5810201, Israel
| | - Amy Price
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Liat Hasenfratz
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Emily Barham
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Asieh Zadbood
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
- Department of Psychology, Columbia University; New York, 10027, USA
| | - Werner Doyle
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Daniel Friedman
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Patricia Dugan
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Lucia Melloni
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Sasha Devore
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Adeen Flinker
- Grossman School of Medicine, New York University; New York, 10016, USA
- Tandon School of Engineering, New York University; New York, 10016, USA
| | - Orrin Devinsky
- Grossman School of Medicine, New York University; New York, 10016, USA
| | - Samuel A. Nastase
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| | - Uri Hasson
- Princeton Neuroscience Institute and Department of Psychology, Princeton University; New Jersey, 08544, USA
| |
Collapse
|
20
|
Raghavan VS, O’Sullivan J, Bickel S, Mehta AD, Mesgarani N. Distinct neural encoding of glimpsed and masked speech in multitalker situations. PLoS Biol 2023; 21:e3002128. [PMID: 37279203 PMCID: PMC10243639 DOI: 10.1371/journal.pbio.3002128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 04/19/2023] [Indexed: 06/08/2023] Open
Abstract
Humans can easily tune in to one talker in a multitalker environment while still picking up bits of background speech; however, it remains unclear how we perceive speech that is masked and to what degree non-target speech is processed. Some models suggest that perception can be achieved through glimpses, which are spectrotemporal regions where a talker has more energy than the background. Other models, however, require the recovery of the masked regions. To clarify this issue, we directly recorded from primary and non-primary auditory cortex (AC) in neurosurgical patients as they attended to one talker in multitalker speech and trained temporal response function models to predict high-gamma neural activity from glimpsed and masked stimulus features. We found that glimpsed speech is encoded at the level of phonetic features for target and non-target talkers, with enhanced encoding of target speech in non-primary AC. In contrast, encoding of masked phonetic features was found only for the target, with a greater response latency and distinct anatomical organization compared to glimpsed phonetic features. These findings suggest separate mechanisms for encoding glimpsed and masked speech and provide neural evidence for the glimpsing model of speech perception.
Collapse
Affiliation(s)
- Vinay S Raghavan
- Department of Electrical Engineering, Columbia University, New York, New York, United States of America
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
| | - James O’Sullivan
- Department of Electrical Engineering, Columbia University, New York, New York, United States of America
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
| | - Stephan Bickel
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, New York, United States of America
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, United States of America
- Department of Neurology, Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, United States of America
| | - Ashesh D. Mehta
- The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, New York, United States of America
- Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, United States of America
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, New York, United States of America
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
| |
Collapse
|
21
|
Dreyer AM, Michalke L, Perry A, Chang EF, Lin JJ, Knight RT, Rieger JW. Grasp-specific high-frequency broadband mirror neuron activity during reach-and-grasp movements in humans. Cereb Cortex 2023; 33:6291-6298. [PMID: 36562997 PMCID: PMC10183732 DOI: 10.1093/cercor/bhac504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 11/30/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
Broadly congruent mirror neurons, responding to any grasp movement, and strictly congruent mirror neurons, responding only to specific grasp movements, have been reported in single-cell studies with primates. Delineating grasp properties in humans is essential to understand the human mirror neuron system with implications for behavior and social cognition. We analyzed electrocorticography data from a natural reach-and-grasp movement observation and delayed imitation task with 3 different natural grasp types of everyday objects. We focused on the classification of grasp types from high-frequency broadband mirror activation patterns found in classic mirror system areas, including sensorimotor, supplementary motor, inferior frontal, and parietal cortices. Classification of grasp types was successful during movement observation and execution intervals but not during movement retention. Our grasp type classification from combined and single mirror electrodes provides evidence for grasp-congruent activity in the human mirror neuron system potentially arising from strictly congruent mirror neurons.
Collapse
Affiliation(s)
- Alexander M Dreyer
- Department of Psychology, Carl von Ossietzky University Oldenburg, Oldenburg 26129, Germany
| | - Leo Michalke
- Department of Psychology, Carl von Ossietzky University Oldenburg, Oldenburg 26129, Germany
| | - Anat Perry
- Department of Psychology, Hebrew University of Jerusalem, Jerusalem 91905, Israel
| | - Edward F Chang
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, United States
| | - Jack J Lin
- Department of Biomedical Engineering and the Comprehensive Epilepsy Program, Department of Neurology, University of California, Irvine, CA 92868, United States
| | - Robert T Knight
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, United States
| | - Jochem W Rieger
- Department of Psychology, Carl von Ossietzky University Oldenburg, Oldenburg 26129, Germany
| |
Collapse
|
22
|
Semantic surprise predicts the N400 brain potential. NEUROIMAGE: REPORTS 2023. [DOI: 10.1016/j.ynirp.2023.100161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
23
|
Desai M, Field AM, Hamilton LS. Dataset size considerations for robust acoustic and phonetic speech encoding models in EEG. Front Hum Neurosci 2023; 16:1001171. [PMID: 36741776 PMCID: PMC9895838 DOI: 10.3389/fnhum.2022.1001171] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/22/2022] [Indexed: 01/21/2023] Open
Abstract
In many experiments that investigate auditory and speech processing in the brain using electroencephalography (EEG), the experimental paradigm is often lengthy and tedious. Typically, the experimenter errs on the side of including more data, more trials, and therefore conducting a longer task to ensure that the data are robust and effects are measurable. Recent studies used naturalistic stimuli to investigate the brain's response to individual or a combination of multiple speech features using system identification techniques, such as multivariate temporal receptive field (mTRF) analyses. The neural data collected from such experiments must be divided into a training set and a test set to fit and validate the mTRF weights. While a good strategy is clearly to collect as much data as is feasible, it is unclear how much data are needed to achieve stable results. Furthermore, it is unclear whether the specific stimulus used for mTRF fitting and the choice of feature representation affects how much data would be required for robust and generalizable results. Here, we used previously collected EEG data from our lab using sentence stimuli and movie stimuli as well as EEG data from an open-source dataset using audiobook stimuli to better understand how much data needs to be collected for naturalistic speech experiments measuring acoustic and phonetic tuning. We found that the EEG receptive field structure tested here stabilizes after collecting a training dataset of approximately 200 s of TIMIT sentences, around 600 s of movie trailers training set data, and approximately 460 s of audiobook training set data. Thus, we provide suggestions on the minimum amount of data that would be necessary for fitting mTRFs from naturalistic listening data. Our findings are motivated by highly practical concerns when working with children, patient populations, or others who may not tolerate long study sessions. These findings will aid future researchers who wish to study naturalistic speech processing in healthy and clinical populations while minimizing participant fatigue and retaining signal quality.
Collapse
Affiliation(s)
- Maansi Desai
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States
| | - Alyssa M. Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States
| | - Liberty S. Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, TX, United States,Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, United States,*Correspondence: Liberty S. Hamilton ✉
| |
Collapse
|
24
|
Feature-space selection with banded ridge regression. Neuroimage 2022; 264:119728. [PMID: 36334814 PMCID: PMC9807218 DOI: 10.1016/j.neuroimage.2022.119728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 10/05/2022] [Accepted: 10/31/2022] [Indexed: 11/09/2022] Open
Abstract
Encoding models provide a powerful framework to identify the information represented in brain recordings. In this framework, a stimulus representation is expressed within a feature space and is used in a regularized linear regression to predict brain activity. To account for a potential complementarity of different feature spaces, a joint model is fit on multiple feature spaces simultaneously. To adapt regularization strength to each feature space, ridge regression is extended to banded ridge regression, which optimizes a different regularization hyperparameter per feature space. The present paper proposes a method to decompose over feature spaces the variance explained by a banded ridge regression model. It also describes how banded ridge regression performs a feature-space selection, effectively ignoring non-predictive and redundant feature spaces. This feature-space selection leads to better prediction accuracy and to better interpretability. Banded ridge regression is then mathematically linked to a number of other regression methods with similar feature-space selection mechanisms. Finally, several methods are proposed to address the computational challenge of fitting banded ridge regressions on large numbers of voxels and feature spaces. All implementations are released in an open-source Python package called Himalaya.
Collapse
|
25
|
Higgins C, Vidaurre D, Kolling N, Liu Y, Behrens T, Woolrich M. Spatiotemporally resolved multivariate pattern analysis for M/EEG. Hum Brain Mapp 2022; 43:3062-3085. [PMID: 35302683 PMCID: PMC9188977 DOI: 10.1002/hbm.25835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 02/14/2022] [Accepted: 02/27/2022] [Indexed: 11/15/2022] Open
Abstract
An emerging goal in neuroscience is tracking what information is represented in brain activity over time as a participant completes some task. While electroencephalography (EEG) and magnetoencephalography (MEG) offer millisecond temporal resolution of how activity patterns emerge and evolve, standard decoding methods present significant barriers to interpretability as they obscure the underlying spatial and temporal activity patterns. We instead propose the use of a generative encoding model framework that simultaneously infers the multivariate spatial patterns of activity and the variable timing at which these patterns emerge on individual trials. An encoding model inversion maps from these parameters to the equivalent decoding model, allowing predictions to be made about unseen test data in the same way as in standard decoding methodology. These SpatioTemporally Resolved MVPA (STRM) models can be flexibly applied to a wide variety of experimental paradigms, including classification and regression tasks. We show that these models provide insightful maps of the activity driving predictive accuracy metrics; demonstrate behaviourally meaningful variation in the timing of pattern emergence on individual trials; and achieve predictive accuracies that are either equivalent or surpass those achieved by more widely used methods. This provides a new avenue for investigating the brain's representational dynamics and could ultimately support more flexible experimental designs in the future.
Collapse
Affiliation(s)
- Cameron Higgins
- Wellcome Centre for Integrative NeuroimagingUniversity of OxfordOxfordUK
- Department of PsychiatryUniversity of OxfordOxfordUK
| | - Diego Vidaurre
- Department of PsychiatryUniversity of OxfordOxfordUK
- Center of Functionally Integrative Neuroscience, Department of Clinical MedicineAarhus UniversityAarhusDenmark
| | - Nils Kolling
- Wellcome Centre for Integrative NeuroimagingUniversity of OxfordOxfordUK
| | - Yunzhe Liu
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain ResearchBeijing Normal UniversityBeijingChina
- Chinese Institute for Brain ResearchBeijingChina
- Max Planck University College London Centre for Computational Psychiatry and Ageing ResearchUniversity College LondonLondonUK
| | - Tim Behrens
- Wellcome Centre for Integrative NeuroimagingUniversity of OxfordOxfordUK
- Wellcome Trust Centre for NeuroimagingUniversity College LondonLondonUK
| | - Mark Woolrich
- Wellcome Centre for Integrative NeuroimagingUniversity of OxfordOxfordUK
- Department of PsychiatryUniversity of OxfordOxfordUK
| |
Collapse
|
26
|
Armeni K, Güçlü U, van Gerven M, Schoffelen JM. A 10-hour within-participant magnetoencephalography narrative dataset to test models of language comprehension. Sci Data 2022; 9:278. [PMID: 35676293 PMCID: PMC9177538 DOI: 10.1038/s41597-022-01382-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/10/2022] [Indexed: 11/13/2022] Open
Abstract
Recently, cognitive neuroscientists have increasingly studied the brain responses to narratives. At the same time, we are witnessing exciting developments in natural language processing where large-scale neural network models can be used to instantiate cognitive hypotheses in narrative processing. Yet, they learn from text alone and we lack ways of incorporating biological constraints during training. To mitigate this gap, we provide a narrative comprehension magnetoencephalography (MEG) data resource that can be used to train neural network models directly on brain data. We recorded from 3 participants, 10 separate recording hour-long sessions each, while they listened to audiobooks in English. After story listening, participants answered short questions about their experience. To minimize head movement, the participants wore MEG-compatible head casts, which immobilized their head position during recording. We report a basic evoked-response analysis showing that the responses accurately localize to primary auditory areas. The responses are robust and conserved across 10 sessions for every participant. We also provide usage notes and briefly outline possible future uses of the resource.
Collapse
Affiliation(s)
- Kristijan Armeni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Jan-Mathijs Schoffelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
| |
Collapse
|
27
|
Spatiotemporal dynamics of odor representations in the human brain revealed by EEG decoding. Proc Natl Acad Sci U S A 2022; 119:e2114966119. [PMID: 35584113 PMCID: PMC9173780 DOI: 10.1073/pnas.2114966119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
To elucidate when and where in the brain different aspects of odor perception emerge, we decoded odors from an electroencephalogram and associated the results with perception and source activities. The odor information was decoded 100 ms after odor onset at the earliest, with its signal sources estimated in and around the olfactory areas. The neural representation of odor unpleasantness emerged 300 ms after odor onset, followed by pleasantness and perceived quality at 500 ms. During this time, brain regions representing odor information spread rapidly from the olfactory areas to regions associated with emotional, semantic, and memory processing. The results suggested that odor perception emerges through computations in these areas, with different perceptual aspects having different spatiotemporal dynamics. How the human brain translates olfactory inputs into diverse perceptions, from pleasurable floral smells to sickening smells of decay, is one of the fundamental questions in olfaction. To examine how different aspects of olfactory perception emerge in space and time in the human brain, we performed time-resolved multivariate pattern analysis of scalp-recorded electroencephalogram responses to 10 perceptually diverse odors and associated the resulting decoding accuracies with perception and source activities. Mean decoding accuracies of odors exceeded the chance level 100 ms after odor onset and reached maxima at 350 ms. The result suggests that the neural representations of individual odors were maximally separated at 350 ms. Perceptual representations emerged following the decoding peak: unipolar unpleasantness (neutral to unpleasant) from 300 ms, and pleasantness (neutral to pleasant) and perceptual quality (applicability to verbal descriptors such as “fruity” or “flowery”) from 500 ms after odor onset, with all these perceptual representations reaching their maxima after 600 ms. A source estimation showed that the areas representing the odor information, estimated based on the decoding accuracies, were localized in and around the primary and secondary olfactory areas at 100 to 350 ms after odor onset. Odor representations then expanded into larger areas associated with emotional, semantic, and memory processing, with the activities of these later areas being significantly associated with perception. These results suggest that initial odor information coded in the olfactory areas (<350 ms) evolves into their perceptual realizations (300 to >600 ms) through computations in widely distributed cortical regions, with different perceptual aspects having different spatiotemporal dynamics.
Collapse
|
28
|
Holtze B, Rosenkranz M, Jaeger M, Debener S, Mirkovic B. Ear-EEG Measures of Auditory Attention to Continuous Speech. Front Neurosci 2022; 16:869426. [PMID: 35592265 PMCID: PMC9111016 DOI: 10.3389/fnins.2022.869426] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Auditory attention is an important cognitive function used to separate relevant from irrelevant auditory information. However, most findings on attentional selection have been obtained in highly controlled laboratory settings using bulky recording setups and unnaturalistic stimuli. Recent advances in electroencephalography (EEG) facilitate the measurement of brain activity outside the laboratory, and around-the-ear sensors such as the cEEGrid promise unobtrusive acquisition. In parallel, methods such as speech envelope tracking, intersubject correlations and spectral entropy measures emerged which allow us to study attentional effects in the neural processing of natural, continuous auditory scenes. In the current study, we investigated whether these three attentional measures can be reliably obtained when using around-the-ear EEG. To this end, we analyzed the cEEGrid data of 36 participants who attended to one of two simultaneously presented speech streams. Speech envelope tracking results confirmed a reliable identification of the attended speaker from cEEGrid data. The accuracies in identifying the attended speaker increased when fitting the classification model to the individual. Artifact correction of the cEEGrid data with artifact subspace reconstruction did not increase the classification accuracy. Intersubject correlations were higher for those participants attending to the same speech stream than for those attending to different speech streams, replicating previously obtained results with high-density cap-EEG. We also found that spectral entropy decreased over time, possibly reflecting the decrease in the listener's level of attention. Overall, these results support the idea of using ear-EEG measurements to unobtrusively monitor auditory attention to continuous speech. This knowledge may help to develop assistive devices that support listeners separating relevant from irrelevant information in complex auditory environments.
Collapse
Affiliation(s)
- Björn Holtze
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Marc Rosenkranz
- Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Manuela Jaeger
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
- Division Hearing, Speech and Audio Technology, Fraunhofer Institute for Digital Media Technology IDMT, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
- Research Center for Neurosensory Science, University of Oldenburg, Oldenburg, Germany
- Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany
| | - Bojana Mirkovic
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
29
|
Arefnezhad S, Hamet J, Eichberger A, Frühwirth M, Ischebeck A, Koglbauer IV, Moser M, Yousefi A. Driver drowsiness estimation using EEG signals with a dynamical encoder-decoder modeling framework. Sci Rep 2022; 12:2650. [PMID: 35173189 PMCID: PMC8850607 DOI: 10.1038/s41598-022-05810-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 01/14/2022] [Indexed: 01/22/2023] Open
Abstract
Drowsiness is a leading cause of accidents on the road as it negatively affects the driver’s ability to safely operate a vehicle. Neural activity recorded by EEG electrodes is a widely used physiological correlate of driver drowsiness. This paper presents a novel dynamical modeling solution to estimate the instantaneous level of the driver drowsiness using EEG signals, where the PERcentage of eyelid CLOSure (PERCLOS) is employed as the ground truth of driver drowsiness. Applying our proposed modeling framework, we find neural features present in EEG data that encode PERCLOS. In the decoding phase, we use a Bayesian filtering solution to estimate the PERCLOS level over time. A data set that comprises 18 driving tests, conducted by 13 drivers, has been used to investigate the performance of the proposed framework. The modeling performance in estimation of PERCLOS provides robust and repeatable results in tests with manual and automated driving modes by an average RMSE of 0.117 (at a PERCLOS range of 0 to 1) and average High Probability Density percentage of 62.5%. We further hypothesized that there are biomarkers that encode the PERCLOS across different driving tests and participants. Using this solution, we identified possible biomarkers such as Theta and Delta powers. Results show that about 73% and 66% of the Theta and Delta powers which are selected as biomarkers are increasing as PERCLOS grows during the driving test. We argue that the proposed method is a robust and reliable solution to estimate drowsiness in real-time which opens the door in utilizing EEG-based measures in driver drowsiness detection systems.
Collapse
Affiliation(s)
- Sadegh Arefnezhad
- Institute of Automotive Engineering, Graz University of Technology, 8010, Graz, Austria.
| | - James Hamet
- Neurable Company, Boston, MA, 02108, USA.,Vistim Labs Company, Salt Lake City, UT, 84103, USA
| | - Arno Eichberger
- Institute of Automotive Engineering, Graz University of Technology, 8010, Graz, Austria
| | | | - Anja Ischebeck
- Institute of Psychology, University of Graz, 8010, Graz, Austria
| | - Ioana Victoria Koglbauer
- Institute of Engineering and Business Informatics, Graz University of Technology, Graz, 8010, Austria
| | - Maximilian Moser
- Human Research Institute, Weiz, 8160, Austria.,Chair of Department of Physiology, Medical University of Graz, 8036, Graz, Austria
| | - Ali Yousefi
- Neurable Company, Boston, MA, 02108, USA.,Department of Computer Science Worcester Polytechnic Institute, 100 Institute Road, MA, 01609, Worcester, USA
| |
Collapse
|
30
|
Teoh ES, Ahmed F, Lalor EC. Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment. J Neurosci 2022; 42:682-691. [PMID: 34893546 PMCID: PMC8805628 DOI: 10.1523/jneurosci.1455-20.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/28/2021] [Accepted: 09/29/2021] [Indexed: 11/21/2022] Open
Abstract
Humans have the remarkable ability to selectively focus on a single talker in the midst of other competing talkers. The neural mechanisms that underlie this phenomenon remain incompletely understood. In particular, there has been longstanding debate over whether attention operates at an early or late stage in the speech processing hierarchy. One way to better understand this is to examine how attention might differentially affect neurophysiological indices of hierarchical acoustic and linguistic speech representations. In this study, we do this by using encoding models to identify neural correlates of speech processing at various levels of representation. Specifically, we recorded EEG from fourteen human subjects (nine female and five male) during a "cocktail party" attention experiment. Model comparisons based on these data revealed phonetic feature processing for attended, but not unattended speech. Furthermore, we show that attention specifically enhances isolated indices of phonetic feature processing, but that such attention effects are not apparent for isolated measures of acoustic processing. These results provide new insights into the effects of attention on different prelexical representations of speech, insights that complement recent anatomic accounts of the hierarchical encoding of attended speech. Furthermore, our findings support the notion that, for attended speech, phonetic features are processed as a distinct stage, separate from the processing of the speech acoustics.SIGNIFICANCE STATEMENT Humans are very good at paying attention to one speaker in an environment with multiple speakers. However, the details of how attended and unattended speech are processed differently by the brain is not completely clear. Here, we explore how attention affects the processing of the acoustic sounds of speech as well as the mapping of those sounds onto categorical phonetic features. We find evidence of categorical phonetic feature processing for attended, but not unattended speech. Furthermore, we find evidence that categorical phonetic feature processing is enhanced by attention, but acoustic processing is not. These findings add an important new layer in our understanding of how the human brain solves the cocktail party problem.
Collapse
Affiliation(s)
- Emily S Teoh
- School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College, University of Dublin, Dublin 2, Ireland
| | - Farhin Ahmed
- Department of Neuroscience, Department of Biomedical Engineering, and Del Monte Neuroscience Institute, University of Rochester, Rochester, New York 14627
| | - Edmund C Lalor
- School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College, University of Dublin, Dublin 2, Ireland
- Department of Neuroscience, Department of Biomedical Engineering, and Del Monte Neuroscience Institute, University of Rochester, Rochester, New York 14627
| |
Collapse
|
31
|
Cheng FY, Xu C, Gold L, Smith S. Rapid Enhancement of Subcortical Neural Responses to Sine-Wave Speech. Front Neurosci 2022; 15:747303. [PMID: 34987356 PMCID: PMC8721138 DOI: 10.3389/fnins.2021.747303] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 12/02/2021] [Indexed: 01/15/2023] Open
Abstract
The efferent auditory nervous system may be a potent force in shaping how the brain responds to behaviorally significant sounds. Previous human experiments using the frequency following response (FFR) have shown efferent-induced modulation of subcortical auditory function online and over short- and long-term time scales; however, a contemporary understanding of FFR generation presents new questions about whether previous effects were constrained solely to the auditory subcortex. The present experiment used sine-wave speech (SWS), an acoustically-sparse stimulus in which dynamic pure tones represent speech formant contours, to evoke FFRSWS. Due to the higher stimulus frequencies used in SWS, this approach biased neural responses toward brainstem generators and allowed for three stimuli (/bɔ/, /bu/, and /bo/) to be used to evoke FFRSWSbefore and after listeners in a training group were made aware that they were hearing a degraded speech stimulus. All SWS stimuli were rapidly perceived as speech when presented with a SWS carrier phrase, and average token identification reached ceiling performance during a perceptual training phase. Compared to a control group which remained naïve throughout the experiment, training group FFRSWS amplitudes were enhanced post-training for each stimulus. Further, linear support vector machine classification of training group FFRSWS significantly improved post-training compared to the control group, indicating that training-induced neural enhancements were sufficient to bolster machine learning classification accuracy. These results suggest that the efferent auditory system may rapidly modulate auditory brainstem representation of sounds depending on their context and perception as non-speech or speech.
Collapse
Affiliation(s)
- Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Lisa Gold
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
32
|
Al-Zubaidi A, Bräuer S, Holdgraf CR, Schepers IM, Rieger JW. OUP accepted manuscript. Cereb Cortex Commun 2022; 3:tgac007. [PMID: 35281216 PMCID: PMC8914075 DOI: 10.1093/texcom/tgac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 11/14/2022] Open
Affiliation(s)
- Arkan Al-Zubaidi
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
- Research Center Neurosensory Science, Oldenburg University, 26129 Oldenburg, Germany
| | - Susann Bräuer
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Chris R Holdgraf
- Department of Statistics, UC Berkeley, Berkeley, CA 94720, USA
- International Interactive Computing Collaboration
| | - Inga M Schepers
- Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany
| | - Jochem W Rieger
- Corresponding author: Department of Psychology, Faculty VI, Oldenburg University, 26129 Oldenburg, Germany.
| |
Collapse
|
33
|
Kupers ER, Edadan A, Benson NC, Zuiderbaan W, de Jong MC, Dumoulin SO, Winawer J. A population receptive field model of the magnetoencephalography response. Neuroimage 2021; 244:118554. [PMID: 34509622 PMCID: PMC8631249 DOI: 10.1016/j.neuroimage.2021.118554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 07/16/2021] [Accepted: 09/02/2021] [Indexed: 12/23/2022] Open
Abstract
Computational models which predict the neurophysiological response from experimental stimuli have played an important role in human neuroimaging. One type of computational model, the population receptive field (pRF), has been used to describe cortical responses at the millimeter scale using functional magnetic resonance imaging (fMRI) and electrocorticography (ECoG). However, pRF models are not widely used for non-invasive electromagnetic field measurements (EEG/MEG), because individual sensors pool responses originating from several centimeter of cortex, containing neural populations with widely varying spatial tuning. Here, we introduce a forward-modeling approach in which pRFs estimated from fMRI data are used to predict MEG sensor responses. Subjects viewed contrast-reversing bar stimuli sweeping across the visual field in separate fMRI and MEG sessions. Individual subject's pRFs were modeled on the cortical surface at the millimeter scale using the fMRI data. We then predicted cortical time series and projected these predictions to MEG sensors using a biophysical MEG forward model, accounting for the pooling across cortex. We compared the predicted MEG responses to observed visually evoked steady-state responses measured in the MEG session. We found that pRF parameters estimated by fMRI could explain a substantial fraction of the variance in steady-state MEG sensor responses (up to 60% in individual sensors). Control analyses in which we artificially perturbed either pRF size or pRF position reduced MEG prediction accuracy, indicating that MEG data are sensitive to pRF properties derived from fMRI. Our model provides a quantitative approach to link fMRI and MEG measurements, thereby enabling advances in our understanding of spatiotemporal dynamics in human visual field maps.
Collapse
Affiliation(s)
- Eline R Kupers
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States; Department of Psychology, Stanford University, Stanford, CA 94305, United States.
| | - Akhil Edadan
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Experimental Psychology, Utrecht University, Utrecht 3584 CS, the Netherlands
| | - Noah C Benson
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States; Sciences Institute, University of Washington, Seattle, WA 98195, United States
| | | | - Maartje C de Jong
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Psychology, University of Amsterdam, Amsterdam 1001 NK, the Netherlands; Amsterdam Brain and Cognition (ABC), University of Amsterdam, Amsterdam 1001 NK, the Netherlands
| | - Serge O Dumoulin
- Spinoza Center for Neuroimaging, Amsterdam 1105 BK, the Netherlands; Department of Experimental Psychology, Utrecht University, Utrecht 3584 CS, the Netherlands; Department of Experimental and Applied Psychology, VU University, Amsterdam 1081 BT, the Netherlands
| | - Jonathan Winawer
- Department of Psychology, New York University, New York, NY 10003, United States; Center for Neural Science, New York University, New York, NY 10003, United States
| |
Collapse
|
34
|
Jessen S, Obleser J, Tune S. Neural tracking in infants - An analytical tool for multisensory social processing in development. Dev Cogn Neurosci 2021; 52:101034. [PMID: 34781250 PMCID: PMC8593584 DOI: 10.1016/j.dcn.2021.101034] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 10/09/2021] [Accepted: 11/07/2021] [Indexed: 11/18/2022] Open
Abstract
Humans are born into a social environment and from early on possess a range of abilities to detect and respond to social cues. In the past decade, there has been a rapidly increasing interest in investigating the neural responses underlying such early social processes under naturalistic conditions. However, the investigation of neural responses to continuous dynamic input poses the challenge of how to link neural responses back to continuous sensory input. In the present tutorial, we provide a step-by-step introduction to one approach to tackle this issue, namely the use of linear models to investigate neural tracking responses in electroencephalographic (EEG) data. While neural tracking has gained increasing popularity in adult cognitive neuroscience over the past decade, its application to infant EEG is still rare and comes with its own challenges. After introducing the concept of neural tracking, we discuss and compare the use of forward vs. backward models and individual vs. generic models using an example data set of infant EEG data. Each section comprises a theoretical introduction as well as a concrete example using MATLAB code. We argue that neural tracking provides a promising way to investigate early (social) processing in an ecologically valid setting.
Collapse
Affiliation(s)
- Sarah Jessen
- Department of Neurology, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany; Center of Brain, Behavior, and Metabolism, University of Lübeck, Germany.
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany; Center of Brain, Behavior, and Metabolism, University of Lübeck, Germany
| | - Sarah Tune
- Department of Psychology, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany; Center of Brain, Behavior, and Metabolism, University of Lübeck, Germany.
| |
Collapse
|
35
|
Crosse MJ, Zuk NJ, Di Liberto GM, Nidiffer AR, Molholm S, Lalor EC. Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research. Front Neurosci 2021; 15:705621. [PMID: 34880719 PMCID: PMC8648261 DOI: 10.3389/fnins.2021.705621] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/21/2021] [Indexed: 01/01/2023] Open
Abstract
Cognitive neuroscience, in particular research on speech and language, has seen an increase in the use of linear modeling techniques for studying the processing of natural, environmental stimuli. The availability of such computational tools has prompted similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits under more naturalistic conditions. However, studying clinical (and often highly heterogeneous) cohorts introduces an added layer of complexity to such modeling procedures, potentially leading to instability of such techniques and, as a result, inconsistent findings. Here, we outline some key methodological considerations for applied research, referring to a hypothetical clinical experiment involving speech processing and worked examples of simulated electrophysiological (EEG) data. In particular, we focus on experimental design, data preprocessing, stimulus feature extraction, model design, model training and evaluation, and interpretation of model weights. Throughout the paper, we demonstrate the implementation of each step in MATLAB using the mTRF-Toolbox and discuss how to address issues that could arise in applied research. In doing so, we hope to provide better intuition on these more technical points and provide a resource for applied and clinical researchers investigating sensory and cognitive processing using ecologically rich stimuli.
Collapse
Affiliation(s)
- Michael J. Crosse
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland
- X, The Moonshot Factory, Mountain View, CA, United States
- Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States
- Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Nathaniel J. Zuk
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States
- Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Giovanni M. Di Liberto
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland
- Centre for Biomedical Engineering, School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland
- School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
| | - Aaron R. Nidiffer
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States
- Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Sophie Molholm
- Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States
- Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Edmund C. Lalor
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States
- Department of Neuroscience, University of Rochester, Rochester, NY, United States
| |
Collapse
|
36
|
Ravishankar S, Toneva M, Wehbe L. Single-Trial MEG Data Can Be Denoised Through Cross-Subject Predictive Modeling. Front Comput Neurosci 2021; 15:737324. [PMID: 34858157 PMCID: PMC8632362 DOI: 10.3389/fncom.2021.737324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 08/23/2021] [Indexed: 11/13/2022] Open
Abstract
A pervasive challenge in brain imaging is the presence of noise that hinders investigation of underlying neural processes, with Magnetoencephalography (MEG) in particular having very low Signal-to-Noise Ratio (SNR). The established strategy to increase MEG's SNR involves averaging multiple repetitions of data corresponding to the same stimulus. However, repetition of stimulus can be undesirable, because underlying neural activity has been shown to change across trials, and repeating stimuli limits the breadth of the stimulus space experienced by subjects. In particular, the rising popularity of naturalistic studies with a single viewing of a movie or story necessitates the discovery of new approaches to increase SNR. We introduce a simple framework to reduce noise in single-trial MEG data by leveraging correlations in neural responses across subjects as they experience the same stimulus. We demonstrate its use in a naturalistic reading comprehension task with 8 subjects, with MEG data collected while they read the same story a single time. We find that our procedure results in data with reduced noise and allows for better discovery of neural phenomena. As proof-of-concept, we show that the N400m's correlation with word surprisal, an established finding in literature, is far more clearly observed in the denoised data than the original data. The denoised data also shows higher decoding and encoding accuracy than the original data, indicating that the neural signals associated with reading are either preserved or enhanced after the denoising procedure.
Collapse
Affiliation(s)
| | - Mariya Toneva
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, United States
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Leila Wehbe
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, United States
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
37
|
Generalizable EEG Encoding Models with Naturalistic Audiovisual Stimuli. J Neurosci 2021; 41:8946-8962. [PMID: 34503996 DOI: 10.1523/jneurosci.2891-20.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/24/2021] [Accepted: 08/29/2021] [Indexed: 11/21/2022] Open
Abstract
In natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as "speech tracking." Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from acoustically rich, naturalistic environments with and without background noise can be generalized to more controlled stimuli. If encoding models for acoustically rich, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations of individuals who may not tolerate listening to more controlled and less engaging stimuli for long periods of time. We recorded noninvasive scalp EEG while 17 human participants (8 male/9 female) listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled datasets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to speech in a rich acoustic background were more accurate when including both phonological and acoustic features. Our findings suggest that naturalistic audiovisual stimuli can be used to measure receptive fields that are comparable and generalizable to more controlled audio-only stimuli.SIGNIFICANCE STATEMENT Understanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli-sentences from a speech corpus and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.
Collapse
|
38
|
Ki JJ, Dmochowski JP, Touryan J, Parra LC. Neural responses to natural visual motion are spatially selective across the visual field, with selectivity differing across brain areas and task. Eur J Neurosci 2021; 54:7609-7625. [PMID: 34679237 PMCID: PMC9298375 DOI: 10.1111/ejn.15503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 09/16/2021] [Accepted: 10/07/2021] [Indexed: 11/28/2022]
Abstract
It is well established that neural responses to visual stimuli are enhanced at select locations in the visual field. Although spatial selectivity and the effects of spatial attention are well understood for discrete tasks (e.g. visual cueing), little is known for naturalistic experience that involves continuous dynamic visual stimuli (e.g. driving). Here, we assess the strength of neural responses across the visual space during a kart‐race game. Given the varying relevance of visual location in this task, we hypothesized that the strength of neural responses to movement will vary across the visual field, and it would differ between active play and passive viewing. To test this, we measure the correlation strength of scalp‐evoked potentials with optical flow magnitude at individual locations on the screen. We find that neural responses are strongly correlated at task‐relevant locations in visual space, extending beyond the focus of overt attention. Although the driver's gaze is directed upon the heading direction at the centre of the screen, neural responses were robust at the peripheral areas (e.g. roads and surrounding buildings). Importantly, neural responses to visual movement are broadly distributed across the scalp, with visual spatial selectivity differing across electrode locations. Moreover, during active gameplay, neural responses are enhanced at select locations in the visual space. Conventionally, spatial selectivity of neural response has been interpreted as an attentional gain mechanism. In the present study, the data suggest that different brain areas focus attention on different portions of the visual field that are task‐relevant, beyond the focus of overt attention.
Collapse
Affiliation(s)
- Jason J Ki
- Department of Biomedical Engineering, City College of the City University of New York, New York, New York, USA
| | - Jacek P Dmochowski
- Department of Biomedical Engineering, City College of the City University of New York, New York, New York, USA
| | | | - Lucas C Parra
- Department of Biomedical Engineering, City College of the City University of New York, New York, New York, USA
| |
Collapse
|
39
|
Kang YY, Li JJ, Sun JX, Wei JX, Ding C, Shi CL, Wu G, Li K, Ma YF, Sun Y, Qiao H. Genome-wide scanning for CHD1L gene in papillary thyroid carcinoma complicated with type 2 diabetes mellitus. Clin Transl Oncol 2021; 23:2536-2547. [PMID: 34245428 DOI: 10.1007/s12094-021-02656-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/28/2021] [Indexed: 02/06/2023]
Abstract
PURPOSE Papillary thyroid carcinoma (PTC) represents the most common subtype of thyroid cancer (TC). This study was set out to explore the potential effect of CHD1L on PTC and type 2 diabetes mellitus (T2DM). METHODS We searched for T2DM susceptibility genes through the GWAS database and obtained T2DM-related differentially expressed gene from the GEO database. The expression and clinical data of TC and normal samples were collated from the TCGA database. Receiver operating characteristic (ROC) curve analysis was subsequently applied to assess the sensitivity and specificity of the CHD1L for the diagnosis of PTC. The MCP-counter package in R language was then utilized to generate immune cell score to evaluate the relationship between CHD1L expression and immune cells. Then, we performed functional enrichment analysis of co-expressed genes and DEGs to determine significantly enriched GO terms and KEGG to predict the potential functions of CHD1L in PTC samples and T2DM adipose tissue. RESULTS From two genes (ABCB9, CHD1L) were identified to be DEGs (p < 1 * 10-5) that exerted effects on survival (HR > 1, p < 0.05) in PTC and served as T2DM susceptibility genes. The gene expression matrix-based scoring of immunocytes suggested that PTC samples with high and low CHD1L expression presented with significant differences in the tumor microenvironment (TME). The enrichment analysis of CHD1L co-expressed genes and DEGs suggested that CHD1L was involved in multiple pathways to regulate the development of PTC. Among them, Kaposi sarcoma-associated herpesvirus infection, salmonella infection and TNF signaling pathways were highlighted as the three most relevant pathways. GSEA analysis, employed to analyze the genome dataset of PTC samples and T2DM adipose tissue presenting with high and low expression groups of CHD1L, suggests that these differential genes are related to chemokine signaling pathway, leukocyte transendothelial migration and TCELL receptor signaling pathway. CONCLUSION CHD1L may potentially serve as an early diagnostic biomarker for PTC, and a target of immunotherapy for PTC and T2DM.
Collapse
Affiliation(s)
- Y Y Kang
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital of Harbin Medical University, No. 246, Xuefu Road, Nangang District, Harbin, 150081, Heilongjiang, People's Republic of China.,Department of Endocrinology and Metabolism, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - J J Li
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital of Harbin Medical University, No. 246, Xuefu Road, Nangang District, Harbin, 150081, Heilongjiang, People's Republic of China
| | - J X Sun
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital of Harbin Medical University, No. 246, Xuefu Road, Nangang District, Harbin, 150081, Heilongjiang, People's Republic of China
| | - J X Wei
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital of Harbin Medical University, No. 246, Xuefu Road, Nangang District, Harbin, 150081, Heilongjiang, People's Republic of China
| | - C Ding
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - C L Shi
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - G Wu
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - K Li
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - Y F Ma
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - Y Sun
- Departments of General Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin, 150081, Heilongjiang, People's Republic of China
| | - H Qiao
- Department of Endocrinology and Metabolism, The Second Affiliated Hospital of Harbin Medical University, No. 246, Xuefu Road, Nangang District, Harbin, 150081, Heilongjiang, People's Republic of China.
| |
Collapse
|
40
|
Boos M, Lücke J, Rieger JW. Generalizable dimensions of human cortical auditory processing of speech in natural soundscapes: A data-driven ultra high field fMRI approach. Neuroimage 2021; 237:118106. [PMID: 33991696 DOI: 10.1016/j.neuroimage.2021.118106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 04/25/2021] [Indexed: 11/27/2022] Open
Abstract
Speech comprehension in natural soundscapes rests on the ability of the auditory system to extract speech information from a complex acoustic signal with overlapping contributions from many sound sources. Here we reveal the canonical processing of speech in natural soundscapes on multiple scales by using data-driven modeling approaches to characterize sounds to analyze ultra high field fMRI recorded while participants listened to the audio soundtrack of a movie. We show that at the functional level the neuronal processing of speech in natural soundscapes can be surprisingly low dimensional in the human cortex, highlighting the functional efficiency of the auditory system for a seemingly complex task. Particularly, we find that a model comprising three functional dimensions of auditory processing in the temporal lobes is shared across participants' fMRI activity. We further demonstrate that the three functional dimensions are implemented in anatomically overlapping networks that process different aspects of speech in natural soundscapes. One is most sensitive to complex auditory features present in speech, another to complex auditory features and fast temporal modulations, that are not specific to speech, and one codes mainly sound level. These results were derived with few a-priori assumptions and provide a detailed and computationally reproducible account of the cortical activity in the temporal lobe elicited by the processing of speech in natural soundscapes.
Collapse
Affiliation(s)
- Moritz Boos
- Applied Neurocognitive Psychology Lab, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany.
| | - Jörg Lücke
- Machine Learning Division, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany
| | - Jochem W Rieger
- Applied Neurocognitive Psychology Lab, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
41
|
Grzywacz NM. Stochasticity, Nonlinear Value Functions, and Update Rules in Learning Aesthetic Biases. Front Hum Neurosci 2021; 15:639081. [PMID: 34040509 PMCID: PMC8141583 DOI: 10.3389/fnhum.2021.639081] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 03/31/2021] [Indexed: 11/29/2022] Open
Abstract
A theoretical framework for the reinforcement learning of aesthetic biases was recently proposed based on brain circuitries revealed by neuroimaging. A model grounded on that framework accounted for interesting features of human aesthetic biases. These features included individuality, cultural predispositions, stochastic dynamics of learning and aesthetic biases, and the peak-shift effect. However, despite the success in explaining these features, a potential weakness was the linearity of the value function used to predict reward. This linearity meant that the learning process employed a value function that assumed a linear relationship between reward and sensory stimuli. Linearity is common in reinforcement learning in neuroscience. However, linearity can be problematic because neural mechanisms and the dependence of reward on sensory stimuli were typically nonlinear. Here, we analyze the learning performance with models including optimal nonlinear value functions. We also compare updating the free parameters of the value functions with the delta rule, which neuroscience models use frequently, vs. updating with a new Phi rule that considers the structure of the nonlinearities. Our computer simulations showed that optimal nonlinear value functions resulted in improvements of learning errors when the reward models were nonlinear. Similarly, the new Phi rule led to improvements in these errors. These improvements were accompanied by the straightening of the trajectories of the vector of free parameters in its phase space. This straightening meant that the process became more efficient in learning the prediction of reward. Surprisingly, however, this improved efficiency had a complex relationship with the rate of learning. Finally, the stochasticity arising from the probabilistic sampling of sensory stimuli, rewards, and motivations helped the learning process narrow the range of free parameters to nearly optimal outcomes. Therefore, we suggest that value functions and update rules optimized for social and ecological constraints are ideal for learning aesthetic biases.
Collapse
Affiliation(s)
- Norberto M Grzywacz
- Department of Psychology, Loyola University Chicago, Chicago, IL, United States.,Department of Molecular Pharmacology and Neuroscience, Loyola University Chicago, Chicago, IL, United States
| |
Collapse
|
42
|
Belo J, Clerc M, Schön D. EEG-Based Auditory Attention Detection and Its Possible Future Applications for Passive BCI. FRONTIERS IN COMPUTER SCIENCE 2021. [DOI: 10.3389/fcomp.2021.661178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The ability to discriminate and attend one specific sound source in a complex auditory environment is a fundamental skill for efficient communication. Indeed, it allows us to follow a family conversation or discuss with a friend in a bar. This ability is challenged in hearing-impaired individuals and more precisely in those with a cochlear implant (CI). Indeed, due to the limited spectral resolution of the implant, auditory perception remains quite poor in a noisy environment or in presence of simultaneous auditory sources. Recent methodological advances allow now to detect, on the basis of neural signals, which auditory stream within a set of multiple concurrent streams an individual is attending to. This approach, called EEG-based auditory attention detection (AAD), is based on fundamental research findings demonstrating that, in a multi speech scenario, cortical tracking of the envelope of the attended speech is enhanced compared to the unattended speech. Following these findings, other studies showed that it is possible to use EEG/MEG (Electroencephalography/Magnetoencephalography) to explore auditory attention during speech listening in a Cocktail-party-like scenario. Overall, these findings make it possible to conceive next-generation hearing aids combining customary technology and AAD. Importantly, AAD has also a great potential in the context of passive BCI, in the educational context as well as in the context of interactive music performances. In this mini review, we firstly present the different approaches of AAD and the main limitations of the global concept. We then expose its potential applications in the world of non-clinical passive BCI.
Collapse
|
43
|
Sachdeva PS, Livezey JA, Dougherty ME, Gu BM, Berke JD, Bouchard KE. Improved inference in coupling, encoding, and decoding models and its consequence for neuroscientific interpretation. J Neurosci Methods 2021; 358:109195. [PMID: 33905791 DOI: 10.1016/j.jneumeth.2021.109195] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/08/2021] [Accepted: 04/10/2021] [Indexed: 10/21/2022]
Abstract
BACKGROUND A central goal of systems neuroscience is to understand the relationships amongst constituent units in neural populations, and their modulation by external factors, using high-dimensional and stochastic neural recordings. Parametric statistical models (e.g., coupling, encoding, and decoding models), play an instrumental role in accomplishing this goal. However, extracting conclusions from a parametric model requires that it is fit using an inference algorithm capable of selecting the correct parameters and properly estimating their values. Traditional approaches to parameter inference have been shown to suffer from failures in both selection and estimation. The recent development of algorithms that ameliorate these deficiencies raises the question of whether past work relying on such inference procedures have produced inaccurate systems neuroscience models, thereby impairing their interpretation. NEW METHOD We used algorithms based on Union of Intersections, a statistical inference framework based on stability principles, capable of improved selection and estimation. COMPARISON We fit functional coupling, encoding, and decoding models across a battery of neural datasets using both UoI and baseline inference procedures (e.g., ℓ1-penalized GLMs), and compared the structure of their fitted parameters. RESULTS Across recording modality, brain region, and task, we found that UoI inferred models with increased sparsity, improved stability, and qualitatively different parameter distributions, while maintaining predictive performance. We obtained highly sparse functional coupling networks with substantially different community structure, more parsimonious encoding models, and decoding models that relied on fewer single-units. CONCLUSIONS Together, these results demonstrate that improved parameter inference, achieved via UoI, reshapes interpretation in diverse neuroscience contexts.
Collapse
Affiliation(s)
- Pratik S Sachdeva
- Redwood Center for Theoretical Neuroscience, University of California, Berkeley, 94720, CA, USA; Department of Physics, University of California, Berkeley, 94720, CA, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, 94720, CA, USA
| | - Jesse A Livezey
- Redwood Center for Theoretical Neuroscience, University of California, Berkeley, 94720, CA, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, 94720, CA, USA
| | - Maximilian E Dougherty
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, 94720, CA, USA
| | - Bon-Mi Gu
- Department of Neurology, University of California, San Francisco, San Francisco, 94143, CA, USA
| | - Joshua D Berke
- Department of Neurology, University of California, San Francisco, San Francisco, 94143, CA, USA; Department of Psychiatry; Neuroscience Graduate Program; Kavli Institute for Fundamental Neuroscience; Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, 94143, CA, USA
| | - Kristofer E Bouchard
- Redwood Center for Theoretical Neuroscience, University of California, Berkeley, 94720, CA, USA; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, 94720, CA, USA; Computational Resources Division, Lawrence Berkeley National Laboratory, Berkeley, 94720, CA, USA; Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, 94720, CA, USA
| |
Collapse
|
44
|
Bayet L, Saville A, Balas B. Sensitivity to face animacy and inversion in childhood: Evidence from EEG data. Neuropsychologia 2021; 156:107838. [PMID: 33775702 DOI: 10.1016/j.neuropsychologia.2021.107838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 12/28/2020] [Accepted: 03/22/2021] [Indexed: 11/25/2022]
Abstract
Adults exhibit relative behavioral difficulties in processing inanimate, artificial faces compared to real human faces, with implications for using artificial faces in research and designing artificial social agents. However, the developmental trajectory of inanimate face perception is unknown. To address this gap, we used electroencephalography to investigate inanimate faces processing in cross-sectional groups of 5-10-year-old children and adults. A face inversion manipulation was used to test whether face animacy processing relies on expert face processing strategies. Groups of 5-7-year-olds (N = 18), 8-10-year-olds (N = 18), and adults (N = 16) watched pictures of real or doll faces presented in an upright or inverted orientation. Analyses of event-related potentials revealed larger N170 amplitudes in response to doll faces, irrespective of age group or face orientation. Thus, the N170 is sensitive to face animacy by 5-7 years of age, but such sensitivity may not reflect high-level, expert face processing. Multivariate pattern analyses of the EEG signal additionally assessed whether animacy information could be reliably extracted during face processing. Face orientation, but not face animacy, could be reliably decoded from occipitotemporal channels in children and adults. Face animacy could be decoded from whole scalp channels in adults, but not children. Together, these results suggest that 5-10-year-old children exhibit some sensitivity to face animacy over occipitotemporal regions that is comparable to adults.
Collapse
Affiliation(s)
- Laurie Bayet
- Department of Psychology and Center for Neuroscience and Behavior, American University, Washington, DC, USA.
| | - Alyson Saville
- Department of Psychology, North Dakota State University, Fargo, ND, USA
| | - Benjamin Balas
- Department of Psychology, North Dakota State University, Fargo, ND, USA.
| |
Collapse
|
45
|
Modulation Spectra Capture EEG Responses to Speech Signals and Drive Distinct Temporal Response Functions. eNeuro 2021; 8:ENEURO.0399-20.2020. [PMID: 33272971 PMCID: PMC7810259 DOI: 10.1523/eneuro.0399-20.2020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/08/2020] [Accepted: 11/14/2020] [Indexed: 11/26/2022] Open
Abstract
Speech signals have a unique shape of long-term modulation spectrum that is distinct from environmental noise, music, and non-speech vocalizations. Does the human auditory system adapt to the speech long-term modulation spectrum and efficiently extract critical information from speech signals? To answer this question, we tested whether neural responses to speech signals can be captured by specific modulation spectra of non-speech acoustic stimuli. We generated amplitude modulated (AM) noise with the speech modulation spectrum and 1/f modulation spectra of different exponents to imitate temporal dynamics of different natural sounds. We presented these AM stimuli and a 10-min piece of natural speech to 19 human participants undergoing electroencephalography (EEG) recording. We derived temporal response functions (TRFs) to the AM stimuli of different spectrum shapes and found distinct neural dynamics for each type of TRFs. We then used the TRFs of AM stimuli to predict neural responses to the speech signals, and found that (1) the TRFs of AM modulation spectra of exponents 1, 1.5, and 2 preferably captured EEG responses to speech signals in the δ band and (2) the θ neural band of speech neural responses can be captured by the AM stimuli of an exponent of 0.75. Our results suggest that the human auditory system shows specificity to the long-term modulation spectrum and is equipped with characteristic neural algorithms tailored to extract critical acoustic information from speech signals.
Collapse
|
46
|
Extracting human cortical responses to sound onsets and acoustic feature changes in real music, and their relation to event rate. Brain Res 2021; 1754:147248. [PMID: 33417893 DOI: 10.1016/j.brainres.2020.147248] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 12/07/2020] [Accepted: 12/13/2020] [Indexed: 11/21/2022]
Abstract
Evoked cortical responses (ERs) have mainly been studied in controlled experiments using simplified stimuli. Though, an outstanding question is how the human cortex responds to the complex stimuli encountered in realistic situations. Few electroencephalography (EEG) studies have used Music Information Retrieval (MIR) tools to extract cortical P1/N1/P2 to acoustical changes in real music. However, less than ten events per music piece could be detected leading to ERs due to limitations in automatic detection of sound onsets. Also, the factors influencing a successful extraction of the ERs have not been identified. Finally, previous studies did not localize the sources of the cortical generators. This study is based on an EEG/MEG dataset from 48 healthy normal hearing participants listening to three real music pieces. Acoustic features were computed from the audio signal of the music with the MIR Toolbox. To overcome limits in automatic methods, sound onsets were also manually detected. The chance of obtaining detectable ERs based on ten randomly picked onset points was less than 1:10,000. For the first time, we show that naturalistic P1/N1/P2 ERs can be reliably measured across 100 manually identified sound onsets, substantially improving the signal-to-noise level compared to <10 trials. More ERs were measurable in musical sections with slow event rates (0.2 Hz-2.5 Hz) than with fast event rates (>2.5 Hz). Furthermore, during monophonic sections of the music only P1/P2 were measurable, and during polyphonic sections only N1. Finally, MEG source analysis revealed that naturalistic P2 is located in core areas of the auditory cortex.
Collapse
|
47
|
Basgöze Z, White DN, Burge J, Cooper EA. Natural statistics of depth edges modulate perceptual stability. J Vis 2020; 20:10. [PMID: 32761107 PMCID: PMC7438667 DOI: 10.1167/jov.20.8.10] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Binocular fusion relies on matching points in the two eyes that correspond to the same physical feature in the world; however, not all world features are binocularly visible. Near depth edges, some regions of a scene are often visible to only one eye (so-called half occlusions). Accurate detection of these monocularly visible regions is likely to be important for stable visual perception. If monocular regions are not detected as such, the visual system may attempt to binocularly fuse non-corresponding points, which can result in unstable percepts. We investigated the hypothesis that the visual system capitalizes on statistical regularities associated with depth edges in natural scenes to aid binocular fusion and facilitate perceptual stability. By sampling from a large set of stereoscopic natural images with co-registered distance information, we found evidence that monocularly visible regions near depth edges primarily result from background occlusions. Accordingly, monocular regions tended to be more visually similar to the adjacent binocularly visible background region than to the adjacent binocularly visible foreground. Consistent with our hypothesis, perceptual experiments showed that perception tended to be more stable when the image properties of the depth edge were statistically more likely given the probability of occurrence in natural scenes (i.e., when monocular regions were more visually similar to the binocular background). The generality of these results was supported by a parametric study with simulated environments. Exploiting regularities in natural environments may allow the visual system to facilitate fusion and perceptual stability when both binocular and monocular regions are visible.
Collapse
|
48
|
Sohoglu E, Davis MH. Rapid computations of spectrotemporal prediction error support perception of degraded speech. eLife 2020; 9:e58077. [PMID: 33147138 PMCID: PMC7641582 DOI: 10.7554/elife.58077] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022] Open
Abstract
Human speech perception can be described as Bayesian perceptual inference but how are these Bayesian computations instantiated neurally? We used magnetoencephalographic recordings of brain responses to degraded spoken words and experimentally manipulated signal quality and prior knowledge. We first demonstrate that spectrotemporal modulations in speech are more strongly represented in neural responses than alternative speech representations (e.g. spectrogram or articulatory features). Critically, we found an interaction between speech signal quality and expectations from prior written text on the quality of neural representations; increased signal quality enhanced neural representations of speech that mismatched with prior expectations, but led to greater suppression of speech that matched prior expectations. This interaction is a unique neural signature of prediction error computations and is apparent in neural responses within 100 ms of speech input. Our findings contribute to the detailed specification of a computational model of speech perception based on predictive coding frameworks.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of SussexBrightonUnited Kingdom
| | - Matthew H Davis
- MRC Cognition and Brain Sciences UnitCambridgeUnited Kingdom
| |
Collapse
|
49
|
Sohoglu E, Kumar S, Chait M, Griffiths TD. Multivoxel codes for representing and integrating acoustic features in human cortex. Neuroimage 2020; 217:116661. [PMID: 32081785 PMCID: PMC7339141 DOI: 10.1016/j.neuroimage.2020.116661] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 02/13/2020] [Accepted: 02/15/2020] [Indexed: 10/25/2022] Open
Abstract
Using fMRI and multivariate pattern analysis, we determined whether spectral and temporal acoustic features are represented by independent or integrated multivoxel codes in human cortex. Listeners heard band-pass noise varying in frequency (spectral) and amplitude-modulation (AM) rate (temporal) features. In the superior temporal plane, changes in multivoxel activity due to frequency were largely invariant with respect to AM rate (and vice versa), consistent with an independent representation. In contrast, in posterior parietal cortex, multivoxel representation was exclusively integrated and tuned to specific conjunctions of frequency and AM features (albeit weakly). Direct between-region comparisons show that whereas independent coding of frequency weakened with increasing levels of the hierarchy, such a progression for AM and integrated coding was less fine-grained and only evident in the higher hierarchical levels from non-core to parietal cortex (with AM coding weakening and integrated coding strengthening). Our findings support the notion that primary auditory cortex can represent spectral and temporal acoustic features in an independent fashion and suggest a role for parietal cortex in feature integration and the structuring of sensory input.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of Sussex, Brighton, BN1 9QH, United Kingdom.
| | - Sukhbinder Kumar
- Institute of Neurobiology, Medical School, Newcastle University, Newcastle Upon Tyne, NE2 4HH, United Kingdom; Wellcome Trust Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| | - Maria Chait
- Ear Institute, University College London, London, United Kingdom
| | - Timothy D Griffiths
- Institute of Neurobiology, Medical School, Newcastle University, Newcastle Upon Tyne, NE2 4HH, United Kingdom; Wellcome Trust Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| |
Collapse
|
50
|
Berezutskaya J, Freudenburg ZV, Ambrogioni L, Güçlü U, van Gerven MAJ, Ramsey NF. Cortical network responses map onto data-driven features that capture visual semantics of movie fragments. Sci Rep 2020; 10:12077. [PMID: 32694561 PMCID: PMC7374611 DOI: 10.1038/s41598-020-68853-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/30/2020] [Indexed: 11/08/2022] Open
Abstract
Research on how the human brain extracts meaning from sensory input relies in principle on methodological reductionism. In the present study, we adopt a more holistic approach by modeling the cortical responses to semantic information that was extracted from the visual stream of a feature film, employing artificial neural network models. Advances in both computer vision and natural language processing were utilized to extract the semantic representations from the film by combining perceptual and linguistic information. We tested whether these representations were useful in studying the human brain data. To this end, we collected electrocorticography responses to a short movie from 37 subjects and fitted their cortical patterns across multiple regions using the semantic components extracted from film frames. We found that individual semantic components reflected fundamental semantic distinctions in the visual input, such as presence or absence of people, human movement, landscape scenes, human faces, etc. Moreover, each semantic component mapped onto a distinct functional cortical network involving high-level cognitive regions in occipitotemporal, frontal and parietal cortices. The present work demonstrates the potential of the data-driven methods from information processing fields to explain patterns of cortical responses, and contributes to the overall discussion about the encoding of high-level perceptual information in the human brain.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands.
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Luca Ambrogioni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| |
Collapse
|