1
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
2
|
de Ruyter van Steveninck J, Nipshagen M, van Gerven M, Güçlü U, Güçlüturk Y, van Wezel R. Gaze-contingent processing improves mobility, scene recognition and visual search in simulated head-steered prosthetic vision. J Neural Eng 2024; 21:026037. [PMID: 38502957 DOI: 10.1088/1741-2552/ad357d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 03/19/2024] [Indexed: 03/21/2024]
Abstract
Objective.The enabling technology of visual prosthetics for the blind is making rapid progress. However, there are still uncertainties regarding the functional outcomes, which can depend on many design choices in the development. In visual prostheses with a head-mounted camera, a particularly challenging question is how to deal with the gaze-locked visual percept associated with spatial updating conflicts in the brain. The current study investigates a recently proposed compensation strategy based on gaze-contingent image processing with eye-tracking. Gaze-contingent processing is expected to reinforce natural-like visual scanning and reestablished spatial updating based on eye movements. The beneficial effects remain to be investigated for daily life activities in complex visual environments.Approach.The current study evaluates the benefits of gaze-contingent processing versus gaze-locked and gaze-ignored simulations in the context of mobility, scene recognition and visual search, using a virtual reality simulated prosthetic vision paradigm with sighted subjects.Main results.Compared to gaze-locked vision, gaze-contingent processing was consistently found to improve the speed in all experimental tasks, as well as the subjective quality of vision. Similar or further improvements were found in a control condition that ignores gaze-dependent effects, a simulation that is unattainable in the clinical reality.Significance.Our results suggest that gaze-locked vision and spatial updating conflicts can be debilitating for complex visually-guided activities of daily living such as mobility and orientation. Therefore, for prospective users of head-steered prostheses with an unimpaired oculomotor system, the inclusion of a compensatory eye-tracking system is strongly endorsed.
Collapse
Affiliation(s)
| | - Mo Nipshagen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Yağmur Güçlüturk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Richard van Wezel
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Biomedical Signals and Systems Group, University of Twente, Enschede, The Netherlands
| |
Collapse
|
3
|
van der Grinten M, de Ruyter van Steveninck J, Lozano A, Pijnacker L, Rueckauer B, Roelfsema P, van Gerven M, van Wezel R, Güçlü U, Güçlütürk Y. Towards biologically plausible phosphene simulation for the differentiable optimization of visual cortical prostheses. eLife 2024; 13:e85812. [PMID: 38386406 PMCID: PMC10883675 DOI: 10.7554/elife.85812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 01/21/2024] [Indexed: 02/23/2024] Open
Abstract
Blindness affects millions of people around the world. A promising solution to restoring a form of vision for some individuals are cortical visual prostheses, which bypass part of the impaired visual pathway by converting camera input to electrical stimulation of the visual system. The artificially induced visual percept (a pattern of localized light flashes, or 'phosphenes') has limited resolution, and a great portion of the field's research is devoted to optimizing the efficacy, efficiency, and practical usefulness of the encoding of visual information. A commonly exploited method is non-invasive functional evaluation in sighted subjects or with computational models by using simulated prosthetic vision (SPV) pipelines. An important challenge in this approach is to balance enhanced perceptual realism, biologically plausibility, and real-time performance in the simulation of cortical prosthetic vision. We present a biologically plausible, PyTorch-based phosphene simulator that can run in real-time and uses differentiable operations to allow for gradient-based computational optimization of phosphene encoding models. The simulator integrates a wide range of clinical results with neurophysiological evidence in humans and non-human primates. The pipeline includes a model of the retinotopic organization and cortical magnification of the visual cortex. Moreover, the quantitative effects of stimulation parameters and temporal dynamics on phosphene characteristics are incorporated. Our results demonstrate the simulator's suitability for both computational applications such as end-to-end deep learning-based prosthetic vision optimization as well as behavioral experiments. The modular and open-source software provides a flexible simulation framework for computational, clinical, and behavioral neuroscientists working on visual neuroprosthetics.
Collapse
Affiliation(s)
| | | | - Antonio Lozano
- Netherlands Institute for Neuroscience, Vrije Universiteit, Amsterdam, Netherlands
| | - Laura Pijnacker
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Bodo Rueckauer
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Netherlands Institute for Neuroscience, Vrije Universiteit, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Richard van Wezel
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
- Biomedical Signals and Systems Group, University of Twente, Enschede, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| |
Collapse
|
4
|
Le L, Ambrogioni L, Seeliger K, Güçlütürk Y, van Gerven M, Güçlü U. Brain2Pix: Fully convolutional naturalistic video frame reconstruction from brain activity. Front Neurosci 2022; 16:940972. [DOI: 10.3389/fnins.2022.940972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 09/09/2022] [Indexed: 11/16/2022] Open
Abstract
Reconstructing complex and dynamic visual perception from brain activity remains a major challenge in machine learning applications to neuroscience. Here, we present a new method for reconstructing naturalistic images and videos from very large single-participant functional magnetic resonance imaging data that leverages the recent success of image-to-image transformation networks. This is achieved by exploiting spatial information obtained from retinotopic mappings across the visual system. More specifically, we first determine what position each voxel in a particular region of interest would represent in the visual field based on its corresponding receptive field location. Then, the 2D image representation of the brain activity on the visual field is passed to a fully convolutional image-to-image network trained to recover the original stimuli using VGG feature loss with an adversarial regularizer. In our experiments, we show that our method offers a significant improvement over existing video reconstruction techniques.
Collapse
|
5
|
Geerligs L, Gözükara D, Oetringer D, Campbell KL, van Gerven M, Güçlü U. A partially nested cortical hierarchy of neural states underlies event segmentation in the human brain. eLife 2022; 11:77430. [PMID: 36111671 PMCID: PMC9531941 DOI: 10.7554/elife.77430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 09/14/2022] [Indexed: 11/18/2022] Open
Abstract
A fundamental aspect of human experience is that it is segmented into discrete events. This may be underpinned by transitions between distinct neural states. Using an innovative data-driven state segmentation method, we investigate how neural states are organized across the cortical hierarchy and where in the cortex neural state boundaries and perceived event boundaries overlap. Our results show that neural state boundaries are organized in a temporal cortical hierarchy, with short states in primary sensory regions, and long states in lateral and medial prefrontal cortex. State boundaries are shared within and between groups of brain regions that resemble well-known functional networks. Perceived event boundaries overlap with neural state boundaries across large parts of the cortical hierarchy, particularly when those state boundaries demarcate a strong transition or are shared between brain regions. Taken together, these findings suggest that a partially nested cortical hierarchy of neural states forms the basis of event segmentation.
Collapse
Affiliation(s)
- Linda Geerligs
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen
| | - Dora Gözükara
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen
| | - Djamari Oetringer
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen
| | | | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen
| |
Collapse
|
6
|
Armeni K, Güçlü U, van Gerven M, Schoffelen JM. A 10-hour within-participant magnetoencephalography narrative dataset to test models of language comprehension. Sci Data 2022; 9:278. [PMID: 35676293 PMCID: PMC9177538 DOI: 10.1038/s41597-022-01382-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/10/2022] [Indexed: 11/13/2022] Open
Abstract
Recently, cognitive neuroscientists have increasingly studied the brain responses to narratives. At the same time, we are witnessing exciting developments in natural language processing where large-scale neural network models can be used to instantiate cognitive hypotheses in narrative processing. Yet, they learn from text alone and we lack ways of incorporating biological constraints during training. To mitigate this gap, we provide a narrative comprehension magnetoencephalography (MEG) data resource that can be used to train neural network models directly on brain data. We recorded from 3 participants, 10 separate recording hour-long sessions each, while they listened to audiobooks in English. After story listening, participants answered short questions about their experience. To minimize head movement, the participants wore MEG-compatible head casts, which immobilized their head position during recording. We report a basic evoked-response analysis showing that the responses accurately localize to primary auditory areas. The responses are robust and conserved across 10 sessions for every participant. We also provide usage notes and briefly outline possible future uses of the resource.
Collapse
Affiliation(s)
- Kristijan Armeni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Jan-Mathijs Schoffelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
| |
Collapse
|
7
|
Abstract
Neural prosthetics may provide a promising solution to restore visual perception in some forms of blindness. The restored prosthetic percept is rudimentary compared to normal vision and can be optimized with a variety of image preprocessing techniques to maximize relevant information transfer. Extracting the most useful features from a visual scene is a nontrivial task and optimal preprocessing choices strongly depend on the context. Despite rapid advancements in deep learning, research currently faces a difficult challenge in finding a general and automated preprocessing strategy that can be tailored to specific tasks or user requirements. In this paper, we present a novel deep learning approach that explicitly addresses this issue by optimizing the entire process of phosphene generation in an end-to-end fashion. The proposed model is based on a deep auto-encoder architecture and includes a highly adjustable simulation module of prosthetic vision. In computational validation experiments, we show that such an approach is able to automatically find a task-specific stimulation protocol. The results of these proof-of-principle experiments illustrate the potential of end-to-end optimization for prosthetic vision. The presented approach is highly modular and our approach could be extended to automated dynamic optimization of prosthetic vision for everyday tasks, given any specific constraints, accommodating individual requirements of the end-user.
Collapse
Affiliation(s)
- Jaap de Ruyter van Steveninck
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Umut Güçlü
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Richard van Wezel
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Biomedical Signal and Systems, MIRA Institute for Biomedical Technology and Technical Medicine, University of Twente, Enschede, The Netherlands
| | - Marcel van Gerven
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
8
|
de Ruyter van Steveninck J, van Gestel T, Koenders P, van der Ham G, Vereecken F, Güçlü U, van Gerven M, Güçlütürk Y, van Wezel R. Real-world indoor mobility with simulated prosthetic vision: The benefits and feasibility of contour-based scene simplification at different phosphene resolutions. J Vis 2022; 22:1. [PMID: 35103758 PMCID: PMC8819280 DOI: 10.1167/jov.22.2.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Neuroprosthetic implants are a promising technology for restoring some form of vision in people with visual impairments via electrical neurostimulation in the visual pathway. Although an artificially generated prosthetic percept is relatively limited compared with normal vision, it may provide some elementary perception of the surroundings, re-enabling daily living functionality. For mobility in particular, various studies have investigated the benefits of visual neuroprosthetics in a simulated prosthetic vision paradigm with varying outcomes. The previous literature suggests that scene simplification via image processing, and particularly contour extraction, may potentially improve the mobility performance in a virtual environment. In the current simulation study with sighted participants, we explore both the theoretically attainable benefits of strict scene simplification in an indoor environment by controlling the environmental complexity, as well as the practically achieved improvement with a deep learning-based surface boundary detection implementation compared with traditional edge detection. A simulated electrode resolution of 26 × 26 was found to provide sufficient information for mobility in a simple environment. Our results suggest that, for a lower number of implanted electrodes, the removal of background textures and within-surface gradients may be beneficial in theory. However, the deep learning-based implementation for surface boundary detection did not improve mobility performance in the current study. Furthermore, our findings indicate that, for a greater number of electrodes, the removal of within-surface gradients and background textures may deteriorate, rather than improve, mobility. Therefore, finding a balanced amount of scene simplification requires a careful tradeoff between informativity and interpretability that may depend on the number of implanted electrodes.
Collapse
Affiliation(s)
- Jaap de Ruyter van Steveninck
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Tom van Gestel
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Paula Koenders
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Guus van der Ham
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Floris Vereecken
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Umut Güçlü
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Marcel van Gerven
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Yagmur Güçlütürk
- Department of Artificial Intelligence, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,
| | - Richard van Wezel
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.,Biomedical Signal and Systems, MIRA Institute for Biomedical Technology and Technical Medicine, University of Twente, Enschede, the Netherlands.,
| |
Collapse
|
9
|
Dado T, Güçlütürk Y, Ambrogioni L, Ras G, Bosch S, van Gerven M, Güçlü U. Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space. Sci Rep 2022; 12:141. [PMID: 34997012 PMCID: PMC8741893 DOI: 10.1038/s41598-021-03938-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 11/16/2021] [Indexed: 11/24/2022] Open
Abstract
Neural decoding can be conceptualized as the problem of mapping brain responses back to sensory stimuli via a feature space. We introduce (i) a novel experimental paradigm that uses well-controlled yet highly naturalistic stimuli with a priori known feature representations and (ii) an implementation thereof for HYPerrealistic reconstruction of PERception (HYPER) of faces from brain recordings. To this end, we embrace the use of generative adversarial networks (GANs) at the earliest step of our neural decoding pipeline by acquiring fMRI data as participants perceive face images synthesized by the generator network of a GAN. We show that the latent vectors used for generation effectively capture the same defining stimulus properties as the fMRI measurements. As such, these latents (conditioned on the GAN) are used as the in-between feature representations underlying the perceived images that can be predicted in neural decoding for (re-)generation of the originally perceived stimuli, leading to the most accurate reconstructions of perception to date.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Luca Ambrogioni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Gabriëlle Ras
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Sander Bosch
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
10
|
Geerligs L, van Gerven M, Güçlü U. Detecting neural state transitions underlying event segmentation. Neuroimage 2021; 236:118085. [PMID: 33882350 DOI: 10.1016/j.neuroimage.2021.118085] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 02/05/2021] [Accepted: 04/08/2021] [Indexed: 01/16/2023] Open
Abstract
Segmenting perceptual experience into meaningful events is a key cognitive process that helps us make sense of what is happening around us in the moment, as well as helping us recall past events. Nevertheless, little is known about the underlying neural mechanisms of the event segmentation process. Recent work has suggested that event segmentation can be linked to regional changes in neural activity patterns. Accurate methods for identifying such activity changes are important to allow further investigation of the neural basis of event segmentation and its link to the temporal processing hierarchy of the brain. In this study, we introduce a new set of elegant and simple methods to study these mechanisms. We introduce a method for identifying the boundaries between neural states in a brain area and a complementary one for identifying the number of neural states. Furthermore, we present the results of a comprehensive set of simulations and analyses of empirical fMRI data to provide guidelines for reliable estimation of neural states and show that our proposed methods outperform the current state-of-the-art in the literature. This methodological innovation will allow researchers to make headway in investigating the neural basis of event segmentation and information processing during naturalistic stimulation.
Collapse
Affiliation(s)
- Linda Geerligs
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Thomas van Aquinostraat 4, 6526 GD, Nijmegen, the Netherlands.
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Thomas van Aquinostraat 4, 6526 GD, Nijmegen, the Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Thomas van Aquinostraat 4, 6526 GD, Nijmegen, the Netherlands
| |
Collapse
|
11
|
Seeliger K, Ambrogioni L, Güçlütürk Y, van den Bulk LM, Güçlü U, van Gerven MAJ. End-to-end neural system identification with neural information flow. PLoS Comput Biol 2021; 17:e1008558. [PMID: 33539366 PMCID: PMC7888598 DOI: 10.1371/journal.pcbi.1008558] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 02/17/2021] [Accepted: 11/24/2020] [Indexed: 11/19/2022] Open
Abstract
Neural information flow (NIF) provides a novel approach for system identification in neuroscience. It models the neural computations in multiple brain regions and can be trained end-to-end via stochastic gradient descent from noninvasive data. NIF models represent neural information processing via a network of coupled tensors, each encoding the representation of the sensory input contained in a brain region. The elements of these tensors can be interpreted as cortical columns whose activity encodes the presence of a specific feature in a spatiotemporal location. Each tensor is coupled to the measured data specific to a brain region via low-rank observation models that can be decomposed into the spatial, temporal and feature receptive fields of a localized neuronal population. Both these observation models and the convolutional weights defining the information processing within regions are learned end-to-end by predicting the neural signal during sensory stimulation. We trained a NIF model on the activity of early visual areas using a large-scale fMRI dataset recorded in a single participant. We show that we can recover plausible visual representations and population receptive fields that are consistent with empirical findings.
Collapse
Affiliation(s)
- K. Seeliger
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - L. Ambrogioni
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Y. Güçlütürk
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - L. M. van den Bulk
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - U. Güçlü
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - M. A. J. van Gerven
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| |
Collapse
|
12
|
Berezutskaya J, Freudenburg ZV, Ambrogioni L, Güçlü U, van Gerven MAJ, Ramsey NF. Cortical network responses map onto data-driven features that capture visual semantics of movie fragments. Sci Rep 2020; 10:12077. [PMID: 32694561 PMCID: PMC7374611 DOI: 10.1038/s41598-020-68853-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/30/2020] [Indexed: 11/08/2022] Open
Abstract
Research on how the human brain extracts meaning from sensory input relies in principle on methodological reductionism. In the present study, we adopt a more holistic approach by modeling the cortical responses to semantic information that was extracted from the visual stream of a feature film, employing artificial neural network models. Advances in both computer vision and natural language processing were utilized to extract the semantic representations from the film by combining perceptual and linguistic information. We tested whether these representations were useful in studying the human brain data. To this end, we collected electrocorticography responses to a short movie from 37 subjects and fitted their cortical patterns across multiple regions using the semantic components extracted from film frames. We found that individual semantic components reflected fundamental semantic distinctions in the visual input, such as presence or absence of people, human movement, landscape scenes, human faces, etc. Moreover, each semantic component mapped onto a distinct functional cortical network involving high-level cognitive regions in occipitotemporal, frontal and parietal cortices. The present work demonstrates the potential of the data-driven methods from information processing fields to explain patterns of cortical responses, and contributes to the overall discussion about the encoding of high-level perceptual information in the human brain.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands.
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Luca Ambrogioni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| |
Collapse
|
13
|
Berezutskaya J, Freudenburg ZV, Güçlü U, van Gerven MAJ, Ramsey NF. Brain-optimized extraction of complex sound features that drive continuous auditory perception. PLoS Comput Biol 2020; 16:e1007992. [PMID: 32614826 PMCID: PMC7363106 DOI: 10.1371/journal.pcbi.1007992] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 07/15/2020] [Accepted: 05/27/2020] [Indexed: 01/01/2023] Open
Abstract
Understanding how the human brain processes auditory input remains a challenge. Traditionally, a distinction between lower- and higher-level sound features is made, but their definition depends on a specific theoretical framework and might not match the neural representation of sound. Here, we postulate that constructing a data-driven neural model of auditory perception, with a minimum of theoretical assumptions about the relevant sound features, could provide an alternative approach and possibly a better match to the neural responses. We collected electrocorticography recordings from six patients who watched a long-duration feature film. The raw movie soundtrack was used to train an artificial neural network model to predict the associated neural responses. The model achieved high prediction accuracy and generalized well to a second dataset, where new participants watched a different film. The extracted bottom-up features captured acoustic properties that were specific to the type of sound and were associated with various response latency profiles and distinct cortical distributions. Specifically, several features encoded speech-related acoustic properties with some features exhibiting shorter latency profiles (associated with responses in posterior perisylvian cortex) and others exhibiting longer latency profiles (associated with responses in anterior perisylvian cortex). Our results support and extend the current view on speech perception by demonstrating the presence of temporal hierarchies in the perisylvian cortex and involvement of cortical sites outside of this region during audiovisual speech perception.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Zachary V. Freudenburg
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Marcel A. J. van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Nick F. Ramsey
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
14
|
Seeliger K, Güçlü U, Ambrogioni L, Güçlütürk Y, van Gerven M. Generative adversarial networks for reconstructing natural images from brain activity. Neuroimage 2018; 181:775-785. [DOI: 10.1016/j.neuroimage.2018.07.043] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 06/22/2018] [Accepted: 07/16/2018] [Indexed: 11/25/2022] Open
|
15
|
Seeliger K, Fritsche M, Güçlü U, Schoenmakers S, Schoffelen JM, Bosch SE, van Gerven MAJ. Convolutional neural network-based encoding and decoding of visual object recognition in space and time. Neuroimage 2017; 180:253-266. [PMID: 28723578 DOI: 10.1016/j.neuroimage.2017.07.018] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Revised: 06/24/2017] [Accepted: 07/10/2017] [Indexed: 10/19/2022] Open
Abstract
Representations learned by deep convolutional neural networks (CNNs) for object recognition are a widely investigated model of the processing hierarchy in the human visual system. Using functional magnetic resonance imaging, CNN representations of visual stimuli have previously been shown to correspond to processing stages in the ventral and dorsal streams of the visual system. Whether this correspondence between models and brain signals also holds for activity acquired at high temporal resolution has been explored less exhaustively. Here, we addressed this question by combining CNN-based encoding models with magnetoencephalography (MEG). Human participants passively viewed 1,000 images of objects while MEG signals were acquired. We modelled their high temporal resolution source-reconstructed cortical activity with CNNs, and observed a feed-forward sweep across the visual hierarchy between 75 and 200 ms after stimulus onset. This spatiotemporal cascade was captured by the network layer representations, where the increasingly abstract stimulus representation in the hierarchical network model was reflected in different parts of the visual cortex, following the visual ventral stream. We further validated the accuracy of our encoding model by decoding stimulus identity in a left-out validation set of viewed objects, achieving state-of-the-art decoding accuracy.
Collapse
Affiliation(s)
- K Seeliger
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands.
| | - M Fritsche
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - U Güçlü
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - S Schoenmakers
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - J-M Schoffelen
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - S E Bosch
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - M A J van Gerven
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| |
Collapse
|
16
|
Abstract
Encoding models are used for predicting brain activity in response to sensory stimuli with the objective of elucidating how sensory information is represented in the brain. Encoding models typically comprise a nonlinear transformation of stimuli to features (feature model) and a linear convolution of features to responses (response model). While there has been extensive work on developing better feature models, the work on developing better response models has been rather limited. Here, we investigate the extent to which recurrent neural network models can use their internal memories for nonlinear processing of arbitrary feature sequences to predict feature-evoked response sequences as measured by functional magnetic resonance imaging. We show that the proposed recurrent neural network models can significantly outperform established response models by accurately estimating long-term dependencies that drive hemodynamic responses. The results open a new window into modeling the dynamics of brain activity in response to sensory stimuli.
Collapse
Affiliation(s)
- Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| |
Collapse
|
17
|
Schoenmakers S, Güçlü U, van Gerven M, Heskes T. Gaussian mixture models and semantic gating improve reconstructions from human brain activity. Front Comput Neurosci 2015; 8:173. [PMID: 25688202 PMCID: PMC4311641 DOI: 10.3389/fncom.2014.00173] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 12/19/2014] [Indexed: 11/16/2022] Open
Abstract
Better acquisition protocols and analysis techniques are making it possible to use fMRI to obtain highly detailed visualizations of brain processes. In particular we focus on the reconstruction of natural images from BOLD responses in visual cortex. We expand our linear Gaussian framework for percept decoding with Gaussian mixture models to better represent the prior distribution of natural images. Reconstruction of such images then boils down to probabilistic inference in a hybrid Bayesian network. In our set-up, different mixture components correspond to different character categories. Our framework can automatically infer higher-order semantic categories from lower-level brain areas. Furthermore, the framework can gate semantic information from higher-order brain areas to enforce the correct category during reconstruction. When categorical information is not available, we show that automatically learned clusters in the data give a similar improvement in reconstruction. The hybrid Bayesian network leads to highly accurate reconstructions in both supervised and unsupervised settings.
Collapse
Affiliation(s)
- Sanne Schoenmakers
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Tom Heskes
- Institute for Computing and Information Sciences, Radboud University Nijmegen Nijmegen, Netherlands
| |
Collapse
|
18
|
Güçlü U, van Gerven MAJ. Unsupervised feature learning improves prediction of human brain activity in response to natural images. PLoS Comput Biol 2014; 10:e1003724. [PMID: 25101625 PMCID: PMC4125038 DOI: 10.1371/journal.pcbi.1003724] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 04/28/2014] [Indexed: 11/18/2022] Open
Abstract
Encoding and decoding in functional magnetic resonance imaging has recently emerged as an area of research to noninvasively characterize the relationship between stimulus features and human brain activity. To overcome the challenge of formalizing what stimulus features should modulate single voxel responses, we introduce a general approach for making directly testable predictions of single voxel responses to statistically adapted representations of ecologically valid stimuli. These representations are learned from unlabeled data without supervision. Our approach is validated using a parsimonious computational model of (i) how early visual cortical representations are adapted to statistical regularities in natural images and (ii) how populations of these representations are pooled by single voxels. This computational model is used to predict single voxel responses to natural images and identify natural images from stimulus-evoked multiple voxel responses. We show that statistically adapted low-level sparse and invariant representations of natural images better span the space of early visual cortical representations and can be more effectively exploited in stimulus identification than hand-designed Gabor wavelets. Our results demonstrate the potential of our approach to better probe unknown cortical representations. An important but difficult problem in contemporary cognitive neuroscience is to find what stimulus features best drive responses in the human brain. The conventional approach to solve this problem is to use descriptive encoding models that predict responses to stimulus features that are known a priori. In this study, we introduce an alternative to this approach that is independent of a priori knowledge. Instead, we use a normative encoding model that predicts responses to stimulus features that are learned from unlabeled data. We show that this normative encoding model learns sparse, topographic and invariant stimulus features from tens of thousands of grayscale natural image patches without supervision, and reproduces the population behavior of simple and complex cells. We find that these stimulus features significantly better drive blood-oxygen-level dependent hemodynamic responses in early visual areas than Gabor wavelets–the fundamental building blocks of the conventional approach. Our approach will improve our understanding of how sensory information is represented beyond early visual areas since it can theoretically find what stimulus features best drive responses in other sensory areas.
Collapse
Affiliation(s)
- Umut Güçlü
- Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
- * E-mail:
| | - Marcel A. J. van Gerven
- Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
| |
Collapse
|
19
|
Güçlü U, Güçlütürk Y, Loo CK. Evaluation of fractal dimension estimation methods for feature extraction in motor imagery based brain computer interface. ACTA ACUST UNITED AC 2011. [DOI: 10.1016/j.procs.2010.12.098] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|