1
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
2
|
Moore JA, Wilms M, Gutierrez A, Ismail Z, Fakhar K, Hadaeghi F, Hilgetag CC, Forkert ND. Simulation of neuroplasticity in a CNN-based in-silico model of neurodegeneration of the visual system. Front Comput Neurosci 2023; 17:1274824. [PMID: 38105786 PMCID: PMC10722164 DOI: 10.3389/fncom.2023.1274824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/08/2023] [Indexed: 12/19/2023] Open
Abstract
The aim of this work was to enhance the biological feasibility of a deep convolutional neural network-based in-silico model of neurodegeneration of the visual system by equipping it with a mechanism to simulate neuroplasticity. Therefore, deep convolutional networks of multiple sizes were trained for object recognition tasks and progressively lesioned to simulate neurodegeneration of the visual cortex. More specifically, the injured parts of the network remained injured while we investigated how the added retraining steps were able to recover some of the model's object recognition baseline performance. The results showed with retraining, model object recognition abilities are subject to a smoother and more gradual decline with increasing injury levels than without retraining and, therefore, more similar to the longitudinal cognition impairments of patients diagnosed with Alzheimer's disease (AD). Moreover, with retraining, the injured model exhibits internal activation patterns similar to those of the healthy baseline model when compared to the injured model without retraining. Furthermore, we conducted this analysis on a network that had been extensively pruned, resulting in an optimized number of parameters or synapses. Our findings show that this network exhibited remarkably similar capability to recover task performance with decreasingly viable pathways through the network. In conclusion, adding a retraining step to the in-silico setup that simulates neuroplasticity improves the model's biological feasibility considerably and could prove valuable to test different rehabilitation approaches in-silico.
Collapse
Affiliation(s)
- Jasmine A. Moore
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Biomedical Engineering Program, University of Calgary, Calgary, AB, Canada
| | - Matthias Wilms
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB, Canada
| | - Alejandro Gutierrez
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Biomedical Engineering Program, University of Calgary, Calgary, AB, Canada
| | - Zahinoor Ismail
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Department of Clinical Neurosciences, University of Calgary, Calgary, AB, Canada
| | - Kayson Fakhar
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Fatemeh Hadaeghi
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Claus C. Hilgetag
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Department of Health Sciences, Boston University, Boston, MA, United States
| | - Nils D. Forkert
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
3
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
4
|
Muttenthaler L, Hebart MN. THINGSvision: A Python Toolbox for Streamlining the Extraction of Activations From Deep Neural Networks. Front Neuroinform 2021; 15:679838. [PMID: 34630062 PMCID: PMC8494008 DOI: 10.3389/fninf.2021.679838] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 08/10/2021] [Indexed: 11/25/2022] Open
Abstract
Over the past decade, deep neural network (DNN) models have received a lot of attention due to their near-human object classification performance and their excellent prediction of signals recorded from biological visual systems. To better understand the function of these networks and relate them to hypotheses about brain activity and behavior, researchers need to extract the activations to images across different DNN layers. The abundance of different DNN variants, however, can often be unwieldy, and the task of extracting DNN activations from different layers may be non-trivial and error-prone for someone without a strong computational background. Thus, researchers in the fields of cognitive science and computational neuroscience would benefit from a library or package that supports a user in the extraction task. THINGSvision is a new Python module that aims at closing this gap by providing a simple and unified tool for extracting layer activations for a wide range of pretrained and randomly-initialized neural network architectures, even for users with little to no programming experience. We demonstrate the general utility of THINGsvision by relating extracted DNN activations to a number of functional MRI and behavioral datasets using representational similarity analysis, which can be performed as an integral part of the toolbox. Together, THINGSvision enables researchers across diverse fields to extract features in a streamlined manner for their custom image dataset, thereby improving the ease of relating DNNs, brain activity, and behavior, and improving the reproducibility of findings in these research fields.
Collapse
Affiliation(s)
- Lukas Muttenthaler
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Machine Learning Group, Technical University of Berlin, Berlin, Germany
| | - Martin N. Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
5
|
Wu H, Zhu Z, Wang J, Zheng N, Chen B. An Encoding Framework With Brain Inner State for Natural Image Identification. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.2987352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
6
|
Berezutskaya J, Freudenburg ZV, Ambrogioni L, Güçlü U, van Gerven MAJ, Ramsey NF. Cortical network responses map onto data-driven features that capture visual semantics of movie fragments. Sci Rep 2020; 10:12077. [PMID: 32694561 PMCID: PMC7374611 DOI: 10.1038/s41598-020-68853-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/30/2020] [Indexed: 11/08/2022] Open
Abstract
Research on how the human brain extracts meaning from sensory input relies in principle on methodological reductionism. In the present study, we adopt a more holistic approach by modeling the cortical responses to semantic information that was extracted from the visual stream of a feature film, employing artificial neural network models. Advances in both computer vision and natural language processing were utilized to extract the semantic representations from the film by combining perceptual and linguistic information. We tested whether these representations were useful in studying the human brain data. To this end, we collected electrocorticography responses to a short movie from 37 subjects and fitted their cortical patterns across multiple regions using the semantic components extracted from film frames. We found that individual semantic components reflected fundamental semantic distinctions in the visual input, such as presence or absence of people, human movement, landscape scenes, human faces, etc. Moreover, each semantic component mapped onto a distinct functional cortical network involving high-level cognitive regions in occipitotemporal, frontal and parietal cortices. The present work demonstrates the potential of the data-driven methods from information processing fields to explain patterns of cortical responses, and contributes to the overall discussion about the encoding of high-level perceptual information in the human brain.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands.
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Luca Ambrogioni
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| |
Collapse
|
7
|
Deep learning and cognitive science. Cognition 2020; 203:104365. [PMID: 32563082 DOI: 10.1016/j.cognition.2020.104365] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 05/31/2020] [Accepted: 06/03/2020] [Indexed: 11/22/2022]
Abstract
In recent years, the family of algorithms collected under the term "deep learning" has revolutionized artificial intelligence, enabling machines to reach human-like performances in many complex cognitive tasks. Although deep learning models are grounded in the connectionist paradigm, their recent advances were basically developed with engineering goals in mind. Despite of their applied focus, deep learning models eventually seem fruitful for cognitive purposes. This can be thought as a kind of biological exaptation, where a physiological structure becomes applicable for a function different from that for which it was selected. In this paper, it will be argued that it is time for cognitive science to seriously come to terms with deep learning, and we try to spell out the reasons why this is the case. First, the path of the evolution of deep learning from the connectionist project is traced, demonstrating the remarkable continuity, and the differences as well. Then, it will be considered how deep learning models can be useful for many cognitive topics, especially those where it has achieved performance comparable to humans, from perception to language. It will be maintained that deep learning poses questions that cognitive sciences should try to answer. One of such questions is the reasons why deep convolutional models that are disembodied, inactive, unaware of context, and static, are by far the closest to the patterns of activation in the brain visual system.
Collapse
|
8
|
|
9
|
Han K, Wen H, Shi J, Lu KH, Zhang Y, Fu D, Liu Z. Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex. Neuroimage 2019; 198:125-136. [PMID: 31103784 PMCID: PMC6592726 DOI: 10.1016/j.neuroimage.2019.05.039] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2018] [Revised: 04/13/2019] [Accepted: 05/15/2019] [Indexed: 01/21/2023] Open
Abstract
Goal-driven and feedforward-only convolutional neural networks (CNN) have been shown to be able to predict and decode cortical responses to natural images or videos. Here, we explored an alternative deep neural network, variational auto-encoder (VAE), as a computational model of the visual cortex. We trained a VAE with a five-layer encoder and a five-layer decoder to learn visual representations from a diverse set of unlabeled images. Using the trained VAE, we predicted and decoded cortical activity observed with functional magnetic resonance imaging (fMRI) from three human subjects passively watching natural videos. Compared to CNN, VAE could predict the video-evoked cortical responses with comparable accuracy in early visual areas, but relatively lower accuracy in higher-order visual areas. The distinction between CNN and VAE in terms of encoding performance was primarily attributed to their different learning objectives, rather than their different model architecture or number of parameters. Despite lower encoding accuracies, VAE offered a more convenient strategy for decoding the fMRI activity to reconstruct the video input, by first converting the fMRI activity to the VAE's latent variables, and then converting the latent variables to the reconstructed video frames through the VAE's decoder. This strategy was more advantageous than alternative decoding methods, e.g. partial least squares regression, for being able to reconstruct both the spatial structure and color of the visual input. Such findings highlight VAE as an unsupervised model for learning visual representation, as well as its potential and limitations for explaining cortical responses and reconstructing naturalistic and diverse visual experiences.
Collapse
Affiliation(s)
- Kuan Han
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Haiguang Wen
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Junxing Shi
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Kun-Han Lu
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Yizhen Zhang
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Di Fu
- School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA
| | - Zhongming Liu
- Weldon School of Biomedical Engineering, USA; School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA.
| |
Collapse
|
10
|
Shehzad Z, McCarthy G. Perceptual and Semantic Phases of Face Identification Processing: A Multivariate Electroencephalography Study. J Cogn Neurosci 2019; 31:1827-1839. [PMID: 31368824 DOI: 10.1162/jocn_a_01453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Rapid identification of a familiar face requires an image-invariant representation of person identity. A varying sample of familiar faces is necessary to disentangle image-level from person-level processing. We investigated the time course of face identity processing using a multivariate electroencephalography analysis. Participants saw ambient exemplars of celebrity faces that differed in pose, lighting, hairstyle, and so forth. A name prime preceded a face on half of the trials to preactivate person-specific information, whereas a neutral prime was used on the remaining half. This manipulation helped dissociate perceptual- and semantic-based identification. Two time intervals within the post-face onset electroencephalography epoch were sensitive to person identity. The early perceptual phase spanned 110-228 msec and was not modulated by the name prime. The late semantic phase spanned 252-1000 msec and was sensitive to person knowledge activated by the name prime. Within this late phase, the identity response occurred earlier in time (300-600 msec) for the name prime with a scalp topography similar to the FN400 ERP. This may reflect a matching of the person primed in memory with the face on the screen. Following a neutral prime, the identity response occurred later in time (500-800 msec) with a scalp topography similar to the P600f ERP. This may reflect activation of semantic knowledge associated with the identity. Our results suggest that processing of identity begins early (110 msec), with some tolerance to image-level variations, and then progresses in stages sensitive to perceptual and then to semantic features.
Collapse
|
11
|
Drew PJ. Vascular and neural basis of the BOLD signal. Curr Opin Neurobiol 2019; 58:61-69. [PMID: 31336326 DOI: 10.1016/j.conb.2019.06.004] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 06/22/2019] [Indexed: 12/26/2022]
Abstract
Neural activity in the brain is usually coupled to increases in local cerebral blood flow, leading to the increase in oxygenation that generates the BOLD fMRI signal. Recent work has begun to elucidate the vascular and neural mechanisms underlying the BOLD signal. The dilatory response is distributed throughout the vascular network. Arteries actively dilate within a second following neural activity increases, while venous distensions are passive and have a time course that last tens of seconds. Vasodilation, and thus local blood flow, is controlled by the activity of both neurons and astrocytes via multiple different pathways. The relationship between sensory-driven neural activity and the vascular dynamics in sensory areas are well-captured with a linear convolution model. However, depending on the behavioral state or brain region, the coupling between neural activity and hemodynamic signals can be weak or even inverted.
Collapse
Affiliation(s)
- Patrick J Drew
- Departments of Engineering Science and Mechanics, Biomedical Engineering and Neurosurgery, Pennsylvania State University, University Park, PA 16802, United States.
| |
Collapse
|
12
|
The Ventral Visual Pathway Represents Animal Appearance over Animacy, Unlike Human Behavior and Deep Neural Networks. J Neurosci 2019; 39:6513-6525. [PMID: 31196934 DOI: 10.1523/jneurosci.1714-18.2019] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 04/09/2019] [Accepted: 05/06/2019] [Indexed: 11/21/2022] Open
Abstract
Recent studies showed agreement between how the human brain and neural networks represent objects, suggesting that we might start to understand the underlying computations. However, we know that the human brain is prone to biases at many perceptual and cognitive levels, often shaped by learning history and evolutionary constraints. Here, we explore one such perceptual phenomenon, perceiving animacy, and use the performance of neural networks as a benchmark. We performed an fMRI study that dissociated object appearance (what an object looks like) from object category (animate or inanimate) by constructing a stimulus set that includes animate objects (e.g., a cow), typical inanimate objects (e.g., a mug), and, crucially, inanimate objects that look like the animate objects (e.g., a cow mug). Behavioral judgments and deep neural networks categorized images mainly by animacy, setting all objects (lookalike and inanimate) apart from the animate ones. In contrast, activity patterns in ventral occipitotemporal cortex (VTC) were better explained by object appearance: animals and lookalikes were similarly represented and separated from the inanimate objects. Furthermore, the appearance of an object interfered with proper object identification, such as failing to signal that a cow mug is a mug. The preference in VTC to represent a lookalike as animate was even present when participants performed a task requiring them to report the lookalikes as inanimate. In conclusion, VTC representations, in contrast to neural networks, fail to represent objects when visual appearance is dissociated from animacy, probably due to a preferred processing of visual features typical of animate objects.SIGNIFICANCE STATEMENT How does the brain represent objects that we perceive around us? Recent advances in artificial intelligence have suggested that object categorization and its neural correlates have now been approximated by neural networks. Here, we show that neural networks can predict animacy according to human behavior but do not explain visual cortex representations. In ventral occipitotemporal cortex, neural activity patterns were strongly biased toward object appearance, to the extent that objects with visual features resembling animals were represented closely to real animals and separated from other objects from the same category. This organization that privileges animals and their features over objects might be the result of learning history and evolutionary constraints.
Collapse
|
13
|
Kellmeyer P. Big Brain Data: On the Responsible Use of Brain Data from Clinical and Consumer-Directed Neurotechnological Devices. NEUROETHICS-NETH 2018. [DOI: 10.1007/s12152-018-9371-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
AbstractThe focus of this paper are the ethical, legal and social challenges for ensuring the responsible use of “big brain data”—the recording, collection and analysis of individuals’ brain data on a large scale with clinical and consumer-directed neurotechnological devices. First, I highlight the benefits of big data and machine learning analytics in neuroscience for basic and translational research. Then, I describe some of the technological, social and psychological barriers for securing brain data from unwarranted access. In this context, I then examine ways in which safeguards at the hardware and software level, as well as increasing “data literacy” in society, may enhance the security of neurotechnological devices and protect the privacy of personal brain data. Regarding ethical and legal ramifications of big brain data, I first discuss effects on the autonomy, the sense of agency and authenticity, as well as the self that may result from the interaction between users and intelligent, particularly closed-loop, neurotechnological devices. I then discuss the impact of the “datafication” in basic and clinical neuroscience research on the just distribution of resources and access to these transformative technologies. In the legal realm, I examine possible legal consequences that arises from the increasing abilities to decode brain states and their corresponding subjective phenomenological experiences on the hitherto inaccessible privacy of these information. Finally, I discuss the implications of big brain data for national and international regulatory policies and models of good data governance.
Collapse
|
14
|
Representations of naturalistic stimulus complexity in early and associative visual and auditory cortices. Sci Rep 2018; 8:3439. [PMID: 29467495 PMCID: PMC5821852 DOI: 10.1038/s41598-018-21636-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 01/18/2018] [Indexed: 01/01/2023] Open
Abstract
The complexity of sensory stimuli has an important role in perception and cognition. However, its neural representation is not well understood. Here, we characterize the representations of naturalistic visual and auditory stimulus complexity in early and associative visual and auditory cortices. This is realized by means of encoding and decoding analyses of two fMRI datasets in the visual and auditory modalities. Our results implicate most early and some associative sensory areas in representing the complexity of naturalistic sensory stimuli. For example, parahippocampal place area, which was previously shown to represent scene features, is shown to also represent scene complexity. Similarly, posterior regions of superior temporal gyrus and superior temporal sulcus, which were previously shown to represent syntactic (language) complexity, are shown to also represent music (auditory) complexity. Furthermore, our results suggest the existence of gradients in sensitivity to naturalistic sensory stimulus complexity in these areas.
Collapse
|
15
|
Holdgraf CR, Rieger JW, Micheli C, Martin S, Knight RT, Theunissen FE. Encoding and Decoding Models in Cognitive Electrophysiology. Front Syst Neurosci 2017; 11:61. [PMID: 29018336 PMCID: PMC5623038 DOI: 10.3389/fnsys.2017.00061] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 08/07/2017] [Indexed: 11/13/2022] Open
Abstract
Cognitive neuroscience has seen rapid growth in the size and complexity of data recorded from the human brain as well as in the computational tools available to analyze this data. This data explosion has resulted in an increased use of multivariate, model-based methods for asking neuroscience questions, allowing scientists to investigate multiple hypotheses with a single dataset, to use complex, time-varying stimuli, and to study the human brain under more naturalistic conditions. These tools come in the form of "Encoding" models, in which stimulus features are used to model brain activity, and "Decoding" models, in which neural features are used to generated a stimulus output. Here we review the current state of encoding and decoding models in cognitive electrophysiology and provide a practical guide toward conducting experiments and analyses in this emerging field. Our examples focus on using linear models in the study of human language and audition. We show how to calculate auditory receptive fields from natural sounds as well as how to decode neural recordings to predict speech. The paper aims to be a useful tutorial to these approaches, and a practical introduction to using machine learning and applied statistics to build models of neural activity. The data analytic approaches we discuss may also be applied to other sensory modalities, motor systems, and cognitive systems, and we cover some examples in these areas. In addition, a collection of Jupyter notebooks is publicly available as a complement to the material covered in this paper, providing code examples and tutorials for predictive modeling in python. The aim is to provide a practical understanding of predictive modeling of human brain data and to propose best-practices in conducting these analyses.
Collapse
Affiliation(s)
- Christopher R. Holdgraf
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Office of the Vice Chancellor for Research, Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, United States
| | - Jochem W. Rieger
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
| | - Cristiano Micheli
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
- Institut des Sciences Cognitives Marc Jeannerod, Lyon, France
| | - Stephanie Martin
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Defitech Chair in Brain-Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Robert T. Knight
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Frederic E. Theunissen
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychology, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
16
|
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex. J Neurosci 2017; 37:7906-7920. [PMID: 28716965 DOI: 10.1523/jneurosci.0238-17.2017] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 07/04/2017] [Accepted: 07/09/2017] [Indexed: 11/21/2022] Open
Abstract
Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain.SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the perisylvian cortex and potentially contribute to the emergence of "coarse" speech representations in inferior frontal gyrus typically associated with high-level language processing. These findings add to the previous work on auditory processing and underline a distinctive role of inferior frontal gyrus in natural speech comprehension.
Collapse
|
17
|
Seeliger K, Fritsche M, Güçlü U, Schoenmakers S, Schoffelen JM, Bosch SE, van Gerven MAJ. Convolutional neural network-based encoding and decoding of visual object recognition in space and time. Neuroimage 2017; 180:253-266. [PMID: 28723578 DOI: 10.1016/j.neuroimage.2017.07.018] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Revised: 06/24/2017] [Accepted: 07/10/2017] [Indexed: 10/19/2022] Open
Abstract
Representations learned by deep convolutional neural networks (CNNs) for object recognition are a widely investigated model of the processing hierarchy in the human visual system. Using functional magnetic resonance imaging, CNN representations of visual stimuli have previously been shown to correspond to processing stages in the ventral and dorsal streams of the visual system. Whether this correspondence between models and brain signals also holds for activity acquired at high temporal resolution has been explored less exhaustively. Here, we addressed this question by combining CNN-based encoding models with magnetoencephalography (MEG). Human participants passively viewed 1,000 images of objects while MEG signals were acquired. We modelled their high temporal resolution source-reconstructed cortical activity with CNNs, and observed a feed-forward sweep across the visual hierarchy between 75 and 200 ms after stimulus onset. This spatiotemporal cascade was captured by the network layer representations, where the increasingly abstract stimulus representation in the hierarchical network model was reflected in different parts of the visual cortex, following the visual ventral stream. We further validated the accuracy of our encoding model by decoding stimulus identity in a left-out validation set of viewed objects, achieving state-of-the-art decoding accuracy.
Collapse
Affiliation(s)
- K Seeliger
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands.
| | - M Fritsche
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - U Güçlü
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - S Schoenmakers
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - J-M Schoffelen
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - S E Bosch
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| | - M A J van Gerven
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands
| |
Collapse
|
18
|
Testolin A, De Filippo De Grazia M, Zorzi M. The Role of Architectural and Learning Constraints in Neural Network Models: A Case Study on Visual Space Coding. Front Comput Neurosci 2017; 11:13. [PMID: 28377709 PMCID: PMC5360096 DOI: 10.3389/fncom.2017.00013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 02/27/2017] [Indexed: 01/25/2023] Open
Abstract
The recent "deep learning revolution" in artificial neural networks had strong impact and widespread deployment for engineering applications, but the use of deep learning for neurocomputational modeling has been so far limited. In this article we argue that unsupervised deep learning represents an important step forward for improving neurocomputational models of perception and cognition, because it emphasizes the role of generative learning as opposed to discriminative (supervised) learning. As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations. We compare different network architectures commonly used as building blocks for unsupervised deep learning by systematically testing the type of receptive fields and gain modulation developed by the hidden neurons. In particular, we compare Restricted Boltzmann Machines (RBMs), which are stochastic, generative networks with bidirectional connections trained using contrastive divergence, with autoencoders, which are deterministic networks trained using error backpropagation. For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation. The unsupervised models are then compared with supervised, feed-forward networks that learn an explicit mapping between different spatial reference frames. Our simulations show that both architectural and learning constraints strongly influenced the emergent coding of visual space in terms of distribution of tuning functions at the level of single neurons. Unsupervised models, and particularly RBMs, were found to more closely adhere to neurophysiological data from single-cell recordings in the primate parietal cortex. These results provide new insights into how basic properties of artificial neural networks might be relevant for modeling neural information processing in biological systems.
Collapse
Affiliation(s)
- Alberto Testolin
- Department of General Psychology and Padova Neuroscience Center, University of Padova Padova, Italy
| | | | - Marco Zorzi
- Department of General Psychology and Padova Neuroscience Center, University of PadovaPadova, Italy; San Camillo Hospital IRCCSVenice, Italy
| |
Collapse
|
19
|
Güçlü U, van Gerven MAJ. Modeling the Dynamics of Human Brain Activity with Recurrent Neural Networks. Front Comput Neurosci 2017; 11:7. [PMID: 28232797 PMCID: PMC5299026 DOI: 10.3389/fncom.2017.00007] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 01/25/2017] [Indexed: 11/13/2022] Open
Abstract
Encoding models are used for predicting brain activity in response to sensory stimuli with the objective of elucidating how sensory information is represented in the brain. Encoding models typically comprise a nonlinear transformation of stimuli to features (feature model) and a linear convolution of features to responses (response model). While there has been extensive work on developing better feature models, the work on developing better response models has been rather limited. Here, we investigate the extent to which recurrent neural network models can use their internal memories for nonlinear processing of arbitrary feature sequences to predict feature-evoked response sequences as measured by functional magnetic resonance imaging. We show that the proposed recurrent neural network models can significantly outperform established response models by accurately estimating long-term dependencies that drive hemodynamic responses. The results open a new window into modeling the dynamics of brain activity in response to sensory stimuli.
Collapse
Affiliation(s)
- Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| |
Collapse
|
20
|
Khaligh-Razavi SM, Henriksson L, Kay K, Kriegeskorte N. Fixed versus mixed RSA: Explaining visual representations by fixed and mixed feature sets from shallow and deep computational models. JOURNAL OF MATHEMATICAL PSYCHOLOGY 2017; 76:184-197. [PMID: 28298702 PMCID: PMC5341758 DOI: 10.1016/j.jmp.2016.10.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies in the training set did not match their frequencies in natural experience or their behavioural importance. The latter factors might determine the representational prominence of semantic dimensions in higher-level ventral-stream areas. Our results demonstrate the benefits of testing both the specific representational hypothesis expressed by a model's original feature space and the hypothesis space generated by linear transformations of that feature space.
Collapse
Affiliation(s)
- Seyed-Mahdi Khaligh-Razavi
- MRC Cognition and Brain Sciences Unit, Cambridge, UK
- Computer Science & Artificial intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Linda Henriksson
- MRC Cognition and Brain Sciences Unit, Cambridge, UK
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Kendrick Kay
- Department of Psychology, Washington University in St. Louis, St. Louis, MO, USA
| | | |
Collapse
|
21
|
Increasingly complex representations of natural movies across the dorsal stream are shared between subjects. Neuroimage 2017; 145:329-336. [DOI: 10.1016/j.neuroimage.2015.12.036] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 11/10/2015] [Accepted: 12/22/2015] [Indexed: 11/23/2022] Open
|
22
|
|
23
|
Classifying four-category visual objects using multiple ERP components in single-trial ERP. Cogn Neurodyn 2016; 10:275-85. [PMID: 27468316 DOI: 10.1007/s11571-016-9378-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 01/21/2016] [Accepted: 01/28/2016] [Indexed: 10/22/2022] Open
Abstract
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Collapse
|
24
|
Maloney RT, Clifford CW. Orientation anisotropies in human primary visual cortex depend on contrast. Neuroimage 2015; 119:129-45. [DOI: 10.1016/j.neuroimage.2015.06.034] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Revised: 05/15/2015] [Accepted: 06/10/2015] [Indexed: 11/28/2022] Open
|
25
|
Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream. J Neurosci 2015; 35:10005-14. [PMID: 26157000 DOI: 10.1523/jneurosci.5023-14.2015] [Citation(s) in RCA: 472] [Impact Index Per Article: 47.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Converging evidence suggests that the primate ventral visual pathway encodes increasingly complex stimulus features in downstream areas. We quantitatively show that there indeed exists an explicit gradient for feature complexity in the ventral pathway of the human brain. This was achieved by mapping thousands of stimulus features of increasing complexity across the cortical sheet using a deep neural network. Our approach also revealed a fine-grained functional specialization of downstream areas of the ventral stream. Furthermore, it allowed decoding of representations from human brain activity at an unsurpassed degree of accuracy, confirming the quality of the developed approach. Stimulus features that successfully explained neural responses indicate that population receptive fields were explicitly tuned for object categorization. This provides strong support for the hypothesis that object categorization is a guiding principle in the functional organization of the primate ventral stream.
Collapse
|
26
|
Schoenmakers S, Güçlü U, van Gerven M, Heskes T. Gaussian mixture models and semantic gating improve reconstructions from human brain activity. Front Comput Neurosci 2015; 8:173. [PMID: 25688202 PMCID: PMC4311641 DOI: 10.3389/fncom.2014.00173] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 12/19/2014] [Indexed: 11/16/2022] Open
Abstract
Better acquisition protocols and analysis techniques are making it possible to use fMRI to obtain highly detailed visualizations of brain processes. In particular we focus on the reconstruction of natural images from BOLD responses in visual cortex. We expand our linear Gaussian framework for percept decoding with Gaussian mixture models to better represent the prior distribution of natural images. Reconstruction of such images then boils down to probabilistic inference in a hybrid Bayesian network. In our set-up, different mixture components correspond to different character categories. Our framework can automatically infer higher-order semantic categories from lower-level brain areas. Furthermore, the framework can gate semantic information from higher-order brain areas to enforce the correct category during reconstruction. When categorical information is not available, we show that automatically learned clusters in the data give a similar improvement in reconstruction. The hybrid Bayesian network leads to highly accurate reconstructions in both supervised and unsupervised settings.
Collapse
Affiliation(s)
- Sanne Schoenmakers
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Nijmegen, Netherlands
| | - Tom Heskes
- Institute for Computing and Information Sciences, Radboud University Nijmegen Nijmegen, Netherlands
| |
Collapse
|