1
|
Coraci D, Douven I, Cevolani G. Inference to the best neuroscientific explanation. STUDIES IN HISTORY AND PHILOSOPHY OF SCIENCE 2024; 107:33-42. [PMID: 39128362 DOI: 10.1016/j.shpsa.2024.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 05/30/2024] [Accepted: 06/25/2024] [Indexed: 08/13/2024]
Abstract
Neuroscientists routinely use reverse inference (RI) to draw conclusions about cognitive processes from neural activation data. However, despite its widespread use, the methodological status of RI is a matter of ongoing controversy, with some critics arguing that it should be rejected wholesale on the grounds that it instantiates a deductively invalid argument form. In response to these critiques, some have proposed to conceive of RI as a form of abduction or inference to the best explanation (IBE). We side with this response but at the same time argue that a defense of RI requires more than identifying it as a form of IBE. In this paper, we give an analysis of what determines the quality of an RI conceived as an IBE and on that basis argue that whether an RI is warranted needs to be decided on a case-by-case basis. Support for our argument will come from a detailed methodological discussion of RI in cognitive neuroscience in light of what the recent literature on IBE has identified as the main quality indicators for IBEs.
Collapse
Affiliation(s)
| | - Igor Douven
- CNRS/Panthéon-Sorbonne University, IHPST, France.
| | | |
Collapse
|
2
|
Lan C, Kou J, Liu Q, Qing P, Zhang X, Song X, Xu D, Zhang Y, Chen Y, Zhou X, Kendrick KM, Zhao W. Oral Oxytocin Blurs Sex Differences in Amygdala Responses to Emotional Scenes. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2024; 9:1028-1038. [PMID: 38852918 DOI: 10.1016/j.bpsc.2024.05.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 05/17/2024] [Accepted: 05/23/2024] [Indexed: 06/11/2024]
Abstract
BACKGROUND Sex differences are shaped both by innate biological differences and the social environment and are frequently observed in human emotional neural responses. Oral administration of oxytocin (OXT), as an alternative and noninvasive intake method, has been shown to produce sex-dependent effects on emotional face processing. However, it is unclear whether oral OXT produces similar sex-dependent effects on processing continuous emotional scenes. METHODS The current randomized, double-blind, placebo-controlled neuropsychopharmacological functional magnetic resonance imaging experiment was conducted in 147 healthy participants (OXT = 74, men/women = 37/37; placebo = 73, men/women = 36/37) to examine the oral OXT effect on plasma OXT concentrations and neural response to emotional scenes in both sexes. RESULTS At the neuroendocrine level, women showed lower endogenous OXT concentrations than men, but oral OXT increased OXT concentrations equally in both sexes. Regarding neural activity, emotional scenes evoked opposite valence-independent effects on right amygdala activation (women > men) and its functional connectivity with the insula (men > women) in men and women in the placebo group. This sex difference was either attenuated (amygdala response) or even completely eliminated (amygdala-insula functional connectivity) in the OXT group. Multivariate pattern analysis confirmed these findings by developing an accurate sex-predictive neural pattern that included the amygdala and the insula under the placebo but not the OXT condition. CONCLUSIONS The results of the current study suggest a pronounced sex difference in neural responses to emotional scenes that was eliminated by oral OXT, with OXT having opposite modulatory effects in men and women. This may reflect oral OXT enhancing emotional regulation to continuous emotional stimuli in both sexes by facilitating appropriate changes in sex-specific amygdala-insula circuitry.
Collapse
Affiliation(s)
- Chunmei Lan
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Juan Kou
- Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu, China
| | - Qi Liu
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Peng Qing
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaodong Zhang
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Xinwei Song
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Dan Xu
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China
| | - Yingying Zhang
- Department of Molecular Psychology, Institute of Psychology and Education, Ulm University, Ulm, Germany
| | - Yuanshu Chen
- Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu, China
| | - Xinqi Zhou
- Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu, China
| | - Keith M Kendrick
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China.
| | - Weihua Zhao
- Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, China; Institute of Electronic and Information Engineering of University of Electronic Science and Technology of China in Guangdong, Dongguan, China.
| |
Collapse
|
3
|
Yin X, Wu Z, Wang H. A novel DRL-guided sparse voxel decoding model for reconstructing perceived images from brain activity. J Neurosci Methods 2024; 412:110292. [PMID: 39299579 DOI: 10.1016/j.jneumeth.2024.110292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 08/31/2024] [Accepted: 09/15/2024] [Indexed: 09/22/2024]
Abstract
BACKGROUND Due to the sparse encoding character of the human visual cortex and the scarcity of paired training samples for {images, fMRIs}, voxel selection is an effective means of reconstructing perceived images from fMRI. However, the existing data-driven voxel selection methods have not achieved satisfactory results. NEW METHOD Here, a novel deep reinforcement learning-guided sparse voxel (DRL-SV) decoding model is proposed to reconstruct perceived images from fMRI. We innovatively describe voxel selection as a Markov decision process (MDP), training agents to select voxels that are highly involved in specific visual encoding. RESULTS Experimental results on two public datasets verify the effectiveness of the proposed DRL-SV, which can accurately select voxels highly involved in neural encoding, thereby improving the quality of visual image reconstruction. COMPARISON WITH EXISTING METHODS We qualitatively and quantitatively compared our results with the state-of-the-art (SOTA) methods, getting better reconstruction results. We compared the proposed DRL-SV with traditional data-driven baseline methods, obtaining sparser voxel selection results, but better reconstruction performance. CONCLUSIONS DRL-SV can accurately select voxels involved in visual encoding on few-shot, compared to data-driven voxel selection methods. The proposed decoding model provides a new avenue to improving the image reconstruction quality of the primary visual cortex.
Collapse
Affiliation(s)
- Xu Yin
- Key Laboratory of Child Development and Learning Science of Ministry of Education, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
| | - Zhengping Wu
- School of Innovations, Sanjiang University, China; School of Electronic Science and Engineering, Nanjing University, China
| | - Haixian Wang
- Key Laboratory of Child Development and Learning Science of Ministry of Education, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China.
| |
Collapse
|
4
|
Contier O, Baker CI, Hebart MN. Distributed representations of behaviour-derived object dimensions in the human visual system. Nat Hum Behav 2024:10.1038/s41562-024-01980-y. [PMID: 39251723 DOI: 10.1038/s41562-024-01980-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 08/06/2024] [Indexed: 09/11/2024]
Abstract
Object vision is commonly thought to involve a hierarchy of brain regions processing increasingly complex image features, with high-level visual cortex supporting object recognition and categorization. However, object vision supports diverse behavioural goals, suggesting basic limitations of this category-centric framework. To address these limitations, we mapped a series of dimensions derived from a large-scale analysis of human similarity judgements directly onto the brain. Our results reveal broadly distributed representations of behaviourally relevant information, demonstrating selectivity to a wide variety of novel dimensions while capturing known selectivities for visual features and categories. Behaviour-derived dimensions were superior to categories at predicting brain responses, yielding mixed selectivity in much of visual cortex and sparse selectivity in category-selective clusters. This framework reconciles seemingly disparate findings regarding regional specialization, explaining category selectivity as a special case of sparse response profiles among representational dimensions, suggesting a more expansive view on visual processing in the human brain.
Collapse
Affiliation(s)
- Oliver Contier
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
- Max Planck School of Cognition, Leipzig, Germany.
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA
| | - Martin N Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Medicine, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
5
|
Lin R, Naselaris T, Kay K, Wehbe L. Stacked regressions and structured variance partitioning for interpretable brain maps. Neuroimage 2024; 298:120772. [PMID: 39117095 DOI: 10.1016/j.neuroimage.2024.120772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 07/26/2024] [Accepted: 08/02/2024] [Indexed: 08/10/2024] Open
Abstract
Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning. Our stacking algorithm combines encoding models that each uses as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.
Collapse
Affiliation(s)
- Ruogu Lin
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, United States of America; Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America.
| |
Collapse
|
6
|
Huffman DJ. An In-depth Exploration of the Interplay between fMRI Methods and Theory in Cognitive Neuroscience. JOURNAL OF UNDERGRADUATE NEUROSCIENCE EDUCATION : JUNE : A PUBLICATION OF FUN, FACULTY FOR UNDERGRADUATE NEUROSCIENCE 2024; 22:A273-A288. [PMID: 39355664 PMCID: PMC11441438 DOI: 10.59390/zabm1739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/08/2024] [Accepted: 06/20/2024] [Indexed: 10/03/2024]
Abstract
Functional magnetic resonance imaging (fMRI) has been a cornerstone of cognitive neuroscience since its invention in the 1990s. The methods that we use for fMRI data analysis allow us to test different theories of the brain, thus different analyses can lead us to different conclusions about how the brain produces cognition. There has been a centuries-long debate about the nature of neural processing, with some theories arguing for functional specialization or localization (e.g., face and scene processing) while other theories suggest that cognition is implemented in distributed representations across many neurons and brain regions. Importantly, these theories have received support via different types of analyses; therefore, having students implement hands-on data analysis to explore the results of different fMRI analyses can allow them to take a firsthand approach to thinking about highly influential theories in cognitive neuroscience. Moreover, these explorations allow students to see that there are not clearcut "right" or "wrong" answers in cognitive neuroscience, rather we effectively instantiate assumptions within our analytical approaches that can lead us to different conclusions. Here, I provide Python code that uses freely available software and data to teach students how to analyze fMRI data using traditional activation analysis and machine-learning-based multivariate pattern analysis (MVPA). Altogether, these resources help teach students about the paramount importance of methodology in shaping our theories of the brain, and I believe they will be helpful for introductory undergraduate courses, graduate-level courses, and as a first analysis for people working in labs that use fMRI.
Collapse
Affiliation(s)
- Derek J Huffman
- Department of Psychology, Colby College, Waterville, ME 04901
| |
Collapse
|
7
|
Nigam T, Schwiedrzik CM. Predictions enable top-down pattern separation in the macaque face-processing hierarchy. Nat Commun 2024; 15:7196. [PMID: 39169024 PMCID: PMC11339276 DOI: 10.1038/s41467-024-51543-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 08/07/2024] [Indexed: 08/23/2024] Open
Abstract
Distinguishing faces requires well distinguishable neural activity patterns. Contextual information may separate neural representations, leading to enhanced identity recognition. Here, we use functional magnetic resonance imaging to investigate how predictions derived from contextual information affect the separability of neural activity patterns in the macaque face-processing system, a 3-level processing hierarchy in ventral visual cortex. We find that in the presence of predictions, early stages of this hierarchy exhibit well separable and high-dimensional neural geometries resembling those at the top of the hierarchy. This is accompanied by a systematic shift of tuning properties from higher to lower areas, endowing lower areas with higher-order, invariant representations instead of their feedforward tuning properties. Thus, top-down signals dynamically transform neural representations of faces into separable and high-dimensional neural geometries. Our results provide evidence how predictive context transforms flexible representational spaces to optimally use the computational resources provided by cortical processing hierarchies for better and faster distinction of facial identities.
Collapse
Affiliation(s)
- Tarana Nigam
- Neural Circuits and Cognition Lab, European Neuroscience Institute Göttingen - A Joint Initiative of the University Medical Center Göttingen and the Max Planck Institute for Multidisciplinary Sciences, Grisebachstraße 5, 37077, Göttingen, Germany
- Perception and Plasticity Group, German Primate Center - Leibniz Institute for Primate Research, Kellnerweg 4, 37077, Göttingen, Germany
- Leibniz ScienceCampus 'Primate Cognition', Göttingen, Germany
- International Max Planck Research School 'Neurosciences', Georg August University Göttingen, Grisebachstraße 5, 37077, Göttingen, Germany
| | - Caspar M Schwiedrzik
- Neural Circuits and Cognition Lab, European Neuroscience Institute Göttingen - A Joint Initiative of the University Medical Center Göttingen and the Max Planck Institute for Multidisciplinary Sciences, Grisebachstraße 5, 37077, Göttingen, Germany.
- Perception and Plasticity Group, German Primate Center - Leibniz Institute for Primate Research, Kellnerweg 4, 37077, Göttingen, Germany.
- Leibniz ScienceCampus 'Primate Cognition', Göttingen, Germany.
| |
Collapse
|
8
|
Abdel-Ghaffar SA, Huth AG, Lescroart MD, Stansbury D, Gallant JL, Bishop SJ. Occipital-temporal cortical tuning to semantic and affective features of natural images predicts associated behavioral responses. Nat Commun 2024; 15:5531. [PMID: 38982092 PMCID: PMC11233618 DOI: 10.1038/s41467-024-49073-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 05/22/2024] [Indexed: 07/11/2024] Open
Abstract
In everyday life, people need to respond appropriately to many types of emotional stimuli. Here, we investigate whether human occipital-temporal cortex (OTC) shows co-representation of the semantic category and affective content of visual stimuli. We also explore whether OTC transformation of semantic and affective features extracts information of value for guiding behavior. Participants viewed 1620 emotional natural images while functional magnetic resonance imaging data were acquired. Using voxel-wise modeling we show widespread tuning to semantic and affective image features across OTC. The top three principal components underlying OTC voxel-wise responses to image features encoded stimulus animacy, stimulus arousal and interactions of animacy with stimulus valence and arousal. At low to moderate dimensionality, OTC tuning patterns predicted behavioral responses linked to each image better than regressors directly based on image features. This is consistent with OTC representing stimulus semantic category and affective content in a manner suited to guiding behavior.
Collapse
Affiliation(s)
- Samy A Abdel-Ghaffar
- Department of Psychology, UC Berkeley, Berkeley, CA, 94720, USA
- Google LLC, San Francisco, CA, USA
| | - Alexander G Huth
- Centre for Theoretical and Computational Neuroscience, UT Austin, Austin, TX, 78712, USA
| | - Mark D Lescroart
- Department of Psychology University of Nevada Reno, Reno, NV, 89557, USA
| | - Dustin Stansbury
- Program in Vision Sciences, UC Berkeley, Berkeley, CA, 94720, USA
| | - Jack L Gallant
- Department of Psychology, UC Berkeley, Berkeley, CA, 94720, USA
- Program in Vision Sciences, UC Berkeley, Berkeley, CA, 94720, USA
- Helen Wills Neuroscience Institute, UC Berkeley, Berkeley, CA, 94720, USA
| | - Sonia J Bishop
- Department of Psychology, UC Berkeley, Berkeley, CA, 94720, USA.
- Helen Wills Neuroscience Institute, UC Berkeley, Berkeley, CA, 94720, USA.
- School of Psychology, Trinity College Dublin, Dublin, Ireland.
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, D02 PX31, Ireland.
| |
Collapse
|
9
|
Lindsey JW, Issa EB. Factorized visual representations in the primate visual system and deep neural networks. eLife 2024; 13:RP91685. [PMID: 38968311 PMCID: PMC11226229 DOI: 10.7554/elife.91685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024] Open
Abstract
Object classification has been proposed as a principal objective of the primate ventral visual stream and has been used as an optimization target for deep neural network models (DNNs) of the visual system. However, visual brain areas represent many different types of information, and optimizing for classification of object identity alone does not constrain how other information may be encoded in visual representations. Information about different scene parameters may be discarded altogether ('invariance'), represented in non-interfering subspaces of population activity ('factorization') or encoded in an entangled fashion. In this work, we provide evidence that factorization is a normative principle of biological visual representations. In the monkey ventral visual hierarchy, we found that factorization of object pose and background information from object identity increased in higher-level regions and strongly contributed to improving object identity decoding performance. We then conducted a large-scale analysis of factorization of individual scene parameters - lighting, background, camera viewpoint, and object pose - in a diverse library of DNN models of the visual system. Models which best matched neural, fMRI, and behavioral data from both monkeys and humans across 12 datasets tended to be those which factorized scene parameters most strongly. Notably, invariance to these parameters was not as consistently associated with matches to neural and behavioral data, suggesting that maintaining non-class information in factorized activity subspaces is often preferred to dropping it altogether. Thus, we propose that factorization of visual scene information is a widely used strategy in brains and DNN models thereof.
Collapse
Affiliation(s)
- Jack W Lindsey
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkUnited States
- Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - Elias B Issa
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkUnited States
- Department of Neuroscience, Columbia UniversityNew YorkUnited States
| |
Collapse
|
10
|
Ma S, Wang L, Hou S, Zhang C, Yan B. Large-scale parameters framework with large convolutional kernel for encoding visual fMRI activity information. Cereb Cortex 2024; 34:bhae257. [PMID: 38997209 DOI: 10.1093/cercor/bhae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 06/01/2024] [Accepted: 06/02/2024] [Indexed: 07/14/2024] Open
Abstract
Visual encoding models often use deep neural networks to describe the brain's visual cortex response to external stimuli. Inspired by biological findings, researchers found that large receptive fields built with large convolutional kernels improve convolutional encoding model performance. Inspired by scaling laws in recent years, this article investigates the performance of large convolutional kernel encoding models on larger parameter scales. This paper proposes a large-scale parameters framework with a sizeable convolutional kernel for encoding visual functional magnetic resonance imaging activity information. The proposed framework consists of three parts: First, the stimulus image feature extraction module is constructed using a large-kernel convolutional network while increasing channel numbers to expand the parameter size of the framework. Second, enlarging the input data during the training stage through the multi-subject fusion module to accommodate the increase in parameters. Third, the voxel mapping module maps from stimulus image features to functional magnetic resonance imaging signals. Compared to sizeable convolutional kernel visual encoding networks with base parameter scale, our visual encoding framework improves by approximately 7% on the Natural Scenes Dataset, the dedicated dataset for the Algonauts 2023 Challenge. We further analyze that our encoding framework made a trade-off between encoding performance and trainability. This paper confirms that expanding parameters in visual coding can bring performance improvements.
Collapse
Affiliation(s)
- Shuxiao Ma
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, 450000, China
| | - Linyuan Wang
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, 450000, China
| | - Senbao Hou
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, 450000, China
| | - Chi Zhang
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, 450000, China
| | - Bin Yan
- Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou, 450000, China
| |
Collapse
|
11
|
Li Y, Yang H, Gu S. Enhancing neural encoding models for naturalistic perception with a multi-level integration of deep neural networks and cortical networks. Sci Bull (Beijing) 2024; 69:1738-1747. [PMID: 38490889 DOI: 10.1016/j.scib.2024.02.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 06/27/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024]
Abstract
Cognitive neuroscience aims to develop computational models that can accurately predict and explain neural responses to sensory inputs in the cortex. Recent studies attempt to leverage the representation power of deep neural networks (DNNs) to predict the brain response and suggest a correspondence between artificial and biological neural networks in their feature representations. However, typical voxel-wise encoding models tend to rely on specific networks designed for computer vision tasks, leading to suboptimal brain-wide correspondence during cognitive tasks. To address this challenge, this work proposes a novel approach that upgrades voxel-wise encoding models through multi-level integration of features from DNNs and information from brain networks. Our approach combines DNN feature-level ensemble learning and brain atlas-level model integration, resulting in significant improvements in predicting whole-brain neural activity during naturalistic video perception. Furthermore, this multi-level integration framework enables a deeper understanding of the brain's neural representation mechanism, accurately predicting the neural response to complex visual concepts. We demonstrate that neural encoding models can be optimized by leveraging a framework that integrates both data-driven approaches and theoretical insights into the functional structure of the cortical networks.
Collapse
Affiliation(s)
- Yuanning Li
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai 201210, China.
| | - Huzheng Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shi Gu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen 518110, China.
| |
Collapse
|
12
|
Miao HY, Tong F. Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing. J Vis 2024; 24:1. [PMID: 38829629 PMCID: PMC11156204 DOI: 10.1167/jov.24.6.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/03/2024] [Indexed: 06/05/2024] Open
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
13
|
Markow ZE, Tripathy K, Svoboda AM, Schroeder ML, Rafferty SM, Richter EJ, Eggebrecht AT, Anastasio MA, Chevillet MA, Mugler EM, Naufel SN, Yin A, Trobaugh JW, Culver JP. Identifying Naturalistic Movies from Human Brain Activity with High-Density Diffuse Optical Tomography. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.27.566650. [PMID: 38076976 PMCID: PMC10705261 DOI: 10.1101/2023.11.27.566650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Modern neuroimaging modalities, particularly functional MRI (fMRI), can decode detailed human experiences. Thousands of viewed images can be identified or classified, and sentences can be reconstructed. Decoding paradigms often leverage encoding models that reduce the stimulus space into a smaller yet generalizable feature set. However, the neuroimaging devices used for detailed decoding are non-portable, like fMRI, or invasive, like electrocorticography, excluding application in naturalistic use. Wearable, non-invasive, but lower-resolution devices such as electroencephalography and functional near-infrared spectroscopy (fNIRS) have been limited to decoding between stimuli used during training. Herein we develop and evaluate model-based decoding with high-density diffuse optical tomography (HD-DOT), a higher-resolution expansion of fNIRS with demonstrated promise as a surrogate for fMRI. Using a motion energy model of visual content, we decoded the identities of novel movie clips outside the training set with accuracy far above chance for single-trial decoding. Decoding was robust to modulations of testing time window, different training and test imaging sessions, hemodynamic contrast, and optode array density. Our results suggest that HD-DOT can translate detailed decoding into naturalistic use.
Collapse
|
14
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
15
|
Khonina SN, Kazanskiy NL, Skidanov RV, Butt MA. Exploring Types of Photonic Neural Networks for Imaging and Computing-A Review. NANOMATERIALS (BASEL, SWITZERLAND) 2024; 14:697. [PMID: 38668191 PMCID: PMC11054149 DOI: 10.3390/nano14080697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/13/2024] [Accepted: 04/15/2024] [Indexed: 04/29/2024]
Abstract
Photonic neural networks (PNNs), utilizing light-based technologies, show immense potential in artificial intelligence (AI) and computing. Compared to traditional electronic neural networks, they offer faster processing speeds, lower energy usage, and improved parallelism. Leveraging light's properties for information processing could revolutionize diverse applications, including complex calculations and advanced machine learning (ML). Furthermore, these networks could address scalability and efficiency challenges in large-scale AI systems, potentially reshaping the future of computing and AI research. In this comprehensive review, we provide current, cutting-edge insights into diverse types of PNNs crafted for both imaging and computing purposes. Additionally, we delve into the intricate challenges they encounter during implementation, while also illuminating the promising perspectives they introduce to the field.
Collapse
Affiliation(s)
| | | | | | - Muhammad A. Butt
- Samara National Research University, 443086 Samara, Russia (N.L.K.)
| |
Collapse
|
16
|
Kim SG, De Martino F, Overath T. Linguistic modulation of the neural encoding of phonemes. Cereb Cortex 2024; 34:bhae155. [PMID: 38687241 PMCID: PMC11059272 DOI: 10.1093/cercor/bhae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 05/02/2024] Open
Abstract
Speech comprehension entails the neural mapping of the acoustic speech signal onto learned linguistic units. This acousto-linguistic transformation is bi-directional, whereby higher-level linguistic processes (e.g. semantics) modulate the acoustic analysis of individual linguistic units. Here, we investigated the cortical topography and linguistic modulation of the most fundamental linguistic unit, the phoneme. We presented natural speech and "phoneme quilts" (pseudo-randomly shuffled phonemes) in either a familiar (English) or unfamiliar (Korean) language to native English speakers while recording functional magnetic resonance imaging. This allowed us to dissociate the contribution of acoustic vs. linguistic processes toward phoneme analysis. We show that (i) the acoustic analysis of phonemes is modulated by linguistic analysis and (ii) that for this modulation, both of acoustic and phonetic information need to be incorporated. These results suggest that the linguistic modulation of cortical sensitivity to phoneme classes minimizes prediction error during natural speech perception, thereby aiding speech comprehension in challenging listening situations.
Collapse
Affiliation(s)
- Seung-Goo Kim
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt am Main 60322, Germany
| | - Federico De Martino
- Faculty of Psychology and Neuroscience, University of Maastricht, Universiteitssingel 40, 6229 ER Maastricht, Netherlands
| | - Tobias Overath
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Duke Institute for Brain Sciences, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Center for Cognitive Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
| |
Collapse
|
17
|
Li R, Li J, Wang C, Liu H, Liu T, Wang X, Zou T, Huang W, Yan H, Chen H. Multi-Semantic Decoding of Visual Perception with Graph Neural Networks. Int J Neural Syst 2024; 34:2450016. [PMID: 38372016 DOI: 10.1142/s0129065724500163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Constructing computational decoding models to account for the cortical representation of semantic information plays a crucial role in understanding visual perception. The human visual system processes interactive relationships among different objects when perceiving the semantic contents of natural visions. However, the existing semantic decoding models commonly regard categories as completely separate and independent visually and semantically and rarely consider the relationships from prior information. In this work, a novel semantic graph learning model was proposed to decode multiple semantic categories of perceived natural images from brain activity. The proposed model was validated on the functional magnetic resonance imaging data collected from five normal subjects while viewing 2750 natural images comprising 52 semantic categories. The results showed that the Graph Neural Network-based decoding model achieved higher accuracies than other deep neural network models. Moreover, the co-occurrence probability among semantic categories showed a significant correlation with the decoding accuracy. Additionally, the results suggested that semantic content organized in a hierarchical way with higher visual areas was more closely related to the internal visual experience. Together, this study provides a superior computational framework for multi-semantic decoding that supports the visual integration mechanism of semantic processing.
Collapse
Affiliation(s)
- Rong Li
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Jiyi Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Chong Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Haoxiang Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Tao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Xuyang Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Ting Zou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Wei Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Hongmei Yan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Huafu Chen
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| |
Collapse
|
18
|
Spino J. Brain Data Availability Presents Unique Privacy Challenges. AJOB Neurosci 2024; 15:146-148. [PMID: 38568702 DOI: 10.1080/21507740.2024.2326881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
|
19
|
Jiang C, Chen Z, Wolfe JM. Toward viewing behavior for aerial scene categorization. Cogn Res Princ Implic 2024; 9:17. [PMID: 38530617 PMCID: PMC10965882 DOI: 10.1186/s41235-024-00541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/07/2024] [Indexed: 03/28/2024] Open
Abstract
Previous work has demonstrated similarities and differences between aerial and terrestrial image viewing. Aerial scene categorization, a pivotal visual processing task for gathering geoinformation, heavily depends on rotation-invariant information. Aerial image-centered research has revealed effects of low-level features on performance of various aerial image interpretation tasks. However, there are fewer studies of viewing behavior for aerial scene categorization and of higher-level factors that might influence that categorization. In this paper, experienced subjects' eye movements were recorded while they were asked to categorize aerial scenes. A typical viewing center bias was observed. Eye movement patterns varied among categories. We explored the relationship of nine image statistics to observers' eye movements. Results showed that if the images were less homogeneous, and/or if they contained fewer or no salient diagnostic objects, viewing behavior became more exploratory. Higher- and object-level image statistics were predictive at both the image and scene category levels. Scanpaths were generally organized and small differences in scanpath randomness could be roughly captured by critical object saliency. Participants tended to fixate on critical objects. Image statistics included in this study showed rotational invariance. The results supported our hypothesis that the availability of diagnostic objects strongly influences eye movements in this task. In addition, this study provides supporting evidence for Loschky et al.'s (Journal of Vision, 15(6), 11, 2015) speculation that aerial scenes are categorized on the basis of image parts and individual objects. The findings were discussed in relation to theories of scene perception and their implications for automation development.
Collapse
Affiliation(s)
- Chenxi Jiang
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China
| | - Zhenzhong Chen
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China.
- Hubei Luojia Laboratory, Wuhan, Hubei, China.
| | - Jeremy M Wolfe
- Harvard Medical School, Boston, MA, USA
- Brigham & Women's Hospital, Boston, MA, USA
| |
Collapse
|
20
|
Ontivero-Ortega M, Iglesias-Fuster J, Perez-Hidalgo J, Marinazzo D, Valdes-Sosa M, Valdes-Sosa P. Intra-V1 functional networks and classification of observed stimuli. Front Neuroinform 2024; 18:1080173. [PMID: 38528885 PMCID: PMC10961393 DOI: 10.3389/fninf.2024.1080173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 02/08/2024] [Indexed: 03/27/2024] Open
Abstract
Introduction Previous studies suggest that co-fluctuations in neural activity within V1 (measured with fMRI) carry information about observed stimuli, potentially reflecting various cognitive mechanisms. This study explores the neural sources shaping this information by using different fMRI preprocessing methods. The common response to stimuli shared by all individuals can be emphasized by using inter-subject correlations or de-emphasized by deconvolving the fMRI with hemodynamic response functions (HRFs) before calculating the correlations. The latter approach shifts the balance towards participant-idiosyncratic activity. Methods Here, we used multivariate pattern analysis of intra-V1 correlation matrices to predict the Level or Shape of observed Navon letters employing the types of correlations described above. We assessed accuracy in inter-subject prediction of specific conjunctions of properties, and attempted intra-subject cross-classification of stimulus properties (i.e., prediction of one feature despite changes in the other). Weight maps from successful classifiers were projected onto the visual field. A control experiment investigated eye-movement patterns during stimuli presentation. Results All inter-subject classifiers accurately predicted the Level and Shape of specific observed stimuli. However, successful intra-subject cross-classification was achieved only for stimulus Level, but not Shape, regardless of preprocessing scheme. Weight maps for successful Level classification differed between inter-subject correlations and deconvolved correlations. The latter revealed asymmetries in visual field link strength that corresponded to known perceptual asymmetries. Post-hoc measurement of eyeball fMRI signals did not find differences in gaze between stimulus conditions, and a control experiment (with derived simulations) also suggested that eye movements do not explain the stimulus-related changes in V1 topology. Discussion Our findings indicate that both inter-subject common responses and participant-specific activity contribute to the information in intra-V1 co-fluctuations, albeit through distinct sub-networks. Deconvolution, that enhances subject-specific activity, highlighted interhemispheric links for Global stimuli. Further exploration of intra-V1 networks promises insights into the neural basis of attention and perceptual organization.
Collapse
Affiliation(s)
- Marlis Ontivero-Ortega
- The Clinical Hospital of Chengdu Brain Sciences, University of Electronic Sciences Technology of China, Chengdu, China
- Cuban Center for Neuroscience, Havana, Cuba
- Department of Data Analysis, Ghent University, Ghent, Belgium
| | | | | | | | - Mitchell Valdes-Sosa
- The Clinical Hospital of Chengdu Brain Sciences, University of Electronic Sciences Technology of China, Chengdu, China
- Cuban Center for Neuroscience, Havana, Cuba
| | - Pedro Valdes-Sosa
- The Clinical Hospital of Chengdu Brain Sciences, University of Electronic Sciences Technology of China, Chengdu, China
- Cuban Center for Neuroscience, Havana, Cuba
| |
Collapse
|
21
|
Etemadpour R, Shintree S, Shereen AD. Brain Activity is Influenced by How High Dimensional Data are Represented: An EEG Study of Scatterplot Diagnostic (Scagnostics) Measures. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:19-49. [PMID: 38273981 PMCID: PMC10805893 DOI: 10.1007/s41666-023-00145-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 07/07/2023] [Accepted: 08/29/2023] [Indexed: 01/27/2024]
Abstract
Visualization and visual analytic tools amplify one's perception of data, facilitating deeper and faster insights that can improve decision making. For multidimensional data sets, one of the most common approaches of visualization methods is to map the data into lower dimensions. Scatterplot matrices (SPLOM) are often used to visualize bivariate relationships between combinations of variables in a multidimensional dataset. However, the number of scatterplots increases quadratically with respect to the number of variables. For high dimensional data, the corresponding enormous number of scatterplots makes data exploration overwhelmingly complex, thereby hindering the usefulness of SPLOM in human decision making processes. One approach to address this difficulty utilizes Graph-theoretic Scatterplot Diagnostic (Scagnostics) to automatically extract a subset of scatterplots with salient features and of manageable size with the hope that the data will be sufficient for improving human decisions. In this paper, we use Electroencephalogram (EEG) to observe brain activity while participants make decisions informed by scatterplots created using different visual measures. We focused on 4 categories of Scagnostics measures: Clumpy, Monotonic, Striated, and Stringy. Our findings demonstrate that by adjusting the level of difficulty in discriminating between data sets based on the Scagnostics measures, different parts of the brain are activated: easier visual discrimination choices involve brain activity mostly in visual sensory cortices located in the occipital lobe, while more difficult discrimination choices tend to recruit more parietal and frontal regions as they are known to be involved in resolving ambiguities. Our results imply that patterns of neural activity are predictive markers of which specific Scagnostics measures most assist human decision making based on visual stimuli such as ours.
Collapse
Affiliation(s)
- Ronak Etemadpour
- Verus Research, 6100 Uptown Blvd NE, Suite 260, Albuquerque, New Mexico 87110 USA
- Radiology Department, UNM Health Sciences Center (UNM), Albuquerque, New Mexico USA
- The City College of New York, 160 Convent Ave, New York, NY 10031 USA
| | - Sonali Shintree
- The City College of New York, 160 Convent Ave, New York, NY 10031 USA
| | - A Duke Shereen
- CUNY Advanced Science Research Center, Neuroscience Initiative, Graduate Center, New York, NY USA
| |
Collapse
|
22
|
Koide-Majima N, Nishimoto S, Majima K. Mental image reconstruction from human brain activity: Neural decoding of mental imagery via deep neural network-based Bayesian estimation. Neural Netw 2024; 170:349-363. [PMID: 38016230 DOI: 10.1016/j.neunet.2023.11.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 09/22/2023] [Accepted: 11/08/2023] [Indexed: 11/30/2023]
Abstract
Visual images observed by humans can be reconstructed from their brain activity. However, the visualization (externalization) of mental imagery is challenging. Only a few studies have reported successful visualization of mental imagery, and their visualizable images have been limited to specific domains such as human faces or alphabetical letters. Therefore, visualizing mental imagery for arbitrary natural images stands as a significant milestone. In this study, we achieved this by enhancing a previous method. Specifically, we demonstrated that the visual image reconstruction method proposed in the seminal study by Shen et al. (2019) heavily relied on low-level visual information decoded from the brain and could not efficiently utilize the semantic information that would be recruited during mental imagery. To address this limitation, we extended the previous method to a Bayesian estimation framework and introduced the assistance of semantic information into it. Our proposed framework successfully reconstructed both seen images (i.e., those observed by the human eye) and imagined images from brain activity. Quantitative evaluation showed that our framework could identify seen and imagined images highly accurately compared to the chance accuracy (seen: 90.7%, imagery: 75.6%, chance accuracy: 50.0%). In contrast, the previous method could only identify seen images (seen: 64.3%, imagery: 50.4%). These results suggest that our framework would provide a unique tool for directly investigating the subjective contents of the brain such as illusions, hallucinations, and dreams.
Collapse
Affiliation(s)
- Naoko Koide-Majima
- Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, Osaka 565-0871, Japan; Graduate School of Frontier Biosciences, Osaka University, Osaka 565-0871, Japan
| | - Shinji Nishimoto
- Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, Osaka 565-0871, Japan; Graduate School of Frontier Biosciences, Osaka University, Osaka 565-0871, Japan; Graduate School of Medicine, Osaka University, Osaka 565-0871, Japan
| | - Kei Majima
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba 263-8555, Japan; JST PRESTO, Saitama 332-0012, Japan.
| |
Collapse
|
23
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|
24
|
Shrivastava M, Ye L. Neuroimaging and artificial intelligence for assessment of chronic painful temporomandibular disorders-a comprehensive review. Int J Oral Sci 2023; 15:58. [PMID: 38155153 PMCID: PMC10754947 DOI: 10.1038/s41368-023-00254-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 12/30/2023] Open
Abstract
Chronic Painful Temporomandibular Disorders (TMD) are challenging to diagnose and manage due to their complexity and lack of understanding of brain mechanism. In the past few decades' neural mechanisms of pain regulation and perception have been clarified by neuroimaging research. Advances in the neuroimaging have bridged the gap between brain activity and the subjective experience of pain. Neuroimaging has also made strides toward separating the neural mechanisms underlying the chronic painful TMD. Recently, Artificial Intelligence (AI) is transforming various sectors by automating tasks that previously required humans' intelligence to complete. AI has started to contribute to the recognition, assessment, and understanding of painful TMD. The application of AI and neuroimaging in understanding the pathophysiology and diagnosis of chronic painful TMD are still in its early stages. The objective of the present review is to identify the contemporary neuroimaging approaches such as structural, functional, and molecular techniques that have been used to investigate the brain of chronic painful TMD individuals. Furthermore, this review guides practitioners on relevant aspects of AI and how AI and neuroimaging methods can revolutionize our understanding on the mechanisms of painful TMD and aid in both diagnosis and management to enhance patient outcomes.
Collapse
Affiliation(s)
- Mayank Shrivastava
- Adams School of Dentistry, University of North Carolina, Chapel Hill, NC, USA
| | - Liang Ye
- Department of Rehabilitation Medicine, University of Minnesota Medical School, Minneapolis, MN, USA.
| |
Collapse
|
25
|
Guan S, Jiang R, Chen DY, Michael A, Meng C, Biswal B. Multifractal long-range dependence pattern of functional magnetic resonance imaging in the human brain at rest. Cereb Cortex 2023; 33:11594-11608. [PMID: 37851793 DOI: 10.1093/cercor/bhad393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/01/2023] [Accepted: 10/02/2023] [Indexed: 10/20/2023] Open
Abstract
Long-range dependence is a prevalent phenomenon in various biological systems that characterizes the long-memory effect of temporal fluctuations. While recent research suggests that functional magnetic resonance imaging signal has fractal property, it remains unknown about the multifractal long-range dependence pattern of resting-state functional magnetic resonance imaging signals. The current study adopted the multifractal detrended fluctuation analysis on highly sampled resting-state functional magnetic resonance imaging scans to investigate long-range dependence profile associated with the whole-brain voxels as specific functional networks. Our findings revealed the long-range dependence's multifractal properties. Moreover, long-term persistent fluctuations are found for all stations with stronger persistency in whole-brain regions. Subsets with large fluctuations contribute more to the multifractal spectrum in the whole brain. Additionally, we found that the preprocessing with band-pass filtering provided significantly higher reliability for estimating long-range dependence. Our validation analysis confirmed that the optimal pipeline of long-range dependence analysis should include band-pass filtering and removal of daily temporal dependence. Furthermore, multifractal long-range dependence characteristics in healthy control and schizophrenia are different significantly. This work has provided an analytical pipeline for the multifractal long-range dependence in the resting-state functional magnetic resonance imaging signal. The findings suggest differential long-memory effects in the intrinsic functional networks, which may offer a neural marker finding for understanding brain function and pathology.
Collapse
Affiliation(s)
- Sihai Guan
- College of Electronic and Information, Southwest Minzu University, Chengdu 610041, China
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission, Chengdu 610041, China
| | - Runzhou Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
- Medical Equipment Department, Xiangyang No.1 People's Hospital, Xiangyang 441000, China
| | - Donna Y Chen
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ 07102, United States
| | - Andrew Michael
- Duke Institute for Brain Sciences, Duke University, Durham, NC 27708, United States
| | - Chun Meng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Bharat Biswal
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
- Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ 07102, United States
| |
Collapse
|
26
|
Csaky R, van Es MWJ, Jones OP, Woolrich M. Group-level brain decoding with deep learning. Hum Brain Mapp 2023; 44:6105-6119. [PMID: 37753636 PMCID: PMC10619368 DOI: 10.1002/hbm.26500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 07/11/2023] [Accepted: 09/11/2023] [Indexed: 09/28/2023] Open
Abstract
Decoding brain imaging data are gaining popularity, with applications in brain-computer interfaces and the study of neural representations. Decoding is typically subject-specific and does not generalise well over subjects, due to high amounts of between subject variability. Techniques that overcome this will not only provide richer neuroscientific insights but also make it possible for group-level models to outperform subject-specific models. Here, we propose a method that uses subject embedding, analogous to word embedding in natural language processing, to learn and exploit the structure in between-subject variability as part of a decoding model, our adaptation of the WaveNet architecture for classification. We apply this to magnetoencephalography data, where 15 subjects viewed 118 different images, with 30 examples per image; to classify images using the entire 1 s window following image presentation. We show that the combination of deep learning and subject embedding is crucial to closing the performance gap between subject- and group-level decoding models. Importantly, group models outperform subject models on low-accuracy subjects (although slightly impair high-accuracy subjects) and can be helpful for initialising subject models. While we have not generally found group-level models to perform better than subject-level models, the performance of group modelling is expected to be even higher with bigger datasets. In order to provide physiological interpretation at the group level, we make use of permutation feature importance. This provides insights into the spatiotemporal and spectral information encoded in the models. All code is available on GitHub (https://github.com/ricsinaruto/MEG-group-decode).
Collapse
Affiliation(s)
- Richard Csaky
- Oxford Centre for Human Brain Activity, Department of PsychiatryUniversity of OxfordOxfordUK
- Wellcome Centre for Integrative NeuroimagingOxfordUK
- Christ ChurchOxfordUK
| | - Mats W. J. van Es
- Oxford Centre for Human Brain Activity, Department of PsychiatryUniversity of OxfordOxfordUK
- Wellcome Centre for Integrative NeuroimagingOxfordUK
| | - Oiwi Parker Jones
- Wellcome Centre for Integrative NeuroimagingOxfordUK
- Jesus CollegeOxfordUK
- Department of Engineering ScienceUniversity of OxfordOxfordUK
| | - Mark Woolrich
- Oxford Centre for Human Brain Activity, Department of PsychiatryUniversity of OxfordOxfordUK
- Wellcome Centre for Integrative NeuroimagingOxfordUK
| |
Collapse
|
27
|
Tang J, Du M, Vo VA, Lal V, Huth AG. Brain encoding models based on multimodal transformers can transfer across language and vision. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2023; 36:29654-29666. [PMID: 39015152 PMCID: PMC11250991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Encoding models have been used to assess how the human brain represents concepts in language and vision. While language and vision rely on similar concept representations, current encoding models are typically trained and tested on brain responses to each modality in isolation. Recent advances in multimodal pretraining have produced transformers that can extract aligned representations of concepts in language and vision. In this work, we used representations from multimodal transformers to train encoding models that can transfer across fMRI responses to stories and movies. We found that encoding models trained on brain responses to one modality can successfully predict brain responses to the other modality, particularly in cortical regions that represent conceptual meaning. Further analysis of these encoding models revealed shared semantic dimensions that underlie concept representations in language and vision. Comparing encoding models trained using representations from multimodal and unimodal transformers, we found that multimodal transformers learn more aligned representations of concepts in language and vision. Our results demonstrate how multimodal transformers can provide insights into the brain's capacity for multimodal processing.
Collapse
|
28
|
Hopp FR, Amir O, Fisher JT, Grafton S, Sinnott-Armstrong W, Weber R. Moral foundations elicit shared and dissociable cortical activation modulated by political ideology. Nat Hum Behav 2023; 7:2182-2198. [PMID: 37679440 DOI: 10.1038/s41562-023-01693-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 08/03/2023] [Indexed: 09/09/2023]
Abstract
Moral foundations theory (MFT) holds that moral judgements are driven by modular and ideologically variable moral foundations but where and how these foundations are represented in the brain and shaped by political beliefs remains an open question. Using a moral vignette judgement task (n = 64), we probed the neural (dis)unity of moral foundations. Univariate analyses revealed that moral judgement of moral foundations, versus conventional norms, reliably recruits core areas implicated in theory of mind. Yet, multivariate pattern analysis demonstrated that each moral foundation elicits dissociable neural representations distributed throughout the cortex. As predicted by MFT, individuals' liberal or conservative orientation modulated neural responses to moral foundations. Our results confirm that each moral foundation recruits domain-general mechanisms of social cognition but also has a dissociable neural signature malleable by sociomoral experience. We discuss these findings in view of unified versus dissociable accounts of morality and their neurological support for MFT.
Collapse
Affiliation(s)
- Frederic R Hopp
- Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, the Netherlands
| | - Ori Amir
- Pomona College, Claremont, CA, USA
| | - Jacob T Fisher
- Department of Communication, Michigan State University, Lansing, MI, USA
| | - Scott Grafton
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | | | - René Weber
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA.
- Department of Communication, Media Neuroscience Lab, University of California, Santa Barbara, CA, USA.
- School of Communication and Media, Ewha Womans University, Seoul, South Korea.
| |
Collapse
|
29
|
Fang Z, Bloem IM, Olsson C, Ma WJ, Winawer J. Normalization by orientation-tuned surround in human V1-V3. PLoS Comput Biol 2023; 19:e1011704. [PMID: 38150484 PMCID: PMC10793941 DOI: 10.1371/journal.pcbi.1011704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 01/17/2024] [Accepted: 11/20/2023] [Indexed: 12/29/2023] Open
Abstract
An influential account of neuronal responses in primary visual cortex is the normalized energy model. This model is often implemented as a multi-stage computation. The first stage is linear filtering. The second stage is the extraction of contrast energy, whereby a complex cell computes the squared and summed outputs of a pair of the linear filters in quadrature phase. The third stage is normalization, in which a local population of complex cells mutually inhibit one another. Because the population includes cells tuned to a range of orientations and spatial frequencies, the result is that the responses are effectively normalized by the local stimulus contrast. Here, using evidence from human functional MRI, we show that the classical model fails to account for the relative responses to two classes of stimuli: straight, parallel, band-passed contours (gratings), and curved, band-passed contours (snakes). The snakes elicit fMRI responses that are about twice as large as the gratings, yet a traditional divisive normalization model predicts responses that are about the same. Motivated by these observations and others from the literature, we implement a divisive normalization model in which cells matched in orientation tuning ("tuned normalization") preferentially inhibit each other. We first show that this model accounts for differential responses to these two classes of stimuli. We then show that the model successfully generalizes to other band-pass textures, both in V1 and in extrastriate cortex (V2 and V3). We conclude that even in primary visual cortex, complex features of images such as the degree of heterogeneity, can have large effects on neural responses.
Collapse
Affiliation(s)
- Zeming Fang
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
- Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, New York, United States of America
| | - Ilona M. Bloem
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Catherine Olsson
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Wei Ji Ma
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Jonathan Winawer
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| |
Collapse
|
30
|
Rastegarnia S, St-Laurent M, DuPre E, Pinsard B, Bellec P. Brain decoding of the Human Connectome Project tasks in a dense individual fMRI dataset. Neuroimage 2023; 283:120395. [PMID: 37832707 DOI: 10.1016/j.neuroimage.2023.120395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 09/21/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023] Open
Abstract
Brain decoding aims to infer cognitive states from patterns of brain activity. Substantial inter-individual variations in functional brain organization challenge accurate decoding performed at the group level. In this paper, we tested whether accurate brain decoding models can be trained entirely at the individual level. We trained several classifiers on a dense individual functional magnetic resonance imaging (fMRI) dataset for which six participants completed the entire Human Connectome Project (HCP) task battery >13 times over ten separate fMRI sessions. We evaluated nine decoding methods, from Support Vector Machines (SVM) and Multi-Layer Perceptron (MLP) to Graph Convolutional Neural Networks (GCN). All decoders were trained to classify single fMRI volumes into 21 experimental conditions simultaneously, using ∼7 h of fMRI data per participant. The best prediction accuracies were achieved with GCN and MLP models, whose performance (57-67 % accuracy) approached state-of-the-art accuracy (76 %) with models trained at the group level on >1 K hours of data from the original HCP sample. Our SVM model also performed very well (54-62 % accuracy). Feature importance maps derived from MLP -our best-performing model- revealed informative features in regions relevant to particular cognitive domains, notably in the motor cortex. We also observed that inter-subject classification achieved substantially lower accuracy than subject-specific models, indicating that our decoders learned individual-specific features. This work demonstrates that densely-sampled neuroimaging datasets can be used to train accurate brain decoding models at the individual level. We expect this work to become a useful benchmark for techniques that improve model generalization across multiple subjects and acquisition conditions.
Collapse
Affiliation(s)
- Shima Rastegarnia
- Université de Montréal, Montréal, QC, Canada; Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada.
| | - Marie St-Laurent
- Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| | | | - Basile Pinsard
- Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| | - Pierre Bellec
- Université de Montréal, Montréal, QC, Canada; Centre de Recherche de L'Institut Universitaire de Gériatrie de Montréal, Montréal, Canada
| |
Collapse
|
31
|
Khaleghi N, Hashemi S, Ardabili SZ, Sheykhivand S, Danishvar S. Salient Arithmetic Data Extraction from Brain Activity via an Improved Deep Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:9351. [PMID: 38067727 PMCID: PMC10708586 DOI: 10.3390/s23239351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/06/2023] [Accepted: 11/14/2023] [Indexed: 12/18/2023]
Abstract
Interpretation of neural activity in response to stimulations received from the surrounding environment is necessary to realize automatic brain decoding. Analyzing the brain recordings corresponding to visual stimulation helps to infer the effects of perception occurring by vision on brain activity. In this paper, the impact of arithmetic concepts on vision-related brain records has been considered and an efficient convolutional neural network-based generative adversarial network (CNN-GAN) is proposed to map the electroencephalogram (EEG) to salient parts of the image stimuli. The first part of the proposed network consists of depth-wise one-dimensional convolution layers to classify the brain signals into 10 different categories according to Modified National Institute of Standards and Technology (MNIST) image digits. The output of the CNN part is fed forward to a fine-tuned GAN in the proposed model. The performance of the proposed CNN part is evaluated via the visually provoked 14-channel MindBigData recorded by David Vivancos, corresponding to images of 10 digits. An average accuracy of 95.4% is obtained for the CNN part for classification. The performance of the proposed CNN-GAN is evaluated based on saliency metrics of SSIM and CC equal to 92.9% and 97.28%, respectively. Furthermore, the EEG-based reconstruction of MNIST digits is accomplished by transferring and tuning the improved CNN-GAN's trained weights.
Collapse
Affiliation(s)
- Nastaran Khaleghi
- Department of Electrical and Computer Engineering, University of Tabriz, Tabriz 51666-16471, Iran;
| | - Shaghayegh Hashemi
- Department of Computer Science and Engineering, Shahid Beheshti University, Tehran 19839-69411, Iran;
| | - Sevda Zafarmandi Ardabili
- Electrical and Computer Engineering Department, Southern Methodist University, Dallas, TX 75205, USA;
| | - Sobhan Sheykhivand
- Department of Biomedical Engineering, University of Bonab, Bonab 55517-61167, Iran;
| | - Sebelan Danishvar
- College of Engineering, Design and Physical Sciences, Brunel University London, Uxbridge UB8 3PH, UK
| |
Collapse
|
32
|
Csaky R, van Es MWJ, Jones OP, Woolrich M. Interpretable many-class decoding for MEG. Neuroimage 2023; 282:120396. [PMID: 37805019 PMCID: PMC10938061 DOI: 10.1016/j.neuroimage.2023.120396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/11/2023] [Accepted: 09/27/2023] [Indexed: 10/09/2023] Open
Abstract
Multivariate pattern analysis (MVPA) of Magnetoencephalography (MEG) and Electroencephalography (EEG) data is a valuable tool for understanding how the brain represents and discriminates between different stimuli. Identifying the spatial and temporal signatures of stimuli is typically a crucial output of these analyses. Such analyses are mainly performed using linear, pairwise, sliding window decoding models. These allow for relative ease of interpretation, e.g. by estimating a time-course of decoding accuracy, but have limited decoding performance. On the other hand, full epoch multiclass decoding models, commonly used for brain-computer interface (BCI) applications, can provide better decoding performance. However interpretation methods for such models have been designed with a low number of classes in mind. In this paper, we propose an approach that combines a multiclass, full epoch decoding model with supervised dimensionality reduction, while still being able to reveal the contributions of spatiotemporal and spectral features using permutation feature importance. Crucially, we introduce a way of doing supervised dimensionality reduction of input features within a neural network optimised for the classification task, improving performance substantially. We demonstrate the approach on 3 different many-class task-MEG datasets using image presentations. Our results demonstrate that this approach consistently achieves higher accuracy than the peak accuracy of a sliding window decoder while estimating the relevant spatiotemporal features in the MEG signal.
Collapse
Affiliation(s)
- Richard Csaky
- Oxford Centre for Human Brain Activity, Department of Psychiatry, University of Oxford, OX3 7JX, Oxford, UK; Wellcome Centre for Integrative Neuroimaging, OX3 9DU, Oxford, UK; Christ Church, OX1 1DP, Oxford, UK.
| | - Mats W J van Es
- Oxford Centre for Human Brain Activity, Department of Psychiatry, University of Oxford, OX3 7JX, Oxford, UK; Wellcome Centre for Integrative Neuroimaging, OX3 9DU, Oxford, UK.
| | - Oiwi Parker Jones
- Wellcome Centre for Integrative Neuroimaging, OX3 9DU, Oxford, UK; Department of Engineering Science, University of Oxford, OX1 3PJ, Oxford, UK; Jesus College, OX1 3DW, Oxford, UK.
| | - Mark Woolrich
- Oxford Centre for Human Brain Activity, Department of Psychiatry, University of Oxford, OX3 7JX, Oxford, UK; Wellcome Centre for Integrative Neuroimaging, OX3 9DU, Oxford, UK.
| |
Collapse
|
33
|
Barbieri R, Töpfer FM, Soch J, Bogler C, Sprekeler H, Haynes JD. Encoding of continuous perceptual choices in human early visual cortex. Front Hum Neurosci 2023; 17:1277539. [PMID: 38021249 PMCID: PMC10679739 DOI: 10.3389/fnhum.2023.1277539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/25/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction Research on the neural mechanisms of perceptual decision-making has typically focused on simple categorical choices, say between two alternative motion directions. Studies on such discrete alternatives have often suggested that choices are encoded either in a motor-based or in an abstract, categorical format in regions beyond sensory cortex. Methods In this study, we used motion stimuli that could vary anywhere between 0° and 360° to assess how the brain encodes choices for features that span the full sensory continuum. We employed a combination of neuroimaging and encoding models based on Gaussian process regression to assess how either stimuli or choices were encoded in brain responses. Results We found that single-voxel tuning patterns could be used to reconstruct the trial-by-trial physical direction of motion as well as the participants' continuous choices. Importantly, these continuous choice signals were primarily observed in early visual areas. The tuning properties in this region generalized between choice encoding and stimulus encoding, even for reports that reflected pure guessing. Discussion We found only little information related to the decision outcome in regions beyond visual cortex, such as parietal cortex, possibly because our task did not involve differential motor preparation. This could suggest that decisions for continuous stimuli take can place already in sensory brain regions, potentially using similar mechanisms to the sensory recruitment in visual working memory.
Collapse
Affiliation(s)
- Riccardo Barbieri
- Bernstein Center for Computational Neuroscience and Berlin Center for Advanced Neuroimaging, Department of Neurology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health (BIH), Berlin, Germany
| | - Felix M. Töpfer
- Bernstein Center for Computational Neuroscience and Berlin Center for Advanced Neuroimaging, Department of Neurology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health (BIH), Berlin, Germany
| | - Joram Soch
- Bernstein Center for Computational Neuroscience and Berlin Center for Advanced Neuroimaging, Department of Neurology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health (BIH), Berlin, Germany
- German Center for Neurodegenerative Diseases, Göttingen, Germany
| | - Carsten Bogler
- Bernstein Center for Computational Neuroscience and Berlin Center for Advanced Neuroimaging, Department of Neurology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health (BIH), Berlin, Germany
| | - Henning Sprekeler
- Department for Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
| | - John-Dylan Haynes
- Bernstein Center for Computational Neuroscience and Berlin Center for Advanced Neuroimaging, Department of Neurology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health (BIH), Berlin, Germany
- Berlin School of Mind and Brain and Institute of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
34
|
Finn ES, Poldrack RA, Shine JM. Functional neuroimaging as a catalyst for integrated neuroscience. Nature 2023; 623:263-273. [PMID: 37938706 DOI: 10.1038/s41586-023-06670-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 09/22/2023] [Indexed: 11/09/2023]
Abstract
Functional magnetic resonance imaging (fMRI) enables non-invasive access to the awake, behaving human brain. By tracking whole-brain signals across a diverse range of cognitive and behavioural states or mapping differences associated with specific traits or clinical conditions, fMRI has advanced our understanding of brain function and its links to both normal and atypical behaviour. Despite this headway, progress in human cognitive neuroscience that uses fMRI has been relatively isolated from rapid advances in other subdomains of neuroscience, which themselves are also somewhat siloed from one another. In this Perspective, we argue that fMRI is well-placed to integrate the diverse subfields of systems, cognitive, computational and clinical neuroscience. We first summarize the strengths and weaknesses of fMRI as an imaging tool, then highlight examples of studies that have successfully used fMRI in each subdomain of neuroscience. We then provide a roadmap for the future advances that will be needed to realize this integrative vision. In this way, we hope to demonstrate how fMRI can help usher in a new era of interdisciplinary coherence in neuroscience.
Collapse
Affiliation(s)
- Emily S Finn
- Department of Psychological and Brain Sciences, Dartmouth College, Dartmouth, NH, USA.
| | | | - James M Shine
- School of Medical Sciences, University of Sydney, Sydney, New South Wales, Australia.
| |
Collapse
|
35
|
Xie Y, Sadeh S. Computational assessment of visual coding across mouse brain areas and behavioural states. Front Comput Neurosci 2023; 17:1269019. [PMID: 37899886 PMCID: PMC10613063 DOI: 10.3389/fncom.2023.1269019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 09/29/2023] [Indexed: 10/31/2023] Open
Abstract
Introduction Our brain is bombarded by a diverse range of visual stimuli, which are converted into corresponding neuronal responses and processed throughout the visual system. The neural activity patterns that result from these external stimuli vary depending on the object or scene being observed, but they also change as a result of internal or behavioural states. This raises the question of to what extent it is possible to predict the presented visual stimuli from neural activity across behavioural states, and how this varies in different brain regions. Methods To address this question, we assessed the computational capacity of decoders to extract visual information in awake behaving mice, by analysing publicly available standardised datasets from the Allen Brain Institute. We evaluated how natural movie frames can be distinguished based on the activity of units recorded in distinct brain regions and under different behavioural states. This analysis revealed the spectrum of visual information present in different brain regions in response to binary and multiclass classification tasks. Results Visual cortical areas showed highest classification accuracies, followed by thalamic and midbrain regions, with hippocampal regions showing close to chance accuracy. In addition, we found that behavioural variability led to a decrease in decoding accuracy, whereby large behavioural changes between train and test sessions reduced the classification performance of the decoders. A generalised linear model analysis suggested that this deterioration in classification might be due to an independent modulation of neural activity by stimulus and behaviour. Finally, we reconstructed the natural movie frames from optimal linear classifiers, and observed a strong similarity between reconstructed and actual movie frames. However, the similarity was significantly higher when the decoders were trained and tested on sessions with similar behavioural states. Conclusion Our analysis provides a systematic assessment of visual coding in the mouse brain, and sheds light on the spectrum of visual information present across brain areas and behavioural states.
Collapse
Affiliation(s)
| | - Sadra Sadeh
- Department of Brain Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|
36
|
Yuste R. Advocating for neurodata privacy and neurotechnology regulation. Nat Protoc 2023; 18:2869-2875. [PMID: 37697107 DOI: 10.1038/s41596-023-00873-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 07/27/2023] [Indexed: 09/13/2023]
Abstract
The ability to record and alter brain activity by using implantable and nonimplantable neural devices, while poised to have significant scientific and clinical benefits, also raises complex ethical concerns. In this Perspective, we raise awareness of the ability of artificial intelligence algorithms and data-aggregation tools to decode and analyze data containing highly sensitive information, jeopardizing personal neuroprivacy. Voids in existing regulatory frameworks, in fact, allow unrestricted decoding and commerce of neurodata. We advocate for the implementation of proposed ethical and human rights guidelines, alongside technical options such as data encryption, differential privacy and federated learning to ensure the protection of neurodata privacy. We further encourage regulatory bodies to consider taking a position of responsibility by categorizing all brain-derived data as sensitive health data and apply existing medical regulations to all data gathered via pre-registered neural devices. Lastly, we propose that a technocratic oath may instill a deontology for neurotechnology practitioners akin to what the Hippocratic oath represents in medicine. A conscientious societal position that thoroughly rejects the misuse of neurodata would provide the moral compass for the future development of the neurotechnology field.
Collapse
Affiliation(s)
- Rafael Yuste
- Neurotechnology Center, Columbia University, New York, NY, USA.
| |
Collapse
|
37
|
Meng L, Yang C. Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI. Bioengineering (Basel) 2023; 10:1117. [PMID: 37892847 PMCID: PMC10604156 DOI: 10.3390/bioengineering10101117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 09/20/2023] [Accepted: 09/21/2023] [Indexed: 10/29/2023] Open
Abstract
The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.
Collapse
Affiliation(s)
- Lu Meng
- College of Information Science and Engineering, Northeastern University, Shenyang 110819, China;
| | | |
Collapse
|
38
|
Ozcelik F, VanRullen R. Natural scene reconstruction from fMRI signals using generative latent diffusion. Sci Rep 2023; 13:15666. [PMID: 37731047 PMCID: PMC10511448 DOI: 10.1038/s41598-023-42891-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 09/15/2023] [Indexed: 09/22/2023] Open
Abstract
In neural decoding research, one of the most intriguing topics is the reconstruction of perceived natural images based on fMRI signals. Previous studies have succeeded in re-creating different aspects of the visuals, such as low-level properties (shape, texture, layout) or high-level features (category of objects, descriptive semantics of scenes) but have typically failed to reconstruct these properties together for complex scene images. Generative AI has recently made a leap forward with latent diffusion models capable of generating high-complexity images. Here, we investigate how to take advantage of this innovative technology for brain decoding. We present a two-stage scene reconstruction framework called "Brain-Diffuser". In the first stage, starting from fMRI signals, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Variational Autoencoder) model. In the second stage, we use the image-to-image framework of a latent diffusion model (Versatile Diffusion) conditioned on predicted multimodal (text and visual) features, to generate final reconstructed images. On the publicly available Natural Scenes Dataset benchmark, our method outperforms previous models both qualitatively and quantitatively. When applied to synthetic fMRI patterns generated from individual ROI (region-of-interest) masks, our trained model creates compelling "ROI-optimal" scenes consistent with neuroscientific knowledge. Thus, the proposed methodology can have an impact on both applied (e.g. brain-computer interface) and fundamental neuroscience.
Collapse
Affiliation(s)
- Furkan Ozcelik
- CerCo, CNRS UMR5549, Toulouse, France.
- Universite de Toulouse, Toulouse, France.
| | - Rufin VanRullen
- CerCo, CNRS UMR5549, Toulouse, France
- Universite de Toulouse, Toulouse, France
- ANITI, Toulouse, France
| |
Collapse
|
39
|
Robinson AK, Quek GL, Carlson TA. Visual Representations: Insights from Neural Decoding. Annu Rev Vis Sci 2023; 9:313-335. [PMID: 36889254 DOI: 10.1146/annurev-vision-100120-025301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Patterns of brain activity contain meaningful information about the perceived world. Recent decades have welcomed a new era in neural analyses, with computational techniques from machine learning applied to neural data to decode information represented in the brain. In this article, we review how decoding approaches have advanced our understanding of visual representations and discuss efforts to characterize both the complexity and the behavioral relevance of these representations. We outline the current consensus regarding the spatiotemporal structure of visual representations and review recent findings that suggest that visual representations are at once robust to perturbations, yet sensitive to different mental states. Beyond representations of the physical world, recent decoding work has shone a light on how the brain instantiates internally generated states, for example, during imagery and prediction. Going forward, decoding has remarkable potential to assess the functional relevance of visual representations for human behavior, reveal how representations change across development and during aging, and uncover their presentation in various mental disorders.
Collapse
Affiliation(s)
- Amanda K Robinson
- Queensland Brain Institute, The University of Queensland, Brisbane, Australia;
| | - Genevieve L Quek
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, Australia;
| | | |
Collapse
|
40
|
Vinken K, Prince JS, Konkle T, Livingstone MS. The neural code for "face cells" is not face-specific. SCIENCE ADVANCES 2023; 9:eadg1736. [PMID: 37647400 PMCID: PMC10468123 DOI: 10.1126/sciadv.adg1736] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 07/27/2023] [Indexed: 09/01/2023]
Abstract
Face cells are neurons that respond more to faces than to non-face objects. They are found in clusters in the inferotemporal cortex, thought to process faces specifically, and, hence, studied using faces almost exclusively. Analyzing neural responses in and around macaque face patches to hundreds of objects, we found graded response profiles for non-face objects that predicted the degree of face selectivity and provided information on face-cell tuning beyond that from actual faces. This relationship between non-face and face responses was not predicted by color and simple shape properties but by information encoded in deep neural networks trained on general objects rather than face classification. These findings contradict the long-standing assumption that face versus non-face selectivity emerges from face-specific features and challenge the practice of focusing on only the most effective stimulus. They provide evidence instead that category-selective neurons are best understood by their tuning directions in a domain-general object space.
Collapse
Affiliation(s)
- Kasper Vinken
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Jacob S. Prince
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | | |
Collapse
|
41
|
Du C, Fu K, Li J, He H. Decoding Visual Neural Representations by Multimodal Learning of Brain-Visual-Linguistic Features. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:10760-10777. [PMID: 37030711 DOI: 10.1109/tpami.2023.3263181] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Decoding human visual neural representations is a challenging task with great scientific significance in revealing vision-processing mechanisms and developing brain-like intelligent machines. Most existing methods are difficult to generalize to novel categories that have no corresponding neural data for training. The two main reasons are 1) the under-exploitation of the multimodal semantic knowledge underlying the neural data and 2) the small number of paired (stimuli-responses) training data. To overcome these limitations, this paper presents a generic neural decoding method called BraVL that uses multimodal learning of brain-visual-linguistic features. We focus on modeling the relationships between brain, visual and linguistic features via multimodal deep generative models. Specifically, we leverage the mixture-of-product-of-experts formulation to infer a latent code that enables a coherent joint generation of all three modalities. To learn a more consistent joint representation and improve the data efficiency in the case of limited brain activity data, we exploit both intra- and inter-modality mutual information maximization regularization terms. In particular, our BraVL model can be trained under various semi-supervised scenarios to incorporate the visual and textual features obtained from the extra categories. Finally, we construct three trimodal matching datasets, and the extensive experiments lead to some interesting conclusions and cognitive insights: 1) decoding novel visual categories from human brain activity is practically possible with good accuracy; 2) decoding models using the combination of visual and linguistic features perform much better than those using either of them alone; 3) visual perception may be accompanied by linguistic influences to represent the semantics of visual stimuli.
Collapse
|
42
|
Fan JE, Bainbridge WA, Chamberlain R, Wammes JD. Drawing as a versatile cognitive tool. NATURE REVIEWS PSYCHOLOGY 2023; 2:556-568. [PMID: 39239312 PMCID: PMC11377027 DOI: 10.1038/s44159-023-00212-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/21/2023] [Indexed: 09/07/2024]
Abstract
Drawing is a cognitive tool that makes the invisible contents of mental life visible. Humans use this tool to produce a remarkable variety of pictures, from realistic portraits to schematic diagrams. Despite this variety and the prevalence of drawn images, the psychological mechanisms that enable drawings to be so versatile have yet to be fully explored. In this Review, we synthesize contemporary work in multiple areas of psychology, computer science and neuroscience that examines the cognitive processes involved in drawing production and comprehension. This body of findings suggests that the balance of contributions from perception, memory and social inference during drawing production varies depending on the situation, resulting in some drawings that are more realistic and other drawings that are more abstract. We also consider the use of drawings as a research tool for investigating various aspects of cognition, as well as the role that drawing has in facilitating learning and communication. Taken together, information about how drawings are used in different contexts illuminates the central role of visually grounded abstractions in human thought and behaviour.
Collapse
Affiliation(s)
- Judith E Fan
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA
- Department of Psychology, Stanford University, Stanford, CA, USA
| | | | | | - Jeffrey D Wammes
- Department of Psychology, Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada
| |
Collapse
|
43
|
Ren Z, Li J, Xue X, Li X, Yang F, Jiao Z, Gao X. Reconstructing controllable faces from brain activity with hierarchical multiview representations. Neural Netw 2023; 166:487-500. [PMID: 37574622 DOI: 10.1016/j.neunet.2023.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/21/2023] [Accepted: 07/12/2023] [Indexed: 08/15/2023]
Abstract
Reconstructing visual experience from brain responses measured by functional magnetic resonance imaging (fMRI) is a challenging yet important research topic in brain decoding, especially it has proved more difficult to decode visually similar stimuli, such as faces. Although face attributes are known as the key to face recognition, most existing methods generally ignore how to decode facial attributes more precisely in perceived face reconstruction, which often leads to indistinguishable reconstructed faces. To solve this problem, we propose a novel neural decoding framework called VSPnet (voxel2style2pixel) by establishing hierarchical encoding and decoding networks with disentangled latent representations as media, so that to recover visual stimuli more elaborately. And we design a hierarchical visual encoder (named HVE) to pre-extract features containing both high-level semantic knowledge and low-level visual details from stimuli. The proposed VSPnet consists of two networks: Multi-branch cognitive encoder and style-based image generator. The encoder network is constructed by multiple linear regression branches to map brain signals to the latent space provided by the pre-extracted visual features and obtain representations containing hierarchical information consistent to the corresponding stimuli. We make the generator network inspired by StyleGAN to untangle the complexity of fMRI representations and generate images. And the HVE network is composed of a standard feature pyramid over a ResNet backbone. Extensive experimental results on the latest public datasets have demonstrated the reconstruction accuracy of our proposed method outperforms the state-of-the-art approaches and the identifiability of different reconstructed faces has been greatly improved. In particular, we achieve feature editing for several facial attributes in fMRI domain based on the multiview (i.e., visual stimuli and evoked fMRI) latent representations.
Collapse
Affiliation(s)
- Ziqi Ren
- School of Electronic Engineering, Xidian University, Xi'an 710071, China
| | - Jie Li
- School of Electronic Engineering, Xidian University, Xi'an 710071, China
| | - Xuetong Xue
- School of Electronic Engineering, Xidian University, Xi'an 710071, China
| | - Xin Li
- Group 42 (G42), Abu Dhabi, United Arab Emirates
| | - Fan Yang
- Group 42 (G42), Abu Dhabi, United Arab Emirates
| | - Zhicheng Jiao
- The Warren Alpert Medical School, Brown University, RI, USA; Department of Diagnostic Imaging, Rhode Island Hospital, RI, USA
| | - Xinbo Gao
- School of Electronic Engineering, Xidian University, Xi'an 710071, China.
| |
Collapse
|
44
|
Sagar V, Shanahan LK, Zelano CM, Gottfried JA, Kahnt T. High-precision mapping reveals the structure of odor coding in the human brain. Nat Neurosci 2023; 26:1595-1602. [PMID: 37620443 PMCID: PMC10726579 DOI: 10.1038/s41593-023-01414-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Accepted: 07/18/2023] [Indexed: 08/26/2023]
Abstract
Odor perception is inherently subjective. Previous work has shown that odorous molecules evoke distributed activity patterns in olfactory cortices, but how these patterns map on to subjective odor percepts remains unclear. In the present study, we collected neuroimaging responses to 160 odors from 3 individual subjects (18 h per subject) to probe the neural coding scheme underlying idiosyncratic odor perception. We found that activity in the orbitofrontal cortex (OFC) represents the fine-grained perceptual identity of odors over and above coarsely defined percepts, whereas this difference is less pronounced in the piriform cortex (PirC) and amygdala. Furthermore, the implementation of perceptual encoding models enabled us to predict olfactory functional magnetic resonance imaging responses to new odors, revealing that the dimensionality of the encoded perceptual spaces increases from the PirC to the OFC. Whereas encoding of lower-order dimensions generalizes across subjects, encoding of higher-order dimensions is idiosyncratic. These results provide new insights into cortical mechanisms of odor coding and suggest that subjective olfactory percepts reside in the OFC.
Collapse
Affiliation(s)
- Vivek Sagar
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | | | - Christina M Zelano
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Jay A Gottfried
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Thorsten Kahnt
- National Institute on Drug Abuse Intramural Research Program, Baltimore, MD, USA.
| |
Collapse
|
45
|
Miao HY, Tong F. Convolutional neural network models of neuronal responses in macaque V1 reveal limited non-linear processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.26.554952. [PMID: 37693397 PMCID: PMC10491131 DOI: 10.1101/2023.08.26.554952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple non-linearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more non-linear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower-layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven non-linear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although VGG-19's predictive accuracy was somewhat better than standard AlexNet, we found that a modified version of AlexNet could match VGG-19's performance after only a few non-linear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for non-linear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few non-linear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, 37240, USA
| |
Collapse
|
46
|
Wang C, Yan H, Huang W, Sheng W, Wang Y, Fan YS, Liu T, Zou T, Li R, Chen H. Neural encoding with unsupervised spiking convolutional neural network. Commun Biol 2023; 6:880. [PMID: 37640808 PMCID: PMC10462614 DOI: 10.1038/s42003-023-05257-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 08/18/2023] [Indexed: 08/31/2023] Open
Abstract
Accurately predicting the brain responses to various stimuli poses a significant challenge in neuroscience. Despite recent breakthroughs in neural encoding using convolutional neural networks (CNNs) in fMRI studies, there remain critical gaps between the computational rules of traditional artificial neurons and real biological neurons. To address this issue, a spiking CNN (SCNN)-based framework is presented in this study to achieve neural encoding in a more biologically plausible manner. The framework utilizes unsupervised SCNN to extract visual features of image stimuli and employs a receptive field-based regression algorithm to predict fMRI responses from the SCNN features. Experimental results on handwritten characters, handwritten digits and natural images demonstrate that the proposed approach can achieve remarkably good encoding performance and can be utilized for "brain reading" tasks such as image reconstruction and identification. This work suggests that SNN can serve as a promising tool for neural encoding.
Collapse
Affiliation(s)
- Chong Wang
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hongmei Yan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Wei Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Wei Sheng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yuting Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yun-Shuang Fan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Tao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Ting Zou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Rong Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Huafu Chen
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 611731, China.
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
- MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| |
Collapse
|
47
|
Schütt HH, Kipnis AD, Diedrichsen J, Kriegeskorte N. Statistical inference on representational geometries. eLife 2023; 12:e82566. [PMID: 37610302 PMCID: PMC10446828 DOI: 10.7554/elife.82566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 08/07/2023] [Indexed: 08/24/2023] Open
Abstract
Neuroscience has recently made much progress, expanding the complexity of both neural activity measurements and brain-computational models. However, we lack robust methods for connecting theory and experiment by evaluating our new big models with our new big data. Here, we introduce new inference methods enabling researchers to evaluate and compare models based on the accuracy of their predictions of representational geometries: A good model should accurately predict the distances among the neural population representations (e.g. of a set of stimuli). Our inference methods combine novel 2-factor extensions of crossvalidation (to prevent overfitting to either subjects or conditions from inflating our estimates of model accuracy) and bootstrapping (to enable inferential model comparison with simultaneous generalization to both new subjects and new conditions). We validate the inference methods on data where the ground-truth model is known, by simulating data with deep neural networks and by resampling of calcium-imaging and functional MRI data. Results demonstrate that the methods are valid and conclusions generalize correctly. These data analysis methods are available in an open-source Python toolbox (rsatoolbox.readthedocs.io).
Collapse
Affiliation(s)
- Heiko H Schütt
- Zuckerman Institute, Columbia UniversityNew YorkUnited States
| | | | | | | |
Collapse
|
48
|
Gong Z, Zhou M, Dai Y, Wen Y, Liu Y, Zhen Z. A large-scale fMRI dataset for the visual processing of naturalistic scenes. Sci Data 2023; 10:559. [PMID: 37612327 PMCID: PMC10447576 DOI: 10.1038/s41597-023-02471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/14/2023] [Indexed: 08/25/2023] Open
Abstract
One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put a lot of effort into collecting large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.
Collapse
Affiliation(s)
- Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Ming Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China
| | - Yuxuan Dai
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Yushan Wen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
| | - Youyi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China.
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
49
|
LeBel A, Wagner L, Jain S, Adhikari-Desai A, Gupta B, Morgenthal A, Tang J, Xu L, Huth AG. A natural language fMRI dataset for voxelwise encoding models. Sci Data 2023; 10:555. [PMID: 37612332 PMCID: PMC10447563 DOI: 10.1038/s41597-023-02437-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 08/02/2023] [Indexed: 08/25/2023] Open
Abstract
Speech comprehension is a complex process that draws on humans' abilities to extract lexical information, parse syntax, and form semantic understanding. These sub-processes have traditionally been studied using separate neuroimaging experiments that attempt to isolate specific effects of interest. More recently it has become possible to study all stages of language comprehension in a single neuroimaging experiment using narrative natural language stimuli. The resulting data are richly varied at every level, enabling analyses that can probe everything from spectral representations to high-level representations of semantic meaning. We provide a dataset containing BOLD fMRI responses recorded while 8 participants each listened to 27 complete, natural, narrative stories (~6 hours). This dataset includes pre-processed and raw MRIs, as well as hand-constructed 3D cortical surfaces for each participant. To address the challenges of analyzing naturalistic data, this dataset is accompanied by a python library containing basic code for creating voxelwise encoding models. Altogether, this dataset provides a large and novel resource for understanding speech and language processing in the human brain.
Collapse
Affiliation(s)
- Amanda LeBel
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94704, USA
| | - Lauren Wagner
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, 90095, USA
| | - Shailee Jain
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Aneesh Adhikari-Desai
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Bhavin Gupta
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Allyson Morgenthal
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Jerry Tang
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Lixiang Xu
- Department of Physics, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Alexander G Huth
- Department of Computer Science, The University of Texas at Austin, Austin, TX, 78712, USA.
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
50
|
Wang Y, Lee H, Kuhl BA. Mapping multidimensional content representations to neural and behavioral expressions of episodic memory. Neuroimage 2023; 277:120222. [PMID: 37327954 PMCID: PMC10424734 DOI: 10.1016/j.neuroimage.2023.120222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 06/06/2023] [Accepted: 06/08/2023] [Indexed: 06/18/2023] Open
Abstract
Human neuroimaging studies have shown that the contents of episodic memories are represented in distributed patterns of neural activity. However, these studies have mostly been limited to decoding simple, unidimensional properties of stimuli. Semantic encoding models, in contrast, offer a means for characterizing the rich, multidimensional information that comprises episodic memories. Here, we extensively sampled four human fMRI subjects to build semantic encoding models and then applied these models to reconstruct content from natural scene images as they were viewed and recalled from memory. First, we found that multidimensional semantic information was successfully reconstructed from activity patterns across visual and lateral parietal cortices, both when viewing scenes and when recalling them from memory. Second, whereas visual cortical reconstructions were much more accurate when images were viewed versus recalled from memory, lateral parietal reconstructions were comparably accurate across visual perception and memory. Third, by applying natural language processing methods to verbal recall data, we showed that fMRI-based reconstructions reliably matched subjects' verbal descriptions of their memories. In fact, reconstructions from ventral temporal cortex more closely matched subjects' own verbal recall than other subjects' verbal recall of the same images. Fourth, encoding models reliably transferred across subjects: memories were successfully reconstructed using encoding models trained on data from entirely independent subjects. Together, these findings provide evidence for successful reconstructions of multidimensional and idiosyncratic memory representations and highlight the differential sensitivity of visual cortical and lateral parietal regions to information derived from the external visual environment versus internally-generated memories.
Collapse
Affiliation(s)
- Yingying Wang
- Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou 310028, China; Department of Psychology, University of Oregon, Eugene, OR 97403, USA
| | - Hongmi Lee
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Brice A Kuhl
- Department of Psychology, University of Oregon, Eugene, OR 97403, USA.
| |
Collapse
|