1
|
Meng L, Tang Z, Liu Y. Reconstruction of natural images from human fMRI using a three-stage multi-level deep fusion model. J Neurosci Methods 2024; 411:110269. [PMID: 39222796 DOI: 10.1016/j.jneumeth.2024.110269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 06/28/2024] [Accepted: 08/25/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND Image reconstruction is a critical task in brain decoding research, primarily utilizing functional magnetic resonance imaging (fMRI) data. However, due to challenges such as limited samples in fMRI data, the quality of reconstruction results often remains poor. NEW METHOD We proposed a three-stage multi-level deep fusion model (TS-ML-DFM). The model employed a three-stage training process, encompassing components such as image encoders, generators, discriminators, and fMRI encoders. In this method, we incorporated distinct supplementary features derived separately from depth images and original images. Additionally, the method integrated several components, including a random shift module, dual attention module, and multi-level feature fusion module. RESULTS In both qualitative and quantitative comparisons on the Horikawa17 and VanGerven10 datasets, our method exhibited excellent performance. COMPARISON WITH EXISTING METHODS For example, on the primary Horikawa17 dataset, our method was compared with other leading methods based on metrics the average hash value, histogram similarity, mutual information, structural similarity accuracy, AlexNet(2), AlexNet(5), and pairwise human perceptual similarity accuracy. Compared to the second-ranked results in each metric, the proposed method achieved improvements of 0.99 %, 3.62 %, 3.73 %, 2.45 %, 3.51 %, 0.62 %, and 1.03 %, respectively. In terms of the SwAV top-level semantic metric, a substantial improvement of 10.53 % was achieved compared to the second-ranked result in the pixel-level reconstruction methods. CONCLUSIONS The TS-ML-DFM method proposed in this study, when applied to decoding brain visual patterns using fMRI data, has outperformed previous algorithms, thereby facilitating further advancements in research within this field.
Collapse
Affiliation(s)
- Lu Meng
- School of Information Science and Engineering, Northeastern University, Shenyang 110819, China.
| | - Zhenxuan Tang
- School of Information Science and Engineering, Northeastern University, Shenyang 110819, China
| | | |
Collapse
|
2
|
Ferrante M, Boccato T, Passamonti L, Toschi N. Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model. J Neural Eng 2024; 21:046001. [PMID: 38885689 DOI: 10.1088/1741-2552/ad593c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 06/17/2024] [Indexed: 06/20/2024]
Abstract
Objective.Brain decoding is a field of computational neuroscience that aims to infer mental states or internal representations of perceptual inputs from measurable brain activity. This study proposes a novel approach to brain decoding that relies on semantic and contextual similarity.Approach.We use several functional magnetic resonance imaging (fMRI) datasets of natural images as stimuli and create a deep learning decoding pipeline inspired by the bottom-up and top-down processes in human vision. Our pipeline includes a linear brain-to-feature model that maps fMRI activity to semantic visual stimuli features. We assume that the brain projects visual information onto a space that is homeomorphic to the latent space of last layer of a pretrained neural network, which summarizes and highlights similarities and differences between concepts. These features are categorized in the latent space using a nearest-neighbor strategy, and the results are used to retrieve images or condition a generative latent diffusion model to create novel images.Main results.We demonstrate semantic classification and image retrieval on three different fMRI datasets: Generic Object Decoding (vision perception and imagination), BOLD5000, and NSD. In all cases, a simple mapping between fMRI and a deep semantic representation of the visual stimulus resulted in meaningful classification and retrieved or generated images. We assessed quality using quantitative metrics and a human evaluation experiment that reproduces the multiplicity of conscious and unconscious criteria that humans use to evaluate image similarity. Our method achieved correct evaluation in over 80% of the test set.Significance.Our study proposes a novel approach to brain decoding that relies on semantic and contextual similarity. The results demonstrate that measurable neural correlates can be linearly mapped onto the latent space of a neural network to synthesize images that match the original content. These findings have implications for both cognitive neuroscience and artificial intelligence.
Collapse
Affiliation(s)
- Matteo Ferrante
- Department of Biomedicine and Prevention, University of Rome, Tor Vergata, Rome, Italy
| | - Tommaso Boccato
- Department of Biomedicine and Prevention, University of Rome, Tor Vergata, Rome, Italy
| | - Luca Passamonti
- CNR, Istituto di Bioimmagini e Fisiologia Molecolare, Milan, Italy
| | - Nicola Toschi
- Department of Biomedicine and Prevention, University of Rome, Tor Vergata, Rome, Italy
- Martinos Center for Biomedical Imaging, MGH and Harvard Medical School, Boston, MA, United States of America
| |
Collapse
|
3
|
Caplette L, Turk-Browne NB. Computational reconstruction of mental representations using human behavior. Nat Commun 2024; 15:4183. [PMID: 38760341 PMCID: PMC11101448 DOI: 10.1038/s41467-024-48114-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 04/19/2024] [Indexed: 05/19/2024] Open
Abstract
Revealing how the mind represents information is a longstanding goal of cognitive science. However, there is currently no framework for reconstructing the broad range of mental representations that humans possess. Here, we ask participants to indicate what they perceive in images made of random visual features in a deep neural network. We then infer associations between the semantic features of their responses and the visual features of the images. This allows us to reconstruct the mental representations of multiple visual concepts, both those supplied by participants and other concepts extrapolated from the same semantic space. We validate these reconstructions in separate participants and further generalize our approach to predict behavior for new stimuli and in a new task. Finally, we reconstruct the mental representations of individual observers and of a neural network. This framework enables a large-scale investigation of conceptual representations.
Collapse
Affiliation(s)
| | - Nicholas B Turk-Browne
- Department of Psychology, Yale University, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| |
Collapse
|
4
|
Dado T, Papale P, Lozano A, Le L, Wang F, van Gerven M, Roelfsema P, Güçlütürk Y, Güçlü U. Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Comput Biol 2024; 20:e1012058. [PMID: 38709818 PMCID: PMC11098503 DOI: 10.1371/journal.pcbi.1012058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 05/16/2024] [Accepted: 04/08/2024] [Indexed: 05/08/2024] Open
Abstract
A challenging goal of neural coding is to characterize the neural representations underlying visual perception. To this end, multi-unit activity (MUA) of macaque visual cortex was recorded in a passive fixation task upon presentation of faces and natural images. We analyzed the relationship between MUA and latent representations of state-of-the-art deep generative models, including the conventional and feature-disentangled representations of generative adversarial networks (GANs) (i.e., z- and w-latents of StyleGAN, respectively) and language-contrastive representations of latent diffusion networks (i.e., CLIP-latents of Stable Diffusion). A mass univariate neural encoding analysis of the latent representations showed that feature-disentangled w representations outperform both z and CLIP representations in explaining neural responses. Further, w-latent features were found to be positioned at the higher end of the complexity gradient which indicates that they capture visual information relevant to high-level neural activity. Subsequently, a multivariate neural decoding analysis of the feature-disentangled representations resulted in state-of-the-art spatiotemporal reconstructions of visual perception. Taken together, our results not only highlight the important role of feature-disentanglement in shaping high-level neural representations underlying visual perception but also serve as an important benchmark for the future of neural coding.
Collapse
Affiliation(s)
- Thirza Dado
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Paolo Papale
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Antonio Lozano
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Lynn Le
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Feng Wang
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Pieter Roelfsema
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne University, Paris, France
- Department of Integrative Neurophysiology, VU Amsterdam, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC, Amsterdam, Netherlands
| | - Yağmur Güçlütürk
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
5
|
Koide-Majima N, Nishimoto S, Majima K. Mental image reconstruction from human brain activity: Neural decoding of mental imagery via deep neural network-based Bayesian estimation. Neural Netw 2024; 170:349-363. [PMID: 38016230 DOI: 10.1016/j.neunet.2023.11.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 09/22/2023] [Accepted: 11/08/2023] [Indexed: 11/30/2023]
Abstract
Visual images observed by humans can be reconstructed from their brain activity. However, the visualization (externalization) of mental imagery is challenging. Only a few studies have reported successful visualization of mental imagery, and their visualizable images have been limited to specific domains such as human faces or alphabetical letters. Therefore, visualizing mental imagery for arbitrary natural images stands as a significant milestone. In this study, we achieved this by enhancing a previous method. Specifically, we demonstrated that the visual image reconstruction method proposed in the seminal study by Shen et al. (2019) heavily relied on low-level visual information decoded from the brain and could not efficiently utilize the semantic information that would be recruited during mental imagery. To address this limitation, we extended the previous method to a Bayesian estimation framework and introduced the assistance of semantic information into it. Our proposed framework successfully reconstructed both seen images (i.e., those observed by the human eye) and imagined images from brain activity. Quantitative evaluation showed that our framework could identify seen and imagined images highly accurately compared to the chance accuracy (seen: 90.7%, imagery: 75.6%, chance accuracy: 50.0%). In contrast, the previous method could only identify seen images (seen: 64.3%, imagery: 50.4%). These results suggest that our framework would provide a unique tool for directly investigating the subjective contents of the brain such as illusions, hallucinations, and dreams.
Collapse
Affiliation(s)
- Naoko Koide-Majima
- Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, Osaka 565-0871, Japan; Graduate School of Frontier Biosciences, Osaka University, Osaka 565-0871, Japan
| | - Shinji Nishimoto
- Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, Osaka 565-0871, Japan; Graduate School of Frontier Biosciences, Osaka University, Osaka 565-0871, Japan; Graduate School of Medicine, Osaka University, Osaka 565-0871, Japan
| | - Kei Majima
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba 263-8555, Japan; JST PRESTO, Saitama 332-0012, Japan.
| |
Collapse
|
6
|
Kneeland R, Ojeda J, St-Yves G, Naselaris T. Brain-optimized inference improves reconstructions of fMRI brain activity. ARXIV 2023:arXiv:2312.07705v1. [PMID: 38168454 PMCID: PMC10760191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The release of large datasets and developments in AI have led to dramatic improvements in decoding methods that reconstruct seen images from human brain activity. We evaluate the prospect of further improving recent decoding methods by optimizing for consistency between reconstructions and brain activity during inference. We sample seed reconstructions from a base decoding method, then iteratively refine these reconstructions using a brain-optimized encoding model that maps images to brain activity. At each iteration, we sample a small library of images from an image distribution (a diffusion model) conditioned on a seed reconstruction from the previous iteration. We select those that best approximate the measured brain activity when passed through our encoding model, and use these images for structural guidance during the generation of the small library in the next iteration. We reduce the stochasticity of the image distribution at each iteration, and stop when a criterion on the "width" of the image distribution is met. We show that when this process is applied to recent decoding methods, it outperforms the base decoding method as measured by human raters, a variety of image feature metrics, and alignment to brain activity. These results demonstrate that reconstruction quality can be significantly improved by explicitly aligning decoding distributions to brain activity distributions, even when the seed reconstruction is output from a state-of-the-art decoding algorithm. Interestingly, the rate of refinement varies systematically across visual cortex, with earlier visual areas generally converging more slowly and preferring narrower image distributions, relative to higher-level brain areas. Brain-optimized inference thus offers a succinct and novel method for improving reconstructions and exploring the diversity of representations across visual brain areas.
Collapse
Affiliation(s)
| | - Jordyn Ojeda
- Department of Computer Science, University of Minnesota
| | | | | |
Collapse
|
7
|
Meng L, Yang C. Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI. Bioengineering (Basel) 2023; 10:1117. [PMID: 37892847 PMCID: PMC10604156 DOI: 10.3390/bioengineering10101117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 09/20/2023] [Accepted: 09/21/2023] [Indexed: 10/29/2023] Open
Abstract
The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.
Collapse
Affiliation(s)
- Lu Meng
- College of Information Science and Engineering, Northeastern University, Shenyang 110819, China;
| | | |
Collapse
|
8
|
Ozcelik F, VanRullen R. Natural scene reconstruction from fMRI signals using generative latent diffusion. Sci Rep 2023; 13:15666. [PMID: 37731047 PMCID: PMC10511448 DOI: 10.1038/s41598-023-42891-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 09/15/2023] [Indexed: 09/22/2023] Open
Abstract
In neural decoding research, one of the most intriguing topics is the reconstruction of perceived natural images based on fMRI signals. Previous studies have succeeded in re-creating different aspects of the visuals, such as low-level properties (shape, texture, layout) or high-level features (category of objects, descriptive semantics of scenes) but have typically failed to reconstruct these properties together for complex scene images. Generative AI has recently made a leap forward with latent diffusion models capable of generating high-complexity images. Here, we investigate how to take advantage of this innovative technology for brain decoding. We present a two-stage scene reconstruction framework called "Brain-Diffuser". In the first stage, starting from fMRI signals, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Variational Autoencoder) model. In the second stage, we use the image-to-image framework of a latent diffusion model (Versatile Diffusion) conditioned on predicted multimodal (text and visual) features, to generate final reconstructed images. On the publicly available Natural Scenes Dataset benchmark, our method outperforms previous models both qualitatively and quantitatively. When applied to synthetic fMRI patterns generated from individual ROI (region-of-interest) masks, our trained model creates compelling "ROI-optimal" scenes consistent with neuroscientific knowledge. Thus, the proposed methodology can have an impact on both applied (e.g. brain-computer interface) and fundamental neuroscience.
Collapse
Affiliation(s)
- Furkan Ozcelik
- CerCo, CNRS UMR5549, Toulouse, France.
- Universite de Toulouse, Toulouse, France.
| | - Rufin VanRullen
- CerCo, CNRS UMR5549, Toulouse, France
- Universite de Toulouse, Toulouse, France
- ANITI, Toulouse, France
| |
Collapse
|
9
|
Wilson H, Golbabaee M, Proulx MJ, Charles S, O'Neill E. EEG-based BCI Dataset of Semantic Concepts for Imagination and Perception Tasks. Sci Data 2023; 10:386. [PMID: 37322034 PMCID: PMC10272218 DOI: 10.1038/s41597-023-02287-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/02/2023] [Indexed: 06/17/2023] Open
Abstract
Electroencephalography (EEG) is a widely-used neuroimaging technique in Brain Computer Interfaces (BCIs) due to its non-invasive nature, accessibility and high temporal resolution. A range of input representations has been explored for BCIs. The same semantic meaning can be conveyed in different representations, such as visual (orthographic and pictorial) and auditory (spoken words). These stimuli representations can be either imagined or perceived by the BCI user. In particular, there is a scarcity of existing open source EEG datasets for imagined visual content, and to our knowledge there are no open source EEG datasets for semantics captured through multiple sensory modalities for both perceived and imagined content. Here we present an open source multisensory imagination and perception dataset, with twelve participants, acquired with a 124 EEG channel system. The aim is for the dataset to be open for purposes such as BCI related decoding and for better understanding the neural mechanisms behind perception, imagination and across the sensory modalities when the semantic category is held constant.
Collapse
Affiliation(s)
- Holly Wilson
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK.
| | - Mohammad Golbabaee
- Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1TW, UK
| | | | - Stephen Charles
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK
| | - Eamonn O'Neill
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK.
| |
Collapse
|
10
|
Li W, Zheng S, Liao Y, Hong R, He C, Chen W, Deng C, Li X. The brain-inspired decoder for natural visual image reconstruction. Front Neurosci 2023; 17:1130606. [PMID: 37205046 PMCID: PMC10185745 DOI: 10.3389/fnins.2023.1130606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 03/28/2023] [Indexed: 05/21/2023] Open
Abstract
The visual system provides a valuable model for studying the working mechanisms of sensory processing and high-level consciousness. A significant challenge in this field is the reconstruction of images from decoded neural activity, which could not only test the accuracy of our understanding of the visual system but also provide a practical tool for solving real-world problems. Although recent advances in deep learning have improved the decoding of neural spike trains, little attention has been paid to the underlying mechanisms of the visual system. To address this issue, we propose a deep learning neural network architecture that incorporates the biological properties of the visual system, such as receptive fields, to reconstruct visual images from spike trains. Our model outperforms current models and has been evaluated on different datasets from both retinal ganglion cells (RGCs) and the primary visual cortex (V1) neural spikes. Our model demonstrated the great potential of brain-inspired algorithms to solve a challenge that our brain solves.
Collapse
Affiliation(s)
- Wenyi Li
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Shengjie Zheng
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yufan Liao
- Clinical Medicine Institute, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Rongqi Hong
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Chenggang He
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- Illinois Institute of Technology, Chicago, IL, United States
| | - Weiliang Chen
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Chunshan Deng
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiaojian Li
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, CAS Key Laboratory of Brain Connectome and Manipulation, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- *Correspondence: Xiaojian Li
| |
Collapse
|
11
|
Hou X, Zhao J, Zhang H. Reconstruction of perceived face images from brain activities based on multi-attribute constraints. Front Neurosci 2022; 16:1015752. [PMID: 36389231 PMCID: PMC9643433 DOI: 10.3389/fnins.2022.1015752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 10/10/2022] [Indexed: 11/24/2022] Open
Abstract
Reconstruction of perceived faces from brain signals is a hot topic in brain decoding and an important application in the field of brain-computer interfaces. Existing methods do not fully consider the multiple facial attributes represented in face images, and their different activity patterns at multiple brain regions are often ignored, which causes the reconstruction performance very poor. In the current study, we propose an algorithmic framework that efficiently combines multiple face-selective brain regions for precise multi-attribute perceived face reconstruction. Our framework consists of three modules: a multi-task deep learning network (MTDLN), which is developed to simultaneously extract the multi-dimensional face features attributed to facial expression, identity and gender from one single face image, a set of linear regressions (LR), which is built to map the relationship between the multi-dimensional face features and the brain signals from multiple brain regions, and a multi-conditional generative adversarial network (mcGAN), which is used to generate the perceived face images constrained by the predicted multi-dimensional face features. We conduct extensive fMRI experiments to evaluate the reconstruction performance of our framework both subjectively and objectively. The results show that, compared with the traditional methods, our proposed framework better characterizes the multi-attribute face features in a face image, better predicts the face features from brain signals, and achieves better reconstruction performance of both seen and unseen face images in both visual effects and quantitative assessment. Moreover, besides the state-of-the-art intra-subject reconstruction performance, our proposed framework can also realize inter-subject face reconstruction to a certain extent.
Collapse
Affiliation(s)
- Xiaoyuan Hou
- School of Engineering Medicine, Beihang University, Beijing, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Jing Zhao
- School of Engineering Medicine, Beihang University, Beijing, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Hui Zhang
- School of Engineering Medicine, Beihang University, Beijing, China
- Key Laboratory of Biomechanics and Mechanobiology, Ministry of Education, Beihang University, Beijing, China
- Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology of the People’s Republic of China, Beihang University, Beijing, China
| |
Collapse
|
12
|
Image Semantic Recognition and Segmentation Algorithm of Colorimetric Sensor Array Based on Deep Convolutional Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2439371. [PMID: 36210987 PMCID: PMC9546663 DOI: 10.1155/2022/2439371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 07/30/2022] [Accepted: 08/06/2022] [Indexed: 11/18/2022]
Abstract
Semantic feature recognition in colour images is required for identifying uneven patterns in object detection and classification. The semantic features are identified by segmenting the colorimetric sensor array features through machine learning paradigms. Semantic segmentation is a method for identifying distinct elements in an image. This can be considered a task involving image classification at the pixel level. This article introduces a semantic feature-dependent array segmentation method (SFASM) to improve recognition accuracy due to irregular semantics. The proposed method incorporates a deep convolutional neural network for detecting the semantic and un-semantic features based on sensor array representations. The colour distributions per array are identified for horizontal and vertical semantics analysis. In this analysis, deep learning classifies the uneven patterns based on colour distribution, i.e. the consecutive and scattered colour distribution pixels in an array are correlated for their similarity. This similarity identification is maximized through max-pooling and recurrent iterations, preventing detection errors. The proposed method classifies the semantic features for further correlation sections, improving the accuracy. The proposed method’s performance is thus validated using the metrics precision, analysis time and F1-Score.
Collapse
|