1
|
Li K, Xie F, Chen H, Yuan K, Hu X. An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:6637-6651. [PMID: 38564350 DOI: 10.1109/tpami.2024.3384034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Audio-visual approaches involving visual inputs have laid the foundation for recent progress in speech separation. However, the optimization of the concurrent usage of auditory and visual inputs is still an active research area. Inspired by the cortico-thalamo-cortical circuit, in which the sensory processing mechanisms of different modalities modulate one another via the non-lemniscal sensory thalamus, we propose a novel cortico-thalamo-cortical neural network (CTCNet) for audio-visual speech separation (AVSS). First, the CTCNet learns hierarchical auditory and visual representations in a bottom-up manner in separate auditory and visual subnetworks, mimicking the functions of the auditory and visual cortical areas. Then, inspired by the large number of connections between cortical regions and the thalamus, the model fuses the auditory and visual information in a thalamic subnetwork through top-down connections. Finally, the model transmits this fused information back to the auditory and visual subnetworks, and the above process is repeated several times. The results of experiments on three speech separation benchmark datasets show that CTCNet remarkably outperforms existing AVSS methods with considerably fewer parameters. These results suggest that mimicking the anatomical connectome of the mammalian brain has great potential for advancing the development of deep neural networks.
Collapse
|
2
|
Shi X, Li B, Wang W, Qin Y, Wang H, Wang X. EEG-VTTCNet: A loss joint training model based on the vision transformer and the temporal convolution network for EEG-based motor imagery classification. Neuroscience 2024; 556:42-51. [PMID: 39103043 DOI: 10.1016/j.neuroscience.2024.07.051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/08/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]
Abstract
Brain-computer interface (BCI) is a technology that directly connects signals between the human brain and a computer or other external device. Motor imagery electroencephalographic (MI-EEG) signals are considered a promising paradigm for BCI systems, with a wide range of potential applications in medical rehabilitation, human-computer interaction, and virtual reality. Accurate decoding of MI-EEG signals poses a significant challenge due to issues related to the quality of the collected EEG data and subject variability. Therefore, developing an efficient MI-EEG decoding network is crucial and warrants research. This paper proposes a loss joint training model based on the vision transformer (VIT) and the temporal convolutional network (EEG-VTTCNet) to classify MI-EEG signals. To take advantage of multiple modules together, the EEG-VTTCNet adopts a shared convolution strategy and a dual-branching strategy. The dual-branching modules perform complementary learning and jointly train shared convolutional modules with better performance. We conducted experiments on the BCI Competition IV-2a and IV-2b datasets, and the proposed network outperformed the current state-of-the-art techniques with an accuracy of 84.58% and 90.94%, respectively, for the subject-dependent mode. In addition, we used t-SNE to visualize the features extracted by the proposed network, further demonstrating the effectiveness of the feature extraction framework. We also conducted extensive ablation and hyperparameter tuning experiments to construct a robust network architecture that can be well generalized.
Collapse
Affiliation(s)
- Xingbin Shi
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China
| | - Baojiang Li
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China.
| | - Wenlong Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China
| | - Yuxin Qin
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China
| | - Haiyan Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China
| | - Xichao Wang
- The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China
| |
Collapse
|
3
|
Saraswat M, Dubey AK. EBi-LSTM: an enhanced bi-directional LSTM for time-series data classification by heuristic development of optimal feature integration in brain computer interface. Comput Methods Biomech Biomed Engin 2024; 27:378-399. [PMID: 36951376 DOI: 10.1080/10255842.2023.2187662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 12/26/2022] [Accepted: 03/01/2023] [Indexed: 03/24/2023]
Abstract
Generally, time series data is referred to as the sequential representation of data that observes from different applications. Therefore, such expertise can use Electroencephalography (EEG) signals to fetch data regarding brain neural activities in brain-computer interface (BCI) systems. Due to massive and myriads data, the signals are appealed in a non-stationary format that ends with a poor quality resolution. To overcome this existing issue, a new framework of enhanced deep learning methods is proposed. The source signals are collected and undergo feature extraction in four ways. Hence, the features are concatenated to enhance the performance. Subsequently, the concatenated features are given to probability ratio-based Reptile Search Algorithm (PR-RSA) to select the optimal features. Finally, the classification is conducted using Enhanced Bi-directional Long Short-Term Memory (EBi-LSTM), where the hyperparameters are optimized by PR-RSA. Throughout the result analysis, it is confirmed that the offered model obtains elevated classification accuracy, and thus tends to increase the performance.
Collapse
Affiliation(s)
- Mala Saraswat
- Assistant Professor, School of Computing Science and Engineering, Bennett University, Noida, India
| | - Anil Kumar Dubey
- Associate Professor, CSE Department, ABES Engineering College Ghaziabad, Ghaziabad, India
| |
Collapse
|
4
|
Alemi R, Wolfe J, Neumann S, Manning J, Towler W, Koirala N, Gracco VL, Deroche M. Audiovisual integration in children with cochlear implants revealed through EEG and fNIRS. Brain Res Bull 2023; 205:110817. [PMID: 37989460 DOI: 10.1016/j.brainresbull.2023.110817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 09/22/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Sensory deprivation can offset the balance of audio versus visual information in multimodal processing. Such a phenomenon could persist for children born deaf, even after they receive cochlear implants (CIs), and could potentially explain why one modality is given priority over the other. Here, we recorded cortical responses to a single speaker uttering two syllables, presented in audio-only (A), visual-only (V), and audio-visual (AV) modes. Electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) were successively recorded in seventy-five school-aged children. Twenty-five were children with normal hearing (NH) and fifty wore CIs, among whom 26 had relatively high language abilities (HL) comparable to those of NH children, while 24 others had low language abilities (LL). In EEG data, visual-evoked potentials were captured in occipital regions, in response to V and AV stimuli, and they were accentuated in the HL group compared to the LL group (the NH group being intermediate). Close to the vertex, auditory-evoked potentials were captured in response to A and AV stimuli and reflected a differential treatment of the two syllables but only in the NH group. None of the EEG metrics revealed any interaction between group and modality. In fNIRS data, each modality induced a corresponding activity in visual or auditory regions, but no group difference was observed in A, V, or AV stimulation. The present study did not reveal any sign of abnormal AV integration in children with CI. An efficient multimodal integrative network (at least for rudimentary speech materials) is clearly not a sufficient condition to exhibit good language and literacy.
Collapse
Affiliation(s)
- Razieh Alemi
- Department of Psychology, Concordia University, 7141 Sherbrooke St. West, Montreal, Quebec H4B 1R6, Canada.
| | - Jace Wolfe
- Oberkotter Foundation, Oklahoma City, OK, USA
| | - Sara Neumann
- Hearts for Hearing Foundation, 11500 Portland Av., Oklahoma City, OK 73120, USA
| | - Jacy Manning
- Hearts for Hearing Foundation, 11500 Portland Av., Oklahoma City, OK 73120, USA
| | - Will Towler
- Hearts for Hearing Foundation, 11500 Portland Av., Oklahoma City, OK 73120, USA
| | - Nabin Koirala
- Haskins Laboratories, 300 George St., New Haven, CT 06511, USA
| | | | - Mickael Deroche
- Department of Psychology, Concordia University, 7141 Sherbrooke St. West, Montreal, Quebec H4B 1R6, Canada
| |
Collapse
|
5
|
Chen R, Xu G, Zhang H, Zhang X, Li B, Wang J, Zhang S. A novel untrained SSVEP-EEG feature enhancement method using canonical correlation analysis and underdamped second-order stochastic resonance. Front Neurosci 2023; 17:1246940. [PMID: 37859766 PMCID: PMC10584314 DOI: 10.3389/fnins.2023.1246940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/19/2023] [Indexed: 10/21/2023] Open
Abstract
Objective Compared with the light-flashing paradigm, the ring-shaped motion checkerboard patterns avoid uncomfortable flicker or brightness modulation, improving the practical interactivity of brain-computer interface (BCI) applications. However, due to fewer harmonic responses and more concentrated frequency energy elicited by the ring-shaped checkerboard patterns, the mainstream untrained algorithms such as canonical correlation analysis (CCA) and filter bank canonical correlation analysis (FBCCA) methods have poor recognition performance and low information transmission rate (ITR). Methods To address this issue, a novel untrained SSVEP-EEG feature enhancement method using CCA and underdamped second-order stochastic resonance (USSR) is proposed to extract electroencephalogram (EEG) features. Results In contrast to typical unsupervised dimensionality reduction methods such as common average reference (CAR), principal component analysis (PCA), multidimensional scaling (MDS), and locally linear embedding (LLE), CCA exhibits higher adaptability for SSVEP rhythm components. Conclusion This study recruits 42 subjects to evaluate the proposed method and experimental results show that the untrained method can achieve higher detection accuracy and robustness. Significance This untrained method provides the possibility of applying a nonlinear model from one-dimensional signals to multi-dimensional signals.
Collapse
Affiliation(s)
- Ruiquan Chen
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Guanghua Xu
- State Key Laboratory for Manufacturing Systems Engineering, School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Huanqing Zhang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Xun Zhang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Baoyu Li
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Jiahuan Wang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| | - Sicong Zhang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
6
|
Ma W, Wang C, Sun X, Lin X, Niu L, Wang Y. MBGA-Net: A multi-branch graph adaptive network for individualized motor imagery EEG classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107641. [PMID: 37327754 DOI: 10.1016/j.cmpb.2023.107641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 05/21/2023] [Accepted: 05/31/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE The development of deep learning has led to significant improvements in the decoding accuracy of Motor Imagery (MI) EEG signal classification. However, current models are inadequate in ensuring high levels of classification accuracy for an individual. Since MI EEG data is primarily used in medical rehabilitation and intelligent control, it is crucial to ensure that each individual's EEG signal is recognized with precision. METHODS We propose a multi-branch graph adaptive network (MBGA-Net), which matches each individual EEG signal with a suitable time-frequency domain processing method based on spatio-temporal domain features. We then feed the signal into the relevant model branch using an adaptive technique. Through an enhanced attention mechanism and deep convolutional method with residual connectivity, each model branch more effectively harvests the features of the related format data. RESULTS We validate the proposed model using the BCI Competition IV dataset 2a and dataset 2b. On dataset 2a, the average accuracy and kappa values are 87.49% and 0.83, respectively. The standard deviation of individual kappa values is only 0.08. For dataset 2b, the average classification accuracies obtained by feeding the data into the three branches of MBGA-Net are 85.71%, 85.83%, and 86.99%, respectively. CONCLUSIONS The experimental results demonstrate that MBGA-Net could effectively perform the classification task of motor imagery EEG signals, and it exhibits strong generalization performance. The proposed adaptive matching technique enhances the classification accuracy of each individual, which is beneficial for the practical application of EEG classification.
Collapse
Affiliation(s)
- Weifeng Ma
- School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, PR China
| | - Chuanlai Wang
- School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, PR China
| | - Xiaoyong Sun
- School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, PR China
| | - Xuefen Lin
- School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, PR China
| | - Lei Niu
- Faculty of Artificial Intelligence Education, Central China Normal University Wollongong Joint Institude, Wuhan 430079, PR China
| | - Yuchen Wang
- School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, PR China.
| |
Collapse
|
7
|
Bagh N, Reddy MR. Investigation of the dynamical behavior of brain activities during rest and motor imagery movements. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Wang G, Cerf M. Brain-Computer Interface using neural network and temporal-spectral features. Front Neuroinform 2022; 16:952474. [PMID: 36277476 PMCID: PMC9580359 DOI: 10.3389/fninf.2022.952474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 08/24/2022] [Indexed: 11/24/2022] Open
Abstract
Brain-Computer Interfaces (BCIs) are increasingly useful for control. Such BCIs can be used to assist individuals who lost mobility or control over their limbs, for recreational purposes such as gaming or semi-autonomous driving, or as an interface toward man-machine integration. Thus far, the performance of algorithms used for thought decoding has been limited. We show that by extracting temporal and spectral features from electroencephalography (EEG) signals and, following, using deep learning neural network to classify those features, one can significantly improve the performance of BCIs in predicting which motor action was imagined by a subject. Our movement prediction algorithm uses Sequential Backward Selection technique to jointly choose temporal and spectral features and a radial basis function neural network for the classification. The method shows an average performance increase of 3.50% compared to state-of-the-art benchmark algorithms. Using two popular public datasets our algorithm reaches 90.08% accuracy (compared to an average benchmark of 79.99%) on the first dataset and 88.74% (average benchmark: 82.01%) on the second dataset. Given the high variability within- and across-subjects in EEG-based action decoding, we suggest that using features from multiple modalities along with neural network classification protocol is likely to increase the performance of BCIs across various tasks.
Collapse
Affiliation(s)
- Gan Wang
- School of Mechanical and Electrical Engineering, Soochow University, Suchow, China
| | - Moran Cerf
- Interdepartmental Neuroscience Program, Northwestern University, Evanston, IL, United States
- *Correspondence: Moran Cerf,
| |
Collapse
|
9
|
Wang Z, Jin J, Xu R, Liu C, Wang X, Cichocki A. Efficient Spatial Filters Enhance SSVEP Target Recognition Based on Task-Related Component Analysis. IEEE Trans Cogn Dev Syst 2022. [DOI: 10.1109/tcds.2021.3096812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Zhiqiang Wang
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
| | - Jing Jin
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
| | - Ren Xu
- Guger Technologies OG, Graz, Austria
| | - Chang Liu
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
| | - Xingyu Wang
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
| | | |
Collapse
|
10
|
Lu Z, Zhang X, Li H, Zhang T, Gu L, Tao Q. An asynchronous artifact-enhanced electroencephalogram based control paradigm assisted by slight facial expression. Front Neurosci 2022; 16:892794. [PMID: 36051646 PMCID: PMC9424911 DOI: 10.3389/fnins.2022.892794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 07/25/2022] [Indexed: 11/25/2022] Open
Abstract
In this study, an asynchronous artifact-enhanced electroencephalogram (EEG)-based control paradigm assisted by slight-facial expressions (sFE-paradigm) was developed. The brain connectivity analysis was conducted to reveal the dynamic directional interactions among brain regions under sFE-paradigm. The component analysis was applied to estimate the dominant components of sFE-EEG and guide the signal processing. Enhanced by the artifact within the detected electroencephalogram (EEG), the sFE-paradigm focused on the mainstream defect as the insufficiency of real-time capability, asynchronous logic, and robustness. The core algorithm contained four steps, including “obvious non-sFE-EEGs exclusion,” “interface ‘ON’ detection,” “sFE-EEGs real-time decoding,” and “validity judgment.” It provided the asynchronous function, decoded eight instructions from the latest 100 ms signal, and greatly reduced the frequent misoperation. In the offline assessment, the sFE-paradigm achieved 96.46% ± 1.07 accuracy for interface “ON” detection and 92.68% ± 1.21 for sFE-EEGs real-time decoding, with the theoretical output timespan less than 200 ms. This sFE-paradigm was applied to two online manipulations for evaluating stability and agility. In “object-moving with a robotic arm,” the averaged intersection-over-union was 60.03 ± 11.53%. In “water-pouring with a prosthetic hand,” the average water volume was 202.5 ± 7.0 ml. During online, the sFE-paradigm performed no significant difference (P = 0.6521 and P = 0.7931) with commercial control methods (i.e., FlexPendant and Joystick), indicating a similar level of controllability and agility. This study demonstrated the capability of sFE-paradigm, enabling a novel solution to the non-invasive EEG-based control in real-world challenges.
Collapse
Affiliation(s)
- Zhufeng Lu
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Key Laboratory of Intelligent Robot, Xi’an Jiaotong University, Xi’an, China
| | - Xiaodong Zhang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Key Laboratory of Intelligent Robot, Xi’an Jiaotong University, Xi’an, China
- *Correspondence: Xiaodong Zhang,
| | - Hanzhe Li
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Key Laboratory of Intelligent Robot, Xi’an Jiaotong University, Xi’an, China
| | - Teng Zhang
- School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
- Shaanxi Key Laboratory of Intelligent Robot, Xi’an Jiaotong University, Xi’an, China
| | - Linxia Gu
- Department of Biomedical and Chemical Engineering and Sciences, College of Engineering and Science, Florida Institute of Technology, Melbourne, FL, United States
| | - Qing Tao
- School of Mechanical Engineering, Xinjiang University, Wulumuqi, China
| |
Collapse
|
11
|
Tai J, Forrester J, Sekuler R. Costs and benefits of audiovisual interactions. Perception 2022; 51:639-657. [PMID: 35959630 DOI: 10.1177/03010066221111501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A strong temporal correlation promotes integration of concurrent sensory signals, either within a single sensory modality, or from different modalities. Although the benefits of such integration are well known, far less attention has been given to possible costs incurred when concurrent sensory signals are uncorrelated. In two experiments, subjects categorized the rate at which a visual object modulated in size, while they also tried to ignore a concurrent task-irrelevant broadband sound. Overall, the experiments showed that (i) losses in accuracy from mismatched auditory and visual rates were larger than gains from matched rates and (ii) mismatched auditory and visual rates slowed responses more than they were sped up when rates matched. Experiment One showed that audiovisual interaction varied with the difference between the visual modulation rate and the modulation rate of a concurrent auditory stimulus. Experiment Two showed that audiovisual interaction depended upon the strength of the task-irrelevant auditory modulation. Although our stimuli involved abstract, low-dimensional stimuli, not speech, the effects we observed parallel key findings on interference in multi-speaker settings.
Collapse
Affiliation(s)
- Jiayue Tai
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| | - Jack Forrester
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| | - Robert Sekuler
- Volen Center for Complex Systems, 8244Brandeis University, Waltham, MA, USA
| |
Collapse
|
12
|
Dellaferrera G, Asabuki T, Fukai T. Modeling the Repetition-Based Recovering of Acoustic and Visual Sources With Dendritic Neurons. Front Neurosci 2022; 16:855753. [PMID: 35573290 PMCID: PMC9097820 DOI: 10.3389/fnins.2022.855753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/31/2022] [Indexed: 11/13/2022] Open
Abstract
In natural auditory environments, acoustic signals originate from the temporal superimposition of different sound sources. The problem of inferring individual sources from ambiguous mixtures of sounds is known as blind source decomposition. Experiments on humans have demonstrated that the auditory system can identify sound sources as repeating patterns embedded in the acoustic input. Source repetition produces temporal regularities that can be detected and used for segregation. Specifically, listeners can identify sounds occurring more than once across different mixtures, but not sounds heard only in a single mixture. However, whether such a behavior can be computationally modeled has not yet been explored. Here, we propose a biologically inspired computational model to perform blind source separation on sequences of mixtures of acoustic stimuli. Our method relies on a somatodendritic neuron model trained with a Hebbian-like learning rule which was originally conceived to detect spatio-temporal patterns recurring in synaptic inputs. We show that the segregation capabilities of our model are reminiscent of the features of human performance in a variety of experimental settings involving synthesized sounds with naturalistic properties. Furthermore, we extend the study to investigate the properties of segregation on task settings not yet explored with human subjects, namely natural sounds and images. Overall, our work suggests that somatodendritic neuron models offer a promising neuro-inspired learning strategy to account for the characteristics of the brain segregation capabilities as well as to make predictions on yet untested experimental settings.
Collapse
Affiliation(s)
- Giorgia Dellaferrera
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
- Institute of Neuroinformatics, University of Zurich and Swiss Federal Institute of Technology Zurich (ETH), Zurich, Switzerland
| | - Toshitake Asabuki
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| | - Tomoki Fukai
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| |
Collapse
|
13
|
A Collaborative Brain-Computer Interface Framework for Enhancing Group Detection Performance of Dynamic Visual Targets. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4752450. [PMID: 35087580 PMCID: PMC8789438 DOI: 10.1155/2022/4752450] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 12/12/2021] [Accepted: 12/14/2021] [Indexed: 11/25/2022]
Abstract
The superiority of collaborative brain-computer interface (cBCI) in performance enhancement makes it an effective way to break through the performance bottleneck of the BCI-based dynamic visual target detection. However, the existing cBCIs focus on multi-mind information fusion with a static and unidirectional mode, lacking the information interaction and learning guidance among multiple agents. Here, we propose a novel cBCI framework to enhance the group detection performance of dynamic visual targets. Specifically, a mutual learning domain adaptation network (MLDANet) with information interaction, dynamic learning, and individual transferring abilities is developed as the core of the cBCI framework. MLDANet takes P3-sSDA network as individual network unit, introduces mutual learning strategy, and establishes a dynamic interactive learning mechanism between individual networks and collaborative decision-making at the neural decision level. The results indicate that the proposed MLDANet-cBCI framework can achieve the best group detection performance, and the mutual learning strategy can improve the detection ability of individual networks. In MLDANet-cBCI, the F1 scores of collaborative detection and individual network are 0.12 and 0.19 higher than those in the multi-classifier cBCI, respectively, when three minds collaborate. Thus, the proposed framework breaks through the traditional multi-mind collaborative mode and exhibits a superior group detection performance of dynamic visual targets, which is also of great significance for the practical application of multi-mind collaboration.
Collapse
|
14
|
Jin J, Sun H, Daly I, Li S, Liu C, Wang X, Cichocki A. A Novel Classification Framework Using the Graph Representations of Electroencephalogram for Motor Imagery based Brain-Computer Interface. IEEE Trans Neural Syst Rehabil Eng 2021; 30:20-29. [PMID: 34962871 DOI: 10.1109/tnsre.2021.3139095] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The motor imagery (MI) based brain-computer interfaces (BCIs) have been proposed as a potential physical rehabilitation technology. However, the low classification accuracy achievable with MI tasks is still a challenge when building effective BCI systems. We propose a novel MI classification model based on measurement of functional connectivity between brain regions and graph theory. Specifically, motifs describing local network structures in the brain are extracted from functional connectivity graphs. A graph embedding model called Ego-CNNs is then used to build a classifier, which can convert the graph from a structural representation to a fixed-dimensional vector for detecting critical structure in the graph. We validate our proposed method on four datasets, and the results show that our proposed method produces high classification accuracies in two-class classification tasks (92.8% for dataset 1, 93.4% for dataset 2, 96.5% for dataset 3, and 80.2% for dataset 4) and multiclass classification tasks (90.33% for dataset 1). Our proposed method achieves a mean Kappa value of 0.88 across nine participants, which is superior to other methods we compared it to. These results indicate that there is a local structural difference in functional connectivity graphs extracted under different motor imagery tasks. Our proposed method has great potential for motor imagery classification in future studies.
Collapse
|
15
|
Izzuddin TA, Safri NM, Othman MA. Compact convolutional neural network (CNN) based on SincNet for end-to-end motor imagery decoding and analysis. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.10.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
16
|
Kiremitçi I, Yilmaz Ö, Çelik E, Shahdloo M, Huth AG, Çukur T. Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment. Cereb Cortex 2021; 31:4986-5005. [PMID: 34115102 PMCID: PMC8491717 DOI: 10.1093/cercor/bhab136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Humans are remarkably adept in listening to a desired speaker in a crowded environment, while filtering out nontarget speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed characterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modulation occurs in each brain area during the cocktail-party task. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed. Intrinsic selectivity profiles were identified via voxelwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that attention causes broad modulations at multiple levels of speech representations while growing stronger toward later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multispeaker environments.
Collapse
Affiliation(s)
- Ibrahim Kiremitçi
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Özgür Yilmaz
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
| | - Emin Çelik
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Mo Shahdloo
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK
| | - Alexander G Huth
- Department of Neuroscience, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| | - Tolga Çukur
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| |
Collapse
|
17
|
Ibrahim B, Suppiah S, Ibrahim N, Mohamad M, Hassan HA, Nasser NS, Saripan MI. Diagnostic power of resting-state fMRI for detection of network connectivity in Alzheimer's disease and mild cognitive impairment: A systematic review. Hum Brain Mapp 2021; 42:2941-2968. [PMID: 33942449 PMCID: PMC8127155 DOI: 10.1002/hbm.25369] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 01/29/2021] [Accepted: 02/01/2021] [Indexed: 12/20/2022] Open
Abstract
Resting‐state fMRI (rs‐fMRI) detects functional connectivity (FC) abnormalities that occur in the brains of patients with Alzheimer's disease (AD) and mild cognitive impairment (MCI). FC of the default mode network (DMN) is commonly impaired in AD and MCI. We conducted a systematic review aimed at determining the diagnostic power of rs‐fMRI to identify FC abnormalities in the DMN of patients with AD or MCI compared with healthy controls (HCs) using machine learning (ML) methods. Multimodal support vector machine (SVM) algorithm was the commonest form of ML method utilized. Multiple kernel approach can be utilized to aid in the classification by incorporating various discriminating features, such as FC graphs based on “nodes” and “edges” together with structural MRI‐based regional cortical thickness and gray matter volume. Other multimodal features include neuropsychiatric testing scores, DTI features, and regional cerebral blood flow. Among AD patients, the posterior cingulate cortex (PCC)/Precuneus was noted to be a highly affected hub of the DMN that demonstrated overall reduced FC. Whereas reduced DMN FC between the PCC and anterior cingulate cortex (ACC) was observed in MCI patients. Evidence indicates that the nodes of the DMN can offer moderate to high diagnostic power to distinguish AD and MCI patients. Nevertheless, various concerns over the homogeneity of data based on patient selection, scanner effects, and the variable usage of classifiers and algorithms pose a challenge for ML‐based image interpretation of rs‐fMRI datasets to become a mainstream option for diagnosing AD and predicting the conversion of HC/MCI to AD.
Collapse
Affiliation(s)
- Buhari Ibrahim
- Department of Radiology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia.,Department of Physiology, Faculty of Basic Medical Sciences, Bauchi State University Gadau, Gadau, Nigeria
| | - Subapriya Suppiah
- Department of Radiology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - Normala Ibrahim
- Department of Psychiatry, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - Mazlyfarina Mohamad
- Centre for Diagnostic and Applied Health Sciences, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Hasyma Abu Hassan
- Department of Radiology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - Nisha Syed Nasser
- Department of Radiology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - M Iqbal Saripan
- Department of Computer and Communication System Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| |
Collapse
|
18
|
A Pilot Study of Game Design in the Unity Environment as an Example of the Use of Neurogaming on the Basis of Brain–Computer Interface Technology to Improve Concentration. NEUROSCI 2021. [DOI: 10.3390/neurosci2020007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The article describes the practical use of Unity technology in neurogaming. For this purpose, the article describes Unity technology and brain–computer interface (BCI) technology based on the Emotiv EPOC + NeuroHeadset device. The process of creating the game world and the test results for the use of a device based on the BCI as a control interface for the created game are also presented. The game was created in the Unity graphics engine and the Visual Studio environment in C#. The game presented in the article is called “NeuroBall” due to the player’s object, which is a big red ball. The game will require full focus to make the ball move. The game will aim to improve the concentration and training of the user’s brain in a user-friendly environment. Through neurogaming, it will be possible to exercise and train a healthy brain, as well as diagnose and treat various symptoms of brain disorders. The project was entirely created in the Unity graphics engine in Unity version 2020.1.
Collapse
|
19
|
Future Trends for Human-AI Collaboration: A Comprehensive Taxonomy of AI/AGI Using Multiple Intelligences and Learning Styles. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021. [DOI: 10.1155/2021/8893795] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This article discusses some trends and concepts in developing a new generation of future Artificial General Intelligence (AGI) systems which relate to complex facets and different types of human intelligence, especially social, emotional, attentional, and ethical intelligence. We describe various aspects of multiple human intelligences and learning styles, which may affect a variety of AI problem domains. Using the concept of “multiple intelligences” rather than a single type of intelligence, we categorize and provide working definitions of various AGIs depending on their cognitive skills or capacities. Future AI systems will be able not only to communicate with human users and each other but also to efficiently exchange knowledge and wisdom with abilities of cooperation, collaboration, and even cocreating something new and valuable and have metalearning capacities. Multiagent systems such as these can be used to solve problems that would be difficult to solve by any individual intelligent agent.
Collapse
|