1
|
Sun Y, Todorovic S, Goodison S. Local-learning-based feature selection for high-dimensional data analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2010; 32:1610-26. [PMID: 20634556 PMCID: PMC3445441 DOI: 10.1109/tpami.2009.190] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
This paper considers feature selection for data classification in the presence of a huge number of irrelevant features. We propose a new feature-selection algorithm that addresses several major issues with prior work, including problems with algorithm implementation, computational complexity, and solution accuracy. The key idea is to decompose an arbitrarily complex nonlinear problem into a set of locally linear ones through local learning, and then learn feature relevance globally within the large margin framework. The proposed algorithm is based on well-established machine learning and numerical analysis techniques, without making any assumptions about the underlying data distribution. It is capable of processing many thousands of features within minutes on a personal computer while maintaining a very high accuracy that is nearly insensitive to a growing number of irrelevant features. Theoretical analyses of the algorithm's sample complexity suggest that the algorithm has a logarithmical sample complexity with respect to the number of features. Experiments on 11 synthetic and real-world data sets demonstrate the viability of our formulation of the feature-selection problem for supervised learning and the effectiveness of our algorithm.
Collapse
|
research-article |
15 |
89 |
2
|
Abstract
It is unclear how public authorities shaped responses to Ebola in Sierra Leone. Focusing on one village, we analyze what happened when "staff, stuff, space, and systems" were absent. Mutuality between neighbors, linked to secret societies, necessitated collective care for infected loved ones, irrespective of the risks. Practical learning was quick. Numbers recovering were reported to be higher among people treated in hidden locations, compared to those taken to Ebola Treatment Centres. Our findings challenge positive post-Ebola narratives about international aid and military deployment. A morally appropriate people's science emerged under the radar of external scrutiny, including that of a paramount chief.
Collapse
|
research-article |
6 |
24 |
3
|
Crafton B, Parihar A, Gebhardt E, Raychowdhury A. Direct Feedback Alignment With Sparse Connections for Local Learning. Front Neurosci 2019; 13:525. [PMID: 31178689 PMCID: PMC6542988 DOI: 10.3389/fnins.2019.00525] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 05/07/2019] [Indexed: 11/13/2022] Open
Abstract
Recent advances in deep neural networks (DNNs) owe their success to training algorithms that use backpropagation and gradient-descent. Backpropagation, while highly effective on von Neumann architectures, becomes inefficient when scaling to large networks. Commonly referred to as the weight transport problem, each neuron's dependence on the weights and errors located deeper in the network require exhaustive data movement which presents a key problem in enhancing the performance and energy-efficiency of machine-learning hardware. In this work, we propose a bio-plausible alternative to backpropagation drawing from advances in feedback alignment algorithms in which the error computation at a single synapse reduces to the product of three scalar values. Using a sparse feedback matrix, we show that a neuron needs only a fraction of the information previously used by the feedback alignment algorithms. Consequently, memory and compute can be partitioned and distributed whichever way produces the most efficient forward pass so long as a single error can be delivered to each neuron. We evaluate our algorithm using standard datasets, including ImageNet, to address the concern of scaling to challenging problems. Our results show orders of magnitude improvement in data movement and 2× improvement in multiply-and-accumulate operations over backpropagation. Like previous work, we observe that any variant of feedback alignment suffers significant losses in classification accuracy on deep convolutional neural networks. By transferring trained convolutional layers and training the fully connected layers using direct feedback alignment, we demonstrate that direct feedback alignment can obtain results competitive with backpropagation. Furthermore, we observe that using an extremely sparse feedback matrix, rather than a dense one, results in a small accuracy drop while yielding hardware advantages. All the code and results are available under https://github.com/bcrafton/ssdfa.
Collapse
|
Journal Article |
6 |
16 |
4
|
Just-in-Time Correntropy Soft Sensor with Noisy Data for Industrial Silicon Content Prediction. SENSORS 2017; 17:s17081830. [PMID: 28786957 PMCID: PMC5579503 DOI: 10.3390/s17081830] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Revised: 07/27/2017] [Accepted: 07/27/2017] [Indexed: 11/17/2022]
Abstract
Development of accurate data-driven quality prediction models for industrial blast furnaces encounters several challenges mainly because the collected data are nonlinear, non-Gaussian, and uneven distributed. A just-in-time correntropy-based local soft sensing approach is presented to predict the silicon content in this work. Without cumbersome efforts for outlier detection, a correntropy support vector regression (CSVR) modeling framework is proposed to deal with the soft sensor development and outlier detection simultaneously. Moreover, with a continuous updating database and a clustering strategy, a just-in-time CSVR (JCSVR) method is developed. Consequently, more accurate prediction and efficient implementations of JCSVR can be achieved. Better prediction performance of JCSVR is validated on the online silicon content prediction, compared with traditional soft sensors.
Collapse
|
Journal Article |
8 |
12 |
5
|
Borthakur A, Cleland TA. A Spike Time-Dependent Online Learning Algorithm Derived From Biological Olfaction. Front Neurosci 2019; 13:656. [PMID: 31316339 PMCID: PMC6610532 DOI: 10.3389/fnins.2019.00656] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Accepted: 06/07/2019] [Indexed: 01/07/2023] Open
Abstract
We have developed a spiking neural network (SNN) algorithm for signal restoration and identification based on principles extracted from the mammalian olfactory system and broadly applicable to input from arbitrary sensor arrays. For interpretability and development purposes, we here examine the properties of its initial feedforward projection. Like the full algorithm, this feedforward component is fully spike timing-based, and utilizes online learning based on local synaptic rules such as spike timing-dependent plasticity (STDP). Using an intermediate metric to assess the properties of this initial projection, the feedforward network exhibits high classification performance after few-shot learning without catastrophic forgetting, and includes a none of the above outcome to reflect classifier confidence. We demonstrate online learning performance using a publicly available machine olfaction dataset with challenges including relatively small training sets, variable stimulus concentrations, and 3 years of sensor drift.
Collapse
|
research-article |
6 |
9 |
6
|
Soures N, Kudithipudi D. Deep Liquid State Machines With Neural Plasticity for Video Activity Recognition. Front Neurosci 2019; 13:686. [PMID: 31333404 PMCID: PMC6621912 DOI: 10.3389/fnins.2019.00686] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 06/17/2019] [Indexed: 11/13/2022] Open
Abstract
Real-world applications such as first-person video activity recognition require intelligent edge devices. However, size, weight, and power constraints of the embedded platforms cannot support resource intensive state-of-the-art algorithms. Machine learning lite algorithms, such as reservoir computing, with shallow 3-layer networks are computationally frugal as only the output layer is trained. By reducing network depth and plasticity, reservoir computing minimizes computational power and complexity, making the algorithms optimal for edge devices. However, as a trade-off for their frugal nature, reservoir computing sacrifices computational power compared to state-of-the-art methods. A good compromise between reservoir computing and fully supervised networks are the proposed deep-LSM networks. The deep-LSM is a deep spiking neural network which captures dynamic information over multiple time-scales with a combination of randomly connected layers and unsupervised layers. The deep-LSM processes the captured dynamic information through an attention modulated readout layer to perform classification. We demonstrate that the deep-LSM achieves an average of 84.78% accuracy on the DogCentric video activity recognition task, beating state-of-the-art. The deep-LSM also shows up to 91.13% memory savings and up to 91.55% reduction in synaptic operations when compared to similar recurrent neural network models. Based on these results we claim that the deep-LSM is capable of overcoming limitations of traditional reservoir computing, while maintaining the low computational cost associated with reservoir computing.
Collapse
|
Journal Article |
6 |
7 |
7
|
Müller E, Arnold E, Breitwieser O, Czierlinski M, Emmel A, Kaiser J, Mauch C, Schmitt S, Spilger P, Stock R, Stradmann Y, Weis J, Baumbach A, Billaudelle S, Cramer B, Ebert F, Göltz J, Ilmberger J, Karasenko V, Kleider M, Leibfried A, Pehle C, Schemmel J. A Scalable Approach to Modeling on Accelerated Neuromorphic Hardware. Front Neurosci 2022; 16:884128. [PMID: 35663548 PMCID: PMC9157770 DOI: 10.3389/fnins.2022.884128] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 04/20/2022] [Indexed: 11/29/2022] Open
Abstract
Neuromorphic systems open up opportunities to enlarge the explorative space for computational research. However, it is often challenging to unite efficiency and usability. This work presents the software aspects of this endeavor for the BrainScaleS-2 system, a hybrid accelerated neuromorphic hardware architecture based on physical modeling. We introduce key aspects of the BrainScaleS-2 Operating System: experiment workflow, API layering, software design, and platform operation. We present use cases to discuss and derive requirements for the software and showcase the implementation. The focus lies on novel system and software features such as multi-compartmental neurons, fast re-configuration for hardware-in-the-loop training, applications for the embedded processors, the non-spiking operation mode, interactive platform access, and sustainable hardware/software co-development. Finally, we discuss further developments in terms of hardware scale-up, system usability, and efficiency.
Collapse
|
research-article |
3 |
5 |
8
|
Oh S, Yoon R, Min KS. Defect-Tolerant Memristor Crossbar Circuits for Local Learning Neural Networks. NANOMATERIALS (BASEL, SWITZERLAND) 2025; 15:213. [PMID: 39940190 PMCID: PMC11820591 DOI: 10.3390/nano15030213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Revised: 01/24/2025] [Accepted: 01/25/2025] [Indexed: 02/14/2025]
Abstract
Local learning algorithms, such as Equilibrium Propagation (EP), have emerged as alternatives to global learning methods like backpropagation for training neural networks. EP offers the potential for more energy-efficient hardware implementation by utilizing only local neuron information for weight updates. However, the practical implementation of EP using memristor-based circuits has significant challenges due to the immature fabrication processes of memristors, resulting in defects and variability issues. Previous implementations of EP with memristor crossbars use two separate circuits for the free and nudge phases. This approach can suffer differences in defects and variability between the two circuits, potentially leading to significant performance degradation. To overcome these limitations, in this paper, we propose a novel time-multiplexing technique that combines the free and nudge phases into a single memristor circuit. Our proposed scheme integrates the dynamic equations of the free and nudge phases into one circuit, allowing defects and variability compensation during the training. Simulations using the MNIST dataset demonstrate that our approach maintains a 92% recognition rate even with a 10% defect rate in memristors, compared to 33% for the previous scheme. Furthermore, the proposed circuit reduces area overhead for both the memristor circuit solving EP's algorithm and the weight-update control circuit.
Collapse
|
research-article |
1 |
|
9
|
Makkeh A, Graetz M, Schneider AC, Ehrlich DA, Priesemann V, Wibral M. A general framework for interpretable neural learning based on local information-theoretic goal functions. Proc Natl Acad Sci U S A 2025; 122:e2408125122. [PMID: 40042906 PMCID: PMC11912414 DOI: 10.1073/pnas.2408125122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 12/19/2024] [Indexed: 03/19/2025] Open
Abstract
Despite the impressive performance of biological and artificial networks, an intuitive understanding of how their local learning dynamics contribute to network-level task solutions remains a challenge to this date. Efforts to bring learning to a more local scale indeed lead to valuable insights, however, a general constructive approach to describe local learning goals that is both interpretable and adaptable across diverse tasks is still missing. We have previously formulated a local information processing goal that is highly adaptable and interpretable for a model neuron with compartmental structure. Building on recent advances in Partial Information Decomposition (PID), we here derive a corresponding parametric local learning rule, which allows us to introduce "infomorphic" neural networks. We demonstrate the versatility of these networks to perform tasks from supervised, unsupervised, and memory learning. By leveraging the interpretable nature of the PID framework, infomorphic networks represent a valuable tool to advance our understanding of the intricate structure of local learning.
Collapse
|
|
1 |
|
10
|
Oh S, An J, Cho S, Yoon R, Min KS. Memristor Crossbar Circuits Implementing Equilibrium Propagation for On-Device Learning. MICROMACHINES 2023; 14:1367. [PMID: 37512678 PMCID: PMC10384638 DOI: 10.3390/mi14071367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/22/2023] [Accepted: 07/01/2023] [Indexed: 07/30/2023]
Abstract
Equilibrium propagation (EP) has been proposed recently as a new neural network training algorithm based on a local learning concept, where only local information is used to calculate the weight update of the neural network. Despite the advantages of local learning, numerical iteration for solving the EP dynamic equations makes the EP algorithm less practical for realizing edge intelligence hardware. Some analog circuits have been suggested to solve the EP dynamic equations physically, not numerically, using the original EP algorithm. However, there are still a few problems in terms of circuit implementation: for example, the need for storing the free-phase solution and the lack of essential peripheral circuits for calculating and updating synaptic weights. Therefore, in this paper, a new analog circuit technique is proposed to realize the EP algorithm in practical and implementable hardware. This work has two major contributions in achieving this objective. First, the free-phase and nudge-phase solutions are calculated by the proposed analog circuits simultaneously, not at different times. With this process, analog voltage memories or digital memories with converting circuits between digital and analog domains for storing the free-phase solution temporarily can be eliminated in the proposed EP circuit. Second, a simple EP learning rule relying on a fixed amount of conductance change per programming pulse is newly proposed and implemented in peripheral circuits. The modified EP learning rule can make the weight update circuit practical and implementable without requiring the use of a complicated program verification scheme. The proposed memristor conductance update circuit is simulated and verified for training synaptic weights on memristor crossbars. The simulation results showed that the proposed EP circuit could be used for realizing on-device learning in edge intelligence hardware.
Collapse
|
|
2 |
|
11
|
Guo W, Fouda ME, Eltawil AM, Salama KN. Efficient training of spiking neural networks with temporally-truncated local backpropagation through time. Front Neurosci 2023; 17:1047008. [PMID: 37090791 PMCID: PMC10117667 DOI: 10.3389/fnins.2023.1047008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 03/20/2023] [Indexed: 04/25/2023] Open
Abstract
Directly training spiking neural networks (SNNs) has remained challenging due to complex neural dynamics and intrinsic non-differentiability in firing functions. The well-known backpropagation through time (BPTT) algorithm proposed to train SNNs suffers from large memory footprint and prohibits backward and update unlocking, making it impossible to exploit the potential of locally-supervised training methods. This work proposes an efficient and direct training algorithm for SNNs that integrates a locally-supervised training method with a temporally-truncated BPTT algorithm. The proposed algorithm explores both temporal and spatial locality in BPTT and contributes to significant reduction in computational cost including GPU memory utilization, main memory access and arithmetic operations. We thoroughly explore the design space concerning temporal truncation length and local training block size and benchmark their impact on classification accuracy of different networks running different types of tasks. The results reveal that temporal truncation has a negative effect on the accuracy of classifying frame-based datasets, but leads to improvement in accuracy on event-based datasets. In spite of resulting information loss, local training is capable of alleviating overfitting. The combined effect of temporal truncation and local training can lead to the slowdown of accuracy drop and even improvement in accuracy. In addition, training deep SNNs' models such as AlexNet classifying CIFAR10-DVS dataset leads to 7.26% increase in accuracy, 89.94% reduction in GPU memory, 10.79% reduction in memory access, and 99.64% reduction in MAC operations compared to the standard end-to-end BPTT. Thus, the proposed method has shown high potential to enable fast and energy-efficient on-chip training for real-time learning at the edge.
Collapse
|
research-article |
2 |
|
12
|
Chen Y, Zhang H, Cameron M, Sejnowski T. Predictive sequence learning in the hippocampal formation. Neuron 2024; 112:2645-2658.e4. [PMID: 38917804 DOI: 10.1016/j.neuron.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 01/21/2024] [Accepted: 05/22/2024] [Indexed: 06/27/2024]
Abstract
The hippocampus receives sequences of sensory inputs from the cortex during exploration and encodes the sequences with millisecond precision. We developed a predictive autoencoder model of the hippocampus including the trisynaptic and monosynaptic circuits from the entorhinal cortex (EC). CA3 was trained as a self-supervised recurrent neural network to predict its next input. We confirmed that CA3 is predicting ahead by analyzing the spike coupling between simultaneously recorded neurons in the dentate gyrus, CA3, and CA1 of the mouse hippocampus. In the model, CA1 neurons signal prediction errors by comparing CA3 predictions to the next direct EC input. The model exhibits the rapid appearance and slow fading of CA1 place cells and displays replay and phase precession from CA3. The model could be learned in a biologically plausible way with error-encoding neurons. Similarities between the hippocampal and thalamocortical circuits suggest that such computation motif could also underlie self-supervised sequence learning in the cortex.
Collapse
|
|
1 |
|
13
|
Polykretis I, Danielescu A. Mapless mobile robot navigation at the edge using self-supervised cognitive map learners. Front Robot AI 2024; 11:1372375. [PMID: 38841433 PMCID: PMC11151295 DOI: 10.3389/frobt.2024.1372375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 04/29/2024] [Indexed: 06/07/2024] Open
Abstract
Navigation of mobile agents in unknown, unmapped environments is a critical task for achieving general autonomy. Recent advancements in combining Reinforcement Learning with Deep Neural Networks have shown promising results in addressing this challenge. However, the inherent complexity of these approaches, characterized by multi-layer networks and intricate reward objectives, limits their autonomy, increases memory footprint, and complicates adaptation to energy-efficient edge hardware. To overcome these challenges, we propose a brain-inspired method that employs a shallow architecture trained by a local learning rule for self-supervised navigation in uncharted environments. Our approach achieves performance comparable to a state-of-the-art Deep Q Network (DQN) method with respect to goal-reaching accuracy and path length, with a similar (slightly lower) number of parameters, operations, and training iterations. Notably, our self-supervised approach combines novelty-based and random walks to alleviate the need for objective reward definition and enhance agent autonomy. At the same time, the shallow architecture and local learning rule do not call for error backpropagation, decreasing the memory overhead and enabling implementation on edge neuromorphic processors. These results contribute to the potential of embodied neuromorphic agents utilizing minimal resources while effectively handling variability.
Collapse
|
research-article |
1 |
|
14
|
Tao J, Dan Y, Zhou D. Local domain generalization with low-rank constraint for EEG-based emotion recognition. Front Neurosci 2023; 17:1213099. [PMID: 38027525 PMCID: PMC10662311 DOI: 10.3389/fnins.2023.1213099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 10/04/2023] [Indexed: 12/01/2023] Open
Abstract
As an important branch in the field of affective computing, emotion recognition based on electroencephalography (EEG) faces a long-standing challenge due to individual diversities. To conquer this challenge, domain adaptation (DA) or domain generalization (i.e., DA without target domain in the training stage) techniques have been introduced into EEG-based emotion recognition to eliminate the distribution discrepancy between different subjects. The preceding DA or domain generalization (DG) methods mainly focus on aligning the global distribution shift between source and target domains, yet without considering the correlations between the subdomains within the source domain and the target domain of interest. Since the ignorance of the fine-grained distribution information in the source may still bind the DG expectation on EEG datasets with multimodal structures, multiple patches (or subdomains) should be reconstructed from the source domain, on which multi-classifiers could be learned collaboratively. It is expected that accurately aligning relevant subdomains by excavating multiple distribution patterns within the source domain could further boost the learning performance of DG/DA. Therefore, we propose in this work a novel DG method for EEG-based emotion recognition, i.e., Local Domain Generalization with low-rank constraint (LDG). Specifically, the source domain is firstly partitioned into multiple local domains, each of which contains only one positive sample and its positive neighbors and k2 negative neighbors. Multiple subject-invariant classifiers on different subdomains are then co-learned in a unified framework by minimizing local regression loss with low-rank regularization for considering the shared knowledge among local domains. In the inference stage, the learned local classifiers are discriminatively selected according to their importance of adaptation. Extensive experiments are conducted on two benchmark databases (DEAP and SEED) under two cross-validation evaluation protocols, i.e., cross-subject within-dataset and cross-dataset within-session. The experimental results under the 5-fold cross-validation demonstrate the superiority of the proposed method compared with several state-of-the-art methods.
Collapse
|
methods-article |
2 |
|