1
|
Series of Semihypergroups of Time-Varying Artificial Neurons and Related Hyperstructures. Symmetry (Basel) 2019. [DOI: 10.3390/sym11070927] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Detailed analysis of the function of multilayer perceptron (MLP) and its neurons together with the use of time-varying neurons allowed the authors to find an analogy with the use of structures of linear differential operators. This procedure allowed the construction of a group and a hypergroup of artificial neurons. In this article, focusing on semihyperstructures and using the above described procedure, the authors bring new insights into structures and hyperstructures of artificial neurons and their possible symmetric relations.
Collapse
|
2
|
Gavrilov D, Strukov D, Likharev KK. Capacity, Fidelity, and Noise Tolerance of Associative Spatial-Temporal Memories Based on Memristive Neuromorphic Networks. Front Neurosci 2018; 12:195. [PMID: 29643761 PMCID: PMC5883079 DOI: 10.3389/fnins.2018.00195] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Accepted: 03/12/2018] [Indexed: 11/13/2022] Open
Abstract
We have calculated key characteristics of associative (content-addressable) spatial-temporal memories based on neuromorphic networks with restricted connectivity-"CrossNets." Such networks may be naturally implemented in nanoelectronic hardware using hybrid memristive circuits, which may feature extremely high energy efficiency, approaching that of biological cortical circuits, at much higher operation speed. Our numerical simulations, in some cases confirmed by analytical calculations, show that the characteristics depend substantially on the method of information recording into the memory. Of the four methods we have explored, two methods look especially promising-one based on the quadratic programming, and the other one being a specific discrete version of the gradient descent. The latter method provides a slightly lower memory capacity (at the same fidelity) then the former one, but it allows local recording, which may be more readily implemented in nanoelectronic hardware. Most importantly, at the synchronous retrieval, both methods provide a capacity higher than that of the well-known Ternary Content-Addressable Memories with the same number of nonvolatile memory cells (e.g., memristors), though the input noise immunity of the CrossNet memories is lower.
Collapse
Affiliation(s)
- Dmitri Gavrilov
- Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY, United States
| | - Dmitri Strukov
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, United States
| | - Konstantin K Likharev
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, United States
| |
Collapse
|
3
|
Sylvester J, Reggia J, Weems S, Bunting M. Controlling working memory with learned instructions. Neural Netw 2013; 41:23-38. [DOI: 10.1016/j.neunet.2013.01.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Revised: 01/11/2013] [Accepted: 01/19/2013] [Indexed: 10/27/2022]
|
4
|
Dynamically generate a long-lived private key based on password keystroke features and neural network. Inf Sci (N Y) 2012. [DOI: 10.1016/j.ins.2012.04.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
5
|
Nguyen VA, Starzyk JA, Goh WB, Jachyra D. Neural network structure for spatio-temporal long-term memory. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:971-983. [PMID: 24806767 DOI: 10.1109/tnnls.2012.2191419] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper proposes a neural network structure for spatio-temporal learning and recognition inspired by the long-term memory (LTM) model of the human cortex. Our structure is able to process real-valued and multidimensional sequences. This capability is attained by addressing three critical problems in sequential learning, namely the error tolerance, the significance of sequence elements and memory forgetting. We demonstrate the potential of the framework with a series of synthetic simulations and the Australian sign language (ASL) dataset. Results show that our LTM model is robust to different types of distortions. Second, our LTM model outperforms other sequential processing models in a classification task for the ASL dataset.
Collapse
|
6
|
Carpinteiro OAS, Leite JPRR, Pinheiro CAM, Lima I. Forecasting models for prediction in time series. Artif Intell Rev 2011. [DOI: 10.1007/s10462-011-9275-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
7
|
A hierarchical hybrid neural model with time integrators in long-term load forecasting. Neural Comput Appl 2009. [DOI: 10.1007/s00521-009-0290-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
|
9
|
Starzyk J, He H. Spatio–Temporal Memories for Machine Learning: A Long-Term Memory Organization. ACTA ACUST UNITED AC 2009; 20:768-80. [DOI: 10.1109/tnn.2009.2012854] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
10
|
Abstract
This paper introduces a general framework for describing dynamic neural networks--the layered digital dynamic network (LDDN). This framework allows the development of two general algorithms for computing the gradients and Jacobians for these dynamic networks: backpropagation-through-time (BPTT) and real-time recurrent learning (RTRL). The structure of the LDDN framework enables an efficient implementation of both algorithms for arbitrary dynamic networks. This paper demonstrates that the BPTT algorithm is more efficient for gradient calculations, but the RTRL algorithm is more efficient for Jacobian calculations.
Collapse
Affiliation(s)
- Orlando De Jesús
- Research Department, Halliburton Energy Services, Dallas, TX 75006, USA
| | | |
Collapse
|
11
|
A hierarchical neural model with time windows in long-term electrical load forecasting. Neural Comput Appl 2006. [DOI: 10.1007/s00521-006-0072-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
12
|
Abstract
This letter presents an algorithm, CrySSMEx, for extracting minimal finite state machine descriptions of dynamic systems such as recurrent neural networks. Unlike previous algorithms, CrySSMEx is parameter free and deterministic, and it efficiently generates a series of increasingly refined models. A novel finite stochastic model of dynamic systems and a novel vector quantization function have been developed to take into account the state-space dynamics of the system. The experiments show that (1) extraction from systems that can be described as regular grammars is trivial, (2) extraction from high-dimensional systems is feasible, and (3) extraction of approximative models from chaotic systems is possible. The results are promising, and an analysis of shortcomings suggests some possible further improvements. Some largely overlooked connections, of the field of rule extraction from recurrent neural networks, to other fields are also identified.
Collapse
|
13
|
Corbacho F, Nishikawa KC, Weerasuriya A, Liaw JS, Arbib MA. Schema-based learning of adaptable and flexible prey- catching in anurans II. Learning after lesioning. BIOLOGICAL CYBERNETICS 2005; 93:410-25. [PMID: 16320080 DOI: 10.1007/s00422-005-0014-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2004] [Accepted: 06/16/2005] [Indexed: 05/05/2023]
Abstract
The previous companion paper describes the initial (seed) schema architecture that gives rise to the observed prey-catching behavior. In this second paper in the series we describe the fundamental adaptive processes required during learning after lesioning. Following bilateral transections of the hypoglossal nerve, anurans lunge toward mealworms with no accompanying tongue or jaw movement. Nevertheless anurans with permanent hypoglossal transections eventually learn to catch their prey by first learning to open their mouth again and then lunging their body further and increasing their head angle. In this paper we present a new learning framework, called schema-based learning (SBL). SBL emphasizes the importance of the current existent structure (schemas), that defines a functioning system, for the incremental and autonomous construction of ever more complex structure to achieve ever more complex levels of functioning. We may rephrase this statement into the language of Schema Theory (Arbib 1992, for a comprehensive review) as the learning of new schemas based on the stock of current schemas. SBL emphasizes a fundamental principle of organization called coherence maximization, that deals with the maximization of congruence between the results of an interaction (external or internal) and the expectations generated for that interaction. A central hypothesis consists of the existence of a hierarchy of predictive internal models (predictive schemas) all over the control center-brain-of the agent. Hence, we will include predictive models in the perceptual, sensorimotor, and motor components of the autonomous agent architecture. We will then show that predictive models are fundamental for structural learning. In particular we will show how a system can learn a new structural component (augment the overall network topology) after being lesioned in order to recover (or even improve) its original functionality. Learning after lesioning is a special case of structural learning but clearly shows that solutions cannot be known/hardwired a priori since it cannot be known, in advance, which substructure is going to break down.
Collapse
Affiliation(s)
- Fernando Corbacho
- USC Brain Project, University of Southern California, Los Angeles, 90089-0781, USA.
| | | | | | | | | |
Collapse
|
14
|
Abstract
Rule extraction (RE) from recurrent neural networks (RNNs) refers to finding models of the underlying RNN, typically in the form of finite state machines, that mimic the network to a satisfactory degree while having the advantage of being more transparent. RE from RNNs can be argued to allow a deeper and more profound form of analysis of RNNs than other, more or less ad hoc methods. RE may give us understanding of RNNs in the intermediate levels between quite abstract theoretical knowledge of RNNs as a class of computing devices and quantitative performance evaluations of RNN instantiations. The development of techniques for extraction of rules from RNNs has been an active field since the early 1990s. This article reviews the progress of this development and analyzes it in detail. In order to structure the survey and evaluate the techniques, a taxonomy specifically designed for this purpose has been developed. Moreover, important open research issues are identified that, if addressed properly, possibly can give the field a significant push forward.
Collapse
Affiliation(s)
- Henrik Jacobsson
- School of Humanities and Informatics, University of Skövde, Skövde, Sweden, and Department of Computer Science, University of Sheffield, United Kingdom
| |
Collapse
|
15
|
Hammer B, Micheli A, Sperduti A. Universal Approximation Capability of Cascade Correlation for Structures. Neural Comput 2005. [DOI: 10.1162/0899766053491878] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Cascade correlation (CC) constitutes a training method for neural networks that determines the weights as well as the neural architecture during training. Various extensions of CC to structured data have been proposed: recurrent cascade correlation (RCC) for sequences, recursive cascade correlation (RecCC) for tree structures with limited fan-out, and contextual recursive cascade correlation (CRecCC) for rooted directed positional acyclic graphs (DPAGs) with limited fan-in and fan-out. We show that these models possess the universal approximation property in the following sense: given a probability measure P on the input set, every measurable function from sequences into a real vector space can be approximated by a sigmoidal RCC up to any desired degree of accuracy up to inputs of arbitrary small probability. Every measurable function from tree structures with limited fan-out into a real vector space can be approximated by a sigmoidal RecCC with multiplicative neurons up to any desired degree of accuracy up to inputs of arbitrary small probability. For sigmoidal CRecCC networks with multiplicative neurons, we show the universal approximation capability for functions on an important subset of all DPAGs with limited fan-in and fan-out for which a specific linear representation yields unique codes. We give one sufficient structural condition for the latter property, which can easily be tested: the enumeration of ingoing and outgoing edges should becom patible. This property can be fulfilled for every DPAG with fan-in and fan-out two via reenumeration of children and parents, and for larger fan-in and fan-out via an expansion of the fan-in and fan-out and reenumeration of children and parents. In addition, the result can be generalized to the case of input-output isomorphic transductions of structures. Thus, CRecCC networks consti-tute the first neural models for which the universal approximation ca-pability of functions involving fairly general acyclic graph structures is proved.
Collapse
Affiliation(s)
- Barbara Hammer
- Institute of Computer Science, Clausthal University of Technology, 38678 Clausthal-Zellerfeld, Germany
| | | | - Alessandro Sperduti
- Dipartimento di Matematica Pura ed Applicata, Universitàdi Padova, Padova, Italy
| |
Collapse
|
16
|
Hammer B, Micheli A, Sperduti A, Strickert M. Recursive self-organizing network models. Neural Netw 2005; 17:1061-85. [PMID: 15555852 DOI: 10.1016/j.neunet.2004.06.009] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2003] [Accepted: 06/03/2004] [Indexed: 11/24/2022]
Abstract
Self-organizing models constitute valuable tools for data visualization, clustering, and data mining. Here, we focus on extensions of basic vector-based models by recursive computation in such a way that sequential and tree-structured data can be processed directly. The aim of this article is to give a unified review of important models recently proposed in literature, to investigate fundamental mathematical properties of these models, and to compare the approaches by experiments. We first review several models proposed in literature from a unifying perspective, thereby making use of an underlying general framework which also includes supervised recurrent and recursive models as special cases. We shortly discuss how the models can be related to different neuron lattices. Then, we investigate theoretical properties of the models in detail: we explicitly formalize how structures are internally stored in different context models and which similarity measures are induced by the recursive mapping onto the structures. We assess the representational capabilities of the models, and we shortly discuss the issues of topology preservation and noise tolerance. The models are compared in an experiment with time series data. Finally, we add an experiment for one context model for tree-structured data to demonstrate the capability to process complex structures.
Collapse
Affiliation(s)
- Barbara Hammer
- Research Group LNM, Department of Mathematics/Computer Science, University of Osnabrück, Albrechtstrasse 28, Osnabrück D-49069, Germany.
| | | | | | | |
Collapse
|
17
|
|
18
|
Hammer B, Micheli A, Sperduti A, Strickert M. A general framework for unsupervised processing of structured data. Neurocomputing 2004. [DOI: 10.1016/j.neucom.2004.01.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
19
|
Tino P, Cernanský M, Benusková L. Markovian Architectural Bias of Recurrent Neural Networks. ACTA ACUST UNITED AC 2004; 15:6-15. [PMID: 15387243 DOI: 10.1109/tnn.2003.820839] [Citation(s) in RCA: 138] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training [1], [2]. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "null" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure. Index Terms-Complex symbolic sequences, information latching problem, iterative function systems, Markov models, recurrent neural networks (RNNs).
Collapse
Affiliation(s)
- Peter Tino
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, U.K.
| | | | | |
Collapse
|
20
|
Abstract
We have recently shown that when initialized with “small” weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased toward Markov models; even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tiňo, 2002; Tiňo, Čerňanský, &Beňušková, 2002a, 2002b). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this article, we extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finite-state transition diagram&a scenario that has been frequently considered in the past, for example, when studying RNN-based learning and implementation of regular grammars and finite-state transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as box counting and Hausdorff dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters, the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
Collapse
Affiliation(s)
- Peter Tiňo
- Aston University, Birmingham B4 7ET, U.K.,
| | | |
Collapse
|
21
|
Barreto GDA, Araújo AFR, Kremer SC. A taxonomy for spatiotemporal connectionist networks revisited: the unsupervised case. Neural Comput 2003; 15:1255-320. [PMID: 12816574 DOI: 10.1162/089976603321780281] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Spatiotemporal connectionist networks (STCNs) comprise an important class of neural models that can deal with patterns distributed in both time and space. In this article, we widen the application domain of the taxonomy for supervised STCNs recently proposed by Kremer (2001) to the unsupervised case. This is possible through a reinterpretation of the state vector as a vector of latent (hidden) variables, as proposed by Meinicke (2000). The goal of this generalized taxonomy is then to provide a nonlinear generative framework for describing unsupervised spatiotemporal networks, making it easier to compare and contrast their representational and operational characteristics. Computational properties, representational issues, and learning are also discussed, and a number of references to the relevant source publications are provided. It is argued that the proposed approach is simple and more powerful than the previous attempts from a descriptive and predictive viewpoint. We also discuss the relation of this taxonomy with automata theory and state-space modeling and suggest directions for further work.
Collapse
|