1
|
Mohamed MH. Rules extraction from constructively trained neural networks based on genetic algorithms. Neurocomputing 2011. [DOI: 10.1016/j.neucom.2011.04.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
2
|
Subrahmanya N, Shin YC. Constructive training of recurrent neural networks using hybrid optimization. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.05.012] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
3
|
Juang CF, Lin CT. A recurrent self-organizing neural fuzzy inference network. ACTA ACUST UNITED AC 2010; 10:828-45. [PMID: 18252581 DOI: 10.1109/72.774232] [Citation(s) in RCA: 241] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed in this paper. The RSONFIN is inherently a recurrent multilayered connectionist network for realizing the basic elements and functions of dynamic fuzzy inference, and may be considered to be constructed from a series of dynamic fuzzy rules. The temporal relations embedded in the network are built by adding some feedback connections representing the memory elements to a feedforward neural fuzzy network. Each weight as well as node in the RSONFIN has its own meaning and represents a special element in a fuzzy rule. There are no hidden nodes (i.e., no membership functions and fuzzy rules) initially in the RSONFIN. They are created on-line via concurrent structure identification (the construction of dynamic fuzzy if-then rules) and parameter identification (the tuning of the free parameters of membership functions). The structure learning together with the parameter learning forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network. Two major characteristics of the RSONFIN can thus be seen: 1) the recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and 2) no predetermination, like the number of hidden nodes, must be given, since the RSONFIN can find its optimal structure and parameters automatically and quickly. Moreover, to reduce the number of fuzzy rules generated, a flexible input partition method, the aligned clustering-based algorithm, is proposed. Various simulations on temporal problems are done and performance comparisons with some existing recurrent networks are also made. Efficiency of the RSONFIN is verified from these results.
Collapse
Affiliation(s)
- C F Juang
- Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C
| | | |
Collapse
|
4
|
do Carmo Nicoletti M, Bertini JR, Elizondo D, Franco L, Jerez JM. Constructive Neural Network Algorithms for Feedforward Architectures Suitable for Classification Tasks. CONSTRUCTIVE NEURAL NETWORKS 2009. [DOI: 10.1007/978-3-642-04512-7_1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
5
|
Delgado M, Cuéllar MP, Pegalajar MC. Multiobjective hybrid optimization and training of recurrent neural networks. ACTA ACUST UNITED AC 2008; 38:381-403. [PMID: 18348922 DOI: 10.1109/tsmcb.2007.912937] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The application of neural networks to solve a problem involves tasks with a high computational cost until a suitable network is found, and these tasks mainly involve the selection of the network topology and the training step. We usually select the network structure by means of a trial-and-error procedure, and we then train the network. In the case of recurrent neural networks (RNNs), the lack of suitable training algorithms sometimes hampers these procedures due to vanishing gradient problems. This paper addresses the simultaneous training and topology optimization of RNNs using multiobjective hybrid procedures. The proposal is based on the SPEA2 and NSGA2 algorithms for making hybrid methods using the Baldwinian hybridization strategy. We also study the effects of the selection of the objectives, crossover, and mutation in the diversity during evolution. The proposals are tested in the experimental section to train and optimize the networks in the competition on artificial time-series (CATS) benchmark.
Collapse
Affiliation(s)
- Miguel Delgado
- Department of Computer Science and Artificial Intelligence, University of Grenada, Grenada, Spain
| | | | | |
Collapse
|
6
|
Citterio C, Pelagotti A, Piuri V, Rocca L. Function approximation--a fast-convergence neural approach based on spectral analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS 2008; 10:725-40. [PMID: 18252573 DOI: 10.1109/72.774207] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We propose a constructive approach to building single-hidden-layer neural networks for nonlinear function approximation using frequency domain analysis. We introduce a spectrum-based learning procedure that minimizes the difference between the spectrum of the training data and the spectrum of the network's estimates. The network is built up incrementally during training and automatically determines the appropriate number of hidden units. This technique achieves similar or better approximation with faster convergence times than traditional techniques such as backpropagation.
Collapse
Affiliation(s)
- C Citterio
- Foster Wheeler Italiana S.p.A., 20094 Milano, Italy
| | | | | | | |
Collapse
|
7
|
Sun GZ, Giles CL, Chen HH. The neural network pushdown automaton: Architecture, dynamics and training. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/bfb0054003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
8
|
|
9
|
Hammer B, Micheli A, Sperduti A. Universal Approximation Capability of Cascade Correlation for Structures. Neural Comput 2005. [DOI: 10.1162/0899766053491878] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Cascade correlation (CC) constitutes a training method for neural networks that determines the weights as well as the neural architecture during training. Various extensions of CC to structured data have been proposed: recurrent cascade correlation (RCC) for sequences, recursive cascade correlation (RecCC) for tree structures with limited fan-out, and contextual recursive cascade correlation (CRecCC) for rooted directed positional acyclic graphs (DPAGs) with limited fan-in and fan-out. We show that these models possess the universal approximation property in the following sense: given a probability measure P on the input set, every measurable function from sequences into a real vector space can be approximated by a sigmoidal RCC up to any desired degree of accuracy up to inputs of arbitrary small probability. Every measurable function from tree structures with limited fan-out into a real vector space can be approximated by a sigmoidal RecCC with multiplicative neurons up to any desired degree of accuracy up to inputs of arbitrary small probability. For sigmoidal CRecCC networks with multiplicative neurons, we show the universal approximation capability for functions on an important subset of all DPAGs with limited fan-in and fan-out for which a specific linear representation yields unique codes. We give one sufficient structural condition for the latter property, which can easily be tested: the enumeration of ingoing and outgoing edges should becom patible. This property can be fulfilled for every DPAG with fan-in and fan-out two via reenumeration of children and parents, and for larger fan-in and fan-out via an expansion of the fan-in and fan-out and reenumeration of children and parents. In addition, the result can be generalized to the case of input-output isomorphic transductions of structures. Thus, CRecCC networks consti-tute the first neural models for which the universal approximation ca-pability of functions involving fairly general acyclic graph structures is proved.
Collapse
Affiliation(s)
- Barbara Hammer
- Institute of Computer Science, Clausthal University of Technology, 38678 Clausthal-Zellerfeld, Germany
| | | | - Alessandro Sperduti
- Dipartimento di Matematica Pura ed Applicata, Universitàdi Padova, Padova, Italy
| |
Collapse
|
10
|
Micheli A, Sona D, Sperduti A. Contextual Processing of Structured Data by Recursive Cascade Correlation. ACTA ACUST UNITED AC 2004; 15:1396-410. [PMID: 15565768 DOI: 10.1109/tnn.2004.837783] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
This paper propose a first approach to deal with contextual information in structured domains by recursive neural networks. The proposed model, i.e., contextual recursive cascade correlation (CRCC), a generalization of the recursive cascade correlation (RCC) model, is able to partially remove the causality assumption by exploiting contextual information stored in frozen units. We formally characterize the properties of CRCC showing that it is able to compute contextual transductions and also some causal supersource transductions that RCC cannot compute. Experimental results on controlled sequences and on a real-world task involving chemical structures confirm the computational limitations of RCC, while assessing the efficiency and efficacy of CRCC in dealing both with pure causal and contextual prediction tasks. Moreover, results obtained for the real-world task show the superiority of the proposed approach versus RCC when exploring a task for which it is not known whether the structural causality assumption holds.
Collapse
Affiliation(s)
- Alessio Micheli
- Computer Science Department, University of Pisa, 56127 Pisa, Italy.
| | | | | |
Collapse
|
11
|
Galicki M, Leistritz L, Zwick EB, Witte H. Improving Generalization Capabilities of Dynamic Neural Networks. Neural Comput 2004; 16:1253-82. [PMID: 15130249 DOI: 10.1162/089976604773717603] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
This work addresses the problem of improving the generalization capabilities of continuous recurrent neural networks. The learning task is transformed into an optimal control framework in which the weights and the initial network state are treated as unknown controls. A new learning algorithm based on a variational formulation of Pontrayagin's maximum principle is proposed. Under reasonable assumptions, its convergence is discussed. Numerical examples are given that demonstrate an essential improvement of generalization capabilities after the learning process of a dynamic network.
Collapse
Affiliation(s)
- Miroslaw Galicki
- Institute of Medical Statistics, Computer Sciences and Documentation, Friedrich Schiller University, Jena, Germany.
| | | | | | | |
Collapse
|
12
|
Abstract
This article reviews connectionist network architectures and training algorithms that are capable of dealing with patterns distributed across both space and time—spatiotemporal patterns. It provides common mathematical, algorithmic, and illustrative frameworks for describing spatiotemporal networks, making it easier to compare and contrast their representational and operational characteristics. Computational power, representational issues, and learning are discussed. In additional references to the relevant source publications are provided. This article can serve as a guide to prospective users of spatiotemporal networks by providing an overview of the operational and representational alternatives available.
Collapse
Affiliation(s)
- Stefan C. Kremer
- Guelph Natural Computation Group, Department of Computing and Information Science, University of Guelph, Guelph, Ontario, N1G 2W1 Canada
| |
Collapse
|
13
|
Parekh R, Yang J, Honavar V. Constructive neural-network learning algorithms for pattern classification. ACTA ACUST UNITED AC 2000; 11:436-51. [DOI: 10.1109/72.839013] [Citation(s) in RCA: 147] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
14
|
Knowledge selection in category learning. PSYCHOLOGY OF LEARNING AND MOTIVATION 2000. [DOI: 10.1016/s0079-7421(00)80034-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register]
|
15
|
Blanco A, Delgado M, Pegalajar M. A genetic algorithm to obtain the optimal recurrent neural network. Int J Approx Reason 2000. [DOI: 10.1016/s0888-613x(99)00032-8] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
Kremer S. Identification of a specific limitation on local-feedback recurrent networks acting as Mealy-Moore machines. ACTA ACUST UNITED AC 1999; 10:433-8. [DOI: 10.1109/72.750574] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
17
|
Campbell C. Constructive learning techniques for designing neural network systems. NEURAL NETWORK SYSTEMS TECHNIQUES AND APPLICATIONS 1998. [DOI: 10.1016/s1874-5946(98)80005-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
18
|
Sperduti A, Starita A. Supervised neural networks for the classification of structures. ACTA ACUST UNITED AC 1997; 8:714-35. [DOI: 10.1109/72.572108] [Citation(s) in RCA: 299] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
19
|
|
20
|
Teng CC, Wah BW. Automated learning for reducing the configuration of a feedforward neural network. IEEE TRANSACTIONS ON NEURAL NETWORKS 1996; 7:1072-85. [PMID: 18263505 DOI: 10.1109/72.536305] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In this paper, we present two learning mechanisms for artificial neural networks (ANN's) that can be applied to solve classification problems with binary outputs. These mechanisms are used to reduce the number of hidden units of an ANN when trained by the cascade-correlation learning algorithm (CAS). Since CAS adds hidden units incrementally as learning proceeds, it is difficult to predict the number of hidden units required when convergence is reached. Further, learning must be restarted when the number of hidden units is larger than expected. Our key idea in this paper is to provide alternatives in the learning process and to select the best alternative dynamically based on run-time information obtained. Mixed-mode learning (MM), our first algorithm, provides alternative output matrices so that learning is extended to find one of the many one-to-many mappings instead of finding a unique one-to-one mapping. Since the objective of learning is relaxed by this transformation, the number of learning epochs can be reduced. This in turn leads to a smaller number of hidden units required for convergence. Population-based learning for ANN's (PLAN), our second algorithm, maintains alternative network configurations to select at run time promising networks to train based on error information obtained and time remaining. This dynamic scheduling avoids training possibly unpromising ANNs to completion before exploring new ones. We show the performance of these two mechanisms by applying them to solve the two-spiral problem, a two-region classification problem, and the Pima Indian diabetes diagnosis problem.
Collapse
Affiliation(s)
- C C Teng
- Coordinated Sci. Lab., Illinois Univ., Urbana, IL
| | | |
Collapse
|
21
|
|
22
|
|