101
|
Cannas SA, Stariolo D, Tamarit FA. Learning dynamics of simple perceptrons with non-extensive cost functions. NETWORK (BRISTOL, ENGLAND) 1996; 7:141-149. [PMID: 29480149 DOI: 10.1080/0954898x.1996.11978659] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A Tsallis-statistics-based generalization of the gradient descent dynamics (using non- extensive cost functions), recently introduced by one of us, is proposed as a learning rule in a simple perceptron. The resulting Langevin equations are solved numerically for different values of an index q (q = 1 and q ≠ 1 respectively correspond to the extensive and non-extensive cases) and for different cost functions. The results are compared with the learning curve (mean error versus time) obtained from a learning experiment carried out with human beings, showing an excellent agreement for values of q slightly above unity. This fact illustrates the possible importance of including some degree of non-locality (non-extensivity) in computational learning procedures, whenever one wants to mimic human behaviour.
Collapse
Affiliation(s)
- S A Cannas
- a Facultad de Matemática, Astronomía y Física , Universidad Nacional de Córdoba, Haya de la Torre y Medina Allende S/N, Ciudad Universitaria , 5000 Córdoba , Argentina
| | - D Stariolo
- b Centro Brasileiro de Pesquisas Físicas , Rua Xavier Sigaud 150, 22290-180, Rio de Janeiro, Brazil
| | - F A Tamarit
- a Facultad de Matemática, Astronomía y Física , Universidad Nacional de Córdoba, Haya de la Torre y Medina Allende S/N, Ciudad Universitaria , 5000 Córdoba , Argentina
- b Centro Brasileiro de Pesquisas Físicas , Rua Xavier Sigaud 150, 22290-180, Rio de Janeiro, Brazil
| |
Collapse
|
102
|
|
103
|
Opper M, Haussler D. Bounds for predictive errors in the statistical mechanics of supervised learning. PHYSICAL REVIEW LETTERS 1995; 75:3772-3775. [PMID: 10059723 DOI: 10.1103/physrevlett.75.3772] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
104
|
Biehl M, Riegler P, Stechert M. Learning from noisy data: An exactly solvable model. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:R4624-R4627. [PMID: 9964090 DOI: 10.1103/physreve.52.r4624] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
105
|
Raffin B, Gordon MB. Learning and generalization with Minimerror, a temperature-dependent learning algorithm. Neural Comput 1995; 7:1206-24. [PMID: 7584899 DOI: 10.1162/neco.1995.7.6.1206] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We study the numerical performances of Minimerror, a recently introduced learning algorithm for the perceptron that has analytically been shown to be optimal both on learning linearly and nonlinearly separable functions. We present its implementation on learning linearly separable boolean functions. Numerical results are in excellent agreement with the theoretical predictions.
Collapse
Affiliation(s)
- B Raffin
- CEA/Département de Recherche Fondamentale sur la Matière Condensée, SPSMS/MDN, Centre d'Etudes Nucléaires de Grenoble, France
| | | |
Collapse
|
106
|
Campbell C, Vicente CP. The target switch algorithm: a constructive learning procedure for feed-forward neural networks. Neural Comput 1995; 7:1245-64. [PMID: 7584901 DOI: 10.1162/neco.1995.7.6.1245] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We propose an efficient procedure for constructing and training a feed-forward neural network. The network can perform binary classification for binary or analogue input data. We show that the procedure can also be used to construct feedforward neural networks with binary-valued weights. Neural networks with binary-valued weights are potentially straightforward to implement using microelectronic or optical devices and they can also exhibit good generalization.
Collapse
Affiliation(s)
- C Campbell
- Advanced Computing Research Centre, Bristol University, United Kingdom
| | | |
Collapse
|
107
|
Saad D, Solla SA. On-line learning in soft committee machines. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:4225-4243. [PMID: 9963894 DOI: 10.1103/physreve.52.4225] [Citation(s) in RCA: 129] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
108
|
Diambra L, Plastino A. Maximum entropy, pseudoinverse techniques, and time series predictions with layered networks. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:4557-4560. [PMID: 9963937 DOI: 10.1103/physreve.52.4557] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
109
|
Diambra L, Fernández J, Plastino A. Pseudoinverse techniques, information theory, and the training of feedforward networks. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:2887-2892. [PMID: 9963735 DOI: 10.1103/physreve.52.2887] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
110
|
Kinouchi O, Caticha N. On-line versus off-line learning in the linear perceptron: A comparative study. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:2878-2886. [PMID: 9963734 DOI: 10.1103/physreve.52.2878] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
111
|
Barkai N, Seung HS, Sompolinsky H. Local and global convergence of on-line learning. PHYSICAL REVIEW LETTERS 1995; 75:1415-1418. [PMID: 10060287 DOI: 10.1103/physrevlett.75.1415] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
112
|
Bouten M, Schietse J. Gradient descent learning in perceptrons: A review of its possibilities. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 52:1958-1967. [PMID: 9963617 DOI: 10.1103/physreve.52.1958] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
113
|
Abstract
We examine the fluctuations in the test error induced by random, finite, training and test sets for the linear perceptron of input dimension n with a spherically constrained weight vector. This variance enables us to address such issues as the partitioning of a data set into a test and training set. We find that the optimal assignment of the test set size scales with n2/3.
Collapse
Affiliation(s)
- D. Barber
- Department of Physics, University of Edinburgh, Kings Buildings, Mayfield Road, Edinburgh EH9 3JZ, U.K
| | - D. Saad
- Department of Physics, University of Edinburgh, Kings Buildings, Mayfield Road, Edinburgh EH9 3JZ, U.K
| | - P. Sollich
- Department of Physics, University of Edinburgh, Kings Buildings, Mayfield Road, Edinburgh EH9 3JZ, U.K
| |
Collapse
|
114
|
Holden SB, Niranjan M. On the statistical physics of radial basis function networks. Neural Process Lett 1995. [DOI: 10.1007/bf02279933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
115
|
Bex GJ, Serneels R. Storage capacity and generalization error for the reversed-wedge Ising perceptron. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 51:6309-6312. [PMID: 9963383 DOI: 10.1103/physreve.51.6309] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
|
116
|
Meir R, Merhav N. On the stochastic complexity of learning realizable and unrealizable rules. Mach Learn 1995. [DOI: 10.1007/bf00996271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
117
|
Barbato DM, Fontanari JF. Dilution in a linear neural network. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 51:6219-6229. [PMID: 9963361 DOI: 10.1103/physreve.51.6219] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
118
|
Saad D, Solla SA. Exact solution for on-line learning in multilayer neural networks. PHYSICAL REVIEW LETTERS 1995; 74:4337-4340. [PMID: 10058475 DOI: 10.1103/physrevlett.74.4337] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
119
|
Opper M. Statistical physics estimates for the complexity of feedforward neural networks. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1995; 51:3613-3618. [PMID: 9963043 DOI: 10.1103/physreve.51.3613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
120
|
Grandvalet Y, Canu S. Comments on "Noise injection into inputs in back propagation learning". ACTA ACUST UNITED AC 1995. [DOI: 10.1109/21.370200] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
121
|
Deffuant G. An Algorithm for Building Regularized Piecewise Linear Discrimination Surfaces: The Perceptron Membrane. Neural Comput 1995. [DOI: 10.1162/neco.1995.7.2.380] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The perceptron membrane is a new connectionist model that aims at solving discrimination (classification) problems with piecewise linear surfaces. The discrimination surfaces of perceptron membranes are defined by the union of convex polyhedrons. Starting from only one convex polyhedron, new facets and new polyhedrons are added during learning. Moreover, the positions and orientations of the facets are continuously adapted according to the training examples. Considering each facet as a perceptron cell, a geometric credit assignment provides a local training domain to each perceptron of the network. This enables one to apply statistical theorems on the probability of good generalization for each unit on its learning domain, and gives a reliable criterion for perceptron elimination (using Vapnik-Chervonenkis dimension). Furthermore, a regularization procedure is implemented. The model efficiency is demonstrated on well-known problems such as the 2-spirals or waveforms.
Collapse
|
122
|
Generalization and PAC learning: some new results for the class of generalized single-layer networks. ACTA ACUST UNITED AC 1995; 6:368-80. [DOI: 10.1109/72.363472] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
123
|
Eisenstein E, Kanter I, Kessler DA, Kinzel W. Generation and prediction of time series by a neural network. PHYSICAL REVIEW LETTERS 1995; 74:6-9. [PMID: 10057685 DOI: 10.1103/physrevlett.74.6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
124
|
Kabashima Y, Shinomoto S. Learning a Decision Boundary from Stochastic Examples: Incremental Algorithms with and without Queries. Neural Comput 1995. [DOI: 10.1162/neco.1995.7.1.158] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Even if it is not possible to reproduce a target input-output relation, a learning machine should be able to minimize the probability of making errors. A practical learning algorithm should also be simple enough to go without memorizing example data, if possible. Incremental algorithms such as error backpropagation satisfy this requirement. We propose incremental algorithms that provide fast convergence of the machine parameter θ to its optimal choice θo with respect to the number of examples t. We will consider the binary choice model whose target relation has a blurred boundary and the machine whose parameter θ specifies a decision boundary to make the output prediction. The question we wish to address here is how fast θ can approach θo, depending upon whether in the learning stage the machine can specify inputs as queries to the target relation, or the inputs are drawn from a certain distribution. If queries are permitted, the machine can achieve the fastest convergence, (θ - θo)2 ∼ O(t−1). If not, O(t−1) convergence is generally not attainable. For learning without queries, we showed in a previous paper that the error minimum algorithm exhibits a slow convergence (θ - θo)2 ∼ O(t−2/3). We propose here a practical algorithm that provides a rather fast convergence, O(t−4/5). It is possible to further accelerate the convergence by using more elaborate algorithms. The fastest convergence turned out to be O[(lnt)2 t−1]. This scaling is considered optimal among possible algorithms, and is not due to the incremental nature of our algorithm.
Collapse
|
125
|
|
126
|
O'Kane D, Winther O. Learning to classify in large committee machines. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1994; 50:3201-3209. [PMID: 9962365 DOI: 10.1103/physreve.50.3201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
127
|
Derényi I, Geszti T, Györgyi G. Generalization in the programed teaching of a perceptron. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1994; 50:3192-3200. [PMID: 9962364 DOI: 10.1103/physreve.50.3192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
128
|
Opper M. Learning and generalization in a two-layer neural network: The role of the Vapnik-Chervonvenkis dimension. PHYSICAL REVIEW LETTERS 1994; 72:2113-2116. [PMID: 10055791 DOI: 10.1103/physrevlett.72.2113] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
129
|
Barnard E. A model for nonpolynomial decrease in error rate with increasing sample size. ACTA ACUST UNITED AC 1994; 5:994-7. [DOI: 10.1109/72.329698] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
130
|
Kang K, Oh JH, Kwon C, Park Y. Generalization in a two-layer neural network. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1993; 48:4805-4809. [PMID: 9961164 DOI: 10.1103/physreve.48.4805] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
131
|
Parrondo JM. Generalization error in a self-similar committee machine. PHYSICAL REVIEW LETTERS 1993; 71:2355-2359. [PMID: 10054659 DOI: 10.1103/physrevlett.71.2355] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
132
|
Engel A. Systems that can learn from examples: Replica calculation of uniform convergence bounds for perceptrons. PHYSICAL REVIEW LETTERS 1993; 71:1772-1775. [PMID: 10054494 DOI: 10.1103/physrevlett.71.1772] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
133
|
Barkai N, Seung HS, Sompolinsky H. Scaling laws in learning of classification tasks. PHYSICAL REVIEW LETTERS 1993; 70:3167-3170. [PMID: 10053792 DOI: 10.1103/physrevlett.70.3167] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
134
|
Kwon C, Park Y, Oh J. Stability of the replica-symmetric solution for a perceptron learning from examples. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1993; 47:3707-3711. [PMID: 9960426 DOI: 10.1103/physreve.47.3707] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
135
|
Bös S, Kinzel W, Opper M. Generalization ability of perceptrons with continuous outputs. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1993; 47:1384-1391. [PMID: 9960140 DOI: 10.1103/physreve.47.1384] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
136
|
Heskes TM, Kappen B. On-line learning processes in artificial neural networks. ACTA ACUST UNITED AC 1993. [DOI: 10.1016/s0924-6509(08)70038-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
|
137
|
Amari S, Kurata K, Nagaoka H. Information geometry of Boltzmann machines. ACTA ACUST UNITED AC 1992; 3:260-71. [DOI: 10.1109/72.125867] [Citation(s) in RCA: 124] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|