1
|
Sun T, Wang X, Li Z, Ding S. Feature-wise scaling and shifting: Improving the generalization capability of neural networks through capturing independent information of features. Neural Netw 2024; 170:453-467. [PMID: 38039683 DOI: 10.1016/j.neunet.2023.11.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 10/10/2023] [Accepted: 11/17/2023] [Indexed: 12/03/2023]
Abstract
From the perspective of input features, information can be divided into independent information and correlation information. Current neural networks mainly concentrate on the capturing of correlation information through connection weight parameters supplemented by bias parameters. This paper introduces feature-wise scaling and shifting (FwSS) into neural networks for capturing independent information of features, and proposes a new neural network FwSSNet. In the network, a pair of scale and shift parameters is added before each input of each network layer, and bias is removed. The parameters are initialized as 1 and 0, respectively, and trained at separate learning rates, to guarantee the fully capturing of independence and correlation information. The learning rates of FwSS parameters depend on input data and the training speed ratios of adjacent FwSS and connection sublayers, meanwhile those of weight parameters remain unchanged as plain networks. Further, FwSS unifies the scaling and shifting operations in batch normalization (BN), and FwSSNet with BN is established through introducing a preprocessing layer. FwSS parameters except those in the last layer of the network can be simply trained at the same learning rate as weight parameters. Experiments show that FwSS is generally helpful in improving the generalization capability of both fully connected neural networks and deep convolutional neural networks, and FWSSNets achieve higher accuracies on UCI repository and CIFAR-10.
Collapse
Affiliation(s)
- Tongfeng Sun
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; Mine Digitization Engineering Research Centre of Ministry of Education of the People's Republic of China, Xuzhou 221116, China
| | - Xiurui Wang
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; Mine Digitization Engineering Research Centre of Ministry of Education of the People's Republic of China, Xuzhou 221116, China
| | - Zhongnian Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; Mine Digitization Engineering Research Centre of Ministry of Education of the People's Republic of China, Xuzhou 221116, China
| | - Shifei Ding
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; Mine Digitization Engineering Research Centre of Ministry of Education of the People's Republic of China, Xuzhou 221116, China.
| |
Collapse
|
2
|
Anil Kumar C, Harish S, Ravi P, SVN M, Kumar BPP, Mohanavel V, Alyami NM, Priya SS, Asfaw AK. Lung Cancer Prediction from Text Datasets Using Machine Learning. BIOMED RESEARCH INTERNATIONAL 2022; 2022:6254177. [PMID: 35872862 PMCID: PMC9303121 DOI: 10.1155/2022/6254177] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/10/2022] [Accepted: 06/20/2022] [Indexed: 11/18/2022]
Abstract
Lung cancer is the major cause of cancer-related death in this generation, and it is expected to remain so for the foreseeable future. It is feasible to treat lung cancer if the symptoms of the disease are detected early. It is possible to construct a sustainable prototype model for the treatment of lung cancer using the current developments in computational intelligence without negatively impacting the environment. Because it will reduce the number of resources squandered as well as the amount of work necessary to complete manual tasks, it will save both time and money. To optimise the process of detection from the lung cancer dataset, a machine learning model based on support vector machines (SVMs) was used. Using an SVM classifier, lung cancer patients are classified based on their symptoms at the same time as the Python programming language is utilised to further the model implementation. The effectiveness of our SVM model was evaluated in terms of several different criteria. Several cancer datasets from the University of California, Irvine, library were utilised to evaluate the evaluated model. As a result of the favourable findings of this research, smart cities will be able to deliver better healthcare to their citizens. Patients with lung cancer can obtain real-time treatment in a cost-effective manner with the least amount of effort and latency from any location and at any time. The proposed model was compared with the existing SVM and SMOTE methods. The proposed method gets a 98.8% of accuracy rate when comparing the existing methods.
Collapse
Affiliation(s)
- C. Anil Kumar
- Department of Electronics and Communication Engineering, R. L. Jalappa Institute of Technology Doddaballapur, Bangalore, Karnataka 561203, India
| | - S. Harish
- Department of Electronics and Communication Engineering, R. L. Jalappa Institute of Technology Doddaballapur, Bangalore, Karnataka 561203, India
| | - Prabha Ravi
- Medical Electronics Engineering, Ramaiah Institute of Technology, Bangalore, Karnataka 560054, India
| | - Murthy SVN
- Department of Computer Science and Engineering, S J C Institute of Technology, Chikkaballapur, Karnataka 562101, India
| | - B. P. Pradeep Kumar
- Department of Electronics and Communication Engineering, HKBK College of Engineering, Bangalore, Karnataka 560045, India
| | - V. Mohanavel
- Centre for Materials Engineering and Regenerative Medicine, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
- Department of Mechanical Engineering, Chandigarh University, Mohali, 140413 Punjab, India
| | - Nouf M. Alyami
- Department of Zoology, College of Science, King Saud University, PO Box 2455, Riyadh 11451, Saudi Arabia
| | - S. Shanmuga Priya
- Department of Microbiology-Immunology, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Amare Kebede Asfaw
- Department of Computer Science, Kombolcha Institute of Technology, Wollo University, Ethiopia
| |
Collapse
|
3
|
An effective integrated genetic programming and neural network model for electronic nose calibration of air pollution monitoring application. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07129-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
4
|
Lyapunov stability-Dynamic Back Propagation-based comparative study of different types of functional link neural networks for the identification of nonlinear systems. Soft comput 2020. [DOI: 10.1007/s00500-019-04496-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
5
|
Waheeb W, Ghazali R. A novel error-output recurrent neural network model for time series forecasting. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04474-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
6
|
Lung cancer prediction using higher-order recurrent neural network based on glowworm swarm optimization. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3824-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
7
|
Tsekouras GE, Trygonis V, Maniatopoulos A, Rigos A, Chatzipavlis A, Tsimikas J, Mitianoudis N, Velegrakis AF. A Hermite neural network incorporating artificial bee colony optimization to model shoreline realignment at a reef-fronted beach. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.07.070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
Waheeb W, Ghazali R, Hussain AJ. Dynamic ridge polynomial neural network with Lyapunov function for time series forecasting. APPL INTELL 2017. [DOI: 10.1007/s10489-017-1036-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
A new sparse model for traffic sign classification using soft histogram of oriented gradients. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2016.12.037] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Waheeb W, Ghazali R, Herawan T. Ridge Polynomial Neural Network with Error Feedback for Time Series Forecasting. PLoS One 2016; 11:e0167248. [PMID: 27959927 PMCID: PMC5154507 DOI: 10.1371/journal.pone.0167248] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 11/10/2016] [Indexed: 11/19/2022] Open
Abstract
Time series forecasting has gained much attention due to its many practical applications. Higher-order neural network with recurrent feedback is a powerful technique that has been used successfully for time series forecasting. It maintains fast learning and the ability to learn the dynamics of the time series over time. Network output feedback is the most common recurrent feedback for many recurrent neural network models. However, not much attention has been paid to the use of network error feedback instead of network output feedback. In this study, we propose a novel model, called Ridge Polynomial Neural Network with Error Feedback (RPNN-EF) that incorporates higher order terms, recurrence and error feedback. To evaluate the performance of RPNN-EF, we used four univariate time series with different forecasting horizons, namely star brightness, monthly smoothed sunspot numbers, daily Euro/Dollar exchange rate, and Mackey-Glass time-delay differential equation. We compared the forecasting performance of RPNN-EF with the ordinary Ridge Polynomial Neural Network (RPNN) and the Dynamic Ridge Polynomial Neural Network (DRPNN). Simulation results showed an average 23.34% improvement in Root Mean Square Error (RMSE) with respect to RPNN and an average 10.74% improvement with respect to DRPNN. That means that using network errors during training helps enhance the overall forecasting performance for the network.
Collapse
Affiliation(s)
- Waddah Waheeb
- Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, Malaysia
- Computer Science Department, Hodeidah University, Hodeidah, Yemen
| | - Rozaida Ghazali
- Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Johor, Malaysia
| | - Tutut Herawan
- Department of Information Systems, University of Malaya, Pantai Valley, Kuala Lumpur, Malaysia
- Department of Computer Science, Universitas Teknologi Yogyakarta, Yogyakarta, Indonesia
- AMCS Research Center, Yogyakarta, Indonesia
| |
Collapse
|
11
|
Predicting physical time series using dynamic ridge polynomial neural networks. PLoS One 2014; 9:e105766. [PMID: 25157950 PMCID: PMC4144909 DOI: 10.1371/journal.pone.0105766] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Accepted: 07/14/2014] [Indexed: 11/19/2022] Open
Abstract
Forecasting naturally occurring phenomena is a common problem in many domains of science, and this has been addressed and investigated by many scientists. The importance of time series prediction stems from the fact that it has wide range of applications, including control systems, engineering processes, environmental systems and economics. From the knowledge of some aspects of the previous behaviour of the system, the aim of the prediction process is to determine or predict its future behaviour. In this paper, we consider a novel application of a higher order polynomial neural network architecture called Dynamic Ridge Polynomial Neural Network that combines the properties of higher order and recurrent neural networks for the prediction of physical time series. In this study, four types of signals have been used, which are; The Lorenz attractor, mean value of the AE index, sunspot number, and heat wave temperature. The simulation results showed good improvements in terms of the signal to noise ratio in comparison to a number of higher order and feedforward neural networks in comparison to the benchmarked techniques.
Collapse
|
12
|
Yu X, Tang L, Chen Q, Xu C. Monotonicity and convergence of asynchronous update gradient method for ridge polynomial neural network. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.09.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
13
|
Fallahnezhad M, Yousefi H. Needle Insertion Force Modeling using Genetic Programming Polynomial Higher Order Neural Network. ROBOTICS 2013. [DOI: 10.4018/978-1-4666-4607-0.ch031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Precise insertion of a medical needle as an end-effecter of a robotic or computer-aided system into biological tissue is an important issue and should be considered in different operations, such as brain biopsy, prostate brachytherapy, and percutaneous therapies. Proper understanding of the whole procedure leads to a better performance by an operator or system. In this chapter, the authors use a 0.98 mm diameter needle with a real-time recording of force, displacement, and velocity of needle through biological tissue during in-vitro insertions. Using constant velocity experiments from 5 mm/min up to 300 mm/min, the data set for the force-displacement graph of insertion was gathered. Tissue deformation with a small puncture and a constant velocity penetration are the two first phases in the needle insertion process. Direct effects of different parameters and their correlations during the process is being modeled using a polynomial neural network. The authors develop different networks in 2nd and 3rd order to model the two first phases of insertion separately. Modeling accuracies were 98% and 86% in phase 1 and 2, respectively.
Collapse
Affiliation(s)
| | - Hashem Yousefi
- Amirkabir University of Technology (Tehran Polytechnic), Iran
| |
Collapse
|
14
|
Yu X, Chen Q. Convergence of gradient method with penalty for Ridge Polynomial neural network. Neurocomputing 2012. [DOI: 10.1016/j.neucom.2012.05.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
15
|
Kwok TY, Yeung DY. Constructive algorithms for structure learning in feedforward neural networks for regression problems. ACTA ACUST UNITED AC 2012; 8:630-45. [PMID: 18255666 DOI: 10.1109/72.572102] [Citation(s) in RCA: 320] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state-space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm, and the network architecture, is then presented.
Collapse
Affiliation(s)
- T Y Kwok
- Dept. of Comput. Sci., Hong Kong Univ. of Sci. and Technol., Kowloon
| | | |
Collapse
|
16
|
Yu X, Deng F. Convergence of gradient method for training ridge polynomial neural network. Neural Comput Appl 2012. [DOI: 10.1007/s00521-012-0915-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
17
|
Tripathi BK, Kalra PK. On Efficient Learning Machine With Root-Power Mean Neuron in Complex Domain. ACTA ACUST UNITED AC 2011; 22:727-38. [DOI: 10.1109/tnn.2011.2115251] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
18
|
Dehuri S, Cho SB. A comprehensive survey on functional link neural networks and an adaptive PSO–BP learning for CFLNN. Neural Comput Appl 2009. [DOI: 10.1007/s00521-009-0288-5] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
19
|
Non-stationary and stationary prediction of financial time series using dynamic ridge polynomial neural network. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2008.12.005] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
20
|
Xiong Y, Wu W, Kang X, Zhang C. Training Pi-Sigma Network by Online Gradient Algorithm with Penalty for Small Weight Update. Neural Comput 2007; 19:3356-68. [DOI: 10.1162/neco.2007.19.12.3356] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.
Collapse
Affiliation(s)
- Yan Xiong
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, People's Republic of China, and Faculty of Science, University of Science and Technology Liaoning, Anshan, 114051, People's Republic of China
| | - Wei Wu
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, People's Republic of China
| | - Xidai Kang
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, People's Republic of China
| | - Chao Zhang
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, People's Republic of China
| |
Collapse
|
21
|
Ghazali R, Hussain AJ, Liatsis P, Tawfik H. The application of ridge polynomial neural network to multi-step ahead financial time series prediction. Neural Comput Appl 2007. [DOI: 10.1007/s00521-007-0132-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
22
|
Shenvi N, Geremia JM, Rabitz H. Efficient chemical kinetic modeling through neural network maps. J Chem Phys 2006; 120:9942-51. [PMID: 15268013 DOI: 10.1063/1.1718305] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An approach to modeling nonlinear chemical kinetics using neural networks is introduced. It is found that neural networks based on a simple multivariate polynomial architecture are useful in approximating a wide variety of chemical kinetic systems. The accuracy and efficiency of these ridge polynomial networks (RPNs) are demonstrated by modeling the kinetics of H(2) bromination, formaldehyde oxidation, and H(2)+O(2) combustion. RPN kinetic modeling has a broad range of applications, including kinetic parameter inversion, simulation of reactor dynamics, and atmospheric modeling.
Collapse
Affiliation(s)
- Neil Shenvi
- Department of Chemistry, Princeton University, Princeton, NJ 08544, USA
| | | | | |
Collapse
|
23
|
Romero E, Alquézar R. A sequential algorithm for feed-forward neural networks with optimal coefficients and interacting frequencies. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2005.07.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
24
|
Yadav RN, Kalra PK, John J. Neural network learning with generalized-mean based neuron model. Soft comput 2005. [DOI: 10.1007/s00500-005-0479-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
25
|
Toh KA, Tran QL, Srinivasan D. Benchmarking a reduced multivariate polynomial pattern classifier. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2004; 26:740-755. [PMID: 18579935 DOI: 10.1109/tpami.2004.3] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
A novel method using a reduced multivariate polynomial model has been developed for biometric decision fusion where simplicity and ease of use could be a concern. However, much to our surprise, the reduced model was found to have good classification accuracy for several commonly used data sets from the Web. In this paper, we extend the single output model to a multiple outputs model to handle multiple class problems. The method is particularly suitable for problems with small number of features and large number of examples. Basic component of this polynomial model boils down to construction of new pattern features which are sums of the original features and combination of these new and original features using power and product terms. A linear regularized least-squares predictor is then built using these constructed features. The number of constructed feature terms varies linearly with the order of the polynomial, instead of having a power law in the case of full multivariate polynomials. The method is simple as it amounts to only a few lines of Matlab code. We perform extensive experiments on this reduced model using 42 data sets. Our results compared remarkably well with best reported results of several commonly used algorithms from the literature. Both the classification accuracy and efficiency aspects are reported for this reduced model.
Collapse
Affiliation(s)
- Kar-Ann Toh
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore.
| | | | | |
Collapse
|
26
|
Toh KA, Yau WY. Combination of Hyperbolic Functions for Multimodal Biometrics Data Fusion. ACTA ACUST UNITED AC 2004; 34:1196-209. [PMID: 15376864 DOI: 10.1109/tsmcb.2003.821868] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper, we treat the problem of combining fingerprint and speech biometric decisions as a classifier fusion problem. By exploiting the specialist capabilities of each classifier, a combined classifier may yield results which would not be possible in a single classifier. The Feedforward Neural Network provides a natural choice for such data fusion as it has been shown to be a universal approximator. However, the training process remains much to be a trial-and-error effort since no learning algorithm can guarantee convergence to optimal solution within finite iterations. In this work, we propose a network model to generate different combinations of the hyperbolic functions to achieve some approximation and classification properties. This is to circumvent the iterative training problem as seen in neural networks learning. In many decision data fusion applications, since individual classifiers or estimators to be combined would have attained a certain level of classification or approximation accuracy, this hyperbolic functions network can be used to combine these classifiers taking their decision outputs as the inputs to the network. The proposed hyperbolic functions network model is first applied to a function approximation problem to illustrate its approximation capability. This is followed by some case studies on pattern classification problems. The model is finally applied to combine the fingerprint and speaker verification decisions which show either better or comparable results with respect to several commonly used methods.
Collapse
Affiliation(s)
- Kar-Ann Toh
- Institute for Infocomm Research, Singapore 119613.
| | | |
Collapse
|
27
|
|
28
|
Abstract
In a great variety of neuron models, neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units that multiply their inputs instead of summing them and thus allow inputs to interact nonlinearly. The class of multiplicative neural networks comprises such widely known and well-studied network types as higher-order networks and product unit networks. We investigate the complexity of computing and learning for multiplicative neural networks. In particular, we derive upper and lower bounds on the Vapnik-Chervonenkis (VC) dimension and the pseudo-dimension for various types of networks with multiplicative units. As the most general case, we consider feedforward networks consisting of product and sigmoidal units, showing that their pseudo-dimension is bounded from above by a polynomial with the same order of magnitude as the currently best-known bound for purely sigmoidal networks. Moreover, we show that this bound holds even when the unit type, product or sigmoidal, may be learned. Crucial for these results are calculations of solution set components bounds for new network classes. As to lower bounds, we construct product unit networks of fixed depth with super-linear VC dimension. For sigmoidal networks of higher order, we establish polynomial bounds that, in contrast to previous results, do not involve any restriction of the network order. We further consider various classes of higher-order units, also known as sigma-pi units, that are characterized by connectivity constraints. In terms of these, we derive some asymptotically tight bounds. Multiplication plays an important role in both neural modeling of biological behavior and computing and learning with artificial neural networks. We briefly survey research in biology and in applications where multiplication is considered an essential computational element. The results we present here provide new tools for assessing the impact of multiplication on the computational power and the learning capabilities of neural networks.
Collapse
Affiliation(s)
- Michael Schmitt
- Lehrstuhl Mathematik und Informatik, Fakultät für Mathematik, Ruhr-Universität Bochum, D-44780 Bochum, Germany.
| |
Collapse
|
29
|
|
30
|
Ramamurti V, Ghosh J. Structurally adaptive modular networks for nonstationary environments. ACTA ACUST UNITED AC 1999; 10:152-60. [DOI: 10.1109/72.737501] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
31
|
Abstract
We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model, as well as the parameters of the locally linear model itself, are learned independently, that is, without the need for competition or any other kind of communication. Independent learning is accomplished by incrementally minimizing a weighted local cross-validation error. As a result, we obtain a learning system that can allocate resources as needed while dealing with the bias-variance dilemma in a principled way. The spatial localization of the linear models increases robustness toward negative interference. Our learning system can be interpreted as a nonparametric adaptive bandwidth smoother, as a mixture of experts where the experts are trained in isolation, and as a learning system that profits from combining independent expert knowledge on the same problem. This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Collapse
Affiliation(s)
- S Schaal
- University of Southern California, Department of Computer Science, Los Angeles CA, US, HEDCO Neuroscience Building 103, 90089.
| | | |
Collapse
|
32
|
Hush D, Horne B. Efficient algorithms for function approximation with piecewise linear sigmoidal networks. ACTA ACUST UNITED AC 1998; 9:1129-41. [DOI: 10.1109/72.728357] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
33
|
Stiles B, Sandberg I, Ghosh J. Complete memory structures for approximating nonlinear discrete-time mappings. ACTA ACUST UNITED AC 1997; 8:1397-409. [DOI: 10.1109/72.641463] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|