1
|
Kaveh M, Mesgari MS. Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review. Neural Process Lett 2022; 55:1-104. [PMID: 36339645 PMCID: PMC9628382 DOI: 10.1007/s11063-022-11055-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/11/2022] [Indexed: 12/02/2022]
Abstract
The learning process and hyper-parameter optimization of artificial neural networks (ANNs) and deep learning (DL) architectures is considered one of the most challenging machine learning problems. Several past studies have used gradient-based back propagation methods to train DL architectures. However, gradient-based methods have major drawbacks such as stucking at local minimums in multi-objective cost functions, expensive execution time due to calculating gradient information with thousands of iterations and needing the cost functions to be continuous. Since training the ANNs and DLs is an NP-hard optimization problem, their structure and parameters optimization using the meta-heuristic (MH) algorithms has been considerably raised. MH algorithms can accurately formulate the optimal estimation of DL components (such as hyper-parameter, weights, number of layers, number of neurons, learning rate, etc.). This paper provides a comprehensive review of the optimization of ANNs and DLs using MH algorithms. In this paper, we have reviewed the latest developments in the use of MH algorithms in the DL and ANN methods, presented their disadvantages and advantages, and pointed out some research directions to fill the gaps between MHs and DL methods. Moreover, it has been explained that the evolutionary hybrid architecture still has limited applicability in the literature. Also, this paper classifies the latest MH algorithms in the literature to demonstrate their effectiveness in DL and ANN training for various applications. Most researchers tend to extend novel hybrid algorithms by combining MHs to optimize the hyper-parameters of DLs and ANNs. The development of hybrid MHs helps improving algorithms performance and capable of solving complex optimization problems. In general, the optimal performance of the MHs should be able to achieve a suitable trade-off between exploration and exploitation features. Hence, this paper tries to summarize various MH algorithms in terms of the convergence trend, exploration, exploitation, and the ability to avoid local minima. The integration of MH with DLs is expected to accelerate the training process in the coming few years. However, relevant publications in this way are still rare.
Collapse
Affiliation(s)
- Mehrdad Kaveh
- Department of Geodesy and Geomatics, K. N. Toosi University of Technology, Tehran, 19967-15433 Iran
| | - Mohammad Saadi Mesgari
- Department of Geodesy and Geomatics, K. N. Toosi University of Technology, Tehran, 19967-15433 Iran
| |
Collapse
|
2
|
Evolutionary neural networks for deep learning: a review. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01578-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
3
|
Saha P, Nath P, Middya AI, Roy S. Improving temporal predictions through time-series labeling using matrix profile and motifs. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06744-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
4
|
Rabiei M, Khorshidi A, Soltani-Nabipour J. Production of Yttrium-86 radioisotope using genetic algorithm and neural network. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102449] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
5
|
Mathematical Modelling of Biosensing Platforms Applied for Environmental Monitoring. CHEMOSENSORS 2021. [DOI: 10.3390/chemosensors9030050] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In recent years, mathematical modelling has known an overwhelming integration in different scientific fields. In general, modelling is used to obtain new insights and achieve more quantitative and qualitative information about systems by programming language, manipulating matrices, creating algorithms and tracing functions and data. Researchers have been inspired by these techniques to explore several methods to solve many problems with high precision. In this direction, simulation and modelling have been employed for the development of sensitive and selective detection tools in different fields including environmental control. Emerging pollutants such as pesticides, heavy metals and pharmaceuticals are contaminating water resources, thus threatening wildlife. As a consequence, various biosensors using modelling have been reported in the literature for efficient environmental monitoring. In this review paper, the recent biosensors inspired by modelling and applied for environmental monitoring will be overviewed. Moreover, the level of success and the analytical performances of each modelling-biosensor will be discussed. Finally, current challenges in this field will be highlighted.
Collapse
|
6
|
A Comparative Assessment of Geostatistical, Machine Learning, and Hybrid Approaches for Mapping Topsoil Organic Carbon Content. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8040174] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Accurate digital soil mapping (DSM) of soil organic carbon (SOC) is still a challenging subject because of its spatial variability and dependency. This study is aimed at comparing six typical methods in three types of DSM techniques for SOC mapping in an area surrounding Changchun in Northeast China. The methods include ordinary kriging (OK) and geographically weighted regression (GWR) from geostatistics, support vector machines for regression (SVR) and artificial neural networks (ANN) from machine learning, and geographically weighted regression kriging (GWRK) and artificial neural networks kriging (ANNK) from hybrid approaches. The hybrid approaches, in particular, integrated the GWR from geostatistics and ANN from machine learning with the estimation of residuals by ordinary kriging, respectively. Environmental variables, including soil properties, climatic, topographic, and remote sensing data, were used for modeling. The mapping results of SOC content from different models were validated by independent testing data based on values of the mean error, root mean squared error and coefficient of determination. The prediction maps depicted spatial variation and patterns of SOC content of the study area. The results showed the accuracy ranking of the compared methods in decreasing order was ANNK, SVR, ANN, GWRK, OK, and GWR. Two-step hybrid approaches performed better than the corresponding individual models, and non-linear models performed better than the linear models. When considering the uncertainty and efficiency, ML and two-step approach are more suitable than geostatistics in regional landscapes with the high heterogeneity. The study concludes that ANNK is a promising approach for mapping SOC content at a local scale.
Collapse
|
7
|
Soltoggio A, Stanley KO, Risi S. Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks. Neural Netw 2018; 108:48-67. [PMID: 30142505 DOI: 10.1016/j.neunet.2018.07.013] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2017] [Revised: 07/24/2018] [Accepted: 07/24/2018] [Indexed: 02/07/2023]
Abstract
Biological neural networks are systems of extraordinary computational capabilities shaped by evolution, development, and lifelong learning. The interplay of these elements leads to the emergence of biological intelligence. Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) employ simulated evolution in-silico to breed plastic neural networks with the aim to autonomously design and create learning systems. EPANN experiments evolve networks that include both innate properties and the ability to change and learn in response to experiences in different environments and problem domains. EPANNs' aims include autonomously creating learning systems, bootstrapping learning from scratch, recovering performance in unseen conditions, testing the computational advantages of particular neural components, and deriving hypotheses on the emergence of biological learning. Thus, EPANNs may include a large variety of different neuron types and dynamics, network architectures, plasticity rules, and other factors. While EPANNs have seen considerable progress over the last two decades, current scientific and technological advances in artificial neural networks are setting the conditions for radically new approaches and results. Exploiting the increased availability of computational resources and of simulation environments, the often challenging task of hand-designing learning neural networks could be replaced by more autonomous and creative processes. This paper brings together a variety of inspiring ideas that define the field of EPANNs. The main methods and results are reviewed. Finally, new opportunities and possible developments are presented.
Collapse
Affiliation(s)
- Andrea Soltoggio
- Department of Computer Science, Loughborough University, LE11 3TU, Loughborough, UK.
| | - Kenneth O Stanley
- Department of Computer Science, University of Central Florida, Orlando, FL, USA.
| | | |
Collapse
|
8
|
Prieto A, Prieto B, Ortigosa EM, Ros E, Pelayo F, Ortega J, Rojas I. Neural networks: An overview of early research, current frameworks and new challenges. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.06.014] [Citation(s) in RCA: 161] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
9
|
Support vector machine and artificial neural network to model soil pollution: a case study in Semnan Province, Iran. Neural Comput Appl 2016. [DOI: 10.1007/s00521-016-2231-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
10
|
|
11
|
Han M, Xu M, Liu X, Wang X. Online multivariate time series prediction using SCKF-γESN model. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2014.06.057] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
de Lima TPF, da Silva AJ, Ludermir TB, de Oliveira WR. An automatic methodology for construction of multi-classifier systems based on the combination of selection and fusion. PROGRESS IN ARTIFICIAL INTELLIGENCE 2014. [DOI: 10.1007/s13748-014-0053-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Mueller AV, Hemond HF. Extended artificial neural networks: Incorporation of a priori chemical knowledge enables use of ion selective electrodes for in-situ measurement of ions at environmentally relevant levels. Talanta 2013; 117:112-8. [DOI: 10.1016/j.talanta.2013.08.045] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2013] [Revised: 08/26/2013] [Accepted: 08/27/2013] [Indexed: 10/26/2022]
|
14
|
Donate JP, Cortez P, Sánchez GG, de Miguel AS. Time series forecasting using a weighted cross-validation evolutionary artificial neural network ensemble. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.02.053] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
15
|
Chavoshi S, Azmin Sulaiman WN, Saghafian B, Bin Sulaiman MN, Manaf LA. Regionalization by fuzzy expert system based approach optimized by genetic algorithm. JOURNAL OF HYDROLOGY 2013; 486:271-280. [DOI: 10.1016/j.jhydrol.2013.01.033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
16
|
Comparison of multi-objective evolutionary neural network, adaptive neuro-fuzzy inference system and bootstrap-based neural network for flood forecasting. Neural Comput Appl 2013. [DOI: 10.1007/s00521-013-1344-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
17
|
Glezakos TJ, Tsiligiridis TA, Yialouris CP. Piecewise evolutionary segmentation for feature extraction in time series models. Neural Comput Appl 2012. [DOI: 10.1007/s00521-012-1212-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
18
|
Tayefeh Mahmoudi M, Taghiyareh F, Forouzideh N, Lucas C. Evolving artificial neural network structure using grammar encoding and colonial competitive algorithm. Neural Comput Appl 2012. [DOI: 10.1007/s00521-012-0905-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
19
|
DONATE JUANPERALTA, SANCHEZ GERMANGUTIERREZ, DE MIGUEL ARACELISANCHIS. TIME SERIES FORECASTING. A COMPARATIVE STUDY BETWEEN AN EVOLVING ARTIFICIAL NEURAL NETWORKS SYSTEM AND STATISTICAL METHODS. INT J ARTIF INTELL T 2012. [DOI: 10.1142/s0218213011000462] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Accurate time series forecasting are important for displaying the manner in which the past continues to affect the future and for planning our day to-day activities. In recent years, a large literature has evolved on the use of evolving artificial neural networks (EANN) in many forecasting applications. Evolving neural networks are particularly appealing because of their ability to model an unspecified non-linear relationship between time series variables. In this work, a new approach of a previous Automatic Design of Artificial Neural Networks (ADANN) system applied to forecast time series is tackled. The automatic process to design artificial neural networks is carried out by a genetic algorithm (GA). These new methods, in order to get an accurate forecasting, are related with: shuffling training and validation patterns obtained from time series values and trying to improve the fitness function used in the global learning process (i.e. GA) using a new patterns set called validation II apart of the two used till the moment (i.e. training and validation). The object of this study is to try to improve the final forecasting getting an accurate system. In this paper, we also compare the forecasting ability of the ARIMA approach, evolving artificial neural networks (ADANN), unobserved components model (UCM) and a forecasting tool called Forecast Pro software using six benchmark time series.
Collapse
Affiliation(s)
- JUAN PERALTA DONATE
- Computer Science Department, Carlos III University, Av Universidad 30, Leganes, Madrid 28911, Spain
| | - GERMAN GUTIERREZ SANCHEZ
- Computer Science Department, Carlos III University, Av Universidad 30, Leganes, Madrid 28911, Spain
| | | |
Collapse
|
20
|
Evaluation of effect of blast design parameters on flyrock using artificial neural networks. Neural Comput Appl 2012. [DOI: 10.1007/s00521-012-0917-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
21
|
ARAN OYA, YILDIZ OLCAYTANER, ALPAYDIN ETHEM. AN INCREMENTAL FRAMEWORK BASED ON CROSS-VALIDATION FOR ESTIMATING THE ARCHITECTURE OF A MULTILAYER PERCEPTRON. INT J PATTERN RECOGN 2011. [DOI: 10.1142/s0218001409007132] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We define the problem of optimizing the architecture of a multilayer perceptron (MLP) as a state space search and propose the MOST (Multiple Operators using Statistical Tests) framework that incrementally modifies the structure and checks for improvement using cross-validation. We consider five variants that implement forward/backward search, using single/multiple operators and searching depth-first/breadth-first. On 44 classification and 30 regression datasets, we exhaustively search for the optimal and evaluate the goodness based on: (1) Order, the accuracy with respect to the optimal and (2) Rank, the computational complexity. We check for the effect of two resampling methods (5 × 2, ten-fold cv), four statistical tests (5 × 2 cv t, ten-fold cv t, Wilcoxon, sign) and two corrections for multiple comparisons (Bonferroni, Holm). We also compare with Dynamic Node Creation (DNC) and Cascade Correlation (CC). Our results show that: (1) On most datasets, networks with few hidden units are optimal, (2) forward searching finds simpler architectures, (3) variants using single node additions (deletions) generally stop early and get stuck in simple (complex) networks, (4) choosing the best of multiple operators finds networks closer to the optimal, (5) MOST variants generally find simpler networks having lower or comparable error rates than DNC and CC.
Collapse
Affiliation(s)
- OYA ARAN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - OLCAY TANER YILDIZ
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - ETHEM ALPAYDIN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| |
Collapse
|
22
|
|
23
|
A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2009.11.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
24
|
Glezakos TJ, Tsiligiridis TA, Iliadis LS, Yialouris CP, Maris FP, Ferentinos KP. Feature extraction for time-series data: An artificial neural network evolutionary training model for the management of mountainous watersheds. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2008.08.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
25
|
|
26
|
Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training. Soft comput 2009. [DOI: 10.1007/s00500-009-0410-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
27
|
|
28
|
An Evolutionary Approach for Tuning Artificial Neural Network Parameters. LECTURE NOTES IN COMPUTER SCIENCE 2009. [DOI: 10.1007/978-3-540-87656-4_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
29
|
|
30
|
|
31
|
|
32
|
Maqsood I, Abraham A. Weather analysis using ensemble of connectionist learning paradigms. Appl Soft Comput 2007. [DOI: 10.1016/j.asoc.2006.06.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
33
|
Ensemble of hybrid neural network learning approaches for designing pharmaceutical drugs. Neural Comput Appl 2007. [DOI: 10.1007/s00521-007-0090-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
34
|
Use of gene dependent mutation probability in evolutionary neural networks for non-stationary problems. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2006.07.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|