1
|
Selamat SN, Abd Majid N, Mohd Taib A. A Comparative Assessment of Sampling Ratios Using Artificial Neural Network (ANN) for Landslide Predictive Model in Langat River Basin, Selangor, Malaysia. SUSTAINABILITY 2023; 15:861. [DOI: 10.3390/su15010861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Landslides have been classified as the most dangerous threat around the world, causing huge damage to properties and loss of life. Increased human activity in landslide-prone areas has been a major contributor to the risk of landslide occurrences. Therefore, machine learning has been used in landslide studies to develop a landslide predictive model. The main objective of this study is to evaluate the most suitable sampling ratio for the predictive landslide model in the Langat River Basin (LRB) using Artificial Neural Networks (ANNs). The landslide inventory was divided randomly into training and testing datasets using four sampling ratios (50:50, 60:40, 70:30, and 80:20). A total of 12 landslide conditioning factors were considered in this study, including the elevation, slope, aspect, curvature, topography wetness index (TWI), distance to the road, distance to the river, distance to faults, soil, lithology, land use, and rainfall. The evaluation model was performed using certain statistical measures and area under the curve (AUC). Finally, the most suitable predictive model was chosen based on the model validation results using the compound factor (CF) method. Based on the results, the predictive model with an 80:20 ratio indicates a realistic finding and was classified as the first rank among others. The AUC value for the training dataset is 0.931, while the AUC value for the testing dataset is 0.964. These attempts will help a great deal when it comes to choosing the best ratio of training samples to testing samples to create a reliable and complete landslide prediction model for the LRB.
Collapse
|
2
|
Landslide Susceptibility Model Using Artificial Neural Network (ANN) Approach in Langat River Basin, Selangor, Malaysia. LAND 2022. [DOI: 10.3390/land11060833] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Landslides are a natural hazard that can endanger human life and cause severe environmental damage. A landslide susceptibility map is essential for planning, managing, and preventing landslides occurrences to minimize losses. A variety of techniques are employed to map landslide susceptibility; however, their capability differs depending on the studies. The aim of the research is to produce a landslide susceptibility map for the Langat River Basin in Selangor, Malaysia, using an Artificial Neural Network (ANN). A landslide inventory map contained a total of 140 landslide locations which were randomly separated into training and testing with ratio 70:30. Nine landslide conditioning factors were selected as model input, including: elevation, slope, aspect, curvature, Topographic Wetness Index (TWI), distance to road, distance to river, lithology, and rainfall. The area under the curve (AUC) and several statistical measures of analyses (sensitivity, specificity, accuracy, positive predictive value, and negative predictive value) were used to validate the landslide predictive model. The ANN predictive model was considered and achieved very good results on validation assessment, with an AUC value of 0.940 for both training and testing datasets. This study found rainfall to be the most crucial factor affecting landslide occurrence in the Langat River Basin, with a 0.248 weight index, followed by distance to road (0.200) and elevation (0.136). The results showed that the most susceptible area is located in the north-east of the Langat River Basin. This map might be useful for development planning and management to prevent landslide occurrences in Langat River Basin.
Collapse
|
3
|
Lee SJ, Tseng CH, Yang HY, Jin X, Jiang Q, Pu B, Hu WH, Liu DR, Huang Y, Zhao N. Random RotBoost: An Ensemble Classification Method Based on Rotation Forest and AdaBoost in Random Subsets and Its Application to Clinical Decision Support. ENTROPY 2022; 24:e24050617. [PMID: 35626502 PMCID: PMC9140905 DOI: 10.3390/e24050617] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/01/2022] [Accepted: 04/19/2022] [Indexed: 02/04/2023]
Abstract
In the era of bathing in big data, it is common to see enormous amounts of data generated daily. As for the medical industry, not only could we collect a large amount of data, but also see each data set with a great number of features. When the number of features is ramping up, a common dilemma is adding computational cost during inferring. To address this concern, the data rotational method by PCA in tree-based methods shows a path. This work tries to enhance this path by proposing an ensemble classification method with an AdaBoost mechanism in random, automatically generating rotation subsets termed Random RotBoost. The random rotation process has replaced the manual pre-defined number of subset features (free pre-defined process). Therefore, with the ensemble of the multiple AdaBoost-based classifier, overfitting problems can be avoided, thus reinforcing the robustness. In our experiments with real-world medical data sets, Random RotBoost reaches better classification performance when compared with existing methods. Thus, with the help from our proposed method, the quality of clinical decisions can potentially be enhanced and supported in medical tasks.
Collapse
Affiliation(s)
- Shin-Jye Lee
- Institute of Management of Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (S.-J.L.); (H.-Y.Y.)
| | - Ching-Hsun Tseng
- Department of Computer Science, The University of Manchester, Manchester M13 9PL, UK;
| | - Hui-Yu Yang
- Institute of Management of Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (S.-J.L.); (H.-Y.Y.)
| | - Xin Jin
- National Pilot School of Software, Yunnan University, Kunming 650504, China; (X.J.); (Q.J.)
| | - Qian Jiang
- National Pilot School of Software, Yunnan University, Kunming 650504, China; (X.J.); (Q.J.)
| | - Bin Pu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China;
| | - Wei-Huan Hu
- College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan;
| | - Duen-Ren Liu
- Institute of Information Management, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (D.-R.L.); (Y.H.)
| | - Yang Huang
- Institute of Information Management, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (D.-R.L.); (Y.H.)
| | - Na Zhao
- National Pilot School of Software, Yunnan University, Kunming 650504, China; (X.J.); (Q.J.)
- Correspondence:
| |
Collapse
|
4
|
Shehab M, Abualigah L, Shambour Q, Abu-Hashem MA, Shambour MKY, Alsalibi AI, Gandomi AH. Machine learning in medical applications: A review of state-of-the-art methods. Comput Biol Med 2022; 145:105458. [PMID: 35364311 DOI: 10.1016/j.compbiomed.2022.105458] [Citation(s) in RCA: 103] [Impact Index Per Article: 51.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 03/23/2022] [Accepted: 03/24/2022] [Indexed: 12/11/2022]
Abstract
Applications of machine learning (ML) methods have been used extensively to solve various complex challenges in recent years in various application areas, such as medical, financial, environmental, marketing, security, and industrial applications. ML methods are characterized by their ability to examine many data and discover exciting relationships, provide interpretation, and identify patterns. ML can help enhance the reliability, performance, predictability, and accuracy of diagnostic systems for many diseases. This survey provides a comprehensive review of the use of ML in the medical field highlighting standard technologies and how they affect medical diagnosis. Five major medical applications are deeply discussed, focusing on adapting the ML models to solve the problems in cancer, medical chemistry, brain, medical imaging, and wearable sensors. Finally, this survey provides valuable references and guidance for researchers, practitioners, and decision-makers framing future research and development directions.
Collapse
Affiliation(s)
- Mohammad Shehab
- Information Technology, The World Islamic Sciences and Education University. Amman, Jordan.
| | - Laith Abualigah
- Faculty of Computer Sciences and Informatics, Amman Arab University, Amman, Jordan; School of Computer Sciences, Universiti Sains Malaysia, Pulau, Pinang, 11800, Malaysia.
| | - Qusai Shambour
- Department of Software Engineering, Al-Ahliyya Amman University, Amman, Jordan.
| | - Muhannad A Abu-Hashem
- Department of Geomatics, Faculty of Architecture and Planning, King Abdulaziz University, Jeddah, Saudi Arabia.
| | | | | | - Amir H Gandomi
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, 2007, Australia.
| |
Collapse
|
5
|
Automatic Classification of Fatty Liver Disease Based on Supervised Learning and Genetic Algorithm. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12010521] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Fatty liver disease is considered a critical illness that should be diagnosed and detected at an early stage. In advanced stages, liver cancer or cirrhosis arise, and to identify this disease, radiologists commonly use ultrasound images. However, because of their low quality, radiologists found it challenging to recognize this disease using ultrasonic images. To avoid this problem, a Computer-Aided Diagnosis technique is developed in the current study, using Machine Learning Algorithms and a voting-based classifier to categorize liver tissues as being fatty or normal, based on extracting ultrasound image features and a voting-based classifier. Four main contributions are provided by our developed method: firstly, the classification of liver images is achieved as normal or fatty without a segmentation phase. Secondly, compared to our proposed work, the dataset in previous works was insufficient. A combination of 26 features is the third contribution. Based on the proposed methods, the extracted features are Gray-Level Co-Occurrence Matrix (GLCM) and First-Order Statistics (FOS). The fourth contribution is the voting classifier used to determine the liver tissue type. Several trials have been performed by examining the voting-based classifier and J48 algorithm on a dataset. The obtained TP, TN, FP, and FN were 94.28%, 97.14%, 5.71%, and 2.85%, respectively. The achieved precision, sensitivity, specificity, and F1-score were 94.28%, 97.05%, 94.44%, and 95.64%, respectively. The achieved classification accuracy using a voting-based classifier was 95.71% and in the case of using the J48 algorithm was 93.12%. The proposed work achieved a high performance compared with the research works.
Collapse
|
6
|
Zhang J, Wen X, Cho A, Whang M. An Empathy Evaluation System Using Spectrogram Image Features of Audio. SENSORS 2021; 21:s21217111. [PMID: 34770419 PMCID: PMC8587789 DOI: 10.3390/s21217111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 10/17/2021] [Accepted: 10/25/2021] [Indexed: 12/01/2022]
Abstract
Watching videos online has become part of a relaxed lifestyle. The music in videos has a sensitive influence on human emotions, perception, and imaginations, which can make people feel relaxed or sad, and so on. Therefore, it is particularly important for people who make advertising videos to understand the relationship between the physical elements of music and empathy characteristics. The purpose of this paper is to analyze the music features in an advertising video and extract the music features that make people empathize. This paper combines both methods of the power spectrum of MFCC and image RGB analysis to find the audio feature vector. In spectral analysis, the eigenvectors obtained in the analysis process range from blue (low range) to green (medium range) to red (high range). The machine learning random forest classifier is used to classify the data obtained by machine learning, and the trained model is used to monitor the development of an advertisement empathy system in real time. The result is that the optimal model is obtained with the training accuracy result of 99.173% and a test accuracy of 86.171%, which can be deemed as correct by comparing the three models of audio feature value analysis. The contribution of this study can be summarized as follows: (1) the low-frequency and high-amplitude audio in the video is more likely to resonate than the high-frequency and high-amplitude audio; (2) it is found that frequency and audio amplitude are important attributes for describing waveforms by observing the characteristics of the machine learning classifier; (3) a new audio extraction method is proposed to induce human empathy. That is, the feature value extracted by the method of spectrogram image features of audio has the most ability to arouse human empathy.
Collapse
Affiliation(s)
- Jing Zhang
- Department of Emotion Engineering, University of Sangmyung, Seoul 03016, Korea; (J.Z.); (X.W.); (A.C.)
| | - Xingyu Wen
- Department of Emotion Engineering, University of Sangmyung, Seoul 03016, Korea; (J.Z.); (X.W.); (A.C.)
| | - Ayoung Cho
- Department of Emotion Engineering, University of Sangmyung, Seoul 03016, Korea; (J.Z.); (X.W.); (A.C.)
| | - Mincheol Whang
- Department of Human Centered Artificial Intelligence, University of Sangmyung, Seoul 03016, Korea
- Correspondence: ; Tel.: +82-2-2287-5293
| |
Collapse
|
7
|
Phong TV, Pham BT, Trinh PT, Ly HB, Vu QH, Ho LS, Le HV, Phong LH, Avand M, Prakash I. Groundwater Potential Mapping Using GIS-Based Hybrid Artificial Intelligence Methods. GROUND WATER 2021; 59:745-760. [PMID: 33745148 DOI: 10.1111/gwat.13094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 03/02/2021] [Accepted: 03/17/2021] [Indexed: 06/12/2023]
Abstract
Groundwater is one of the major valuable water resources for the use of communities, agriculture, and industries. In the present study, we have developed three novel hybrid artificial intelligence (AI) models which is a combination of modified RealAdaBoost (MRAB), bagging (BA), and rotation forest (RF) ensembles with functional tree (FT) base classifier for the groundwater potential mapping (GPM) in the basaltic terrain at DakLak province, Highland Centre, Vietnam. Based on the literature survey, these proposed hybrid AI models are new and have not been used in the GPM of an area. Geospatial techniques were used and geo-hydrological data of 130 groundwater wells and 12 topographical and geo-environmental factors were used in the model studies. One-R Attribute Evaluation feature selection method was used for the selection of relevant input parameters for the development of AI models. The performance of these models was evaluated using various statistical measures including area under the receiver operation curve (AUC). Results indicated that though all the hybrid models developed in this study enhanced the goodness-of-fit and prediction accuracy, but MRAB-FT (AUC = 0.742) model outperformed RF-FT (AUC = 0.736), BA-FT (AUC = 0.714), and single FT (AUC = 0.674) models. Therefore, the MRAB-FT model can be considered as a promising AI hybrid technique for the accurate GPM. Accurate mapping of the groundwater potential zones will help in adequately recharging the aquifer for optimum use of groundwater resources by maintaining the balance between consumption and exploitation.
Collapse
Affiliation(s)
- Tran Van Phong
- Institute of Geological Sciences, Vietnam Academy of Sciences and Technology, 84 Chua Lang Street, Dong da, Hanoi, Vietnam
| | - Binh Thai Pham
- University of Transport Technology, Ha Noi, 100000, Vietnam
- Civil and Environmental Engineering Program, Graduate School of Advanced Science and Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-8527, Japan
| | - Phan Trong Trinh
- Institute of Geological Sciences, Vietnam Academy of Sciences and Technology, 84 Chua Lang Street, Dong da, Hanoi, Vietnam
| | - Hai-Bang Ly
- University of Transport Technology, Ha Noi, 100000, Vietnam
| | - Quoc Hung Vu
- Faculty of Hydraulic Engineering, National University of Civil Engineering, Hanoi, 100000, Vietnam
| | - Lanh Si Ho
- University of Transport Technology, Ha Noi, 100000, Vietnam
- Civil and Environmental Engineering Program, Graduate School of Advanced Science and Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-8527, Japan
| | - Hiep Van Le
- University of Transport Technology, Ha Noi, 100000, Vietnam
| | - Lai Hop Phong
- Institute of Geological Sciences, Vietnam Academy of Sciences and Technology, 84 Chua Lang Street, Dong da, Hanoi, Vietnam
| | - Mohammadtaghi Avand
- Department of Watershed Management Engineering and Sciences, Faculty of Natural Resources and Marine Science, Tarbiat Modares University, Tehran, Iran
| | - Indra Prakash
- DDG (R) Geological Survey of India, Gandhinagar, 382010, India
| |
Collapse
|
8
|
Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:8899263. [PMID: 33815733 PMCID: PMC7987450 DOI: 10.1155/2021/8899263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 09/29/2020] [Accepted: 02/24/2021] [Indexed: 01/02/2023]
Abstract
Software defect prediction (SDP) in the initial period of the software development life cycle (SDLC) remains a critical and important assignment. SDP is essentially studied during few last decades as it leads to assure the quality of software systems. The quick forecast of defective or imperfect artifacts in software development may serve the development team to use the existing assets competently and more effectively to provide extraordinary software products in the given or narrow time. Previously, several canvassers have industrialized models for defect prediction utilizing machine learning (ML) and statistical techniques. ML methods are considered as an operative and operational approach to pinpoint the defective modules, in which moving parts through mining concealed patterns amid software metrics (attributes). ML techniques are also utilized by several researchers on healthcare datasets. This study utilizes different ML techniques software defect prediction using seven broadly used datasets. The ML techniques include the multilayer perceptron (MLP), support vector machine (SVM), decision tree (J48), radial basis function (RBF), random forest (RF), hidden Markov model (HMM), credal decision tree (CDT), K-nearest neighbor (KNN), average one dependency estimator (A1DE), and Naïve Bayes (NB). The performance of each technique is evaluated using different measures, for instance, relative absolute error (RAE), mean absolute error (MAE), root mean squared error (RMSE), root relative squared error (RRSE), recall, and accuracy. The inclusive outcome shows the best performance of RF with 88.32% average accuracy and 2.96 rank value, second-best performance is achieved by SVM with 87.99% average accuracy and 3.83 rank values. Moreover, CDT also shows 87.88% average accuracy and 3.62 rank values, placed on the third position. The comprehensive outcomes of research can be utilized as a reference point for new research in the SDP domain, and therefore, any assertion concerning the enhancement in prediction over any new technique or model can be benchmarked and proved.
Collapse
|
9
|
GIS-Based Landslide Susceptibility Mapping for Land Use Planning and Risk Assessment. LAND 2021. [DOI: 10.3390/land10020162] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Landslide susceptibility mapping is essential for a suitable land use managing and risk assessment. In this work a GIS-based approach has been proposed to map landslide susceptibility in the Portofino promontory, a Mediterranean area that is periodically hit by intense rain events that induce often shallow landslides. Based on over 110 years landslides inventory and experts’ judgements, a semi-quantitative analytical hierarchy process (AHP) method has been applied to assess the role of nine landslide conditioning factors, which include both natural and anthropogenic elements. A separated subset of landslide data has been used to validate the map. Our findings reveal that areas where possible future landslides may occur are larger than those identified in the actual official map adopted in land use and risk management. The way the new map has been compiled seems more oriented towards the possible future landslide scenario, rather than weighting with higher importance the existing landslides as in the current model. The paper provides a useful decision support tool to implement risk mitigation strategies and to better apply land use planning. Allowing to modify factors in order to local features, the proposed methodology may be adopted in different conditions or geographical context featured by rainfall induced landslide risk.
Collapse
|
10
|
Credal decision tree based novel ensemble models for spatial assessment of gully erosion and sustainable management. Sci Rep 2021; 11:3147. [PMID: 33542340 PMCID: PMC7862281 DOI: 10.1038/s41598-021-82527-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 01/21/2021] [Indexed: 01/30/2023] Open
Abstract
We introduce novel hybrid ensemble models in gully erosion susceptibility mapping (GESM) through a case study in the Bastam sedimentary plain of Northern Iran. Four new ensemble models including credal decision tree-bagging (CDT-BA), credal decision tree-dagging (CDT-DA), credal decision tree-rotation forest (CDT-RF), and credal decision tree-alternative decision tree (CDT-ADTree) are employed for mapping the gully erosion susceptibility (GES) with the help of 14 predictor factors and 293 gully locations. The relative significance of GECFs in modelling GES is assessed by random forest algorithm. Two cut-off-independent (area under success rate curve and area under predictor rate curve) and six cut-off-dependent metrics (accuracy, sensitivity, specificity, F-score, odd ratio and Cohen Kappa) were utilized based on both calibration as well as testing dataset. Drainage density, distance to road, rainfall and NDVI were found to be the most influencing predictor variables for GESM. The CDT-RF (AUSRC = 0.942, AUPRC = 0.945, accuracy = 0.869, specificity = 0.875, sensitivity = 0.864, RMSE = 0.488, F-score = 0.869 and Cohen's Kappa = 0.305) was found to be the most robust model which showcased outstanding predictive accuracy in mapping GES. Our study shows that the GESM can be utilized for conserving soil resources and for controlling future gully erosion.
Collapse
|
11
|
Performance Assessment of Classification Algorithms on Early Detection of Liver Syndrome. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2020:6680002. [PMID: 33489060 PMCID: PMC7787853 DOI: 10.1155/2020/6680002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 11/18/2020] [Accepted: 11/25/2020] [Indexed: 11/17/2022]
Abstract
In the recent era, a liver syndrome that causes any damage in life capacity is exceptionally normal everywhere throughout the world. It has been found that liver disease is exposed more in young people as a comparison with other aged people. At the point when liver capacity ends up, life endures just up to 1 or 2 days scarcely, and it is very hard to predict such illness in the early stage. Researchers are trying to project a model for early prediction of liver disease utilizing various machine learning approaches. However, this study compares ten classifiers including A1DE, NB, MLP, SVM, KNN, CHIRP, CDT, Forest-PA, J48, and RF to find the optimal solution for early and accurate prediction of liver disease. The datasets utilized in this study are taken from the UCI ML repository and the GitHub repository. The outcomes are assessed via RMSE, RRSE, recall, specificity, precision, G-measure, F-measure, MCC, and accuracy. The exploratory outcomes show a better consequence of RF utilizing the UCI dataset. Assessing RF using RMSE and RRSE, the outcomes are 0.4328 and 87.6766, while the accuracy of RF is 72.1739% that is also better than other employed classifiers. However, utilizing the GitHub dataset, SVM beats other employed techniques in terms of increasing accuracy up to 71.3551%. Moreover, the comprehensive outcomes of this exploration can be utilized as a reference point for further research studies that slight assertion concerning the enhancement in extrapolation through any new technique, model, or framework can be benchmarked and confirmed.
Collapse
|
12
|
Risk Assessment of Resources Exposed to Rainfall Induced Landslide with the Development of GIS and RS Based Ensemble Metaheuristic Machine Learning Algorithms. SUSTAINABILITY 2021. [DOI: 10.3390/su13020457] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Disastrous natural hazards, such as landslides, floods, and forest fires cause a serious threat to natural resources, assets and human lives. Consequently, landslide risk assessment has become requisite for managing the resources in future. This study was designed to develop four ensemble metaheuristic machine learning algorithms, such as grey wolf optimized based artificial neural network (GW-ANN), grey wolf optimized based random forest (GW-RF), particle swarm optimization optimized based ANN (PSO-ANN), and PSO optimized based RF for modeling rainfall-induced landslide susceptibility (LS) in Aqabat Al-Sulbat, Asir region, Saudi Arabia, which observes landslide frequently. To obtain very high precision and robust prediction from machine learning algorithms, the grey wolf and PSO optimization algorithms were integrated to develop new ensemble machine learning techniques. Subsequently, LS maps produced by training dataset were validated using the receiver operating characteristics (ROC) curve based on the testing dataset. Based on the area under curve (AUC) value of ROC curve, the best method for LS modeling was selected. We developed ROC curve-based sensitivity analysis to investigate the influence of the parameters for LS modeling. The Gumble extreme value distribution was employed to estimate the rainfall at 2, 5, 10, 20, 50, and 100 year return periods. Then, the landslide hazard maps were prepared at different return periods by integrating the best LS model and estimated rainfall at different return periods. The theory of danger pixels was employed to prepare a final risk assessment of the resources, which have been exposed to the landslide. The results showed that 27–42 and 6–15 km2 were predicted as the very high and high LS zones using four ensemble metaheuristic machine learning algorithms. Based on the area under curve (AUC) of ROC, GR-ANN (AUC-0.905) appeared as the best model for LS modeling. The areas under high and very high landslide hazard were gradually increased over the progression of time (26 km2 at the 2 year return period and 40 km2 at the 100 year return period for the high landslide hazard zone, and 6 km2 at the 2 year return period and 20 km2 at the 100 year return period for the very high landslide hazard zone). Similarly, the areas of danger pixel also increased gradually from the 2 to 100 year return periods (37 km2 to 62 km2). Various natural resources, such as scrubland, built up, and sparse vegetation, were identified under risk zone due to landslide hazards. In addition, these resources would be exposed extensively to landslides over the advancement of return periods. Therefore, the outcome of the present study will help planners and scientists to propose high precision management plans for protecting natural resources, which have been exposed to landslides.
Collapse
|
13
|
Performance Evaluation and Comparison of Bivariate Statistical-Based Artificial Intelligence Algorithms for Spatial Prediction of Landslides. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2020. [DOI: 10.3390/ijgi9120696] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The purpose of this study is to compare nine models, composed of certainty factors (CFs), weights of evidence (WoE), evidential belief function (EBF) and two machine learning models, namely random forest (RF) and support vector machine (SVM). In the first step, fifteen landslide conditioning factors were selected to prepare thematic maps, including slope aspect, slope angle, elevation, stream power index (SPI), sediment transport index (STI), topographic wetness index (TWI), plan curvature, profile curvature, land use, normalized difference vegetation index (NDVI), soil, lithology, rainfall, distance to rivers and distance to roads. In the second step, 152 landslides were randomly divided into two groups at a ratio of 70/30 as the training and validation datasets. In the third step, the weights of the CF, WoE and EBF models for conditioning factor were calculated separately, and the weights were used to generate the landslide susceptibility maps. The weights of each bivariate model were substituted into the RF and SVM models, respectively, and six integrated models and landslide susceptibility maps were obtained. In the fourth step, the receiver operating characteristic (ROC) curve and related parameters were used for verification and comparison, and then the success rate curve and the prediction rate curves were used for re-analysis. The comprehensive results showed that the hybrid model is superior to the bivariate model, and all nine models have excellent performance. The WoE–RF model has the highest predictive ability (AUC_T: 0.9993, AUC_P: 0.8968). The landslide susceptibility maps produced in this study can be used to manage landslide hazard and risk in Linyou County and other similar areas.
Collapse
|
14
|
Novel Credal Decision Tree-Based Ensemble Approaches for Predicting the Landslide Susceptibility. REMOTE SENSING 2020. [DOI: 10.3390/rs12203389] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Landslides are natural and often quasi-normal threats that destroy natural resources and may lead to a persistent loss of human life. Therefore, the preparation of landslide susceptibility maps is necessary in order to mitigate harmful effects. The key objective of this research is to develop landslide susceptibility maps for the Taleghan basin of Alborz province, Iran, using hybrid Machine Learning (ML) algorithms, i.e., k-fold cross validation and ML techniques of credal decision tree (CDT), Alternative Decision Tree (ADTree), and their ensemble method (CDT-ADTree), which have been state-of-the-art soft computing techniques rarely used in the case of landslide susceptibility assessments. In this study, 22 key landslide causative factors (LCFs) were considered to explore their spatial relationship to landslides, based on local geomorphological and geo-environmental influences. The Random Forest (RF) algorithm was used for the identification of variables importance of different LCFs that are more prone to landslide susceptibility. A receiver operation characteristics (ROC) curve with area under the curve (AUC), accuracy, precision, and robustness index was used to evaluate and compare landslide susceptibility models. The output of the model performance shows that the CDT-ADTree model is the more robust model for the landslide susceptibility where the AUC, accuracy, and precision are 0.981, 0.837, and 0.867, respectively, than the standalone model of CDT and ADTree model. Therefore, it is concluded that the CDT-ADTree ensemble model can be applied as a new promising technique for spatial prediction of the landslide in further studies.
Collapse
|
15
|
Novel Ensemble Landslide Predictive Models Based on the Hyperpipes Algorithm: A Case Study in the Nam Dam Commune, Vietnam. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10113710] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Development of landslide predictive models with strong prediction power has become a major focus of many researchers. This study describes the first application of the Hyperpipes (HP) algorithm for the development of the five novel ensemble models that combine the HP algorithm and the AdaBoost (AB), Bagging (B), Dagging, Decorate, and Real AdaBoost (RAB) ensemble techniques for mapping the spatial variability of landslide susceptibility in the Nam Dan commune, Ha Giang province, Vietnam. Information on 76 historical landslides and ten geo-environmental factors (slope degree, slope aspect, elevation, topographic wetness index, curvature, weathering crust, geology, river density, fault density, and distance from roads) were used for the construction of the training and validation datasets that are the prerequisites for building and testing the proposed models. Using different performance metrics (i.e., the area under the receiver operating characteristic curve (AUC), negative predictive value, positive predictive value, accuracy, sensitivity, specificity, root mean square error, and Kappa), we verified the proficiency of all five ensemble learning techniques in increasing the fitness and predictive powers of the base HP model. Based on the AUC values derived from the models, the ensemble ABHP model that yielded an AUC value of 0.922 was identified as the most efficient model for mapping the landslide susceptibility in the Nam Dan commune, followed by RABHP (AUC = 0.919), BHP (AUC = 0.909), Dagging-HP (AUC = 0.897), Decorate-HP (AUC = 0.865), and the single HP model (AUC = 0.856), respectively. The novel ensemble models proposed for the Nam Dan commune and the resultant susceptibility maps can aid land-use planners in the development of efficient mitigation strategies in response to destructive landslides.
Collapse
|
16
|
Shallow Landslide Susceptibility Mapping by Random Forest Base Classifier and Its Ensembles in a Semi-Arid Region of Iran. FORESTS 2020. [DOI: 10.3390/f11040421] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We generated high-quality shallow landslide susceptibility maps for Bijar County, Kurdistan Province, Iran, using Random Forest (RAF), an ensemble computational intelligence method and three meta classifiers—Bagging (BA, BA-RAF), Random Subspace (RS, RS-RAF), and Rotation Forest (RF, RF-RAF). Modeling and validation were done on 111 shallow landslide locations using 20 conditioning factors tested by the Information Gain Ratio (IGR) technique. We assessed model performance with statistically based indexes, including sensitivity, specificity, accuracy, kappa, root mean square error (RMSE), and area under the receiver operatic characteristic curve (AUC). All four machine learning models that we tested yielded excellent goodness-of-fit and prediction accuracy, but the RF-RAF ensemble model (AUC = 0.936) outperformed the BA-RAF, RS-RAF (AUC = 0.907), and RAF (AUC = 0.812) models. The results also show that the Random Forest model significantly improved the predictive capability of the RAF-based classifier and, therefore, can be considered as a useful and an effective tool in regional shallow landslide susceptibility mapping.
Collapse
|
17
|
Improvement of Credal Decision Trees Using Ensemble Frameworks for Groundwater Potential Modeling. SUSTAINABILITY 2020. [DOI: 10.3390/su12072622] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Groundwater is one of the most important sources of fresh water all over the world, especially in those countries where rainfall is erratic, such as Vietnam. Nowadays, machine learning (ML) models are being used for the assessment of groundwater potential of the region. Credal decision trees (CDT) is one of the ML models which has been used in such studies. In the present study, the performance of the CDT has been improved using various ensemble frameworks such as Bagging, Dagging, Decorate, Multiboost, and Random SubSpace. Based on these methods, five hybrid models, namely BCDT, Dagging-CDT, Decorate-CDT, MBCDT, and RSSCDT, were developed and applied for groundwater potential mapping of DakLak province of Vietnam. Data of 227 groundwater wells of the study area were utilized for the construction and validation of the models. Twelve groundwater potential conditioning factors, namely rainfall, slope, elevation, river density, Sediment Transport Index (STI), curvature, flow direction, aspect, soil, land use, Topographic Wetness Index (TWI), and geology, were considered for the model studies. Various statistical measures, including area under receiver operating characteristic (AUC) curve, were applied to validate and compare the performance of the models. The results show that performance of the hybrid CDT ensemble models MBCDT (AUC = 0.770), BCDT (AUC = 0.731), Dagging-CDT (AUC = 0.763), Decorate-CDT (AUC = 0.750), and RSSCDT (AUC = 0.766) improved significantly in comparison to the single CDT (AUC = 0.722) model. Therefore, these developed hybrid models can be applied for better ground water potential mapping and groundwater resources management of the study area as well as other regions of the world.
Collapse
|
18
|
GIS Based Hybrid Computational Approaches for Flash Flood Susceptibility Assessment. WATER 2020. [DOI: 10.3390/w12030683] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Flash floods are one of the most devastating natural hazards; they occur within a catchment (region) where the response time of the drainage basin is short. Identification of probable flash flood locations and development of accurate flash flood susceptibility maps are important for proper flash flood management of a region. With this objective, we proposed and compared several novel hybrid computational approaches of machine learning methods for flash flood susceptibility mapping, namely AdaBoostM1 based Credal Decision Tree (ABM-CDT); Bagging based Credal Decision Tree (Bag-CDT); Dagging based Credal Decision Tree (Dag-CDT); MultiBoostAB based Credal Decision Tree (MBAB-CDT), and single Credal Decision Tree (CDT). These models were applied at a catchment of Markazi state in Iran. About 320 past flash flood events and nine flash flood influencing factors, namely distance from rivers, aspect, elevation, slope, rainfall, distance from faults, soil, land use, and lithology were considered and analyzed for the development of flash flood susceptibility maps. Correlation based feature selection method was used to validate and select the important factors for modeling of flash floods. Based on this feature selection analysis, only eight factors (distance from rivers, aspect, elevation, slope, rainfall, soil, land use, and lithology) were selected for the modeling, where distance to rivers is the most important factor for modeling of flash flood in this area. Performance of the models was validated and compared by using several robust metrics such as statistical measures and Area Under the Receiver Operating Characteristic (AUC) curve. The results of this study suggested that ABM-CDT (AUC = 0.957) has the best predictive capability in terms of accuracy, followed by Dag-CDT (AUC = 0.947), MBAB-CDT (AUC = 0.933), Bag-CDT (AUC = 0.932), and CDT (0.900), respectively. The proposed methods presented in this study would help in the development of accurate flash flood susceptible maps of watershed areas not only in Iran but also other parts of the world.
Collapse
|
19
|
Abstract
In this study, hybrid integration of MultiBoosting based on two artificial intelligence methods (the radial basis function network (RBFN) and credal decision tree (CDT) models) and geographic information systems (GIS) were used to establish landslide susceptibility maps, which were used to evaluate landslide susceptibility in Nanchuan County, China. First, the landslide inventory map was generated based on previous research results combined with GIS and aerial photos. Then, 298 landslides were identified, and the established dataset was divided into a training dataset (70%, 209 landslides) and a validation dataset (30%, 89 landslides) with ensured randomness, fairness, and symmetry of data segmentation. Sixteen landslide conditioning factors (altitude, profile curvature, plan curvature, slope aspect, slope angle, stream power index (SPI), topographical wetness index (TWI), sediment transport index (STI), distance to rivers, distance to roads, distance to faults, rainfall, NDVI, soil, land use, and lithology) were identified in the study area. Subsequently, the CDT, RBFN, and their ensembles with MultiBoosting (MCDT and MRBFN) were used in ArcGIS to generate the landslide susceptibility maps. The performances of the four landslide susceptibility maps were compared and verified based on the area under the curve (AUC). Finally, the verification results of the AUC evaluation show that the landslide susceptibility mapping generated by the MCDT model had the best performance.
Collapse
|
20
|
Landslide Susceptibility Evaluation Using Hybrid Integration of Evidential Belief Function and Machine Learning Techniques. WATER 2019. [DOI: 10.3390/w12010113] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this study, Random SubSpace-based classification and regression tree (RSCART) was introduced for landslide susceptibility modeling, and CART model and logistic regression (LR) model were used as benchmark models. 263 landslide locations in the study area were randomly divided into two parts (70/30) for training and validation of models. 14 landslide influencing factors were selected, such as slope angle, elevation, aspect, sediment transport index (STI), topographical wetness index (TWI), stream power index (SPI), profile curvature, plan curvature, distance to rivers, distance to road, soil, normalized difference vegetation index (NDVI), land use, and lithology. Finally, the hybrid RSCART model and two benchmark models were applied for landslide susceptibility modeling and the receiver operating characteristic curve method is used to evaluate the performance of the model. The susceptibility is quantitatively compared based on each pixel to reveal the system spatial pattern between susceptibility maps. At the same time, area under ROC curve (AUC) and landslide density analysis were used to estimate the prediction ability of landslide susceptibility map. The results showed that the RSCART model is the optimal model with the highest AUC values of 0.852 and 0.827, followed by LR and CART models. The results also illustrate that the hybrid model generally improves the prediction ability of a single landslide susceptibility model.
Collapse
|
21
|
GIS Based Novel Hybrid Computational Intelligence Models for Mapping Landslide Susceptibility: A Case Study at Da Lat City, Vietnam. SUSTAINABILITY 2019. [DOI: 10.3390/su11247118] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Landslides affect properties and the lives of a large number of people in many hilly parts of Vietnam and in the world. Damages caused by landslides can be reduced by understanding distribution, nature, mechanisms and causes of landslides with the help of model studies for better planning and risk management of the area. Development of landslide susceptibility maps is one of the main steps in landslide management. In this study, the main objective is to develop GIS based hybrid computational intelligence models to generate landslide susceptibility maps of the Da Lat province, which is one of the landslide prone regions of Vietnam. Novel hybrid models of alternating decision trees (ADT) with various ensemble methods, namely bagging, dagging, MultiBoostAB, and RealAdaBoost, were developed namely B-ADT, D-ADT, MBAB-ADT, RAB-ADT, respectively. Data of 72 past landslide events was used in conjunction with 11 landslide conditioning factors (curvature, distance from geological boundaries, elevation, land use, Normalized Difference Vegetation Index (NDVI), relief amplitude, stream density, slope, lithology, weathering crust and soil) in the development and validation of the models. Area under the receiver operating characteristic (ROC) curve (AUC), and several statistical measures were applied to validate these models. Results indicated that performance of all the models was good (AUC value greater than 0.8) but B-ADT model performed the best (AUC= 0.856). Landslide susceptibility maps generated using the proposed models would be helpful to decision makers in the risk management for land use planning and infrastructure development.
Collapse
|
22
|
Flood Hazard Mapping Using the Flood and Flash-Flood Potential Index in the Buzău River Catchment, Romania. WATER 2019. [DOI: 10.3390/w11102116] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The importance of identifying the areas vulnerable for both floods and flash-floods is an important component of risk management. The assessment of vulnerable areas is a major challenge in the scientific world. The aim of this study is to provide a methodology-oriented study of how to identify the areas vulnerable to floods and flash-floods in the Buzău river catchment by computing two indices: the Flash-Flood Potential Index (FFPI) for the mountainous and the Sub-Carpathian areas, and the Flood Potential Index (FPI) for the low-altitude areas, using the frequency ratio (FR), a bivariate statistical model, the Multilayer Perceptron Neural Networks (MLP), and the ensemble model MLP–FR. A database containing historical flood locations (168 flood locations) and the areas with torrentiality (172 locations with torrentiality) was created and used to train and test the models. The resulting models were computed using GIS techniques, thus resulting the flood and flash-flood vulnerability maps. The results show that the MLP–FR hybrid model had the most performance. The use of the two indices represents a preliminary step in creating flood vulnerability maps, which could represent an important tool for local authorities and a support for flood risk management policies.
Collapse
|
23
|
Abstract
Landslides are the most frequent phenomenon in the northern part of Iran, which cause considerable financial and life damages every year. One of the most widely used approaches to reduce these damages is preparing a landslide susceptibility map (LSM) using suitable methods and selecting the proper conditioning factors. The current study is aimed at comparing four bivariate models, namely the frequency ratio (FR), Shannon entropy (SE), weights of evidence (WoE), and evidential belief function (EBF), for a LSM of Klijanrestagh Watershed, Iran. Firstly, 109 locations of landslides were obtained from field surveys and interpretation of aerial photographs. Then, the locations were categorized into two groups of 70% (74 locations) and 30% (35 locations), randomly, for modeling and validation processes, respectively. Then, 10 conditioning factors of slope aspect, curvature, elevation, distance from fault, lithology, normalized difference vegetation index (NDVI), distance from the river, distance from the road, the slope angle, and land use were determined to construct the spatial database. From the results of multicollinearity, it was concluded that no collinearity existed between the 10 considered conditioning factors in the occurrence of landslides. The receiver operating characteristic (ROC) curve and the area under the curve (AUC) were used for validation of the four achieved LSMs. The AUC results introduced the success rates of 0.8, 0.86, 0.84, and 0.85 for EBF, WoE, SE, and FR, respectively. Also, they indicated that the rates of prediction were 0.84, 0.83, 0.82, and 0.79 for WoE, FR, SE, and EBF, respectively. Therefore, the WoE model, having the highest AUC, was the most accurate method among the four implemented methods in identifying the regions at risk of future landslides in the study area. The outcomes of this research are useful and essential for the government, planners, decision makers, researchers, and general land-use planners in the study area.
Collapse
|