26
|
Bleichrodt A, Dahal S, Maloney K, Casanova L, Luo R, Chowell G. Real-time forecasting the trajectory of monkeypox outbreaks at the national and global levels, July-October 2022. BMC Med 2023; 21:19. [PMID: 36647108 PMCID: PMC9841951 DOI: 10.1186/s12916-022-02725-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 12/28/2022] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Beginning May 7, 2022, multiple nations reported an unprecedented surge in monkeypox cases. Unlike past outbreaks, differences in affected populations, transmission mode, and clinical characteristics have been noted. With the existing uncertainties of the outbreak, real-time short-term forecasting can guide and evaluate the effectiveness of public health measures. METHODS We obtained publicly available data on confirmed weekly cases of monkeypox at the global level and for seven countries (with the highest burden of disease at the time this study was initiated) from the Our World in Data (OWID) GitHub repository and CDC website. We generated short-term forecasts of new cases of monkeypox across the study areas using an ensemble n-sub-epidemic modeling framework based on weekly cases using 10-week calibration periods. We report and assess the weekly forecasts with quantified uncertainty from the top-ranked, second-ranked, and ensemble sub-epidemic models. Overall, we conducted 324 weekly sequential 4-week ahead forecasts across the models from the week of July 28th, 2022, to the week of October 13th, 2022. RESULTS The last 10 of 12 forecasting periods (starting the week of August 11th, 2022) show either a plateauing or declining trend of monkeypox cases for all models and areas of study. According to our latest 4-week ahead forecast from the top-ranked model, a total of 6232 (95% PI 487.8, 12,468.0) cases could be added globally from the week of 10/20/2022 to the week of 11/10/2022. At the country level, the top-ranked model predicts that the USA will report the highest cumulative number of new cases for the 4-week forecasts (median based on OWID data: 1806 (95% PI 0.0, 5544.5)). The top-ranked and weighted ensemble models outperformed all other models in short-term forecasts. CONCLUSIONS Our top-ranked model consistently predicted a decreasing trend in monkeypox cases on the global and country-specific scale during the last ten sequential forecasting periods. Our findings reflect the potential impact of increased immunity, and behavioral modification among high-risk populations.
Collapse
|
27
|
Mujahid M, Rustam F, Alasim F, Siddique M, Ashraf I. What people think about fast food: opinions analysis and LDA modeling on fast food restaurants using unstructured tweets. PeerJ Comput Sci 2023; 9:e1193. [PMID: 37346556 PMCID: PMC10280231 DOI: 10.7717/peerj-cs.1193] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/28/2022] [Indexed: 06/23/2023]
Abstract
With the rise of social media platforms, sharing reviews has become a social norm in today's modern society. People check customer views on social networking sites about different fast food restaurants and food items before visiting the restaurants and ordering food. Restaurants can compete to better the quality of their offered items or services by carefully analyzing the feedback provided by customers. People tend to visit restaurants with a higher number of positive reviews. Accordingly, manually collecting feedback from customers for every product is a labor-intensive process; the same is true for sentiment analysis. To overcome this, we use sentiment analysis, which automatically extracts meaningful information from the data. Existing studies predominantly focus on machine learning models. As a consequence, the performance analysis of deep learning models is neglected primarily and of the deep ensemble models especially. To this end, this study adopts several deep ensemble models including Bi long short-term memory and gated recurrent unit (BiLSTM+GRU), LSTM+GRU, GRU+recurrent neural network (GRU+RNN), and BiLSTM+RNN models using self-collected unstructured tweets. The performance of lexicon-based methods is compared with deep ensemble models for sentiment classification. In addition, the study makes use of Latent Dirichlet Allocation (LDA) modeling for topic analysis. For experiments, the tweets for the top five fast food serving companies are collected which include KFC, Pizza Hut, McDonald's, Burger King, and Subway. Experimental results reveal that deep ensemble models yield better results than the lexicon-based approach and BiLSTM+GRU obtains the highest accuracy of 95.31% for three class problems. Topic modeling indicates that the highest number of negative sentiments are represented for Subway restaurants with high-intensity negative words. The majority of the people (49%) remain neutral regarding the choice of fast food, 31% seem to like fast food while the rest (20%) dislike fast food.
Collapse
|
28
|
Mondal S, Lee MA, Chen YK, Wang YC. Ensemble modeling of black pomfret ( Parastromateus niger) habitat in the Taiwan Strait based on oceanographic variables. PeerJ 2023; 11:e14990. [PMID: 36919168 PMCID: PMC10008307 DOI: 10.7717/peerj.14990] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 02/12/2023] [Indexed: 03/12/2023] Open
Abstract
The location, effort, number of captures, and time of fishing were all used in this study to assess the geographic distribution of Parastromateus niger in the Taiwan Strait. Other species distribution models performed worse than generalized linear models (GLMs) based on six oceanographic parameters. The sea surface temperature (SST) was between 26.5 °C and 29.5 °C, the sea surface chlorophyll (SSC) level was between 0.3-0.44 mg/m3, the sea surface salinity (SSS) was between 33.4 °C and 34.4 °C, the mixed layer depth was between 10 °C and 14 °C, the sea surface height was between 0.57 °C and 0.77 °C, and the eddy kinetic energy (EKE) was between 0.603 °C. According to the statistical findings, SST is merely a small effect compared to SSS, SSC level, and EKE in terms of impacting species distribution. By combining four effective single-algorithm models with no obvious bias, an ensemble habitat model was created. The ranges of 117°E-119°E and 22°N-24°N have the highest annual distributions of S.CPUE and nominal CPUE.
Collapse
|
29
|
Umer M, Sadiq S, karamti H, Abdulmajid Eshmawi A, Nappi M, Usman Sana M, Ashraf I. ETCNN: Extra Tree and Convolutional Neural Network-based Ensemble Model for COVID-19 Tweets Sentiment Classification. Pattern Recognit Lett 2022; 164:224-231. [PMID: 36407854 PMCID: PMC9664766 DOI: 10.1016/j.patrec.2022.11.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 10/09/2022] [Accepted: 11/11/2022] [Indexed: 11/17/2022]
Abstract
Pandemics influence people negatively and people experience fear and disappointment. With the global outspread of COVID-19, the sentiments of the general public are substantially influenced, and analyzing their sentiments could help to devise corresponding policies to alleviate negative sentiments. Often the data collected from social media platforms is unstructured leading to low classification accuracy. This study brings forward an ensemble model where the benefits of handcrafted features and automatic feature extraction are combined by machine learning and deep learning models. Unstructured data is obtained, preprocessed, and annotated using TextBlob and VADER before training machine learning models. Similarly, the efficiency of Word2Vec, TF, and TF-IDF features is also analyzed. Results reveal the better performance of the extra tree classifier when trained with TF-IDF features from TextBlob annotated data. Overall, machine learning models perform better with TF-IDF and TextBlob. The proposed model obtains superior performance using both annotation techniques with 0.97 and 0.95 scores of accuracy using TextBlob and VADER respectively with Word2Vec features. Results reveal that use of machine learning and deep learning models together with a voting criterion tends to yield better results than other machine learning models. Analysis of sentiments indicates that predominantly people possess negative sentiments regarding COVID-19.
Collapse
|
30
|
Zhu X, Guo H, Huang JJ, Tian S, Xu W, Mai Y. An ensemble machine learning model for water quality estimation in coastal area based on remote sensing imagery. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2022; 323:116187. [PMID: 36261960 DOI: 10.1016/j.jenvman.2022.116187] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/01/2022] [Accepted: 09/02/2022] [Indexed: 06/16/2023]
Abstract
The accurate estimation of coastal water quality parameters (WQPs) is crucial for decision-makers to manage water resources. Although various machine learning (ML) models have been developed for coastal water quality estimation using remote sensing data, the performance of these models has significant uncertainties when applied to regional scales. To address this issue, an ensemble ML-based model was developed in this study. The ensemble ML model was applied to estimate chlorophyll-a (Chla), turbidity, and dissolved oxygen (DO) based on Sentinel-2 satellite images in Shenzhen Bay, China. The optimal input features for each WQP were selected from eight spectral bands and seven spectral indices. A local explanation strategy termed Shapley Additive Explanations (SHAP) was employed to quantify contributions of each feature to model outputs. In addition, the impacts of three climate factors on the variation of each WQP were analyzed. The results suggested that the ensemble ML models have satisfied performance for Chla (errors = 1.7%), turbidity (errors = 1.5%) and DO estimation (errors = 0.02%). Band 3 (B3) has the highest positive contribution to Chla estimation, while Band Ration Index2 (BR2) has the highest negative contribution to turbidity estimation, and Band 7 (B7) has the highest positive contribution to DO estimation. The spatial patterns of the three WQPs revealed that the water quality deterioration in Shenzhen Bay was mainly influenced by input of terrestrial pollutants from the estuary. Correlation analysis demonstrated that air temperature (Temp) and average air pressure (AAP) exhibited the closest relationship with Chla. DO showed the strongest negative correlation with Temp, while turbidity was not sensitive to Temp, average wind speed (AWS), and AAP. Overall, the ensemble ML model proposed in this study provides an accurate and practical method for long-term Chla, turbidity, and DO estimation in coastal waters.
Collapse
|
31
|
El-Kenawy ESM, Zerouali B, Bailek N, Bouchouich K, Hassan MA, Almorox J, Kuriqi A, Eid M, Ibrahim A. Improved weighted ensemble learning for predicting the daily reference evapotranspiration under the semi-arid climate conditions. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:81279-81299. [PMID: 35731435 DOI: 10.1007/s11356-022-21410-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 06/07/2022] [Indexed: 06/15/2023]
Abstract
Evapotranspiration is an important quantity required in many applications, such as hydrology and agricultural and irrigation planning. Reference evapotranspiration is particularly important, and the prediction of its variations is beneficial for analyzing the needs and management of water resources. In this paper, we explore the predictive ability of hybrid ensemble learning to predict daily reference evapotranspiration (RET) under the semi-arid climate by using meteorological datasets at 12 locations in the Andalusia province in southern Spain. The datasets comprise mean, maximum, and minimum air temperatures and mean relative humidity and mean wind speed. A new modified variant of the grey wolf optimizer, named the PRSFGWO algorithm, is proposed to maximize the ensemble learning's prediction accuracy through optimal weight tuning and evaluate the proposed model's capacity when the climate data is limited. The performance of the proposed approach, based on weighted ensemble learning, is compared with various algorithms commonly adopted in relevant studies. A diverse set of statistical measurements alongside ANOVA tests was used to evaluate the predictive performance of the prediction models. The proposed model showed high-accuracy statistics, with relative root mean errors lower than 0.999% and a minimum R2 of 0.99. The model inputs were also reduced from six variables to only two for cost-effective predictions of daily RET. This shows that the PRSFGWO algorithm is a good RET prediction model for the semi-arid climate region in southern Spain. The results obtained from this research are very promising compared with existing models in the literature.
Collapse
|
32
|
Yenkikar A, Babu CN, Hemanth DJ. Semantic relational machine learning model for sentiment analysis using cascade feature selection and heterogeneous classifier ensemble. PeerJ Comput Sci 2022; 8:e1100. [PMID: 36262147 PMCID: PMC9575864 DOI: 10.7717/peerj-cs.1100] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 08/23/2022] [Indexed: 06/16/2023]
Abstract
The exponential rise in social media via microblogging sites like Twitter has sparked curiosity in sentiment analysis that exploits user feedback towards a targeted product or service. Considering its significance in business intelligence and decision-making, numerous efforts have been made in this area. However, lack of dictionaries, unannotated data, large-scale unstructured data, and low accuracies have plagued these approaches. Also, sentiment classification through classifier ensemble has been underexplored in literature. In this article, we propose a Semantic Relational Machine Learning (SRML) model that automatically classifies the sentiment of tweets by using classifier ensemble and optimal features. The model employs the Cascaded Feature Selection (CFS) strategy, a novel statistical assessment approach based on Wilcoxon rank sum test, univariate logistic regression assisted significant predictor test and cross-correlation test. It further uses the efficacy of word2vec-based continuous bag-of-words and n-gram feature extraction in conjunction with SentiWordNet for finding optimal features for classification. We experiment on six public Twitter sentiment datasets, the STS-Gold dataset, the Obama-McCain Debate (OMD) dataset, the healthcare reform (HCR) dataset and the SemEval2017 Task 4A, 4B and 4C on a heterogeneous classifier ensemble comprising fourteen individual classifiers from different paradigms. Results from the experimental study indicate that CFS supports in attaining a higher classification accuracy with up to 50% lesser features compared to count vectorizer approach. In Intra-model performance assessment, the Artificial Neural Network-Gradient Descent (ANN-GD) classifier performs comparatively better than other individual classifiers, but the Best Trained Ensemble (BTE) strategy outperforms on all metrics. In inter-model performance assessment with existing state-of-the-art systems, the proposed model achieved higher accuracy and outperforms more accomplished models employing quantum-inspired sentiment representation (QSR), transformer-based methods like BERT, BERTweet, RoBERTa and ensemble techniques. The research thus provides critical insights into implementing similar strategy into building more generic and robust expert system for sentiment analysis that can be leveraged across industries.
Collapse
|
33
|
Ezanno P, Picault S, Bareille S, Beaunée G, Boender GJ, Dankwa EA, Deslandes F, Donnelly CA, Hagenaars TJ, Hayes S, Jori F, Lambert S, Mancini M, Munoz F, Pleydell DRJ, Thompson RN, Vergu E, Vignes M, Vergne T. The African swine fever modelling challenge: Model comparison and lessons learnt. Epidemics 2022; 40:100615. [PMID: 35970067 DOI: 10.1016/j.epidem.2022.100615] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 06/29/2022] [Accepted: 07/20/2022] [Indexed: 11/26/2022] Open
Abstract
Robust epidemiological knowledge and predictive modelling tools are needed to address challenging objectives, such as: understanding epidemic drivers; forecasting epidemics; and prioritising control measures. Often, multiple modelling approaches can be used during an epidemic to support effective decision making in a timely manner. Modelling challenges contribute to understanding the pros and cons of different approaches and to fostering technical dialogue between modellers. In this paper, we present the results of the first modelling challenge in animal health - the ASF Challenge - which focused on a synthetic epidemic of African swine fever (ASF) on an island. The modelling approaches proposed by five independent international teams were compared. We assessed their ability to predict temporal and spatial epidemic expansion at the interface between domestic pigs and wild boar, and to prioritise a limited number of alternative interventions. We also compared their qualitative and quantitative spatio-temporal predictions over the first two one-month projection phases of the challenge. Top-performing models in predicting the ASF epidemic differed according to the challenge phase, host species, and in predicting spatial or temporal dynamics. Ensemble models built using all team-predictions outperformed any individual model in at least one phase. The ASF Challenge demonstrated that accounting for the interface between livestock and wildlife is key to increasing our effectiveness in controlling emerging animal diseases, and contributed to improving the readiness of the scientific community to face future ASF epidemics. Finally, we discuss the lessons learnt from model comparison to guide decision making.
Collapse
|
34
|
Park J, Lee WH, Kim KT, Park CY, Lee S, Heo TY. Interpretation of ensemble learning to predict water quality using explainable artificial intelligence. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 832:155070. [PMID: 35398119 DOI: 10.1016/j.scitotenv.2022.155070] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 03/31/2022] [Accepted: 04/02/2022] [Indexed: 06/14/2023]
Abstract
Algal bloom is a significant issue when managing water quality in freshwater; specifically, predicting the concentration of algae is essential to maintaining the safety of the drinking water supply system. The chlorophyll-a (Chl-a) concentration is a commonly used indicator to obtain an estimation of algal concentration. In this study, an XGBoost ensemble machine learning (ML) model was developed from eighteen input variables to predict Chl-a concentration. The composition and pretreatment of input variables to the model are important factors for improving model performance. Explainable artificial intelligence (XAI) is an emerging area of ML modeling that provides a reasonable interpretation of model performance. The effect of input variable selection on model performance was estimated, where the priority of input variable selection was determined using three indices: Shapley value (SHAP), feature importance (FI), and variance inflation factor (VIF). SHAP analysis is an XAI algorithm designed to compute the relative importance of input variables with consistency, providing an interpretable analysis for model prediction. The XGB models simulated with independent variables selected using three indices were evaluated with root mean square error (RMSE), RMSE-observation standard deviation ratio, and Nash-Sutcliffe efficiency. This study shows that the model exhibited the most stable performance when the priority of input variables was determined by SHAP. This implies that on-site monitoring can be designed to collect the selected input variables from the SHAP analysis to reduce the cost of overall water quality analysis. The independent variables were further analyzed using SHAP summary plot, force plot, target plot, and partial dependency plot to provide understandable interpretation on the performance of the XGB model. While XAI is still in the early stages of development, this study successfully demonstrated a good example of XAI application to improve the interpretation of machine learning model performance in predicting water quality.
Collapse
|
35
|
Mohsen F, Biswas MR, Ali H, Alam T, Househ M, Shah Z. Customized and Automated Machine Learning-Based Models for Diabetes Type 2 Classification. Stud Health Technol Inform 2022; 295:517-520. [PMID: 35773925 DOI: 10.3233/shti220779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This study aims to develop models to accurately classify patients with type 2 diabetes using the Practice Fusion dataset. We use Random Forest (RF), Support Vector Classifier (SVC), AdaBoost classifier, an ensemble model, and automated machine learning (AutoML) model. We compare the performance of all models in a five-fold cross-validation scheme using four evaluation measures. Experimental results demonstrate that the AutoML model outperformed individual and ensemble models in all evaluation measures.
Collapse
|
36
|
Wang Y, Zhu X, Yang L, Hu X, He K, Yu C, Jiao S, Chen J, Guo R, Yang S. IDDLncLoc: Subcellular Localization of LncRNAs Based on a Framework for Imbalanced Data Distributions. Interdiscip Sci 2022; 14:409-420. [PMID: 35192174 DOI: 10.1007/s12539-021-00497-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 12/16/2021] [Accepted: 12/20/2021] [Indexed: 06/14/2023]
Abstract
Long non-coding RNAs play a crucial role in many life processes of cell, such as genetic markers, RNA splicing, signaling, and protein regulation. Considering that identifying lncRNA's localization in the cell through experimental methods is complicated, hard to reproduce, and expensive, we propose a novel method named IDDLncLoc in this paper, which adopts an ensemble model to solve the problem of the subcellular localization. In the proposal model, dinucleotide-based auto-cross covariance features, k-mer nucleotide composition features, and composition, transition, and distribution features are introduced to encode a raw RNA sequence to vector. To screen out reliable features, feature selection through binomial distribution, and recursive feature elimination is employed. Furthermore, strategies of oversampling in mini-batch, random sampling, and stacking ensemble strategies are customized to overcome the problem of data imbalance on the benchmark dataset. Finally, compared with the latest methods, IDDLncLoc achieves an accuracy of 94.96% on the benchmark dataset, which is 2.59% higher than the best method, and the results further demonstrate IDDLncLoc is excellent on the subcellular localization of lncRNA. Besides, a user-friendly web server is available at http://lncloc.club .
Collapse
|
37
|
Nimmi K, Janet B, Selvan AK, Sivakumaran N. Pre-trained ensemble model for identification of emotion during COVID-19 based on emergency response support system dataset. Appl Soft Comput 2022; 122:108842. [PMID: 35465357 PMCID: PMC9014641 DOI: 10.1016/j.asoc.2022.108842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/26/2022] [Accepted: 04/05/2022] [Indexed: 01/17/2023]
Abstract
The COVID-19 precautions, lockdown, and quarantine implemented throughout the epidemic resulted in a worldwide economic disaster. People are facing unprecedented levels of intense threat, necessitating professional, systematic psychiatric intervention and assistance. New psychological services must be established as quickly as possible to support the mental healthcare needs of people in this pandemic condition. This study examines the contents of calls landed in the emergency response support system (ERSS) during the pandemic. Furthermore, a combined analysis of Twitter patterns connected to emergency services could be valuable in assisting people in this pandemic crisis and understanding and supporting people's emotions. The proposed Average Voting Ensemble Deep Learning model (AVEDL Model) is based on the Average Voting technique. The AVEDL Model is utilized to classify emotion based on COVID-19 associated emergency response support system calls (transcribed) along with tweets. Pre-trained transformer-based models BERT, DistilBERT, and RoBERTa are combined to build the AVEDL Model, which achieves the best results. The AVEDL Model is trained and tested for emotion detection using the COVID-19 labeled tweets and call content of the emergency response support system. This is the first deep learning ensemble model using COVID-19 emotion analysis to the best of our knowledge. The AVEDL Model outperforms standard deep learning and machine learning models by attaining an accuracy of 86.46 percent and Macro-average F1-score of 85.20 percent.
Collapse
|
38
|
Biney JKM, Vašát R, Blöcher JR, Borůvka L, Němeček K. Using an ensemble model coupled with portable X-ray fluorescence and visible near-infrared spectroscopy to explore the viability of mapping and estimating arsenic in an agricultural soil. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 818:151805. [PMID: 34813815 DOI: 10.1016/j.scitotenv.2021.151805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 11/07/2021] [Accepted: 11/15/2021] [Indexed: 06/13/2023]
Abstract
Increasing concentrations of potentially toxic elements (PTE) in agricultural soils remain a major source of public concern. Monitoring PTEs in an agricultural field with no history of contaminants necessitate adequate analysis utilizing a robust model to accurately uncover hidden PTEs. Detecting and mapping the distribution of soil properties using portable X-ray fluorescence (pXRF) and proximal sensing techniques is not only rapid, but also relatively inexpensive. In this study, an ensemble model, consisting of partial least square regression (PLSR), support vector machine (SVM), random forest (RF) and cubist, was used for the prediction and mapping of soil As content in an agricultural field with no history of pollution. The datasets were collected using pXRF and field spectroscopy techniques. The main goal was to compare the ensemble model to each of the calibration techniques in terms of prediction accuracy of As content in such a field. Other components [e.g., soil organic carbon (SOC), Mn, S, soil pH, Fe] that are known to influence As levels in the soil were also retrieved to assess their correlation with soil As. The models were evaluated using the root mean squared error (RMSECV), the coefficient of determination (R2CV) and the ratio of performance to interquartile range (RPIQ). In terms of prediction accuracy, the ensemble model outperformed each of the individual techniques (R2CV = 0.80/0.75) and obtained the least error margin (RMSECV = 1.91/2.16). Overall, all the predictive techniques were able to detect both low and high estimated values of soil As within the study field, but with the ensemble model resembling the measurements better. The ensemble model, a promising tool as demonstrated by the current study, is highly recommended to be included in future studies for more accurate estimation of As and other PTEs in other agricultural fields.
Collapse
|
39
|
Jin Z, Ma Y, Chu L, Liu Y, Dubrow R, Chen K. Predicting spatiotemporally-resolved mean air temperature over Sweden from satellite data using an ensemble model. ENVIRONMENTAL RESEARCH 2022; 204:111960. [PMID: 34464620 DOI: 10.1016/j.envres.2021.111960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/29/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]
Abstract
Mapping of air temperature (Ta) at high spatiotemporal resolution is critical to reducing exposure assessment errors in epidemiological studies on the health effects of air temperature. In this study, we applied a three-stage ensemble model to estimate daily mean Ta from satellite-based land surface temperature (Ts) over Sweden during 2001-2019 at a high spatial resolution of 1 × 1 km2. The ensemble model incorporated four base models, including a generalized additive model (GAM), a generalized additive mixed model (GAMM), and two machine learning models (random forest [RF] and extreme gradient boosting [XGBoost]), and allowed the weights for each model to vary over space, with the best-performing model for each grid cell assigned the highest weight. Various spatial predictors were included as adjustment variables in all the base models, including land cover type, normalized difference vegetation index (NDVI), and elevation. The ensemble model showed high performance with an overall R2 of 0.98 and a root mean square error of 1.38 °C in the ten-fold cross-validation, and outperformed each of the four base models. Although each base model performed well, the two machine learning models (RF [R2 = 0.97], XGBoost [R2 = 0.98]) had better performance than the two regression models (GAM [R2 = 0.95], GAMM [R2 = 0.96]). In the machine learning models, Ts was the dominant predictor of Ta, followed by day of year, NDVI, latitude, elevation, and longitude. The highly spatiotemporally-resolved Ta can improve temperature exposure assessment in future epidemiological studies.
Collapse
|
40
|
Zhuang H, Zhang C, Jin X, Ge A, Chen M, Ye J, Qiao H, Xiong P, Zhang X, Chen J, Luan X, Wang W. A flagship species-based approach to efficient, cost-effective biodiversity conservation in the Qinling Mountains, China. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2022; 305:114388. [PMID: 34972047 DOI: 10.1016/j.jenvman.2021.114388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 12/12/2021] [Accepted: 12/22/2021] [Indexed: 06/14/2023]
Abstract
Prioritizing threatened species protection has been proposed as an efficient response to the global biodiversity crisis. We used in-situ conservation data to predict the potential habitat area of four flagship species: the giant panda (Ailuropoda melanoleuca), golden monkey (Rhinopithecus roxella quinlingensis), takin (Budorcas taxicolor bedfordi), and crested ibis (Nipponia nippon). We then designed systematic conservation planning schemes for various scenarios given species habitat preferences and anthropogenic activities and conducted a cost-effectiveness assessment. Broadly, the geographical distributions of suitable habitats for giant pandas, golden monkeys, and takins exhibited high spatial congruence (correlation coefficients of 0.59-0.90), and areas of high congruence were concentrated in the northern portion of the Qinling Mountains at high elevation (>1500 m). By contrast, the crested ibis was negatively correlated in space with its sympatric species (-0.47 to -0.29). Crested ibis habitats were clustered in the southern portion of the region at low elevation (<1500 m). A hypothetical conservation priority area (CPA) based on the giant panda, golden monkey, and takin included 39.64% of the Qinling Mountains and 100%, 99.99%, 99.59%, and 7.84% of the suitable habitats for giant pandas, golden monkeys, takins, and crested ibises, respectively. The same area included 99.07%, 70.87%, and 39.96% of the highly important areas for the ecosystem services of biodiversity conservation, water supply, and soil retention, respectively, and only 4.62%, 16.83%, and 13.4% of the area were associated with high-density residential area, impervious surfaces, and cropland, respectively. Therefore, we conclude that a CPA approach based on the specialist species could result in effective, low-cost biodiversity conservation in the Qinling Mountains. However, we note that existing protected areas account for only 26.52% of the CPA. We recommend that the main area of the proposed Qinling National Park should be based on the CPA developed here.
Collapse
|
41
|
Ke H, Gong S, He J, Zhang L, Cui B, Wang Y, Mo J, Zhou Y, Zhang H. Development and application of an automated air quality forecasting system based on machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 806:151204. [PMID: 34710417 DOI: 10.1016/j.scitotenv.2021.151204] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/20/2021] [Accepted: 10/20/2021] [Indexed: 06/13/2023]
Abstract
As one of the most concerned issues in modern society, air quality has received extensive attentions from the public and the government, which promotes the continuous development and progress of air quality forecasting technology. In this study, an automated air quality forecasting system based on machine learning has been developed and applied for daily forecasts of six common pollutants (PM2.5, PM10, SO2, NO2, O3, and CO) and pollution levels, which can automatically find the best "Model + Hyperparameters" without human intervention. Five machine learning models and an ensemble model (Stacked Generalization) were integrated into the system, supported by a knowledge base containing the meteorological observed data, pollutant concentrations, pollutant emissions, and model reanalysis data. Then five-year data (2015-2019) of Beijing, Shanghai, Guangzhou, Chengdu, Xi'an, Wuhan, and Changchun in China, were used as an application case to study the effectiveness of the automated forecasting system. Based on the analysis of seven evaluation criteria and pollution level forecasts, combined with the forecasting results for the next 3-days, it is found that the automated system can achieve satisfactory forecasting performance, better than most of numerical model results. This implied that the developed system unveils a good application prospect in the field of environmental meteorology.
Collapse
|
42
|
Chen YM, Chen YJ, Ho WH, Tsai JT. Classifying chest CT images as COVID-19 positive/negative using a convolutional neural network ensemble model and uniform experimental design method. BMC Bioinformatics 2021; 22:147. [PMID: 34749629 PMCID: PMC8574139 DOI: 10.1186/s12859-021-04083-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 03/16/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To classify chest computed tomography (CT) images as positive or negative for coronavirus disease 2019 (COVID-19) quickly and accurately, researchers attempted to develop effective models by using medical images. RESULTS A convolutional neural network (CNN) ensemble model was developed for classifying chest CT images as positive or negative for COVID-19. To classify chest CT images acquired from COVID-19 patients, the proposed COVID19-CNN ensemble model combines the use of multiple trained CNN models with a majority voting strategy. The CNN models were trained to classify chest CT images by transfer learning from well-known pre-trained CNN models and by applying their algorithm hyperparameters as appropriate. The combination of algorithm hyperparameters for a pre-trained CNN model was determined by uniform experimental design. The chest CT images (405 from COVID-19 patients and 397 from healthy patients) used for training and performance testing of the COVID19-CNN ensemble model were obtained from an earlier study by Hu in 2020. Experiments showed that, the COVID19-CNN ensemble model achieved 96.7% accuracy in classifying CT images as COVID-19 positive or negative, which was superior to the accuracies obtained by the individual trained CNN models. Other performance measures (i.e., precision, recall, specificity, and F1-score) obtained bythe COVID19-CNN ensemble model were higher than those obtained by individual trained CNN models. CONCLUSIONS The COVID19-CNN ensemble model had superior accuracy and excellent capability in classifying chest CT images as COVID-19 positive or negative.
Collapse
|
43
|
Rahmanian S, Pourghasemi HR, Pouyan S, Karami S. Habitat potential modelling and mapping of Teucrium polium using machine learning techniques. ENVIRONMENTAL MONITORING AND ASSESSMENT 2021; 193:759. [PMID: 34718878 DOI: 10.1007/s10661-021-09551-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 10/19/2021] [Indexed: 06/13/2023]
Abstract
Determining suitable habitats is important for the successful management and conservation of plant and wildlife species. Teucrium polium L. is a wild plant species found in Iran. It is widely used to treat numerous health problems. The range of this plant is shrinking due to habitat destruction and overexploitation. Therefore, habitat suitability (HS) modeling is critical for conservation. HS modeling can also identify the key characteristics of habitats that support this species. This study models the habitats of T. polium using five data mining models: random forest (RF), flexible discriminant analysis (FDA), multivariate adaptive regression splines (MARS), support vector machine (SVM), and generalized linear model (GLM). A total of 119 T. poliumlocations were identified and mapped. According to the RF model, the most important factors describing T. polium habitat were elevation, soil texture, and mean annual rainfall. HS maps (HSMs) were prepared, and habitat suitability was classified as low, medium, high, or very high. The percentages of the study area assigned high or very high suitability ratings by each of the models were 44.62% for FDA, 43.75% for GLM, 43.12% for SVM, 38.91% for RF, 28.72% for MARS, and 39.16% for their ensemble. Although the six models were reasonably accurate, the ensemble model had the highest AUC value, demonstrating a strong predictive performance. The rank order of the other models in this regard is RF, MARS, SVM, FDA, and GLM. HSMs can provide useful output to support the sustainable management of rangelands, reclamation, and land protection.
Collapse
|
44
|
Cui L, Wang S. Mapping the daily nitrous acid (HONO) concentrations across China during 2006-2017 through ensemble machine-learning algorithm. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 785:147325. [PMID: 33957584 DOI: 10.1016/j.scitotenv.2021.147325] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/19/2021] [Accepted: 04/21/2021] [Indexed: 06/12/2023]
Abstract
Nitrous acid (HONO) is a major source of the hydroxyl radical (OH) and plays a key role in atmospheric photochemistry. The lack of spatially resolved HONO concentration information results in large knowledge gaps of HONO and its role in atmospheric chemistry and air pollution in China. In this work, an ensemble machine learning model comprising of random forest, gradient boosting, and back propagation neural network was proposed, for the first time, to estimate the long-term (2006-2017) daily HONO concentrations across China in 0.25° resolution. Further, the key factors controlling the space-time variablity of HONO concentrations were analyzed based on variable importance values. The ensemble model well characterized the spatiotemporal distribution of daily HONO concentrations with the sampled-based, site-based and by-year cross-validation (CV) R2 (RMSE) of 0.7 (0.36 ppbv), 0.67 (0.36 ppbv), and 0.62 (0.40 ppbv), respectively. HONO hotspots were mainly distributed in the Beijing-Tianjin-Hebei (BTH), Pearl River Delta (PRD), Yangtze River Delta (YRD), and several sites of Sichuan Basin, in line with the distribution patterns of the tropospheric NO2 columns and assimilated surface NO3- levels. The national HONO levels stagnated during 2006-2013, then declined after 2013 benefiting from the implementation of the Action Plan for Air Pollution Prevention and Control. The NO3- concentration, urban area, NO2 column density ranked as important variables for HONO prediction, while agricultral land, forest and grassland played minor roles in affecting HONO concentrations, suggesting the significant role of heterogeneous HONO production from anthropogenic precursor emissions. Leveraging the ground-level HONO observations, this study fills the gap of statistically modelling nationwide HONO in China, which provides essential data for atmospheric chemistry research.
Collapse
|
45
|
Ke B, Nguyen H, Bui XN, Bui HB, Choi Y, Zhou J, Moayedi H, Costache R, Nguyen-Trang T. Predicting the sorption efficiency of heavy metal based on the biochar characteristics, metal sources, and environmental conditions using various novel hybrid machine learning models. CHEMOSPHERE 2021; 276:130204. [PMID: 34088091 DOI: 10.1016/j.chemosphere.2021.130204] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 02/17/2021] [Accepted: 03/04/2021] [Indexed: 06/12/2023]
Abstract
Heavy metals in water and wastewater are taken into account as one of the most hazardous environmental issues that significantly impact human health. The use of biochar systems with different materials helped significantly remove heavy metals in the water, especially wastewater treatment systems. Nevertheless, heavy metal's sorption efficiency on the biochar systems is highly dependent on the biochar characteristics, metal sources, and environmental conditions. Therefore, this study implicates the feasibility of biochar systems in the heavy metal sorption in water/wastewater and the use of artificial intelligence (AI) models in investigating efficiency sorption of heavy metal on biochar. Accordingly, this work investigated and proposed 20 artificial intelligent models for forecasting the sorption efficiency of heavy metal onto biochar based on five machine learning algorithms and bagging technique (BA). Accordingly, support vector machine (SVM), random forest (RF), artificial neural network (ANN), M5Tree, and Gaussian process (GP) algorithms were used as the key algorithms for the aim of this study. Subsequently, the individual models were bagged with each other to generate new ensemble models. Finally, 20 intelligent models were developed and evaluated, including SVM, RF, M5Tree, GP, ANN, BA-SVM, BA-RF, BA-M5Tree, BA-GP, BA-ANN, SVM-RF, SVM-M5Tree, SVM-GP, SVM-ANN, RF-M5Tree, RF-GP, RF-ANN, M5Tree-GP, M5Tree-ANN, GP-ANN. Of those, the hybrid models (i.e., BA-SVM, BA-RF, BA-M5Tree, BA-GP, BA-ANN, SVM-RF, SVM-M5Tree, SVM-GP, SVM-ANN, RF-M5Tree, RF-GP, RF-ANN, M5Tree-GP, M5Tree-ANN, GP-ANN) are introduced as the novelty of this study for estimating the heavy metal's sorption efficiency on the biochar systems. Also, the biochar characteristics, metal sources, and environmental conditions were comprehensively assessed and used, and they are considered as a novelty of the study as well. For this aim, a dataset of sorption efficiency of heavy metal was collected and processed with 353 experimental tests. Various performance indexes were applied to evaluate the models, such as RMSE, R2, MAE, color intensity, Taylor diagram, box and whiskers plots. This study's findings revealed that AI models could predict heavy metal's sorption efficiency onto biochar with high reliability, and the efficiency of the ensemble models is higher than those of individual models. The results also reported that the SVM-ANN ensemble model is the most superior model among 20 developed models. The predictive model proposed that heavy metal's efficiency sorption on biochar can be accurately forecasted and early warning for the water pollution by heavy metal.
Collapse
|
46
|
Mukherjee T, Sharma V, Sharma LK, Thakur M, Joshi BD, Sharief A, Thapa A, Dutta R, Dolker S, Tripathy B, Chandra K. Landscape-level habitat management plan through geometric reserve design for critically endangered Hangul (Cervus hanglu hanglu). THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 777:146031. [PMID: 33676208 DOI: 10.1016/j.scitotenv.2021.146031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 02/17/2021] [Accepted: 02/17/2021] [Indexed: 06/12/2023]
Abstract
Hangul (Cervus hanglu hanglu), the only red deer subspecies surviving in the Indian subcontinent, is of top conservation priority with global importance. Unfortunately, it has lost much of its historical distribution range, and it is now confined to Dachigam landscape within the Kashmir valley of India. The Government of India initiated a recovery plan in 2008 to augment their numbers through ex-situ conservation programs. However, it was necessary to identify potential hangul habitats in Kashmir valley for adopting landscape-level conservation planning for the species. Based on geometric aspects of reserve design, we modeled hangul habitat using an ensemble approach to identify hangul habitats. The present model indicates that the conifer and broadleaf mixed forests were the most suitable habitats. Only 9% of the total study landscape was found suitable for the species. We identified corridors among the suitable habitat blocks, which may be vital for the species' long-term genetic viability. We suggest reorganizing the existing management of Dachigam National Park (NP) following the landscape level and habitat block-level management planning based on the core principles of geometric reserve design. We recommend that the identified patch (PID-6) in the southern region of the landscape to be converted into a Conservation Reserve or merged with the Overa-Aru Wildlife Sanctuary. This habitat patch PID-6 may be a stepping stone habitat and vital for maintaining the species landscape connectivity and metapopulation dynamics.
Collapse
|
47
|
Tanveer MA, Khan MJ, Sajid H, Naseer N. Convolutional neural networks ensemble model for neonatal seizure detection. J Neurosci Methods 2021; 358:109197. [PMID: 33864835 DOI: 10.1016/j.jneumeth.2021.109197] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 04/11/2021] [Accepted: 04/12/2021] [Indexed: 10/21/2022]
Abstract
BACKGROUND Neonatal seizures are a common occurrence in clinical settings, requiring immediate attention and detection. Previous studies have proposed using manual feature extraction coupled with machine learning, or deep learning to classify between seizure and non-seizure states. NEW METHOD In this paper a deep learning based approach is used for neonatal seizure classification using electroencephalogram (EEG) signals. The architecture detects seizure activity in raw EEG signals as opposed to common state-of-art, where manual feature extraction with machine learning algorithms is used. The architecture is a two-dimensional (2D) convolutional neural network (CNN) to classify between seizure/non-seizure states. RESULTS The dataset used for this study is annotated by three experts and as such three separate models are trained on individual annotations, resulting in average accuracies (ACC) of 95.6 %, 94.8 % and 90.1 % respectively, and average area under the receiver operating characteristic curve (AUC) of 99.2 %, 98.4 % and 96.7 % respectively. The testing was done using 10-cross fold validation, so that the performance can be an accurate representation of the architectures classification capability in a clinical setting. After training/testing of the three individual models, a final ensemble model is made consisting of the three models. The ensemble model gives an average ACC and AUC of 96.3 % and 99.3 % respectively. COMPARISON WITH EXISTING METHODS This study outperforms previous studies, with increased ACC and AUC results coupled with use of small time windows (1 s) used for evaluation. CONCLUSION The proposed approach is promising for detecting seizure activity in unseen neonate data in a clinical setting.
Collapse
|
48
|
Yu X, Yang Q, Wang D, Li Z, Chen N, Kong DX. Predicting lung adenocarcinoma disease progression using methylation-correlated blocks and ensemble machine learning classifiers. PeerJ 2021; 9:e10884. [PMID: 33628643 PMCID: PMC7894106 DOI: 10.7717/peerj.10884] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 01/12/2021] [Indexed: 01/20/2023] Open
Abstract
Applying the knowledge that methyltransferases and demethylases can modify adjacent cytosine-phosphorothioate-guanine (CpG) sites in the same DNA strand, we found that combining multiple CpGs into a single block may improve cancer diagnosis. However, survival prediction remains a challenge. In this study, we developed a pipeline named "stacked ensemble of machine learning models for methylation-correlated blocks" (EnMCB) that combined Cox regression, support vector regression (SVR), and elastic-net models to construct signatures based on DNA methylation-correlated blocks for lung adenocarcinoma (LUAD) survival prediction. We used methylation profiles from the Cancer Genome Atlas (TCGA) as the training set, and profiles from the Gene Expression Omnibus (GEO) as validation and testing sets. First, we partitioned the genome into blocks of tightly co-methylated CpG sites, which we termed methylation-correlated blocks (MCBs). After partitioning and feature selection, we observed different diagnostic capacities for predicting patient survival across the models. We combined the multiple models into a single stacking ensemble model. The stacking ensemble model based on the top-ranked block had the area under the receiver operating characteristic curve of 0.622 in the TCGA training set, 0.773 in the validation set, and 0.698 in the testing set. When stratified by clinicopathological risk factors, the risk score predicted by the top-ranked MCB was an independent prognostic factor. Our results showed that our pipeline was a reliable tool that may facilitate MCB selection and survival prediction.
Collapse
|
49
|
Mukherjee T, Sharma LK, Kumar V, Sharief A, Dutta R, Kumar M, Joshi BD, Thakur M, Venkatraman C, Chandra K. Adaptive spatial planning of protected area network for conserving the Himalayan brown bear. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 754:142416. [PMID: 33254933 DOI: 10.1016/j.scitotenv.2020.142416] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 09/12/2020] [Accepted: 09/14/2020] [Indexed: 06/12/2023]
Abstract
Large mammals that occur in low densities, particularly in the high-altitude areas, are globally threatened due to fragile climatic and ecological envelopes. Among bear species, the Himalayan brown bear (Ursus arctos isabellinus) has a distribution that is restricted to Himalayan highlands with relatively small and fragmented populations. To date, very little scientific information on the Himalayan brown bear, which is vital for the conservation of the species and the management of its habitats, especially in protected areas of the landscape, is available. The present study aims to understand the effectiveness of existing Himalayan Protected Areas in terms of representativeness for the conservation of Himalayan brown bear (HBB), an umbrella species in high-altitude habitats of the Himalayan region. We used the ensemble approach of the species distribution model and then assessed biological connectivity to predict the current and future distribution and movement of HBB in climate change scenarios for the year 2050. Approximately 33 protected areas (PAs) currently possess suitable habitats. Our model suggests a massive decline of approximately 73.38% and 72.87% under 4.5 and 8.5 representative concentration pathway (RCP) respectively in the year 2050 compared with the current distribution. The predicted change in suitability will result in loss of habitats from thirteen PAs; eight will become completely uninhabitable by the year 2050, followed by loss of connectivity in the majority of PAs. Habitat configuration analysis suggested a 40% decline in the number of suitable patches, a reduction in large habitat patches (up to 50%) and aggregation of suitable areas (9%) by 2050, indicating fragmentation. The predicted change in geographic isotherm will result in loss of habitats from thirteen PAs, eight of them will become completely inhabitable. Hence, these PAs may lose their effectiveness and representativeness in achieving the very objective of their existence or conservation goals. Therefore, we recommend adaptive spatial planning for protecting suitable habitats distributed outside the PA for climate change adaptation.
Collapse
|
50
|
Gifani P, Shalbaf A, Vafaeezadeh M. Automated detection of COVID-19 using ensemble of transfer learning with deep convolutional neural network based on CT scans. Int J Comput Assist Radiol Surg 2021; 16:115-123. [PMID: 33191476 PMCID: PMC7667011 DOI: 10.1007/s11548-020-02286-w] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Accepted: 10/23/2020] [Indexed: 12/18/2022]
Abstract
PURPOSE COVID-19 has infected millions of people worldwide. One of the most important hurdles in controlling the spread of this disease is the inefficiency and lack of medical tests. Computed tomography (CT) scans are promising in providing accurate and fast detection of COVID-19. However, determining COVID-19 requires highly trained radiologists and suffers from inter-observer variability. To remedy these limitations, this paper introduces an automatic methodology based on an ensemble of deep transfer learning for the detection of COVID-19. METHODS A total of 15 pre-trained convolutional neural networks (CNNs) architectures: EfficientNets(B0-B5), NasNetLarge, NasNetMobile, InceptionV3, ResNet-50, SeResnet 50, Xception, DenseNet121, ResNext50 and Inception_resnet_v2 are used and then fine-tuned on the target task. After that, we built an ensemble method based on majority voting of the best combination of deep transfer learning outputs to further improve the recognition performance. We have used a publicly available dataset of CT scans, which consists of 349 CT scans labeled as being positive for COVID-19 and 397 negative COVID-19 CT scans that are normal or contain other types of lung diseases. RESULTS The experimental results indicate that the majority voting of 5 deep transfer learning architecture with EfficientNetB0, EfficientNetB3, EfficientNetB5, Inception_resnet_v2, and Xception has the higher results than the individual transfer learning structure and among the other models based on precision (0.857), recall (0.854) and accuracy (0.85) metrics in diagnosing COVID-19 from CT scans. CONCLUSION Our study based on an ensemble deep transfer learning system with different pre-trained CNNs architectures can work well on a publicly available dataset of CT images for the diagnosis of COVID-19 based on CT scans.
Collapse
|