Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

64
(from Reference Citation Analysis)

Article PDFs (4)

Cited by > 0 (44)

Searched Name

Ensemble model

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Yakovyna V, Shakhovska N, Szpakowska A. A novel hybrid supervised and unsupervised hierarchical ensemble for COVID-19 cases and mortality prediction. Sci Rep 2024;14:9782. [PMID: 38684770 PMCID: PMC11059164 DOI: 10.1038/s41598-024-60637-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 04/25/2024] [Indexed: 05/02/2024] Open

Abstract

Though COVID-19 is no longer a pandemic but rather an endemic, the epidemiological situation related to the SARS-CoV-2 virus is developing at an alarming rate, impacting every corner of the world. The rapid escalation of the coronavirus has led to the scientific community engagement, continually seeking solutions to ensure the comfort and safety of society. Understanding the joint impact of medical and non-medical interventions on COVID-19 spread is essential for making public health decisions that control the pandemic. This paper introduces two novel hybrid machine-learning ensembles that combine supervised and unsupervised learning for COVID-19 data classification and regression. The study utilizes publicly available COVID-19 outbreak and potential predictive features in the USA dataset, which provides information related to the outbreak of COVID-19 disease in the US, including data from each of 3142 US counties from the beginning of the epidemic (January 2020) until June 2021. The developed hybrid hierarchical classifiers outperform single classification algorithms. The best-achieved performance metrics for the classification task were Accuracy = 0.912, ROC-AUC = 0.916, and F1-score = 0.916. The proposed hybrid hierarchical ensemble combining both supervised and unsupervised learning allows us to increase the accuracy of the regression task by 11% in terms of MSE, 29% in terms of the area under the ROC, and 43% in terms of the MPP metric. Thus, using the proposed approach, it is possible to predict the number of COVID-19 cases and deaths based on demographic, geographic, climatic, traffic, public health, social-distancing-policy adherence, and political characteristics with sufficiently high accuracy. The study reveals that virus pressure is the most important feature in COVID-19 spread for classification and regression analysis. Five other significant features were identified to have the most influence on COVID-19 spread. The combined ensembling approach introduced in this study can help policymakers design prevention and control measures to avoid or minimize public health threats in the future.

Collapse

Yin Y, Ahmadianfar I, Karim FK, Elmannai H. Advanced forecasting of COVID-19 epidemic: Leveraging ensemble models, advanced optimization, and decomposition techniques. Comput Biol Med 2024;175:108442. [PMID: 38678939 DOI: 10.1016/j.compbiomed.2024.108442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 03/25/2024] [Accepted: 04/07/2024] [Indexed: 05/01/2024]

Abstract

In the global effort to address the outbreak of the new coronavirus pneumonia (COVID-19) pandemic, accurate forecasting of epidemic patterns has become crucial for implementing successful interventions aimed at preventing and controlling the spread of the disease. The correct prediction of the course of COVID-19 outbreaks is a complex and challenging task, mainly because of the significant volatility in the data series related to COVID-19. Previous studies have been limited by the exclusive use of individual forecasting techniques in epidemic modeling, disregarding the integration of diverse prediction procedures. The lack of attention to detail in this situation can yield worse-than-ideal results. Consequently, this study introduces a novel ensemble framework that integrates three machine learning methods (kernel ridge regression (KRidge), Deep random vector functional link (dRVFL), and ridge regression) within a linear relationship (L-KRidge-dRVFL-Ridge). The optimization of this framework is accomplished through a distinctive approach, specifically adaptive differential evolution and particle swarm optimization (A-DEPSO). Moreover, an effective decomposition method, known as time-varying filter empirical mode decomposition (TVF-EMD), is employed to decompose the input variables. A feature selection technique, specifically using the light gradient boosting machine (LGBM), is also implemented to extract the most influential input variables. The daily datasets of COVID-19 collected from two countries, namely Italy and Poland, were used as the experimental examples. Additionally, all models are implemented to forecast COVID-19 at two-time horizons: 10- and 14-day ahead (t+10 and t+14). According to the results, the proposed model can yield higher correlation coefficient (R) for both case studies: Italy (t+10 = 0.965, t+14 = 0.961) and Poland (t+10 = 0.952, t+14 = 0.940) than the other models. The experimental results demonstrate that the model suggested in this paper has outstanding results in various kinds of complex epidemic prediction situations. The proposed ensemble model demonstrates exceptional accuracy and resilience, outperforming all similar models in terms of efficacy.

Collapse

Liu J, Zhu A, Wang X, Zhou X, Chen L. Predicting the current fishable habitat distribution of Antarctic toothfish (Dissostichus mawsoni) and its shift in the future under climate change in the Southern Ocean. PeerJ 2024;12:e17131. [PMID: 38563000 PMCID: PMC10984185 DOI: 10.7717/peerj.17131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open

Hussain S, Aslam W, Mehmood A, Choi GS, Ashraf I. A machine learning based framework for IoT devices identification using web traffic. PeerJ Comput Sci 2024;10:e1834. [PMID: 38660201 PMCID: PMC11041939 DOI: 10.7717/peerj-cs.1834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 01/02/2024] [Indexed: 04/26/2024]

Kim C, Park JH, Lee JY. AI-based betting anomaly detection system to ensure fairness in sports and prevent illegal gambling. Sci Rep 2024;14:6470. [PMID: 38499635 PMCID: PMC10948790 DOI: 10.1038/s41598-024-57195-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 03/15/2024] [Indexed: 03/20/2024] Open

Brar AS, Singh K. A multi-objective stacked regression method for distance based colour measuring device. Sci Rep 2024;14:5530. [PMID: 38448462 PMCID: PMC10918078 DOI: 10.1038/s41598-024-54785-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 02/16/2024] [Indexed: 03/08/2024] Open

He X, Yang Z, Wang L, Sun Y, Cao H, Liang Y. NeuTox: A weighted ensemble model for screening potential neuronal cytotoxicity of chemicals based on various types of molecular representations. J Hazard Mater 2024;465:133443. [PMID: 38198870 DOI: 10.1016/j.jhazmat.2024.133443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/12/2024]

Byeon DH, Lee WH. Ensemble evaluation of potential distribution of Procambarus clarkii using multiple species distribution models. Oecologia 2024;204:589-601. [PMID: 38386057 DOI: 10.1007/s00442-024-05516-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 01/20/2024] [Indexed: 02/23/2024]

Jin Z, Zhao H, Xian X, Li M, Qi Y, Guo J, Yang N, Lü Z, Liu W. Early warning and management of invasive crop pests under global warming: estimating the global geographical distribution patterns and ecological niche overlap of three Diabrotica beetles. Environ Sci Pollut Res Int 2024;31:13575-13590. [PMID: 38253826 DOI: 10.1007/s11356-024-32076-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 01/15/2024] [Indexed: 01/24/2024]

Abstract

Invasive alien pests (IAPs) pose a major threat to global agriculture and food production. When multiple IAPs coexist in the same habitat and use the same resources, the economic loss to local agricultural production increases. Many species of the Diabrotica genus, such as Diabrotica barberi, Diabrotica undecimpunctata, and Diabrotica virgifera, originating from the USA and Mexico, seriously damaged maize production in North America and Europe. However, the potential geographic distributions (PGDs) and degree of ecological niche overlap among the three Diabrotica beetles remain unclear; thus, the potential coexistence zone is unknown. Based on environmental and species occurrence data, we used an ensemble model (EM) to predict the PGDs and overlapping PGD of the three Diabrotica beetles. The n-dimensional hypervolumes concept was used to explore the degree of niche overlap among the three species. The EM showed better reliability than the individual models. According to the EM results, the PGDs and overlapping PGD of the three Diabrotica beetles were mainly distributed in North America, Europe, and Asia. Under the current scenario, D. virgifera has the largest PGD ranges (1615 × 104 km2). In the future, the PGD of this species will expand further and reach a maximum under the SSP5-8.5 scenario in the 2050s (2499 × 104 km2). Diabrotica virgifera showed the highest potential for invasion under the current and future global warming scenarios. Among the three studied species, the degree of ecological niche overlap was the highest for D. undecimpunctata and D. virgifera, with the highest similarity in the PGD patterns and maximum coexistence range. Under global warming, the PGDs of the three Diabrotica beetles are expected to expand to high latitudes. Identifying the PGDs of the three Diabrotica beetles provides an important reference for quarantine authorities in countries at risk of invasion worldwide to develop specific preventive measures against pests.

Collapse

Tache IA, Hatfaludi CA, Puiu A, Itu LM, Popa-Fotea NM, Calmac L, Scafa-Udriste A. Assessment of the functional severity of coronary lesions from optical coherence tomography based on ensembled learning. Biomed Eng Online 2023;22:127. [PMID: 38104144 PMCID: PMC10724936 DOI: 10.1186/s12938-023-01192-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 12/07/2023] [Indexed: 12/19/2023] Open

Wang S, Lin M, Meng Y, Jiang T, Fan F, Wang S. Self-expansion full information optimization strategy: Convenient and efficient method for near infrared spectrum auto-analysis. Spectrochim Acta A Mol Biomol Spectrosc 2023;303:123224. [PMID: 37603976 DOI: 10.1016/j.saa.2023.123224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/06/2023] [Accepted: 07/31/2023] [Indexed: 08/23/2023]

Abstract

An essential step in the application of near infrared spectroscopy technology is the spectrum preprocessing. A reasonable implementation of it ensures that the effective spectral information is correctly extracted and, also that the model's accuracy is increased. However, some analysts' research still uses the manual approach of trial and error, particularly those less skilled ones. Previous papers have provided preprocessing optimization algorithms for NIR, but there are still some problems that need to be resolved, such as the unwieldy sequence determination of preprocessing method or, the fluctuated optimization outcomes or, lack of sufficient statistical information. This research suggests a spectrum auto-analysis methodology named self-expansion full information optimization strategy, a new powerful open-source technique for concurrently addressing all of these above issues simultaneously. For the first time in the field of chemometrics, this algorithm offers a reliable and effective automatic near infrared auto-modelling method based on the statistical informatics. With the aid of its built-in modules, such as information generators, spectrum processors, etc., it is able to fully search the common preprocessing techniques, which is determined by Monte Carlo cross validation. Then the final ensemble calibration model is built by employing the optimized preprocessing schemes, along with the wavelength variables screening algorithm. The optimization strategy can offer the user objective useful statistics information created throughout the modeling process to further examine the model's effectiveness. The results demonstrate that the suggested method can easily and successfully extract spectrum information and develop calibration models by putting it to the test on two groups of actual near-infrared spectral data. Additionally, this optimization strategy can also be applied to other spectrum analysis areas, such Raman spectroscopy or infrared spectroscopy, by changing a few of its parameters, and has extraordinary application value.

Collapse

K V, Al-onazi BB, Simic V, Tirkolaee EB, Jana C. DeepFND: an ensemble-based deep learning approach for the optimization and improvement of fake news detection in digital platform. PeerJ Comput Sci 2023;9:e1666. [PMID: 38192452 PMCID: PMC10773750 DOI: 10.7717/peerj-cs.1666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 10/05/2023] [Indexed: 01/10/2024]

Zamani MG, Nikoo MR, Jahanshahi S, Barzegar R, Meydani A. Forecasting water quality variable using deep learning and weighted averaging ensemble models. Environ Sci Pollut Res Int 2023;30:124316-124340. [PMID: 37996598 DOI: 10.1007/s11356-023-30774-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/27/2023] [Indexed: 11/25/2023]

Abstract

Water quality variables, including chlorophyll-a (Chl-a), play a pivotal role in comprehending and evaluating the condition of aquatic ecosystems. Chl-a, a pigment present in diverse aquatic organisms, notably algae and cyanobacteria, serves as a valuable indicator of water quality. Thus, the objectives of this study encompass: (1) the assessment of the predictive capabilities of four deep learning (DL) models - namely, recurrent neural network (RNN), long short-term memory (LSTM), gated recurrence unit (GRU), and temporal convolutional network (TCN) - in forecasting Chl-a concentrations; (2) the incorporation of these DL models into ensemble models (EMs) employing genetic algorithm (GA) and non-dominated sorting genetic algorithm (NSGA-II) to harness the strengths of each standalone model; and (3) the evaluation of the efficacy of the developed EMs. Utilizing data collected at 15-min intervals from Small Prespa Lake (SPL) in Greece, the models employed hourly Chl-a concentration lag times, extending up to 6 h, as models' inputs to forecast Chla (t+1). The proposed models underwent training on 70% of the dataset and were subsequently validated on the remaining 30%. Among the standalone DL models, the GRU model exhibited superior performance in Chl-a forecasting, surpassing the RNN, LSTM, and TCN models by 8%, 2%, and 2%, respectively. Furthermore, the integration of DL models through single-objective GA and multi-objective NSGA-II optimization algorithms yielded hybrid models adept at effectively forecasting both low and high Chl-a concentrations. The ensemble model based on NSGA-II outperformed standalone DL models as well as the GA-based model across a range of evaluation indices. For instance, considering the R-squared metric, the study's findings demonstrated that the EM-NSGA-II stands out with exceptional effectiveness compared to DL and EM-GA models, showcasing improvements of 14% (RNN), 8% (LSTM), 6% (GRU), 8% (TCN), and 3% (EM-GA) during the testing phase.

Collapse

Liu M, Liu H, Wu T, Zhu Y, Zhou Y, Huang Z, Xiang C, Huang J. ACP-Dnnel: anti-coronavirus peptides' prediction based on deep neural network ensemble learning. Amino Acids 2023;55:1121-1136. [PMID: 37402073 DOI: 10.1007/s00726-023-03300-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 06/25/2023] [Indexed: 07/05/2023]

Abstract

The ongoing COVID-19 pandemic has caused dramatic loss of human life. There is an urgent need for safe and efficient anti-coronavirus infection drugs. Anti-coronavirus peptides (ACovPs) can inhibit coronavirus infection. With high-efficiency, low-toxicity, and broad-spectrum inhibitory effects on coronaviruses, they are promising candidates to be developed into a new type of anti-coronavirus drug. Experiment is the traditional way of ACovPs' identification, which is less efficient and more expensive. With the accumulation of experimental data on ACovPs, computational prediction provides a cheaper and faster way to find anti-coronavirus peptides' candidates. In this study, we ensemble several state-of-the-art machine learning methodologies to build nine classification models for the prediction of ACovPs. These models were pre-trained using deep neural networks, and the performance of our ensemble model, ACP-Dnnel, was evaluated across three datasets and independent dataset. We followed Chou's 5-step rules. (1) we constructed the benchmark datasets data1, data2, and data3 for training and testing, and introduced the independent validation dataset ACVP-M; (2) we analyzed the peptides sequence composition feature of the benchmark dataset; (3) we constructed the ACP-Dnnel model with deep convolutional neural network (DCNN) merged the bi-directional long short-term memory (BiLSTM) as the base model for pre-training to extract the features embedded in the benchmark dataset, and then, nine classification algorithms were introduced to ensemble together for classification prediction and voting together; (4) tenfold cross-validation was introduced during the training process, and the final model performance was evaluated; (5) finally, we constructed a user-friendly web server accessible to the public at http://150.158.148.228:5000/ . The highest accuracy (ACC) of ACP-Dnnel reaches 97%, and the Matthew's correlation coefficient (MCC) value exceeds 0.9. On three different datasets, its average accuracy is 96.0%. After the latest independent dataset validation, ACP-Dnnel improved at MCC, SP, and ACC values 6.2%, 7.5% and 6.3% greater, respectively. It is suggested that ACP-Dnnel can be helpful for the laboratory identification of ACovPs, speeding up the anti-coronavirus peptide drug discovery and development. We constructed the web server of anti-coronavirus peptides' prediction and it is available at http://150.158.148.228:5000/ .

Collapse

Shahabi MS, Shalbaf A, Rostami R. Prediction of response to repetitive transcranial magnetic stimulation for major depressive disorder using hybrid Convolutional recurrent neural networks and raw Electroencephalogram Signal. Cogn Neurodyn 2023;17:909-920. [PMID: 37522037 PMCID: PMC10374518 DOI: 10.1007/s11571-022-09881-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/03/2022] [Accepted: 08/28/2022] [Indexed: 11/30/2022] Open

Chen J, Engelhard M, Henao R, Berchuck S, Eichner B, Perrin EM, Sapiro G, Dawson G. Enhancing early autism prediction based on electronic records using clinical narratives. J Biomed Inform 2023;144:104390. [PMID: 37182592 PMCID: PMC10526711 DOI: 10.1016/j.jbi.2023.104390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 04/14/2023] [Accepted: 05/09/2023] [Indexed: 05/16/2023]

Abstract

Recent work has shown that predictive models can be applied to structured electronic health record (EHR) data to stratify autism likelihood from an early age (<1 year). Integrating clinical narratives (or notes) with structured data has been shown to improve prediction performance in other clinical applications, but the added predictive value of this information in early autism prediction has not yet been explored. In this study, we aimed to enhance the performance of early autism prediction by using both structured EHR data and clinical narratives. We built models based on structured data and clinical narratives separately, and then an ensemble model that integrated both sources of data. We assessed the predictive value of these models from Duke University Health System over a 14-year span to evaluate ensemble models predicting later autism diagnosis (by age 4 years) from data collected from ages 30 to 360 days. Our sample included 11,750 children above by age 3 years (385 meeting autism diagnostic criteria). The ensemble model for autism prediction showed superior performance and at age 30 days achieved 46.8% sensitivity (95% confidence interval, CI: 22.0%, 52.9%), 28.0% positive predictive value (PPV) at high (90%) specificity (CI: 2.0%, 33.1%), and AUC4 (with at least 4-year follow-up for controls) reaching 0.769 (CI: 0.715, 0.811). Prediction by 360 days achieved 44.5% sensitivity (CI: 23.6%, 62.9%), and 13.7% PPV at high (90%) specificity (CI: 9.6%, 18.9%), and AUC4 reaching 0.797 (CI: 0.746, 0.840). Results show that incorporating clinical narratives in early autism prediction achieved promising accuracy by age 30 days, outperforming models based on structured data only. Furthermore, findings suggest that additional features learned from clinician narratives might be hypothesis generating for understanding early development in autism.

Collapse

Dai TY, Radhakrishnan P, Nweye K, Estrada R, Niyogi D, Nagy Z. Analyzing the impact of COVID-19 on the electricity demand in Austin, TX using an ensemble-model based counterfactual and 400,000 smart meters. Comput Urban Sci 2023;3:20. [PMID: 37192956 PMCID: PMC10162906 DOI: 10.1007/s43762-023-00095-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 03/17/2023] [Accepted: 03/29/2023] [Indexed: 05/18/2023]

Kim G, Park YM, Yoon HJ, Choi JH. A multi-kernel and multi-scale learning based deep ensemble model for predicting recurrence of non-small cell lung cancer. PeerJ Comput Sci 2023;9:e1311. [PMID: 37346527 PMCID: PMC10280639 DOI: 10.7717/peerj-cs.1311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 03/06/2023] [Indexed: 06/23/2023]

Yang SQ, Zhang LX, Ge YJ, Zhang JW, Hu JX, Shen CY, Lu AP, Hou TJ, Cao DS. In-silico target prediction by ensemble chemogenomic model based on multi-scale information of chemical structures and protein sequences. J Cheminform 2023;15:48. [PMID: 37088813 PMCID: PMC10123967 DOI: 10.1186/s13321-023-00720-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 04/08/2023] [Indexed: 04/25/2023] Open

Affiliation(s)

Su-Qing Yang Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, People's Republic of China Department of Pharmacy, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, 330006, Jiangxi, People's Republic of China
Liu-Xia Zhang The First Hospital of Hunan University of Chinese Medicine, Changsha, 410007, Hunan, People's Republic of China
You-Jin Ge Department of Pharmacy, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, 330006, Jiangxi, People's Republic of China
Jin-Wei Zhang Departments of Biomedical Engineering and Pathology, School of Basic Medical Science, Central South University, Changsha, 410013, Hunan, People's Republic of China
Jian-Xin Hu Department of Pharmacy, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, 330006, Jiangxi, People's Republic of China
Cheng-Ying Shen Department of Pharmacy, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, 330006, Jiangxi, People's Republic of China
Ai-Ping Lu Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, People's Republic of China
Ting-Jun Hou Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, People's Republic of China.
Dong-Sheng Cao Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, People's Republic of China. Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, People's Republic of China.

Collapse

Yang X, Zhang X, Zhang P, Bidegain G, Dong J, Hu C, Li M, Zhang Z, Guo H. Ensemble habitat suitability modeling for predicting optimal sites for eelgrass (Zostera marina) in the tidal lagoon ecosystem: Implications for restoration and conservation. J Environ Manage 2023;330:117108. [PMID: 36584472 DOI: 10.1016/j.jenvman.2022.117108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 12/18/2022] [Accepted: 12/20/2022] [Indexed: 06/17/2023]

Abstract

Seagrass systems are in decline, mainly due to anthropogenic pressures and ongoing climate change. Implementing seagrass protection and restoration measures requires accurate assessment of suitable habitats. Commonly, such assessments have been performed using single-algorithm habitat suitability models, nearly always based on low environmental resolution information and short-term species data series. Here we address eelgrass (Zoostera marina) meadows' large-scale decline (>80%) in Shandong province (Yellow Sea, China) by developing an ensemble habitat model (EHM) to inform eelgrass conservation and restoration strategies in the Swan Lake (SL). For this, we applied a weighted EHM derived from ten single-algorithm models including profile, regression, classification, and machine learning methods to generate a high-resolution habitat suitability map. The EHM was constructed based on the predictive performances of each model, by combining a series of present-absent eelgrass datasets from recent years coupled with oceanographic and sediment data. The model was cross-validated with independent historical datasets, and a final habitat suitability map for conservation and restoration was generated. Our EHM scheme outperformed all single models in terms of habitat suitability, scoring ∼0.95 for both true statistic skill (TSS) and area under the curve (AUC) performance criteria. Machine learning methods outperformed profile, regression and classification methods. Regarding model explanatory variables, overall, topographic characteristics such as depth (DEP) and seafloor slope (SSL) are the most significant factors determining the distribution of eelgrass. The EHM predicted that the overlapping area was almost 90% of the current eelgrass habitat. Using results from our EHM, a LOESS regression model for the relationship of the habitat suitability to both the biomass and density of Z. marina outperformed better than the classic Ordinary Least Squares regression model. The EHM is a promising tool for supporting eelgrass protection and restoration areas in temperate lagoons as data availability improves.

Collapse

Haoxiang Z, Xiaoqing X, Nianwan Y, Yongjun Z, Hui L, Fanghao W, Jianyang G, Wanxue L. Insights from the biogeographic approach for biocontrol of invasive alien pests: Estimating the ecological niche overlap of three egg parasitoids against Spodoptera frugiperda in China. Sci Total Environ 2023;862:160785. [PMID: 36502977 DOI: 10.1016/j.scitotenv.2022.160785] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 12/04/2022] [Accepted: 12/05/2022] [Indexed: 06/17/2023]

Abstract

Spodoptera frugiperda, the fall armyworm, causes major damage to maize and >80 other crops worldwide. Since S. frugiperda successfully invaded China in 2018 via long-distance migration from Myanmar, it has caused major maize yield losses and posed a severe threat to maize production and food security. The biocontrol approach for S. frugiperda using natural enemies is environmentally safe and effective. Estimating the potential suitable area (PSA) for S. frugiperda and its natural enemies can provide insights for its biocontrol and management. Therefore, based on the global distribution records and bioclimatic variables, we modeled the PSA of S. frugiperda and three egg parasitoids in China using an ensemble model (EM). We found that the prediction results of the EM were more reliable than those of a single model. The PSAs of S. frugiperda and its three egg parasitoids were mainly attributed to temperature variables. The PSA of S. frugiperda was divided into migratory and overwintering areas using the mean January 10 °C isotherm from 2018 to 2022. In the overwintering area, Trichogramma chilonis had the largest PSA overlap with S. frugiperda (94.57 %), followed by Telenomus remus (68.64 %) and Trichogramma dendrolimi (67.53 %). Telenomus remus and Tr. chilonis were the most effective egg parasitoids against S. frugiperda in the overwintering area. In the migratory area, Tr. chilonis had the largest PSA overlap with S. frugiperda (91.36 %), followed by Tr. dendrolimi (81.70 %) and Te. remus (15.23 %). Trichogramma dendrolimi would be the most effective egg parasitoid against S. frugiperda in the Yangtze River Basin and northeastern China. Trichogramma chilonis was the most effective egg parasitoid against S. frugiperda in central China. Our findings indicate that the three native egg parasitoids would be "good regulators" of S. frugiperda outbreaks in China.

Collapse

Zhu D, Yang W, Xu D, Li H, Zhao Y, Li D. A deep learning based two-layer predictor to identify enhancers and their strength. Methods 2023;211:23-30. [PMID: 36740001 DOI: 10.1016/j.ymeth.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 01/03/2023] [Accepted: 01/30/2023] [Indexed: 02/05/2023] Open

Xian X, Zhao H, Wang R, Huang H, Chen B, Zhang G, Liu W, Wan F. Climate change has increased the global threats posed by three ragweeds (Ambrosia L.) in the Anthropocene. Sci Total Environ 2023;859:160252. [PMID: 36427731 DOI: 10.1016/j.scitotenv.2022.160252] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 11/07/2022] [Accepted: 11/14/2022] [Indexed: 06/16/2023]

Abstract

Invasive alien plants (IAPs) substantially affect the native biodiversity, agriculture, industry, and human health worldwide. Ambrosia (ragweed) species, which are major IAPs globally, produce a significant impact on human health and the natural environment. In particular, invasion of A. artemisiifolia, A. psilostachya, and A. trifida in non-native continents is more extensive and severe than that of other species. Here, we used biomod2 ensemble model based on environmental and species occurrence data to predict the potential geographical distribution, overlapping geographical distribution areas, and the ecological niche dynamics of these three ragweeds and further explored the environmental variables shaping the observed patterns to assess the impact of these IAPs on the natural environment and public health. The ecological niche has shifted in the invasive area compared with that in the native area, which increased the invasion risk of three Ambrosia species during the invasion process in the world. The potential geographical distribution and overlapping geographical distribution areas of the three Ambrosia species are primarily distributed in Asia, North America, and Europe, and are expected to increase under four representative concentration pathways in the 2050s. The centers of potential geographical distributions of the three Ambrosia species showed a tendency to shift poleward from the current time to the 2050s. Bioclimatic variables and the human influence index were more significant in shaping these patterns than other factors. In brief, climate change has facilitated the expansion of the geographical distribution and overlapping geographical distribution areas of the three Ambrosia species. Ecomanagement and cross-country management strategies are warranted to mitigate the future effects of the expansion of these ragweed species worldwide in the Anthropocene on the natural environment and public health.

Collapse

Bleichrodt A, Dahal S, Maloney K, Casanova L, Luo R, Chowell G. Real-time forecasting the trajectory of monkeypox outbreaks at the national and global levels, July-October 2022. BMC Med 2023;21:19. [PMID: 36647108 PMCID: PMC9841951 DOI: 10.1186/s12916-022-02725-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 12/28/2022] [Indexed: 01/17/2023] Open

Abstract

BACKGROUND

Beginning May 7, 2022, multiple nations reported an unprecedented surge in monkeypox cases. Unlike past outbreaks, differences in affected populations, transmission mode, and clinical characteristics have been noted. With the existing uncertainties of the outbreak, real-time short-term forecasting can guide and evaluate the effectiveness of public health measures.

METHODS

We obtained publicly available data on confirmed weekly cases of monkeypox at the global level and for seven countries (with the highest burden of disease at the time this study was initiated) from the Our World in Data (OWID) GitHub repository and CDC website. We generated short-term forecasts of new cases of monkeypox across the study areas using an ensemble n-sub-epidemic modeling framework based on weekly cases using 10-week calibration periods. We report and assess the weekly forecasts with quantified uncertainty from the top-ranked, second-ranked, and ensemble sub-epidemic models. Overall, we conducted 324 weekly sequential 4-week ahead forecasts across the models from the week of July 28th, 2022, to the week of October 13th, 2022.

RESULTS

The last 10 of 12 forecasting periods (starting the week of August 11th, 2022) show either a plateauing or declining trend of monkeypox cases for all models and areas of study. According to our latest 4-week ahead forecast from the top-ranked model, a total of 6232 (95% PI 487.8, 12,468.0) cases could be added globally from the week of 10/20/2022 to the week of 11/10/2022. At the country level, the top-ranked model predicts that the USA will report the highest cumulative number of new cases for the 4-week forecasts (median based on OWID data: 1806 (95% PI 0.0, 5544.5)). The top-ranked and weighted ensemble models outperformed all other models in short-term forecasts.

CONCLUSIONS

Our top-ranked model consistently predicted a decreasing trend in monkeypox cases on the global and country-specific scale during the last ten sequential forecasting periods. Our findings reflect the potential impact of increased immunity, and behavioral modification among high-risk populations.

Collapse

Mujahid M, Rustam F, Alasim F, Siddique M, Ashraf I. What people think about fast food: opinions analysis and LDA modeling on fast food restaurants using unstructured tweets. PeerJ Comput Sci 2023;9:e1193. [PMID: 37346556 PMCID: PMC10280231 DOI: 10.7717/peerj-cs.1193] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/28/2022] [Indexed: 06/23/2023]

Abstract

With the rise of social media platforms, sharing reviews has become a social norm in today's modern society. People check customer views on social networking sites about different fast food restaurants and food items before visiting the restaurants and ordering food. Restaurants can compete to better the quality of their offered items or services by carefully analyzing the feedback provided by customers. People tend to visit restaurants with a higher number of positive reviews. Accordingly, manually collecting feedback from customers for every product is a labor-intensive process; the same is true for sentiment analysis. To overcome this, we use sentiment analysis, which automatically extracts meaningful information from the data. Existing studies predominantly focus on machine learning models. As a consequence, the performance analysis of deep learning models is neglected primarily and of the deep ensemble models especially. To this end, this study adopts several deep ensemble models including Bi long short-term memory and gated recurrent unit (BiLSTM+GRU), LSTM+GRU, GRU+recurrent neural network (GRU+RNN), and BiLSTM+RNN models using self-collected unstructured tweets. The performance of lexicon-based methods is compared with deep ensemble models for sentiment classification. In addition, the study makes use of Latent Dirichlet Allocation (LDA) modeling for topic analysis. For experiments, the tweets for the top five fast food serving companies are collected which include KFC, Pizza Hut, McDonald's, Burger King, and Subway. Experimental results reveal that deep ensemble models yield better results than the lexicon-based approach and BiLSTM+GRU obtains the highest accuracy of 95.31% for three class problems. Topic modeling indicates that the highest number of negative sentiments are represented for Subway restaurants with high-intensity negative words. The majority of the people (49%) remain neutral regarding the choice of fast food, 31% seem to like fast food while the rest (20%) dislike fast food.

Collapse

Mondal S, Lee MA, Chen YK, Wang YC. Ensemble modeling of black pomfret (Parastromateus niger) habitat in the Taiwan Strait based on oceanographic variables. PeerJ 2023;11:e14990. [PMID: 36919168 PMCID: PMC10008307 DOI: 10.7717/peerj.14990] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 02/12/2023] [Indexed: 03/12/2023] Open

Umer M, Sadiq S, karamti H, Abdulmajid Eshmawi A, Nappi M, Usman Sana M, Ashraf I. ETCNN: Extra Tree and Convolutional Neural Network-based Ensemble Model for COVID-19 Tweets Sentiment Classification. Pattern Recognit Lett 2022;164:224-231. [PMID: 36407854 PMCID: PMC9664766 DOI: 10.1016/j.patrec.2022.11.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 10/09/2022] [Accepted: 11/11/2022] [Indexed: 11/17/2022]

Zhu X, Guo H, Huang JJ, Tian S, Xu W, Mai Y. An ensemble machine learning model for water quality estimation in coastal area based on remote sensing imagery. J Environ Manage 2022;323:116187. [PMID: 36261960 DOI: 10.1016/j.jenvman.2022.116187] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/01/2022] [Accepted: 09/02/2022] [Indexed: 06/16/2023]

Abstract

The accurate estimation of coastal water quality parameters (WQPs) is crucial for decision-makers to manage water resources. Although various machine learning (ML) models have been developed for coastal water quality estimation using remote sensing data, the performance of these models has significant uncertainties when applied to regional scales. To address this issue, an ensemble ML-based model was developed in this study. The ensemble ML model was applied to estimate chlorophyll-a (Chla), turbidity, and dissolved oxygen (DO) based on Sentinel-2 satellite images in Shenzhen Bay, China. The optimal input features for each WQP were selected from eight spectral bands and seven spectral indices. A local explanation strategy termed Shapley Additive Explanations (SHAP) was employed to quantify contributions of each feature to model outputs. In addition, the impacts of three climate factors on the variation of each WQP were analyzed. The results suggested that the ensemble ML models have satisfied performance for Chla (errors = 1.7%), turbidity (errors = 1.5%) and DO estimation (errors = 0.02%). Band 3 (B3) has the highest positive contribution to Chla estimation, while Band Ration Index2 (BR2) has the highest negative contribution to turbidity estimation, and Band 7 (B7) has the highest positive contribution to DO estimation. The spatial patterns of the three WQPs revealed that the water quality deterioration in Shenzhen Bay was mainly influenced by input of terrestrial pollutants from the estuary. Correlation analysis demonstrated that air temperature (Temp) and average air pressure (AAP) exhibited the closest relationship with Chla. DO showed the strongest negative correlation with Temp, while turbidity was not sensitive to Temp, average wind speed (AWS), and AAP. Overall, the ensemble ML model proposed in this study provides an accurate and practical method for long-term Chla, turbidity, and DO estimation in coastal waters.

Collapse

El-Kenawy ESM, Zerouali B, Bailek N, Bouchouich K, Hassan MA, Almorox J, Kuriqi A, Eid M, Ibrahim A. Improved weighted ensemble learning for predicting the daily reference evapotranspiration under the semi-arid climate conditions. Environ Sci Pollut Res Int 2022;29:81279-81299. [PMID: 35731435 DOI: 10.1007/s11356-022-21410-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 06/07/2022] [Indexed: 06/15/2023]

Yenkikar A, Babu CN, Hemanth DJ. Semantic relational machine learning model for sentiment analysis using cascade feature selection and heterogeneous classifier ensemble. PeerJ Comput Sci 2022;8:e1100. [PMID: 36262147 PMCID: PMC9575864 DOI: 10.7717/peerj-cs.1100] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 08/23/2022] [Indexed: 06/16/2023]

Abstract

The exponential rise in social media via microblogging sites like Twitter has sparked curiosity in sentiment analysis that exploits user feedback towards a targeted product or service. Considering its significance in business intelligence and decision-making, numerous efforts have been made in this area. However, lack of dictionaries, unannotated data, large-scale unstructured data, and low accuracies have plagued these approaches. Also, sentiment classification through classifier ensemble has been underexplored in literature. In this article, we propose a Semantic Relational Machine Learning (SRML) model that automatically classifies the sentiment of tweets by using classifier ensemble and optimal features. The model employs the Cascaded Feature Selection (CFS) strategy, a novel statistical assessment approach based on Wilcoxon rank sum test, univariate logistic regression assisted significant predictor test and cross-correlation test. It further uses the efficacy of word2vec-based continuous bag-of-words and n-gram feature extraction in conjunction with SentiWordNet for finding optimal features for classification. We experiment on six public Twitter sentiment datasets, the STS-Gold dataset, the Obama-McCain Debate (OMD) dataset, the healthcare reform (HCR) dataset and the SemEval2017 Task 4A, 4B and 4C on a heterogeneous classifier ensemble comprising fourteen individual classifiers from different paradigms. Results from the experimental study indicate that CFS supports in attaining a higher classification accuracy with up to 50% lesser features compared to count vectorizer approach. In Intra-model performance assessment, the Artificial Neural Network-Gradient Descent (ANN-GD) classifier performs comparatively better than other individual classifiers, but the Best Trained Ensemble (BTE) strategy outperforms on all metrics. In inter-model performance assessment with existing state-of-the-art systems, the proposed model achieved higher accuracy and outperforms more accomplished models employing quantum-inspired sentiment representation (QSR), transformer-based methods like BERT, BERTweet, RoBERTa and ensemble techniques. The research thus provides critical insights into implementing similar strategy into building more generic and robust expert system for sentiment analysis that can be leveraged across industries.

Collapse

Ezanno P, Picault S, Bareille S, Beaunée G, Boender GJ, Dankwa EA, Deslandes F, Donnelly CA, Hagenaars TJ, Hayes S, Jori F, Lambert S, Mancini M, Munoz F, Pleydell DRJ, Thompson RN, Vergu E, Vignes M, Vergne T. The African swine fever modelling challenge: Model comparison and lessons learnt. Epidemics 2022;40:100615. [PMID: 35970067 DOI: 10.1016/j.epidem.2022.100615] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 06/29/2022] [Accepted: 07/20/2022] [Indexed: 11/26/2022] Open

Affiliation(s)

Pauline Ezanno INRAE, Oniris, BIOEPAR, 44300 Nantes, France.
Sébastien Picault INRAE, Oniris, BIOEPAR, 44300 Nantes, France
Servane Bareille INRAE, Oniris, BIOEPAR, 44300 Nantes, France; INRAE, ENVT, IHAP, Toulouse, France
Gaël Beaunée INRAE, Oniris, BIOEPAR, 44300 Nantes, France
Gert Jan Boender Wageningen Bioveterinary Research, Lelystad, the Netherlands
Emmanuelle A Dankwa Department of Statistics, University of Oxford, Oxford, United Kingdom
François Deslandes Université Paris-Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
Christl A Donnelly Department of Statistics, University of Oxford, Oxford, United Kingdom; Department of Infectious Disease Epidemiology, Faculty of Medicine, School of Public Health, Imperial College London, United Kingdom
Thomas J Hagenaars Wageningen Bioveterinary Research, Lelystad, the Netherlands
Sarah Hayes Department of Infectious Disease Epidemiology, Faculty of Medicine, School of Public Health, Imperial College London, United Kingdom
Ferran Jori CIRAD, INRAE, Université de Montpellier, ASTRE, 34398 Montpellier, France
Sébastien Lambert Centre for Emerging, Endemic and Exotic Diseases, Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, United Kingdom
Matthieu Mancini INRAE, Oniris, BIOEPAR, 44300 Nantes, France; INRAE, ENVT, IHAP, Toulouse, France
Facundo Munoz CIRAD, INRAE, Université de Montpellier, ASTRE, 34398 Montpellier, France
David R J Pleydell CIRAD, INRAE, Université de Montpellier, ASTRE, 34398 Montpellier, France
Robin N Thompson Mathematics Institute and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, United Kingdom
Elisabeta Vergu Université Paris-Saclay, INRAE, MaIAGE, 78350 Jouy-en-Josas, France
Matthieu Vignes School of Mathematical and Computational Sciences, Massey University, Palmerston North, New Zealand
Timothée Vergne INRAE, ENVT, IHAP, Toulouse, France

Collapse

Park J, Lee WH, Kim KT, Park CY, Lee S, Heo TY. Interpretation of ensemble learning to predict water quality using explainable artificial intelligence. Sci Total Environ 2022;832:155070. [PMID: 35398119 DOI: 10.1016/j.scitotenv.2022.155070] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 03/31/2022] [Accepted: 04/02/2022] [Indexed: 06/14/2023]

Abstract

Algal bloom is a significant issue when managing water quality in freshwater; specifically, predicting the concentration of algae is essential to maintaining the safety of the drinking water supply system. The chlorophyll-a (Chl-a) concentration is a commonly used indicator to obtain an estimation of algal concentration. In this study, an XGBoost ensemble machine learning (ML) model was developed from eighteen input variables to predict Chl-a concentration. The composition and pretreatment of input variables to the model are important factors for improving model performance. Explainable artificial intelligence (XAI) is an emerging area of ML modeling that provides a reasonable interpretation of model performance. The effect of input variable selection on model performance was estimated, where the priority of input variable selection was determined using three indices: Shapley value (SHAP), feature importance (FI), and variance inflation factor (VIF). SHAP analysis is an XAI algorithm designed to compute the relative importance of input variables with consistency, providing an interpretable analysis for model prediction. The XGB models simulated with independent variables selected using three indices were evaluated with root mean square error (RMSE), RMSE-observation standard deviation ratio, and Nash-Sutcliffe efficiency. This study shows that the model exhibited the most stable performance when the priority of input variables was determined by SHAP. This implies that on-site monitoring can be designed to collect the selected input variables from the SHAP analysis to reduce the cost of overall water quality analysis. The independent variables were further analyzed using SHAP summary plot, force plot, target plot, and partial dependency plot to provide understandable interpretation on the performance of the XGB model. While XAI is still in the early stages of development, this study successfully demonstrated a good example of XAI application to improve the interpretation of machine learning model performance in predicting water quality.

Collapse

Mohsen F, Biswas MR, Ali H, Alam T, Househ M, Shah Z. Customized and Automated Machine Learning-Based Models for Diabetes Type 2 Classification. Stud Health Technol Inform 2022;295:517-520. [PMID: 35773925 DOI: 10.3233/shti220779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Wang Y, Zhu X, Yang L, Hu X, He K, Yu C, Jiao S, Chen J, Guo R, Yang S. IDDLncLoc: Subcellular Localization of LncRNAs Based on a Framework for Imbalanced Data Distributions. Interdiscip Sci 2022;14:409-420. [PMID: 35192174 DOI: 10.1007/s12539-021-00497-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 12/16/2021] [Accepted: 12/20/2021] [Indexed: 06/14/2023]

Affiliation(s)

Yan Wang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China School of Artificial Intelligence, Jilin University, Changchun, China
Xiaopeng Zhu Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Lili Yang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China Department of Obstetrics, The First Hospital of Jilin University, Changchun, China
Xuemei Hu Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Kai He Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Cuinan Yu Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Shaoqing Jiao Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Jiali Chen Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Rui Guo Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
Sen Yang Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China.

Collapse

Nimmi K, Janet B, Selvan AK, Sivakumaran N. Pre-trained ensemble model for identification of emotion during COVID-19 based on emergency response support system dataset. Appl Soft Comput 2022;122:108842. [PMID: 35465357 PMCID: PMC9014641 DOI: 10.1016/j.asoc.2022.108842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/26/2022] [Accepted: 04/05/2022] [Indexed: 01/17/2023]

Biney JKM, Vašát R, Blöcher JR, Borůvka L, Němeček K. Using an ensemble model coupled with portable X-ray fluorescence and visible near-infrared spectroscopy to explore the viability of mapping and estimating arsenic in an agricultural soil. Sci Total Environ 2022;818:151805. [PMID: 34813815 DOI: 10.1016/j.scitotenv.2021.151805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 11/07/2021] [Accepted: 11/15/2021] [Indexed: 06/13/2023]

Abstract

Increasing concentrations of potentially toxic elements (PTE) in agricultural soils remain a major source of public concern. Monitoring PTEs in an agricultural field with no history of contaminants necessitate adequate analysis utilizing a robust model to accurately uncover hidden PTEs. Detecting and mapping the distribution of soil properties using portable X-ray fluorescence (pXRF) and proximal sensing techniques is not only rapid, but also relatively inexpensive. In this study, an ensemble model, consisting of partial least square regression (PLSR), support vector machine (SVM), random forest (RF) and cubist, was used for the prediction and mapping of soil As content in an agricultural field with no history of pollution. The datasets were collected using pXRF and field spectroscopy techniques. The main goal was to compare the ensemble model to each of the calibration techniques in terms of prediction accuracy of As content in such a field. Other components [e.g., soil organic carbon (SOC), Mn, S, soil pH, Fe] that are known to influence As levels in the soil were also retrieved to assess their correlation with soil As. The models were evaluated using the root mean squared error (RMSE_CV), the coefficient of determination (R²_CV) and the ratio of performance to interquartile range (RPIQ). In terms of prediction accuracy, the ensemble model outperformed each of the individual techniques (R²_CV = 0.80/0.75) and obtained the least error margin (RMSE_CV = 1.91/2.16). Overall, all the predictive techniques were able to detect both low and high estimated values of soil As within the study field, but with the ensemble model resembling the measurements better. The ensemble model, a promising tool as demonstrated by the current study, is highly recommended to be included in future studies for more accurate estimation of As and other PTEs in other agricultural fields.

Collapse

Jin Z, Ma Y, Chu L, Liu Y, Dubrow R, Chen K. Predicting spatiotemporally-resolved mean air temperature over Sweden from satellite data using an ensemble model. Environ Res 2022;204:111960. [PMID: 34464620 DOI: 10.1016/j.envres.2021.111960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/29/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]

Zhuang H, Zhang C, Jin X, Ge A, Chen M, Ye J, Qiao H, Xiong P, Zhang X, Chen J, Luan X, Wang W. A flagship species-based approach to efficient, cost-effective biodiversity conservation in the Qinling Mountains, China. J Environ Manage 2022;305:114388. [PMID: 34972047 DOI: 10.1016/j.jenvman.2021.114388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 12/12/2021] [Accepted: 12/22/2021] [Indexed: 06/14/2023]

Abstract

Prioritizing threatened species protection has been proposed as an efficient response to the global biodiversity crisis. We used in-situ conservation data to predict the potential habitat area of four flagship species: the giant panda (Ailuropoda melanoleuca), golden monkey (Rhinopithecus roxella quinlingensis), takin (Budorcas taxicolor bedfordi), and crested ibis (Nipponia nippon). We then designed systematic conservation planning schemes for various scenarios given species habitat preferences and anthropogenic activities and conducted a cost-effectiveness assessment. Broadly, the geographical distributions of suitable habitats for giant pandas, golden monkeys, and takins exhibited high spatial congruence (correlation coefficients of 0.59-0.90), and areas of high congruence were concentrated in the northern portion of the Qinling Mountains at high elevation (>1500 m). By contrast, the crested ibis was negatively correlated in space with its sympatric species (-0.47 to -0.29). Crested ibis habitats were clustered in the southern portion of the region at low elevation (<1500 m). A hypothetical conservation priority area (CPA) based on the giant panda, golden monkey, and takin included 39.64% of the Qinling Mountains and 100%, 99.99%, 99.59%, and 7.84% of the suitable habitats for giant pandas, golden monkeys, takins, and crested ibises, respectively. The same area included 99.07%, 70.87%, and 39.96% of the highly important areas for the ecosystem services of biodiversity conservation, water supply, and soil retention, respectively, and only 4.62%, 16.83%, and 13.4% of the area were associated with high-density residential area, impervious surfaces, and cropland, respectively. Therefore, we conclude that a CPA approach based on the specialist species could result in effective, low-cost biodiversity conservation in the Qinling Mountains. However, we note that existing protected areas account for only 26.52% of the CPA. We recommend that the main area of the proposed Qinling National Park should be based on the CPA developed here.

Collapse

Ke H, Gong S, He J, Zhang L, Cui B, Wang Y, Mo J, Zhou Y, Zhang H. Development and application of an automated air quality forecasting system based on machine learning. Sci Total Environ 2022;806:151204. [PMID: 34710417 DOI: 10.1016/j.scitotenv.2021.151204] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/20/2021] [Accepted: 10/20/2021] [Indexed: 06/13/2023]

Chen YM, Chen YJ, Ho WH, Tsai JT. Classifying chest CT images as COVID-19 positive/negative using a convolutional neural network ensemble model and uniform experimental design method. BMC Bioinformatics 2021;22:147. [PMID: 34749629 PMCID: PMC8574139 DOI: 10.1186/s12859-021-04083-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 03/16/2021] [Indexed: 11/10/2022] Open

Rahmanian S, Pourghasemi HR, Pouyan S, Karami S. Habitat potential modelling and mapping of Teucrium polium using machine learning techniques. Environ Monit Assess 2021;193:759. [PMID: 34718878 DOI: 10.1007/s10661-021-09551-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 10/19/2021] [Indexed: 06/13/2023]

Cui L, Wang S. Mapping the daily nitrous acid (HONO) concentrations across China during 2006-2017 through ensemble machine-learning algorithm. Sci Total Environ 2021;785:147325. [PMID: 33957584 DOI: 10.1016/j.scitotenv.2021.147325] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 04/19/2021] [Accepted: 04/21/2021] [Indexed: 06/12/2023]

Abstract

Nitrous acid (HONO) is a major source of the hydroxyl radical (OH) and plays a key role in atmospheric photochemistry. The lack of spatially resolved HONO concentration information results in large knowledge gaps of HONO and its role in atmospheric chemistry and air pollution in China. In this work, an ensemble machine learning model comprising of random forest, gradient boosting, and back propagation neural network was proposed, for the first time, to estimate the long-term (2006-2017) daily HONO concentrations across China in 0.25° resolution. Further, the key factors controlling the space-time variablity of HONO concentrations were analyzed based on variable importance values. The ensemble model well characterized the spatiotemporal distribution of daily HONO concentrations with the sampled-based, site-based and by-year cross-validation (CV) R² (RMSE) of 0.7 (0.36 ppbv), 0.67 (0.36 ppbv), and 0.62 (0.40 ppbv), respectively. HONO hotspots were mainly distributed in the Beijing-Tianjin-Hebei (BTH), Pearl River Delta (PRD), Yangtze River Delta (YRD), and several sites of Sichuan Basin, in line with the distribution patterns of the tropospheric NO₂ columns and assimilated surface NO₃^- levels. The national HONO levels stagnated during 2006-2013, then declined after 2013 benefiting from the implementation of the Action Plan for Air Pollution Prevention and Control. The NO₃^- concentration, urban area, NO₂ column density ranked as important variables for HONO prediction, while agricultral land, forest and grassland played minor roles in affecting HONO concentrations, suggesting the significant role of heterogeneous HONO production from anthropogenic precursor emissions. Leveraging the ground-level HONO observations, this study fills the gap of statistically modelling nationwide HONO in China, which provides essential data for atmospheric chemistry research.

Collapse

Ke B, Nguyen H, Bui XN, Bui HB, Choi Y, Zhou J, Moayedi H, Costache R, Nguyen-Trang T. Predicting the sorption efficiency of heavy metal based on the biochar characteristics, metal sources, and environmental conditions using various novel hybrid machine learning models. Chemosphere 2021;276:130204. [PMID: 34088091 DOI: 10.1016/j.chemosphere.2021.130204] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 02/17/2021] [Accepted: 03/04/2021] [Indexed: 06/12/2023]

Abstract

Heavy metals in water and wastewater are taken into account as one of the most hazardous environmental issues that significantly impact human health. The use of biochar systems with different materials helped significantly remove heavy metals in the water, especially wastewater treatment systems. Nevertheless, heavy metal's sorption efficiency on the biochar systems is highly dependent on the biochar characteristics, metal sources, and environmental conditions. Therefore, this study implicates the feasibility of biochar systems in the heavy metal sorption in water/wastewater and the use of artificial intelligence (AI) models in investigating efficiency sorption of heavy metal on biochar. Accordingly, this work investigated and proposed 20 artificial intelligent models for forecasting the sorption efficiency of heavy metal onto biochar based on five machine learning algorithms and bagging technique (BA). Accordingly, support vector machine (SVM), random forest (RF), artificial neural network (ANN), M5Tree, and Gaussian process (GP) algorithms were used as the key algorithms for the aim of this study. Subsequently, the individual models were bagged with each other to generate new ensemble models. Finally, 20 intelligent models were developed and evaluated, including SVM, RF, M5Tree, GP, ANN, BA-SVM, BA-RF, BA-M5Tree, BA-GP, BA-ANN, SVM-RF, SVM-M5Tree, SVM-GP, SVM-ANN, RF-M5Tree, RF-GP, RF-ANN, M5Tree-GP, M5Tree-ANN, GP-ANN. Of those, the hybrid models (i.e., BA-SVM, BA-RF, BA-M5Tree, BA-GP, BA-ANN, SVM-RF, SVM-M5Tree, SVM-GP, SVM-ANN, RF-M5Tree, RF-GP, RF-ANN, M5Tree-GP, M5Tree-ANN, GP-ANN) are introduced as the novelty of this study for estimating the heavy metal's sorption efficiency on the biochar systems. Also, the biochar characteristics, metal sources, and environmental conditions were comprehensively assessed and used, and they are considered as a novelty of the study as well. For this aim, a dataset of sorption efficiency of heavy metal was collected and processed with 353 experimental tests. Various performance indexes were applied to evaluate the models, such as RMSE, R², MAE, color intensity, Taylor diagram, box and whiskers plots. This study's findings revealed that AI models could predict heavy metal's sorption efficiency onto biochar with high reliability, and the efficiency of the ensemble models is higher than those of individual models. The results also reported that the SVM-ANN ensemble model is the most superior model among 20 developed models. The predictive model proposed that heavy metal's efficiency sorption on biochar can be accurately forecasted and early warning for the water pollution by heavy metal.

Collapse

Mukherjee T, Sharma V, Sharma LK, Thakur M, Joshi BD, Sharief A, Thapa A, Dutta R, Dolker S, Tripathy B, Chandra K. Landscape-level habitat management plan through geometric reserve design for critically endangered Hangul (Cervus hanglu hanglu). Sci Total Environ 2021;777:146031. [PMID: 33676208 DOI: 10.1016/j.scitotenv.2021.146031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 02/17/2021] [Accepted: 02/17/2021] [Indexed: 06/12/2023]

Tanveer MA, Khan MJ, Sajid H, Naseer N. Convolutional neural networks ensemble model for neonatal seizure detection. J Neurosci Methods 2021;358:109197. [PMID: 33864835 DOI: 10.1016/j.jneumeth.2021.109197] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 04/11/2021] [Accepted: 04/12/2021] [Indexed: 10/21/2022]

Yu X, Yang Q, Wang D, Li Z, Chen N, Kong DX. Predicting lung adenocarcinoma disease progression using methylation-correlated blocks and ensemble machine learning classifiers. PeerJ 2021;9:e10884. [PMID: 33628643 PMCID: PMC7894106 DOI: 10.7717/peerj.10884] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 01/12/2021] [Indexed: 01/20/2023] Open

Mukherjee T, Sharma LK, Kumar V, Sharief A, Dutta R, Kumar M, Joshi BD, Thakur M, Venkatraman C, Chandra K. Adaptive spatial planning of protected area network for conserving the Himalayan brown bear. Sci Total Environ 2021;754:142416. [PMID: 33254933 DOI: 10.1016/j.scitotenv.2020.142416] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 09/12/2020] [Accepted: 09/14/2020] [Indexed: 06/12/2023]

Abstract

Large mammals that occur in low densities, particularly in the high-altitude areas, are globally threatened due to fragile climatic and ecological envelopes. Among bear species, the Himalayan brown bear (Ursus arctos isabellinus) has a distribution that is restricted to Himalayan highlands with relatively small and fragmented populations. To date, very little scientific information on the Himalayan brown bear, which is vital for the conservation of the species and the management of its habitats, especially in protected areas of the landscape, is available. The present study aims to understand the effectiveness of existing Himalayan Protected Areas in terms of representativeness for the conservation of Himalayan brown bear (HBB), an umbrella species in high-altitude habitats of the Himalayan region. We used the ensemble approach of the species distribution model and then assessed biological connectivity to predict the current and future distribution and movement of HBB in climate change scenarios for the year 2050. Approximately 33 protected areas (PAs) currently possess suitable habitats. Our model suggests a massive decline of approximately 73.38% and 72.87% under 4.5 and 8.5 representative concentration pathway (RCP) respectively in the year 2050 compared with the current distribution. The predicted change in suitability will result in loss of habitats from thirteen PAs; eight will become completely uninhabitable by the year 2050, followed by loss of connectivity in the majority of PAs. Habitat configuration analysis suggested a 40% decline in the number of suitable patches, a reduction in large habitat patches (up to 50%) and aggregation of suitable areas (9%) by 2050, indicating fragmentation. The predicted change in geographic isotherm will result in loss of habitats from thirteen PAs, eight of them will become completely inhabitable. Hence, these PAs may lose their effectiveness and representativeness in achieving the very objective of their existence or conservation goals. Therefore, we recommend adaptive spatial planning for protecting suitable habitats distributed outside the PA for climate change adaptation.

Collapse

Gifani P, Shalbaf A, Vafaeezadeh M. Automated detection of COVID-19 using ensemble of transfer learning with deep convolutional neural network based on CT scans. Int J Comput Assist Radiol Surg 2021;16:115-123. [PMID: 33191476 PMCID: PMC7667011 DOI: 10.1007/s11548-020-02286-w] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Accepted: 10/23/2020] [Indexed: 12/18/2022]

Abstract

PURPOSE

COVID-19 has infected millions of people worldwide. One of the most important hurdles in controlling the spread of this disease is the inefficiency and lack of medical tests. Computed tomography (CT) scans are promising in providing accurate and fast detection of COVID-19. However, determining COVID-19 requires highly trained radiologists and suffers from inter-observer variability. To remedy these limitations, this paper introduces an automatic methodology based on an ensemble of deep transfer learning for the detection of COVID-19.

METHODS

A total of 15 pre-trained convolutional neural networks (CNNs) architectures: EfficientNets(B0-B5), NasNetLarge, NasNetMobile, InceptionV3, ResNet-50, SeResnet 50, Xception, DenseNet121, ResNext50 and Inception_resnet_v2 are used and then fine-tuned on the target task. After that, we built an ensemble method based on majority voting of the best combination of deep transfer learning outputs to further improve the recognition performance. We have used a publicly available dataset of CT scans, which consists of 349 CT scans labeled as being positive for COVID-19 and 397 negative COVID-19 CT scans that are normal or contain other types of lung diseases.

RESULTS

The experimental results indicate that the majority voting of 5 deep transfer learning architecture with EfficientNetB0, EfficientNetB3, EfficientNetB5, Inception_resnet_v2, and Xception has the higher results than the individual transfer learning structure and among the other models based on precision (0.857), recall (0.854) and accuracy (0.85) metrics in diagnosing COVID-19 from CT scans.

CONCLUSION

Our study based on an ensemble deep transfer learning system with different pre-trained CNNs architectures can work well on a publicly available dataset of CT images for the diagnosis of COVID-19 based on CT scans.

Collapse

Singh P, Kaur R. An integrated fog and Artificial Intelligence smart health framework to predict and prevent COVID-19. Glob Transit 2020;2:283-292. [PMID: 33205037 PMCID: PMC7659515 DOI: 10.1016/j.glt.2020.11.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 10/09/2020] [Accepted: 11/01/2020] [Indexed: 05/18/2023]

Saha S, Saha M, Mukherjee K, Arabameri A, Ngo PTT, Paul GC. Predicting the deforestation probability using the binary logistic regression, random forest, ensemble rotational forest, REPTree: A case study at the Gumani River Basin, India. Sci Total Environ 2020;730:139197. [PMID: 32402979 DOI: 10.1016/j.scitotenv.2020.139197] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Revised: 05/01/2020] [Accepted: 05/01/2020] [Indexed: 04/15/2023]

Abstract

Rapid population growth and its corresponding effects like the expansion of human settlement, increasing agricultural land, and industry lead to the loss of forest area in most parts of the world especially in such highly populated nations like India. Forest canopy density (FCD) is a useful measure to assess the forest cover change in its own as numerous works of forest change have been done using only FCD with the help of remote sensing and GIS. The coupling of binary logistic regression (BLR), random forest (RF), ensemble of rotational forest and reduced error pruning trees (RTF-REPTree) with FCD makes it more convenient to find out the deforestation probability. Advanced vegetation index (AVI), bare soil index (BSI), shadow index (SI), and scaled vegetation density (VD) derived from Landsat imageries are the main input parameters to identify the FCD. After preparing the FCDs of 1990, 2000, 2010 and 2017 the deforestation map of the study area was prepared and considered as dependent parameter for deforestation probability modelling. On the other hand, twelve deforestation determining factors were used to delineate the deforestation probability with the help of BLR, RF and RTF-REPTree models. These deforestation probability models were validated through area under curve (AUC), receiver operating characteristics (ROC), efficiency, true skill statistics (TSS) and Kappa co-efficient. The validation result shows that all the models like BLR (AUC = 0.874), RF (AUC = 0.886) and RTF-REPTree (AUC = 0.919) have good capability of assessing the deforestation probability but among them, RTF-REPTree has the highest accuracy level. The result also shows that low canopy density area i.e. not under the dense forest cover has increased by 9.26% from 1990 to 2017. Besides, nearly 30% of the forested land is under high to very high deforestation probable zone, which needs to be protected with immediate measures.

Collapse