1
|
Peng C, Zhang X, Wang W. Predicting plant disease epidemics using boosted regression trees. Infect Dis Model 2024; 9:1138-1146. [PMID: 39022297 PMCID: PMC11253225 DOI: 10.1016/j.idm.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/13/2024] [Accepted: 06/18/2024] [Indexed: 07/20/2024] Open
Abstract
Plant epidemics are often associated with weather-related variables. It is difficult to identify weather-related predictors for models predicting plant epidemics. In the article by Shah et al., to predict Fusarium head blight (FHB) epidemics of wheat, they explored a functional approach using scalar-on-function regression to model a binary outcome (FHB epidemic or non-epidemic) with respect to weather time series spanning 140 days relative to anthesis. The scalar-on-function models fit the data better than previously described logistic regression models. In this work, given the same dataset and models, we attempt to reproduce the article by Shah et al. using a different approach, boosted regression trees. After fitting, the classification accuracy and model statistics are surprisingly good.
Collapse
Affiliation(s)
- Chun Peng
- School of Mathematics and Statistics, Huaiyin Normal University, Huaian, 223300, PR China
| | - Xingyue Zhang
- École Polytechnique Fédérale de Lausanne, Rte Cantonale, 1015, Lausanne, Switzerland
| | - Weiming Wang
- School of Mathematics and Statistics, Huaiyin Normal University, Huaian, 223300, PR China
| |
Collapse
|
2
|
Shah DA, De Wolf ED, Paul PA, Madden LV. Into the Trees: Random Forests for Predicting Fusarium Head Blight Epidemics of Wheat in the United States. PHYTOPATHOLOGY 2023; 113:1483-1493. [PMID: 36880796 DOI: 10.1094/phyto-10-22-0380-r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Constructing models that accurately predict Fusarium head blight (FHB) epidemics and are also amenable to large-scale deployment is a challenging task. In the United States, the emphasis has been on simple logistic regression (LR) models, which are easy to implement but may suffer from lower accuracies when compared with more complicated, harder-to-deploy (over large geographies) model frameworks such as functional or boosted regressions. This article examined the plausibility of random forests (RFs) for the binary prediction of FHB epidemics as a possible mediation between model simplicity and complexity without sacrificing accuracy. A minimalist set of predictors was also desirable rather than having the RF model use all 90 candidate variables as predictors. The input predictor set was filtered with the aid of three RF variable selection algorithms (Boruta, varSelRF, and VSURF), using resampling techniques to quantify the variability and stability of selected variable sets. Post-selection filtering produced 58 competitive RF models with no more than 14 predictors each. One variable representing temperature stability in the 20 days before anthesis was the most frequently selected predictor. This was a departure from the prominence of relative humidity-based variables previously reported in LR models for FHB. The RF models had overall superior predictive performance over the LR models and may be suitable candidates for use by the Fusarium Head Blight Prediction Center.
Collapse
Affiliation(s)
- Denis A Shah
- Department of Plant Pathology, Kansas State University, Manhattan, KS 66506
| | - Erick D De Wolf
- Department of Plant Pathology, Kansas State University, Manhattan, KS 66506
| | - Pierce A Paul
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, OH 44691
| | - Laurence V Madden
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, OH 44691
| |
Collapse
|
3
|
Infantino A, Belocchi A, Quaranta F, Reverberi M, Beccaccioli M, Lombardi D, Vitale M. Effects of climate change on the distribution of Fusarium spp. in Italy. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 882:163640. [PMID: 37087011 DOI: 10.1016/j.scitotenv.2023.163640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/17/2023] [Accepted: 04/17/2023] [Indexed: 05/03/2023]
Abstract
This work studies the incidence of Fusarium spp. on wheat kernels about current and future climatic conditions in Italy. Epidemiological analyses were performed from 2007 to 2013 and the resulting dataset was used to find correlations between the disease incidence of five important Fusarium species monitored in Italy (Fusarium graminearum, F. langsethiae, F. sporotrichioides, F. poae and F. avenaceum) and climatic and geographical parameters. Probabilistic-based modelling of the actual distribution of Fusarium spp. was achieved by using the Zero-inflated Poisson regression. The probabilistic geographical distribution of the Fusarium species was assessed by applying future climatic scenarios (RCPs 4.5 and 8.5). The shift from current to future climatic scenarios highlighted changes on a national and regional scale. The tightening of environmental conditions from the RCP4.5 to 8.5 scenarios resulted in a sporadic presence of F. avenaceum only in the northern region of Italy. Fusarium graminearum was plentifully present in the current climate, but the tightening of minimum and maximum temperatures and the decrease of precipitation between May-June in the RCP8.5 no longer represents the optimum conditions for it. Fusarium langsethiae was currently distributed in all of Italy, showing an increase in the probability of detecting it by moving from high to low latitudes and from low to high longitudes in the RCP8.5. Fusarium poae, unlike other Fusarium species, grows and develops in arid climatic conditions. High values of F. poae were recorded at low latitudes and longitudes. Under the RCP scenarios, it showed high incidence probabilities in the southeast and northeast areas of Italy. Fusarium sporotrichioides is scarcely present in Italy, found at high latitudes and in the central areas. Climate change altered this distribution, and the chances of discovering it increased significantly moving to southern Italy. Overall, the study shows that climate change conditions are likely to lead to an increase in the incidence of Fusarium species on wheat kernels in Italy, highlighting the importance of developing strategies to mitigate the effects of climate change on wheat production, quality, and safety.
Collapse
Affiliation(s)
- Alessandro Infantino
- Research Centre for Plant Protection and Certification, Council for Agricultural Research and Agricultural Economics-CREA, Italy
| | - Andreina Belocchi
- Research Centre for Engineering and Agro-Food Processing, Council for Agricultural Research and Agricultural Economics-CREA, Italy
| | - Fabrizio Quaranta
- Research Centre for Engineering and Agro-Food Processing, Council for Agricultural Research and Agricultural Economics-CREA, Italy
| | - Massimo Reverberi
- Department of Environmental Biology, Sapienza University of Rome, Italy
| | | | - Danilo Lombardi
- Department of Environmental Biology, Sapienza University of Rome, Italy
| | - Marcello Vitale
- Department of Environmental Biology, Sapienza University of Rome, Italy.
| |
Collapse
|
4
|
Ahmed U, Lin JCW, Srivastava G. Multivariate time-series sensor vital sign forecasting of cardiovascular and chronic respiratory diseases. SUSTAINABLE COMPUTING : INFORMATICS AND SYSTEMS 2023; 38:100868. [PMID: 37168459 PMCID: PMC10076073 DOI: 10.1016/j.suscom.2023.100868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 11/27/2022] [Accepted: 04/02/2023] [Indexed: 05/13/2023]
Abstract
Approximately 19 million people die each year from cardiovascular and chronic respiratory diseases. As a result of the recent Covid-19 epidemic, blood pressure, cholesterol, and blood sugar levels have risen. Not only do healthcare institutions benefit from studying physiological vital signs, but individuals also benefit from being alerted to health problems in a timely manner. This study uses machine learning to categorize and predict cardiovascular and chronic respiratory diseases. By predicting a patient's health status, caregivers and medical professionals can be alerted when needed. We predicted vital signs for 180 seconds using real-world vital sign data. A person's life can be saved if caregivers react quickly and anticipate emergencies. The tree-based pipeline optimization method (TPOT) is used instead of manually adjusting machine learning classifiers. This paper focuses on optimizing classification accuracy by combining feature pre-processors and machine learning models with TPOT genetic programming making use of linear and Prophet models to predict important indicators. The TPOT tuning parameter combines predicted values with classical classification models such as Naïve Bayes, Support Vector Machines, and Random Forests. As a result of this study, we show the importance of categorizing and increasing the accuracy of predictions. The proposed model achieves its adaptive behavior by conceptually incorporating different machine learning classifiers. We compare the proposed model with several state-of-the-art algorithms using a large amount of training data. Test results at the University of Queensland using 32 patient's data showed that the proposed model outperformed existing algorithms, improving the classification of cardiovascular disease from 0.58 to 0.71 and chronic respiratory disease from 0.49 to 0.70, respectively, while minimizing the mean percent error in vital signs. Our results suggest that the Facebook Prophet prediction model in conjunction with the TPOT classification model can correctly diagnose a patient's health status based on abnormal vital signs and enables patients to receive prompt medical attention.
Collapse
Affiliation(s)
- Usman Ahmed
- Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, 5063, Bergen, Norway
| | - Jerry Chun-Wei Lin
- Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, 5063, Bergen, Norway
| | - Gautam Srivastava
- Department of Mathematics & Computer Science, Brandon University, Brandon, Canada
- Research Centre of Interneural Computing, Taichung, Taiwan
- Department of Computer Science & Math, Lebanese American University, Beirut, Lebanon
| |
Collapse
|
5
|
Dalla Lana F, Madden LV, Paul PA. Logistic Models Derived via LASSO Methods for Quantifying the Risk of Natural Contamination of Maize Grain with Deoxynivalenol. PHYTOPATHOLOGY 2021; 111:2250-2267. [PMID: 34009008 DOI: 10.1094/phyto-03-21-0104-r] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Models were developed to quantify the risk of deoxynivalenol (DON) contamination of maize grain based on weather, cultural practices, hybrid resistance, and Gibberella ear rot (GER) intensity. Data on natural DON contamination of 15 to 16 hybrids and weather were collected from 10 Ohio locations over 4 years. Logistic regression with 10-fold cross-validation was used to develop models to predict the risk of DON ≥1 ppm. The presence and severity of GER predicted DON risk with an accuracy of 0.81 and 0.87, respectively. Temperature, relative humidity, surface wetness, and rainfall were used to generate 37 weather-based predictor variables summarized over each of six 15-day windows relative to maize silking (R1). With these variables, least absolute shrinkage and selection operator (LASSO) followed by all-subsets variable selection and logistic regression with 10-fold cross-validation were used to build single-window weather-based models, from which 11 with one or two predictors were selected based on performance metrics and simplicity. LASSO logistic regression was also used to build more complex multiwindow models with up to 22 predictors. The performance of the best single-window models was comparable to that of the best multiwindow models, with accuracy ranging from 0.81 to 0.83 for the former and 0.83 to 0.87 for the latter group of models. These results indicated that the risk of DON ≥1 ppm can be accurately predicted with simple models built using temperature- and moisture-based predictors from a single window. These models will be the foundation for developing tools to predict the risk of DON contamination of maize grain.
Collapse
Affiliation(s)
- Felipe Dalla Lana
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research, and Development Center, Wooster, OH 44691
| | - Laurence V Madden
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research, and Development Center, Wooster, OH 44691
| | - Pierce A Paul
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research, and Development Center, Wooster, OH 44691
| |
Collapse
|
6
|
Shah W, Aleem M, Iqbal MA, Islam MA, Ahmed U, Srivastava G, Lin JCW. A Machine-Learning-Based System for Prediction of Cardiovascular and Chronic Respiratory Diseases. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:2621655. [PMID: 34760140 PMCID: PMC8575608 DOI: 10.1155/2021/2621655] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 08/24/2021] [Accepted: 10/04/2021] [Indexed: 11/17/2022]
Abstract
Cardiovascular and chronic respiratory diseases are global threats to public health and cause approximately 19 million deaths worldwide annually. This high mortality rate can be reduced with the use of technological advancements in medical science that can facilitate continuous monitoring of physiological parameters-blood pressure, cholesterol levels, blood glucose, etc. The futuristic values of these critical physiological or vital sign parameters not only enable in-time assistance from medical experts and caregivers but also help patients manage their health status by receiving relevant regular alerts/advice from healthcare practitioners. In this study, we propose a machine-learning-based prediction and classification system to determine futuristic values of related vital signs for both cardiovascular and chronic respiratory diseases. Based on the prediction of futuristic values, the proposed system can classify patients' health status to alarm the caregivers and medical experts. In this machine-learning-based prediction and classification model, we have used a real vital sign dataset. To predict the next 1-3 minutes of vital sign values, several regression techniques (i.e., linear regression and polynomial regression of degrees 2, 3, and 4) have been tested. For caregivers, a 60-second prediction and to facilitate emergency medical assistance, a 3-minute prediction of vital signs is used. Based on the predicted vital signs values, the patient's overall health is assessed using three machine learning classifiers, i.e., Support Vector Machine (SVM), Naive Bayes, and Decision Tree. Our results show that the Decision Tree can correctly classify a patient's health status based on abnormal vital sign values and is helpful in timely medical care to the patients.
Collapse
Affiliation(s)
- Wajid Shah
- Capital University of Science and Technology, Islamabad 44000, Pakistan
| | - Muhammad Aleem
- National University of Computer and Emerging Sciences (NUCES), Islamabad 44000, Pakistan
| | - Muhammad Azhar Iqbal
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
| | - Muhammad Arshad Islam
- National University of Computer and Emerging Sciences (NUCES), Islamabad 44000, Pakistan
| | - Usman Ahmed
- Department of Computer Science,Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen 5063, Norway
| | - Gautam Srivastava
- Department of Mathematics and Computer Science, Brandon University, Brandon, Canada
- Research Centre for Interneural Computing, China Medical University, Taichung 40402, Taiwan
| | - Jerry Chun-Wei Lin
- Department of Computer Science,Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen 5063, Norway
| |
Collapse
|
7
|
Evolution of Fusarium Head Blight Management in Wheat: Scientific Perspectives on Biological Control Agents and Crop Genotypes Protocooperation. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11198960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Over the past century, the economically devastating Fusarium Head Blight (FHB) disease has persistently ravished small grain cereal crops worldwide. Annually, losses globally are in the billions of United States dollars (USD), with common bread wheat and durum wheat accounting for a major portion of these losses. Since the unforgettable FHB epidemics of the 1990s and early 2000s in North America, different management strategies have been employed to treat this disease. However, even with some of the best practices including chemical fungicides and innovative breeding technological advances that have given rise to a spectrum of moderately resistant cultivars, FHB still remains an obstinate problem in cereal farms globally. This is in part due to several constraints such as the Fusarium complex of species and the struggle to develop and employ methods that can effectively combat more than one pathogenic line or species simultaneously. This review highlights the last 100 years of major FHB epidemics in the US and Canada, as well as the evolution of different management strategies, and recent progress in resistance and cultivar development. It also takes a look at protocooperation between specific biocontrol agents and cereal genotypes as a promising tool for combatting FHB.
Collapse
|
8
|
Shah DA, De Wolf ED, Paul PA, Madden LV. Accuracy in the prediction of disease epidemics when ensembling simple but highly correlated models. PLoS Comput Biol 2021; 17:e1008831. [PMID: 33720929 PMCID: PMC7993824 DOI: 10.1371/journal.pcbi.1008831] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 03/25/2021] [Accepted: 02/23/2021] [Indexed: 11/25/2022] Open
Abstract
Ensembling combines the predictions made by individual component base models with the goal of achieving a predictive accuracy that is better than that of any one of the constituent member models. Diversity among the base models in terms of predictions is a crucial criterion in ensembling. However, there are practical instances when the available base models produce highly correlated predictions, because they may have been developed within the same research group or may have been built from the same underlying algorithm. We investigated, via a case study on Fusarium head blight (FHB) on wheat in the U.S., whether ensembles of simple yet highly correlated models for predicting the risk of FHB epidemics, all generated from logistic regression, provided any benefit to predictive performance, despite relatively low levels of base model diversity. Three ensembling methods were explored: soft voting, weighted averaging of smaller subsets of the base models, and penalized regression as a stacking algorithm. Soft voting and weighted model averages were generally better at classification than the base models, though not universally so. The performances of stacked regressions were superior to those of the other two ensembling methods we analyzed in this study. Ensembling simple yet correlated models is computationally feasible and is therefore worth pursuing for models of epidemic risk.
Collapse
Affiliation(s)
- Denis A. Shah
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas, United States of America
| | - Erick D. De Wolf
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas, United States of America
| | - Pierce A. Paul
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, Ohio, United States of America
| | - Laurence V. Madden
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, Ohio, United States of America
| |
Collapse
|