1
|
Zhou G, Lee MC, Wang X, Zhong D, Githeko AK, Yan G. Mapping Potential Malaria Vector Larval Habitats for Larval Source Management in Western Kenya: Introduction to Multimodel Ensembling Approaches. Am J Trop Med Hyg 2024; 110:421-430. [PMID: 38350135 PMCID: PMC10919169 DOI: 10.4269/ajtmh.23-0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 11/03/2023] [Indexed: 02/15/2024] Open
Abstract
Identification and mapping of larval sources are a prerequisite for effective planning and implementing mosquito larval source management (LSM). Ensemble modeling is increasingly used for prediction modeling, but it lacks standard procedures. We proposed a detailed framework to predict potential malaria vector larval habitats by using multimodel ensemble modeling, which includes selection of models, ensembling method, and predictors, evaluation of variable importance, prediction of potential larval habitats, and assessment of prediction uncertainty. The models were built and validated based on multisite, multiyear field observations and climatic/environmental variables. Model performance was tested using independent field observations. Overall, we found that the ensembled model predicted larval habitats with about 20% more accuracy than the average of the individual models ensembled. Key larval habitat predictors in western Kenya were elevation, geomorphon class, and precipitation for the 2 months prior. Additional predictors may be required to increase the predictive accuracy of the larva-positive habitats. This is the first study to provide a detailed framework for the process of multimodel ensemble modeling for malaria vector habitats. Mapping of potential habitats will be helpful in LSM planning.
Collapse
Affiliation(s)
- Guofa Zhou
- Program in Public Health, University of California, Irvine, California
| | - Ming-Chieh Lee
- Program in Public Health, University of California, Irvine, California
| | - Xiaoming Wang
- Program in Public Health, University of California, Irvine, California
| | - Daibin Zhong
- Program in Public Health, University of California, Irvine, California
| | - Andrew K. Githeko
- Centre for Global Health Research, Kenya Medical Research Institute, Kisumu, Kenya
| | - Guiyun Yan
- Program in Public Health, University of California, Irvine, California
| |
Collapse
|
2
|
Hosseinzadeh Shahri M, Haghbin F, Raeini YQ, Monfared N. The effects of fake reviews during stepwise topic movement on shopping attitude in social network marketing. MethodsX 2023; 11:102461. [PMID: 38023303 PMCID: PMC10643290 DOI: 10.1016/j.mex.2023.102461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 10/22/2023] [Indexed: 12/01/2023] Open
Abstract
Although the influence of consumer reviews is increasing in weight, far-flung consumer comments on social networks are a retrogressive problem, disturbing users' attention in reviews studies by interpreting them as misleading messages. The need for investigating the unknown meaning layers of fake reviews in the stepwise topic movement of conversations and examining the effects of the fake reviews on consumers' shopping attitudes encouraged us to adopt an integrated approach to marketing and discourse analyses. •Qualitative analysis: To qualitatively investigate the stepwise topic movement of fake reviews in each sampled conversation of the research, three phases were taken into consideration: firstly, identification of topic opening; then, topic closing procedure; and finally, the topic switch toward topic drift.•Quantitative investigation: we develop a questionnaire using multidisciplinary research variables. Then, the reliability and validity of the questionnaire were assessed using Cronbach's alpha and convergent and discriminant values, respectively. After that, the questionnaire was evaluated among a research sample. The data was analysed based on structural equation modeling (SEM) and machine learning (ML).•Conclusion: It was found that fake reviews using topic coherence and grammatical-lexical cohesion mechanisms had positive effects on shopping attitudes. Moreover, fake reviews using topic drift mechanisms influenced consumers' shopping attitudes.
Collapse
|
3
|
Shyaa MA, Zainol Z, Abdullah R, Anbar M, Alzubaidi L, Santamaría J. Enhanced Intrusion Detection with Data Stream Classification and Concept Drift Guided by the Incremental Learning Genetic Programming Combiner. SENSORS (BASEL, SWITZERLAND) 2023; 23:3736. [PMID: 37050795 PMCID: PMC10098915 DOI: 10.3390/s23073736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 03/27/2023] [Accepted: 03/31/2023] [Indexed: 06/19/2023]
Abstract
Concept drift (CD) in data streaming scenarios such as networking intrusion detection systems (IDS) refers to the change in the statistical distribution of the data over time. There are five principal variants related to CD: incremental, gradual, recurrent, sudden, and blip. Genetic programming combiner (GPC) classification is an effective core candidate for data stream classification for IDS. However, its basic structure relies on the usage of traditional static machine learning models that receive onetime training, limiting its ability to handle CD. To address this issue, we propose an extended variant of the GPC using three main components. First, we replace existing classifiers with alternatives: online sequential extreme learning machine (OSELM), feature adaptive OSELM (FA-OSELM), and knowledge preservation OSELM (KP-OSELM). Second, we add two new components to the GPC, specifically, a data balancing and a classifier update. Third, the coordination between the sub-models produces three novel variants of the GPC: GPC-KOS for KA-OSELM; GPC-FOS for FA-OSELM; and GPC-OS for OSELM. This article presents the first data stream-based classification framework that provides novel strategies for handling CD variants. The experimental results demonstrate that both GPC-KOS and GPC-FOS outperform the traditional GPC and other state-of-the-art methods, and the transfer learning and memory features contribute to the effective handling of most types of CD. Moreover, the application of our incremental variants on real-world datasets (KDD Cup '99, CICIDS-2017, CSE-CIC-IDS-2018, and ISCX '12) demonstrate improved performance (GPC-FOS in connection with CSE-CIC-IDS-2018 and CICIDS-2017; GPC-KOS in connection with ISCX2012 and KDD Cup '99), with maximum accuracy rates of 100% and 98% by GPC-KOS and GPC-FOS, respectively. Additionally, our GPC variants do not show superior performance in handling blip drift.
Collapse
Affiliation(s)
- Methaq A. Shyaa
- School of Computer Sciences, Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia; (M.A.S.)
| | - Zurinahni Zainol
- School of Computer Sciences, Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia; (M.A.S.)
| | - Rosni Abdullah
- School of Computer Sciences, Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia; (M.A.S.)
| | - Mohammed Anbar
- National Advanced IPv6 Centre (NAv6), Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia
| | - Laith Alzubaidi
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - José Santamaría
- Department of Computer Science, University of Jaén, 23071 Jaén, Spain
| |
Collapse
|
4
|
The L2 convergence of stream data mining algorithms based on probabilistic neural networks. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.02.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
|
5
|
Parallelized extreme learning machine for online data classification. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03308-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Castro VM, Hart KL, Sacks CA, Murphy SN, Perlis RH, McCoy TH. Longitudinal validation of an electronic health record delirium prediction model applied at admission in COVID-19 patients. Gen Hosp Psychiatry 2022; 74:9-17. [PMID: 34798580 PMCID: PMC8562039 DOI: 10.1016/j.genhosppsych.2021.10.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/25/2021] [Accepted: 10/27/2021] [Indexed: 12/15/2022]
Abstract
OBJECTIVE To validate a previously published machine learning model of delirium risk in hospitalized patients with coronavirus disease 2019 (COVID-19). METHOD Using data from six hospitals across two academic medical networks covering care occurring after initial model development, we calculated the predicted risk of delirium using a previously developed risk model applied to diagnostic, medication, laboratory, and other clinical features available in the electronic health record (EHR) at time of hospital admission. We evaluated the accuracy of these predictions against subsequent delirium diagnoses during that admission. RESULTS Of the 5102 patients in this cohort, 716 (14%) developed delirium. The model's risk predictions produced a c-index of 0.75 (95% CI, 0.73-0.77) with 27.7% of cases occurring in the top decile of predicted risk scores. Model calibration was diminished compared to the initial COVID-19 wave. CONCLUSION This EHR delirium risk prediction model, developed during the initial surge of COVID-19 patients, produced consistent discrimination over subsequent larger waves; however, with changing cohort composition and delirium occurrence rates, model calibration decreased. These results underscore the importance of calibration, and the challenge of developing risk models for clinical contexts where standard of care and clinical populations may shift.
Collapse
Affiliation(s)
- Victor M. Castro
- Center for Quantitative Health, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA,Research Information Science and Computing, Mass General Brigham, 399 Revolution Drive, Somerville, MA 02145, USA
| | - Kamber L. Hart
- Center for Quantitative Health, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA
| | - Chana A. Sacks
- Department of Medicine, Massachusetts General Hospital, 100 Cambridge Street, Boston, MA 02114, USA
| | - Shawn N. Murphy
- Research Information Science and Computing, Mass General Brigham, 399 Revolution Drive, Somerville, MA 02145, USA,Department of Neurology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA
| | - Roy H. Perlis
- Center for Quantitative Health, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA
| | - Thomas H. McCoy
- Center for Quantitative Health, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA,Corresponding author at: Simches Research Building, Massachusetts General Hospital, 185 Cambridge St, 6th Floor, Boston, MA 02114, USA
| |
Collapse
|
7
|
Recurrent Adaptive Classifier Ensemble for Handling Recurring Concept Drifts. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2021. [DOI: 10.1155/2021/5533777] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
For most real-world data streams, the concept about which data is obtained may shift from time to time, a phenomenon known as concept drift. For most real-world applications such as nonstationary time-series data, concept drift often occurs in a cyclic fashion, and previously seen concepts will reappear, which supports a unique kind of concept drift known as recurring concepts. A cyclically drifting concept exhibits a tendency to return to previously visited states. Existing machine learning algorithms handle recurring concepts by retraining a learning model if concept is detected, leading to the loss of information if the concept was well learned by the learning model, and the concept will recur again in the next learning phase. A common remedy for most machine learning algorithms is to retain and reuse previously learned models, but the process is time-consuming and computationally prohibitive in nonstationary environments to appropriately select any optimal ensemble classifier capable of accurately adapting to recurring concepts. To learn streaming data, fast and accurate machine learning algorithms are needed for time-dependent applications. Most of the existing algorithms designed to handle concept drift do not take into account the presence of recurring concept drift. To accurately and efficiently handle recurring concepts with minimum computational overheads, we propose a novel and evolving ensemble method called Recurrent Adaptive Classifier Ensemble (RACE). The algorithm preserves an archive of previously learned models that are diverse and always trains both new and existing classifiers. The empirical experiments conducted on synthetic and real-world data stream benchmarks show that RACE significantly adapts to recurring concepts more accurately than some state-of-the-art ensemble classifiers based on classifier reuse.
Collapse
|