1
|
Bohm BC, Borges FEDM, Silva SCM, Soares AT, Ferreira DD, Belo VS, Lignon JS, Bruhn FRP. Utilization of machine learning for dengue case screening. BMC Public Health 2024; 24:1573. [PMID: 38862945 PMCID: PMC11167742 DOI: 10.1186/s12889-024-19083-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 06/07/2024] [Indexed: 06/13/2024] Open
Abstract
Dengue causes approximately 10.000 deaths and 100 million symptomatic infections annually worldwide, making it a significant public health concern. To address this, artificial intelligence tools like machine learning can play a crucial role in developing more effective strategies for control, diagnosis, and treatment. This study identifies relevant variables for the screening of dengue cases through machine learning models and evaluates the accuracy of the models. Data from reported dengue cases in the states of Rio de Janeiro and Minas Gerais for the years 2016 and 2019 were obtained through the National Notifiable Diseases Surveillance System (SINAN). The mutual information technique was used to assess which variables were most related to laboratory-confirmed dengue cases. Next, a random selection of 10,000 confirmed cases and 10,000 discarded cases was performed, and the dataset was divided into training (70%) and testing (30%). Machine learning models were then tested to classify the cases. It was found that the logistic regression model with 10 variables (gender, age, fever, myalgia, headache, vomiting, nausea, back pain, rash, retro-orbital pain) and the Decision Tree and Multilayer Perceptron (MLP) models achieved the best results in decision metrics, with an accuracy of 98%. Therefore, a tree-based model would be suitable for building an application and implementing it on smartphones. This resource would be available to healthcare professionals such as doctors and nurses.
Collapse
Affiliation(s)
- Bianca Conrad Bohm
- Laboratory of Veterinary Epidemiology, Postgraduate Program in Veterinary, Federal University of Pelotas (UFPel), Capão do Leão, RS, Brazil.
| | | | - Suellen Caroline Matos Silva
- Laboratory of Veterinary Epidemiology, Postgraduate Program in Veterinary, Federal University of Pelotas (UFPel), Capão do Leão, RS, Brazil
| | - Alessandra Talaska Soares
- Laboratory of Veterinary Epidemiology, Graduate Program in Microbiology and Parasitology, Federal University of Pelotas, Capão do Leão, Rio Grande do Sul, Brazil
| | | | - Vinícius Silva Belo
- Federal University of São, João del-Rei, Midwest Dona Lindu campus, Divinópolis, Minas Gerais, Brazil
| | - Julia Somavilla Lignon
- Laboratory of Veterinary Epidemiology, Postgraduate Program in Veterinary, Federal University of Pelotas (UFPel), Capão do Leão, RS, Brazil
| | - Fábio Raphael Pascoti Bruhn
- Laboratory of Veterinary Epidemiology, Preventive Veterinary Department, Federal University of Pelotas,, Capão do Leão, Rio Grande do Sul, Brazil
| |
Collapse
|
2
|
Durge AR, Shrimankar DD. DHFS-ECM: Design of a Dual Heuristic Feature Selection-based Ensemble Classification Model for the Identification of Bamboo Species from Genomic Sequences. Curr Genomics 2024; 25:185-201. [PMID: 39087000 PMCID: PMC11288165 DOI: 10.2174/0113892029268176240125055419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 08/02/2024] Open
Abstract
Background Analyzing genomic sequences plays a crucial role in understanding biological diversity and classifying Bamboo species. Existing methods for genomic sequence analysis suffer from limitations such as complexity, low accuracy, and the need for constant reconfiguration in response to evolving genomic datasets. Aim This study addresses these limitations by introducing a novel Dual Heuristic Feature Selection-based Ensemble Classification Model (DHFS-ECM) for the precise identification of Bamboo species from genomic sequences. Methods The proposed DHFS-ECM method employs a Genetic Algorithm to perform dual heuristic feature selection. This process maximizes inter-class variance, leading to the selection of informative N-gram feature sets. Subsequently, intra-class variance levels are used to create optimal training and validation sets, ensuring comprehensive coverage of class-specific features. The selected features are then processed through an ensemble classification layer, combining multiple stratification models for species-specific categorization. Results Comparative analysis with state-of-the-art methods demonstrate that DHFS-ECM achieves remarkable improvements in accuracy (9.5%), precision (5.9%), recall (8.5%), and AUC performance (4.5%). Importantly, the model maintains its performance even with an increased number of species classes due to the continuous learning facilitated by the Dual Heuristic Genetic Algorithm Model. Conclusion DHFS-ECM offers several key advantages, including efficient feature extraction, reduced model complexity, enhanced interpretability, and increased robustness and accuracy through the ensemble classification layer. These attributes make DHFS-ECM a promising tool for real-time clinical applications and a valuable contribution to the field of genomic sequence analysis.
Collapse
Affiliation(s)
- Aditi R Durge
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India
| | - Deepti D Shrimankar
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India
| |
Collapse
|
3
|
Hoyos W, Hoyos K, Ruíz R. Using Computational Simulations Based on Fuzzy Cognitive Maps to Detect Dengue Complications. Diagnostics (Basel) 2024; 14:533. [PMID: 38473004 PMCID: PMC10931136 DOI: 10.3390/diagnostics14050533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 02/04/2024] [Accepted: 02/05/2024] [Indexed: 03/14/2024] Open
Abstract
Dengue remains a globally prevalent and potentially fatal disease, affecting millions of people worldwide each year. Early and accurate detection of dengue complications is crucial to improving clinical outcomes and reducing the burden on healthcare systems. In this study, we explore the use of computational simulations based on fuzzy cognitive maps (FCMs) to improve the detection of dengue complications. We propose an innovative approach that integrates clinical data into a computational model that mimics the decision-making process of a medical expert. Our method uses FCMs to model complexity and uncertainty in dengue. The model was evaluated in simulated scenarios with each of the dengue classifications. These maps allow us to represent and process vague and fuzzy information effectively, capturing relationships that often go unnoticed in conventional approaches. The results of the simulations show the potential of our approach to detecting dengue complications. This innovative strategy has the potential to transform the way clinical management of dengue is approached. This research is a starting point for further development of complication detection approaches for events of public health concern, such as dengue.
Collapse
Affiliation(s)
- William Hoyos
- Grupo de Investigación en Ingeniería Sostenible e Inteligente, Universidad Cooperativa de Colombia, Montería 230002, Colombia
- Grupo de Investigación en I+D+I en TIC, Universidad EAFIT, Medellín 050022, Colombia
| | - Kenia Hoyos
- Laboratorio Clínico Humano, Clínica Salud Social, Sincelejo 700001, Colombia;
| | - Rander Ruíz
- Grupo de Investigación Interdisciplinario del Bajo Cauca y Sur de Córdoba, Universidad de Antioquia, Campus Caucasia, Caucasia 052410, Colombia;
| |
Collapse
|
4
|
Bresani-Salvi CC, Morais CNLD, Neco HVPDC, Farias PCS, Pastor AF, Lima RED, Montarroyos UR, Acioli-Santos B. Interferon-gamma gene diplotype (AA-rs2069716 / AG-rs2069727) may play an important role during secondary outcomes of severe dengue in Brazilian patients. Rev Inst Med Trop Sao Paulo 2023; 65:e43. [PMID: 37403881 DOI: 10.1590/s1678-9946202365043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 05/04/2023] [Indexed: 07/06/2023] Open
Abstract
Dengue is a global and growing health threat, especially in Southeast Asia, West Pacific and South America. Infection by the dengue virus (DENV) results in dengue fever, which can evolve to severe forms. Cytokines, especially interferons, are involved in the immunopathogenesis of dengue fever, and so may influence the disease outcomes. The aim of this study was to investigate the association between severe forms of dengue and two single nucleotide polymorphisms (SNPs) in the interferon-gamma gene (IFNG): A256G (rs2069716) and A325G (rs2069727). We included 274 patients infected with DENV serotype 3: 119 cases of dengue without warning signs (DWoWS), and 155 with warning signs (DWWS) or severe dengue (SD). DNA was extracted, and genotyped with Illumina Genotyping Kit or real time PCR (TaqMan probes). We estimated the adjusted Odds Ratios (OR) by multivariate logistic regression models. When comparing with the ancestral AA/AA diplotype (A256G/A325G), we found a protective association of the AA/AG against DWWS/SD among patients with secondary dengue (OR 0.51; 95% IC 0.24-1.10, p = 0.085), adjusting for age and sex. The variant genotype at locus A325G of the IFNG, in combination with the ancestral genotype at locus A256G, can protect against severe clinical forms of secondary dengue in Brazilian DENV3-infected patients.
Collapse
Affiliation(s)
| | | | | | | | - André Filipe Pastor
- Instituto Federal de Educação, Ciência e Tecnologia do Sertão Pernambucano, Pernambuco, Floresta, Brazil
| | - Raul Emídio de Lima
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Virologia, Recife, Pernambuco, Brazil
| | | | - Bartolomeu Acioli-Santos
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Virologia, Recife, Pernambuco, Brazil
| |
Collapse
|
5
|
Hu TY, Chow JC, Chien TW, Chou W. Detecting dengue fever in children using online Rasch analysis to develop algorithms for parents: An APP development and usability study. Medicine (Baltimore) 2023; 102:e33296. [PMID: 37000053 PMCID: PMC10063317 DOI: 10.1097/md.0000000000033296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 02/23/2023] [Accepted: 02/24/2023] [Indexed: 04/01/2023] Open
Abstract
BACKGROUND Dengue fever (DF) is a significant public health concern in Asia. However, detecting the disease using traditional dichotomous criteria (i.e., absent vs present) can be extremely difficult. Convolutional neural networks (CNNs) and artificial neural networks (ANNs), due to their use of a large number of parameters for modeling, have shown the potential to improve prediction accuracy (ACC). To date, there has been no research conducted to understand item features and responses using online Rasch analysis. To verify the hypothesis that a combination of CNN, ANN, K-nearest-neighbor algorithm (KNN), and logistic regression (LR) can improve the ACC of DF prediction for children, further research is required. METHODS We extracted 19 feature variables related to DF symptoms from 177 pediatric patients, of whom 69 were diagnosed with DF. Using the RaschOnline technique for Rasch analysis, we examined 11 variables for their statistical significance in predicting the risk of DF. Based on 2 sets of data, 1 for training (80%) and the other for testing (20%), we calculated the prediction ACC by comparing the areas under the receiver operating characteristic curve (AUCs) between DF + and DF- in both sets. In the training set, we compared 2 scenarios: the combined scheme and individual algorithms. RESULTS Our findings indicate that visual displays of DF data are easily interpreted using Rasch analysis; the k-nearest neighbors algorithm has a lower AUC (<0.50); LR has a relatively higher AUC (0.70); all 3 algorithms have an almost equal AUC (=0.68), which is smaller than the individual algorithms of Naive Bayes, LR in raw data, and Naive Bayes in normalized data; and we developed an app to assist parents in detecting DF in children during the dengue season. CONCLUSION The development of an LR-based APP for the detection of DF in children has been completed. To help patients, family members, and clinicians differentiate DF from other febrile illnesses at an early stage, an 11-item model is proposed for developing the APP.
Collapse
Affiliation(s)
- Ting-Yun Hu
- Department of Pediatrics, Chi Mei Medical Center, Tainan, Taiwan
| | - Julie Chi Chow
- Department of Pediatrics, Chi Mei Medical Center, Tainan, Taiwan
- Department of Pediatrics, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Tsair-Wei Chien
- Department of Medical Research, Chi-Mei Medical Center, Tainan, Taiwan
| | - Willy Chou
- Department of Physical Medicine and Rehabilitation, Chi Mei Medical Center, Tainan, Taiwan
- Department of Physical Medicine and Rehabilitation, Chung San Medical University Hospital, Taichung, Taiwan
| |
Collapse
|
6
|
Zargari Marandi R, Leung P, Sigera C, Murray DD, Weeratunga P, Fernando D, Rodrigo C, Rajapakse S, MacPherson CR. Development of a machine learning model for early prediction of plasma leakage in suspected dengue patients. PLoS Negl Trop Dis 2023; 17:e0010758. [PMID: 36913411 PMCID: PMC10035900 DOI: 10.1371/journal.pntd.0010758] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 03/23/2023] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
BACKGROUND At least a third of dengue patients develop plasma leakage with increased risk of life-threatening complications. Predicting plasma leakage using laboratory parameters obtained in early infection as means of triaging patients for hospital admission is important for resource-limited settings. METHODS A Sri Lankan cohort including 4,768 instances of clinical data from N = 877 patients (60.3% patients with confirmed dengue infection) recorded in the first 96 hours of fever was considered. After excluding incomplete instances, the dataset was randomly split into a development and a test set with 374 (70%) and 172 (30%) patients, respectively. From the development set, five most informative features were selected using the minimum description length (MDL) algorithm. Random forest and light gradient boosting machine (LightGBM) were used to develop a classification model using the development set based on nested cross validation. An ensemble of the learners via average stacking was used as the final model to predict plasma leakage. RESULTS Lymphocyte count, haemoglobin, haematocrit, age, and aspartate aminotransferase were the most informative features to predict plasma leakage. The final model achieved the area under the receiver operating characteristics curve, AUC = 0.80 with positive predictive value, PPV = 76.9%, negative predictive value, NPV = 72.5%, specificity = 87.9%, and sensitivity = 54.8% on the test set. CONCLUSION The early predictors of plasma leakage identified in this study are similar to those identified in several prior studies that used non-machine learning based methods. However, our observations strengthen the evidence base for these predictors by showing their relevance even when individual data points, missing data and non-linear associations were considered. Testing the model on different populations using these low-cost observations would identify further strengths and limitations of the presented model.
Collapse
Affiliation(s)
- Ramtin Zargari Marandi
- Centre of Excellence for Health, Immunity and Infections (CHIP), Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Preston Leung
- Centre of Excellence for Health, Immunity and Infections (CHIP), Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | | | - Daniel Dawson Murray
- Centre of Excellence for Health, Immunity and Infections (CHIP), Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | | | | | - Chaturaka Rodrigo
- Viral Immunology Systems Program (VISP), Kirby Institute, UNSW Sydney, Sydney, Australia
| | | | - Cameron Ross MacPherson
- Centre of Excellence for Health, Immunity and Infections (CHIP), Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| |
Collapse
|
7
|
Durge AR, Shrimankar DD, Sawarkar AD. Heuristic Analysis of Genomic Sequence Processing Models for High Efficiency Prediction: A Statistical Perspective. Curr Genomics 2022; 23:299-317. [PMID: 36778194 PMCID: PMC9878859 DOI: 10.2174/1389202923666220927105311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 08/29/2022] [Accepted: 09/01/2022] [Indexed: 11/22/2022] Open
Abstract
Genome sequences indicate a wide variety of characteristics, which include species and sub-species type, genotype, diseases, growth indicators, yield quality, etc. To analyze and study the characteristics of the genome sequences across different species, various deep learning models have been proposed by researchers, such as Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), Multilayer Perceptrons (MLPs), etc., which vary in terms of evaluation performance, area of application and species that are processed. Due to a wide differentiation between the algorithmic implementations, it becomes difficult for research programmers to select the best possible genome processing model for their application. In order to facilitate this selection, the paper reviews a wide variety of such models and compares their performance in terms of accuracy, area of application, computational complexity, processing delay, precision and recall. Thus, in the present review, various deep learning and machine learning models have been presented that possess different accuracies for different applications. For multiple genomic data, Repeated Incremental Pruning to Produce Error Reduction with Support Vector Machine (Ripper SVM) outputs 99.7% of accuracy, and for cancer genomic data, it exhibits 99.27% of accuracy using the CNN Bayesian method. Whereas for Covid genome analysis, Bidirectional Long Short-Term Memory with CNN (BiLSTM CNN) exhibits the highest accuracy of 99.95%. A similar analysis of precision and recall of different models has been reviewed. Finally, this paper concludes with some interesting observations related to the genomic processing models and recommends applications for their efficient use.
Collapse
Affiliation(s)
- Aditi R. Durge
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India
| | - Deepti D. Shrimankar
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India,Address correspondence to this author at the Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India; Tel: 9860606477; E-mail:
| | - Ankush D. Sawarkar
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology (VNIT), Nagpur, India
| |
Collapse
|
8
|
Ali A, Nisar S, Khan MA, Mohsan SAH, Noor F, Mostafa H, Marey M. A Privacy-Preserved Internet-of-Medical-Things Scheme for Eradication and Control of Dengue Using UAV. MICROMACHINES 2022; 13:1702. [PMID: 36296055 PMCID: PMC9609698 DOI: 10.3390/mi13101702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 09/30/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
Dengue is a mosquito-borne viral infection, found in tropical and sub-tropical climates worldwide, mostly in urban and semi-urban areas. Countries like Pakistan receive heavy rains annually resulting in floods in urban cities due to poor drainage systems. Currently, different cities of Pakistan are at high risk of dengue outbreaks, as multiple dengue cases have been reported due to poor flood control and drainage systems. After heavy rain in urban areas, mosquitoes are provided with a favorable environment for their breeding and transmission through stagnant water due to poor maintenance of the drainage system. The history of the dengue virus in Pakistan shows that there is a closed relationship between dengue outbreaks and a rainfall. There is no specific treatment for dengue; however, the outbreak can be controlled through internet of medical things (IoMT). In this paper, we propose a novel privacy-preserved IoMT model to control dengue virus outbreaks by tracking dengue virus-infected patients based on bedding location extracted using call data record analysis (CDRA). Once the bedding location of the patient is identified, then the actual infected spot can be easily located by using geographic information system mapping. Once the targeted spots are identified, then it is very easy to eliminate the dengue by spraying the affected areas with the help of unmanned aerial vehicles (UAVs). The proposed model identifies the targeted spots up to 100%, based on the bedding location of the patient using CDRA.
Collapse
Affiliation(s)
- Amir Ali
- Military College of Signals (MCS), National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Shibli Nisar
- Military College of Signals (MCS), National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Muhammad Asghar Khan
- Department of Electrical Engineering, Hamdard University, Islamabad 44000, Pakistan
- Smart Systems Engineering Laboratory, College of Engineering, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | | | - Fazal Noor
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah 400411, Saudi Arabia
| | - Hala Mostafa
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Mohamed Marey
- Smart Systems Engineering Laboratory, College of Engineering, Prince Sultan University, Riyadh 11586, Saudi Arabia
| |
Collapse
|
9
|
Hoyos W, Aguilar J, Toro M. A clinical decision-support system for dengue based on fuzzy cognitive maps. Health Care Manag Sci 2022; 25:666-681. [PMID: 35971038 DOI: 10.1007/s10729-022-09611-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 07/28/2022] [Indexed: 01/18/2023]
Abstract
Dengue is a viral infection widely distributed in tropical and subtropical regions of the world. Dengue is characterized by high fatality rates when the diagnosis is not made promptly and effectively. To aid in the diagnosis of dengue, we propose a clinical decision-support system that classifies the clinical picture based on its severity, and using causal relationships evaluates the behavior of the clinical and laboratory variables that describe the signs and symptoms related to dengue. The system is based on a fuzzy cognitive map that is defined by the signs, symptoms and laboratory tests used in the conventional diagnosis of dengue. The evaluation of the model was performed on datasets of patients diagnosed with dengue to compare the model with other approaches. The developed model showed a good classification performance with 89.4% accuracy and could evaluate the behaviour of clinical and laboratory variables related to dengue severity (it is an explainable method). This model serves as a diagnostic aid for dengue that can be used by medical professionals in clinical settings.
Collapse
Affiliation(s)
- William Hoyos
- Grupo de Investigaciones Microbiológicas y Biomédicas de Córdoba, Universidad de Córdoba, Carrera 6 No 77-305, Montería, Colombia
- Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Carrera 48 No 7Sur-50, Medellín, Colombia
| | - Jose Aguilar
- Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Carrera 48 No 7Sur-50, Medellín, Colombia.
- Centro de Estudios en Microelectrónica y Sistemas Distribuidos, Universidad de Los Andes, Núcleo La Hechicera, Mérida, Venezuela.
- Departamento de Automática, Universidad de Alcalá, Alcalá de Henares, Spain.
| | - Mauricio Toro
- Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Carrera 48 No 7Sur-50, Medellín, Colombia
| |
Collapse
|
10
|
Shenoy S, Rajan AK, Rashid M, Chandran VP, Poojari PG, Kunhikatta V, Acharya D, Nair S, Varma M, Thunga G. Artificial intelligence in differentiating tropical infections: A step ahead. PLoS Negl Trop Dis 2022; 16:e0010455. [PMID: 35771774 PMCID: PMC9246149 DOI: 10.1371/journal.pntd.0010455] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 04/29/2022] [Indexed: 11/19/2022] Open
Abstract
Background and objective Differentiating tropical infections are difficult due to its homogenous nature of clinical and laboratorial presentations among them. Sophisticated differential tests and prediction tools are better ways to tackle this issue. Here, we aimed to develop a clinician assisted decision making tool to differentiate the common tropical infections. Methodology A cross sectional study through 9 item self-administered questionnaire were performed to understand the need of developing a decision making tool and its parameters. The most significant differential parameters among the identified infections were measured through a retrospective study and decision tree was developed. Based on the parameters identified, a multinomial logistic regression model and a machine learning model were developed which could better differentiate the infection. Results A total of 40 physicians involved in the management of tropical infections were included for need analysis. Dengue, malaria, leptospirosis and scrub typhus were the common tropical infections in our settings. Sodium, total bilirubin, albumin, lymphocytes and platelets were the laboratory parameters; and abdominal pain, arthralgia, myalgia and urine output were the clinical presentation identified as better predictors. In multinomial logistic regression analysis with dengue as a reference revealed a predictability of 60.7%, 62.5% and 66% for dengue, malaria and leptospirosis, respectively, whereas, scrub typhus showed only 38% of predictability. The multi classification machine learning model observed to have an overall predictability of 55–60%, whereas a binary classification machine learning algorithms showed an average of 79–84% for one vs other and 69–88% for one vs one disease category. Conclusion This is a first of its kind study where both statistical and machine learning approaches were explored simultaneously for differentiating tropical infections. Machine learning techniques in healthcare sectors will aid in early detection and better patient care. Distinguishing tropical infections is difficult due to its homogeneous nature from clinical and laboratory presentations among them. This is a first of its kind study where both statistical and machine learning approaches were explored simultaneously for differentiating tropical infections. Dengue, malaria, leptospirosis and scrub typhus were the common tropical infections in our settings as per the need analysis. Better predictors in terms of laboratory parameters and clinical presentations were identified from retrospective analysis and used for the regression and machine learning models. The parameters such as accuracy, true positive rate/sensitivity/recall, false positive rate, precision/positive predictive value, F-measure and ROC area for both the training and validation sets (10-fold cross validation) for all modelling approaches and diseases (One vs One and One vs others) were calculated. All the models observed to have an acceptable range of model performance in differentiating tropical infections. Albumin can be considered as the main parameter in differentiating these tropical infections. These models should be implemented in daily clinical routine practice via mobile or desktop assisted applications or tools.
Collapse
Affiliation(s)
- Shreelaxmi Shenoy
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Asha K. Rajan
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Muhammed Rashid
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Viji Pulikkel Chandran
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Pooja Gopal Poojari
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Vijayanarayana Kunhikatta
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Dinesh Acharya
- Department of Computer Science & Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Sreedharan Nair
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Muralidhar Varma
- Department of Infectious Diseases, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, India
| | - Girish Thunga
- Department of Pharmacy Practice, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, India
- * E-mail:
| |
Collapse
|
11
|
Liu YE, Saul S, Rao AM, Robinson ML, Agudelo Rojas OL, Sanz AM, Verghese M, Solis D, Sibai M, Huang CH, Sahoo MK, Gelvez RM, Bueno N, Estupiñan Cardenas MI, Villar Centeno LA, Rojas Garrido EM, Rosso F, Donato M, Pinsky BA, Einav S, Khatri P. An 8-gene machine learning model improves clinical prediction of severe dengue progression. Genome Med 2022; 14:33. [PMID: 35346346 PMCID: PMC8959795 DOI: 10.1186/s13073-022-01034-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 02/24/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Each year 3-6 million people develop life-threatening severe dengue (SD). Clinical warning signs for SD manifest late in the disease course and are nonspecific, leading to missed cases and excess hospital burden. Better SD prognostics are urgently needed. METHODS We integrated 11 public datasets profiling the blood transcriptome of 365 dengue patients of all ages and from seven countries, encompassing biological, clinical, and technical heterogeneity. We performed an iterative multi-cohort analysis to identify differentially expressed genes (DEGs) between non-severe patients and SD progressors. Using only these DEGs, we trained an XGBoost machine learning model on public data to predict progression to SD. All model parameters were "locked" prior to validation in an independent, prospectively enrolled cohort of 377 dengue patients in Colombia. We measured expression of the DEGs in whole blood samples collected upon presentation, prior to SD progression. We then compared the accuracy of the locked XGBoost model and clinical warning signs in predicting SD. RESULTS We identified eight SD-associated DEGs in the public datasets and built an 8-gene XGBoost model that accurately predicted SD progression in the independent validation cohort with 86.4% (95% CI 68.2-100) sensitivity and 79.7% (95% CI 75.5-83.9) specificity. Given the 5.8% proportion of SD cases in this cohort, the 8-gene model had a positive and negative predictive value (PPV and NPV) of 20.9% (95% CI 16.7-25.6) and 99.0% (95% CI 97.7-100.0), respectively. Compared to clinical warning signs at presentation, which had 77.3% (95% CI 58.3-94.1) sensitivity and 39.7% (95% CI 34.7-44.9) specificity, the 8-gene model led to an 80% reduction in the number needed to predict (NNP) from 25.4 to 5.0. Importantly, the 8-gene model accurately predicted subsequent SD in the first three days post-fever onset and up to three days prior to SD progression. CONCLUSIONS The 8-gene XGBoost model, trained on heterogeneous public datasets, accurately predicted progression to SD in a large, independent, prospective cohort, including during the early febrile stage when SD prediction remains clinically difficult. The model has potential to be translated to a point-of-care prognostic assay to reduce dengue morbidity and mortality without overwhelming limited healthcare resources.
Collapse
Affiliation(s)
- Yiran E. Liu
- grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Cancer Biology Graduate Program, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
| | - Sirle Saul
- grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
| | - Aditya Manohar Rao
- grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Immunology Graduate Program, School of Medicine, Stanford University, CA Stanford, USA
| | - Makeda Lucretia Robinson
- grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | | | - Ana Maria Sanz
- grid.477264.4Clinical Research Center, Fundación Valle del Lili, Cali, Colombia
| | - Michelle Verghese
- grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Daniel Solis
- grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Mamdouh Sibai
- grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Chun Hong Huang
- grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Malaya Kumar Sahoo
- grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Rosa Margarita Gelvez
- Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
| | - Nathalia Bueno
- Centro de Atención y Diagnóstico de Enfermedades Infecciosas (CDI), Bucaramanga, Colombia
| | | | | | | | - Fernando Rosso
- grid.477264.4Clinical Research Center, Fundación Valle del Lili, Cali, Colombia ,grid.477264.4Division of Infectious Diseases, Department of Internal Medicine, Fundación Valle del Lili, Cali, Colombia
| | - Michele Donato
- grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
| | - Benjamin A. Pinsky
- grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Department of Pathology, School of Medicine, Stanford University, CA Stanford, USA
| | - Shirit Einav
- grid.168010.e0000000419368956Division of Infectious Diseases and Geographic Medicine, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Department of Microbiology and Immunology, School of Medicine, Stanford University, CA Stanford, USA
| | - Purvesh Khatri
- grid.168010.e0000000419368956Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, CA Stanford, USA ,grid.168010.e0000000419368956Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, CA Stanford, USA
| |
Collapse
|
12
|
Hung SJ, Tsai HP, Wang YF, Ko WC, Wang JR, Huang SW. Assessment of the Risk of Severe Dengue Using Intrahost Viral Population in Dengue Virus Serotype 2 Patients via Machine Learning. Front Cell Infect Microbiol 2022; 12:831281. [PMID: 35223554 PMCID: PMC8866709 DOI: 10.3389/fcimb.2022.831281] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022] Open
Abstract
Dengue virus, a positive-sense single-stranded RNA virus, continuously threatens human health. Although several criteria for evaluation of severe dengue have been recently established, the ability to prognose the risk of severe outcomes for dengue patients remains limited. Mutant spectra of RNA viruses, including single nucleotide variants (SNVs) and defective virus genomes (DVGs), contribute to viral virulence and growth. Here, we determine the potency of intrahost viral population in dengue patients with primary infection that progresses into severe dengue. A total of 65 dengue virus serotype 2 infected patients in primary infection including 17 severe cases were enrolled. We utilized deep sequencing to directly define the frequency of SNVs and detection times of DVGs in sera of dengue patients and analyzed their associations with severe dengue. Among the detected SNVs and DVGs, the frequencies of 9 SNVs and the detection time of 1 DVG exhibited statistically significant differences between patients with dengue fever and those with severe dengue. By utilizing the detected frequencies/times of the selected SNVs/DVG as features, the machine learning model showed high average with a value of area under the receiver operating characteristic curve (AUROC, 0.966 ± 0.064). The elevation of the frequency of SNVs at E (nucleotide position 995 and 2216), NS2A (nucleotide position 4105), NS3 (nucleotide position 4536, 4606), and NS5 protein (nucleotide position 7643 and 10067) and the detection times of the selected DVG that had a deletion junction in the E protein region (nucleotide positions of the junction: between 969 and 1022) increased the possibility of dengue patients for severe dengue. In summary, we demonstrated the detected frequencies/times of SNVs/DVG in dengue patients associated with severe disease and successfully utilized them to discriminate severe patients using machine learning algorithm. The identified SNVs and DVGs that are associated with severe dengue will expand our understanding of intrahost viral population in dengue pathogenesis.
Collapse
Affiliation(s)
- Su-Jhen Hung
- National Mosquito-Borne Diseases Control Research Center, National Health Research Institutes, Tainan, Taiwan
| | - Huey-Pin Tsai
- Department of Pathology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Department of Medical Laboratory Science and Biotechnology, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Ya-Fang Wang
- National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Tainan, Taiwan
| | - Wen-Chien Ko
- Department of Internal Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Department of Medicine, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Jen-Ren Wang
- Department of Pathology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Department of Medical Laboratory Science and Biotechnology, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Tainan, Taiwan
- Center of Infectious Disease and Signaling Research, National Cheng Kung University, Tainan, Taiwan
| | - Sheng-Wen Huang
- National Mosquito-Borne Diseases Control Research Center, National Health Research Institutes, Tainan, Taiwan
- *Correspondence: Sheng-Wen Huang,
| |
Collapse
|
13
|
Swain S, Bhushan B, Dhiman G, Viriyasitavat W. Appositeness of Optimized and Reliable Machine Learning for Healthcare: A Survey. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2022; 29:3981-4003. [PMID: 35342282 PMCID: PMC8939887 DOI: 10.1007/s11831-022-09733-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 02/09/2022] [Indexed: 05/04/2023]
Abstract
Machine Learning (ML) has been categorized as a branch of Artificial Intelligence (AI) under the Computer Science domain wherein programmable machines imitate human learning behavior with the help of statistical methods and data. The Healthcare industry is one of the largest and busiest sectors in the world, functioning with an extensive amount of manual moderation at every stage. Most of the clinical documents concerning patient care are hand-written by experts, selective reports are machine-generated. This process elevates the chances of misdiagnosis thereby, imposing a risk to a patient's life. Recent technological adoptions for automating manual operations have witnessed extensive use of ML in its applications. The paper surveys the applicability of ML approaches in automating medical systems. The paper discusses most of the optimized statistical ML frameworks that encourage better service delivery in clinical aspects. The universal adoption of various Deep Learning (DL) and ML techniques as the underlying systems for a variety of wellness applications, is delineated by challenges and elevated by myriads of security. This work tries to recognize a variety of vulnerabilities occurring in medical procurement, admitting the concerns over its predictive performance from a privacy point of view. Finally providing possible risk delimiting facts and directions for active challenges in the future.
Collapse
Affiliation(s)
- Subhasmita Swain
- Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, India
| | - Bharat Bhushan
- Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, India
| | - Gaurav Dhiman
- Department of Computer Science, Government Bikram College of Commerce, Patiala, India
- University Centre for Research and Development, Department of Computer Science and Engineering, Chandigarh University, Gharuan, Mohali, India
- Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India
| | - Wattana Viriyasitavat
- Department of Statistics, Faculty of Commerce and Accountancy, Chulalongkorn Business School, Bangkok, Thailand
| |
Collapse
|
14
|
Data-driven methods for dengue prediction and surveillance using real-world and Big Data: A systematic review. PLoS Negl Trop Dis 2022; 16:e0010056. [PMID: 34995281 PMCID: PMC8740963 DOI: 10.1371/journal.pntd.0010056] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 12/06/2021] [Indexed: 12/23/2022] Open
Abstract
Background Traditionally, dengue surveillance is based on case reporting to a central health agency. However, the delay between a case and its notification can limit the system responsiveness. Machine learning methods have been developed to reduce the reporting delays and to predict outbreaks, based on non-traditional and non-clinical data sources. The aim of this systematic review was to identify studies that used real-world data, Big Data and/or machine learning methods to monitor and predict dengue-related outcomes. Methodology/Principal findings We performed a search in PubMed, Scopus, Web of Science and grey literature between January 1, 2000 and August 31, 2020. The review (ID: CRD42020172472) focused on data-driven studies. Reviews, randomized control trials and descriptive studies were not included. Among the 119 studies included, 67% were published between 2016 and 2020, and 39% used at least one novel data stream. The aim of the included studies was to predict a dengue-related outcome (55%), assess the validity of data sources for dengue surveillance (23%), or both (22%). Most studies (60%) used a machine learning approach. Studies on dengue prediction compared different prediction models, or identified significant predictors among several covariates in a model. The most significant predictors were rainfall (43%), temperature (41%), and humidity (25%). The two models with the highest performances were Neural Networks and Decision Trees (52%), followed by Support Vector Machine (17%). We cannot rule out a selection bias in our study because of our two main limitations: we did not include preprints and could not obtain the opinion of other international experts. Conclusions/Significance Combining real-world data and Big Data with machine learning methods is a promising approach to improve dengue prediction and monitoring. Future studies should focus on how to better integrate all available data sources and methods to improve the response and dengue management by stakeholders. Dengue is one of the most important arbovirus infections in the world and its public health, societal and economic burden is increasing. Although the majority of dengue cases are asymptomatic or mild, severe disease forms can lead to death. For this reason, early diagnosis and monitoring of dengue are crucial to decrease mortality. However, most endemic regions still rely on traditional monitoring methods, despite the growing availability of novel data sources and data-driven methods based on real-world data, Big Data, and machine learning algorithms. In this systematic review, we identified and analyzed studies that used these novel approaches for dengue monitoring and/or prediction. We found that novel data streams, such as Internet search engines and social media platforms, and machine learning methods can be successfully used to improve dengue management, but are still vastly ignored in real life. These approaches should be combined with traditional methods to help stakeholders better prepare for each outbreak and improve early responsiveness.
Collapse
|
15
|
Hoyos W, Aguilar J, Toro M. Dengue models based on machine learning techniques: A systematic literature review. Artif Intell Med 2021; 119:102157. [PMID: 34531010 DOI: 10.1016/j.artmed.2021.102157] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 05/08/2021] [Accepted: 08/17/2021] [Indexed: 12/16/2022]
Abstract
BACKGROUND Dengue modeling is a research topic that has increased in recent years. Early prediction and decision-making are key factors to control dengue. This Systematic Literature Review (SLR) analyzes three modeling approaches of dengue: diagnostic, epidemic, intervention. These approaches require models of prediction, prescription and optimization. This SLR establishes the state-of-the-art in dengue modeling, using machine learning, in the last years. METHODS Several databases were selected to search the articles. The selection was made based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology. Sixty-four articles were obtained and analyzed to describe their strengths and limitations. Finally, challenges and opportunities for research on machine-learning for dengue modeling were identified. RESULTS Logistic regression was the most used modeling approach for the diagnosis of dengue (59.1%). The analysis of the epidemic approach showed that linear regression (17.4%) is the most used technique within the spatial analysis. Finally, the most used intervention modeling is General Linear Model with 70%. CONCLUSIONS We conclude that cause-effect models may improve diagnosis and understanding of dengue. Models that manage uncertainty can also be helpful, because of low data-quality in healthcare. Finally, decentralization of data, using federated learning, may decrease computational costs and allow model building without compromising data security.
Collapse
Affiliation(s)
- William Hoyos
- Grupo de Investigaciones Microbiológicas y Biomédicas de Córdoba, Universidad de Córdoba, Montería, Colombia; Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Medellín, Colombia.
| | - Jose Aguilar
- Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Medellín, Colombia; Centro de Estudios en Microelectrónica y Sistemas Distribuidos, Universidad de Los Andes, Mérida, Venezuela; Universidad de Alcalá, Depto. de Automática, Alcalá de Henares, Spain
| | - Mauricio Toro
- Grupo de Investigación en I+D+i en TIC, Universidad EAFIT, Medellín, Colombia
| |
Collapse
|
16
|
Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:8899263. [PMID: 33815733 PMCID: PMC7987450 DOI: 10.1155/2021/8899263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 09/29/2020] [Accepted: 02/24/2021] [Indexed: 01/02/2023]
Abstract
Software defect prediction (SDP) in the initial period of the software development life cycle (SDLC) remains a critical and important assignment. SDP is essentially studied during few last decades as it leads to assure the quality of software systems. The quick forecast of defective or imperfect artifacts in software development may serve the development team to use the existing assets competently and more effectively to provide extraordinary software products in the given or narrow time. Previously, several canvassers have industrialized models for defect prediction utilizing machine learning (ML) and statistical techniques. ML methods are considered as an operative and operational approach to pinpoint the defective modules, in which moving parts through mining concealed patterns amid software metrics (attributes). ML techniques are also utilized by several researchers on healthcare datasets. This study utilizes different ML techniques software defect prediction using seven broadly used datasets. The ML techniques include the multilayer perceptron (MLP), support vector machine (SVM), decision tree (J48), radial basis function (RBF), random forest (RF), hidden Markov model (HMM), credal decision tree (CDT), K-nearest neighbor (KNN), average one dependency estimator (A1DE), and Naïve Bayes (NB). The performance of each technique is evaluated using different measures, for instance, relative absolute error (RAE), mean absolute error (MAE), root mean squared error (RMSE), root relative squared error (RRSE), recall, and accuracy. The inclusive outcome shows the best performance of RF with 88.32% average accuracy and 2.96 rank value, second-best performance is achieved by SVM with 87.99% average accuracy and 3.83 rank values. Moreover, CDT also shows 87.88% average accuracy and 3.62 rank values, placed on the third position. The comprehensive outcomes of research can be utilized as a reference point for new research in the SDP domain, and therefore, any assertion concerning the enhancement in prediction over any new technique or model can be benchmarked and proved.
Collapse
|
17
|
Araujo-Mariz C, Militão de Albuquerque MDFP, Lopes EP, Ximenes RAA, Lacerda HR, Miranda-Filho DB, Lustosa-Martins BB, Pastor AFP, Acioli-Santos B. Hepatotoxicity during TB treatment in people with HIV/AIDS related to NAT2 polymorphisms in Pernambuco, Northeast Brazil. Ann Hepatol 2021; 19:153-160. [PMID: 31734174 DOI: 10.1016/j.aohep.2019.09.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 09/02/2019] [Accepted: 09/14/2019] [Indexed: 02/04/2023]
Abstract
INTRODUCTION AND OBJECTIVE Hepatotoxicity during tuberculosis (TB) treatment is frequent and may be related to the Arylamine N-Acetyltransferase (NAT2) acetylator profile, in which allele frequencies differ according to the population. The aim of this study was to investigate functional polymorphisms in NAT2 associated with the development of hepatotoxicity after initiating treatment for TB in people living with HIV/AIDS (PLWHA) in Pernambuco, Northeast Brazil. MATERIAL AND METHODS This was a prospective cohort study that investigated seven single nucleotide polymorphisms located in the NAT2 coding region in 173 PLWHA undergoing TB treatment. Hepatotoxicity was defined as elevated aminotransferase levels and identified as being three times higher than it was before initiating TB treatment, with associated symptoms of hepatitis. A further 80 healthy subjects, without HIV infection or TB were used as a control group. All individuals were genotyped by direct sequencing. RESULTS The NAT2*13A and NAT2*6B variant alleles were significantly associated with the development of hepatotoxicity during TB treatment in PLWHA (p<0.05). Individual comparisons between the wild type and each variant genotype revealed that PLWHA with signatures NAT2*13A/NAT2*13A (OR 4.4; CI95% 1.1-18.8; p 0.037) and NAT2*13A/NAT2*6B (OR 4.4; CI95% 1.5-12.7; p 0.005) significantly increased the risk of hepatotoxicity. CONCLUSION This study suggests that NAT2*13A and NAT2*6B variant alleles are risk factors for developing hepatotoxicity, and PLWHA with genotypes NAT2*13A/NAT2*13A and NAT2*13A/NAT2*6B should be targeted for specific care to reduce the risk of hepatotoxicity during treatment for tuberculosis.
Collapse
Affiliation(s)
- Carolline Araujo-Mariz
- Departamento de Medicina Tropical, Universidade Federal de Pernambuco, Recife, PE, Brazil.
| | | | - Edmundo P Lopes
- Departamento de Medicina Tropical, Universidade Federal de Pernambuco, Recife, PE, Brazil
| | - Ricardo A A Ximenes
- Departamento de Medicina Tropical, Universidade Federal de Pernambuco, Recife, PE, Brazil
| | - Heloísa R Lacerda
- Departamento de Medicina Tropical, Universidade Federal de Pernambuco, Recife, PE, Brazil
| | | | | | - André Filipe P Pastor
- Instituto Federal de Educação, Ciência e Tecnologia do Sertão Pernambucano/IFSertão, Floresta, PE, Brazil
| | | |
Collapse
|
18
|
Performance Assessment of Classification Algorithms on Early Detection of Liver Syndrome. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2020:6680002. [PMID: 33489060 PMCID: PMC7787853 DOI: 10.1155/2020/6680002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 11/18/2020] [Accepted: 11/25/2020] [Indexed: 11/17/2022]
Abstract
In the recent era, a liver syndrome that causes any damage in life capacity is exceptionally normal everywhere throughout the world. It has been found that liver disease is exposed more in young people as a comparison with other aged people. At the point when liver capacity ends up, life endures just up to 1 or 2 days scarcely, and it is very hard to predict such illness in the early stage. Researchers are trying to project a model for early prediction of liver disease utilizing various machine learning approaches. However, this study compares ten classifiers including A1DE, NB, MLP, SVM, KNN, CHIRP, CDT, Forest-PA, J48, and RF to find the optimal solution for early and accurate prediction of liver disease. The datasets utilized in this study are taken from the UCI ML repository and the GitHub repository. The outcomes are assessed via RMSE, RRSE, recall, specificity, precision, G-measure, F-measure, MCC, and accuracy. The exploratory outcomes show a better consequence of RF utilizing the UCI dataset. Assessing RF using RMSE and RRSE, the outcomes are 0.4328 and 87.6766, while the accuracy of RF is 72.1739% that is also better than other employed classifiers. However, utilizing the GitHub dataset, SVM beats other employed techniques in terms of increasing accuracy up to 71.3551%. Moreover, the comprehensive outcomes of this exploration can be utilized as a reference point for further research studies that slight assertion concerning the enhancement in extrapolation through any new technique, model, or framework can be benchmarked and confirmed.
Collapse
|
19
|
Xiao W, Xin L, Gao S, Peng Y, Luo J, Yao W, Ribeiro R, Xu Z, Zhang Z, Liu Y, Li J, Badiwala M, Sun Y. Single-Beat Measurement of Left Ventricular Contractility in Normothermic Ex Situ Perfused Porcine Hearts. IEEE Trans Biomed Eng 2020; 67:3288-3295. [DOI: 10.1109/tbme.2020.2982655] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
20
|
de Jong W, Asmarawati TP, Verbeek I, Rusli M, Hadi U, van Gorp E, Goeijenbier M. Point-of-care thrombocyte function testing using multiple-electrode aggregometry in dengue patients: an explorative study. BMC Infect Dis 2020; 20:580. [PMID: 32762658 PMCID: PMC7409667 DOI: 10.1186/s12879-020-05248-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 07/14/2020] [Indexed: 12/13/2022] Open
Abstract
Background Dengue virus (DENV) causes the hospitalisation of an estimated 500,000 people every year. Outbreaks can severely stress healthcare systems, especially in rural settings. It is difficult to discriminate patients who need to be hospitalized from those that do not. Earlier work identified thrombocyte count and subsequent function as a promising prognostic marker of DENV severity. Herein, we investigated the potential of quantitative thrombocyte function tests in those admitted in the very early phase of acute DENV infections, using Multiplate™ multiple-electrode aggregometry to explore its potential in triage. Methods In this prospective cohort study all patients aged ≥13 admitted to Universitas Airlangga Hospital in Surabaya, Indonesia with a fever (≥38 °C) between 25 January and 1 August 2018 and with a clinical suspicion of DENV, were eligible for inclusion. Exclusion criteria were a thrombocyte count below 100 × 109/L and the use of any medication with a known anticoagulant effect, nonsteroidal anti-inflammatory drugs and acetyl salicylic acid. Clinical data was collected and blood was taken on admission, day 1 and day 7. Samples were tested for acute DENV, using Panbio NS1 ELISA. Platelet aggregation using ADP-, TRAP- and COL-test were presented as Area Under the aggregation Curve (AUC). Significance was tested between DENV+, probably DENV, fever of another origin, and healthy controls (HC). Results A total of 59 patients (DENV+ n = 10, DENV probable n = 25, fever other origin n = 24) and 20 HC were included. We found a significantly lower thrombocyte aggregation in the DENV+ group, compared with both HCs and the fever of another origin group (p < .001). Low ADP AUC values on baseline correlated to a longer hospital stay in DENV+ and probable DENV cases. Conclusion Thrombocyte aggregation induced by Adenosine diphosphate, Collagen and Thrombin receptor activating peptide-6 is impaired in human DENV cases, compared with healthy controls and other causes of fever. This explorative study provides insights to thrombocyte function in DENV patients and could potentially serve as a future marker in DENV disease.
Collapse
Affiliation(s)
- Wesley de Jong
- Department of Viroscience, Erasmus MC, Rotterdam, the Netherlands.
| | - Tri Pudy Asmarawati
- Department of Internal Medicine, Universitas Airlangga Hospital, Airlangga University, Surabaya, Indonesia
| | - Inge Verbeek
- Department of Viroscience, Erasmus MC, Rotterdam, the Netherlands
| | - Musofa Rusli
- Department of infectious diseases, Rumah Sakit Umum Daerah Dr Soetomo, Airlangga University, Surabaya, Indonesia
| | - Usman Hadi
- Department of infectious diseases, Rumah Sakit Umum Daerah Dr Soetomo, Airlangga University, Surabaya, Indonesia
| | - Eric van Gorp
- Department of Viroscience, Erasmus MC, Rotterdam, the Netherlands.,Department of internal medicine, Erasmus MC, Rotterdam, the Netherlands
| | - Marco Goeijenbier
- Department of Viroscience, Erasmus MC, Rotterdam, the Netherlands. .,Department of internal medicine, Erasmus MC, Rotterdam, the Netherlands.
| |
Collapse
|
21
|
Agany DD, Pietri JE, Gnimpieba EZ. Assessment of vector-host-pathogen relationships using data mining and machine learning. Comput Struct Biotechnol J 2020; 18:1704-1721. [PMID: 32670510 PMCID: PMC7340972 DOI: 10.1016/j.csbj.2020.06.031] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/19/2020] [Accepted: 06/19/2020] [Indexed: 12/15/2022] Open
Abstract
Infectious diseases, including vector-borne diseases transmitted by arthropods, are a leading cause of morbidity and mortality worldwide. In the era of big data, addressing broad-scale, fundamental questions regarding the complex dynamics of these diseases will increasingly require the integration of diverse datasets to produce new biological knowledge. This review provides a current snapshot of the systematic assessment of the relationships between microbial pathogens, arthropod vectors and mammalian hosts using data mining and machine learning. We employ PRISMA to identify 32 key papers relevant to this topic. Our analysis shows an increasing use of data mining and machine learning tasks and techniques, including prediction, classification, clustering, association rules mining, and deep learning, over the last decade. However, it also reveals a number of critical challenges in applying these to the study of vector-host-pathogen interactions at various systems biology levels. Here, relevant studies, current limitations and future directions are discussed. Furthermore, the quality of data in relevant papers was assessed using the FAIR (Findable, Accessible, Interoperable, Reusable) compliance criteria to evaluate and encourage reproducibility and shareability of research outcomes. Although shortcomings in their application remain, data mining and machine learning have significant potential to break new ground in understanding fundamental aspects of vector-host-pathogen relationships and their application in this field should be encouraged. In particular, while predictive modeling, feature engineering and supervised machine learning are already being used in the field, other data mining and machine learning methods such as deep learning and association rules analysis lag behind and should be implemented in combination with established methods to accelerate hypothesis and knowledge generation in the domain.
Collapse
Affiliation(s)
- Diing D.M. Agany
- University of South Dakota, Biomedical Engineering Program, Sioux Falls, SD, United States
- 2DBEST (2-Dimensional Materials for Biofilm Engineering, Science and Technology), United States
| | - Jose E. Pietri
- University of South Dakota, Sanford School of Medicine, Division of Basic Biomedical Sciences, Vermillion, SD, United States
| | - Etienne Z. Gnimpieba
- University of South Dakota, Biomedical Engineering Program, Sioux Falls, SD, United States
- 2DBEST (2-Dimensional Materials for Biofilm Engineering, Science and Technology), United States
| |
Collapse
|
22
|
Sippy R, Farrell DF, Lichtenstein DA, Nightingale R, Harris MA, Toth J, Hantztidiamantis P, Usher N, Cueva Aponte C, Barzallo Aguilar J, Puthumana A, Lupone CD, Endy T, Ryan SJ, Stewart Ibarra AM. Severity Index for Suspected Arbovirus (SISA): Machine learning for accurate prediction of hospitalization in subjects suspected of arboviral infection. PLoS Negl Trop Dis 2020; 14:e0007969. [PMID: 32059026 PMCID: PMC7046343 DOI: 10.1371/journal.pntd.0007969] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 02/27/2020] [Accepted: 12/03/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Dengue, chikungunya, and Zika are arboviruses of major global health concern. Decisions regarding the clinical management of suspected arboviral infection are challenging in resource-limited settings, particularly when deciding on patient hospitalization. The objective of this study was to determine if hospitalization of individuals with suspected arboviral infections could be predicted using subject intake data. METHODOLOGY/PRINCIPAL FINDINGS Two prediction models were developed using data from a surveillance study in Machala, a city in southern coastal Ecuador with a high burden of arboviral infections. Data were obtained from subjects who presented at sentinel medical centers with suspected arboviral infection (November 2013 to September 2017). The first prediction model-called the Severity Index for Suspected Arbovirus (SISA)-used only demographic and symptom data. The second prediction model-called the Severity Index for Suspected Arbovirus with Laboratory (SISAL)-incorporated laboratory data. These models were selected by comparing the prediction ability of seven machine learning algorithms; the area under the receiver operating characteristic curve from the prediction of a test dataset was used to select the final algorithm for each model. After eliminating those with missing data, the SISA dataset had 534 subjects, and the SISAL dataset had 98 subjects. For SISA, the best prediction algorithm was the generalized boosting model, with an AUC of 0.91. For SISAL, the best prediction algorithm was the elastic net with an AUC of 0.94. A sensitivity analysis revealed that SISA and SISAL are not directly comparable to one another. CONCLUSIONS/SIGNIFICANCE Both SISA and SISAL were able to predict arbovirus hospitalization with a high degree of accuracy in our dataset. These algorithms will need to be tested and validated on new data from future patients. Machine learning is a powerful prediction tool and provides an excellent option for new management tools and clinical assessment of arboviral infection.
Collapse
Affiliation(s)
- Rachel Sippy
- Institute for Global Health and Translational Science, SUNY Upstate Medical University, Syracuse, New York, United States of America
- Quantitative Disease Ecology and Conservation Lab, Department of Geography, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| | - Daniel F. Farrell
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Daniel A. Lichtenstein
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Ryan Nightingale
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Megan A. Harris
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Joseph Toth
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Paris Hantztidiamantis
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Nicholas Usher
- Office of Undergraduate Biology, Cornell University, Ithaca, New York, United States of America
| | - Cinthya Cueva Aponte
- Institute for Global Health and Translational Science, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | | | - Anthony Puthumana
- College of Medicine, MD Program, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Christina D. Lupone
- Institute for Global Health and Translational Science, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Timothy Endy
- Institute for Global Health and Translational Science, SUNY Upstate Medical University, Syracuse, New York, United States of America
- Department of Medicine, SUNY Upstate Medical University, Syracuse, New York, United States of America
| | - Sadie J. Ryan
- Quantitative Disease Ecology and Conservation Lab, Department of Geography, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| | - Anna M. Stewart Ibarra
- Institute for Global Health and Translational Science, SUNY Upstate Medical University, Syracuse, New York, United States of America
- Department of Medicine, SUNY Upstate Medical University, Syracuse, New York, United States of America
| |
Collapse
|
23
|
Azevedo BP, Farias PCS, Pastor AF, Davi CCM, Neco HVPDC, Lima RED, Acioli-Santos B. Response to Joob and Wiwanitkit Re: "AA IDO1 Variant Genotype (G2431A, rs3739319) Is Associated with Severe Dengue Risk Development in a DEN-3 Brazilian Cohort". Viral Immunol 2019; 32:319-320. [PMID: 31526262 DOI: 10.1089/vim.2019.0098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
| | - Pablo Cantalice S Farias
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, Brazil
| | - André Filipe Pastor
- Institute of Education, Science, and Technology of Sertão Pernambucano (IFSertão-PE), Floresta, Brazil
| | | | | | - Raul Emidio de Lima
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, Brazil
| | - Bartolomeu Acioli-Santos
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ), Recife, Brazil
| |
Collapse
|
24
|
Azevedo BP, Farias PCS, Pastor AF, Davi CCM, Neco HVPDC, Lima RED, Acioli-Santos B. AAIDO1Variant Genotype (G2431A, rs3739319) Is Associated with Severe Dengue Risk Development in a DEN-3 Brazilian Cohort. Viral Immunol 2019; 32:296-301. [DOI: 10.1089/vim.2018.0149] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Affiliation(s)
| | - Pablo Cantalice S. Farias
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ). Recife/PE, Brazil
| | - André Filipe Pastor
- Institute of Education, Science, and Technology of Sertão Pernambucano (IFSertão-PE), Floresta, Pernambuco, Brazil
| | | | | | - Raul Emídio de Lima
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ). Recife/PE, Brazil
| | - Bartolomeu Acioli-Santos
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation (FIOCRUZ). Recife/PE, Brazil
| |
Collapse
|