1
|
Bender D, Licht DJ, Nataraj C. A Novel Embedded Feature Selection and Dimensionality Reduction Method for an SVM Type Classifier to Predict Periventricular Leukomalacia (PVL) in Neonates. APPLIED SCIENCES (BASEL, SWITZERLAND) 2021; 11:11156. [PMID: 37885926 PMCID: PMC10601609 DOI: 10.3390/app112311156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
This paper is concerned with the prediction of the occurrence of periventricular leukomalacia (PVL) in neonates after heart surgery. Our prior work shows that the Support Vector Machine (SVM) classifier can be a powerful tool in predicting clinical outcomes of such complicated and uncommon diseases, even when the number of data samples is low. In the presented work, we first illustrate and discuss the shortcomings of the traditional automatic machine learning (aML) approach. Consequently, we describe our methodology for addressing these shortcomings, while utilizing the designed interactive ML (iML) algorithm. Finally, we conclude with a discussion of the developed method and the results obtained. In sum, by adding an additional (Genetic Algorithm) optimization step in the SVM learning framework, we were able to (a) reduce the dimensionality of an SVM model from 248 to 53 features, (b) increase generalization that was confirmed by a 100% accuracy assessed on an unseen testing set, and (c) improve the overall SVM model's performance from 65% to 100% testing accuracy, utilizing the proposed iML method.
Collapse
Affiliation(s)
- Dieter Bender
- Villanova Center for Analytics of Dynamic Systems, Villanova University, 800 Lancaster Ave, Villanova, PA 19085, USA
| | - Daniel J. Licht
- June and Steve Wolfson Laboratory for Clinical and Biomedical Optics, Children’s Hospital of Philadelphia, 324 S 34th St, Philadelphia, PA 19104, USA
| | - C. Nataraj
- Villanova Center for Analytics of Dynamic Systems, Villanova University, 800 Lancaster Ave, Villanova, PA 19085, USA
| |
Collapse
|
2
|
Jalali A, Lonsdale H, Do N, Peck J, Gupta M, Kutty S, Ghazarian SR, Jacobs JP, Rehman M, Ahumada LM. Deep Learning for Improved Risk Prediction in Surgical Outcomes. Sci Rep 2020; 10:9289. [PMID: 32518246 PMCID: PMC7283236 DOI: 10.1038/s41598-020-62971-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 03/19/2020] [Indexed: 11/10/2022] Open
Abstract
The Norwood surgical procedure restores functional systemic circulation in neonatal patients with single ventricle congenital heart defects, but this complex procedure carries a high mortality rate. In this study we address the need to provide an accurate patient specific risk prediction for one-year postoperative mortality or cardiac transplantation and prolonged length of hospital stay with the purpose of assisting clinicians and patients' families in the preoperative decision making process. Currently available risk prediction models either do not provide patient specific risk factors or only predict in-hospital mortality rates. We apply machine learning models to predict and calculate individual patient risk for mortality and prolonged length of stay using the Pediatric Heart Network Single Ventricle Reconstruction trial dataset. We applied a Markov Chain Monte-Carlo simulation method to impute missing data and then fed the selected variables to multiple machine learning models. The individual risk of mortality or cardiac transplantation calculation produced by our deep neural network model demonstrated 89 ± 4% accuracy and 0.95 ± 0.02 area under the receiver operating characteristic curve (AUROC). The C-statistics results for prediction of prolonged length of stay were 85 ± 3% accuracy and AUROC 0.94 ± 0.04. These predictive models and calculator may help to inform clinical and organizational decision making.
Collapse
Affiliation(s)
- Ali Jalali
- Predictive Analytics, Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA.
- Department of Anesthesia and Pain Medicine at Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA.
| | - Hannah Lonsdale
- Department of Anesthesia and Pain Medicine at Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
| | - Nhue Do
- Pediatric Cardiac Surgery, Department of Surgery at Vanderbilt University, Nashville, TN, 37240, USA
| | - Jacquelin Peck
- Department of Anesthesiology at Mount Sinai Hospital, Miami Beach, FL, 33140, USA
| | - Monesha Gupta
- Division of Cardiology at Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
| | - Shelby Kutty
- Department of Pediatrics, at Johns Hopkins School of Medicine, Baltimore, MD, 21287, USA
| | - Sharon R Ghazarian
- Health Informatics Core, Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
| | | | - Mohamed Rehman
- Department of Anesthesia and Pain Medicine at Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
| | - Luis M Ahumada
- Predictive Analytics, Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
- Department of Anesthesia and Pain Medicine at Johns Hopkins All Children's Hospital, St. Petersburg, FL, 33701, USA
| |
Collapse
|
3
|
Elshawi R, Al-Mallah MH, Sakr S. On the interpretability of machine learning-based model for predicting hypertension. BMC Med Inform Decis Mak 2019; 19:146. [PMID: 31357998 PMCID: PMC6664803 DOI: 10.1186/s12911-019-0874-0] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 07/18/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Although complex machine learning models are commonly outperforming the traditional simple interpretable models, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. The aim of this study to demonstrate the utility of various model-agnostic explanation techniques of machine learning models with a case study for analyzing the outcomes of the machine learning random forest model for predicting the individuals at risk of developing hypertension based on cardiorespiratory fitness data. METHODS The dataset used in this study contains information of 23,095 patients who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. Five global interpretability techniques (Feature Importance, Partial Dependence Plot, Individual Conditional Expectation, Feature Interaction, Global Surrogate Models) and two local interpretability techniques (Local Surrogate Models, Shapley Value) have been applied to present the role of the interpretability techniques on assisting the clinical staff to get better understanding and more trust of the outcomes of the machine learning-based predictions. RESULTS Several experiments have been conducted and reported. The results show that different interpretability techniques can shed light on different insights on the model behavior where global interpretations can enable clinicians to understand the entire conditional distribution modeled by the trained response function. In contrast, local interpretations promote the understanding of small parts of the conditional distribution for specific instances. CONCLUSIONS Various interpretability techniques can vary in their explanations for the behavior of the machine learning model. The global interpretability techniques have the advantage that it can generalize over the entire population while local interpretability techniques focus on giving explanations at the level of instances. Both methods can be equally valid depending on the application need. Both methods are effective methods for assisting clinicians on the medical decision process, however, the clinicians will always remain to hold the final say on accepting or rejecting the outcome of the machine learning models and their explanations based on their domain expertise.
Collapse
Affiliation(s)
- Radwa Elshawi
- Data Systems Group, Institute of Computer Science, University of Tartu, 2 J. Liivi St., 50409 Tartu, Estonia
| | | | - Sherif Sakr
- Data Systems Group, Institute of Computer Science, University of Tartu, 2 J. Liivi St., 50409 Tartu, Estonia
| |
Collapse
|
4
|
Al-Kaysi AM, Al-Ani A, Galvez V, Colleen Loo K, Ling S, Tjeerd Boonstra W. Estimating The Quality of Electroconvulsive Therapy Induced Seizures Using Decision Tree and Fuzzy Inference System Classifiers. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2018; 2018:3677-3680. [PMID: 30441170 DOI: 10.1109/embc.2018.8513334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Electroconvulsive therapy (ECT) is an effective and widely used treatment for major depressive disorder, in which a brief electric current is passed through the brain to trigger a brief seizure. This study aims to identify seizure quality rating by utilizing a set of seizure parameters. We used 750 ECT EEG recordings in this experiment. Four seizure related parameters, (time of slowing, regularity, stereotypy and post-ictal suppression) are used as inputs to two classifiers, decision tree and fuzzy inference system (FIS), to predict seizure quality ratings. The two classifiers produced encouraging results with error rate of 0.31 and 0.25 for FIS and decision tree, respectively. The classification results show that the four seizure parameters provide relevant information about the rating of seizure quality. Automatic scoring of seizure quality may be beneficial to clinicians working in this field.
Collapse
|
5
|
Olive MK, Owens GE. Current monitoring and innovative predictive modeling to improve care in the pediatric cardiac intensive care unit. Transl Pediatr 2018; 7:120-128. [PMID: 29770293 PMCID: PMC5938248 DOI: 10.21037/tp.2018.04.03] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The objectives of this review are (I) to describe the challenges associated with monitoring patients in the pediatric cardiac intensive care unit (PCICU) and (II) to discuss the use of innovative statistical and artificial intelligence (AI) software programs to attempt to predict significant clinical events. Patients cared for in the PCICU are clinically fragile and at risk for fatal decompensation. Current monitoring modalities are often ineffective, sometimes inaccurate, and fail to detect a deteriorating clinical status in a timely manner. Predictive models created by AI and machine learning may lead to earlier detection of patients at risk for clinical decompensation and thereby improve care for critically ill pediatric cardiac patients.
Collapse
Affiliation(s)
- Mary K Olive
- Division of Pediatric Cardiology, C.S. Mott Children's Hospital, University of Michigan, Ann Arbor, MI, USA
| | - Gabe E Owens
- Division of Pediatric Cardiology, C.S. Mott Children's Hospital, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
6
|
Gálvez JA, Jalali A, Ahumada L, Simpao AF, Rehman MA. Neural Network Classifier for Automatic Detection of Invasive Versus Noninvasive Airway Management Technique Based on Respiratory Monitoring Parameters in a Pediatric Anesthesia. J Med Syst 2017; 41:153. [PMID: 28836107 DOI: 10.1007/s10916-017-0787-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 07/20/2017] [Indexed: 01/09/2023]
Abstract
Children undergoing general anesthesia require airway monitoring by an anesthesia provider. The airway may be supported with noninvasive devices such as face mask or invasive devices such as a laryngeal mask airway or an endotracheal tube. The physiologic data stored provides an opportunity to apply machine learning algorithms distinguish between these modes based on pattern recognition. We retrieved three data sets from patients receiving general anesthesia in 2015 with either mask, laryngeal mask airway or endotracheal tube. Patients underwent myringotomy, tonsillectomy, adenoidectomy or inguinal hernia repair procedures. We retrieved measurements for end-tidal carbon dioxide, tidal volume, and peak inspiratory pressure and calculated statistical features for each data element per patient. We applied machine learning algorithms (decision tree, support vector machine, and neural network) to classify patients into noninvasive or invasive airway device support. We identified 300 patients per group (mask, laryngeal mask airway, and endotracheal tube) for a total of 900 patients. The neural network classifier performed better than the boosted trees and support vector machine classifiers based on the test data sets. The sensitivity, specificity, and accuracy for neural network classification are 97.5%, 96.3%, and 95.8%. In contrast, the sensitivity, specificity, and accuracy of support vector machine are 89.1%, 92.3%, and 88.3% and with the boosted tree classifier they are 93.8%, 92.1%, and 91.4%. We describe a method to automatically distinguish between noninvasive and invasive airway device support in a pediatric surgical setting based on respiratory monitoring parameters. The results show that the neural network classifier algorithm can accurately classify noninvasive and invasive airway device support.
Collapse
Affiliation(s)
- Jorge A Gálvez
- Section of Biomedical Informatics, Department of Anesthesiology & Critical Care Medicine, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA.
| | - Ali Jalali
- Section of Biomedical Informatics, Department of Anesthesiology & Critical Care Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Luis Ahumada
- Enterprise Analytics and Reporting, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Allan F Simpao
- Section of Biomedical Informatics, Department of Anesthesiology & Critical Care Medicine, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Mohamed A Rehman
- Section of Biomedical Informatics, Department of Anesthesiology & Critical Care Medicine, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| |
Collapse
|
7
|
Application of Mathematical Modeling for Simulation and Analysis of Hypoplastic Left Heart Syndrome (HLHS) in Pre- and Postsurgery Conditions. BIOMED RESEARCH INTERNATIONAL 2015; 2015:987293. [PMID: 26601113 PMCID: PMC4637090 DOI: 10.1155/2015/987293] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2014] [Accepted: 02/19/2015] [Indexed: 11/24/2022]
Abstract
This paper is concerned with the mathematical modeling of a severe and common congenital defect called hypoplastic left heart syndrome (HLHS). Surgical approaches are utilized for palliating this heart condition; however, a brain white matter injury called periventricular leukomalacia (PVL) occurs with high prevalence at or around the time of surgery, the exact cause of which is not known presently. Our main goal in this paper is to study the hemodynamic conditions under which HLHS physiology may lead to the occurrence of PVL. A lumped parameter model of the HLHS circulation has been developed integrating diffusion modeling of oxygen and carbon dioxide concentrations in order to study hemodynamic variables such as pressure, flow, and blood gas concentration. Results presented include calculations of blood pressures and flow rates in different parts of the circulation. Simulations also show changes in the ratio of pulmonary to systemic blood flow rates when the sizes of the patent ductus arteriosus and atrial septal defect are varied. These changes lead to unbalanced blood circulations and, when combined with low oxygen and carbon dioxide concentrations in arteries, result in poor oxygen delivery to the brain. We stipulate that PVL occurs as a consequence.
Collapse
|
8
|
Jalali A, Licht DJ, Nataraj C. Discovering hidden relationships in physiological signals for prediction of Periventricular Leukomalacia. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2013:7080-3. [PMID: 24111376 DOI: 10.1109/embc.2013.6611189] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This paper is concerned with predicting the occurrence of Periventricular Leukomalacia (PVL) using vital data which are collected over a period of twelve hours after neonatal cardiac surgery. The vital data contain heart rate (HR), mean arterial pressure (MAP), right atrium pressure (RAP), and oxygen saturation (SpO2). Various features are extracted from the data and are then ranked so that an optimal subset of features that have the highest discriminative capabilities can be selected. A decision tree (DT) is then developed for the vital data in order to identify the most important vital measurements. The DT result shows that high amplitude 20 minutes variations and low sample entropy in the data is an important factor for prediction of PVL. Low sample entropy represents lack of variability in hemodynamic measurement, and constant blood pressure with small fluctuations is an important indicator of PVL occurrence. Finally, using the different time frames of the collected data, we show that the first six hours of data contain sufficient information for PVL occurrence prediction.
Collapse
|
9
|
Jalali A, Buckley EM, Lynch JM, Schwab PJ, Licht DJ, Nataraj C. Prediction of periventricular leukomalacia occurrence in neonates after heart surgery. IEEE J Biomed Health Inform 2014; 18:1453-60. [PMID: 24122606 PMCID: PMC4122287 DOI: 10.1109/jbhi.2013.2285011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
This paper is concerned with predicting the occurrence of periventricular leukomalacia (PVL) using vital and blood gas data which are collected over a period of 12 h after the neonatal cardiac surgery. A data mining approach has been employed to generate a set of rules for classification of subjects as healthy or PVL affected. In view of the fact that blood gas and vital data have different sampling rates, in this study we have divided the data into two categories: 1) high resolution (vital), and 2) low resolution (blood gas), and designed a separate classifier based on each data category. The developed algorithm is composed of several stages; first, a feature pool has been extracted from each data category and the extracted features have been ranked based on the data reliability and their mutual information content with the output. An optimal feature subset with the highest discriminative capability has been formed using simultaneous maximization of the class separability measure and mutual information of a set. Two separate decision trees (DTs) have been developed for the classification purpose and more importantly to discover hidden relationships that exist among the data to help us better understand PVL pathophysiology. The DT result shows that high amplitude 20 min variations and low sample entropy in the vital data and the defined out of range index as well as maximum rate of change in blood gas data are important factors for PVL prediction. Low sample entropy represents lack of variability in hemodynamic measurement, and constant blood pressure with small fluctuations is an important indicator of PVL occurrence. Finally, using the different time frames of data collection, we show that the first 6 h of data contain sufficient information for PVL occurrence prediction.
Collapse
Affiliation(s)
- Ali Jalali
- PhD candidate at the Department of Mechanical Engineering, Villanova University, Villanova, PA, 19085 USA
| | - Erin M. Buckley
- Post-Doctoral researcher at the Neurovascular Imaging Lab, Division of Child Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA, 19140 USA
| | - Jennifer M. Lynch
- PhD candidate at the Neurovascular Imaging Lab, Division of Child Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA, 19140 USA
| | - Peter J. Schwab
- Neurovascular Imaging Lab, Division of Child Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA, 19140 USA
| | - Daniel J. Licht
- Director of the Neurovascular Imaging Lab, Division of Child Neurology, Children’s Hospital of Philadelphia, Philadelphia, PA, 19140 USA
| | - C Nataraj
- Mrs. and Mr. Mortiz, Sr. Endowed Professor in Engineered Systems and Chair of the Department of Mechanical Engineering, Villanova University, Villanova, PA, 19085 USA
| |
Collapse
|