1
|
Zhao L, Vidwans A, Bearnot CJ, Rayner J, Lin T, Baird J, Suner S, Jay GD. Prediction of anemia in real-time using a smartphone camera processing conjunctival images. PLoS One 2024; 19:e0302883. [PMID: 38739605 PMCID: PMC11090304 DOI: 10.1371/journal.pone.0302883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 04/15/2024] [Indexed: 05/16/2024] Open
Abstract
Anemia is defined as a low hemoglobin (Hb) concentration and is highly prevalent worldwide. We report on the performance of a smartphone application (app) that records images in RAW format of the palpebral conjunctivae and estimates Hb concentration by relying upon computation of the tissue surface high hue ratio. Images of bilateral conjunctivae were obtained prospectively from a convenience sample of 435 Emergency Department patients using a dedicated smartphone. A previous computer-based and validated derivation data set associating estimated conjunctival Hb (HBc) and the actual laboratory-determined Hb (HBl) was used in deriving Hb estimations using a self-contained mobile app. Accuracy of HBc was 75.4% (95% CI 71.3, 79.4%) for all categories of anemia, and Bland-Altman plot analysis showed a bias of 0.10 and limits of agreement (LOA) of (-4.73, 4.93 g/dL). Analysis of HBc estimation accuracy around different anemia thresholds showed that AUC was maximized at transfusion thresholds of 7 and 9 g/dL which showed AUC values of 0.92 and 0.90 respectively. We found that the app is sufficiently accurate for detecting severe anemia and shows promise as a population-sourced screening platform or as a non-invasive point-of-care anemia classifier.
Collapse
Affiliation(s)
- Leon Zhao
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - Alisa Vidwans
- Rhode Island Hospital, Providence, Rhode Island, United States of America
| | - Courtney J. Bearnot
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - James Rayner
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - Timmy Lin
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - Janette Baird
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - Selim Suner
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| | - Gregory D. Jay
- The Warren Alpert Medical School, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
2
|
Shen S, Yuan X, Wang J, Fan L, Zhao J, Tao J. Evaluation of a machine learning algorithms for predicting the dental age of adolescent based on different preprocessing methods. Front Public Health 2022; 10:1068253. [PMID: 36530730 PMCID: PMC9751184 DOI: 10.3389/fpubh.2022.1068253] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 11/14/2022] [Indexed: 12/05/2022] Open
Abstract
Background Machine learning (ML) algorithms play a key role in estimating dental age. In this study, three ML models were used for dental age estimation, based on different preprocessing methods. Aim The seven mandibular teeth on the digital panorama were measured and evaluated according to the Cameriere and the Demirjian method, respectively. Correlation data were used for decision tree (DT), Bayesian ridge regression (BRR), k-nearest neighbors (KNN) models for dental age estimation. An accuracy comparison was made among different methods. Subjects and methods We analyzed 748 orthopantomographs (392 males and 356 females) from eastern China between the age of 5 and 13 years in this retrospective study. Three models, DT, BRR, and KNN, were used to estimate the dental age. The data in ML is obtained according to the Cameriere method and the Demirjian method. Coefficient of determination (R2), mean error (ME), root mean square error (RMSE), mean square error (MSE) and mean absolute error (MAE), the above five metrics were used to evaluate the accuracy of age estimation. Results Our experimental results showed that the prediction accuracy of dental age was affected by ML algorithms. MD, MAD, MSE, RMSE of the dental age predicted by ML were significantly decreased. Among all the methods, the KNN model based on the Cameriere method had the highest accuracy (ME = 0.015, MAE = 0.473, MSE = 0.340, RMSE = 0.583, R2 = 0.94). Conclusion The results show that the prediction accuracy of dental age is influenced by ML algorithms and preprocessing method. The KNN model based on the Cameriere method was able to infer dental age more accurately in a clinical setting.
Collapse
Affiliation(s)
- Shihui Shen
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China
| | - Xiaoyan Yuan
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China
| | - Jian Wang
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China
| | - Linfeng Fan
- College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China,Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Junjun Zhao
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China,Junjun Zhao
| | - Jiang Tao
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China,National Center for Stomatology, Shanghai, China,National Clinical Research Center for Oral Diseases, Shanghai, China,Shanghai Key Laboratory of Stomatology, Shanghai, China,Shanghai Research Institute of Stomatology, Shanghai, China,*Correspondence: Jiang Tao
| |
Collapse
|
3
|
Płuciennik A, Płaczek A, Wilk A, Student S, Oczko-Wojciechowska M, Fujarewicz K. Data Integration–Possibilities of Molecular and Clinical Data Fusion on the Example of Thyroid Cancer Diagnostics. Int J Mol Sci 2022; 23:ijms231911880. [PMID: 36233181 PMCID: PMC9569592 DOI: 10.3390/ijms231911880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 09/24/2022] [Accepted: 09/28/2022] [Indexed: 11/23/2022] Open
Abstract
(1) Background: The data from independent gene expression sources may be integrated for the purpose of molecular diagnostics of cancer. So far, multiple approaches were described. Here, we investigated the impacts of different data fusion strategies on classification accuracy and feature selection stability, which allow the costs of diagnostic tests to be reduced. (2) Methods: We used molecular features (gene expression) combined with a feature extracted from the independent clinical data describing a patient’s sample. We considered the dependencies between selected features in two data fusion strategies (early fusion and late fusion) compared to classification models based on molecular features only. We compared the best accuracy classification models in terms of the number of features, which is connected to the potential cost reduction of the diagnostic classifier. (3) Results: We show that for thyroid cancer, the extracted clinical feature is correlated with (but not redundant to) the molecular data. The usage of data fusion allows a model to be obtained with similar or even higher classification quality (with a statistically significant accuracy improvement, a p-value below 0.05) and with a reduction in molecular dimensionality of the feature space from 15 to 3–8 (depending on the feature selection method). (4) Conclusions: Both strategies give comparable quality results, but the early fusion method provides better feature selection stability.
Collapse
Affiliation(s)
- Alicja Płuciennik
- Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
- Department of Technology Development, Gabos Software Sp z o.o., Mikołowska 100, 40-065 Katowice, Poland
- Correspondence: (A.P.); (S.S.)
| | - Aleksander Płaczek
- Department of Technology Development, Gabos Software Sp z o.o., Mikołowska 100, 40-065 Katowice, Poland
- Department of Applied Informatics, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Agata Wilk
- Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
- Department of Biostatistics and Bioinformatics, Maria Sklodowska-Curie National Research Institute of Oncology, Gliwice Branch, Wybrzeze AK 14, 44-100 Gliwice, Poland
| | - Sebastian Student
- Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
- Biotechnology Center, Silesian University of Technology, Bolesława Krzywoustego 8, 44-100 Gliwice, Poland
- Correspondence: (A.P.); (S.S.)
| | - Małgorzata Oczko-Wojciechowska
- Department of Clinical and Molecular Genetics, Maria Sklodowska-Curie National Research Institute of Oncology, Gliwice Branch, Wybrzeze AK 14, 44-100 Gliwice, Poland
| | - Krzysztof Fujarewicz
- Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
4
|
Knio ZO, Morales FL, Shah KP, Ondigi OK, Selinski CE, Baldeo CM, Zhuo DX, Bilchick KC, Mehta NK, Kwon Y, Breathett K, Thiele RH, Hulse MC, Mazimba S. A systemic congestive index (systemic pulse pressure to central venous pressure ratio) predicts adverse outcomes in patients undergoing valvular heart surgery. J Card Surg 2022; 37:3259-3266. [PMID: 35842813 PMCID: PMC9543661 DOI: 10.1111/jocs.16772] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 06/09/2022] [Accepted: 06/28/2022] [Indexed: 12/26/2022]
Abstract
Background and Aims Invasive hemodynamics may provide a more nuanced assessment of cardiac function and risk phenotyping in patients undergoing cardiac surgery. The systemic pulse pressure (SPP) to central venous pressure (CVP) ratio represents an integrated index of right and left ventricular function and thus may demonstrate an association with valvular heart surgery outcomes. This study hypothesized that a low SPP/CVP ratio would be associated with mortality in valvular surgery patients. Methods This retrospective cohort study examined adult valvular surgery patients with preoperative right heart catheterization from 2007 through 2016 at a single tertiary medical center (n = 215). Associations between the SPP/CVP ratio and mortality were investigated with univariate and multivariate analyses. Results Among 215 patients (age 69.7 ± 12.4 years; 55.8% male), 61 died (28.4%) over a median follow‐up of 5.9 years. A SPP/CVP ratio <7.6 was associated with increased mortality (relative risk 1.70, 95% confidence interval [CI] 1.08–2.67, p = .019) and increased length of stay (11.56 ± 13.73 days vs. 7.93 ± 4.92 days, p = .016). It remained an independent predictor of mortality (adjusted odds ratio 3.99, 95% CI 1.47–11.45, p = .008) after adjusting for CVP, mean pulmonary artery pressure, aortic stenosis, tricuspid regurgitation, smoking status, diabetes mellitus, dialysis, and cross‐clamp time. Conclusions A low SPP/CVP ratio was associated with worse outcomes in patients undergoing valvular heart surgery. This metric has potential utility in preoperative risk stratification to guide patient selection, prognosis, and surgical outcomes.
Collapse
Affiliation(s)
- Ziyad O Knio
- Department of Anesthesiology, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Frances L Morales
- University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Kajal P Shah
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Olivia K Ondigi
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Christian E Selinski
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Cherisse M Baldeo
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| | - David X Zhuo
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA.,Division of Cardiology, Department of Medicine, University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA
| | - Kenneth C Bilchick
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Nishaki K Mehta
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA.,Division of Cardiovascular Medicine, Beaumont Hospital, Royal Oak, Michigan, USA
| | - Younghoon Kwon
- Division of Cardiology, Department of Medicine, University of Washington, Seattle, Washington, USA
| | - Khadijah Breathett
- Division of Cardiovascular Medicine, Indiana University, Indianapolis, Indiana, USA
| | - Robert H Thiele
- Department of Anesthesiology, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Matthew C Hulse
- Department of Anesthesiology, University of Virginia Health System, Charlottesville, Virginia, USA
| | - Sula Mazimba
- Division of Cardiovascular Medicine, Department of Medicine, University of Virginia Health System, Charlottesville, Virginia, USA
| |
Collapse
|
5
|
Optimal Data Reduction of Training Data in Machine Learning-Based Modelling: A Multidimensional Bin Packing Approach. ENERGIES 2022. [DOI: 10.3390/en15093092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In these days, when complex, IT-controlled systems have found their way into many areas, models and the data on which they are based are playing an increasingly important role. Due to the constantly growing possibilities of collecting data through sensor technology, extensive data sets are created that need to be mastered. In concrete terms, this means extracting the information required for a specific problem from the data in a high quality. For example, in the field of condition monitoring, this includes relevant system states. Especially in the application field of machine learning, the quality of the data is of significant importance. Here, different methods already exist to reduce the size of data sets without reducing the information value. In this paper, the multidimensional binned reduction (MdBR) method is presented as an approach that has a much lower complexity in comparison on the one hand and deals with regression, instead of classification as most other approaches do, on the other. The approach merges discretization approaches with non-parametric numerosity reduction via histograms. MdBR has linear complexity and can be facilitated to reduce large multivariate data sets to smaller subsets, which could be used for model training. The evaluation, based on a dataset from the photovoltaic sector with approximately 92 million samples, aims to train a multilayer perceptron (MLP) model to estimate the output power of the system. The results show that using the approach, the number of samples for training could be reduced by more than 99%, while also increasing the model’s performance. It works best with large data sets of low-dimensional data. Although periodic data often include the most redundant samples and thus provide the best reduction capabilities, the presented approach can only handle time-invariant data and not sequences of samples, as often done in time series.
Collapse
|
6
|
Knio ZO, Thiele RH, Wright WZ, Mazimba S, Naik BI, Hulse MC. A Novel Hemodynamic Index of Post-operative Right Heart Dysfunction Predicts Mortality in Cardiac Surgical Patients. Semin Cardiothorac Vasc Anesth 2022; 26:200-208. [PMID: 35332827 DOI: 10.1177/10892532221080382] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
INTRODUCTION This study aimed to investigate whether mortality following cardiac surgery was associated with the pulmonary artery pulsatility index (PAPi): pulmonary artery pulse pressure divided by central venous pressure (CVP), and a novel index: mean pulmonary artery pressure (mPAP) minus CVP. METHODS This retrospective analysis investigated all cardiac surgery patients in the Society of Thoracic Surgeons registry at a single academic medical center from January 2017 through March 2020 (n = 1510). The primary and secondary outcomes were mortality at 1 year and serum creatinine increase during index surgical admission, respectively. CVP, mPAP, PAPi, mPAP-CVP gradient, mean arterial pressure (MAP), and cardiac index (CI) were sampled continually from invasive hemodynamic monitors post-operatively. Associations with mortality were tested with univariate and multivariate analyses. The relationship with serum creatinine was investigated with Pearson's correlation at alpha = .05. RESULTS One-year mortality was observed in 44/1200 patients (3.7%). On univariate analysis, mortality was associated with minimums for mPAP, MAP, and CI and maximums for CVP, mPAP, PAPi, mPAP-CVP gradient, and CI (all P < .10). Model selection revealed that the only independently predictive parameters were minimum MAP (AOR = .880 [.819-.944]), maximum mPAP-CVP gradient (AOR = 1.082 [1.031-1.133]), and maximum CI (AOR = 1.421 [.928-2.068]), with model c-statistic = .770. A maximum mPAP-CVP gradient >20.5 predicted mortality with 54.5% sensitivity and 79.30% specificity, maintaining significance on survival analysis (P < .001). Peak increase in serum creatinine from baseline demonstrated a weak association with all parameters (max |r| = .33). CONCLUSIONS Mortality was not predicted by the post-operative PAPi; rather, it was independently predicted by the mPAP-CVP gradient, MAP, and CI.
Collapse
Affiliation(s)
- Ziyad O Knio
- Department of Anesthesiology, 12350University of Virginia Health System, Charlottesville, VA, USA
| | - Robert H Thiele
- Department of Anesthesiology, 12350University of Virginia Health System, Charlottesville, VA, USA
| | - W Zachary Wright
- Department of Anesthesiology, 12350University of Virginia Health System, Charlottesville, VA, USA
| | - Sula Mazimba
- Department of Medicine, Division of Cardiovascular Medicine, 12350University of Virginia Health System, Charlottesville, VA, USA
| | - Bhiken I Naik
- Department of Anesthesiology, 12350University of Virginia Health System, Charlottesville, VA, USA.,Department of Neurosurgery, 12350University of Virginia Health System, Charlottesville, VA, USA
| | - Matthew C Hulse
- Department of Anesthesiology, 12350University of Virginia Health System, Charlottesville, VA, USA
| |
Collapse
|
7
|
Choi Y, Park JH, Hong KJ, Ro YS, Song KJ, Shin SD. Development and validation of a prehospital-stage prediction tool for traumatic brain injury: a multicentre retrospective cohort study in Korea. BMJ Open 2022; 12:e055918. [PMID: 35022177 PMCID: PMC8756263 DOI: 10.1136/bmjopen-2021-055918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVES Predicting diagnosis and prognosis of traumatic brain injury (TBI) at the prehospital stage is challenging; however, using comprehensive prehospital information and machine learning may improve the performance of the predictive model. We developed and tested predictive models for TBI that use machine learning algorithms using information that can be obtained in the prehospital stage. DESIGN This was a multicentre retrospective study. SETTING AND PARTICIPANTS This study was conducted at three tertiary academic emergency departments (EDs) located in an urban area of South Korea. The data from adult patients with severe trauma who were assessed by emergency medical service providers and transported to three participating hospitals between 2014 to 2018 were analysed. RESULTS We developed and tested five machine learning algorithms-logistic regression analyses, extreme gradient boosting, support vector machine, random forest and elastic net (EN)-to predict TBI, TBI with intracranial haemorrhage or injury (TBI-I), TBI with ED or admission result of admission or transferred (TBI with non-discharge (TBI-ND)) and TBI with ED or admission result of death (TBI-D). A total of 1169 patients were included in the final analysis, and the proportions of TBI, TBI-I, TBI-ND and TBI-D were 24.0%, 21.5%, 21.3% and 3.7%, respectively. The EN model yielded an area under receiver-operator curve of 0.799 for TBI, 0.844 for TBI-I, 0.811 for TBI-ND and 0.871 for TBI-D. The EN model also yielded the highest specificity and significant reclassification improvement. Variables related to loss of consciousness, Glasgow Coma Scale and light reflex were the three most important variables to predict all outcomes. CONCLUSION Our results inform the diagnosis and prognosis of TBI. Machine learning models resulted in significant performance improvement over that with logistic regression analyses, and the best performing model was EN.
Collapse
Affiliation(s)
- Yeongho Choi
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul National University Hospital, Seoul, South Korea
| | - Jeong Ho Park
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul National University Hospital, Seoul, South Korea
| | - Ki Jeong Hong
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul National University Hospital, Seoul, South Korea
| | - Young Sun Ro
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul National University Hospital, Seoul, South Korea
| | - Kyoung Jun Song
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul Metropolitan Government Seoul National University Boramae Medical Center, Seoul, South Korea
| | - Sang Do Shin
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea
- Department of Emergency Medicine, Seoul National University Hospital, Seoul, South Korea
| |
Collapse
|
8
|
Fu K, Li Y, Lv H, Wu W, Song J, Xu J. Development of a Model Predicting the Outcome of In Vitro Fertilization Cycles by a Robust Decision Tree Method. Front Endocrinol (Lausanne) 2022; 13:877518. [PMID: 36093079 PMCID: PMC9449728 DOI: 10.3389/fendo.2022.877518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 06/21/2022] [Indexed: 11/20/2022] Open
Abstract
INTRODUCTION Infertility is a worldwide problem. To evaluate the outcome of in vitro fertilization (IVF) treatment for infertility, many indicators need to be considered and the relation among indicators need to be studied. OBJECTIVES To construct an IVF predicting model by a robust decision tree method and find important factors and their interrelation. METHODS IVF and intracytoplasmic sperm injection (ICSI) cycles between January 2010 and December 2020 in a women's hospital were collected. Comprehensive evaluation and examination of patients, specific therapy strategy and the outcome of treatment were recorded. Variables were selected through the significance of 1-way analysis between the clinical pregnant group and the nonpregnant group and then were discretized. Then, gradient boosting decision tree (GBDT) was used to construct the model to compute the score for predicting the rate of clinical pregnancy. RESULT Thirty-eight variables with significant difference were selected for binning and thirty of them in which the pregnancy rate varied in different categories were chosen to construct the model. The final score computed by model predicted the clinical pregnancy rate well with the Area Under Curve (AUC) value achieving 0.704 and the consistency reaching 98.1%. Number of two-pronuclear embryo (2PN), age of women, AMH level, number of oocytes retrieved and endometrial thickness were important factors related to IVF outcome. Moreover, some interrelations among factors were found from model, which may assist clinicians in making decisions. CONCLUSION This study constructed a model predicting the outcome of IVF cycles through a robust decision tree method and achieved satisfactory prediction performance. Important factors related to IVF outcome and some interrelations among factors were found.
Collapse
Affiliation(s)
- Kaiyou Fu
- First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- Women’s Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yanrui Li
- School of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Houyi Lv
- Women’s Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- Fourth Affiliated Hospital, School of Medicine, Zhejiang University, Yiwu, China
| | - Wei Wu
- Fourth Affiliated Hospital, School of Medicine, Zhejiang University, Yiwu, China
| | - Jianyuan Song
- Fourth Affiliated Hospital, School of Medicine, Zhejiang University, Yiwu, China
| | - Jian Xu
- Women’s Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- Fourth Affiliated Hospital, School of Medicine, Zhejiang University, Yiwu, China
- *Correspondence: Jian Xu,
| |
Collapse
|
9
|
Different Data Mining Approaches Based Medical Text Data. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:1285167. [PMID: 34912530 PMCID: PMC8668297 DOI: 10.1155/2021/1285167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 11/18/2021] [Indexed: 12/15/2022]
Abstract
The amount of medical text data is increasing dramatically. Medical text data record the progress of medicine and imply a large amount of medical knowledge. As a natural language, they are characterized by semistructured, high-dimensional, high data volume semantics and cannot participate in arithmetic operations. Therefore, how to extract useful knowledge or information from the total available data is very important task. Using various techniques of data mining can extract valuable knowledge or information from data. In the current study, we reviewed different approaches to apply for medical text data mining. The advantages and shortcomings for each technique compared to different processes of medical text data were analyzed. We also explored the applications of algorithms for providing insights to the users and enabling them to use the resources for the specific challenges in medical text data. Further, the main challenges in medical text data mining were discussed. Findings of this paper are benefit for helping the researchers to choose the reasonable techniques for mining medical text data and presenting the main challenges to them in medical text data mining.
Collapse
|
10
|
Shen S, Liu Z, Wang J, Fan L, Ji F, Tao J. Machine learning assisted Cameriere method for dental age estimation. BMC Oral Health 2021; 21:641. [PMID: 34911516 PMCID: PMC8672533 DOI: 10.1186/s12903-021-01996-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 11/24/2021] [Indexed: 11/23/2022] Open
Abstract
Background Recently, the dental age estimation method developed by Cameriere has been widely recognized and accepted. Although machine learning (ML) methods can improve the accuracy of dental age estimation, no machine learning research exists on the use of the Cameriere dental age estimation method, making this research innovative and meaningful. Aim The purpose of this research is to use 7 lower left permanent teeth and three models [random forest (RF), support vector machine (SVM), and linear regression (LR)] based on the Cameriere method to predict children's dental age, and compare with the Cameriere age estimation. Subjects and methods This was a retrospective study that collected and analyzed orthopantomograms of 748 children (356 females and 392 males) aged 5–13 years. Data were randomly divided into training and test datasets in an 80–20% proportion for the ML algorithms. The procedure, starting with randomly creating new training and test datasets, was repeated 20 times. 7 permanent developing teeth on the left mandible (except wisdom teeth) were recorded using the Cameriere method. Then, the traditional Cameriere formula and three models (RF, SVM, and LR) were used to estimate the dental age. The age prediction accuracy was measured by five indicators: the coefficient of determination (R2), mean error (ME), root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE). Results The research showed that the ML models have better accuracy than the traditional Cameriere formula. The ME, MAE, MSE, and RMSE values of the SVM model (0.004, 0.489, 0.392, and 0.625, respectively) and the RF model (− 0.004, 0.495, 0.389, and 0.623, respectively) were lower with the highest accuracy. In contrast, the ME, MAE, MSE and RMSE of the European Cameriere formula were 0.592, 0.846, 0.755, and 0.869, respectively, and those of the Chinese Cameriere formula were 0.748, 0.812, 0.890 and 0.943, respectively. Conclusions Compared to the Cameriere formula, ML methods based on the Cameriere’s maturation stages were more accurate in estimating dental age. These results support the use of ML algorithms instead of the traditional Cameriere formula. Supplementary Information The online version contains supplementary material available at 10.1186/s12903-021-01996-0.
Collapse
Affiliation(s)
- Shihui Shen
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology, Shanghai, People's Republic of China
| | - Zihao Liu
- Department of Nuclear Medicine, Xin Hua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
| | - Jian Wang
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology, Shanghai, People's Republic of China
| | - Linfeng Fan
- Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology, Shanghai, People's Republic of China
| | - Fang Ji
- Department of Orthodontics, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology, Shanghai, People's Republic of China.
| | - Jiang Tao
- Department of General Dentistry, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology, Shanghai, People's Republic of China.
| |
Collapse
|
11
|
Zamanzadeh DJ, Petousis P, Davis TA, Nicholas SB, Norris KC, Tuttle KR, Bui AAT, Sarrafzadeh M. Autopopulus: A Novel Framework for Autoencoder Imputation on Large Clinical Datasets. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:2303-2309. [PMID: 34891747 PMCID: PMC8862635 DOI: 10.1109/embc46164.2021.9630135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The adoption of electronic health records (EHRs) has made patient data increasingly accessible, precipitating the development of various clinical decision support systems and data-driven models to help physicians. However, missing data are common in EHR-derived datasets, which can introduce significant uncertainty, if not invalidating the use of a predictive model. Machine learning (ML)-based imputation methods have shown promise in various domains for the task of estimating values and reducing uncertainty to the point that a predictive model can be employed. We introduce Autopopulus, a novel framework that enables the design and evaluation of various autoencoder architectures for efficient imputation on large datasets. Autopopulus implements existing autoencoder methods as well as a new technique that outputs a range of estimated values (rather than point estimates), and demonstrates a workflow that helps users make an informed decision on an appropriate imputation method. To further illustrate Autopopulus' utility, we use it to identify not only which imputation methods can most accurately impute on a large clinical dataset, but to also identify the imputation methods that enable downstream predictive models to achieve the best performance for prediction of chronic kidney disease (CKD) progression.
Collapse
|
12
|
Alexandre L, Costa RS, Henriques R. DI2: prior-free and multi-item discretization of biological data and its applications. BMC Bioinformatics 2021; 22:426. [PMID: 34496758 PMCID: PMC8425008 DOI: 10.1186/s12859-021-04329-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Accepted: 08/23/2021] [Indexed: 11/24/2022] Open
Abstract
Background A considerable number of data mining approaches for biomedical data analysis, including state-of-the-art associative models, require a form of data discretization. Although diverse discretization approaches have been proposed, they generally work under a strict set of statistical assumptions which are arguably insufficient to handle the diversity and heterogeneity of clinical and molecular variables within a given dataset. In addition, although an increasing number of symbolic approaches in bioinformatics are able to assign multiple items to values occurring near discretization boundaries for superior robustness, there are no reference principles on how to perform multi-item discretizations. Results In this study, an unsupervised discretization method, DI2, for variables with arbitrarily skewed distributions is proposed. Statistical tests applied to assess differences in performance confirm that DI2 generally outperforms well-established discretizations methods with statistical significance. Within classification tasks, DI2 displays either competitive or superior levels of predictive accuracy, particularly delineate for classifiers able to accommodate border values. Conclusions This work proposes a new unsupervised method for data discretization, DI2, that takes into account the underlying data regularities, the presence of outlier values disrupting expected regularities, as well as the relevance of border values. DI2 is available at https://github.com/JupitersMight/DI2 Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04329-8.
Collapse
Affiliation(s)
- Leonardo Alexandre
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001, Lisbon, Portugal. .,INESC-ID, Lisbon, Portugal. .,Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal.
| | - Rafael S Costa
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001, Lisbon, Portugal.,LAQV-REQUIMTE, DQ, NOVA School of Science and Technology, Universidade NOVA de Lisboa, 2829-516, Caparica, Portugal
| | - Rui Henriques
- INESC-ID, Lisbon, Portugal.,Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
13
|
Bbosa FF, Nabukenya J, Nabende P, Wesonga R. On the goodness of fit of parametric and non-parametric data mining techniques: the case of malaria incidence thresholds in Uganda. HEALTH AND TECHNOLOGY 2021. [DOI: 10.1007/s12553-021-00551-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
14
|
Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing. SENSORS 2021; 21:s21082849. [PMID: 33919558 PMCID: PMC8074051 DOI: 10.3390/s21082849] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/13/2021] [Accepted: 04/15/2021] [Indexed: 11/17/2022]
Abstract
Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree's performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.
Collapse
|
15
|
Cost and Complications of Single-Level Lumbar Decompression in Those Over and Under 75: A Matched Comparison. Spine (Phila Pa 1976) 2021; 46:29-34. [PMID: 32925688 DOI: 10.1097/brs.0000000000003686] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
STUDY DESIGN Retrospective database analysis. OBJECTIVE This study aimed to compare costs and complication rates following single-level lumbar decompression in patients under age 75 versus patients aged 75 and older. SUMMARY OF BACKGROUND DATA Lumbar decompression is a common surgical treatment for lumbar pathology; however, its effectiveness can be debated in elderly patients because complication rates and costs by age group are not well-defined. METHODS The Medicare database was queried through the PearlDiver server for patients who underwent single-level lumbar decompression without fusion as an index procedure. The 90-day complication and reoperation rates were compared between age groups after matching for sex and comorbidity burden. Same day and 90-day costs are compared. RESULTS The matched cohort included 89,388 total patients (n = 44,694 for each study arm). Compared to the under 75 age group, the 75 and older age group had greater rates of deep venous thrombosis (odds ratio [OR] 1.443, P = 0.042) and dural tear (OR 1.560, P = 0.043), and a lower rate of seroma complicating the procedure (OR 0.419, P = 0.009). There was no difference in overall 90-day reoperation rate in patients under age 75 versus patients aged 75 and older (9.66% vs. 9.28%, P = 0.051), although the 75 and older age group had a greater rate of laminectomy without discectomy (CPT-63047; OR 1.175, P < 0.001), while having a lower rate of laminotomy with discectomy (CPT-63042 and CPT-63030; OR 0.727 and 0.867, respectively, P = 0.013 and <0.001, respectively). The 75 and older age group had greater same day ($3329.24 vs. $3138.05, P < 0.001) and 90-day ($5014.82 vs. $4749.44, P < 0.001) mean reimbursement. CONCLUSION Elderly patients experience greater rates of select perioperative complications, with mildly increased costs. There is no significant difference in overall 90-day reoperation rates. LEVEL OF EVIDENCE 3.
Collapse
|
16
|
Wang L, Tong L, Davis D, Arnold T, Esposito T. The application of unsupervised deep learning in predictive models using electronic health records. BMC Med Res Methodol 2020; 20:37. [PMID: 32101147 PMCID: PMC7043035 DOI: 10.1186/s12874-020-00923-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2018] [Accepted: 02/12/2020] [Indexed: 11/18/2022] Open
Abstract
Background The main goal of this study is to explore the use of features representing patient-level electronic health record (EHR) data, generated by the unsupervised deep learning algorithm autoencoder, in predictive modeling. Since autoencoder features are unsupervised, this paper focuses on their general lower-dimensional representation of EHR information in a wide variety of predictive tasks. Methods We compare the model with autoencoder features to traditional models: logistic model with least absolute shrinkage and selection operator (LASSO) and Random Forest algorithm. In addition, we include a predictive model using a small subset of response-specific variables (Simple Reg) and a model combining these variables with features from autoencoder (Enhanced Reg). We performed the study first on simulated data that mimics real world EHR data and then on actual EHR data from eight Advocate hospitals. Results On simulated data with incorrect categories and missing data, the precision for autoencoder is 24.16% when fixing recall at 0.7, which is higher than Random Forest (23.61%) and lower than LASSO (25.32%). The precision is 20.92% in Simple Reg and improves to 24.89% in Enhanced Reg. When using real EHR data to predict the 30-day readmission rate, the precision of autoencoder is 19.04%, which again is higher than Random Forest (18.48%) and lower than LASSO (19.70%). The precisions for Simple Reg and Enhanced Reg are 18.70 and 19.69% respectively. That is, Enhanced Reg can have competitive prediction performance compared to LASSO. In addition, results show that Enhanced Reg usually relies on fewer features under the setting of simulations of this paper. Conclusions We conclude that autoencoder can create useful features representing the entire space of EHR data and which are applicable to a wide array of predictive tasks. Together with important response-specific predictors, we can derive efficient and robust predictive models with less labor in data extraction and model training.
Collapse
Affiliation(s)
- Lei Wang
- School of Statistics, Renmin University of China, 59 Zhong Guan Cun Ave, Hai Dian District, Beijing, People's Republic of China.,Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, 851 S Morgan St, Chicago, IL, 60607, USA
| | - Liping Tong
- Advocate Aurora Health, 3075 Highland Parkway, Downers Grove, IL, 60515, USA.
| | - Darcy Davis
- Advocate Aurora Health, 3075 Highland Parkway, Downers Grove, IL, 60515, USA
| | - Tim Arnold
- Cerner Corporation, 2800 Rockcreek Parkway, North Kansas City, MO, 64117, USA
| | - Tina Esposito
- Advocate Aurora Health, 3075 Highland Parkway, Downers Grove, IL, 60515, USA
| |
Collapse
|
17
|
Rodriguez-Morilla B, Estivill E, Estivill-Domènech C, Albares J, Segarra F, Correa A, Campos M, Rol MA, Madrid JA. Application of Machine Learning Methods to Ambulatory Circadian Monitoring (ACM) for Discriminating Sleep and Circadian Disorders. Front Neurosci 2019; 13:1318. [PMID: 31920488 PMCID: PMC6916421 DOI: 10.3389/fnins.2019.01318] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Accepted: 11/25/2019] [Indexed: 12/20/2022] Open
Abstract
The present study proposes a classification model for the differential diagnosis of primary insomnia (PI) and delayed sleep phase disorder (DSPD), applying machine learning methods to circadian parameters obtained from ambulatory circadian monitoring (ACM). Nineteen healthy controls and 242 patients (PI = 184; DSPD = 58) were selected for a retrospective and non-interventional study from an anonymized Circadian Health Database (https://kronowizard.um.es/). ACM records wrist temperature (T), motor activity (A), body position (P), and environmental light exposure (L) rhythms during a whole week. Sleep was inferred from the integrated variable TAP (from temperature, activity, and position). Non-parametric analyses of TAP and estimated sleep yielded indexes of interdaily stability (IS), intradaily variability (IV), relative amplitude (RA), and a global circadian function index (CFI). Mid-sleep and mid-wake times were estimated from the central time of TAP-L5 (five consecutive hours of lowest values) and TAP-M10 (10 consecutive hours of maximum values), respectively. The most discriminative parameters, determined by ANOVA, Chi-squared, and information gain criteria analysis, were employed to build a decision tree, using machine learning. This model differentiated between healthy controls, DSPD and three insomnia subgroups (compatible with onset, maintenance and mild insomnia), with accuracy, sensitivity, and AUC >85%. In conclusion, circadian parameters can be reliably and objectively used to discriminate and characterize different sleep and circadian disorders, such as DSPD and OI, which are commonly confounded, and between different subtypes of PI. Our findings highlight the importance of considering circadian rhythm assessment in sleep medicine.
Collapse
Affiliation(s)
- Beatriz Rodriguez-Morilla
- Laboratory of Chronobiology, IMIB-Arrixaca, Department of Physiology, Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable, Instituto de Salud Carlos III, University of Murcia, Murcia, Spain
| | | | | | - Javier Albares
- Medicina del Sueño Doctor Albares, Centro Médico Teknon, Barcelona, Spain
| | | | - Angel Correa
- Department of Experimental Psychology, Faculty of Psychology, University of Granada, Granada, Spain
| | - Manuel Campos
- Laboratory of Chronobiology, IMIB-Arrixaca, Department of Physiology, Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable, Instituto de Salud Carlos III, University of Murcia, Murcia, Spain
- Department of Computing and Systems, Faculty of Computer Science, University of Murcia, Murcia, Spain
| | - Maria Angeles Rol
- Laboratory of Chronobiology, IMIB-Arrixaca, Department of Physiology, Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable, Instituto de Salud Carlos III, University of Murcia, Murcia, Spain
| | - Juan Antonio Madrid
- Laboratory of Chronobiology, IMIB-Arrixaca, Department of Physiology, Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable, Instituto de Salud Carlos III, University of Murcia, Murcia, Spain
| |
Collapse
|
18
|
Prediction of good neurological recovery after out-of-hospital cardiac arrest: A machine learning analysis. Resuscitation 2019; 142:127-135. [PMID: 31362082 DOI: 10.1016/j.resuscitation.2019.07.020] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 06/28/2019] [Accepted: 07/16/2019] [Indexed: 01/28/2023]
Abstract
BACKGROUND This study aimed to train, validate and compare predictive models that use machine learning analysis for good neurological recovery in OHCA patients. METHODS Adult OHCA patients who had a presumed cardiac etiology and a sustained return of spontaneous circulation between 2013 and 2016 were analyzed; 80% of the individuals were analyzed for training and 20% were analyzed for validation. We developed using six machine learning algorithms: logistic regression (LR), extreme gradient boosting (XGB), support vector machine, random forest, elastic net (EN), and neural network. Variables that could be obtained within 24 hours of the emergency department visit were used. The area under the receiver operation curve (AUROC) was calculated to assess the discrimination. Calibration was assessed by the Hosmer-Lemeshow test. Reclassification was assessed by using the continuous net reclassification index (NRI). RESULTS A total of 19,860 OHCA patients were included in the analysis. Of the 15,888 patients in the training group, 2228 (14.0%) had a good neurological recovery; of the 3972 patients in the validation group, 577 (14.5%) had a good neurological recovery. The LR, XGB, and EN models showed the highest discrimination powers (AUROC (95% CI)) of 0.949 (0.941-0.957) for all), and all three models were well calibrated (Hosmer-Lemeshow test: p >0.05). The XGB model reclassified patients according to their true risk better than the LR model (NRI: 0.110), but the EN model reclassified patients worse than the LR model (NRI: -1.239). CONCLUSION The best performing machine learning algorithm was the XGB and LR algorithm.
Collapse
|
19
|
Unobtrusive Mattress-Based Identification of Hypertension by Integrating Classification and Association Rule Mining. SENSORS 2019; 19:s19071489. [PMID: 30934719 PMCID: PMC6480150 DOI: 10.3390/s19071489] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 03/13/2019] [Accepted: 03/22/2019] [Indexed: 11/25/2022]
Abstract
Hypertension is one of the most common cardiovascular diseases, which will cause severe complications if not treated in a timely way. Early and accurate identification of hypertension is essential to prevent the condition from deteriorating further. As a kind of complex physiological state, hypertension is hard to characterize accurately. However, most existing hypertension identification methods usually extract features only from limited aspects such as the time-frequency domain or non-linear domain. It is difficult for them to characterize hypertension patterns comprehensively, which results in limited identification performance. Furthermore, existing methods can only determine whether the subjects suffer from hypertension, but they cannot give additional useful information about the patients’ condition. For example, their classification results cannot explain why the subjects are hypertensive, which is not conducive to further analyzing the patient’s condition. To this end, this paper proposes a novel hypertension identification method by integrating classification and association rule mining. Its core idea is to exploit the association relationship among multi-dimension features to distinguish hypertensive patients from normotensive subjects. In particular, the proposed method can not only identify hypertension accurately, but also generate a set of class association rules (CARs). The CARs are proved to be able to reflect the subject’s physiological status. Experimental results based on a real dataset indicate that the proposed method outperforms two state-of-the-art methods and three common classifiers, and achieves 84.4%, 82.5% and 85.3% in terms of accuracy, precision and recall, respectively.
Collapse
|
20
|
Nagasato D, Tabuchi H, Ohsugi H, Masumoto H, Enno H, Ishitobi N, Sonobe T, Kameoka M, Niki M, Mitamura Y. Deep-learning classifier with ultrawide-field fundus ophthalmoscopy for detecting branch retinal vein occlusion. Int J Ophthalmol 2019; 12:94-99. [PMID: 30662847 DOI: 10.18240/ijo.2019.01.15] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Accepted: 12/06/2018] [Indexed: 12/27/2022] Open
Abstract
AIM To investigate and compare the efficacy of two machine-learning technologies with deep-learning (DL) and support vector machine (SVM) for the detection of branch retinal vein occlusion (BRVO) using ultrawide-field fundus images. METHODS This study included 237 images from 236 patients with BRVO with a mean±standard deviation of age 66.3±10.6y and 229 images from 176 non-BRVO healthy subjects with a mean age of 64.9±9.4y. Training was conducted using a deep convolutional neural network using ultrawide-field fundus images to construct the DL model. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and area under the curve (AUC) were calculated to compare the diagnostic abilities of the DL and SVM models. RESULTS For the DL model, the sensitivity, specificity, PPV, NPV and AUC for diagnosing BRVO was 94.0% (95%CI: 93.8%-98.8%), 97.0% (95%CI: 89.7%-96.4%), 96.5% (95%CI: 94.3%-98.7%), 93.2% (95%CI: 90.5%-96.0%) and 0.976 (95%CI: 0.960-0.993), respectively. In contrast, for the SVM model, these values were 80.5% (95%CI: 77.8%-87.9%), 84.3% (95%CI: 75.8%-86.1%), 83.5% (95%CI: 78.4%-88.6%), 75.2% (95%CI: 72.1%-78.3%) and 0.857 (95%CI: 0.811-0.903), respectively. The DL model outperformed the SVM model in all the aforementioned parameters (P<0.001). CONCLUSION These results indicate that the combination of the DL model and ultrawide-field fundus ophthalmoscopy may distinguish between healthy and BRVO eyes with a high level of accuracy. The proposed combination may be used for automatically diagnosing BRVO in patients residing in remote areas lacking access to an ophthalmic medical center.
Collapse
Affiliation(s)
- Daisuke Nagasato
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Hitoshi Tabuchi
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Hideharu Ohsugi
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Hiroki Masumoto
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | | | - Naofumi Ishitobi
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Tomoaki Sonobe
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Masahiro Kameoka
- Department of Ophthalmology, Saneikai Tsukazaki Hospital, Himeji 6711227, Japan
| | - Masanori Niki
- Department of Ophthalmology, Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima 7708503, Japan
| | - Yoshinori Mitamura
- Department of Ophthalmology, Institute of Biomedical Sciences, Tokushima University Graduate School, Tokushima 7708503, Japan
| |
Collapse
|
21
|
Deep Neural Network-Based Method for Detecting Central Retinal Vein Occlusion Using Ultrawide-Field Fundus Ophthalmoscopy. J Ophthalmol 2018; 2018:1875431. [PMID: 30515316 PMCID: PMC6236766 DOI: 10.1155/2018/1875431] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 10/17/2018] [Indexed: 11/17/2022] Open
Abstract
The aim of this study is to assess the performance of two machine-learning technologies, namely, deep learning (DL) and support vector machine (SVM) algorithms, for detecting central retinal vein occlusion (CRVO) in ultrawide-field fundus images. Images from 125 CRVO patients (n=125 images) and 202 non-CRVO normal subjects (n=238 images) were included in this study. Training to construct the DL model using deep convolutional neural network algorithms was provided using ultrawide-field fundus images. The SVM uses scikit-learn library with a radial basis function kernel. The diagnostic abilities of DL and the SVM were compared by assessing their sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic curve for CRVO. For diagnosing CRVO, the DL model had a sensitivity of 98.4% (95% confidence interval (CI), 94.3–99.8%) and a specificity of 97.9% (95% CI, 94.6–99.1%) with an AUC of 0.989 (95% CI, 0.980–0.999). In contrast, the SVM model had a sensitivity of 84.0% (95% CI, 76.3–89.3%) and a specificity of 87.5% (95% CI, 82.7–91.1%) with an AUC of 0.895 (95% CI, 0.859–0.931). Thus, the DL model outperformed the SVM model in all indices assessed (P < 0.001 for all). Our data suggest that a DL model derived using ultrawide-field fundus images could distinguish between normal and CRVO images with a high level of accuracy and that automatic CRVO detection in ultrawide-field fundus ophthalmoscopy is possible. This proposed DL-based model can also be used in ultrawide-field fundus ophthalmoscopy to accurately diagnose CRVO and improve medical care in remote locations where it is difficult for patients to attend an ophthalmic medical center.
Collapse
|
22
|
The Impact of Risk Standardization on Variation in CT Use and Emergency Physician Profiling. AJR Am J Roentgenol 2018; 211:392-399. [PMID: 29975119 DOI: 10.2214/ajr.17.19188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of this study is to use detailed electronic health record data to profile the use of condition-specific, risk-standardized imaging by emergency physicians. MATERIALS AND METHODS CT utilization in four emergency departments in a single health care system was retrospectively analyzed. The primary outcome for analysis was indication-specific, risk-standardized CT utilization. We constructed seven clinical cohorts on the basis of the presence or absence of a traumatic indication for the most frequently performed CT studies. Risk standardization was performed using machine learning algorithms and hierarchic logistic regression models. Variation in CT utilization for each cohort was analyzed using coefficients of variation and box plots, the effect of risk standardization on physician profiling was determined using slope diagrams and kappa values, and within-physician correlation was assessed using correlation coefficients and matrices. RESULTS For the seven cohorts, the number of physicians ordering more than 25 CT studies for a particular indication ranged from 70 to 88, and the number of ED visits ranged from 17,458 to 117,489. The unadjusted variation was large for each indication (coefficient of variation, 30.2-57.9). Risk standardization resulted in reduced but persistent variation for all indications (coefficient of variation, 12.3-22.3). Among indication-specific models, risk standardization resulted in reclassification by two or more deciles for 14.0-39.1% of physicians. The R value for within-physician correlation varied from 0.02 to 0.80 and was highest between chest and abdominal imaging for trauma. CONCLUSION In this multisite study of CT utilization, risk standardization had a substantial impact on variation in CT utilization and emergency physician profiling. Administrators and payers should include risk standardization in future measures of physician imaging to ensure valid assessment of performance and achieve improvements in emergency care value.
Collapse
|
23
|
Taylor RA, Moore CL, Cheung KH, Brandt C. Predicting urinary tract infections in the emergency department with machine learning. PLoS One 2018. [PMID: 29513742 PMCID: PMC5841824 DOI: 10.1371/journal.pone.0194085] [Citation(s) in RCA: 100] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Urinary tract infection (UTI) is a common emergency department (ED) diagnosis with reported high diagnostic error rates. Because a urine culture, part of the gold standard for diagnosis of UTI, is usually not available for 24-48 hours after an ED visit, diagnosis and treatment decisions are based on symptoms, physical findings, and other laboratory results, potentially leading to overutilization, antibiotic resistance, and delayed treatment. Previous research has demonstrated inadequate diagnostic performance for both individual laboratory tests and prediction tools. OBJECTIVE Our aim, was to train, validate, and compare machine-learning based predictive models for UTI in a large diverse set of ED patients. METHODS Single-center, multi-site, retrospective cohort analysis of 80,387 adult ED visits with urine culture results and UTI symptoms. We developed models for UTI prediction with six machine learning algorithms using demographic information, vitals, laboratory results, medications, past medical history, chief complaint, and structured historical and physical exam findings. Models were developed with both the full set of 211 variables and a reduced set of 10 variables. UTI predictions were compared between models and to proxies of provider judgment (documentation of UTI diagnosis and antibiotic administration). RESULTS The machine learning models had an area under the curve ranging from 0.826-0.904, with extreme gradient boosting (XGBoost) the top performing algorithm for both full and reduced models. The XGBoost full and reduced models demonstrated greatly improved specificity when compared to the provider judgment proxy of UTI diagnosis OR antibiotic administration with specificity differences of 33.3 (31.3-34.3) and 29.6 (28.5-30.6), while also demonstrating superior sensitivity when compared to documentation of UTI diagnosis with sensitivity differences of 38.7 (38.1-39.4) and 33.2 (32.5-33.9). In the admission and discharge cohorts using the full XGboost model, approximately 1 in 4 patients (4109/15855) would be re-categorized from a false positive to a true negative and approximately 1 in 11 patients (1372/15855) would be re-categorized from a false negative to a true positive. CONCLUSION The best performing machine learning algorithm, XGBoost, accurately diagnosed positive urine culture results, and outperformed previously developed models in the literature and several proxies for provider judgment. Future prospective validation is warranted.
Collapse
Affiliation(s)
- R. Andrew Taylor
- Department of Emergency Medicine, Yale University School of Medicine, New Haven CT, United States of America
- * E-mail:
| | - Christopher L. Moore
- Department of Emergency Medicine, Yale University School of Medicine, New Haven CT, United States of America
| | - Kei-Hoi Cheung
- Department of Emergency Medicine, Yale University School of Medicine, New Haven CT, United States of America
| | - Cynthia Brandt
- Department of Emergency Medicine, Yale University School of Medicine, New Haven CT, United States of America
| |
Collapse
|
24
|
Edmunds K, Gíslason M, Sigurðsson S, Guðnason V, Harris T, Carraro U, Gargiulo P. Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities. PLoS One 2018. [PMID: 29513690 PMCID: PMC5841751 DOI: 10.1371/journal.pone.0193241] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Sarcopenic muscular degeneration has been consistently identified as an independent risk factor for mortality in aging populations. Recent investigations have realized the quantitative potential of computed tomography (CT) image analysis to describe skeletal muscle volume and composition; however, the optimum approach to assessing these data remains debated. Current literature reports average Hounsfield unit (HU) values and/or segmented soft tissue cross-sectional areas to investigate muscle quality. However, standardized methods for CT analyses and their utility as a comorbidity index remain undefined, and no existing studies compare these methods to the assessment of entire radiodensitometric distributions. The primary aim of this study was to present a comparison of nonlinear trimodal regression analysis (NTRA) parameters of entire radiodensitometric muscle distributions against extant CT metrics and their correlation with lower extremity function (LEF) biometrics (normal/fast gait speed, timed up-and-go, and isometric leg strength) and biochemical and nutritional parameters, such as total solubilized cholesterol (SCHOL) and body mass index (BMI). Data were obtained from 3,162 subjects, aged 66–96 years, from the population-based AGES-Reykjavik Study. 1-D k-means clustering was employed to discretize each biometric and comorbidity dataset into twelve subpopulations, in accordance with Sturges’ Formula for Class Selection. Dataset linear regressions were performed against eleven NTRA distribution parameters and standard CT analyses (fat/muscle cross-sectional area and average HU value). Parameters from NTRA and CT standards were analogously assembled by age and sex. Analysis of specific NTRA parameters with standard CT results showed linear correlation coefficients greater than 0.85, but multiple regression analysis of correlative NTRA parameters yielded a correlation coefficient of 0.99 (P<0.005). These results highlight the specificities of each muscle quality metric to LEF biometrics, SCHOL, and BMI, and particularly highlight the value of the connective tissue regime in this regard.
Collapse
Affiliation(s)
- Kyle Edmunds
- Institute for Biomedical and Neural Engineering, Reykjavík University, Reykjavík, Iceland
- * E-mail:
| | - Magnús Gíslason
- Institute for Biomedical and Neural Engineering, Reykjavík University, Reykjavík, Iceland
| | | | - Vilmundur Guðnason
- Icelandic Heart Association (Hjartavernd), Kópavogur, Iceland
- Faculty of Medicine, University of Iceland, Reykjavík, Iceland
| | - Tamara Harris
- Laboratory of Epidemiology and Population Sciences, National Institute on Aging, Bethesda, MD, United States of America
| | - Ugo Carraro
- IRRCS Fondazione Ospedale San Camillo, Venezia, Italy
| | - Paolo Gargiulo
- Institute for Biomedical and Neural Engineering, Reykjavík University, Reykjavík, Iceland
- Department of Rehabilitation, Landspítali, Reykjavík, Iceland
| |
Collapse
|
25
|
Rajappan S, Rangasamy D. Estimation of incomplete values in heterogeneous attribute large datasets using discretized Bayesian max–min ant colony optimization. Knowl Inf Syst 2017. [DOI: 10.1007/s10115-017-1123-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
26
|
Towards a Predictive Analytics-Based Intelligent Malaria Outbreak Warning System. APPLIED SCIENCES-BASEL 2017. [DOI: 10.3390/app7080836] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
27
|
Casanova IJ, Campos M, Juarez JM, Fernandez-Fernandez-Arroyo A, Lorente JA. Impact of time series discretization on intensive care burn unit survival classification. PROGRESS IN ARTIFICIAL INTELLIGENCE 2017. [DOI: 10.1007/s13748-017-0130-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
28
|
Gómez I, Ribelles N, Franco L, Alba E, Jerez JM. Supervised discretization can discover risk groups in cancer survival analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 136:11-19. [PMID: 27686699 DOI: 10.1016/j.cmpb.2016.08.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Revised: 07/07/2016] [Accepted: 08/12/2016] [Indexed: 06/06/2023]
Abstract
Discretization of continuous variables is a common practice in medical research to identify risk patient groups. This work compares the performance of gold-standard categorization procedures (TNM+A protocol) with that of three supervised discretization methods from Machine Learning (CAIM, ChiM and DTree) in the stratification of patients with breast cancer. The performance for the discretization algorithms was evaluated based on the results obtained after applying standard survival analysis procedures such as Kaplan-Meier curves, Cox regression and predictive modelling. The results show that the application of alternative discretization algorithms could lead the clinicians to get valuable information for the diagnosis and outcome of the disease. Patient data were collected from the Medical Oncology Service of the Hospital Clínico Universitario (Málaga, Spain) considering a follow up period from 1982 to 2008.
Collapse
Affiliation(s)
- Iván Gómez
- Computer Science Department, University of Málaga, Campus de Teatinos S/N, 29071 Málaga, Spain; Málaga Biomedical Research Institute (IBIMA), Málaga, Spain.
| | - Nuria Ribelles
- Málaga Biomedical Research Institute (IBIMA), Málaga, Spain; Virgen de la Victoria Oncology Service, Málaga, Campus de Teatinos S/N, 29071 Málaga, Spain
| | - Leonardo Franco
- Computer Science Department, University of Málaga, Campus de Teatinos S/N, 29071 Málaga, Spain; Málaga Biomedical Research Institute (IBIMA), Málaga, Spain
| | - Emilio Alba
- Málaga Biomedical Research Institute (IBIMA), Málaga, Spain; Virgen de la Victoria Oncology Service, Málaga, Campus de Teatinos S/N, 29071 Málaga, Spain
| | - José M Jerez
- Computer Science Department, University of Málaga, Campus de Teatinos S/N, 29071 Málaga, Spain; Málaga Biomedical Research Institute (IBIMA), Málaga, Spain
| |
Collapse
|
29
|
Ni Y, Beck AF, Taylor R, Dyas J, Solti I, Grupp-Phelan J, Dexheimer JW. Will they participate? Predicting patients' response to clinical trial invitations in a pediatric emergency department. J Am Med Inform Assoc 2016; 23:671-80. [PMID: 27121609 PMCID: PMC4926740 DOI: 10.1093/jamia/ocv216] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Accepted: 12/30/2015] [Indexed: 12/27/2022] Open
Abstract
Objective (1) To develop an automated algorithm to predict a patient’s response (ie, if the patient agrees or declines) before he/she is approached for a clinical trial invitation; (2) to assess the algorithm performance and the predictors on real-world patient recruitment data for a diverse set of clinical trials in a pediatric emergency department; and (3) to identify directions for future studies in predicting patients’ participation response. Materials and Methods We collected 3345 patients’ response to trial invitations on 18 clinical trials at one center that were actively enrolling patients between January 1, 2010 and December 31, 2012. In parallel, we retrospectively extracted demographic, socioeconomic, and clinical predictors from multiple sources to represent the patients’ profiles. Leveraging machine learning methodology, the automated algorithms predicted participation response for individual patients and identified influential features associated with their decision-making. The performance was validated on the collection of actual patient response, where precision, recall, F-measure, and area under the ROC curve were assessed. Results Compared to the random response predictor that simulated the current practice, the machine learning algorithms achieved significantly better performance (Precision/Recall/F-measure/area under the ROC curve: 70.82%/92.02%/80.04%/72.78% on 10-fold cross validation and 71.52%/92.68%/80.74%/75.74% on the test set). By analyzing the significant features output by the algorithms, the study confirmed several literature findings and identified challenges that could be mitigated to optimize recruitment. Conclusion By exploiting predictive variables from multiple sources, we demonstrated that machine learning algorithms have great potential in improving the effectiveness of the recruitment process by automatically predicting patients’ participation response to trial invitations.
Collapse
Affiliation(s)
- Yizhao Ni
- Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Andrew F Beck
- Division of General and Community Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Regina Taylor
- Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Jenna Dyas
- Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Imre Solti
- Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Jacqueline Grupp-Phelan
- Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| | - Judith W Dexheimer
- Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229-3039, USA
| |
Collapse
|
30
|
Metting EI, in ’t Veen JC, Dekhuijzen PR, van Heijst E, Kocks JW, Muilwijk-Kroes JB, Chavannes NH, van der Molen T. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data. ERJ Open Res 2016; 2:00077-2015. [PMID: 27730177 PMCID: PMC5005160 DOI: 10.1183/23120541.00077-2015] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 11/21/2015] [Indexed: 11/05/2022] Open
Abstract
The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53±17 years) with suspicion of an obstructive pulmonary disease was derived from an asthma/chronic obstructive pulmonary disease (COPD) service where patients were assessed using spirometry, the Asthma Control Questionnaire, the Clinical COPD Questionnaire, history data and medication use. All patients were diagnosed through the Internet by a pulmonologist. The Chi-squared Automatic Interaction Detection method was used to build the decision tree. The tree was externally validated in another real-life primary care population (n=3215). Our tree correctly diagnosed 79% of the asthma patients, 85% of the COPD patients and 32% of the asthma-COPD overlap syndrome (ACOS) patients. External validation showed a comparable pattern (correct: asthma 78%, COPD 83%, ACOS 24%). Our decision tree is considered to be promising because it was based on real-life primary care patients with a specialist's diagnosis. In most patients the diagnosis could be correctly predicted. Predicting ACOS, however, remained a challenge. The total decision tree can be implemented in computer-assisted diagnostic systems for individual patients. A simplified version of this tree can be used in daily clinical practice as a desk tool.
Collapse
Affiliation(s)
- Esther I. Metting
- Dept of General Practice, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
- GRIAC Research Institute, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | | | | | | | - Janwillem W.H. Kocks
- Dept of General Practice, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
- GRIAC Research Institute, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | | | - Niels H. Chavannes
- Leiden University Medical Center, Dept of Public Health and Primary Care, Leiden, The Netherlands
| | - Thys van der Molen
- Dept of General Practice, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
- GRIAC Research Institute, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
31
|
Cevik M, Ergun MA, Stout NK, Trentham-Dietz A, Craven M, Alagoz O. Using Active Learning for Speeding up Calibration in Simulation Models. Med Decis Making 2015; 36:581-93. [PMID: 26471190 DOI: 10.1177/0272989x15611359] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Accepted: 07/17/2015] [Indexed: 01/08/2023]
Abstract
BACKGROUND Most cancer simulation models include unobservable parameters that determine disease onset and tumor growth. These parameters play an important role in matching key outcomes such as cancer incidence and mortality, and their values are typically estimated via a lengthy calibration procedure, which involves evaluating a large number of combinations of parameter values via simulation. The objective of this study is to demonstrate how machine learning approaches can be used to accelerate the calibration process by reducing the number of parameter combinations that are actually evaluated. METHODS Active learning is a popular machine learning method that enables a learning algorithm such as artificial neural networks to interactively choose which parameter combinations to evaluate. We developed an active learning algorithm to expedite the calibration process. Our algorithm determines the parameter combinations that are more likely to produce desired outputs and therefore reduces the number of simulation runs performed during calibration. We demonstrate our method using the previously developed University of Wisconsin breast cancer simulation model (UWBCS). RESULTS In a recent study, calibration of the UWBCS required the evaluation of 378 000 input parameter combinations to build a race-specific model, and only 69 of these combinations produced results that closely matched observed data. By using the active learning algorithm in conjunction with standard calibration methods, we identify all 69 parameter combinations by evaluating only 5620 of the 378 000 combinations. CONCLUSION Machine learning methods hold potential in guiding model developers in the selection of more promising parameter combinations and hence speeding up the calibration process. Applying our machine learning algorithm to one model shows that evaluating only 1.49% of all parameter combinations would be sufficient for the calibration.
Collapse
Affiliation(s)
- Mucahit Cevik
- Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI, USA (MC, MAE, OA)
| | - Mehmet Ali Ergun
- Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI, USA (MC, MAE, OA)
| | - Natasha K Stout
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA (NKS)
| | - Amy Trentham-Dietz
- Department of Population Health Sciences and Carbone Cancer Center, University of Wisconsin, Madison, WI, USA (AT-D, OA)
| | - Mark Craven
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA (MC)
| | - Oguzhan Alagoz
- Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI, USA (MC, MAE, OA),Department of Population Health Sciences and Carbone Cancer Center, University of Wisconsin, Madison, WI, USA (AT-D, OA)
| |
Collapse
|