1
|
Arnold M, Liou L, Boland MR. Development, evaluation and comparison of machine learning algorithms for predicting in-hospital patient charges for congestive heart failure exacerbations, chronic obstructive pulmonary disease exacerbations and diabetic ketoacidosis. BioData Min 2024; 17:35. [PMID: 39267093 PMCID: PMC11395859 DOI: 10.1186/s13040-024-00387-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 08/30/2024] [Indexed: 09/14/2024] Open
Abstract
BACKGROUND Hospitalizations for exacerbations of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD) and diabetic ketoacidosis (DKA) are costly in the United States. The purpose of this study was to predict in-hospital charges for each condition using machine learning (ML) models. RESULTS We conducted a retrospective cohort study on national discharge records of hospitalized adult patients from January 1st, 2016, to December 31st, 2019. We constructed six ML models (linear regression, ridge regression, support vector machine, random forest, gradient boosting and extreme gradient boosting) to predict total in-hospital cost for admission for each condition. Our models had good predictive performance, with testing R-squared values of 0.701-0.750 (mean of 0.713) for CHF; 0.694-0.724 (mean 0.709) for COPD; and 0.615-0.729 (mean 0.694) for DKA. We identified important key features driving costs, including patient age, length of stay, number of procedures, and elective/nonelective admission. CONCLUSIONS ML methods may be used to accurately predict costs and identify drivers of high cost for COPD exacerbations, CHF exacerbations and DKA. Overall, our findings may inform future studies that seek to decrease the underlying high patient costs for these conditions.
Collapse
Affiliation(s)
- Monique Arnold
- Department of Emergency Medicine, The Mount Sinai Hospital at the Icahn School of Medicine, 306 E 96th Street, #4A, New York, NY, 10128, USA.
| | - Lathan Liou
- Icahn School of Medicine at Mount Sinai Hospital, New York City, NY, USA
| | - Mary Regina Boland
- Data Science, Department of Mathematics, Herbert W. Boyer School of Natural Sciences, Mathematics, and Computing, Saint Vincent College, Latrobe, PA, USA
| |
Collapse
|
2
|
Chen S, Yang X, Gu H, Wang Y, Xu Z, Jiang Y, Wang Y. Predictive etiological classification of acute ischemic stroke through interpretable machine learning algorithms: a multicenter, prospective cohort study. BMC Med Res Methodol 2024; 24:199. [PMID: 39256656 PMCID: PMC11384709 DOI: 10.1186/s12874-024-02331-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 09/05/2024] [Indexed: 09/12/2024] Open
Abstract
BACKGROUND The prognosis, recurrence rates, and secondary prevention strategies varied significantly among different subtypes of acute ischemic stroke (AIS). Machine learning (ML) techniques can uncover intricate, non-linear relationships within medical data, enabling the identification of factors associated with etiological classification. However, there is currently a lack of research utilizing ML algorithms for predicting AIS etiology. OBJECTIVE We aimed to use interpretable ML algorithms to develop AIS etiology prediction models, identify critical factors in etiology classification, and enhance existing clinical categorization. METHODS This study involved patients with the Third China National Stroke Registry (CNSR-III). Nine models, which included Natural Gradient Boosting (NGBoost), Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Light Gradient Boosting Machine (LGBM), Gradient Boosting Decision Tree (GBDT), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), and logistic regression (LR), were employed to predict large artery atherosclerosis (LAA), small vessel occlusion (SVO), and cardioembolism (CE) using an 80:20 randomly split training and test set. We designed an SFS-XGB with 10-fold cross-validation for feature selection. The primary evaluation metrics for the models included the area under the receiver operating characteristic curve (AUC) for discrimination and the Brier score (or calibration plots) for calibration. RESULTS A total of 5,213 patients were included, comprising 2,471 (47.4%) with LAA, 2,153 (41.3%) with SVO, and 589 (11.3%) with CE. In both LAA and SVO models, the AUC values of the ML models were significantly higher than that of the LR model (P < 0.001). The optimal model for predicting SVO (AUC [RF model] = 0.932) outperformed the optimal LAA model (AUC [NGB model] = 0.917) and the optimal CE model (AUC [LGBM model] = 0.846). Each model displayed relatively satisfactory calibration. Further analysis showed that the optimal CE model could identify potential CE patients in the undetermined etiology (SUE) group, accounting for 1,900 out of 4,156 (45.7%). CONCLUSIONS The ML algorithm effectively classified patients with LAA, SVO, and CE, demonstrating superior classification performance compared to the LR model. The optimal ML model can identify potential CE patients among SUE patients. These newly identified predictive factors may complement the existing etiological classification system, enabling clinicians to promptly categorize stroke patients' etiology and initiate optimal strategies for secondary prevention.
Collapse
Affiliation(s)
- Siding Chen
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
- Changping Laboratory, Beijing, China
| | - Xiaomeng Yang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Hongqiu Gu
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Yanzhao Wang
- School of Statistics, Renmin University of China, No. 59 Zhongguancun Street, Haidian District, Beijing, 100872, China
| | - Zhe Xu
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Yong Jiang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- Changping Laboratory, Beijing, China.
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing, 100091, China.
| | - Yongjun Wang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- Changping Laboratory, Beijing, China.
- Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China.
- Clinical Center for Precision Medicine in Stroke, Capital Medical University, Beijing, China.
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
3
|
Peng J, Liu X, Cai Z, Huang Y, Lin J, Zhou M, Xiao Z, Lai H, Cao Z, Peng H, Wang J, Xu J. Practice of distributed machine learning in clinical modeling for chronic obstructive pulmonary disease. Heliyon 2024; 10:e33566. [PMID: 39071634 PMCID: PMC11283156 DOI: 10.1016/j.heliyon.2024.e33566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/09/2024] [Accepted: 06/24/2024] [Indexed: 07/30/2024] Open
Abstract
Background The high prevalence, morbidity and mortality, and disease heterogeneity of chronic obstructive pulmonary disease (COPD) result in the scattered data derived from patient visits in different medical units. The huge cost of integrating the scattered data for analysis and modeling, as well as the legal demand for patient privacy protection lead to the emergence of data island. Objectives On the premise of protecting patient privacy, integrating scattered data of patients from different medical units for high-quality modeling is beneficial to promoting the development of digital health. Based on this, we develop a distributed COPD disease diagnosis system termed COPD average federated learning (COPD_AVG_FL) using FedAvg. Methods First, to build the COPD_AVG_FL, the clinical data of COPD patients from the real world is collected and the data pre-processing is performed to clean the incorrect data, outlier samples and missing values. Then, a classical federated learning architecture is designed as COPD_AVG_FL. Finally, to evaluate the established COPD_AVG_FL system, we develop Centralized Machine Learning (CML). Conclusions Our results suggest that, with the assistance of COPD_AVG_FL, the absolute improvement rates are 13.4% (accuracy), 13.3% (precision), 12.8% (recall), 13.1% (F1-Score) and 12.9% (AUC) on the test data, respectively. The decoupling between model training and raw training data protects the patients' privacy, and helps to securely integrate more COPD data from different medical units to generate a more comprehensive model COPD_AVG_FL. This approach promotes the landing of wise information technology of medicine for COPD in the real clinical world. Code for our model will be made available at https://github.com/Cczhh/COPD_AVG_FL/tree/master.
Collapse
Affiliation(s)
- Junfeng Peng
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Xujiang Liu
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Ziwei Cai
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Yuanpei Huang
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jiayi Lin
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Mi Zhou
- Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510640, China
| | - Zhenpei Xiao
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Huifang Lai
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Zhihao Cao
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Hui Peng
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jihong Wang
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jun Xu
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| |
Collapse
|
4
|
Arnold M, Liou L, Boland MR. Development and Optimization of Machine Learning Algorithms for Predicting In-hospital Patient Charges for Congestive Heart Failure Exacerbations, Chronic Obstructive Pulmonary Disease Exacerbations and Diabetic Ketoacidosis. RESEARCH SQUARE 2024:rs.3.rs-4490027. [PMID: 38947079 PMCID: PMC11213225 DOI: 10.21203/rs.3.rs-4490027/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Background Hospitalizations for exacerbations of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD) and diabetic ketoacidosis (DKA) are costly in the United States. The purpose of this study was to predict in-hospital charges for each condition using machine learning (ML) models. Results We conducted a retrospective cohort study on national discharge records of hospitalized adult patients from January 1st, 2016, to December 31st, 2019. We used numerous ML techniques to predict in-hospital total cost. We found that linear regression (LM), gradient boosting (GBM) and extreme gradient boosting (XGB) models had good predictive performance and were statistically equivalent, with training R-square values ranging from 0.49-0.95 for CHF, 0.56-0.95 for COPD, and 0.32-0.99 for DKA. We identified important key features driving costs, including patient age, length of stay, number of procedures. and elective/nonelective admission. Conclusions ML methods may be used to accurately predict costs and identify drivers of high cost for COPD exacerbations, CHF exacerbations and DKA. Overall, our findings may inform future studies that seek to decrease the underlying high patient costs for these conditions.
Collapse
Affiliation(s)
- Monique Arnold
- The Mount Sinai Hospital at the Icahn School of Medicine
| | | | - Mary Regina Boland
- Alex G McKenna School of Business, Economics and Government. Saint Vincent College
| |
Collapse
|
5
|
He X, Cui X, Zhao Z, Wu R, Zhang Q, Xue L, Zhang H, Ge Q, Leng Y. A generalizable and easy-to-use COVID-19 stratification model for the next pandemic via immune-phenotyping and machine learning. Front Immunol 2024; 15:1372539. [PMID: 38601145 PMCID: PMC11004273 DOI: 10.3389/fimmu.2024.1372539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 03/11/2024] [Indexed: 04/12/2024] Open
Abstract
Introduction The coronavirus disease 2019 (COVID-19) pandemic has affected billions of people worldwide, and the lessons learned need to be concluded to get better prepared for the next pandemic. Early identification of high-risk patients is important for appropriate treatment and distribution of medical resources. A generalizable and easy-to-use COVID-19 severity stratification model is vital and may provide references for clinicians. Methods Three COVID-19 cohorts (one discovery cohort and two validation cohorts) were included. Longitudinal peripheral blood mononuclear cells were collected from the discovery cohort (n = 39, mild = 15, critical = 24). The immune characteristics of COVID-19 and critical COVID-19 were analyzed by comparison with those of healthy volunteers (n = 16) and patients with mild COVID-19 using mass cytometry by time of flight (CyTOF). Subsequently, machine learning models were developed based on immune signatures and the most valuable laboratory parameters that performed well in distinguishing mild from critical cases. Finally, single-cell RNA sequencing data from a published study (n = 43) and electronic health records from a prospective cohort study (n = 840) were used to verify the role of crucial clinical laboratory and immune signature parameters in the stratification of COVID-19 severity. Results Patients with COVID-19 were determined with disturbed glucose and tryptophan metabolism in two major innate immune clusters. Critical patients were further characterized by significant depletion of classical dendritic cells (cDCs), regulatory T cells (Tregs), and CD4+ central memory T cells (Tcm), along with increased systemic interleukin-6 (IL-6), interleukin-12 (IL-12), and lactate dehydrogenase (LDH). The machine learning models based on the level of cDCs and LDH showed great potential for predicting critical cases. The model performances in severity stratification were validated in two cohorts (AUC = 0.77 and 0.88, respectively) infected with different strains in different periods. The reference limits of cDCs and LDH as biomarkers for predicting critical COVID-19 were 1.2% and 270.5 U/L, respectively. Conclusion Overall, we developed and validated a generalizable and easy-to-use COVID-19 severity stratification model using machine learning algorithms. The level of cDCs and LDH will assist clinicians in making quick decisions during future pandemics.
Collapse
Affiliation(s)
- Xinlei He
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Xiao Cui
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Zhiling Zhao
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Rui Wu
- Department of Pulmonary and Critical Care Medicine, Peking University Third Hospital, Beijing, China
| | - Qiang Zhang
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Lei Xue
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Hua Zhang
- Department of Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China
| | - Qinggang Ge
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| | - Yuxin Leng
- Department of Intensive Care Unit, Peking University Third Hospital, Beijing, China
| |
Collapse
|
6
|
Wang X, Qiao Y, Cui Y, Ren H, Zhao Y, Linghu L, Ren J, Zhao Z, Chen L, Qiu L. An explainable artificial intelligence framework for risk prediction of COPD in smokers. BMC Public Health 2023; 23:2164. [PMID: 37932692 PMCID: PMC10626705 DOI: 10.1186/s12889-023-17011-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 10/17/2023] [Indexed: 11/08/2023] Open
Abstract
BACKGROUND Since the inconspicuous nature of early signs associated with Chronic Obstructive Pulmonary Disease (COPD), individuals often remain unidentified, leading to suboptimal opportunities for timely prevention and treatment. The purpose of this study was to create an explainable artificial intelligence framework combining data preprocessing methods, machine learning methods, and model interpretability methods to identify people at high risk of COPD in the smoking population and to provide a reasonable interpretation of model predictions. METHODS The data comprised questionnaire information, physical examination data and results of pulmonary function tests before and after bronchodilatation. First, the factorial analysis for mixed data (FAMD), Boruta and NRSBoundary-SMOTE resampling methods were used to solve the missing data, high dimensionality and category imbalance problems. Then, seven classification models (CatBoost, NGBoost, XGBoost, LightGBM, random forest, SVM and logistic regression) were applied to model the risk level, and the best machine learning (ML) model's decisions were explained using the Shapley additive explanations (SHAP) method and partial dependence plot (PDP). RESULTS In the smoking population, age and 14 other variables were significant factors for predicting COPD. The CatBoost, random forest, and logistic regression models performed reasonably well in unbalanced datasets. CatBoost with NRSBoundary-SMOTE had the best classification performance in balanced datasets when composite indicators (the AUC, F1-score, and G-mean) were used as model comparison criteria. Age, COPD Assessment Test (CAT) score, gross annual income, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), anhelation, respiratory disease, central obesity, use of polluting fuel for household heating, region, use of polluting fuel for household cooking, and wheezing were important factors for predicting COPD in the smoking population. CONCLUSION This study combined feature screening methods, unbalanced data processing methods, and advanced machine learning methods to enable early identification of COPD risk groups in the smoking population. COPD risk factors in the smoking population were identified using SHAP and PDP, with the goal of providing theoretical support for targeted screening strategies and smoking population self-management strategies.
Collapse
Affiliation(s)
- Xuchun Wang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Yuchao Qiao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Yu Cui
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Hao Ren
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Ying Zhao
- Shanxi Centre for Disease Control and Prevention, Taiyuan, Shanxi, 030012, China
| | - Liqin Linghu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
- Shanxi Centre for Disease Control and Prevention, Taiyuan, Shanxi, 030012, China
| | - Jiahui Ren
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Zhiyang Zhao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China
| | - Limin Chen
- The Fifth Hospital (Shanxi People's Hospital) of Shanxi Medical University, Taiyuan, Shanxi, 030012, P.R. China.
| | - Lixia Qiu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001, P.R. China.
| |
Collapse
|
7
|
Jacobson PK, Lind L, Persson HL. Unleashing the Power of Very Small Data to Predict Acute Exacerbations of Chronic Obstructive Pulmonary Disease. Int J Chron Obstruct Pulmon Dis 2023; 18:1457-1473. [PMID: 37485052 PMCID: PMC10362872 DOI: 10.2147/copd.s412692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 06/20/2023] [Indexed: 07/25/2023] Open
Abstract
Introduction In this article, we explore to what extent it is possible to leverage on very small data to build machine learning (ML) models that predict acute exacerbations of chronic obstructive pulmonary disease (AECOPD). Methods We build ML models using the small data collected during the eHealth Diary telemonitoring study between 2013 and 2017 in Sweden. This data refers to a group of multimorbid patients, namely 18 patients with chronic obstructive pulmonary disease (COPD) as the major reason behind previous hospitalisations. The telemonitoring was supervised by a specialised hospital-based home care (HBHC) unit, which also was responsible for the medical actions needed. Results We implement two different ML approaches, one based on time-dependent covariates and the other one based on time-independent covariates. We compare the first approach with standard COX Proportional Hazards (CPH). For the second one, we use different proportions of synthetic data to build models and then evaluate the best model against authentic data. Discussion To the best of our knowledge, the present ML study shows for the first time that the most important variable for an increased risk of future AECOPDs is "maintenance medication changes by HBHC". This finding is clinically relevant since a sub-optimal maintenance treatment, requiring medication changes, puts the patient in risk for future AECOPDs. Conclusion The experiments return useful insights about the use of small data for ML.
Collapse
Affiliation(s)
- Petra Kristina Jacobson
- Department of Health, Medicine and Caring Sciences, Linköping University, Linköping, Sweden
- Department of Respiratory Medicine in Linköping, Linköping University, Linköping, Sweden
| | - Leili Lind
- Department of Biomedical Engineering/Health Informatics, Linköping University, Linköping, Sweden
- Digital Systems Division, Unit Digital Health, RISE Research Institutes of Sweden, Linköping, Sweden
| | - Hans Lennart Persson
- Department of Health, Medicine and Caring Sciences, Linköping University, Linköping, Sweden
- Department of Respiratory Medicine in Linköping, Linköping University, Linköping, Sweden
| |
Collapse
|
8
|
Meng Q, Wang J, Cui J, Li B, Wu S, Yun J, Aschner M, Wang C, Zhang L, Li X, Chen R. Prediction of COPD acute exacerbation in response to air pollution using exosomal circRNA profile and Machine learning. ENVIRONMENT INTERNATIONAL 2022; 168:107469. [PMID: 36041244 PMCID: PMC9939562 DOI: 10.1016/j.envint.2022.107469] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 07/19/2022] [Accepted: 08/10/2022] [Indexed: 05/11/2023]
Abstract
Ambient fine particulate matter (PM2.5) is linked to an increased risk of chronic obstructive pulmonary disease (COPD) exacerbations, which significantly increase the risk of mortality in COPD patients. Identifying the subtype of COPD patients who are sensitive to environmental aggressions is necessary. Using in vitro and in vivo PM2.5 exposure models, we demonstrate that exosomal hsa_circ_0005045 is upregulated by PM2.5 and binds to the protein cargo peroxiredoxin2, which functionally aggravates hallmarks of COPD by recruiting neutrophil elastase and triggering in situ release of tumor necrosis factor (TNF)-α by inflammatory cells. The biological function of hsa_circ_0005045 associated with aggravation of COPD is validated using exosome-transplantation and conditional circRNA-knockdown murine models. By sorting the major components of PM2.5, we find that PM2.5-bound heavy metals, which are distinguishable from the components of cigarette smoke, trigger the elevation of exosomal hsa_circ_0005045. Finally, using machine learning models in a cohort with 327 COPD patients, the PM2.5 exposure-sensitive COPD patients are characterized by relatively high hsa_circ_0005045 expression, non-smoking, and group C (mMRC 0-1 (or CAT < 10) and ≥ 2 exacerbations (or ≥ 1 exacerbation leading to hospital admission) in the past year). Thus, our results suggest that environmental reduction in PM2.5 emission provides a targeted approach to protecting non-smoking COPD patients against air pollution-related disease exacerbation.
Collapse
Affiliation(s)
- Qingtao Meng
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China
| | - Jiajia Wang
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China
| | - Jian Cui
- Jiangsu Key Laboratory of Molecular and Functional Imaging, Department of Radiology, Zhongda Hospital, Medical School of Southeast University, 87, Ding Jia Qiao Road, Nanjing 210009, China; Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
| | - Bin Li
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China
| | - Shenshen Wu
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China
| | - Jun Yun
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
| | - Michael Aschner
- Department of Molecular Pharmacology, Albert Einstein College of Medicine, Forchheimer 209, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Chengshuo Wang
- Department of Otolaryngology, Head and Neck Surgery, Beijing TongRen Hospital, Capital Medical University, Beijing 100730, China; Beijing Key Laboratory of Nasal Diseases, Beijing Institute of Otolaryngology, Beijing 100005, China
| | - Luo Zhang
- Department of Allergy, Beijing TongRen Hospital, Capital Medical University, Beijing, 100005, China; Beijing Key Laboratory of Nasal Diseases, Beijing Institute of Otolaryngology, Beijing China; Department of Otolaryngology Head and Neck Surgery, Beijing TongRen Hospital, Capital Medical University, Beijing 100005, China.
| | - Xiaobo Li
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China; Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China.
| | - Rui Chen
- Beijing Key Laboratory of Environmental Toxicology, School of Public Health, Capital Medical University, Beijing 100069, PR China; School of Public Health, Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing 100069, PR China; Institute for Chemical Carcinogenesis, Guangzhou Medical University, Guangzhou 511436, PR China.
| |
Collapse
|
9
|
Chen SD, You J, Yang XM, Gu HQ, Huang XY, Liu H, Feng JF, Jiang Y, Wang YJ. Machine learning is an effective method to predict the 90-day prognosis of patients with transient ischemic attack and minor stroke. BMC Med Res Methodol 2022; 22:195. [PMID: 35842606 PMCID: PMC9287991 DOI: 10.1186/s12874-022-01672-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 06/30/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE We aimed to investigate factors related to the 90-day poor prognosis (mRS≥3) in patients with transient ischemic attack (TIA) or minor stroke, construct 90-day poor prognosis prediction models for patients with TIA or minor stroke, and compare the predictive performance of machine learning models and Logistic model. METHOD We selected TIA and minor stroke patients from a prospective registry study (CNSR-III). Demographic characteristics,smoking history, drinking history(≥20g/day), physiological data, medical history,secondary prevention treatment, in-hospital evaluation and education,laboratory data, neurological severity, mRS score and TOAST classification of patients were assessed. Univariate and multivariate logistic regression analyses were performed in the training set to identify predictors associated with poor outcome (mRS≥3). The predictors were used to establish machine learning models and the traditional Logistic model, which were randomly divided into the training set and test set according to the ratio of 70:30. The training set was used to construct the prediction model, and the test set was used to evaluate the effect of the model. The evaluation indicators of the model included the area under the curve (AUC) of the discrimination index and the Brier score (or calibration plot) of the calibration index. RESULT A total of 10967 patients with TIA and minor stroke were enrolled in this study, with an average age of 61.77 ± 11.18 years, and women accounted for 30.68%. Factors associated with the poor prognosis in TIA and minor stroke patients included sex, age, stroke history, heart rate, D-dimer, creatinine, TOAST classification, admission mRS, discharge mRS, and discharge NIHSS score. All models, both those constructed by Logistic regression and those by machine learning, performed well in predicting the 90-day poor prognosis (AUC >0.800). The best performing AUC in the test set was the Catboost model (AUC=0.839), followed by the XGBoost, GBDT, random forest and Adaboost model (AUCs equal to 0.838, 0, 835, 0.832, 0.823, respectively). The performance of Catboost and XGBoost in predicting poor prognosis at 90-day was better than the Logistic model, and the difference was statistically significant(P<0.05). All models, both those constructed by Logistic regression and those by machine learning had good calibration. CONCLUSION Machine learning algorithms were not inferior to the Logistic regression model in predicting the poor prognosis of patients with TIA and minor stroke at 90-day. Among them, the Catboost model had the best predictive performance. All models provided good discrimination.
Collapse
Affiliation(s)
- Si-Ding Chen
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Jia You
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Xiao-Meng Yang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Hong-Qiu Gu
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Xin-Ying Huang
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Huan Liu
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China
| | - Jian-Feng Feng
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Yong Jiang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine (Beihang University & Capital Medical University), Beijing, 100091, China.
| | - Yong-Jun Wang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- China National Clinical Research Center for Neurological Diseases, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.
- Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, China.
- Clinical Center for Precision Medicine in Stroke, Capital Medical University, Beijing, China.
- Research Unit of Artificial Intelligence in Cerebrovascular Disease, Chinese Academy of Medical Sciences, 2019RU018, Beijing, China.
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China.
- Chinese Institute for Brain Research, Beijing, China.
| |
Collapse
|
10
|
Explainable Machine Learning Model for Predicting First-Time Acute Exacerbation in Patients with Chronic Obstructive Pulmonary Disease. J Pers Med 2022; 12:jpm12020228. [PMID: 35207716 PMCID: PMC8879653 DOI: 10.3390/jpm12020228] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 02/02/2022] [Accepted: 02/03/2022] [Indexed: 12/15/2022] Open
Abstract
Background: The study developed accurate explainable machine learning (ML) models for predicting first-time acute exacerbation of chronic obstructive pulmonary disease (COPD, AECOPD) at an individual level. Methods: We conducted a retrospective case–control study. A total of 606 patients with COPD were screened for eligibility using registry data from the COPD Pay-for-Performance Program (COPD P4P program) database at Changhua Christian Hospital between January 2017 and December 2019. Recursive feature elimination technology was used to select the optimal subset of features for predicting the occurrence of AECOPD. We developed four ML models to predict first-time AECOPD, and the highest-performing model was applied. Finally, an explainable approach based on ML and the SHapley Additive exPlanations (SHAP) and a local explanation method were used to evaluate the risk of AECOPD and to generate individual explanations of the model’s decisions. Results: The gradient boosting machine (GBM) and support vector machine (SVM) models exhibited superior discrimination ability (area under curve [AUC] = 0.833 [95% confidence interval (CI) 0.745–0.921] and AUC = 0.836 [95% CI 0.757–0.915], respectively). The decision curve analysis indicated that the GBM model exhibited a higher net benefit in distinguishing patients at high risk for AECOPD when the threshold probability was <0.55. The COPD Assessment Test (CAT) and the symptom of wheezing were the two most important features and exhibited the highest SHAP values, followed by monocyte count and white blood cell (WBC) count, coughing, red blood cell (RBC) count, breathing rate, oral long-acting bronchodilator use, chronic pulmonary disease (CPD), systolic blood pressure (SBP), and others. Higher CAT score; monocyte, WBC, and RBC counts; BMI; diastolic blood pressure (DBP); neutrophil-to-lymphocyte ratio; and eosinophil and lymphocyte counts were associated with AECOPD. The presence of symptoms (wheezing, dyspnea, coughing), chronic disease (CPD, congestive heart failure [CHF], sleep disorders, and pneumonia), and use of COPD medications (triple-therapy long-acting bronchodilators, short-acting bronchodilators, oral long-acting bronchodilators, and antibiotics) were also positively associated with AECOPD. A high breathing rate, heart rate, or systolic blood pressure and methylxanthine use were negatively correlated with AECOPD. Conclusions: The ML model was able to accurately assess the risk of AECOPD. The ML model combined with SHAP and the local explanation method were able to provide interpretable and visual explanations of individualized risk predictions, which may assist clinical physicians in understanding the effects of key features in the model and the model’s decision-making process.
Collapse
|
11
|
Mohamed I. Prediction of Chronic Obstructive Pulmonary Disease Stages Using Machine Learning Algorithms. INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY 2022. [DOI: 10.4018/ijdsst.286693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Identifying chronic obstructive pulmonary disease (COPD) severity stages is of great importance to control the related mortality rates and reduce the associated costs. This study aims to build prediction models for COPD stages and, to compare the relative performance of five machine learning algorithms to determine the optimal prediction algorithm. This research is based on data collected from a private hospital in Egypt for the two calendar years 2018 and 2019. Five machine learning algorithms were used for the comparison. The F1 score, specificity, sensitivity, accuracy, positive predictive value and negative predictive value were the performance measures used for algorithms comparison. Analysis included 211 patients’ records. Our results show that the best performing algorithm in most of the disease stages is the PNN with the optimal prediction accuracy and hence it can be considered as a powerful prediction tool used by decision makers in predicting severity stages of COPD.
Collapse
|
12
|
Peng J, Zhou M, Zou K, Zhu X, Xu J, Teng Y, Zhang F, Chen G. Exploratory study on classification of chronic obstructive pulmonary disease combining multi-stage feature fusion and machine learning. BMC Med Inform Decis Mak 2021; 21:348. [PMID: 34906123 PMCID: PMC8670199 DOI: 10.1186/s12911-021-01708-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 12/01/2021] [Indexed: 11/10/2022] Open
Abstract
Background Due to the complexity and high heterogeneity of the acute exacerbation of chronic obstructive pulmonary disease (AECOPD), the guidelines (global initiative for chronic obstructive, GOLD) is unable to fully guide the treatment of AECOPD. Objectives To provide a rapid treatment in line with the development of the AECOPD after admission. In this paper, we propose a multi-stage feature fusion (MSFF) framework combining machine learning to track the diseases deterioration risk of the AECOPD. Methods First, we identify 408 AECOPD patients as the study population. Then, feature segment and fusion methods are applied to generate the phased data set. Finally, human studies are designed to evaluate the performance of the MSFF framework. Results The experimental results show that the proposed framework is potential to obtain the full-process tracking of deterioration risk for the AECOPD patients. The proposed MSFF framework achieves a higher overall accuracy average and F1 scores than the four physician groups i.e., IM, Surgery, Emergency, and ICU. Conclusions The proposed MSFF model may serve as a useful disease tracking tool to estimate the deterioration risk at each stage, and finally achieve the disease monitoring and management for AECOPD patients.
Collapse
Affiliation(s)
- Junfeng Peng
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China.
| | - Mi Zhou
- The Third Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510640, China
| | - Kaiqiang Zou
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| | - Xiongyong Zhu
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| | - Jun Xu
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| | - Yi Teng
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| | - Feifei Zhang
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| | - Guoming Chen
- School of Computer Science, Guangdong University of Education, Guangzhou, 510006, China
| |
Collapse
|
13
|
Zafari H, Langlois S, Zulkernine F, Kosowan L, Singer A. AI in predicting COPD in the Canadian population. Biosystems 2021; 211:104585. [PMID: 34864143 DOI: 10.1016/j.biosystems.2021.104585] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 11/17/2021] [Accepted: 11/23/2021] [Indexed: 12/12/2022]
Abstract
Chronic obstructive pulmonary disease (COPD) is a progressive lung disease that produces non-reversible airflow limitations. Approximately 10% of Canadians aged 35 years or older are living with COPD. Primary care is often the first contact an individual will have with the healthcare system providing acute care, chronic disease management, and services aimed at health maintenance. This study used Electronic Medical Record (EMR) data from primary care clinics in seven provinces across Canada to develop predictive models to identify COPD in the Canadian population. The comprehensive nature of this primary care EMR data containing structured numeric, categorical, hybrid, and unstructured text data, enables the predictive models to capture symptoms of COPD and discriminate it from diseases with similar symptoms. We applied two supervised machine learning models, a Multilayer Neural Networks (MLNN) model and an Extreme Gradient Boosting (XGB) to identify COPD patients. The XGB model achieved an accuracy of 86% in the test dataset compared to 83% achieved by the MLNN. Utilizing feature importance, we identified a set of key symptoms from the EMR for diagnosing COPD, which included medications, health conditions, risk factors, and patient age. Application of this XGB model to primary care structured EMR data can identify patients with COPD from others having similar chronic conditions for disease surveillance, and improve evidence-based care delivery.
Collapse
Affiliation(s)
- Hasan Zafari
- School of Computing, Queen's University, Kingston, Ontario, Canada.
| | - Sarah Langlois
- School of Computing, Queen's University, Kingston, Ontario, Canada.
| | | | - Leanne Kosowan
- Department of Family Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
| | - Alexander Singer
- Department of Family Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
| |
Collapse
|
14
|
De Ramón Fernández A, Ruiz Fernández D, Gilart Iglesias V, Marcos Jorquera D. Analyzing the use of artificial intelligence for the management of chronic obstructive pulmonary disease (COPD). Int J Med Inform 2021; 158:104640. [PMID: 34890934 DOI: 10.1016/j.ijmedinf.2021.104640] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 10/21/2021] [Accepted: 11/03/2021] [Indexed: 12/31/2022]
Abstract
OBJECTIVE Chronic obstructive pulmonary disease (COPD) is a disease that causes airflow limitation to the lungs and has a high morbidity around the world. The objective of this study was to evaluate how artificial intelligence (AI) is being applied for the management of the disease, analyzing the objectives that are raised, the algorithms that are used and what results they offer. METHODS We conducted a scoping review following the Arksey and O'Malley (2005) and Levac et al. (2010) guidelines. Two reviewers independently searched, analyzed and extracted data from papers of five databases: Web of Science, PubMed, Scopus, Cinahl and Cochrane. To be included, the studies had to apply some AI techniques for the management of at least one stage of the COPD clinical process. In the event of any discrepancy between both reviewers, the criterion of a third reviewer prevailed. RESULTS 380 papers were identified through database searches. After applying the exclusion criteria, 67 papers were included in the study. The studies were of a different nature and pursued a wide range of objectives, highlighting mainly those focused on the identification, classification and prevention of the disease. Neural nets, support vector machines and decision trees were the AI algorithms most commonly used. The mean and median values of all the performance metrics evaluated were between 80% and 90%. CONCLUSIONS The results obtained show a growing interest in the development of medical applications that manage the different phases of the COPD clinical process, especially predictive models. According to the performance shown, these models could be a useful complementary tool in the decision-making by health specialists, although more high-quality ML studies are needed to endorse the findings of this study.
Collapse
|
15
|
Xing X, Ma Z, Xu S, Zhang M, Zhao W, Song M, Dong WF. Blood pressure assessment with in-ear photoplethysmography. Physiol Meas 2021; 42. [PMID: 34571491 DOI: 10.1088/1361-6579/ac2a71] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/27/2021] [Indexed: 11/11/2022]
Abstract
Objective. In this study, we aimed to estimate blood pressure (BP) from in-ear photoplethysmography (PPG). This novel implementation provided an unobtrusive and steady way of recording PPG, whereas previous PPG measurements were mostly performed at the wrist, finger, or earlobe.Methods. The time between forward and reflected PPG waves was very short at the ear site. To minimize errors introduced by feature extraction, a multi-Gaussian decomposition of in-ear PPG was performed. Both hand-crafted and whole-based features were extracted and the best combination of features was selected using a backward-search wrapper method and evaluated by the Akaike information criteria. Hemodynamic parameters such as compliance and inertance were estimated from a four-element Windkessel (WK4) model, which was used to pre-classify PPG signals and generate different BP estimation algorithms. Calibration was done by using previous measurements from the same class. To validate this novel approach, 53 subjects were recruited for a one-month follow-up study, and 17 subjects were recruited for a two-month follow-up study. Calibrated systolic BP estimation accuracy was significantly improved with inertance-based pre-classification, while diastolic BP showed less improvement.Results. With proper feature selection, pre-classification and calibration, we have achieved a mean absolute error of 5.35 mmHg for SBP estimation, compared to 6.16 mmHg if no pre-classification was carried out. The performance did not deteriorate in two months, showing a decent BP trend-tracking ability.Conclusion. The study demonstrated the feasibility of in-ear PPG to reliably measure BP, which represents an important technological advancement in terms of unobtrusiveness and steadiness.
Collapse
Affiliation(s)
- Xiaoman Xing
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Sciences and Technology of China, Suzhou, Jiangsu, People's Republic of China.,Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, Jiangsu, People's Republic of China
| | - Zhimin Ma
- The Affiliated Suzhou Science &Technology Town Hospital of Nanjing Medical University, Suzhou, Jiangsu, People's Republic of China
| | - Shengkai Xu
- The Affiliated Suzhou Science &Technology Town Hospital of Nanjing Medical University, Suzhou, Jiangsu, People's Republic of China
| | - Mingyou Zhang
- The First Hospital of Jilin University, Changchun, Jilin, People's Republic of China
| | - Wei Zhao
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
| | - Mingxuan Song
- Jinan Guoke Medical Technology Development Co., Ltd, Shandong, People's Republic of China
| | - Wen-Fei Dong
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, Jiangsu, People's Republic of China
| |
Collapse
|
16
|
Gharagozloo M, Amrani A, Wittingstall K, Hamilton-Wright A, Gris D. Machine Learning in Modeling of Mouse Behavior. Front Neurosci 2021; 15:700253. [PMID: 34594182 PMCID: PMC8477014 DOI: 10.3389/fnins.2021.700253] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 08/02/2021] [Indexed: 12/02/2022] Open
Abstract
Mouse behavior is a primary outcome in evaluations of therapeutic efficacy. Exhaustive, continuous, multiparametric behavioral phenotyping is a valuable tool for understanding the pathophysiological status of mouse brain diseases. Automated home cage behavior analysis produces highly granulated data both in terms of number of features and sampling frequency. Previously, we demonstrated several ways to reduce feature dimensionality. In this study, we propose novel approaches for analyzing 33-Hz data generated by CleverSys software. We hypothesized that behavioral patterns within short time windows are reflective of physiological state, and that computer modeling of mouse behavioral routines can serve as a predictive tool in classification tasks. To remove bias due to researcher decisions, our data flow is indifferent to the quality, value, and importance of any given feature in isolation. To classify day and night behavior, as an example application, we developed a data preprocessing flow and utilized logistic regression (LG), support vector machines (SVM), random forest (RF), and one-dimensional convolutional neural networks paired with long short-term memory deep neural networks (1DConvBiLSTM). We determined that a 5-min video clip is sufficient to classify mouse behavior with high accuracy. LG, SVM, and RF performed similarly, predicting mouse behavior with 85% accuracy, and combining the three algorithms in an ensemble procedure increased accuracy to 90%. The best performance was achieved by combining the 1DConv and BiLSTM algorithms yielding 96% accuracy. Our findings demonstrate that computer modeling of the home-cage ethome can clearly define mouse physiological state. Furthermore, we showed that continuous behavioral data can be analyzed using approaches similar to natural language processing. These data provide proof of concept for future research in diagnostics of complex pathophysiological changes that are accompanied by changes in behavioral profile.
Collapse
Affiliation(s)
- Marjan Gharagozloo
- Department of Neurology, Johns Hopkins University, Baltimore, MD, United States
| | - Abdelaziz Amrani
- Department of Pediatrics, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Kevin Wittingstall
- Department of Radiology, Sherbrooke Molecular Imaging Center, Université de Sherbrooke, Sherbrooke, QC, Canada
| | | | - Denis Gris
- Department of Pharmacology and Physiology, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
17
|
Wu CT, Li GH, Huang CT, Cheng YC, Chen CH, Chien JY, Kuo PH, Kuo LC, Lai F. Acute Exacerbation of a Chronic Obstructive Pulmonary Disease Prediction System Using Wearable Device Data, Machine Learning, and Deep Learning: Development and Cohort Study. JMIR Mhealth Uhealth 2021; 9:e22591. [PMID: 33955840 PMCID: PMC8138712 DOI: 10.2196/22591] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 01/30/2021] [Accepted: 03/23/2021] [Indexed: 12/25/2022] Open
Abstract
Background The World Health Organization has projected that by 2030, chronic obstructive pulmonary disease (COPD) will be the third-leading cause of mortality and the seventh-leading cause of morbidity worldwide. Acute exacerbations of chronic obstructive pulmonary disease (AECOPD) are associated with an accelerated decline in lung function, diminished quality of life, and higher mortality. Accurate early detection of acute exacerbations will enable early management and reduce mortality. Objective The aim of this study was to develop a prediction system using lifestyle data, environmental factors, and patient symptoms for the early detection of AECOPD in the upcoming 7 days. Methods This prospective study was performed at National Taiwan University Hospital. Patients with COPD that did not have a pacemaker and were not pregnant were invited for enrollment. Data on lifestyle, temperature, humidity, and fine particulate matter were collected using wearable devices (Fitbit Versa), a home air quality–sensing device (EDIMAX Airbox), and a smartphone app. AECOPD episodes were evaluated via standardized questionnaires. With these input features, we evaluated the prediction performance of machine learning models, including random forest, decision trees, k-nearest neighbor, linear discriminant analysis, and adaptive boosting, and a deep neural network model. Results The continuous real-time monitoring of lifestyle and indoor environment factors was implemented by integrating home air quality–sensing devices, a smartphone app, and wearable devices. All data from 67 COPD patients were collected prospectively during a mean 4-month follow-up period, resulting in the detection of 25 AECOPD episodes. For 7-day AECOPD prediction, the proposed AECOPD predictive model achieved an accuracy of 92.1%, sensitivity of 94%, and specificity of 90.4%. Receiver operating characteristic curve analysis showed that the area under the curve of the model in predicting AECOPD was greater than 0.9. The most important variables in the model were daily steps walked, stairs climbed, and daily distance moved. Conclusions Using wearable devices, home air quality–sensing devices, a smartphone app, and supervised prediction algorithms, we achieved excellent power to predict whether a patient would experience AECOPD within the upcoming 7 days. The AECOPD prediction system provided an effective way to collect lifestyle and environmental data, and yielded reliable predictions of future AECOPD events. Compared with previous studies, we have comprehensively improved the performance of the AECOPD prediction model by adding objective lifestyle and environmental data. This model could yield more accurate prediction results for COPD patients than using only questionnaire data.
Collapse
Affiliation(s)
- Chia-Tung Wu
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Guo-Hung Li
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Chun-Ta Huang
- Department of Internal Medicine, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Yu-Chieh Cheng
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Chi-Hsien Chen
- Department of Environmental and Occupational Medicine, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Jung-Yien Chien
- Department of Internal Medicine, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Ping-Hung Kuo
- Department of Internal Medicine, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Lu-Cheng Kuo
- Department of Internal Medicine, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Feipei Lai
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
18
|
Zhu X, Huang W, Lu H, Wang Z, Ni X, Hu J, Deng S, Tan Y, Li L, Zhang M, Qiu C, Luo Y, Chen H, Huang S, Xiao T, Shang D, Wen Y. A machine learning approach to personalized dose adjustment of lamotrigine using noninvasive clinical parameters. Sci Rep 2021; 11:5568. [PMID: 33692435 PMCID: PMC7946912 DOI: 10.1038/s41598-021-85157-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 02/23/2021] [Indexed: 12/11/2022] Open
Abstract
The pharmacokinetic variability of lamotrigine (LTG) plays a significant role in its dosing requirements. Our goal here was to use noninvasive clinical parameters to predict the dose-adjusted concentrations (C/D ratio) of LTG based on machine learning (ML) algorithms. A total of 1141 therapeutic drug-monitoring measurements were used, 80% of which were randomly selected as the "derivation cohort" to develop the prediction algorithm, and the remaining 20% constituted the "validation cohort" to test the finally selected model. Fifteen ML models were optimized and evaluated by tenfold cross-validation on the "derivation cohort,” and were filtered by the mean absolute error (MAE). On the whole, the nonlinear models outperformed the linear models. The extra-trees’ regression algorithm delivered good performance, and was chosen to establish the predictive model. The important features were then analyzed and parameters of the model adjusted to develop the best prediction model, which accurately described the C/D ratio of LTG, especially in the intermediate-to-high range (≥ 22.1 μg mL−1 g−1 day), as illustrated by a minimal bias (mean relative error (%) = + 3%), good precision (MAE = 8.7 μg mL−1 g−1 day), and a high percentage of predictions within ± 20% of the empirical values (60.47%). This is the first study, to the best of our knowledge, to use ML algorithms to predict the C/D ratio of LTG. The results here can help clinicians adjust doses of LTG administered to patients to minimize adverse reactions.
Collapse
Affiliation(s)
- Xiuqing Zhu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Wencan Huang
- Department of Pharmacy, Guangzhou Bureau of Civil Affairs Psychiatric Hospital, Guangzhou, 510430, China
| | - Haoyang Lu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Zhanzhang Wang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Xiaojia Ni
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Jinqing Hu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Shuhua Deng
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Yaqian Tan
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Lu Li
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Ming Zhang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Chang Qiu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China.,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China
| | - Yayan Luo
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China.,Institute of Neuropsychiatry, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China
| | - Hongzhen Chen
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China
| | - Shanqing Huang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China
| | - Tao Xiao
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China
| | - Dewei Shang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China. .,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China.
| | - Yuguan Wen
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, 510370, China. .,Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, 510370, China.
| |
Collapse
|
19
|
Feng Y, Wang Y, Zeng C, Mao H. Artificial Intelligence and Machine Learning in Chronic Airway Diseases: Focus on Asthma and Chronic Obstructive Pulmonary Disease. Int J Med Sci 2021; 18:2871-2889. [PMID: 34220314 PMCID: PMC8241767 DOI: 10.7150/ijms.58191] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/20/2021] [Indexed: 02/05/2023] Open
Abstract
Chronic airway diseases are characterized by airway inflammation, obstruction, and remodeling and show high prevalence, especially in developing countries. Among them, asthma and chronic obstructive pulmonary disease (COPD) show the highest morbidity and socioeconomic burden worldwide. Although there are extensive guidelines for the prevention, early diagnosis, and rational treatment of these lifelong diseases, their value in precision medicine is very limited. Artificial intelligence (AI) and machine learning (ML) techniques have emerged as effective methods for mining and integrating large-scale, heterogeneous medical data for clinical practice, and several AI and ML methods have recently been applied to asthma and COPD. However, very few methods have significantly contributed to clinical practice. Here, we review four aspects of AI and ML implementation in asthma and COPD to summarize existing knowledge and indicate future steps required for the safe and effective application of AI and ML tools by clinicians.
Collapse
Affiliation(s)
- Yinhe Feng
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China.,Department of Respiratory and Critical Care Medicine, People's Hospital of Deyang City, Affiliated Hospital of Chengdu College of Medicine, Deyang, Sichuan Province, China
| | - Yubin Wang
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Chunfang Zeng
- Department of Respiratory and Critical Care Medicine, People's Hospital of Deyang City, Affiliated Hospital of Chengdu College of Medicine, Deyang, Sichuan Province, China
| | - Hui Mao
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| |
Collapse
|
20
|
Lee YW, Choi JW, Shin EH. Machine learning model for predicting malaria using clinical information. Comput Biol Med 2020; 129:104151. [PMID: 33290932 DOI: 10.1016/j.compbiomed.2020.104151] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 11/09/2020] [Accepted: 11/24/2020] [Indexed: 12/19/2022]
Abstract
BACKGROUND Rapid diagnosing is crucial for controlling malaria. Various studies have aimed at developing machine learning models to diagnose malaria using blood smear images; however, this approach has many limitations. This study developed a machine learning model for malaria diagnosis using patient information. METHODS To construct datasets, we extracted patient information from the PubMed abstracts from 1956 to 2019. We used two datasets: a solely parasitic disease dataset and total dataset by adding information about other diseases. We compared six machine learning models: support vector machine, random forest (RF), multilayered perceptron, AdaBoost, gradient boosting (GB), and CatBoost. In addition, a synthetic minority oversampling technique (SMOTE) was employed to address the data imbalance problem. RESULTS Concerning the solely parasitic disease dataset, RF was found to be the best model regardless of using SMOTE. Concerning the total dataset, GB was found to be the best. However, after applying SMOTE, RF performed the best. Considering the imbalanced data, nationality was found to be the most important feature in malaria prediction. In case of the balanced data with SMOTE, the most important feature was symptom. CONCLUSIONS The results demonstrated that machine learning techniques can be successfully applied to predict malaria using patient information.
Collapse
Affiliation(s)
- You Won Lee
- Department of Tropical Medicine and Parasitology, Seoul National University College of Medicine and Institute of Endemic Diseases, Seoul, 03080, Republic of Korea
| | - Jae Woo Choi
- Department of Pharmacology, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea; Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, 03722, Republic of Korea
| | - Eun-Hee Shin
- Department of Tropical Medicine and Parasitology, Seoul National University College of Medicine and Institute of Endemic Diseases, Seoul, 03080, Republic of Korea; Seoul National University Bundang Hospital, Seongnam, 13620, Republic of Korea.
| |
Collapse
|