1
|
Peng J, Liu X, Cai Z, Huang Y, Lin J, Zhou M, Xiao Z, Lai H, Cao Z, Peng H, Wang J, Xu J. Practice of distributed machine learning in clinical modeling for chronic obstructive pulmonary disease. Heliyon 2024; 10:e33566. [PMID: 39071634 PMCID: PMC11283156 DOI: 10.1016/j.heliyon.2024.e33566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/09/2024] [Accepted: 06/24/2024] [Indexed: 07/30/2024] Open
Abstract
Background The high prevalence, morbidity and mortality, and disease heterogeneity of chronic obstructive pulmonary disease (COPD) result in the scattered data derived from patient visits in different medical units. The huge cost of integrating the scattered data for analysis and modeling, as well as the legal demand for patient privacy protection lead to the emergence of data island. Objectives On the premise of protecting patient privacy, integrating scattered data of patients from different medical units for high-quality modeling is beneficial to promoting the development of digital health. Based on this, we develop a distributed COPD disease diagnosis system termed COPD average federated learning (COPD_AVG_FL) using FedAvg. Methods First, to build the COPD_AVG_FL, the clinical data of COPD patients from the real world is collected and the data pre-processing is performed to clean the incorrect data, outlier samples and missing values. Then, a classical federated learning architecture is designed as COPD_AVG_FL. Finally, to evaluate the established COPD_AVG_FL system, we develop Centralized Machine Learning (CML). Conclusions Our results suggest that, with the assistance of COPD_AVG_FL, the absolute improvement rates are 13.4% (accuracy), 13.3% (precision), 12.8% (recall), 13.1% (F1-Score) and 12.9% (AUC) on the test data, respectively. The decoupling between model training and raw training data protects the patients' privacy, and helps to securely integrate more COPD data from different medical units to generate a more comprehensive model COPD_AVG_FL. This approach promotes the landing of wise information technology of medicine for COPD in the real clinical world. Code for our model will be made available at https://github.com/Cczhh/COPD_AVG_FL/tree/master.
Collapse
Affiliation(s)
- Junfeng Peng
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Xujiang Liu
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Ziwei Cai
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Yuanpei Huang
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jiayi Lin
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Mi Zhou
- Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510640, China
| | - Zhenpei Xiao
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Huifang Lai
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Zhihao Cao
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Hui Peng
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jihong Wang
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| | - Jun Xu
- Department of Computer Science and Engineering, Guangdong University of Education, Guangzhou 510303, China
| |
Collapse
|
2
|
Shon S, Lim K, Chae M, Lee H, Choi J. Predicting Sudden Sensorineural Hearing Loss Recovery with Patient-Personalized Seigel's Criteria Using Machine Learning. Diagnostics (Basel) 2024; 14:1296. [PMID: 38928711 PMCID: PMC11202901 DOI: 10.3390/diagnostics14121296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 06/04/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Accurate prognostic prediction is crucial for managing Idiopathic Sudden Sensorineural Hearing Loss (ISSHL). Previous studies developing ISSHL prognosis models often overlooked individual variability in hearing damage by relying on fixed frequency domains. This study aims to develop models predicting ISSHL prognosis one month after treatment, focusing on patient-specific hearing impairments. METHODS Patient-Personalized Seigel's Criteria (PPSC) were developed considering patient-specific hearing impairment related to ISSHL criteria. We performed a statistical test to assess the shift in the recovery assessment when applying PPSC. The utilized dataset of 581 patients comprised demographic information, health records, laboratory testing, onset and treatment, and hearing levels. To reduce the model's reliance on hearing level features, we used only the averages of hearing levels of the impaired frequencies. Then, model development, evaluation, and interpretation proceeded. RESULTS The chi-square test (p-value: 0.106) indicated that the shift in recovery assessment is not statistically significant. The soft-voting ensemble model was most effective, achieving an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.864 (95% CI: 0.801-0.927), with model interpretation based on the SHapley Additive exPlanations value. CONCLUSIONS With PPSC, providing a hearing assessment comparable to traditional Seigel's criteria, the developed models successfully predicted ISSHL recovery one month post-treatment by considering patient-specific impairments.
Collapse
Affiliation(s)
- Sanghyun Shon
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - Kanghyeon Lim
- Department of Otorhinolaryngology-Head and Neck Surgery, Korea University Ansan Hospital, Ansan-si 15355, Republic of Korea;
| | - Minsu Chae
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - Hwamin Lee
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
| | - June Choi
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul 02708, Republic of Korea; (S.S.); (M.C.)
- Department of Otorhinolaryngology-Head and Neck Surgery, Korea University Ansan Hospital, Ansan-si 15355, Republic of Korea;
| |
Collapse
|
3
|
Wang L, Song D, Wang W, Li C, Zhou Y, Zheng J, Rao S, Wang X, Shao G, Cai J, Yang S, Dong J. Data-Driven Assisted Decision Making for Surgical Procedure of Hepatocellular Carcinoma Resection and Prognostic Prediction: Development and Validation of Machine Learning Models. Cancers (Basel) 2023; 15:cancers15061784. [PMID: 36980670 PMCID: PMC10046511 DOI: 10.3390/cancers15061784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 03/02/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023] Open
Abstract
Background: Currently, surgical decisions for hepatocellular carcinoma (HCC) resection are difficult and not sufficiently personalized. We aimed to develop and validate data driven prediction models to assist surgeons in selecting the optimal surgical procedure for patients. Methods: Retrospective data from 361 HCC patients who underwent radical resection in two institutions were included. End-to-end deep learning models were built to automatically segment lesions from the arterial phase (AP) of preoperative dynamic contrast enhanced magnetic resonance imaging (DCE-MRI). Clinical baseline characteristics and radiomic features were rigorously screened. The effectiveness of radiomic features and radiomic-clinical features was also compared. Three ensemble learning models were proposed to perform the surgical procedure decision and the overall survival (OS) and recurrence-free survival (RFS) predictions after taking different solutions, respectively. Results: SegFormer performed best in terms of automatic segmentation, achieving a Mean Intersection over Union (mIoU) of 0.8860. The five-fold cross-validation results showed that inputting radiomic-clinical features outperformed using only radiomic features. The proposed models all outperformed the other mainstream ensemble models. On the external test set, the area under the receiver operating characteristic curve (AUC) of the proposed decision model was 0.7731, and the performance of the prognostic prediction models was also relatively excellent. The application web server based on automatic lesion segmentation was deployed and is available online. Conclusions: In this study, we developed and externally validated the surgical decision-making procedures and prognostic prediction models for HCC for the first time, and the results demonstrated relatively accurate predictions and strong generalizations, which are expected to help clinicians optimize surgical procedures.
Collapse
Affiliation(s)
- Liyang Wang
- School of Clinical Medicine, Tsinghua University, Beijing 100084, China
- Hepato-Pancreato-Biliary Center, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing 102218, China
| | - Danjun Song
- Department of Interventional Therapy, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310022, China
- Department of Liver Surgery, Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Wentao Wang
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Chengquan Li
- School of Clinical Medicine, Tsinghua University, Beijing 100084, China
| | - Yiming Zhou
- Department of Hepatobiliary and Pancreatic Surgery, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310022, China
| | - Jiaping Zheng
- Department of Interventional Therapy, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310022, China
| | - Shengxiang Rao
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Xiaoying Wang
- Department of Liver Surgery, Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China
| | - Guoliang Shao
- Department of Interventional Therapy, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310022, China
- Department of Radiology, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou 310022, China
| | - Jiabin Cai
- Department of Liver Surgery, Key Laboratory of Carcinogenesis and Cancer Invasion of Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai 200032, China
- Correspondence: (J.C.); (S.Y.)
| | - Shizhong Yang
- Hepato-Pancreato-Biliary Center, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing 102218, China
- Correspondence: (J.C.); (S.Y.)
| | - Jiahong Dong
- School of Clinical Medicine, Tsinghua University, Beijing 100084, China
- Hepato-Pancreato-Biliary Center, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing 102218, China
| |
Collapse
|
4
|
Zhang G, Luo L, Zhang L, Liu Z. Research Progress of Respiratory Disease and Idiopathic Pulmonary Fibrosis Based on Artificial Intelligence. Diagnostics (Basel) 2023; 13:diagnostics13030357. [PMID: 36766460 PMCID: PMC9914063 DOI: 10.3390/diagnostics13030357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/06/2023] [Accepted: 01/16/2023] [Indexed: 01/21/2023] Open
Abstract
Machine Learning (ML) is an algorithm based on big data, which learns patterns from the previously observed data through classifying, predicting, and optimizing to accomplish specific tasks. In recent years, there has been rapid development in the field of ML in medicine, including lung imaging analysis, intensive medical monitoring, mechanical ventilation, and there is need for intubation etiology prediction evaluation, pulmonary function evaluation and prediction, obstructive sleep apnea, such as biological information monitoring and so on. ML can have good performance and is a great potential tool, especially in the imaging diagnosis of interstitial lung disease. Idiopathic pulmonary fibrosis (IPF) is a major problem in the treatment of respiratory diseases, due to the abnormal proliferation of fibroblasts, leading to lung tissue destruction. The diagnosis mainly depends on the early detection of imaging and early treatment, which can effectively prolong the life of patients. If the computer can be used to assist the examination results related to the effects of fibrosis, a timely diagnosis of such diseases will be of great value to both doctors and patients. We also previously proposed a machine learning algorithm model that can play a good clinical guiding role in early imaging prediction of idiopathic pulmonary fibrosis. At present, AI and machine learning have great potential and ability to transform many aspects of respiratory medicine and are the focus and hotspot of research. AI needs to become an invisible, seamless, and impartial auxiliary tool to help patients and doctors make better decisions in an efficient, effective, and acceptable way. The purpose of this paper is to review the current application of machine learning in various aspects of respiratory diseases, with the hope to provide some help and guidance for clinicians when applying algorithm models.
Collapse
Affiliation(s)
- Gerui Zhang
- Department of Critical Care Unit, The First Affiliated Hospital of Dalian Medical University, 222, Zhongshan Road, Dalian 116011, China
| | - Lin Luo
- Department of Critical Care Unit, The Second Hospital of Dalian Medical University, 467 Zhongshan Road, Shahekou District, Dalian 116023, China
| | - Limin Zhang
- Department of Respiratory, The First Affiliated Hospital of Dalian Medical University, 222, Zhongshan Road, Dalian 116011, China
| | - Zhuo Liu
- Department of Respiratory, The First Affiliated Hospital of Dalian Medical University, 222, Zhongshan Road, Dalian 116011, China
- Correspondence:
| |
Collapse
|
5
|
Wang J, Zhou X, Hou Z, Xu X, Zhao Y, Chen S, Zhang J, Shao L, Yan R, Wang M, Ge M, Hao T, Tu Y, Huang H. Homogeneous ensemble models for predicting infection levels and
mortality of COVID-19 patients: Evidence from China. Digit Health 2022; 8:20552076221133692. [PMID: 36339905 PMCID: PMC9630904 DOI: 10.1177/20552076221133692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 09/30/2022] [Indexed: 11/07/2022] Open
Abstract
Background Persistence of long-term COVID-19 pandemic is putting high pressure on
healthcare services worldwide for several years. This article aims to
establish models to predict infection levels and mortality of COVID-19
patients in China. Methods Machine learning models and deep learning models have been built based on the
clinical features of COVID-19 patients. The best models are selected by area
under the receiver operating characteristic curve (AUC) scores to construct
two homogeneous ensemble models for predicting infection levels and
mortality, respectively. The first-hand clinical data of 760 patients are
collected from Zhongnan Hospital of Wuhan University between 3 January and 8
March 2020. We preprocess data with cleaning, imputation, and
normalization. Results Our models obtain AUC = 0.7059 and Recall (Weighted avg) = 0.7248 in
predicting infection level, while AUC=0.8436 and Recall (Weighted avg) =
0.8486 in predicting mortality ratio. This study also identifies two sets of
essential clinical features. One is C-reactive protein (CRP) or high
sensitivity C-reactive protein (hs-CRP) and the other is chest tightness,
age, and pleural effusion. Conclusions Two homogeneous ensemble models are proposed to predict infection levels and
mortality of COVID-19 patients in China. New findings of clinical features
for benefiting the machine learning models are reported. The evaluation of
an actual dataset collected from January 3 to March 8, 2020 demonstrates the
effectiveness of the models by comparing them with state-of-the-art models
in prediction.
Collapse
Affiliation(s)
- Jiafeng Wang
- Department of Head, Neck and Thyroid Surgery, Zhejiang Provincial
People's Hospital and People's Hospital Affiliated to Hangzhou Medical College,
Hangzhou, China
| | - Xianlong Zhou
- Emergency Center, Zhongnan Hospital of Wuhan
University, Wuhan, China,Hubei Clinical Research Center for Emergency and Resuscitation, Zhongnan Hospital of Wuhan
University, Wuhan, China
| | - Zhitian Hou
- School of Computer Science, South China Normal
University, Guangzhou, China
| | - Xiaoya Xu
- School of Business Administration, Guangdong University of Finance &
Economics, Guangzhou, China
| | - Yueyue Zhao
- Department of Infectious Disease, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China,Graduate School of Clinical Medicine, Bengbu Medical College, Bengbu, China
| | - Shanshan Chen
- Department of Infectious Disease, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China,Graduate School of Clinical Medicine, Bengbu Medical College, Bengbu, China
| | - Jun Zhang
- Department of Orthopaedic Surgery, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China
| | - Lina Shao
- Department of Nephrology, Zhejiang Provincial People's Hospital and
People's Hospital Affiliated of Hangzhou Medical College, Hangzhou, China
| | - Rong Yan
- Department of Infectious Disease, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China
| | - Mingshan Wang
- Graduate School of Clinical Medicine, Bengbu Medical College, Bengbu, China
| | - Minghua Ge
- Department of Head, Neck and Thyroid Surgery, Zhejiang Provincial
People's Hospital and People's Hospital Affiliated to Hangzhou Medical College,
Hangzhou, China
| | - Tianyong Hao
- School of Computer Science, South China Normal
University, Guangzhou, China
| | - Yuexing Tu
- Department of Intensive Care Unit, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China,Yuexing Tu, Department of Intensive Unit,
Zhejiang Provincial People's Hospital and People’s Hospital Affiliated to
Hangzhou Medical College, Hangzhou, 310014, China.
| | - Haijun Huang
- Department of Infectious Disease, Zhejiang Provincial People's
Hospital and People's Hospital Affiliated to Hangzhou Medical College, Hangzhou,
China,Haijun Huang, Department of Infectious
Disease, Zhejiang Provincial People's Hospital and People’s Hospital Affiliated
to Hangzhou Medical College, Hangzhou, 310014, China.
| |
Collapse
|
6
|
Haque A, Stubbs D, Hubig NC, Spinale FG, Richardson WJ. Interpretable machine learning predicts cardiac resynchronization therapy responses from personalized biochemical and biomechanical features. BMC Med Inform Decis Mak 2022; 22:282. [PMID: 36316772 PMCID: PMC9620606 DOI: 10.1186/s12911-022-02015-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/04/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Cardiac Resynchronization Therapy (CRT) is a widely used, device-based therapy for patients with left ventricle (LV) failure. Unfortunately, many patients do not benefit from CRT, so there is potential value in identifying this group of non-responders before CRT implementation. Past studies suggest that predicting CRT response will require diverse variables, including demographic, biomarker, and LV function data. Accordingly, the objective of this study was to integrate diverse variable types into a machine learning algorithm for predicting individual patient responses to CRT. METHODS We built an ensemble classification algorithm using previously acquired data from the SMART-AV CRT clinical trial (n = 794 patients). We used five-fold stratified cross-validation on 80% of the patients (n = 635) to train the model with variables collected at 0 months (before initiating CRT), and the remaining 20% of the patients (n = 159) were used as a hold-out test set for model validation. To improve model interpretability, we quantified feature importance values using SHapley Additive exPlanations (SHAP) analysis and used Local Interpretable Model-agnostic Explanations (LIME) to explain patient-specific predictions. RESULTS Our classification algorithm incorporated 26 patient demographic and medical history variables, 12 biomarker variables, and 18 LV functional variables, which yielded correct prediction of CRT response in 71% of patients. Additional patient stratification to identify the subgroups with the highest or lowest likelihood of response showed 96% accuracy with 22 correct predictions out of 23 patients in the highest and lowest responder groups. CONCLUSION Computationally integrating general patient characteristics, comorbidities, therapy history, circulating biomarkers, and LV function data available before CRT intervention can improve the prediction of individual patient responses.
Collapse
Affiliation(s)
- Anamul Haque
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, SC, USA
| | - Doug Stubbs
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, SC, USA
| | - Nina C Hubig
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, SC, USA
| | - Francis G Spinale
- School of Medicine, Columbia Veterans Affairs Health Care System, University of South Carolina, Columbia, SC, USA
| | - William J Richardson
- Biomedical Data Science & Informatics Program, Clemson University, Clemson, SC, USA.
- Bioengineering Department, Clemson University, Clemson, SC, USA.
- , 301 Rhodes Engineering Research, 29634, Clemson, SC, USA.
| |
Collapse
|