1
|
Caruso CM, Guarrasi V, Ramella S, Soda P. A deep learning approach for overall survival prediction in lung cancer with missing values. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108308. [PMID: 38968829 DOI: 10.1016/j.cmpb.2024.108308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 06/24/2024] [Accepted: 06/24/2024] [Indexed: 07/07/2024]
Abstract
BACKGROUND AND OBJECTIVE In the field of lung cancer research, particularly in the analysis of overall survival (OS), artificial intelligence (AI) serves crucial roles with specific aims. Given the prevalent issue of missing data in the medical domain, our primary objective is to develop an AI model capable of dynamically handling this missing data. Additionally, we aim to leverage all accessible data, effectively analyzing both uncensored patients who have experienced the event of interest and censored patients who have not, by embedding a specialized technique within our AI model, not commonly utilized in other AI tasks. Through the realization of these objectives, our model aims to provide precise OS predictions for non-small cell lung cancer (NSCLC) patients, thus overcoming these significant challenges. METHODS We present a novel approach to survival analysis with missing values in the context of NSCLC, which exploits the strengths of the transformer architecture to account only for available features without requiring any imputation strategy. More specifically, this model tailors the transformer architecture to tabular data by adapting its feature embedding and masked self-attention to mask missing data and fully exploit the available ones. By making use of ad-hoc designed losses for OS, it is able to account for both censored and uncensored patients, as well as changes in risks over time. RESULTS We compared our method with state-of-the-art models for survival analysis coupled with different imputation strategies. We evaluated the results obtained over a period of 6 years using different time granularities obtaining a Ct-index, a time-dependent variant of the C-index, of 71.97, 77.58 and 80.72 for time units of 1 month, 1 year and 2 years, respectively, outperforming all state-of-the-art methods regardless of the imputation method used. CONCLUSIONS The results show that our model not only outperforms the state-of-the-art's performance but also simplifies the analysis in the presence of missing data, by effectively eliminating the need to identify the most appropriate imputation strategy for predicting OS in NSCLC patients.
Collapse
Affiliation(s)
- Camillo Maria Caruso
- Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Rome, Italy.
| | - Valerio Guarrasi
- Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Rome, Italy.
| | - Sara Ramella
- Operative Research Unit of Radiation Oncology, Fondazione Policlinico Universitario Campus Bio-Medico, Rome, Italy.
| | - Paolo Soda
- Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Rome, Italy; Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, Umeå, Sweden.
| |
Collapse
|
2
|
Nguyen PA, Hsu MH, Chang TH, Yang HC, Huang CW, Liao CT, Lu CY, Hsu JC. Taipei Medical University Clinical Research Database: a collaborative hospital EHR database aligned with international common data standards. BMJ Health Care Inform 2024; 31:e100890. [PMID: 38749529 PMCID: PMC11097871 DOI: 10.1136/bmjhci-2023-100890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 04/29/2024] [Indexed: 05/18/2024] Open
Abstract
OBJECTIVE The objective of this paper is to provide a comprehensive overview of the development and features of the Taipei Medical University Clinical Research Database (TMUCRD), a repository of real-world data (RWD) derived from electronic health records (EHRs) and other sources. METHODS TMUCRD was developed by integrating EHRs from three affiliated hospitals, including Taipei Medical University Hospital, Wan-Fang Hospital and Shuang-Ho Hospital. The data cover over 15 years and include diverse patient care information. The database was converted to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) for standardisation. RESULTS TMUCRD comprises 89 tables (eg, 29 tables for each hospital and 2 linked tables), including demographics, diagnoses, medications, procedures and measurements, among others. It encompasses data from more than 4.15 million patients with various medical records, spanning from the year 2004 to 2021. The dataset offers insights into disease prevalence, medication usage, laboratory tests and patient characteristics. DISCUSSION TMUCRD stands out due to its unique advantages, including diverse data types, comprehensive patient information, linked mortality and cancer registry data, regular updates and a swift application process. Its compatibility with the OMOP CDM enhances its usability and interoperability. CONCLUSION TMUCRD serves as a valuable resource for researchers and scholars interested in leveraging RWD for clinical research. Its availability and integration of diverse healthcare data contribute to a collaborative and data-driven approach to advancing medical knowledge and practice.
Collapse
Affiliation(s)
- Phung-Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
| | - Min-Huei Hsu
- Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Tzu-Hao Chang
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Hsuan-Chia Yang
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Research Center of Big Data and Meta-Analysis, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
| | - Chih-Wei Huang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Chia-Te Liao
- Division of Nephrology, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
- Division of Nephrology, Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- Taipei Medical University-Research Center of Urology and Kidney, Taipei Medical University, Taipei, Taiwan
| | - Christine Y Lu
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
- Kolling Institute, Faculty of Medicine and Health, The University of Sydney and the Northern Sydney Local Health District, Sydney, NSW, Australia
- School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Jason C Hsu
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- International Ph.D. Program in Biotech and Healthcare Management, College of Management, Taipei Medical Unversity, Taipei, Taiwan
| |
Collapse
|
3
|
Didier AJ, Nigro A, Noori Z, Omballi MA, Pappada SM, Hamouda DM. Application of machine learning for lung cancer survival prognostication-A systematic review and meta-analysis. Front Artif Intell 2024; 7:1365777. [PMID: 38646415 PMCID: PMC11026647 DOI: 10.3389/frai.2024.1365777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/18/2024] [Indexed: 04/23/2024] Open
Abstract
Introduction Machine learning (ML) techniques have gained increasing attention in the field of healthcare, including predicting outcomes in patients with lung cancer. ML has the potential to enhance prognostication in lung cancer patients and improve clinical decision-making. In this systematic review and meta-analysis, we aimed to evaluate the performance of ML models compared to logistic regression (LR) models in predicting overall survival in patients with lung cancer. Methods We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement. A comprehensive search was conducted in Medline, Embase, and Cochrane databases using a predefined search query. Two independent reviewers screened abstracts and conflicts were resolved by a third reviewer. Inclusion and exclusion criteria were applied to select eligible studies. Risk of bias assessment was performed using predefined criteria. Data extraction was conducted using the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies (CHARMS) checklist. Meta-analytic analysis was performed to compare the discriminative ability of ML and LR models. Results The literature search resulted in 3,635 studies, and 12 studies with a total of 211,068 patients were included in the analysis. Six studies reported confidence intervals and were included in the meta-analysis. The performance of ML models varied across studies, with C-statistics ranging from 0.60 to 0.85. The pooled analysis showed that ML models had higher discriminative ability compared to LR models, with a weighted average C-statistic of 0.78 for ML models compared to 0.70 for LR models. Conclusion Machine learning models show promise in predicting overall survival in patients with lung cancer, with superior discriminative ability compared to logistic regression models. However, further validation and standardization of ML models are needed before their widespread implementation in clinical practice. Future research should focus on addressing the limitations of the current literature, such as potential bias and heterogeneity among studies, to improve the accuracy and generalizability of ML models for predicting outcomes in patients with lung cancer. Further research and development of ML models in this field may lead to improved patient outcomes and personalized treatment strategies.
Collapse
Affiliation(s)
- Alexander J. Didier
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| | - Anthony Nigro
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| | - Zaid Noori
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| | - Mohamed A. Omballi
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| | - Scott M. Pappada
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
- Department of Anesthesiology, The University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| | - Danae M. Hamouda
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
- Division of Hematology and Oncology, Department of Medicine, The University of Toledo College of Medicine and Life Sciences, Toledo, OH, United States
| |
Collapse
|
4
|
Hien NTK, Tsai FJ, Chang YH, Burton W, Phuc PT, Nguyen PA, Harnod D, Lam CSK, Lu TC, Chen CI, Hsu MH, Lu CY, Huang CW, Yang HC, Hsu JC. Unveiling the future of COVID-19 patient care: groundbreaking prediction models for severe outcomes or mortality in hospitalized cases. Front Med (Lausanne) 2024; 10:1289968. [PMID: 38249981 PMCID: PMC10797111 DOI: 10.3389/fmed.2023.1289968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024] Open
Abstract
Background Previous studies have identified COVID-19 risk factors, such as age and chronic health conditions, linked to severe outcomes and mortality. However, accurately predicting severe illness in COVID-19 patients remains challenging, lacking precise methods. Objective This study aimed to leverage clinical real-world data and multiple machine-learning algorithms to formulate innovative predictive models for assessing the risk of severe outcomes or mortality in hospitalized patients with COVID-19. Methods Data were obtained from the Taipei Medical University Clinical Research Database (TMUCRD) including electronic health records from three Taiwanese hospitals in Taiwan. This study included patients admitted to the hospitals who received an initial diagnosis of COVID-19 between January 1, 2021, and May 31, 2022. The primary outcome was defined as the composite of severe infection, including ventilator use, intubation, ICU admission, and mortality. Secondary outcomes consisted of individual indicators. The dataset encompassed demographic data, health status, COVID-19 specifics, comorbidities, medications, and laboratory results. Two modes (full mode and simplified mode) are used; the former includes all features, and the latter only includes the 30 most important features selected based on the algorithm used by the best model in full mode. Seven machine learning was employed algorithms the performance of the models was evaluated using metrics such as the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, and specificity. Results The study encompassed 22,192 eligible in-patients diagnosed with COVID-19. In the full mode, the model using the light gradient boosting machine algorithm achieved the highest AUROC value (0.939), with an accuracy of 85.5%, a sensitivity of 0.897, and a specificity of 0.853. Age, vaccination status, neutrophil count, sodium levels, and platelet count were significant features. In the simplified mode, the extreme gradient boosting algorithm yielded an AUROC of 0.935, an accuracy of 89.9%, a sensitivity of 0.843, and a specificity of 0.902. Conclusion This study illustrates the feasibility of constructing precise predictive models for severe outcomes or mortality in COVID-19 patients by leveraging significant predictors and advanced machine learning. These findings can aid healthcare practitioners in proactively predicting and monitoring severe outcomes or mortality among hospitalized COVID-19 patients, improving treatment and resource allocation.
Collapse
Affiliation(s)
- Nguyen Thi Kim Hien
- Master Program in Global Health and Health Security, College of Public Health, Taipei Medical University, Taipei, Taiwan
| | - Feng-Jen Tsai
- Master Program in Global Health and Health Security, College of Public Health, Taipei Medical University, Taipei, Taiwan
- Ph.D. Program in Global Health and Health Security, College of Public Health, Taipei Medical University, Taipei, Taiwan
| | - Yu-Hui Chang
- PharmD Program, Division of Clinical Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan
| | - Whitney Burton
- International Ph.D. Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Phan Thanh Phuc
- International Ph.D. Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Phung-Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Dorji Harnod
- Department of Emergency, College of Medicine, Taipei Medical University, Taipei, Taiwan
- Department of Emergency and Critical Care Medicine, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
| | - Carlos Shu-Kei Lam
- Department of Emergency, College of Medicine, Taipei Medical University, Taipei, Taiwan
- Division of Emergency, Department of Emergency and Critical Care Medicine, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Injury Prevention and Control, College of Public Health, Taipei Medical University, Taipei, Taiwan
| | - Tsung-Chien Lu
- Department of Emergency Medicine, National Taiwan University Hospital, Taipei, Taiwan
| | - Chang-I Chen
- Department of Healthcare Administration, School of Management, Taipei Medical University, Taipei, Taiwan
| | - Min-Huei Hsu
- Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Christine Y. Lu
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States
- School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
- Kolling Institute, Faculty of Medicine and Health, The University of Sydney and the Northern Sydney Local Health District, Sydney, NSW, Australia
| | - Chih-Wei Huang
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan
| | - Hsuan-Chia Yang
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan
| | - Jason C. Hsu
- International Ph.D. Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Taiwan
- Research Center of Health Care Industry Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
5
|
Shao CY, Luo J, Ju S, Li CL, Ding C, Chen J, Liu XL, Zhao J, Yang LQ. Online decision tools for personalized survival prediction and treatment optimization in elderly patients with lung squamous cell carcinoma: a retrospective cohort study. BMC Cancer 2023; 23:920. [PMID: 37773106 PMCID: PMC10542697 DOI: 10.1186/s12885-023-11309-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 08/17/2023] [Indexed: 09/30/2023] Open
Abstract
BACKGROUND Despite major advances in cancer therapeutics, the therapeutic options of Lung Squamous Cell Carcinoma (LSCC)-specific remain limited. Furthermore, the current staging system is imperfect for defining a prognosis and guiding treatment due to its simplicity and heterogeneity. We sought to develop prognostic decision tools for individualized survival prediction and treatment optimization in elderly patients with LSCC. METHODS Clinical data of 4564 patients (stageIB-IIIB) diagnosed from 2010 to 2015 were extracted from the Surveillance, Epidemiology, and End Results (SEER) database for prognostic nomograms development. The proposed models were externally validated using a separate group consisting of 1299 patients (stage IB-IIIB) diagnosed from 2012-2015 in China. The prognostic performance was measured using the concordance index (C-index), calibration curves, the average time-dependent area under the receiver operator characteristic curves (AUC), and decision curve analysis. RESULTS Eleven candidate prognostic variables were identified by the univariable and multivariable Cox regression analysis. The calibration curves showed satisfactory agreement between the actual and nomogram-estimated Lung Cancer-Specific Survival (LCSS) rates. By calculating the c-indices and average AUC, our nomograms presented a higher prognostic accuracy than the current staging system. Clinical usefulness was revealed by the decision curve analysis. User-friendly online decision tools integrating proposed nomograms were created to estimate survival for patients with different treatment regimens. CONCLUSIONS The decision tools for individualized survival prediction and treatment optimization might facilitate clinicians with decision-making, medical teaching, and experimental design. Online tools are expected to be integrated into clinical practice by using the freely available website ( https://loyal-brand-611803.framer.app/ ).
Collapse
Affiliation(s)
- Chen-Ye Shao
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
| | - Jing Luo
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
| | - Sheng Ju
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
| | - Chu-Ling Li
- Department of Respiratory Medicine, Jinling Hospital, Nanjing Medical University, Nanjing, China
- Department of Respiratory Medicine, Jinling Hospital Medical School of Nanjing University, Nanjing, China
| | - Cheng Ding
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
| | - Jun Chen
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China
| | - Xiao-Long Liu
- Department of Cardiothoracic Surgery, Jinling Hospital, Medical School of Nanjing University, 305 Zhongshan East Road, Nanjing, 210002, Jiangsu, China.
| | - Jun Zhao
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China.
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China.
| | - Li-Qin Yang
- Department of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China.
- Institute of Thoracic Surgery, The First Affiliated Hospital of Soochow University, 899 Pinghai Road, Gusu District, Suzhou, 215006, China.
| |
Collapse
|