1
|
Zhou YF, Wang J, Wang XL, Song SS, Bai Y, Li JL, Luo JY, Jin QQ, Cai WC, Yuan KM, Li J. A prediction model of elderly hip fracture mortality including preoperative red cell distribution width constructed based on the random survival forest (RSF) and Cox risk ratio regression. Osteoporos Int 2024; 35:613-623. [PMID: 38062161 DOI: 10.1007/s00198-023-06988-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 11/20/2023] [Indexed: 03/22/2024]
Abstract
An independent correlation between pre-RDW and 1-year mortality after surgery in elderly hip fracture can be used to predict mortality in elderly hip fracture patients and has predictive significance in anemia patients. With further research, a treatment algorithm can be developed to potentially identify patients at high risk of preoperative mortality. INTRODUCTION Red blood cell distribution width (RDW) is an independent predictor of various disease states in elderly individuals, but its association with the prognosis of elderly hip fracture patients is controversial. This study aimed to evaluate the prognostic value of RDW in such patients, construct a prediction model containing RDW using random survival forest (RSF) and Cox regression analysis, and compare RDW in patients with and without anemia. METHODS We retrospectively analyzed the data of elderly patients who underwent hip fracture surgery, selected the best variables using RSF, stratified the independent variables by Cox regression analysis, constructed a 1-year mortality prediction model of elderly hip fracture with RDW, and conducted internal validation and external validation. RESULTS Two thousand one hundred six patients were included in this study. The RSF algorithm selects 12 important influencing factors, and Cox regression analysis showed that eight variables including preoperative RDW (pre-RDW) were independent risk factors for death within 1-year after hip fracture surgery in elderly patients. Stratified analysis showed that pre-RDW was still independently associated with 1-year mortality in the non-anemia group and not in the anemia group. The nomogram prediction model had high differentiation and fit, and the prediction model constructed by the total cohort of patients was also used for validation of patients in the anemia patients and obtained good clinical benefits. CONCLUSION An independent correlation between pre-RDW and 1-year mortality after surgery in elderly hip fracture can be used to predict mortality in elderly hip fracture patients and has predictive significance in anemia patients.
Collapse
Affiliation(s)
- Ying-Feng Zhou
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Jiao Wang
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Xin-Lin Wang
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Shu-Shu Song
- Department of Anesthesiology, Wenzhou Central Hospital, Zhejiang, China
| | - Yue Bai
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Jian-Lin Li
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Jing-Yu Luo
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Qi-Qi Jin
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Wei-Cha Cai
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China
| | - Kai-Ming Yuan
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China.
| | - Jun Li
- Department of Anesthesiology and Perioperative Medicine, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Key Laboratory of Pediatric Anesthesiology, Ministry of Education, Key Laboratory of Anesthesiology of Zhejiang Province, Wenzhou Medical University, Zhejiang, China.
| |
Collapse
|
2
|
Chen D, Liang S, Chen J, Li K, Mi H. Machine learning-based overall and cancer-specific survival prediction of M0 penile squamous cell carcinoma:A population-based retrospective study. Heliyon 2024; 10:e23442. [PMID: 38163093 PMCID: PMC10755306 DOI: 10.1016/j.heliyon.2023.e23442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 01/03/2024] Open
Abstract
Background Penile cancer is a rare tumor and few studies have focused on the prognosis of M0 penile squamous cell carcinoma (PSCC). This retrospective study aimed to identify independent prognostic factors and construct predictive models for the overall survival (OS) and cancer-specific survival (CSS) of patients with M0 PSCC. Methods Data was extracted from the Surveillance, Epidemiology, and End Results database for patients diagnosed with malignant penile cancer. Eligible patients with M0 PSCC were selected according to predetermined inclusion and exclusion criteria. These patients were then divided into a training set, a validation set, and a test set. Univariate and multivariate COX regression analyses were initially performed to identify independent prognostic factors for OS and CSS in M0 PSCC patients. Subsequently, traditional and machine learning prognostic models, including random survival forest (RSF), COX, gradient boosting, and component-wise gradient boosting modelling, were constructed using the scikit-survival framework. The performance of each model was assessed by calculating time-dependent area under curve (AUC), C-index, and integrated Brier score (IBS), ultimately identifying the model with the highest performance. Finally, the Shapley additive explanation (SHAP) value, feature importance, and cumulative rates analyses were used to further estimate the selected model. Results A total of 2, 446 patients were included in our study. Cox regression analyses demonstrated that age, N stage, and tumor size were predictors of OS, while the N stage, tumor size, surgery, and residential area were predictors of CSS. The RSF and COX models had a higher time-independent AUC and C-index, and lower IBS value than other models in OS and CSS prediction. Feature importance analysis revealed the N stage as a common significant feature for predicting M0 PSCC patients' survival. The SHAP and cumulative rate analyses demonstrated that the selected models can effectively evaluate the prognosis of M0 PSCC patients. Conclusion In M0 PSCC patients, age, N stage, and tumor size were predictors of OS. In addition, the N stage, tumor size, surgery, and residential area were predictors of CSS. The machine learning-based RSF and COX models effectively predicted the prognosis of M0 PSCC patients.
Collapse
Affiliation(s)
| | | | | | - Kezhen Li
- Department of urology, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, 530001, China
| | - Hua Mi
- Department of urology, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, 530001, China
| |
Collapse
|
3
|
Ma B, Chen B, Cai C, Zhang J. Establishment of survival models for primary prostate cancer and colorectal cancer based on the random survival forest. Asian J Surg 2023; 46:5787-5788. [PMID: 37666701 DOI: 10.1016/j.asjsur.2023.08.156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 08/24/2023] [Indexed: 09/06/2023] Open
Affiliation(s)
- Bingqing Ma
- Department of Emergency General Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Biao Chen
- Department of Emergency General Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Chengjun Cai
- Department of Emergency General Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Jinxiang Zhang
- Department of Emergency General Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
| |
Collapse
|
4
|
Huang Y, Li J, Li M, Aparasu RR. Application of machine learning in predicting survival outcomes involving real-world data: a scoping review. BMC Med Res Methodol 2023; 23:268. [PMID: 37957593 PMCID: PMC10641971 DOI: 10.1186/s12874-023-02078-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/20/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND Despite the interest in machine learning (ML) algorithms for analyzing real-world data (RWD) in healthcare, the use of ML in predicting time-to-event data, a common scenario in clinical practice, is less explored. ML models are capable of algorithmically learning from large, complex datasets and can offer advantages in predicting time-to-event data. We reviewed the recent applications of ML for survival analysis using RWD in healthcare. METHODS PUBMED and EMBASE were searched from database inception through March 2023 to identify peer-reviewed English-language studies of ML models for predicting time-to-event outcomes using the RWD. Two reviewers extracted information on the data source, patient population, survival outcome, ML algorithms, and the Area Under the Curve (AUC). RESULTS Of 257 citations, 28 publications were included. Random survival forests (N = 16, 57%) and neural networks (N = 11, 39%) were the most popular ML algorithms. There was variability across AUC for these ML models (median 0.789, range 0.6-0.950). ML algorithms were predominately considered for predicting overall survival in oncology (N = 12, 43%). ML survival models were often used to predict disease prognosis or clinical events (N = 27, 96%) in the oncology, while less were used for treatment outcomes (N = 1, 4%). CONCLUSIONS The ML algorithms, random survival forests and neural networks, are mainly used for RWD to predict survival outcomes such as disease prognosis or clinical events in the oncology. This review shows that more opportunities remain to apply these ML algorithms to inform treatment decision-making in clinical practice. More methodological work is also needed to ensure the utility and applicability of ML models in survival outcomes.
Collapse
Affiliation(s)
- Yinan Huang
- Department of Pharmacy Administration, School of Pharmacy, University of Mississippi, University, MS, 38677, USA
| | - Jieni Li
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX, 77204, USA
| | - Mai Li
- Department of Industrial Engineering, Cullen College of Engineering, University of Houston, Houston, TX, USA
| | - Rajender R Aparasu
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
5
|
Wang N, Lin Y, Song H, Huang W, Huang J, Shen L, Chen F, Liu F, Wang J, Qiu Y, Shi B, Lin L, He B. Development and validation of a model for the prediction of disease-specific survival in patients with oral squamous cell carcinoma: based on random survival forest analysis. Eur Arch Otorhinolaryngol 2023; 280:5049-5057. [PMID: 37535081 DOI: 10.1007/s00405-023-08087-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 06/20/2023] [Indexed: 08/04/2023]
Abstract
OBJECTIVE To establish a model for predicting the disease-specific survival (DSS) of patients with oral squamous cell carcinoma (OSCC). METHODS Patients diagnosed with OSCC from the Surveillance, Epidemiology, and End Results (SEER) database were enrolled and randomly divided into development (n = 14,495) and internal validation cohort (n = 9625). Additionally, a cohort from a hospital located in Southeastern China was utilized for external validation (n = 582). RESULTS TNM stage, adjuvant treatment, surgery, tumor sites, age, grade, and gender were used for RSF model construction based on the development cohort. The effectiveness of the model was confirmed through time-dependent ROC curves in different cohorts. The risk score exhibited an almost exponential increase in the hazard ratio of death due to OSCC. In development, internal, and external validation cohorts, the prognosis was significantly worse for patients in groups with higher risk scores (all log-rank P < 0.05). CONCLUSION Based on RSF, a high-performance prediction model for OSCC prognosis was created and verified in this study.
Collapse
Affiliation(s)
- Na Wang
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Yulan Lin
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Haoyuan Song
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Weihai Huang
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Jingyao Huang
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Liling Shen
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Fa Chen
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Fengqiong Liu
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China
| | - Jing Wang
- Laboratory Center, The Major Subject of Environment and Health of Fujian Key Universities, School of Public Health, Fujian Medical University, Fujian, China
| | - Yu Qiu
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Bin Shi
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Lisong Lin
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Fujian Medical University, Fuzhou, China.
| | - Baochang He
- Department of Epidemiology and Health Statistics, Fujian Provincial Key Laboratory of Environment Factors and Cancer, School of Public Health, Fujian Medical University, Fujian, China.
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fujian, China.
- Department of Oral and Maxillofacial Surgery, the First Affiliated Hospital of Fujian Medical University, Fuzhou, China.
| |
Collapse
|
6
|
Wang Y, Deng Y, Tan Y, Zhou M, Jiang Y, Liu B. A comparison of random survival forest and Cox regression for prediction of mortality in patients with hemorrhagic stroke. BMC Med Inform Decis Mak 2023; 23:215. [PMID: 37833724 PMCID: PMC10576378 DOI: 10.1186/s12911-023-02293-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 09/11/2023] [Indexed: 10/15/2023] Open
Abstract
OBJECTIVE To evaluate RSF and Cox models for mortality prediction of hemorrhagic stroke (HS) patients in intensive care unit (ICU). METHODS In the training set, the optimal models were selected using five-fold cross-validation and grid search method. In the test set, the bootstrap method was used to validate. The area under the curve(AUC) was used for discrimination, Brier Score (BS) was used for calibration, positive predictive value(PPV), negative predictive value(NPV), and F1 score were combined to compare. RESULTS A total of 2,990 HS patients were included. For predicting the 7-day mortality, the mean AUCs for RSF and Cox regression were 0.875 and 0.761, while the mean BS were 0.083 and 0.108. For predicting the 28-day mortality, the mean AUCs for RSF and Cox regression were 0.794 and 0.649, while the mean BS were 0.129 and 0.174. The mean AUCs of RSF and Cox versus conventional scores for predicting patients' 7-day mortality were 0.875 (RSF), 0.761 (COX), 0.736 (SAPS II), 0.723 (OASIS), 0.632 (SIRS), and 0.596 (SOFA), respectively. CONCLUSIONS RSF provided a better clinical reference than Cox. Creatine, temperature, anion gap and sodium were important variables in both models.
Collapse
Affiliation(s)
- Yuxin Wang
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Yuhan Deng
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Yinliang Tan
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Meihong Zhou
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China
| | - Yong Jiang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
- China National Clinical Research Center for Neurological Diseases, Beijing, China.
| | - Baohua Liu
- Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing, China.
| |
Collapse
|
7
|
Villasanta-Gonzalez A, Mora-Ortiz M, Alcala-Diaz JF, Rivas-Garcia L, Torres-Peña JD, Lopez-Bascon A, Calderon-Santiago M, Arenas-Larriva AP, Priego-Capote F, Malagon MM, Eichelmann F, Perez-Martinez P, Delgado-Lista J, Schulze MB, Camargo A, Lopez-Miranda J. Plasma lipidic fingerprint associated with type 2 diabetes in patients with coronary heart disease: CORDIOPREV study. Cardiovasc Diabetol 2023; 22:199. [PMID: 37537576 PMCID: PMC10401778 DOI: 10.1186/s12933-023-01933-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 07/21/2023] [Indexed: 08/05/2023] Open
Abstract
OBJECTIVE We aimed to identify a lipidic profile associated with type 2 diabetes mellitus (T2DM) development in coronary heart disease (CHD) patients, to provide a new, highly sensitive model which could be used in clinical practice to identify patients at T2DM risk. METHODS This study considered the 462 patients of the CORDIOPREV study (CHD patients) who were not diabetic at the beginning of the intervention. In total, 107 of them developed T2DM after a median follow-up of 60 months. They were diagnosed using the American Diabetes Association criteria. A novel lipidomic methodology employing liquid chromatography (LC) separation followed by HESI, and detection by mass spectrometry (MS) was used to annotate the lipids at the isomer level. The patients were then classified into a Training and a Validation Set (60-40). Next, a Random Survival Forest (RSF) was carried out to detect the lipidic isomers with the lowest prediction error, these lipids were then used to build a Lipidomic Risk (LR) score which was evaluated through a Cox. Finally, a production model combining the clinical variables of interest, and the lipidic species was carried out. RESULTS LC-tandem MS annotated 440 lipid species. From those, the RSF identified 15 lipid species with the lowest prediction error. These lipids were combined in an LR score which showed association with the development of T2DM. The LR hazard ratio per unit standard deviation was 2.87 and 1.43, in the Training and Validation Set respectively. Likewise, patients with higher LR Score values had lower insulin sensitivity (P = 0.006) and higher liver insulin resistance (P = 0.005). The receiver operating characteristic (ROC) curve obtained by combining clinical variables and the selected lipidic isomers using a generalised lineal model had an area under the curve (AUC) of 81.3%. CONCLUSION Our study showed the potential of comprehensive lipidomic analysis in identifying patients at risk of developing T2DM. In addition, the lipid species combined with clinical variables provided a new, highly sensitive model which can be used in clinical practice to identify patients at T2DM risk. Moreover, these results also indicate that we need to look closely at isomers to understand the role of this specific compound in T2DM development. Trials registration NCT00924937.
Collapse
Affiliation(s)
- Alejandro Villasanta-Gonzalez
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
| | - Marina Mora-Ortiz
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
| | - Juan F Alcala-Diaz
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - Lorenzo Rivas-Garcia
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
| | - Jose D Torres-Peña
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - Asuncion Lopez-Bascon
- Department of Analytical Chemistry and Nanochemistry University Institute, University of Cordoba, Cordoba, Spain
- CIBER de Fragilidad y Envejecimiento Saludable (CIBERFES), Instituto de Salud Carlos III, Madrid, Spain
| | - Monica Calderon-Santiago
- Department of Analytical Chemistry and Nanochemistry University Institute, University of Cordoba, Cordoba, Spain
- CIBER de Fragilidad y Envejecimiento Saludable (CIBERFES), Instituto de Salud Carlos III, Madrid, Spain
| | - Antonio P Arenas-Larriva
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
| | - Feliciano Priego-Capote
- Department of Analytical Chemistry and Nanochemistry University Institute, University of Cordoba, Cordoba, Spain
- CIBER de Fragilidad y Envejecimiento Saludable (CIBERFES), Instituto de Salud Carlos III, Madrid, Spain
| | - Maria M Malagon
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
- Department of Cell Biology, Physiology and Immunology, University of Cordoba, Cordoba, Spain
| | - Fabian Eichelmann
- German Center for Diabetes Research, Munich-Neuherberg, Germany
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbrücke, Nuthetal, Germany
| | - Pablo Perez-Martinez
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - Javier Delgado-Lista
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - Matthias B Schulze
- German Center for Diabetes Research, Munich-Neuherberg, Germany
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbrücke, Nuthetal, Germany
- Germany Institute of Nutrition Science, University of Potsdam, Nuthetal, Germany
| | - Antonio Camargo
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain.
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain.
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain.
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain.
| | - Jose Lopez-Miranda
- Lipids and Atherosclerosis Unit, Department of Internal Medicine, Reina Sofia University Hospital, University of Cordoba, Cordoba, Spain.
- Department of Medical and Surgical Sciences, University of Cordoba, Cordoba, Spain.
- Instituto Maimonides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, Spain.
- CIBER Fisiopatología de la Obesidad y Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain.
| |
Collapse
|
8
|
Liu J, Wu P, Lai S, Wang J, Hou H, Zhang Y. Prognostic models for upper urinary tract urothelial carcinoma patients after radical nephroureterectomy based on a novel systemic immune-inflammation score with machine learning. BMC Cancer 2023; 23:574. [PMID: 37349696 DOI: 10.1186/s12885-023-11058-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 06/11/2023] [Indexed: 06/24/2023] Open
Abstract
PURPOSE This study aimed to evaluate the clinical significance of a novel systemic immune-inflammation score (SIIS) to predict oncological outcomes in upper urinary tract urothelial carcinoma(UTUC) after radical nephroureterectomy(RNU). METHOD The clinical data of 483 patients with nonmetastatic UTUC underwent surgery in our center were analyzed. Five inflammation-related biomarkers were screened in the Lasso-Cox model and then aggregated to generate the SIIS based on the regression coefficients. Overall survival (OS) was assessed using Kaplan-Meier analyses. The Cox proportional hazards regression and random survival forest model were adopted to build the prognostic model. Then we established an effective nomogram for UTUC after RNU based on SIIS. The discrimination and calibration of the nomogram were evaluated using the concordance index (C-index), area under the time-dependent receiver operating characteristic curve (time-dependent AUC), and calibration curves. Decision curve analysis (DCA) was used to assess the net benefits of the nomogram at different threshold probabilities. RESULT According to the median value SIIS computed by the lasso Cox model, the high-risk group had worse OS (p<0.0001) than low risk-group. Variables with a minimum depth greater than the depth threshold or negative variable importance were excluded, and the remaining six variables were included in the model. The area under the ROC curve (AUROC) of the Cox and random survival forest models were 0.801 and 0.872 for OS at five years, respectively. Multivariate Cox analysis showed that elevated SIIS was significantly associated with poorer OS (p<0.001). In terms of predicting overall survival, a nomogram that considered the SIIS and clinical prognostic factors performed better than the AJCC staging. CONCLUSION The pretreatment levels of SIIS were an independent predictor of prognosis in upper urinary tract urothelial carcinoma after RNU. Therefore, incorporating SIIS into currently available clinical parameters helps predict the long-term survival of UTUC.
Collapse
Affiliation(s)
- Jianyong Liu
- Department of Urology, Beijing Hospital, National Center of Gerontology, Institute of the Geriatric Medicine, Chinese Academy of Medical Sciences, No. 1 DaHua Road, Dong Dan, Beijing, China
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Hospital Continence Center, Beijing, China
| | - Pengjie Wu
- Department of Urology, Beijing Hospital, National Center of Gerontology, Institute of the Geriatric Medicine, Chinese Academy of Medical Sciences, No. 1 DaHua Road, Dong Dan, Beijing, China
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Hospital Continence Center, Beijing, China
| | - Shicong Lai
- Department of Urology, Peking University People's Hospital, 100044, Beijing, China
| | - Jianye Wang
- Department of Urology, Beijing Hospital, National Center of Gerontology, Institute of the Geriatric Medicine, Chinese Academy of Medical Sciences, No. 1 DaHua Road, Dong Dan, Beijing, China.
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China.
- Beijing Hospital Continence Center, Beijing, China.
| | - Huimin Hou
- Department of Urology, Beijing Hospital, National Center of Gerontology, Institute of the Geriatric Medicine, Chinese Academy of Medical Sciences, No. 1 DaHua Road, Dong Dan, Beijing, China.
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China.
- Beijing Hospital Continence Center, Beijing, China.
| | - Yaoguang Zhang
- Department of Urology, Beijing Hospital, National Center of Gerontology, Institute of the Geriatric Medicine, Chinese Academy of Medical Sciences, No. 1 DaHua Road, Dong Dan, Beijing, China.
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China.
- Beijing Hospital Continence Center, Beijing, China.
| |
Collapse
|
9
|
Kim Y, Kim KH, Park J, Yoon HI, Sung W. Prognosis prediction for glioblastoma multiforme patients using machine learning approaches: Development of the clinically applicable model. Radiother Oncol 2023; 183:109617. [PMID: 36921767 DOI: 10.1016/j.radonc.2023.109617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 02/28/2023] [Accepted: 03/04/2023] [Indexed: 03/16/2023]
Abstract
BACKGROUND AND PURPOSE We aimed to develop a clinically applicable prognosis prediction model predicting overall survival (OS) and progression-free survival (PFS) for glioblastoma multiforme (GBM) patients. MATERIALS AND METHODS All 467 patients treated with concurrent chemoradiotherapy at Yonsei Cancer Center from 2016 to 2020 were included in this study. We developed a conventional linear regression, Cox proportional hazards (COX), and non-linear machine learning algorithms, random survival forest (RSF) and survival support vector machine (SVM) based on 16 clinical variables. After backward feature selection and hyperparameter tuning using grid search, we repeated 100 times of cross-validations to combat overfitting and enhance the model performance. Harrell's concordance index (C-index) and integrated brier score (IBS) were employed as quantitative performance metrics. RESULTS In both predictions, RSF performed much better than COX and SVM. (For OS prediction: RSF C-index = 0.72 90%CI [0.71-0.72] and IBS = 0.12 90%CI [0.10-0.13]; For PFS prediction: RSF C-index = 0.70 90%CI [0.70-0.71] and IBS = 0.12 90%CI [0.10-0.14]). Permutation feature importance confirmed that MGMT promoter methylation, extent of resection, age, cone down planning target volume, and subventricular zone involvement are significant prognostic factors for OS. The importance of the extent of resection and MGMT promoter methylation was much higher than other selected input factors in PFS. Our final models accurately stratified two risk groups with root mean square errors less than 0.07. The sensitivity analysis revealed that our final models are highly applicable to newly diagnosed GBM patients. CONCLUSION Our final models can provide a reliable outcome prediction for individual GBM. The final OS and PFS predicting models we developed accurately stratify high-risk groups up to 5-years, and the sensitivity analysis confirmed that both final models are clinically applicable.
Collapse
Affiliation(s)
- Yeseul Kim
- Department of Biomedical Engineering and of Biomedicine & Health Science, College of Medicine, The Catholic University of Korea, Seoul 137-70, South Korea
| | - Kyung Hwan Kim
- Department of Radiation Oncology, Yonsei Cancer Center, Heavy Ion Therapy Research Institute, Yonsei University College of Medicine, Seoul, South Korea
| | - Junyoung Park
- Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Hong In Yoon
- Department of Radiation Oncology, Yonsei Cancer Center, Heavy Ion Therapy Research Institute, Yonsei University College of Medicine, Seoul, South Korea.
| | - Wonmo Sung
- Department of Biomedical Engineering and of Biomedicine & Health Science, College of Medicine, The Catholic University of Korea, Seoul 137-70, South Korea.
| |
Collapse
|
10
|
Chen W, Zhou B, Jeon CY, Xie F, Lin YC, Butler RK, Zhou Y, Luong TQ, Lustigova E, Pisegna JR, Wu BU. Machine learning versus regression for prediction of sporadic pancreatic cancer. Pancreatology 2023:S1424-3903(23)00103-5. [PMID: 37130760 PMCID: PMC10406388 DOI: 10.1016/j.pan.2023.04.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 04/10/2023] [Accepted: 04/23/2023] [Indexed: 05/04/2023]
Abstract
BACKGROUND/OBJECTIVES There is currently no widely accepted approach to identify patients at increased risk for sporadic pancreatic cancer (PC). We aimed to compare the performance of two machine-learning models with a regression-based model in predicting pancreatic ductal adenocarcinoma (PDAC), the most common form of PC. METHODS This retrospective cohort study consisted of patients 50-84 years of age enrolled in either Kaiser Permanente Southern California (KPSC, model training, internal validation) or the Veterans Affairs (VA, external testing) between 2008 and 2017. The performance of random survival forests (RSF) and eXtreme gradient boosting (XGB) models were compared to that of COX proportional hazards regression (COX). Heterogeneity of the three models were assessed. RESULTS The KPSC and the VA cohorts consisted of 1.8 and 2.7 million patients with 1792 and 4582 incident PDAC cases within 18 months, respectively. Predictors selected into all three models included age, abdominal pain, weight change, and glycated hemoglobin (A1c). Additionally, RSF selected change in alanine transaminase (ALT), whereas the XGB and COX selected the rate of change in ALT. The COX model appeared to have lower AUC (KPSC: 73.7, 95% CI 71.0-76.4; VA: 70.6, 69.9-71.4), compared to those of RSF (KPSC: 76.7, 74.4-79.1; VA: 73.1, 72.4-73.9) and XGB (KPSC: 77.9, 75.5-80.2; VA: 74.2, 73.5-75.0). Among patients with top 5% predicted risk from all three models (N = 29,663), 117 developed PDAC, of which RSF, XGB and COX captured 84 (9 unique), 87 (4 unique), 87 (19 unique) cases, respectively. CONCLUSIONS The three models complement each other, but each has unique contributions.
Collapse
Affiliation(s)
- Wansu Chen
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA.
| | - Botao Zhou
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | | | - Fagen Xie
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Yu-Chen Lin
- Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rebecca K Butler
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Yichen Zhou
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Tiffany Q Luong
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Eva Lustigova
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Joseph R Pisegna
- Division of Gastroenterology and Hepatology, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA; Departments of Medicine and Human Genetics David Geffen School of Medicine at UCLA, USA
| | - Bechien U Wu
- Center for Pancreatic Care, Department of Gastroenterology, Los Angeles Medical Center, Southern California Permanente Medical Group, Los Angeles, CA, USA
| |
Collapse
|
11
|
Liu Y, Xue J, Jiang J. Application of machine learning algorithms in electronic medical records to predict amputation-free survival after first revascularization in patients with peripheral artery disease. Int J Cardiol 2023:S0167-5273(23)00594-6. [PMID: 37119943 DOI: 10.1016/j.ijcard.2023.04.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 04/07/2023] [Accepted: 04/23/2023] [Indexed: 05/01/2023]
Abstract
BACKGROUND This study aimed to apply eight machine learning algorithms to develop the optimal model to predict amputation-free survival (AFS) after first revascularization in patients with peripheral artery disease (PAD). METHODS Among 2130 patients from 2011 to 2020, 1260 patients who underwent revascularization were randomly assigned to training set and validation set in an 8:2 ratio. 67 clinical parameters were analyzed by lasso regression analysis. Logistic regression, gradient boosting machine, random forest, decision tree, eXtreme gradient boosting, neural network, Cox regression, and random survival forest (RSF) were applied to develop prediction models. The optimal model was compared with GermanVasc score in testing set comprising patients from 2010. RESULTS The postoperative 1/3/5-year AFS were 90%, 79.4%, and 74.1%. Age (HR:1.035, 95%CI: 1.015-1.056), atrial fibrillation (HR:2.257, 95%CI: 1.193-4.271), cardiac ejection fraction (HR:0.064, 95%CI: 0.009-0.413), Rutherford grade ≥ 5 (HR:1.899, 95%CI: 1.296-2.782), creatinine (HR:1.03, 95%CI: 1.02-1.04), surgery duration (HR:1.03, 95%CI: 1.01-1.05), and fibrinogen (HR:1.292, 95%CI: 1.098-1.521) were independent risk factors. The optimal model was developed by RSF algorithm, with 1/3/5-year AUCs in training set of 0.866 (95% CI:0.819-0.912), 0.854 (95% CI:0.811-0.896), 0.844 (95% CI:0.793-0.894), in validation set of 0.741 (95% CI:0.580-0.902), 0.768 (95% CI:0.654-0.882), 0.836 (95% CI:0.719-0.953), and in testing set of 0.821 (95%CI: 0.711-0.931), 0.802 (95%CI: 0.684-0.919), 0.798 (95%CI: 0.657-0.939). The c-index of the model outperformed GermanVasc Score (0.788 vs 0.730). A dynamic nomogram was published on shinyapp (https://wyy2023.shinyapps.io/amputation/). CONCLUSION The optimal prediction model for AFS after first revascularization in patients with PAD was developed by RSF algorithm, which exhibited outstanding prediction performance.
Collapse
Affiliation(s)
- Yang Liu
- Department of General surgery, Vascular Surgery, Qilu Hospital of Shandong University, No.107, Road Wen Hua Xi, Jinan, Shandong, China
| | - Junshuai Xue
- Department of General surgery, Qilu Hospital of Shandong University, No.107, Road Wen Hua Xi, Jinan, Shandong, China
| | - Jianjun Jiang
- Department of General surgery, Vascular Surgery, Qilu Hospital of Shandong University, No.107, Road Wen Hua Xi, Jinan, Shandong, China.
| |
Collapse
|
12
|
Murakami M, Fujimori N, Nakata K, Nakamura M, Hashimoto S, Kurahara H, Nishihara K, Abe T, Hashigo S, Kugiyama N, Ozawa E, Okamoto K, Ishida Y, Okano K, Takaki R, Shimamatsu Y, Ito T, Miki M, Oza N, Yamaguchi D, Yamamoto H, Takedomi H, Kawabe K, Akashi T, Miyahara K, Ohuchida J, Ogura Y, Nakashima Y, Ueki T, Ishigami K, Umakoshi H, Ueda K, Oono T, Ogawa Y. Machine learning-based model for prediction and feature analysis of recurrence in pancreatic neuroendocrine tumors G1/G2. J Gastroenterol 2023; 58:586-597. [PMID: 37099152 DOI: 10.1007/s00535-023-01987-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 03/28/2023] [Indexed: 04/27/2023]
Abstract
BACKGROUND Pancreatic neuroendocrine neoplasms (PanNENs) are a heterogeneous group of tumors. Although the prognosis of resected PanNENs is generally considered to be good, a relatively high recurrence rate has been reported. Given the scarcity of large-scale reports about PanNEN recurrence due to their rarity, we aimed to identify the predictors for recurrence in patients with resected PanNENs to improve prognosis. METHODS We established a multicenter database of 573 patients with PanNENs, who underwent resection between January 1987 and July 2020 at 22 Japanese centers, mainly in the Kyushu region. We evaluated the clinical characteristics of 371 patients with localized non-functioning pancreatic neuroendocrine tumors (G1/G2). We also constructed a machine learning-based prediction model to analyze the important features to determine recurrence. RESULTS Fifty-two patients experienced recurrence (14.0%) during the follow-up period, with the median time of recurrence being 33.7 months. The random survival forest (RSF) model showed better predictive performance than the Cox proportional hazards regression model in terms of the Harrell's C-index (0.841 vs. 0.820). The Ki-67 index, residual tumor, WHO grade, tumor size, and lymph node metastasis were the top five predictors in the RSF model; tumor size above 20 mm was the watershed with increased recurrence probability, whereas the 5-year disease-free survival rate decreased linearly as the Ki-67 index increased. CONCLUSIONS Our study revealed the characteristics of resected PanNENs in real-world clinical practice. Machine learning techniques can be powerful analytical tools that provide new insights into the relationship between the Ki-67 index or tumor size and recurrence.
Collapse
Affiliation(s)
- Masatoshi Murakami
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan
| | - Nao Fujimori
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan.
| | - Kohei Nakata
- Department of Surgery and Oncology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Masafumi Nakamura
- Department of Surgery and Oncology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Shinichi Hashimoto
- Digestive and Lifestyle Diseases, Kagoshima University Graduate School of Medical and Dental Sciences, Kagoshima, Japan
| | - Hiroshi Kurahara
- Department of Digestive Surgery, Breast and Thyroid Surgery, Kagoshima University Graduate School of Medical and Dental Sciences, Kagoshima, Japan
| | - Kazuyoshi Nishihara
- Department of Surgery, Kitakyushu Municipal Medical Center, Kitakyushu, Japan
| | - Toshiya Abe
- Department of Surgery, Kitakyushu Municipal Medical Center, Kitakyushu, Japan
| | - Shunpei Hashigo
- Department of Gastroenterology and Hepatology, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Naotaka Kugiyama
- Department of Gastroenterology and Hepatology, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Eisuke Ozawa
- Department of Gastroenterology and Hepatology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Kazuhisa Okamoto
- Department of Gastroenterology, Faculty of Medicine, Oita University, Oita, Japan
| | - Yusuke Ishida
- Department of Gastroenterology and Medicine, Faculty of Medicine, Fukuoka University, Fukuoka, Japan
| | - Keiichi Okano
- Department of Gastroenterological Surgery, Faculty of Medicine, Kagawa University, Kita-gun, Japan
| | - Ryo Takaki
- Department of Gastroenterology, Urasoe General Hospital, Urasoe, Japan
| | - Yutaka Shimamatsu
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Tetsuhide Ito
- Neuroendocrine Tumor Centre, Fukuoka Sanno Hospital, Fukuoka, Japan
- Department of Gastroenterology, Graduate School of Medical Sciences, International University of Health and Welfare, Fukuoka, Japan
| | - Masami Miki
- Department of Gastroenterology, National Hospital Organization Kyushu Cancer Center, Fukuoka, Japan
| | - Noriko Oza
- Department of Hepato-Biliary-Pancreatology, Saga-Ken Medical Centre Koseikan, Saga, Japan
| | - Daisuke Yamaguchi
- Department of Gastroenterology, National Hospital Organization Ureshino Medical Center, Ureshino, Japan
| | | | - Hironobu Takedomi
- Division of Gastroenterology, Department of Internal Medicine, Faculty of Medicine, Saga University, Saga, Japan
| | - Ken Kawabe
- Department of Gastroenterology, National Hospital Organization Kyushu Medical Center, Fukuoka, Japan
| | - Tetsuro Akashi
- Department of Internal Medicine, Saiseikai Fukuoka General Hospital, Fukuoka, Japan
| | - Koichi Miyahara
- Department of Internal Medicine, Karatsu Red Cross Hospital, Karatsu, Japan
| | - Jiro Ohuchida
- Department of Surgery, Miyazaki Prefectural Miyazaki Hospital, Miyazaki, Japan
| | - Yasuhiro Ogura
- Department of Surgery, Fukuoka Red Cross Hospital, Fukuoka, Japan
| | - Yohei Nakashima
- Department of Surgery, Japan Community Health Care Organization, Kyushu Hospital, Kitakyushu, Japan
| | - Toshiharu Ueki
- Department of Gastroenterology, Fukuoka University Chikushi Hospital, Chikushino, Japan
| | - Kousei Ishigami
- Department of Clinical Radiology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Hironobu Umakoshi
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan
| | - Keijiro Ueda
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan
| | - Takamasa Oono
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan
| | - Yoshihiro Ogawa
- Department of Medicine and Bioregulatory Science, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-Ku, Fukuoka, 812-8582, Japan
| |
Collapse
|
13
|
Hsu JC, Yang YY, Chuang SL, Lin LY, Chen THH. Prediabetes as a risk factor for new-onset atrial fibrillation: the propensity-score matching cohort analyzed using the Cox regression model coupled with the random survival forest. Cardiovasc Diabetol 2023; 22:35. [PMID: 36804876 PMCID: PMC9940357 DOI: 10.1186/s12933-023-01767-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 02/06/2023] [Indexed: 02/22/2023] Open
Abstract
BACKGROUND The glycemic continuum often indicates a gradual decline in insulin sensitivity leading to an increase in glucose levels. Although prediabetes is an established risk factor for both macrovascular and microvascular diseases, whether prediabetes is independently associated with the risk of developing atrial fibrillation (AF), particularly the occurrence time, has not been well studied using a high-quality research design in combination with statistical machine-learning algorithms. METHODS Using data available from electronic medical records collected from the National Taiwan University Hospital, a tertiary medical center in Taiwan, we conducted a retrospective cohort study consisting 174,835 adult patients between 2014 and 2019 to investigate the relationship between prediabetes and AF. To render patients with prediabetes as comparable to those with normal glucose test, a propensity-score matching design was used to select the matched pairs of two groups with a 1:1 ratio. The Kaplan-Meier method was used to compare the cumulative risk of AF between prediabetes and normal glucose test using log-rank test. The multivariable Cox regression model was employed to estimate adjusted hazard ratio (HR) for prediabetes versus normal glucose test by stratifying three levels of glycosylated hemoglobin (HbA1c). The machine-learning algorithm using the random survival forest (RSF) method was further used to identify the importance of clinical factors associated with AF in patients with prediabetes. RESULTS A sample of 14,309 pairs of patients with prediabetes and normal glucose test result were selected. The incidence of AF was 11.6 cases per 1000 person-years during a median follow-up period of 47.1 months. The Kaplan-Meier analysis revealed that the risk of AF was significantly higher in patients with prediabetes (log-rank p < 0.001). The multivariable Cox regression model indicated that prediabetes was independently associated with a significant increased risk of AF (HR 1.24, 95% confidence interval 1.11-1.39, p < 0.001), particularly for patients with HbA1c above 5.5%. The RSF method identified elevated N-terminal natriuretic peptide and altered left heart structure as the two most important risk factors for AF among patients with prediabetes. CONCLUSIONS Our study found that prediabetes is independently associated with a higher risk of AF. Furthermore, alterations in left heart structure make a significant contribution to this elevated risk, and these structural changes may begin during the prediabetes stage.
Collapse
Affiliation(s)
- Jung-Chi Hsu
- Division of Cardiology, Department of Internal Medicine, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan.,Division of Cardiology, Department of Internal Medicine, National Taiwan University College of Medicine and Hospital, No.7, Chung-Chan South Road, Taipei, 100, Taiwan
| | - Yen-Yun Yang
- Department of Medical Research, National Taiwan University Hospital, Taipei, Taiwan
| | - Shu-Lin Chuang
- Department of Medical Research, National Taiwan University Hospital, Taipei, Taiwan
| | - Lian-Yu Lin
- Division of Cardiology, Department of Internal Medicine, National Taiwan University College of Medicine and Hospital, No.7, Chung-Chan South Road, Taipei, 100, Taiwan. .,Department of Internal Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan.
| | - Tony Hsiu-Hsi Chen
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
14
|
Abstract
Background Prediction models for time-to-event outcomes are commonly used in biomedical research to obtain subject-specific probabilities that aid in making important clinical care decisions. There are several regression and machine learning methods for building these models that have been designed or modified to account for the censoring that occurs in time-to-event data. Discrete-time survival models, which have often been overlooked in the literature, provide an alternative approach for predictive modeling in the presence of censoring with limited loss in predictive accuracy. These models can take advantage of the range of nonparametric machine learning classification algorithms and their available software to predict survival outcomes. Methods Discrete-time survival models are applied to a person-period data set to predict the hazard of experiencing the failure event in pre-specified time intervals. This framework allows for any binary classification method to be applied to predict these conditional survival probabilities. Using time-dependent performance metrics that account for censoring, we compare the predictions from parametric and machine learning classification approaches applied within the discrete time-to-event framework to those from continuous-time survival prediction models. We outline the process for training and validating discrete-time prediction models, and demonstrate its application using the open-source R statistical programming environment. Results Using publicly available data sets, we show that some discrete-time prediction models achieve better prediction performance than the continuous-time Cox proportional hazards model. Random survival forests, a machine learning algorithm adapted to survival data, also had improved performance compared to the Cox model, but was sometimes outperformed by the discrete-time approaches. In comparing the binary classification methods in the discrete time-to-event framework, the relative performance of the different methods varied depending on the data set. Conclusions We present a guide for developing survival prediction models using discrete-time methods and assessing their predictive performance with the aim of encouraging their use in medical research settings. These methods can be applied to data sets that have continuous time-to-event outcomes and multiple clinical predictors. They can also be extended to accommodate new binary classification algorithms as they become available. We provide R code for fitting discrete-time survival prediction models in a github repository. Supplementary Information The online version contains supplementary material available at (10.1186/s12874-022-01679-6).
Collapse
Affiliation(s)
- Krithika Suresh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, USA.
| | - Cameron Severn
- Child Health Biostatistics Core Department of Pediatrics, Section of Endocrinology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, USA
| |
Collapse
|
15
|
Cuthbert AR, Giles LC, Glonek G, Kalisch Ellett LM, Pratt NL. A comparison of survival models for prediction of eight-year revision risk following total knee and hip arthroplasty. BMC Med Res Methodol 2022; 22:164. [PMID: 35668349 PMCID: PMC9172144 DOI: 10.1186/s12874-022-01644-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 05/18/2022] [Indexed: 11/25/2022] Open
Abstract
Background There is increasing interest in the development and use of clinical prediction models, but a lack of evidence-supported guidance on the merits of different modelling approaches. This is especially true for time-to-event outcomes, where limited studies have compared the vast number of modelling approaches available. This study compares prediction accuracy and variable importance measures for four modelling approaches in prediction of time-to-revision surgery following total knee arthroplasty (TKA) and total hip arthroplasty (THA). Methods The study included 321,945 TKA and 151,113 THA procedures performed between 1 January 2003 and 31 December 2017. Accuracy of the Cox model, Weibull parametric model, flexible parametric model, and random survival forest were compared, with patient age, sex, comorbidities, and prosthesis characteristics considered as predictors. Prediction accuracy was assessed using the Index of Prediction Accuracy (IPA), c-index, and smoothed calibration curves. Variable importance rankings from the Cox model and random survival forest were also compared. Results Overall, the Cox and flexible parametric survival models performed best for prediction of both TKA (integrated IPA 0.056 (95% CI [0.054, 0.057]) compared to 0.054 (95% CI [0.053, 0.056]) for the Weibull parametric model), and THA revision. (0.029 95% CI [0.027, 0.030] compared to 0.027 (95% CI [0.025, 0.028]) for the random survival forest). The c-index showed broadly similar discrimination between all modelling approaches. Models were generally well calibrated, but random survival forest underfitted the predicted risk of TKA revision compared to regression approaches. The most important predictors of revision were similar in the Cox model and random survival forest for TKA (age, opioid use, and patella resurfacing) and THA (femoral cement, depression, and opioid use). Conclusion The Cox and flexible parametric models had superior overall performance, although all approaches performed similarly. Notably, this study showed no benefit of a tuned random survival forest over regression models in this setting. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01644-3.
Collapse
Affiliation(s)
- Alana R Cuthbert
- Quality Use of Medicines and Pharmacy Research Centre, Clinical and Health Sciences, University of South Australia, PO Box 11060, Adelaide, SA, 5001, Australia. .,South Australian Health and Medical Research Institute, Adelaide, SA, 5000, Australia.
| | - Lynne C Giles
- School of Public Health, The University of Adelaide, Adelaide, SA, 5005, Australia
| | - Gary Glonek
- School of Mathematical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
| | - Lisa M Kalisch Ellett
- Quality Use of Medicines and Pharmacy Research Centre, Clinical and Health Sciences, University of South Australia, PO Box 11060, Adelaide, SA, 5001, Australia
| | - Nicole L Pratt
- Quality Use of Medicines and Pharmacy Research Centre, Clinical and Health Sciences, University of South Australia, PO Box 11060, Adelaide, SA, 5001, Australia
| |
Collapse
|
16
|
Zhang L, Huang T, Xu F, Li S, Zheng S, Lyu J, Yin H. Prediction of prognosis in elderly patients with sepsis based on machine learning ( random survival forest). BMC Emerg Med 2022; 22:26. [PMID: 35148680 PMCID: PMC8832779 DOI: 10.1186/s12873-022-00582-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 02/02/2022] [Indexed: 12/05/2022] Open
Abstract
Background Elderly patients with sepsis have many comorbidities, and the clinical reaction is not obvious. Thus, clinical treatment is difficult. We planned to use the laboratory test results and comorbidities of elderly patients with sepsis from a large-scale public database Medical Information Mart for Intensive Care (MIMIC) IV to build a random survival forest (RSF) model and to evaluate the model’s predictive value for these patients. Methods Clinical information of elderly patients with sepsis in MIMIC IV database was collected retrospectively. Machine learning (RSF) was used to select the top 30 variables in the training cohort to build the final RSF model. The model was compared with the traditional scoring systems SOFA, SAPSII, and APSIII. The performance of the model was evaluated by C index and calibration curve. Results A total of 6,503 patients were enrolled in the study. The top 30 important variables screened by RSF were used to construct the final RSF model. The new model provided a better C-index (0.731 in the validation cohort). The calibration curve described the agreement between the predicted probability of RSF model and the observed 30-day survival. Conclusions We constructed a prognostic model to predict a 30-day mortality risk in elderly patients with sepsis based on machine learning (RSF algorithm), and it proved superior to the traditional scoring systems. The risk factors affecting the patients were also ranked. In addition to the common risk factors of vasopressors, ventilator use, and urine output. Newly added factors such as RDW, type of ICU unit, malignant cancer, and metastatic solid tumor also significantly influence prognosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12873-022-00582-z.
Collapse
Affiliation(s)
- Luming Zhang
- Intensive Care Unit, The First Affiliated Hospital of Jinan University, Guangzhou, 510630, People's Republic of China.,Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, China
| | - Tao Huang
- Intensive Care Unit, The First Affiliated Hospital of Jinan University, Guangzhou, 510630, People's Republic of China
| | - Fengshuo Xu
- Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, China.,School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi Province, China
| | - Shaojin Li
- Department of Orthopaedics, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, China
| | - Shuai Zheng
- Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, China.,School of Public Health, Shannxi University of Chinese Medicine, Xianyang, Shaanxi Province, China
| | - Jun Lyu
- Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, China
| | - Haiyan Yin
- Intensive Care Unit, The First Affiliated Hospital of Jinan University, Guangzhou, 510630, People's Republic of China.
| |
Collapse
|
17
|
Bohannan ZS, Coffman F, Mitrofanova A. Random survival forest model identifies novel biomarkers of event-free survival in high-risk pediatric acute lymphoblastic leukemia. Comput Struct Biotechnol J 2022; 20:583-597. [PMID: 35116134 PMCID: PMC8777142 DOI: 10.1016/j.csbj.2022.01.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 12/30/2021] [Accepted: 01/01/2022] [Indexed: 12/16/2022] Open
Abstract
High-risk pediatric B-ALL patients experience 5-year negative event rates up to 25%. Although some biomarkers of relapse are utilized in the clinic, their ability to predict outcomes in high-risk patients is limited. Here, we propose a random survival forest (RSF) machine learning model utilizing interpretable genomic inputs to predict relapse/death in high-risk pediatric B-ALL patients. We utilized whole exome sequencing profiles from 156 patients in the TARGET-ALL study (with samples collected at presentation) further stratified into training and test cohorts (109 and 47 patients, respectively). To avoid overfitting and facilitate the interpretation of machine learning results, input genomic variables were engineered using a stepwise approach involving univariable Cox models to select variables directly associated with outcomes, genomic coordinate-based analysis to select mutational hotspots, and correlation analysis to eliminate feature co-linearity. Model training identified 7 genomic regions most predictive of relapse/death-free survival. The test cohort error rate was 12.47%, and a polygenic score based on the sum of the top 7 variables effectively stratified patients into two groups, with significant differences in time to relapse/death (log-rank P = 0.001, hazard ratio = 5.41). Our model outperformed other EFS modeling approaches including an RSF using gold-standard prognostic variables (error rate = 24.35%). Validation in 174 standard-risk patients and 3 patients who failed to respond to induction therapy confirmed that our RSF model and polygenic score were specific to high-risk disease. We propose that our feature selection/engineering approach can increase the clinical interpretability of RSF, and our polygenic score could be utilized for enhance clinical decision-making in high-risk B-ALL.
Collapse
Affiliation(s)
- Zachary S. Bohannan
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| | - Frederick Coffman
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| | - Antonina Mitrofanova
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| |
Collapse
|
18
|
Aizawa T. Inequality of opportunity in infant mortality in South Asia: A decomposition analysis of survival data. Econ Hum Biol 2021; 43:101058. [PMID: 34509789 DOI: 10.1016/j.ehb.2021.101058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 08/02/2021] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
Early-life environments into which newborn babies are born play principal roles in their development. This study explores inequalities in infant mortality that are rooted in household and parental socio-economic backgrounds in five South-Asian countries: Afghanistan, Bangladesh, India, Nepal and Pakistan. Considering multidimensional aspects of socio-demographic and socio-economic status, this study explores disparities in the trajectory of survival rates across infants with dissimilar circumstantial backgrounds over the first 12 months of their lives. This study proposes a new method to first cluster the data into advantaged and disadvantaged types and explore the differences in survival rates by a clustering approach and a random survival forest. Furthermore, this study extends a Shapley-value decomposition method to explore the determinants of inequality. The results indicate that demographic factors, parental educational background and household living standards are major factors contributing to inequality. In order to ameliorate the inequality of opportunity, priority should be given to protecting marginalised infants by compensating for their disadvantaged backgrounds.
Collapse
Affiliation(s)
- Toshiaki Aizawa
- Waseda University, Waseda Institute for Advanced Study (WIAS), Nishi-Waseda Bldg., 1-21-1 Nishi Waseda, Shinjuku-ku, Tokyo 169-0051, Japan.
| |
Collapse
|
19
|
Grendas LN, Chiapella L, Rodante DE, Daray FM. Comparison of traditional model-based statistical methods with machine learning for the prediction of suicide behaviour. J Psychiatr Res 2021; 145:85-91. [PMID: 34883411 DOI: 10.1016/j.jpsychires.2021.11.029] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 10/23/2021] [Accepted: 11/20/2021] [Indexed: 01/18/2023]
Abstract
BACKGROUND Despite considerable research efforts during the last five decades, the prediction of suicidal behaviour (SB) using traditional model-based statistical has been weak. This marks the need to explore new statistical methods. OBJECTIVE To compare the performance of Cox regression models versus Random Survival Forest (RSF) to predict SB. METHODS Using a data set of more than 300 high-risk suicidal patients from a multicenter prospective cohort study, we compare Cox regression models with RSF to address predictors of time to suicide reattempt. Cross-validation was used to assess model prediction performance, including the area under the receiver operator curve (AUC), precision, Integrated Brier Score (IBS), sensitivity, and specificity. RESULTS A variant of the RSF denominated the RSFElimin, in which irrelevant predictor variables were eliminated from the model, presented the best accuracy, sensitivity, AUC and IBS. At the same time, the sensitivity of this method was slightly lower than that obtained with the Cox regression model with all predictor variables (CoxComp). CONCLUSION The RSF, a machine learning model, seems more sensitive and precise than the traditional Cox regression model in predicting suicidal behaviour.
Collapse
Affiliation(s)
- Leandro Nicolás Grendas
- University of Buenos Aires, School of Medicine, Institute of Pharmacology, Argentina; Teodoro Alvarez Hospital, Buenos Aires, Argentina
| | - Luciana Chiapella
- National University of Rosario, School of Biochemical and Pharmaceutical Sciences, Argentina; National Scientific and Technical Research Council (CONICET), Argentina
| | - Demian Emanuel Rodante
- University of Buenos Aires, School of Medicine, Institute of Pharmacology, Argentina; Braulio A. Moyano Neuropsychiatric Hospital, Buenos Aires, Argentina
| | - Federico Manuel Daray
- University of Buenos Aires, School of Medicine, Institute of Pharmacology, Argentina; National Scientific and Technical Research Council (CONICET), Argentina.
| |
Collapse
|
20
|
Li Y, Chen M, Lv H, Yin P, Zhang L, Tang P. A novel machine-learning algorithm for predicting mortality risk after hip fracture surgery. Injury 2021; 52:1487-1493. [PMID: 33386157 DOI: 10.1016/j.injury.2020.12.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 11/04/2020] [Accepted: 12/13/2020] [Indexed: 02/02/2023]
Abstract
INTRODUCTION Although several risk stratification models have been developed to predict hip fracture mortality, efforts are still being placed in this area. Our aim is to (1) construct a risk prediction model for long-term mortality after hip fracture utilizing the RSF method and (2) to evaluate the changing effects over time of individual pre- and post-treatment variables on predicting mortality. METHODS 1330 hip fracture surgical patients were included. Forty-five admission and in-hospital variables were analyzed as potential predictors of all-cause mortality. A random survival forest (RSF) algorithm was applied in predictors identification. Cox regression models were then constructed. Sensitivity analyses and internal validation were performed to assess the performance of each model. C statistics were calculated and model calibrations were further assessed. RESULTS Our machine-learning RSF algorithm achieved a c statistic of 0.83 for 30-day prediction and 0.75 for 1-year mortality. Additionally, a COX model was also constructed by using the variables selected by RSF, c statistics were shown as 0.75 and 0.72 when applying in 2-year and 4-year mortality prediction. The presence of post-operative complications remained as the strongest risk factor for both short- and long-term mortality. Variables including fracture location, high serum creatinine, age, hypertension, anemia, ASA, hypoproteinemia, abnormal BUN, and RDW became more important as the length of follow-up increased. CONCLUSION The RSF machine-learning algorithm represents a novel approach to identify important risk factors and a risk stratification models for patients undergoing hip fracture surgery is built through this approach to identify those at high risk of long-term mortality.
Collapse
Affiliation(s)
- Yi Li
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China
| | - Ming Chen
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China
| | - Houchen Lv
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China
| | - Pengbin Yin
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China.
| | - Licheng Zhang
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China.
| | - Peifu Tang
- Department of Orthopedics, Chinese PLA General Hospital, Beijing 100853, China; National Clinical Research Center for Orthopedics, Sports Medicine & Rehabilitation, Beijing 100853, China.
| |
Collapse
|
21
|
Abstract
Low- and middle-income countries in Asia have seen substantial improvements in infant mortality over the last three decades. This study examines the factors contributing to the improvement in infant survival in their first year in six Asian countries: Bangladesh, India, Indonesia, Nepal, Pakistan, and the Philippines. I decompose the overall improvement in the infant survival rate in the respective countries from the 1990s to the 2010s into the part that can be explained by the improvements in circumstantial environments in which infants develop and the remaining part that is due to the structural change in the hazard functions. This decomposition is achieved by employing the random survival forest, allowing me to predict the counterfactual infant survival probability that infants in the 2010s would have under the circumstantial environments of the 1990s. The results show that large parts of the improvement are explained by the improvement in the environments in all the countries being analyzed. I find that the reduction in family size, increased use of antenatal care, longer pregnancy periods, and improved living standards were associated with the improvement of the infant mortality rate in all six countries.
Collapse
Affiliation(s)
- Toshiaki Aizawa
- Waseda Institute for Advanced Study, Waseda University, Tokyo, Japan
| |
Collapse
|
22
|
Chen Z, Xu HM, Li ZX, Zhang Y, Zhou T, You WC, Pan KF, Li WQ. [ Random survival forest: applying machine learning algorithm in survival analysis of biomedical data]. Zhonghua Yu Fang Yi Xue Za Zhi 2021; 55:104-9. [PMID: 33455140 DOI: 10.3760/cma.j.cn112150-20200911-01197] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Traditional survival methods have a wide application in the field of biomedical research. However, applying traditional survival methods requires data to meet a set of special assumptions while the Random Survival Forest model can overcome this inconvenience. Herein, we used the clinical data of Primary Biliary Cholangitis (PBC) from Mayo Clinic to introduce and demonstrate Random Survival Forest model from mathematical principles, model building, practical example and attentions, aiming to provide a novel method for doing survival analysis.
Collapse
|
23
|
Farhadian M, Dehdar Karsidani S, Mozayanimonfared A, Mahjub H. Risk factors associated with major adverse cardiac and cerebrovascular events following percutaneous coronary intervention: a 10-year follow-up comparing random survival forest and Cox proportional-hazards model. BMC Cardiovasc Disord 2021; 21:38. [PMID: 33461487 PMCID: PMC7814642 DOI: 10.1186/s12872-020-01834-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 12/22/2020] [Indexed: 11/16/2022] Open
Abstract
Background Due to the limited number of studies with long term follow-up of patients undergoing Percutaneous Coronary Intervention (PCI), we investigated the occurrence of Major Adverse Cardiac and Cerebrovascular Events (MACCE) during 10 years of follow-up after coronary angioplasty using Random Survival Forest (RSF) and Cox proportional hazards models. Methods The current retrospective cohort study was performed on 220 patients (69 women and 151 men) undergoing coronary angioplasty from March 2009 to March 2012 in Farchshian Medical Center in Hamadan city, Iran. Survival time (month) as the response variable was considered from the date of angioplasty to the main endpoint or the end of the follow-up period (September 2019). To identify the factors influencing the occurrence of MACCE, the performance of Cox and RSF models were investigated in terms of C index, Integrated Brier Score (IBS) and prediction error criteria. Results Ninety-six patients (43.7%) experienced MACCE by the end of the follow-up period, and the median survival time was estimated to be 98 months. Survival decreased from 99% during the first year to 39% at 10 years' follow-up. By applying the Cox model, the predictors were identified as follows: age (HR = 1.03, 95% CI 1.01–1.05), diabetes (HR = 2.17, 95% CI 1.29–3.66), smoking (HR = 2.41, 95% CI 1.46–3.98), and stent length (HR = 1.74, 95% CI 1.11–2.75). The predictive performance was slightly better by the RSF model (IBS of 0.124 vs. 0.135, C index of 0.648 vs. 0.626 and out-of-bag error rate of 0.352 vs. 0.374 for RSF). In addition to age, diabetes, smoking, and stent length, RSF also included coronary artery disease (acute or chronic) and hyperlipidemia as the most important variables. Conclusion Machine-learning prediction models such as RSF showed better performance than the Cox proportional hazards model for the prediction of MACCE during long-term follow-up after PCI.
Collapse
Affiliation(s)
- Maryam Farhadian
- Research Center for Health Sciences, Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, P.O. Box 4171-65175, Hamadan, Iran
| | - Sahar Dehdar Karsidani
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Azadeh Mozayanimonfared
- Department of Cardiology, Medical School, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Hossein Mahjub
- Research Center for Health Sciences, Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, P.O. Box 4171-65175, Hamadan, Iran.
| |
Collapse
|
24
|
Kantidakis G, Putter H, Lancia C, Boer JD, Braat AE, Fiocco M. Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques. BMC Med Res Methodol 2020; 20:277. [PMID: 33198650 PMCID: PMC7667810 DOI: 10.1186/s12874-020-01153-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 10/26/2020] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians. METHODS In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques. RESULTS Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years. CONCLUSION In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables. TRIAL REGISTRATION Retrospective data were provided by the Scientific Registry of Transplant Recipients under Data Use Agreement number 9477 for analysis of risk factors after liver transplantation.
Collapse
Affiliation(s)
- Georgios Kantidakis
- Mathematical Institute (MI) Leiden University, Niels Bohrweg 1, Leiden, 2333 CA, the Netherlands. .,Department of Biomedical Data Sciences, Section Medical Statistics, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, The Netherlands. .,Department of Statistics, European Organisation for Research and Treatment of Cancer (EORTC) Headquarters, Ave E. Mounier 83/11, Brussels, 1200, Belgium.
| | - Hein Putter
- Department of Biomedical Data Sciences, Section Medical Statistics, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, The Netherlands
| | - Carlo Lancia
- Mathematical Institute (MI) Leiden University, Niels Bohrweg 1, Leiden, 2333 CA, the Netherlands
| | - Jacob de Boer
- Department of Surgery, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, the Netherlands
| | - Andries E Braat
- Department of Surgery, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, the Netherlands
| | - Marta Fiocco
- Mathematical Institute (MI) Leiden University, Niels Bohrweg 1, Leiden, 2333 CA, the Netherlands.,Department of Biomedical Data Sciences, Section Medical Statistics, Leiden University Medical Center (LUMC), Albinusdreef 2, Leiden, 2333 ZA, The Netherlands.,Trial and Data Center, Princess Máxima Center for pediatric oncology (PMC), Heidelberglaan 25, Utrecht, 3584 CS, the Netherlands
| |
Collapse
|
25
|
Roshanaei G, Safari M, Faradmal J, Abbasi M, Khazaei S. Factors affecting the survival of patients with colorectal cancer using random survival forest. J Gastrointest Cancer 2020; 53:64-71. [PMID: 33174117 DOI: 10.1007/s12029-020-00544-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2020] [Indexed: 11/26/2022]
Abstract
PURPOSE Colorectal cancer is one of the most common cancers and the leading cause of cancer death in Iran. This study aimed to develop and validate a random survival forest (RSF) to identify important risk factors on mortality in colorectal patients based on their demographic and clinical-related variables. METHODS In this retrospective cohort study, the information of 317 patients with colorectal cancer who were referred to Imam Khomeini Clinic of Hamadan during the years of 2002 to 2017 were examined. Patient survival was calculated from the time of diagnosis to death. In the present study, the RSF model was used to identify factors affecting patient survival. Also, the results of the RSF model were compared with the Cox model. The data were analyzed using R software (version 3.6.1) and survival packages. RESULTS One-, 2-, 3-, 4-, 5-, and 10-year survival rates of included patients were 81.4%, 63%, 57%, 52%, 45%, and 34%, respectively, and the median survival was obtained to be 53 months. The number of 150 patients was died at this time period. The four most important predictors of survival included metastasis to other organs, WBC count, disease stage, and number of lymphomas involved. RSF method predicted survival better than the conventional Cox proportional hazard model. CONCLUSION We found that metastasis to other organs, WBC count, disease stage, and number of lymphomas involved were the most four most important predictors of low survival for colorectal cancer patients.
Collapse
Affiliation(s)
- Ghodratollah Roshanaei
- Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Diseases Research Canter, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Malihe Safari
- Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Diseases Research Canter, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Javad Faradmal
- Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Diseases Research Canter, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mohammad Abbasi
- Department of Internal Medicine, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Salman Khazaei
- Research Center for Health Sciences, Hamadan University of Medical Sciences, Hamadan, Iran.
| |
Collapse
|
26
|
Jamet B, Morvan L, Nanni C, Michaud AV, Bailly C, Chauvie S, Moreau P, Touzeau C, Zamagni E, Bodet-Milin C, Kraeber-Bodéré F, Mateus D, Carlier T. Random survival forest to predict transplant-eligible newly diagnosed multiple myeloma outcome including FDG-PET radiomics: a combined analysis of two independent prospective European trials. Eur J Nucl Med Mol Imaging 2021; 48:1005-15. [PMID: 33006656 DOI: 10.1007/s00259-020-05049-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/20/2020] [Indexed: 01/15/2023]
Abstract
PURPOSE Fluorodeoxyglucose-positron emission tomography/computed tomography (FDG-PET/CT) is included in the International Myeloma Working Group (IMWG) imaging guidelines for the work-up at diagnosis and the follow-up of multiple myeloma (MM) notably because it is a reliable tool as a predictor of prognosis. Nevertheless, none of the published studies focusing on the prognostic value of PET-derived features at baseline consider tumor heterogeneity, which could be of high importance in MM. The aim of this study was to evaluate the prognostic value of baseline PET-derived features in transplant-eligible newly diagnosed (TEND) MM patients enrolled in two prospective independent European randomized phase III trials using an innovative statistical random survival forest (RSF) approach. METHODS Imaging ancillary studies of IFM/DFCI2009 and EMN02/HO95 trials formed part of the present analysis (IMAJEM and EMN02/HO95, respectively). Among all patients initially enrolled in these studies, those with a positive baseline FDG-PET/CT imaging and focal bone lesions (FLs) and/or extramedullary disease (EMD) were included in the present analysis. A total of 17 image features (visual and quantitative, reflecting whole imaging characteristics) and 5 clinical/histopathological parameters were collected. The statistical analysis was conducted using two RSF approaches (train/validation + test and additional nested cross-validation) to predict progression-free survival (PFS). RESULTS One hundred thirty-nine patients were considered for this study. The final model based on the first RSF (train/validation + test) approach selected 3 features (treatment arm, hemoglobin, and SUVmaxBone Marrow (BM)) among the 22 involved initially, and two risk groups of patients (good and poor prognosis) could be defined with a mean hazard ratio of 4.3 ± 1.5 and a mean log-rank p value of 0.01 ± 0.01. The additional RSF (nested cross-validation) analysis highlighted the robustness of the proposed model across different splits of the dataset. Indeed, the first features selected using the train/validation + test approach remained the first ones over the folds with the nested approach. CONCLUSION We proposed a new prognosis model for TEND MM patients at diagnosis based on two RSF approaches. TRIAL REGISTRATION IMAJEM: NCT01309334 and EMN02/HO95: NCT01134484.
Collapse
|
27
|
Jung SY, Papp JC, Sobel EM, Pellegrini M, Yu H, Zhang ZF. Pro-inflammatory cytokine polymorphisms in ONECUT2 and HNF4A and primary colorectal carcinoma: a post genome-wide gene-lifestyle interaction study. Am J Cancer Res 2020; 10:2955-2976. [PMID: 33042629 PMCID: PMC7539781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 08/12/2020] [Indexed: 06/11/2023] Open
Abstract
Immune-related molecular and genetic pathways that are connected to colorectal cancer (CRC) and lifestyles in postmenopausal women are incompletely characterized. In this study, we examined the role of pro-inflammatory biomarkers such as C-reactive protein (CRP) and interleukin-6 (IL-6) in those pathways. Through selection of the best predictive single-nucleotide polymorphisms (SNPs) and lifestyles, our goal was to improve the prediction accuracy and ability for CRC risk. Using large cohort data of postmenopausal women from the Women's Health Initiative Database for Genotypes and Phenotypes Study, we previously conducted a genome-wide association (GWA) for a CRP and IL-6 gene-behavioral interaction study. For the present study, we added GWA-SNPs from outside GWA studies, resulting in a total of 152 SNPs. Together with 41 selected lifestyles, we performed a 2-stage multimodal random survival forest analysis with generalized multifactor dimensionality reduction approach to construct CRC risk profiles. Overall and in obesity strata (by body mass index, waist circumference, waist-to-hip ratio, exercise, and dietary fat intake), we identified the best predictive genetic markers in inflammatory cytokines and lifestyles. Across the strata, 2 SNPs (ONECUT2 rs4092465 and HNF4A rs1800961) and 1 lifestyle factor (relatively short-term past use of oral contraceptives) were the most common and strongest predictive markers for CRC risk. The risk profile that combined those variables exhibited synergistically increased risk for CRC; this pattern appeared more strongly in obese and inactive subgroups. Our results may contribute to improved predictability for CRC and suggest genetically targeted lifestyle interventions for women carrying the inflammatory-risk genotypes, reducing CRC risk.
Collapse
Affiliation(s)
- Su Yon Jung
- Translational Sciences Section, Jonsson Comprehensive Cancer Center, School of Nursing, University of CaliforniaLos Angeles, CA 90095, USA
| | - Jeanette C Papp
- Department of Human Genetics, David Geffen School of Medicine, University of CaliforniaLos Angeles, CA 90095, USA
| | - Eric M Sobel
- Department of Human Genetics, David Geffen School of Medicine, University of CaliforniaLos Angeles, CA 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of CaliforniaLos Angeles, CA 90095, USA
| | - Matteo Pellegrini
- Department of Molecular, Cell and Developmental Biology, Life Sciences Division, University of CaliforniaLos Angeles, CA 90095, USA
| | - Herbert Yu
- Cancer Epidemiology Program, University of Hawaii Cancer CenterHonolulu, HI 96813, USA
| | - Zuo-Feng Zhang
- Department of Epidemiology, Fielding School of Public Health, University of CaliforniaLos Angeles, CA 90095, USA
- Center for Human Nutrition, David Geffen School of Medicine, University of CaliforniaLos Angeles, CA 90095, USA
| |
Collapse
|
28
|
Ma B, Geng Y, Meng F, Yan G, Song F. Identification of a Sixteen-gene Prognostic Biomarker for Lung Adenocarcinoma Using a Machine Learning Method. J Cancer 2020; 11:1288-1298. [PMID: 31956375 PMCID: PMC6959071 DOI: 10.7150/jca.34585] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Accepted: 10/25/2019] [Indexed: 12/27/2022] Open
Abstract
Objectives: Lung adenocarcinoma (LUAD) accounts for a majority of cancer-related deaths worldwide annually. The identification of prognostic biomarkers and prediction of prognosis for LUAD patients is necessary. Materials and Methods: In this study, LUAD RNA-Seq data and clinical data from the Cancer Genome Atlas (TCGA) were divided into TCGA cohort I (n = 338) and II (n = 168). The cohort I was used for model construction, and the cohort II and data from Gene Expression Omnibus (GSE72094 cohort, n = 393; GSE11969 cohort, n = 149) were utilized for validation. First, the survival-related seed genes were selected from the cohort I using the machine learning model (random survival forest, RSF), and then in order to improve prediction accuracy, the forward selection model was utilized to identify the prognosis-related key genes among the seed genes using the clinically-integrated RNA-Seq data. Second, the survival risk score system was constructed by using these key genes in the cohort II, the GSE72094 cohort and the GSE11969 cohort, and the evaluation metrics such as HR, p value and C-index were calculated to validate the proposed method. Third, the developed approach was compared with the previous five prediction models. Finally, bioinformatics analyses (pathway, heatmap, protein-gene interaction network) have been applied to the identified seed genes and key genes. Results and Conclusion: Based on the RSF model and clinically-integrated RNA-Seq data, we identified sixteen key genes that formed the prognostic gene expression signature. These sixteen key genes could achieve a strong power for prognostic prediction of LUAD patients in cohort II (HR = 3.80, p = 1.63e-06, C-index = 0.656), and were further validated in the GSE72094 cohort (HR = 4.12, p = 1.34e-10, C-index = 0.672) and GSE11969 cohort (HR = 3.87, p = 6.81e-07, C-index = 0.670). The experimental results of three independent validation cohorts showed that compared with the traditional Cox model and the use of standalone RNA-Seq data, the machine-learning-based method effectively improved the prediction accuracy of LUAD prognosis, and the derived model was also superior to the other five existing prediction models. KEGG pathway analysis found eleven of the sixteen genes were associated with Nicotine addiction. Thirteen of the sixteen genes were reported for the first time as the LUAD prognosis-related key genes. In conclusion, we developed a sixteen-gene prognostic marker for LUAD, which may provide a powerful prognostic tool for precision oncology.
Collapse
Affiliation(s)
- Baoshan Ma
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Yao Geng
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Fanyu Meng
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Ge Yan
- College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Fengju Song
- Department of Epidemiology and Biostatistics, Key Laboratory of Cancer Prevention and Therapy, Tianjin, National Clinical Research Center of Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| |
Collapse
|
29
|
Morvan L, Carlier T, Jamet B, Bailly C, Bodet-Milin C, Moreau P, Kraeber-Bodéré F, Mateus D. Leveraging RSF and PET images for prognosis of multiple myeloma at diagnosis. Int J Comput Assist Radiol Surg 2020; 15:129-39. [PMID: 31256359 DOI: 10.1007/s11548-019-02015-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 06/11/2019] [Indexed: 10/26/2022]
Abstract
PURPOSE Multiple myeloma (MM) is a bone marrow cancer that accounts for 10% of all hematological malignancies. It has been reported that FDG PET imaging provides prognostic information for both baseline and therapeutic follow-up of MM patients using visual analysis. In this study, we aim to develop a computer-assisted method based on PET quantitative image features to assist diagnoses and treatment decisions for MM patients. METHODS Our proposed model relies on a two-stage method with Random Survival Forest (RFS) and variable importance (VIMP) for both feature selection and prediction. The targeted variable for prediction is the progression-free survival (PFS). We consider texture-based (radiomics), conventional (e.g., SUVmax) and clinical biomarkers. We evaluate PFS predictions in terms of C-index and final prognosis separation in two risk groups, from a database of 66 patients who were part of the prospective multi-centric french IMAJEM study. RESULTS Our method (VIMP + RSF) provides better results (1-C-index of 0.36) than conventional methods such as Lasso-Cox and gradient-boosting Cox (0.48 and 0.56, respectively). We experimentally proved the interest of using selection (0.61 for RSF without selection) and showed that VIMP selection is more stable and gives better results than minimal depth and variable hunting (0.47 and 0.43). The approach gives better prognosis group separation (a p value of 0.05 against 0.11 to 0.4 for others). CONCLUSION Our results confirm the predictive value of radiomics for MM patients, in particular, they demonstrate that quantitative/heterogeneity image-based features reduce the error of the predicted progression. To our knowledge, this is the first work using RFS on PET images for the progression prediction of MM patients. Moreover, we provide an analysis of the feature selection process, which points toward the identification of clinically relevant biomarkers.
Collapse
|
30
|
Shi M, Xu G. Development and validation of GMI signature based random survival forest prognosis model to predict clinical outcome in acute myeloid leukemia. BMC Med Genomics 2019; 12:90. [PMID: 31242922 PMCID: PMC6595612 DOI: 10.1186/s12920-019-0540-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 05/30/2019] [Indexed: 12/13/2022] Open
Abstract
Background Acute myeloid leukemia (AML) is a disease with marked molecular heterogeneity and a high early death rate. Our aim was to investigate an integrated Gene expression, Mirna and miRNA-mRNA Interactions (GMI) signature for improving risk stratification of AML. Methods We identified differentially expressed genes by pooling a large number of 861 human AML patients and 75 normal cases. We then used miRWalk to identify the functional miRNA-mRNA regulatory module. The GMI signature based random survival forest (RSF) prognosis model was developed from training data set and evaluated in independent patient cohorts from The Cancer Genome Atlas (TCGA) dataset (N = 147). Univariate and multivariate Cox proportional hazards regression analyses were applied to evaluate the prognostic value of GMI signature. Results We identified 139 differentially expressed genes between normal and abnormal AML samples. We discovered the functional miRNA-mRNA regulatory module which participate in the network of cancer progression. We named 23 differentially expressed genes and 16 validated target miRNAs as the GMI signature. The RSF model-based scores separated independent patient cohorts into two groups with significantly different overall survival (C-index = 0.59, hazard ratio [HR], 2.12; 95% confidence interval [CI], 1.11–4.03; p = 0.019). Similar results were obtained with reversed training and testing datasets (C-index = 0.58, hazard ratio [HR], 2.08; 95% confidence interval [CI], 1.02–4.24; p = 0.038). The GMI signature score contributed more information about recurrence than standard clinical covariates. Conclusion The GMI signature based RSF prognosis model not only reflects regulatory relationships from identified miRNA-mRNA module but also informs patient prognosis. While in the TCGA data set the GMI signature score contributed additional information about recurrence in comparison to standard clinical covariates, further studies are needed to determine its clinical significance. Electronic supplementary material The online version of this article (10.1186/s12920-019-0540-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mingguang Shi
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, 230009, Anhui, China.
| | - Guofu Xu
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, 230009, Anhui, China
| |
Collapse
|
31
|
Abstract
Background Congestive heart failure is one of the most common reasons those aged 65 and over are hospitalized in the United States, which has caused a considerable economic burden. The precise prediction of hospitalization caused by congestive heart failure in the near future could prevent possible hospitalization, optimize the medical resources, and better meet the healthcare needs of patients. Methods To fully utilize the monthly-updated claim feed data released by The Centers for Medicare and Medicaid Services (CMS), we present a dynamic random survival forest model adapted for periodically updated data to predict the risk of adverse events. We apply our model to dynamically predict the risk of hospital admission among patients with congestive heart failure identified using the Accountable Care Organization Operational System Claim and Claim Line Feed data from Feb 2014 to Sep 2015. We benchmark the proposed model with two commonly used models in medical application literature: the cox proportional model and logistic regression model with L-1 norm penalty. Results Results show that our model has high Area-Under-the-ROC-Curve across time points and C-statistics. In addition to the high performance, it provides measures of variable importance and individual-level instant risk. Conclusion We present an efficient model adapted for periodically updated data such as the monthly updated claim feed data released by CMS to predict the risk of hospitalization. In addition to processing big-volume periodically updated stream-like data, our model can capture event onset information and time-to-event information, incorporate time-varying features, provide insights of variable importance and have good prediction power. To the best of our knowledge, it is the first work combining sliding window technique with the random survival forest model. The model achieves remarkable performance and could be easily deployed to monitor patients in real time. Electronic supplementary material The online version of this article (10.1186/s12911-019-0734-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tianzhong Yang
- Philips Research North America, Cambridge, MA, 02141, USA.,Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston, Houston, 77030, TX, USA
| | - Yang Yang
- Philips Research North America, Cambridge, MA, 02141, USA.
| | - Yugang Jia
- Philips Research North America, Cambridge, MA, 02141, USA
| | - Xiao Li
- Philips Research North America, Cambridge, MA, 02141, USA.,Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston, Houston, 77030, TX, USA
| |
Collapse
|
32
|
Verschut TA, Hambäck PA. A random survival forest illustrates the importance of natural enemies compared to host plant quality on leaf beetle survival rates. BMC Ecol 2018; 18:33. [PMID: 30200936 PMCID: PMC6131828 DOI: 10.1186/s12898-018-0187-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 08/31/2018] [Indexed: 11/11/2022] Open
Abstract
Background Wetlands are habitats where variation in soil moisture content and associated environmental conditions can strongly affect the survival of herbivorous insects by changing host plant quality and natural enemy densities. In this study, we combined natural enemy exclusion experiments with random survival forest analyses to study the importance of local variation in host plant quality and predation by natural enemies on the egg and larval survival of the leaf beetle Galerucella sagittariae along a soil moisture gradient. Results Our results showed that the exclusion of natural enemies substantially increased the survival probability of G. sagittariae eggs and larvae. Interestingly, the egg survival probability decreased with soil moisture content, while the larval survival probability instead increased with soil moisture content. For both the egg and larval survival, we found that host plant height, the number of eggs or larvae, and vegetation height explained more of the variation than the soil moisture gradient by itself. Moreover, host plant quality related variables, such as leaf nitrogen, carbon and phosphorus content did not influence the survival of G. sagittariae eggs and larvae. Conclusion Our results suggest that the soil moisture content is not an overarching factor that determines the interplay between factors related to host plant quality and factors relating to natural enemies on the survival of G. sagittariae in different microhabitats. Moreover, the natural enemy exclusion experiments and the random survival forest analysis suggest that natural enemies have a stronger indirect impact on the survival of G. sagittariae offspring than host plant quality. Electronic supplementary material The online version of this article (10.1186/s12898-018-0187-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Thomas A Verschut
- Department of Ecology, Environment and Plant Sciences, Stockholm University, 106 91, Stockholm, Sweden.
| | - Peter A Hambäck
- Department of Ecology, Environment and Plant Sciences, Stockholm University, 106 91, Stockholm, Sweden
| |
Collapse
|
33
|
Audureau E, Chivet A, Ursu R, Corns R, Metellus P, Noel G, Zouaoui S, Guyotat J, Le Reste PJ, Faillot T, Litre F, Desse N, Petit A, Emery E, Lechapt-Zalcman E, Peltier J, Duntze J, Dezamis E, Voirin J, Menei P, Caire F, Dam Hieu P, Barat JL, Langlois O, Vignes JR, Fabbro-Peray P, Riondel A, Sorbets E, Zanello M, Roux A, Carpentier A, Bauchet L, Pallud J; Club de Neuro-Oncologie of the Société Française de Neurochirurgie. Prognostic factors for survival in adult patients with recurrent glioblastoma: a decision-tree-based model. J Neurooncol 2018; 136:565-76. [PMID: 29159777 DOI: 10.1007/s11060-017-2685-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 11/11/2017] [Indexed: 01/30/2023]
Abstract
We assessed prognostic factors in relation to OS from progression in recurrent glioblastomas. Retrospective multicentric study enrolling 407 (training set) and 370 (external validation set) adult patients with a recurrent supratentorial glioblastoma treated by surgical resection and standard combined chemoradiotherapy as first-line treatment. Four complementary multivariate prognostic models were evaluated: Cox proportional hazards regression modeling, single-tree recursive partitioning, random survival forest, conditional random forest. Median overall survival from progression was 7.6 months (mean, 10.1; range, 0-86) and 8.0 months (mean, 8.5; range, 0-56) in the training and validation sets, respectively (p = 0.900). Using the Cox model in the training set, independent predictors of poorer overall survival from progression included increasing age at histopathological diagnosis (aHR, 1.47; 95% CI [1.03-2.08]; p = 0.032), RTOG-RPA V-VI classes (aHR, 1.38; 95% CI [1.11-1.73]; p = 0.004), decreasing KPS at progression (aHR, 3.46; 95% CI [2.10-5.72]; p < 0.001), while independent predictors of longer overall survival from progression included surgical resection (aHR, 0.57; 95% CI [0.44-0.73]; p < 0.001) and chemotherapy (aHR, 0.41; 95% CI [0.31-0.55]; p < 0.001). Single-tree recursive partitioning identified KPS at progression, surgical resection at progression, chemotherapy at progression, and RTOG-RPA class at histopathological diagnosis, as main survival predictors in the training set, yielding four risk categories highly predictive of overall survival from progression both in training (p < 0.0001) and validation (p < 0.0001) sets. Both random forest approaches identified KPS at progression as the most important survival predictor. Age, KPS at progression, RTOG-RPA classes, surgical resection at progression and chemotherapy at progression are prognostic for survival in recurrent glioblastomas and should inform the treatment decisions.
Collapse
|
34
|
Gilhodes J, Zemmour C, Ajana S, Martinez A, Delord JP, Leconte E, Boher JM, Filleron T. Comparison of variable selection methods for high-dimensional survival data with competing events. Comput Biol Med 2017; 91:159-167. [PMID: 29078093 DOI: 10.1016/j.compbiomed.2017.10.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 10/19/2017] [Accepted: 10/19/2017] [Indexed: 11/12/2022]
Abstract
BACKGROUND In the era of personalized medicine, it's primordial to identify gene signatures for each event type in the context of competing risks in order to improve risk stratification and treatment strategy. Until recently, little attention was paid to the performance of high-dimensional selection in deriving molecular signatures in this context. In this paper, we investigate the performance of two selection methods developed in the framework of high-dimensional data and competing risks: Random survival forest and a boosting approach for fitting proportional subdistribution hazards models. METHODS Using data from bladder cancer patients (GSE5479) and simulated datasets, stability and prognosis performance of the two methods were evaluated using a resampling strategy. For each sample, the data set was split into 100 training and validation sets. Molecular signatures were developed in the training sets by the two selection methods and then applied on the corresponding validation sets. RESULTS Random survival forest and boosting approach have comparable performance for the prediction of survival data, with few selected genes in common. Nevertheless, many different sets of genes are identified by the resampling approach, with a very small frequency of genes occurrence among the signatures. Also, the smaller the training sample size, the lower is the stability of the signatures. CONCLUSION Random survival forest and boosting approach give good predictive performance but gene signatures are very unstable. Further works are needed to propose adequate strategies for the analysis of high-dimensional data in the context of competing risks.
Collapse
Affiliation(s)
- Julia Gilhodes
- Department of Biostatistics, Institut Claudius Regaud, IUCT-O, Toulouse, France
| | - Christophe Zemmour
- Department of Clinical Research and Investigation, Biostatistics and Methodology Unit, Institut Paoli-Calmettes, Aix Marseille University, INSERM, IRD, SESSTIM, Marseille, France
| | - Soufiane Ajana
- Department of Biostatistics, Institut Claudius Regaud, IUCT-O, Toulouse, France
| | - Alejandra Martinez
- Department of Surgery, Institut Claudius Regaud, IUCT-O, Toulouse, France
| | - Jean-Pierre Delord
- Department of Medical Oncology, Institut Claudius Regaud, IUCT-O, Toulouse, France
| | | | - Jean-Marie Boher
- Department of Clinical Research and Investigation, Biostatistics and Methodology Unit, Institut Paoli-Calmettes, Aix Marseille University, INSERM, IRD, SESSTIM, Marseille, France
| | - Thomas Filleron
- Department of Biostatistics, Institut Claudius Regaud, IUCT-O, Toulouse, France.
| |
Collapse
|
35
|
Abstract
Over the past decades, there has been considerable interest in applying statistical machine learning methods in survival analysis. Ensemble based approaches, especially random survival forests, have been developed in a variety of contexts due to their high precision and non-parametric nature. This article aims to provide a timely review on recent developments and applications of random survival forests for time-to-event data with high dimensional covariates. This selective review begins with an introduction to the random survival forest framework, followed by a survey of recent developments on splitting criteria, variable selection, and other advanced topics of random survival forests for time-to-event data in high dimensional settings. We also discuss potential research directions for future research.
Collapse
Affiliation(s)
- Hong Wang
- School of Mathematics and Statistics, Central South University, Hunan 410083, China
| | - Gang Li
- Department of Biostatistics and Biomathematics, School of Public Health, University of California at Los Angeles, CA 90095, USA
| |
Collapse
|
36
|
Shi M, He J. ColoFinder: a prognostic 9-gene signature improves prognosis for 871 stage II and III colorectal cancer patients. PeerJ 2016; 4:e1804. [PMID: 26989635 PMCID: PMC4793313 DOI: 10.7717/peerj.1804] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 02/23/2016] [Indexed: 12/24/2022] Open
Abstract
Colorectal cancer (CRC) is a heterogeneous disease with a high mortality rate and is still lacking an effective treatment. Our goal is to develop a robust prognosis model for predicting the prognosis in CRC patients. In this study, 871 stage II and III CRC samples were collected from six gene expression profilings. ColoFinder was developed using a 9-gene signature based Random Survival Forest (RSF) prognosis model. The 9-gene signature recurrence score was derived with a 5-fold cross validation to test the association with relapse-free survival, and the value of AUC was gained with 0.87 in GSE39582(95% CI [0.83-0.91]). The low-risk group had a significantly better relapse-free survival (HR, 14.8; 95% CI [8.17-26.8]; P < 0.001) than the high-risk group. We also found that the 9-gene signature recurrence score contributed more information about recurrence than standard clinical and pathological variables in univariate and multivariate Cox analyses when applied to GSE17536(p = 0.03 and p = 0.01 respectively). Furthermore, ColoFinder improved the predictive ability and better stratified the risk subgroups when applied to CRC gene expression datasets GSE14333, GSE17537, GSE12945and GSE24551. In summary, ColoFinder significantly improves the risk assessment in stage II and III CRC patients. The 9-gene prognostic classifier informs patient prognosis and treatment response.
Collapse
Affiliation(s)
- Mingguang Shi
- School of Electric Engineering and Automation, Hefei University of Technology , Hefei, Anhui , China
| | - Jianmin He
- School of Management, Hefei University of Technology , Hefei, Anhui , China
| |
Collapse
|
37
|
HAMIDI O, POOROLAJAL J, FARHADIAN M, TAPAK L. Identifying Important Risk Factors for Survival in Kidney Graft Failure Patients Using Random Survival Forests. Iran J Public Health 2016; 45:27-33. [PMID: 27057518 PMCID: PMC4822390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
BACKGROUND Kidney transplantation is the best alternative treatment for end-stage renal disease. Several studies have been devoted to investigate predisposing factors of graft rejection. However, there is inconsistency between the results. The objective of the present study was to utilize an intuitive and robust approach for variable selection, random survival forests (RSF), and to identify important risk factors in kidney transplantation patients. METHODS The data set included 378 patients with kidney transplantation obtained through a historical cohort study in Hamadan, western Iran, from 1994 to 2011. The event of interest was chronic nonreversible graft rejection and the duration between kidney transplantation and rejection was considered as the survival time. RSF method was used to identify important risk factors for survival of the patients among the potential predictors of graft rejection. RESULTS The mean survival time was 7.35±4.62 yr. Thirty-seven episodes of rejection were occurred. The most important predictors of survival were cold ischemic time, recipient's age, creatinine level at discharge, donors' age and duration of hospitalization. RSF method predicted survival better than the conventional Cox-proportional hazards model (out-of-bag C-index of 0.965 for RSF vs. 0.766 for Cox model and integrated Brier score of 0.081 for RSF vs. 0.088 for Cox model). CONCLUSION A RSF model in the kidney transplantation patients outperformed traditional Cox-proportional hazard model. RSF is a promising method that may serve as a more intuitive approach to identify important risk factors for graft rejection.
Collapse
Affiliation(s)
- Omid HAMIDI
- Dept. of Science, Hamadan University of Technology, Hamadan, Iran
| | - Jalal POOROLAJAL
- Modeling of Noncommunicable Diseases Research Center, Department of Epidemiology, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Maryam FARHADIAN
- Modeling of Noncommunicable Diseases Research Center, Dept. of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Leili TAPAK
- Dept. of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran,Corresponding Author:
| |
Collapse
|
38
|
Stenholm S, Shardell M, Bandinelli S, Guralnik JM, Ferrucci L. Physiological factors contributing to mobility loss over 9 years of follow-up—results from the InCHIANTI study. J Gerontol A Biol Sci Med Sci 2015; 70:591-7. [PMID: 25748030 DOI: 10.1093/gerona/glv004] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 12/31/2014] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Mobility is an essential aspect of everyday life and enables autonomy and participation. Although many risk factors for mobility loss have been previously described, their relative importance and independent contributions to the long-term risk of losing mobility have not been well defined. METHODS This study is based on 1,013 men and women aged ≥65 years enrolled in 1998-2000 and followed for 9 years through 2007-2008 in the population-based InCHIANTI (Invecchiare in Chianti, aging in the Chianti area) study. We considered 44 different measures assessed at baseline to explore six subsystems: (i) central nervous system, (ii) peripheral nervous system, (iii) muscles, (iv) bone and joints, (v) energy production and delivery, and (vi) perceptual system. The outcome was incident mobility loss defined as self-report of inability to walk 400 m or climb and descend 10 steps without help from another person. Random survival forest analysis was used to rank the candidate predictors by their importance. RESULTS The most important physiological markers predicting mobility loss that emerged from the random survival forest modeling were older age among women (81-95 vs 65-68 years, hazard ratio [HR] 9.60 [95% CI 3.35, 27.50]), weaker ankle dorsiflexion strength (lowest vs highest quintile, HR 5.25 [95% CI 2.35, 11.72]), low hip flexion range of motion (lowest vs highest quintile, HR 2.30 [95% CI 1.20, 4.41]), presence of primitive reflexes (yes vs no, HR 1.47 [95% CI 1.03, 2.09]), and tremor (yes vs no, HR 1.91 [95% CI 1.18, 3.07]). CONCLUSION Prevention of mobility loss with aging should focus on prevention and treatment of neuromuscular impairments.
Collapse
Affiliation(s)
- Sari Stenholm
- Department of Public Health, University of Turku, Finland. School of Health Sciences, University of Tampere, Finland.
| | | | | | - Jack M Guralnik
- Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore
| | | |
Collapse
|