1
|
Cai K, Fu W, Wang Z, Yang X, Liu H, Ji Z. Optimizing Prognostic Predictions in Liver Cancer with Machine Learning and Survival Analysis. ENTROPY (BASEL, SWITZERLAND) 2024; 26:767. [PMID: 39330100 PMCID: PMC11431161 DOI: 10.3390/e26090767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 09/02/2024] [Accepted: 09/06/2024] [Indexed: 09/28/2024]
Abstract
This study harnesses RNA sequencing data from the Cancer Genome Atlas to unearth pivotal genetic markers linked to the progression of liver hepatocellular carcinoma (LIHC), a major contributor to cancer-related deaths worldwide, characterized by a dire prognosis and limited treatment avenues. We employ advanced feature selection techniques, including sure independence screening (SIS) combined with the least absolute shrinkage and selection operator (Lasso), smoothly clipped absolute deviation (SCAD), information gain (IG), and permutation variable importance (VIMP) methods, to effectively navigate the challenges posed by ultra-high-dimensional data. Through these methods, we identify critical genes like MED8 as significant markers for LIHC. These markers are further analyzed using advanced survival analysis models, including the Cox proportional hazards model, survival tree, and random survival forests. Our findings reveal that SIS-Lasso demonstrates strong predictive accuracy, particularly in combination with the Cox proportional hazards model. However, when coupled with the random survival forests method, the SIS-VIMP approach achieves the highest overall performance. This comprehensive approach not only enhances the prediction of LIHC outcomes but also provides valuable insights into the genetic mechanisms underlying the disease, thereby paving the way for personalized treatment strategies and advancing the field of cancer genomics.
Collapse
Affiliation(s)
- Kaida Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
| | - Wenzhi Fu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
| | - Zhengyan Wang
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
| | - Xiaofang Yang
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
| | - Hanwen Liu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
| | - Ziyang Ji
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China
| |
Collapse
|
2
|
Davies AL, Coolen AC, Galla T. Delayed kernels for longitudinal survival analysis and dynamic prediction. Stat Methods Med Res 2024:9622802241275382. [PMID: 39211944 DOI: 10.1177/09622802241275382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Predicting patient survival probabilities based on observed covariates is an important assessment in clinical practice. These patient-specific covariates are often measured over multiple follow-up appointments. It is then of interest to predict survival based on the history of these longitudinal measurements, and to update predictions as more observations become available. The standard approaches to these so-called 'dynamic prediction' assessments are joint models and landmark analysis. Joint models involve high-dimensional parameterizations, and their computational complexity often prohibits including multiple longitudinal covariates. Landmark analysis is simpler, but discards a proportion of the available data at each 'landmark time'. In this work, we propose a 'delayed kernel' approach to dynamic prediction that sits somewhere in between the two standard methods in terms of complexity. By conditioning hazard rates directly on the covariate measurements over the observation time frame, we define a model that takes into account the full history of covariate measurements but is more practical and parsimonious than joint modelling. Time-dependent association kernels describe the impact of covariate changes at earlier times on the patient's hazard rate at later times. Under the constraints that our model (a) reduces to the standard Cox model for time-independent covariates, and (b) contains the instantaneous Cox model as a special case, we derive two natural kernel parameterizations. Upon application to three clinical data sets, we find that the predictive accuracy of the delayed kernel approach is comparable to that of the two existing standard methods.
Collapse
Affiliation(s)
- Annabel Louisa Davies
- Department of Physics and Astronomy, University of Manchester, UK
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, UK
| | - Anthony Cc Coolen
- Department of Biophysics, Radboud University, the Netherlands
- Saddle Point Science Ltd, UK
| | - Tobias Galla
- Department of Physics and Astronomy, University of Manchester, UK
- Instituto de Física Interdisciplinar y Sistemas Complejos, IFISC (CSIC-UIB), Campus Universitat Illes Balears, Palma de Mallorca, Spain
| |
Collapse
|
3
|
Abbasi AF, Asim MN, Ahmed S, Vollmer S, Dengel A. Survival prediction landscape: an in-depth systematic literature review on activities, methods, tools, diseases, and databases. Front Artif Intell 2024; 7:1428501. [PMID: 39021434 PMCID: PMC11252047 DOI: 10.3389/frai.2024.1428501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 06/12/2024] [Indexed: 07/20/2024] Open
Abstract
Survival prediction integrates patient-specific molecular information and clinical signatures to forecast the anticipated time of an event, such as recurrence, death, or disease progression. Survival prediction proves valuable in guiding treatment decisions, optimizing resource allocation, and interventions of precision medicine. The wide range of diseases, the existence of various variants within the same disease, and the reliance on available data necessitate disease-specific computational survival predictors. The widespread adoption of artificial intelligence (AI) methods in crafting survival predictors has undoubtedly revolutionized this field. However, the ever-increasing demand for more sophisticated and effective prediction models necessitates the continued creation of innovative advancements. To catalyze these advancements, it is crucial to bring existing survival predictors knowledge and insights into a centralized platform. The paper in hand thoroughly examines 23 existing review studies and provides a concise overview of their scope and limitations. Focusing on a comprehensive set of 90 most recent survival predictors across 44 diverse diseases, it delves into insights of diverse types of methods that are used in the development of disease-specific predictors. This exhaustive analysis encompasses the utilized data modalities along with a detailed analysis of subsets of clinical features, feature engineering methods, and the specific statistical, machine or deep learning approaches that have been employed. It also provides insights about survival prediction data sources, open-source predictors, and survival prediction frameworks.
Collapse
Affiliation(s)
- Ahtisham Fazeel Abbasi
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, Germany
- Smart Data & Knowledge Services, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Kaiserslautern, Germany
| | - Muhammad Nabeel Asim
- Smart Data & Knowledge Services, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Kaiserslautern, Germany
| | - Sheraz Ahmed
- Smart Data & Knowledge Services, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Kaiserslautern, Germany
| | - Sebastian Vollmer
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, Germany
- Smart Data & Knowledge Services, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Kaiserslautern, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, Germany
- Smart Data & Knowledge Services, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Kaiserslautern, Germany
| |
Collapse
|
4
|
Lin CH, Liu ZY, Chen JS, Fann YC, Wen MS, Kuo CF. ECG-surv: A deep learning-based model to predict time to 1-year mortality from 12-lead electrocardiogram. Biomed J 2024:100732. [PMID: 38697480 DOI: 10.1016/j.bj.2024.100732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 03/12/2024] [Accepted: 04/18/2024] [Indexed: 05/05/2024] Open
Abstract
BACKGROUND Electrocardiogram (ECG) abnormalities have demonstrated potential as prognostic indicators of patient survival. However, the traditional statistical approach is constrained by structured data input, limiting its ability to fully leverage the predictive value of ECG data in prognostic modeling. METHODS This study aims to introduce and evaluate a deep-learning model to simultaneously handle censored data and unstructured ECG data for survival analysis. We herein introduce a novel deep neural network called ECG-surv, which includes a feature extraction neural network and a time-to-event analysis neural network. The proposed model is specifically designed to predict the time to 1-year mortality by extracting and analyzing unique features from 12-lead ECG data. ECG-surv was evaluated using both an independent test set and an external set, which were collected using different ECG devices. RESULTS The performance of ECG-surv surpassed that of the Cox proportional model, which included demographics and ECG waveform parameters, in predicting 1-year all-cause mortality, with a significantly higher concordance index (C-index) in ECG-surv than in the Cox model using both the independent test set (0.860 [95% CI: 0.859- 0.861] vs. 0.796 [95% CI: 0.791- 0.800]) and the external test set (0.813 [95% CI: 0.807- 0.814] vs. 0.764 [95% CI: 0.755- 0.770]). ECG-surv also demonstrated exceptional predictive ability for cardiovascular death (C-index of 0.891 [95% CI: 0.890- 0.893]), outperforming the Framingham risk Cox model (C-index of 0.734 [95% CI: 0.715-0.752]). CONCLUSION ECG-surv effectively utilized unstructured ECG data in a survival analysis. It outperformed traditional statistical approaches in predicting 1-year all-cause mortality and cardiovascular death, which makes it a valuable tool for predicting patient survival.
Collapse
Affiliation(s)
- Ching-Heng Lin
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan; Bachelor Program in Artificial Intelligence, Chang Gung University, Taoyuan, Taiwan
| | - Zhi-Yong Liu
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Jung-Sheng Chen
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Yang C Fann
- Division of Intramural Research, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, United States
| | - Ming-Shien Wen
- Division of Cardiology, Chang Gung Memorial Hospital, Taoyuan, Taiwan; School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Chang-Fu Kuo
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan; School of Medicine, Chang Gung University, Taoyuan, Taiwan; Division of Rheumatology, Allergy and Immunology, Chang Gung Memorial Hospital, Taoyuan, Taiwan.
| |
Collapse
|
5
|
Lotspeich SC, Ashner MC, Vazquez JE, Richardson BD, Grosser KF, Bodek BE, Garcia TP. Making Sense of Censored Covariates: Statistical Methods for Studies of Huntington's Disease. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2024; 11:255-277. [PMID: 38962579 PMCID: PMC11220439 DOI: 10.1146/annurev-statistics-040522-095944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
The landscape of survival analysis is constantly being revolutionized to answer biomedical challenges, most recently the statistical challenge of censored covariates rather than outcomes. There are many promising strategies to tackle censored covariates, including weighting, imputation, maximum likelihood, and Bayesian methods. Still, this is a relatively fresh area of research, different from the areas of censored outcomes (i.e., survival analysis) or missing covariates. In this review, we discuss the unique statistical challenges encountered when handling censored covariates and provide an in-depth review of existing methods designed to address those challenges. We emphasize each method's relative strengths and weaknesses, providing recommendations to help investigators pinpoint the best approach to handling censored covariates in their data.
Collapse
Affiliation(s)
- Sarah C Lotspeich
- Department of Statistical Sciences, Wake Forest University, Winston-Salem, North Carolina, USA
| | - Marissa C Ashner
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Jesus E Vazquez
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brian D Richardson
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kyle F Grosser
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Benjamin E Bodek
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Tanya P Garcia
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
6
|
Lee M. Deep Learning Techniques with Genomic Data in Cancer Prognosis: A Comprehensive Review of the 2021-2023 Literature. BIOLOGY 2023; 12:893. [PMID: 37508326 PMCID: PMC10376033 DOI: 10.3390/biology12070893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/16/2023] [Accepted: 06/20/2023] [Indexed: 07/30/2023]
Abstract
Deep learning has brought about a significant transformation in machine learning, leading to an array of novel methodologies and consequently broadening its influence. The application of deep learning in various sectors, especially biomedical data analysis, has initiated a period filled with noteworthy scientific developments. This trend has majorly influenced cancer prognosis, where the interpretation of genomic data for survival analysis has become a central research focus. The capacity of deep learning to decode intricate patterns embedded within high-dimensional genomic data has provoked a paradigm shift in our understanding of cancer survival. Given the swift progression in this field, there is an urgent need for a comprehensive review that focuses on the most influential studies from 2021 to 2023. This review, through its careful selection and thorough exploration of dominant trends and methodologies, strives to fulfill this need. The paper aims to enhance our existing understanding of applications of deep learning in cancer survival analysis, while also highlighting promising directions for future research. This paper undertakes aims to enrich our existing grasp of the application of deep learning in cancer survival analysis, while concurrently shedding light on promising directions for future research in this vibrant and rapidly proliferating field.
Collapse
Affiliation(s)
- Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
7
|
Geng T, Zheng M, Wang Y, Reseland JE, Samara A. An artificial intelligence prediction model based on extracellular matrix proteins for the prognostic prediction and immunotherapeutic evaluation of ovarian serous adenocarcinoma. Front Mol Biosci 2023; 10:1200354. [PMID: 37388244 PMCID: PMC10301747 DOI: 10.3389/fmolb.2023.1200354] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 05/31/2023] [Indexed: 07/01/2023] Open
Abstract
Background: Ovarian Serous Adenocarcinoma is a malignant tumor originating from epithelial cells and one of the most common causes of death from gynecological cancers. The objective of this study was to develop a prediction model based on extracellular matrix proteins, using artificial intelligence techniques. The model aimed to aid healthcare professionals to predict the overall survival of patients with ovarian cancer (OC) and determine the efficacy of immunotherapy. Methods: The Cancer Genome Atlas Ovarian Cancer (TCGA-OV) data collection was used as the study dataset, whereas the TCGA-Pancancer dataset was used for validation. The prognostic importance of 1068 known extracellular matrix proteins for OC were determined by the Random Forest algorithm and the Lasso algorithm establishing the ECM risk score. Based on the gene expression data, the differences in mRNA abundance, tumour mutation burden (TMB) and tumour microenvironment (TME) between the high- and low-risk groups were assessed. Results: Combining multiple artificial intelligence algorithms we were able to identify 15 key extracellular matrix genes, namely, AMBN, CXCL11, PI3, CSPG5, TGFBI, TLL1, HMCN2, ESM1, IL12A, MMP17, CLEC5A, FREM2, ANGPTL4, PRSS1, FGF23, and confirm the validity of this ECM risk score for overall survival prediction. Several other parameters were identified as independent prognostic factors for OC by multivariate COX analysis. The analysis showed that thyroglobulin (TG) targeted immunotherapy was more effective in the high ECM risk score group, while the low ECM risk score group was more sensitive to the RYR2 gene-related immunotherapy. Additionally, the patients with low ECM risk scores had higher immune checkpoint gene expression and immunophenoscore levels and responded better to immunotherapy. Conclusion: The ECM risk score is an accurate tool to assess the patient's sensitivity to immunotherapy and forecast OC prognosis.
Collapse
Affiliation(s)
- Tianxiang Geng
- Department of Biomaterials, FUTURE, Center for Functional Tissue Reconstruction, Faculty of Dentistry, University of Oslo, Oslo, Norway
| | - Mengxue Zheng
- Laboratory of Reproductive Biology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Yongfeng Wang
- Department of Obstetrics and Gynecology, Seventh People’s Hospital of Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Janne Elin Reseland
- Department of Biomaterials, FUTURE, Center for Functional Tissue Reconstruction, Faculty of Dentistry, University of Oslo, Oslo, Norway
| | - Athina Samara
- Department of Biomaterials, FUTURE, Center for Functional Tissue Reconstruction, Faculty of Dentistry, University of Oslo, Oslo, Norway
| |
Collapse
|
8
|
Wu X, Shi Y, Wang M, Li A. CAMR: cross-aligned multimodal representation learning for cancer survival prediction. Bioinformatics 2023; 39:btad025. [PMID: 36637188 PMCID: PMC9857974 DOI: 10.1093/bioinformatics/btad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/10/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Accurately predicting cancer survival is crucial for helping clinicians to plan appropriate treatments, which largely improves the life quality of cancer patients and spares the related medical costs. Recent advances in survival prediction methods suggest that integrating complementary information from different modalities, e.g. histopathological images and genomic data, plays a key role in enhancing predictive performance. Despite promising results obtained by existing multimodal methods, the disparate and heterogeneous characteristics of multimodal data cause the so-called modality gap problem, which brings in dramatically diverse modality representations in feature space. Consequently, detrimental modality gaps make it difficult for comprehensive integration of multimodal information via representation learning and therefore pose a great challenge to further improvements of cancer survival prediction. RESULTS To solve the above problems, we propose a novel method called cross-aligned multimodal representation learning (CAMR), which generates both modality-invariant and -specific representations for more accurate cancer survival prediction. Specifically, a cross-modality representation alignment learning network is introduced to reduce modality gaps by effectively learning modality-invariant representations in a common subspace, which is achieved by aligning the distributions of different modality representations through adversarial training. Besides, we adopt a cross-modality fusion module to fuse modality-invariant representations into a unified cross-modality representation for each patient. Meanwhile, CAMR learns modality-specific representations which complement modality-invariant representations and therefore provides a holistic view of the multimodal data for cancer survival prediction. Comprehensive experiment results demonstrate that CAMR can successfully narrow modality gaps and consistently yields better performance than other survival prediction methods using multimodal data. AVAILABILITY AND IMPLEMENTATION CAMR is freely available at https://github.com/wxq-ustc/CAMR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xingqi Wu
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Yi Shi
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| |
Collapse
|
9
|
Krzyziński M, Spytek M, Baniecki H, Biecek P. SurvSHAP(t): Time-dependent explanations of machine learning survival models. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
10
|
Choi SR, Lee M. Estimating the Prognosis of Low-Grade Glioma with Gene Attention Using Multi-Omics and Multi-Modal Schemes. BIOLOGY 2022; 11:biology11101462. [PMID: 36290366 PMCID: PMC9598836 DOI: 10.3390/biology11101462] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/01/2022] [Accepted: 10/02/2022] [Indexed: 11/20/2022]
Abstract
The prognosis estimation of low-grade glioma (LGG) patients with deep learning models using gene expression data has been extensively studied in recent years. However, the deep learning models used in these studies do not utilize the latest deep learning techniques, such as residual learning and ensemble learning. To address this limitation, in this study, a deep learning model using multi-omics and multi-modal schemes, namely the Multi-Prognosis Estimation Network (Multi-PEN), is proposed. When using Multi-PEN, gene attention layers are employed for each datatype, including mRNA and miRNA, thereby allowing us to identify prognostic genes. Additionally, recent developments in deep learning, such as residual learning and layer normalization, are utilized. As a result, Multi-PEN demonstrates competitive performance compared to conventional models for prognosis estimation. Furthermore, the most significant prognostic mRNA and miRNA were identified using the attention layers in Multi-PEN. For instance, MYBL1 was identified as the most significant prognostic mRNA. Such a result accords with the findings in existing studies that have demonstrated that MYBL1 regulates cell survival, proliferation, and differentiation. Additionally, hsa-mir-421 was identified as the most significant prognostic miRNA, and it has been extensively reported that hsa-mir-421 is highly associated with various cancers. These results indicate that the estimations of Multi-PEN are valid and reliable and showcase Multi-PEN's capacity to present hypotheses regarding prognostic mRNAs and miRNAs.
Collapse
|
11
|
Yin Q, Chen W, Zhang C, Wei Z. A convolutional neural network model for survival prediction based on prognosis-related cascaded Wx feature selection. J Transl Med 2022; 102:1064-1074. [PMID: 35810236 DOI: 10.1038/s41374-022-00801-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 04/22/2022] [Accepted: 04/26/2022] [Indexed: 12/14/2022] Open
Abstract
Great advances in deep learning have provided effective solutions for prediction tasks in the biomedical field. However, accurate prognosis prediction using cancer genomics data remains challenging due to the severe overfitting problem caused by curse of dimensionality inherent to high-throughput sequencing data. Moreover, there are unique challenges to perform survival analysis, arising from the difficulty in utilizing censored samples whose events of interest are not observed. Convolutional neural network (CNN) models provide us the opportunity to extract meaningful hierarchical features to characterize cancer subtype and prognosis outcomes. On the other hand, feature selection can mitigate overfitting and reduce subsequent model training computation burden by screening out significant genes from redundant genes. To accomplish model simplification, we developed a concise and efficient survival analysis model, named CNN-Cox model, which combines a special CNN framework with prognosis-related feature selection cascaded Wx, with the advantage of less computation demand utilizing light training parameters. Experiment results show that CNN-Cox model achieved consistent higher C-index values and better survival prediction performance across seven cancer type datasets in The Cancer Genome Atlas cohort, including bladder carcinoma, head and neck squamous cell carcinoma, kidney renal cell carcinoma, brain low-grade glioma, lung adenocarcinoma (LUAD), lung squamous cell carcinoma, and skin cutaneous melanoma, compared with the existing state-of-the-art survival analysis methods. As an illustration of model interpretation, we examined potential prognostic gene signatures of LUAD dataset using the proposed CNN-Cox model. We conducted protein-protein interaction network analysis to identify potential prognostic genes and further analyzed the biological function of 13 hub genes, including ANLN, RACGAP1, KIF4A, KIF20A, KIF14, ASPM, CDK1, SPC25, NCAPG, MKI67, HJURP, EXO1, HMMR, whose high expression is significantly associated with poor survival of LUAD patients. These findings confirmed that CNN-Cox model is effective in extracting not only prognosis factors but also biologically meaningful gene features. The codes are available at the GitHub website: https://github.com/wangwangCCChen/CNN-Cox .
Collapse
Affiliation(s)
- Qingyan Yin
- School of Science, Xi'an University of Architecture and Technology, Xi'an, Shaanxi, 710055, China.
| | - Wangwang Chen
- School of Science, Xi'an University of Architecture and Technology, Xi'an, Shaanxi, 710055, China
| | - Chunxia Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, 07102, USA
| |
Collapse
|
12
|
Dey R, Zhou W, Kiiskinen T, Havulinna A, Elliott A, Karjalainen J, Kurki M, Qin A, Lee S, Palotie A, Neale B, Daly M, Lin X. Efficient and accurate frailty model approach for genome-wide survival association analysis in large-scale biobanks. Nat Commun 2022; 13:5437. [PMID: 36114182 PMCID: PMC9481565 DOI: 10.1038/s41467-022-32885-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 08/22/2022] [Indexed: 01/11/2023] Open
Abstract
With decades of electronic health records linked to genetic data, large biobanks provide unprecedented opportunities for systematically understanding the genetics of the natural history of complex diseases. Genome-wide survival association analysis can identify genetic variants associated with ages of onset, disease progression and lifespan. We propose an efficient and accurate frailty model approach for genome-wide survival association analysis of censored time-to-event (TTE) phenotypes by accounting for both population structure and relatedness. Our method utilizes state-of-the-art optimization strategies to reduce the computational cost. The saddlepoint approximation is used to allow for analysis of heavily censored phenotypes (>90%) and low frequency variants (down to minor allele count 20). We demonstrate the performance of our method through extensive simulation studies and analysis of five TTE phenotypes, including lifespan, with heavy censoring rates (90.9% to 99.8%) on ~400,000 UK Biobank participants with white British ancestry and ~180,000 individuals in FinnGen. We further analyzed 871 TTE phenotypes in the UK Biobank and presented the genome-wide scale phenome-wide association results with the PheWeb browser.
Collapse
Affiliation(s)
- Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Tuomo Kiiskinen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
- Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Aki Havulinna
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
- Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Amanda Elliott
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Juha Karjalainen
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Mitja Kurki
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Ashley Qin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Seunggeun Lee
- Graduate School of Data Science, Seoul National University, Seoul, Korea
| | - Aarno Palotie
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Benjamin Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Mark Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
13
|
Zhang Y, Wong G, Mann G, Muller S, Yang JYH. SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data. Gigascience 2022; 11:6652188. [PMID: 35906887 PMCID: PMC9338425 DOI: 10.1093/gigascience/giac071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/16/2022] [Accepted: 06/22/2022] [Indexed: 11/24/2022] Open
Abstract
Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a diverse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics; these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies.
Collapse
Affiliation(s)
- Yunwei Zhang
- School of Mathematics and Statistics, The University of Sydney, Sydney 2006, Australia.,Charles Perkins Centre, The University of Sydney, Sydney 2006, Australia
| | - Germaine Wong
- Sydney School of Public Health, The University of Sydney, NSW, Sydney 2006, Australia.,Centre for Kidney Research, Kids Research Institute, The Children's Hospital at Westmead, NSW, 2145, Sydney, Australia.,Centre for Transplant and Renal Research, Westmead Hospital, NSW, 2145, Sydney, Australia
| | - Graham Mann
- John Curtin School of Medical Research, Australian National University, Canberra 2601, Australia.,Melanoma Institute Australia, North Sydney, NSW 2065, Australia
| | - Samuel Muller
- School of Mathematics and Statistics, The University of Sydney, Sydney 2006, Australia.,Department of Mathematics and Statistics, Macquarie University, Sydney 2109, Australia
| | - Jean Y H Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney 2006, Australia.,Charles Perkins Centre, The University of Sydney, Sydney 2006, Australia.,Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| |
Collapse
|
14
|
Prediction of Bronchopneumonia Inpatients' Total Hospitalization Expenses Based on BP Neural Network and Support Vector Machine Models. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9275801. [PMID: 35633928 PMCID: PMC9132643 DOI: 10.1155/2022/9275801] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/13/2022] [Accepted: 05/05/2022] [Indexed: 01/09/2023]
Abstract
Objective BP neural network (BPNN) model and support vector machine (SVM) model were used to predict the total hospitalization expenses of patients with bronchopneumonia. Methods A total of 355 patients with bronchopneumonia from January 2018 to December 2020 were collected and sorted out. The data set was randomly divided into a training set (n = 249) and a test set (n = 106) according to 7 : 3. The BPNN model and SVM model were constructed to analyze the predictors of total hospitalization expenses. The effectiveness was compared between these two prediction models. Results The top three influencing factors and their importance for predicting total hospitalization cost by the BPNN model were hospitalization days (0.477), age (0.154), and discharge department (0.083). The top 3 factors predicted by the SVM model were hospitalization days (0.215), age (0.196), and marital status (0.172). The area under the curve of these two models is 0.838 (95% CI: 0.755~0.921) and 0.889 (95% CI: 0.819~0.959), respectively. Conclusion Both the BPNN model and SVM model can predict the total hospitalization expenses of patients with bronchopneumonia, but the prediction effect of the SVM model is better than the BPNN model.
Collapse
|
15
|
Lee M. An Ensemble Deep Learning Model with a Gene Attention Mechanism for Estimating the Prognosis of Low-Grade Glioma. BIOLOGY 2022; 11:586. [PMID: 35453785 PMCID: PMC9027395 DOI: 10.3390/biology11040586] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 03/30/2022] [Accepted: 04/11/2022] [Indexed: 06/14/2023]
Abstract
While estimating the prognosis of low-grade glioma (LGG) is a crucial problem, it has not been extensively studied to introduce recent improvements in deep learning to address the problem. The attention mechanism is one of the significant advances; however, it is still unclear how attention mechanisms are used in gene expression data to estimate prognosis because they were designed for convolutional layers and word embeddings. This paper proposes an attention mechanism called gene attention for gene expression data. Additionally, a deep learning model for prognosis estimation of LGG is proposed using gene attention. The proposed Gene Attention Ensemble NETwork (GAENET) outperformed other conventional methods, including survival support vector machine and random survival forest. When evaluated by C-Index, the GAENET exhibited an improvement of 7.2% compared to the second-best model. In addition, taking advantage of the gene attention mechanism, HILS1 was discovered as the most significant prognostic gene in terms of deep learning training. While HILS1 is known as a pseudogene, HILS1 is a biomarker estimating the prognosis of LGG and has demonstrated a possibility of regulating the expression of other prognostic genes.
Collapse
Affiliation(s)
- Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Korea
| |
Collapse
|
16
|
A nine-gene signature identification and prognostic risk prediction for patients with lung adenocarcinoma using novel machine learning approach. Comput Biol Med 2022; 145:105493. [DOI: 10.1016/j.compbiomed.2022.105493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 03/31/2022] [Accepted: 04/02/2022] [Indexed: 02/06/2023]
|
17
|
Novel application of survival models for predicting microbial community transitions with variable selection for eDNA. Appl Environ Microbiol 2022; 88:e0214621. [PMID: 35138931 DOI: 10.1128/aem.02146-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Survival analysis is a prolific statistical tool in medicine for inferring risk and time to disease-related events. However, it is under-utilized in microbiome research to predict microbial community mediated events, partly due to the sparsity and high dimensional nature of the data. We advance the application of Cox proportional hazards (Cox PH) survival models to environmental DNA (eDNA) data with feature selection suitable for filtering irrelevant and redundant taxonomic variables. Selection methods are compared in terms of false positives, sensitivity, and survival estimation accuracy in simulation and in a real data setting to forecast harmful cyanobacterial blooms. A novel extension of a method for selecting microbial biomarkers with survival data (SuRFCox) reliably outperforms other methods. We determine Cox PH models with SuRFCox selected predictors are more robust to varied signal, noise, and data correlation structure. SuRFCox also yields the most accurate and consistent prediction of blooms according to cross-validated testing by year over eight different bloom seasons. Identification of common biomarkers among validated survival forecasts over changing conditions has clear biological significance. Survival models with such biomarkers inform risk assessment and provide insight into the causes of critical community transitions. Importance In this paper, we report on a novel approach of selecting microorganisms for model-based prediction of the time to critical microbially-modulated events (e.g., harmful algal blooms, clinical outcomes, community shifts, etc.). Our novel method for identifying biomarkers from large, dynamic communities of microbes has broad utility to environmental and ecological impact risk assessment and public health. Results will also promote theoretical and practical advancements relevant to the biology of specific organisms. To address the unique challenge posed by diverse environmental conditions and sparse microbes, we developed a novel method of selecting predictors for modelling time-to-event data. Competing methods for selecting predictors are rigorously compared to determine which is the most accurate and generalizable. Model forecasts are applied to show suitable predictors can precisely quantify the risk over time of biological events like harmful cyanobacterial blooms.
Collapse
|
18
|
Ning Z, Du D, Tu C, Feng Q, Zhang Y. Relation-Aware Shared Representation Learning for Cancer Prognosis Analysis With Auxiliary Clinical Variables and Incomplete Multi-Modality Data. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:186-198. [PMID: 34460368 DOI: 10.1109/tmi.2021.3108802] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The integrative analysis of complementary phenotype information contained in multi-modality data (e.g., histopathological images and genomic data) has advanced the prognostic evaluation of cancers. However, multi-modality based prognosis analysis confronts two challenges: (1) how to explore underlying relations inherent in different modalities data for learning compact and discriminative multi-modality representations; (2) how to take full consideration of incomplete multi-modality data for constructing accurate and robust prognostic model, since a host of complete multi-modality data are not always available. Additionally, many existing multi-modality based prognostic methods commonly ignore relevant clinical variables (e.g., grade and stage), which, however, may provide supplemental information to promote the performance of model. In this paper, we propose a relation-aware shared representation learning method for prognosis analysis of cancers, which makes full use of clinical information and incomplete multi-modality data. The proposed method learns multi-modal shared space tailored for prognostic model via a dual mapping. Within the shared space, it equips with relational regularizers to explore the potential relations (i.e., feature-label and feature-feature relations) among multi-modality data for inducing discriminatory representations and simultaneously obtaining extra sparsity for alleviating overfitting. Moreover, it regresses and incorporates multiple auxiliary clinical attributes with dynamic coefficients to meliorate performance. Furthermore, in training stage, a partial mapping strategy is employed to extend and train a more reliable model with incomplete multi-modality data. We have evaluated our method on three public datasets derived from The Cancer Genome Atlas (TCGA) project, and the experimental results demonstrate the superior performance of the proposed method.
Collapse
|
19
|
Machine learning-based prediction of 1-year mortality for acute coronary syndrome ✰. J Cardiol 2021; 79:342-351. [PMID: 34857429 DOI: 10.1016/j.jjcc.2021.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 09/20/2021] [Accepted: 10/13/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND Clinical risk assessment with quantitative formal risk scores may add to intuitive physician risk assessment and are advised by the international guidelines for the management of acute coronary syndrome (ACS) patients. Most previous studies have used the binary regression/classification approach (dead/alive) for long-term mortality post-ACS, without considering the time-to-event as in survival analysis. The use of machine learning (ML)-based survival models has yet to be validated. The primary objective was to compare survival prediction performance of 1-year mortality following ACS of two newly developed ML-based models [random survival forest (RSF) and deep learning (DeepSurv)] with the traditional Cox-proportional hazard (CPH) model. The secondary objective was external validation of the findings. METHODS This was a retrospective, supervised learning data mining study based on the Acute Coronary Syndrome Israeli Survey (ACSIS) and the Myocardial Ischemia National Audit Project (MINAP). The ACSIS data were divided to train/test in a 70/30 fashion. Next, the models were externally validated on the MINAP data. Harrell's C-index, inverse probability of censoring weighting (IPCW), and the Brier-score were used for models' performance comparison. RESULTS RSF performed best among the three models, with Harrell's C-index on training and testing sets reaching 0.953 and 0.924 respectively, followed by CPH multivariate selected model (0.805/0.849), CPH Univariate selected model (0.828/0.806), DeepSurv model (0.801/0.804), and the traditional CPH model (0.826/0.738). The RSF model also had the highest performance on the validation data set with 0.811 for Harrell's C-index, 0.844 for IPCW, and 0.093 for Brier score. The CPH model performance on the validation set had C-index range between 0.689 to 0.790, 0.713 to 0.826 for IPCW, and 0.094 to 0.103 Brier score. CONCLUSIONS RSF survival predictions for long-term mortality post-ACS show improved model performance compared with the classic statistical method. This may benefit patients by allowing better risk stratification and tailored therapy, however further prospective evaluations are required.
Collapse
|
20
|
Bottino F, Tagliente E, Pasquini L, Napoli AD, Lucignani M, Figà-Talamanca L, Napolitano A. COVID Mortality Prediction with Machine Learning Methods: A Systematic Review and Critical Appraisal. J Pers Med 2021; 11:893. [PMID: 34575670 PMCID: PMC8467935 DOI: 10.3390/jpm11090893] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 08/26/2021] [Accepted: 09/03/2021] [Indexed: 12/21/2022] Open
Abstract
More than a year has passed since the report of the first case of coronavirus disease 2019 (COVID), and increasing deaths continue to occur. Minimizing the time required for resource allocation and clinical decision making, such as triage, choice of ventilation modes and admission to the intensive care unit is important. Machine learning techniques are acquiring an increasingly sought-after role in predicting the outcome of COVID patients. Particularly, the use of baseline machine learning techniques is rapidly developing in COVID mortality prediction, since a mortality prediction model could rapidly and effectively help clinical decision-making for COVID patients at imminent risk of death. Recent studies reviewed predictive models for SARS-CoV-2 diagnosis, severity, length of hospital stay, intensive care unit admission or mechanical ventilation modes outcomes; however, systematic reviews focused on prediction of COVID mortality outcome with machine learning methods are lacking in the literature. The present review looked into the studies that implemented machine learning, including deep learning, methods in COVID mortality prediction thus trying to present the existing published literature and to provide possible explanations of the best results that the studies obtained. The study also discussed challenging aspects of current studies, providing suggestions for future developments.
Collapse
Affiliation(s)
- Francesca Bottino
- Medical Physics Department Bambino Gesù Children’s Hospital, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), 00165 Rome, Italy;
| | - Emanuela Tagliente
- Medical Physics Department Bambino Gesù Children’s Hospital, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), 00165 Rome, Italy;
| | - Luca Pasquini
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, 00165 Rome, Italy; (L.P.); (A.D.N.)
- Neuroradiology Service, Radiology Department, Memorial Sloan Kettering Cancer Center, New York, NY 1275, USA
| | - Alberto Di Napoli
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, 00165 Rome, Italy; (L.P.); (A.D.N.)
- Radiology Department, Castelli Romani Hospital, 00040 Ariccia (RM), Italy
| | - Martina Lucignani
- Medical Physics Department Bambino Gesù Children’s Hospital, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), 00165 Rome, Italy;
| | - Lorenzo Figà-Talamanca
- Neuroradiology Unit, Imaging Department, Bambino Gesù Children’s Hospital, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), 00165 Rome, Italy;
| | - Antonio Napolitano
- Medical Physics Department Bambino Gesù Children’s Hospital, Scientific Institute for Research, Hospitalization and Healthcare (IRCCS), 00165 Rome, Italy;
| |
Collapse
|
21
|
Hussein AA, Sabry NA, Abdalla MS, Farid SF. A prospective, randomised clinical study comparing triple therapy regimen to hydrocortisone monotherapy in reducing mortality in septic shock patients. Int J Clin Pract 2021; 75:e14376. [PMID: 34003568 DOI: 10.1111/ijcp.14376] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Accepted: 05/09/2021] [Indexed: 01/03/2023] Open
Abstract
OBJECTIVES This prospective, comparative and randomised clinical study evaluated the effectiveness of triple therapy regimen (hydrocortisone, thiamine and vitamin C) versus hydrocortisone alone in reducing the mortality rate and preventing progressive organ dysfunction in septic shock patients. METHODS A total of 94 patients were randomly assigned to one of two groups: the first group received hydrocortisone 50 mg/6-h IV for 7 days or till intensive care unit (ICU) discharge, if sooner, followed by tapering. The second group received hydrocortisone 50 mg/6-h IV for 7 days or ICU discharge followed by tapering, vitamin C 1.5 g/6-h IV for 4 days or till ICU discharge and thiamine 200 mg/12-h IV for 4 days or till ICU discharge. RESULTS The triple therapy regimen showed a non-significant reduction in 28-day mortality compared to hydrocortisone alone (17 [36.2%] vs. 21 [44.7%]; P = .4005), but it was significantly lower than the control group regarding shock time and the duration of vasopressor use in days (4.000 [3.000-7.000]; 5.000 [4.000-8.000], [P = .0100]). The patients in the control group were likely to get 0.59 more in SCr level than those in the intervention group by a linear regression model which was significant (P < .05). Also, the number of patients who developed a fever after 216 hours was significantly higher in the control group (P value = .0299). CONCLUSION Vitamin C, thiamine, and hydrocortisone regimen for septic shock management showed non-significant efficacy in decreasing 28-day mortality when compared to hydrocortisone monotherapy. On the other hand, it showed significant efficacy in decreasing the shock time and duration on vasopressors.
Collapse
Affiliation(s)
| | - Nirmeen A Sabry
- Clinical pharmacy department, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Maged S Abdalla
- Anaesthesia and Critical Care department, Faculty of Medicine (Kasr-el Ainy), Cairo University, Cairo, Egypt
| | - Samar F Farid
- Department of Clinical Pharmacy, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| |
Collapse
|
22
|
Wilson CM, Li K, Sun Q, Kuan PF, Wang X. Fenchel duality of Cox partial likelihood with an application in survival kernel learning. Artif Intell Med 2021; 116:102077. [PMID: 34020756 PMCID: PMC8159024 DOI: 10.1016/j.artmed.2021.102077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 04/14/2021] [Accepted: 04/19/2021] [Indexed: 11/30/2022]
Abstract
The Cox proportional hazard model is one of the most widely used methods in modeling time-to-event data in the health sciences. Due to the simplicity of the Cox partial likelihood function, many machine learning algorithms use it for survival data. However, due to the nature of censored data, the optimization problem becomes intractable when more complicated regularization is employed, which is necessary when dealing with high dimensional omic data. In this paper, we show that a convex conjugate function of the Cox loss function based on Fenchel duality exists, and provide an alternative framework to optimization based on the primal form. Furthermore, the dual form suggests an efficient algorithm for solving the kernel learning problem with censored survival outcomes. We illustrate performance and properties of the derived duality form of Cox partial likelihood loss in multiple kernel learning problems with simulated and the Skin Cutaneous Melanoma TCGA datasets.
Collapse
Affiliation(s)
- Christopher M Wilson
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Kaiqiao Li
- Department of Applied Math & Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| | - Qiang Sun
- Department of Statistical Sciences, University of Toronto, Ontario M5S 3G3, Canada
| | - Pei Fen Kuan
- Department of Applied Math & Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| | - Xuefeng Wang
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA.
| |
Collapse
|
23
|
Abstract
With the development of high-throughput technologies, more and more high-dimensional or ultra-high-dimensional genomic data are being generated. Therefore, effectively analyzing such data has become a significant challenge. Machine learning (ML) algorithms have been widely applied for modeling nonlinear and complicated interactions in a variety of practical fields such as high-dimensional survival data. Recently, multilayer deep neural network (DNN) models have made remarkable achievements. Thus, a Cox-based DNN prediction survival model (DNNSurv model), which was built with Keras and TensorFlow, was developed. However, its results were only evaluated on the survival datasets with high-dimensional or large sample sizes. In this paper, we evaluated the prediction performance of the DNNSurv model using ultra-high-dimensional and high-dimensional survival datasets and compared it with three popular ML survival prediction models (i.e., random survival forest and the Cox-based LASSO and Ridge models). For this purpose, we also present the optimal setting of several hyperparameters, including the selection of a tuning parameter. The proposed method demonstrated via data analysis that the DNNSurv model performed well overall as compared with the ML models, in terms of the three main evaluation measures (i.e., concordance index, time-dependent Brier score, and the time-dependent AUC) for survival prediction performance.
Collapse
|
24
|
Dessie EY, Tsai JJP, Chang JG, Ng KL. A novel miRNA-based classification model of risks and stages for clear cell renal cell carcinoma patients. BMC Bioinformatics 2021; 22:270. [PMID: 34058987 PMCID: PMC8323484 DOI: 10.1186/s12859-021-04189-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 05/12/2021] [Indexed: 12/17/2022] Open
Abstract
Background Clear cell renal cell carcinoma (ccRCC) is the most common subtype of renal carcinoma and patients at advanced stage showed poor survival rate. Despite microRNAs (miRNAs) are used as potential biomarkers in many cancers, miRNA biomarkers for predicting the tumor stage of ccRCC are still limitedly identified. Therefore, we proposed a new integrated machine learning (ML) strategy to identify a novel miRNA signature related to tumor stage and prognosis of ccRCC patients using miRNA expression profiles. A multivariate Cox regression model with three hybrid penalties including Least absolute shrinkage and selection operator (Lasso), Adaptive lasso and Elastic net algorithms was used to screen relevant prognostic related miRNAs. The best subset regression (BSR) model was used to identify optimal prognostic model. Five ML algorithms were used to develop stage classification models. The biological significance of the miRNA signature was analyzed by utilizing DIANA-mirPath. Results A four-miRNA signature associated with survival was identified and the expression of this signature was strongly correlated with high risk patients. The high risk patients had unfavorable overall survival compared with the low risk group (HR = 4.523, P-value = 2.86e−08). Univariate and multivariate analyses confirmed independent and translational value of this predictive model. A combined ML algorithm identified six miRNA signatures for cancer staging prediction. After using the data balancing algorithm SMOTE, the Support Vector Machine (SVM) algorithm achieved the best classification performance (accuracy = 0.923, sensitivity = 0.927, specificity = 0.919, MCC = 0.843) when compared with other classifiers. Furthermore, enrichment analysis indicated that the identified miRNA signature involved in cancer-associated pathways. Conclusions A novel miRNA classification model using the identified prognostic and tumor stage associated miRNA signature will be useful for risk and stage stratification for clinical practice, and the identified miRNA signature can provide promising insight to understand the progression mechanism of ccRCC. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04189-2.
Collapse
Affiliation(s)
- Eskezeia Y Dessie
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.,Center for Artificial Intelligence and Precision Medicine Research, Asia University, Taichung, Taiwan
| | - Jeffrey J P Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Jan-Gowth Chang
- Department of Laboratory Medicine, China Medical University, Taichung, Taiwan.
| | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan. .,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan. .,Center for Artificial Intelligence and Precision Medicine Research, Asia University, Taichung, Taiwan.
| |
Collapse
|
25
|
Hallak JA. A Machine Learning Model With Survival Statistics to Identify Predictors of Descemet Stripping Automated Endothelial Keratoplasty Graft Failure. JAMA Ophthalmol 2021; 139:198-199. [PMID: 33355603 DOI: 10.1001/jamaophthalmol.2020.5741] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Joelle A Hallak
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago
| |
Collapse
|
26
|
Chen H, Li C, Zheng L, Lu W, Li Y, Wei Q. A machine learning-based survival prediction model of high grade glioma by integration of clinical and dose-volume histogram parameters. Cancer Med 2021; 10:2774-2786. [PMID: 33760360 PMCID: PMC8026951 DOI: 10.1002/cam4.3838] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 12/02/2020] [Accepted: 02/23/2021] [Indexed: 01/03/2023] Open
Abstract
PURPOSE Glioma is the most common type of primary brain tumor in adults, and it causes significant morbidity and mortality, especially in high-grade glioma (HGG) patients. The accurate prognostic prediction of HGG is vital and helpful for clinicians when developing therapeutic strategies. Therefore, we propose a machine learning-based survival prediction model by analyzing clinical and dose-volume histogram (DVH) parameters, to improve the performance of the risk model in HGG patients. METHODS Eight clinical variables and 39 DVH parameters were extracted for each patient, who received radiotherapy for HGG with active follow-up. Ninety-five patients were randomly divided into training and testing cohorts, and we employed random survival forest (RSF), support vector machine (SVM), and Cox proportional hazards (CPHs) models to predict survival. Calibration plots, concordance indexes, and decision curve analyses were used to evaluate the calibration, discrimination, and clinical utility of these three models. RESULTS The RSF model showed the best performance among the three models, with concordance indexes of 0.824 and 0.847 in the training and testing sets, respectively, followed by the SVM (0.792/0.823) and CPH (0.821/0.811) models. Specifically, in the RSF model, we identified age, gross tumor volume (GTV), grade, Karnofsky performance status (KPS), isocitrate dehydrogenase (IDH), and D99 as important variables associated with survival. The AUCs of the testing set were 92.4%, 87.7%, and 84.0% for 1-, 2-, and 3-year survival, respectively. According to this model, HGG patients can be divided into high- and low-risk groups. CONCLUSION The machine learning-based RSF model integrating both clinical and DVH variables is an improved and useful tool for predicting the survival of HGG patients.
Collapse
Affiliation(s)
- Haiyan Chen
- Department of Radiation OncologyKey Laboratory of Cancer Prevention and InterventionMinistry of EducationThe Second Affiliated HospitalZhejiang University School of MedicineHangzhouZhejiangChina
- Zhejiang University Cancer CenterHangzhouZhejiangChina
| | - Chao Li
- Department of Radiation OncologyKey Laboratory of Cancer Prevention and InterventionMinistry of EducationThe Second Affiliated HospitalZhejiang University School of MedicineHangzhouZhejiangChina
| | - Lin Zheng
- Department of Radiation OncologyKey Laboratory of Cancer Prevention and InterventionMinistry of EducationThe Second Affiliated HospitalZhejiang University School of MedicineHangzhouZhejiangChina
- Department of Radiation OncologyTaizhou Tumor HospitalTaizhouZhejiangChina
| | - Wei Lu
- Zhejiang University Cancer CenterHangzhouZhejiangChina
- Department of Colorectal Surgery and OncologyKey Laboratory of Cancer Prevention and InterventionMinistry of EducationThe Second Affiliated HospitalZhejiang University School of MedicineHangzhouZhejiangChina
| | - Yanlin Li
- College of ScienceHangzhou Normal UniversityHangzhouZhejiangChina
| | - Qichun Wei
- Department of Radiation OncologyKey Laboratory of Cancer Prevention and InterventionMinistry of EducationThe Second Affiliated HospitalZhejiang University School of MedicineHangzhouZhejiangChina
- Zhejiang University Cancer CenterHangzhouZhejiangChina
| |
Collapse
|
27
|
Elledge CR, LaVigne AW, Fiksel J, Wright JL, McNutt T, Kleinberg LR, Hu C, Smith TJ, Zeger S, DeWeese TL, Alcorn SR. External Validation of the Bone Metastases Ensemble Trees for Survival (BMETS) Machine Learning Model to Predict Survival in Patients With Symptomatic Bone Metastases. JCO Clin Cancer Inform 2021; 5:304-314. [PMID: 33760638 DOI: 10.1200/cci.20.00128] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
PURPOSE The Bone Metastases Ensemble Trees for Survival (BMETS) model uses a machine learning algorithm to estimate survival time following consultation for palliative radiation therapy for symptomatic bone metastases (SBM). BMETS was developed at a tertiary-care, academic medical center, but its validity and stability when applied to external data sets are unknown. PATIENTS AND METHODS Patients treated with palliative radiation therapy for SBM from May 2013 to May 2016 at two hospital-based community radiation oncology clinics were included, and medical records were retrospectively reviewed to collect model covariates and survival time. The Kaplan-Meier method was used to estimate overall survival from consultation to death or last follow-up. Model discrimination was estimated using time-dependent area under the curve (tAUC), which was calculated using survival predictions from BMETS based on the initial training data set. RESULTS A total of 216 sites of SBM were treated in 182 patients. Most common histologies were breast (27%), lung (23%), and prostate (23%). Compared with the BMETS training set, the external validation population was older (mean age, 67 v 62 years; P < .001), had more primary breast (27% v 19%; P = .03) and prostate cancer (20% v 12%; P = .01), and survived longer (median, 10.7 v 6.4 months). When the BMETS model was applied to the external data set, tAUC values at 3, 6, and 12 months were 0.82, 0.77, and 0.77, respectively. When refit with data from the combined training and external validation sets, tAUC remained > 0.79. CONCLUSION BMETS maintained high discriminative ability when applied to an external validation set and when refit with new data, supporting its generalizability, stability, and the feasibility of dynamic modeling.
Collapse
Affiliation(s)
- Christen R Elledge
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Anna W LaVigne
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Jacob Fiksel
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Jean L Wright
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Todd McNutt
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Lawrence R Kleinberg
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Chen Hu
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD
| | - Thomas J Smith
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD
| | - Scott Zeger
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Theodore L DeWeese
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Sara R Alcorn
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD
| |
Collapse
|
28
|
Construction and Validation of a Prognostic Gene-Based Model for Overall Survival Prediction in Hepatocellular Carcinoma Using an Integrated Statistical and Bioinformatic Approach. Int J Mol Sci 2021; 22:ijms22041632. [PMID: 33562824 PMCID: PMC7915780 DOI: 10.3390/ijms22041632] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 01/29/2021] [Accepted: 02/01/2021] [Indexed: 12/24/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is one of the most common lethal cancers worldwide and is often related to late diagnosis and poor survival outcome. More evidence is demonstrating that gene-based prognostic models can be used to predict high-risk HCC patients. Therefore, our study aimed to construct a novel prognostic model for predicting the prognosis of HCC patients. We used multivariate Cox regression model with three hybrid penalties approach including least absolute shrinkage and selection operator (Lasso), adaptive lasso and elastic net algorithms for informative prognostic-related genes selection. Then, the best subset regression was used to identify the best prognostic gene signature. The prognostic gene-based risk score was constructed using the Cox coefficient of the prognostic gene signature. The model was evaluated by Kaplan-Meier (KM) and receiver operating characteristic curve (ROC) analyses. A novel four-gene signature associated with prognosis was identified and the risk score was constructed based on the four-gene signature. The risk score efficiently distinguished the patients into a high-risk group with poor prognosis. The time-dependent ROC analysis revealed that the risk model had a good performance with an area under the curve (AUC) of 0.780, 0.732, 0.733 in 1-, 2- and 3-year prognosis prediction in The Cancer Genome Atlas (TCGA) dataset. Moreover, the risk score revealed a high diagnostic performance to classify HCC from normal samples. The prognosis and diagnosis prediction performances of risk scores were verified in external validation datasets. Functional enrichment analysis of the four-gene signature and its co-expressed genes involved in the metabolic and cell cycle pathways was constructed. Overall, we developed a novel-gene-based prognostic model to predict high-risk HCC patients and we hope that our findings can provide promising insight to explore the role of the four-gene signature in HCC patients and aid risk classification.
Collapse
|
29
|
Zheng X, Amos CI, Frost HR. Cancer prognosis prediction using somatic point mutation and copy number variation data: a comparison of gene-level and pathway-based models. BMC Bioinformatics 2020; 21:467. [PMID: 33081688 PMCID: PMC7574407 DOI: 10.1186/s12859-020-03791-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 09/30/2020] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Genomic profiling of solid human tumors by projects such as The Cancer Genome Atlas (TCGA) has provided important information regarding the somatic alterations that drive cancer progression and patient survival. Although researchers have successfully leveraged TCGA data to build prognostic models, most efforts have focused on specific cancer types and a targeted set of gene-level predictors. Less is known about the prognostic ability of pathway-level variables in a pan-cancer setting. To address these limitations, we systematically evaluated and compared the prognostic ability of somatic point mutation (SPM) and copy number variation (CNV) data, gene-level and pathway-level models for a diverse set of TCGA cancer types and predictive modeling approaches. RESULTS We evaluated gene-level and pathway-level penalized Cox proportional hazards models using SPM and CNV data for 29 different TCGA cohorts. We measured predictive accuracy as the concordance index for predicting survival outcomes. Our comprehensive analysis suggests that the use of pathway-level predictors did not offer superior predictive power relative to gene-level models for all cancer types but had the advantages of robustness and parsimony. We identified a set of cohorts for which somatic alterations could not predict prognosis, and a unique cohort LGG, for which SPM data was more predictive than CNV data and the predictive accuracy is good for all model types. We found that the pathway-level predictors provide superior interpretative value and that there is often a serious collinearity issue for the gene-level models while pathway-level models avoided this issue. CONCLUSION Our comprehensive analysis suggests that when using somatic alterations data for cancer prognosis prediction, pathway-level models are more interpretable, stable and parsimonious compared to gene-level models. Pathway-level models also avoid the issue of collinearity, which can be serious for gene-level somatic alterations. The prognostic power of somatic alterations is highly variable across different cancer types and we have identified a set of cohorts for which somatic alterations could not predict prognosis. In general, CNV data predicts prognosis better than SPM data with the exception of the LGG cohort.
Collapse
Affiliation(s)
- Xingyu Zheng
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
- Department of Medicine, Institute for Clinical and Translational Research, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.
| | - H Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
| |
Collapse
|
30
|
Fuellen G, Liesenfeld O, Kowald A, Barrantes I, Bastian M, Simm A, Jansen L, Tietz-Latza A, Quandt D, Franceschi C, Walter M. The preventive strategy for pandemics in the elderly is to collect in advance samples & data to counteract chronic inflammation (inflammaging). Ageing Res Rev 2020; 62:101091. [PMID: 32454090 PMCID: PMC7245683 DOI: 10.1016/j.arr.2020.101091] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 05/07/2020] [Accepted: 05/18/2020] [Indexed: 12/15/2022]
Abstract
Fighting the current COVID-19 pandemic, we must not forget to prepare for the next. Since elderly and frail people are at high risk, we wish to predict their vulnerability, and intervene if possible. For example, it would take little effort to take additional swabs or dried blood spots. Such minimally-invasive sampling, exemplified here during screening for potential COVID-19 infection, can yield the data to discover biomarkers to better handle this and the next respiratory disease pandemic. Longitudinal outcome data can then be combined with other epidemics and old-age health data, to discover the best biomarkers to predict (i) coping with infection & inflammation and thus hospitalization or intensive care, (ii) long-term health challenges, e.g. deterioration of lung function after intensive care, and (iii) treatment & vaccination response. Further, there are universal triggers of old-age morbidity & mortality, and the elimination of senescent cells improved health in pilot studies in idiopathic lung fibrosis & osteoarthritis patients alike. Biomarker studies are needed to test the hypothesis that resilience of the elderly during a pandemic can be improved by countering chronic inflammation and/or removing senescent cells. Our review suggests that more samples should be taken and saved systematically, following minimum standards, and data be made available, to maximize healthspan & minimize frailty, leading to savings in health care, gains in quality of life, and preparing us better for the next pandemic, all at the same time.
Collapse
|
31
|
Shouval R, Fein JA, Savani B, Mohty M, Nagler A. Machine learning and artificial intelligence in haematology. Br J Haematol 2020; 192:239-250. [PMID: 32602593 DOI: 10.1111/bjh.16915] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Digitalization of the medical record and integration of genomic methods into clinical practice have resulted in an unprecedented wealth of data. Machine learning is a subdomain of artificial intelligence that attempts to computationally extract meaningful insights from complex data structures. Applications of machine learning in haematological scenarios are steadily increasing. However, basic concepts are often unfamiliar to clinicians and investigators. The purpose of this review is to provide readers with tools to interpret and critically appraise machine learning literature. We begin with the elucidation of standard terminology and then review examples in haematology. Guidelines for designing and evaluating machine-learning studies are provided. Finally, we discuss limitations of the machine-learning approach.
Collapse
Affiliation(s)
- Roni Shouval
- Adult Bone Marrow Transplant Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.,Hematology and Bone Marrow Transplantation Division, Chaim Sheba Medical Center, Tel-Hashomer, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Joshua A Fein
- University of Connecticut Medical Center, Farmington, CT, USA
| | - Bipin Savani
- Division of Hematology-Oncology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mohamad Mohty
- European Society for Blood and Marrow Transplantation Paris Study Office/CEREST-TC, Paris, France.,Service d'Hématologie Clinique et de Thérapie Cellulaire, Hôpital Saint Antoine, AP-HP, Paris, France
| | - Arnon Nagler
- Hematology and Bone Marrow Transplantation Division, Chaim Sheba Medical Center, Tel-Hashomer, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|