1
|
Garrido NJ, González-Martínez F, Losada S, Plaza A, del Olmo E, Mateo J. Innovation through Artificial Intelligence in Triage Systems for Resource Optimization in Future Pandemics. Biomimetics (Basel) 2024; 9:440. [PMID: 39056881 PMCID: PMC11274710 DOI: 10.3390/biomimetics9070440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/12/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024] Open
Abstract
Artificial intelligence (AI) systems are already being used in various healthcare areas. Similarly, they can offer many advantages in hospital emergency services. The objective of this work is to demonstrate that through the novel use of AI, a trained system can be developed to detect patients at potential risk of infection in a new pandemic more quickly than standardized triage systems. This identification would occur in the emergency department, thus allowing for the early implementation of organizational preventive measures to block the chain of transmission. MATERIALS AND METHODS In this study, we propose the use of a machine learning system in emergency department triage during pandemics to detect patients at the highest risk of death and infection using the COVID-19 era as an example, where rapid decision making and comprehensive support have becoming increasingly crucial. All patients who consecutively presented to the emergency department were included, and more than 89 variables were automatically analyzed using the extreme gradient boosting (XGB) algorithm. RESULTS The XGB system demonstrated the highest balanced accuracy at 91.61%. Additionally, it obtained results more quickly than traditional triage systems. The variables that most influenced mortality prediction were procalcitonin level, age, and oxygen saturation, followed by lactate dehydrogenase (LDH) level, C-reactive protein, the presence of interstitial infiltrates on chest X-ray, and D-dimer. Our system also identified the importance of oxygen therapy in these patients. CONCLUSIONS These results highlight that XGB is a useful and novel tool in triage systems for guiding the care pathway in future pandemics, thus following the example set by the well-known COVID-19 pandemic.
Collapse
Affiliation(s)
- Nicolás J. Garrido
- Internal Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
| | - Félix González-Martínez
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Expert Medical Analysis Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Susana Losada
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Adrián Plaza
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Eneida del Olmo
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Jorge Mateo
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
- Expert Medical Analysis Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| |
Collapse
|
2
|
Yuan S, Xu S, Lu X, Chen X, Wang Y, Bao R, Sun Y, Xiao X, Su L, Long Y, Li L, He H. A privacy-preserving platform oriented medical healthcare and its application in identifying patients with candidemia. Sci Rep 2024; 14:15589. [PMID: 38971879 PMCID: PMC11227531 DOI: 10.1038/s41598-024-66596-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 07/02/2024] [Indexed: 07/08/2024] Open
Abstract
Federated learning (FL) has emerged as a significant method for developing machine learning models across multiple devices without centralized data collection. Candidemia, a critical but rare disease in ICUs, poses challenges in early detection and treatment. The goal of this study is to develop a privacy-preserving federated learning framework for predicting candidemia in ICU patients. This approach aims to enhance the accuracy of antifungal drug prescriptions and patient outcomes. This study involved the creation of four predictive FL models for candidemia using data from ICU patients across three hospitals in China. The models were designed to prioritize patient privacy while aggregating learnings across different sites. A unique ensemble feature selection strategy was implemented, combining the strengths of XGBoost's feature importance and statistical test p values. This strategy aimed to optimize the selection of relevant features for accurate predictions. The federated learning models demonstrated significant improvements over locally trained models, with a 9% increase in the area under the curve (AUC) and a 24% rise in true positive ratio (TPR). Notably, the FL models excelled in the combined TPR + TNR metric, which is critical for feature selection in candidemia prediction. The ensemble feature selection method proved more efficient than previous approaches, achieving comparable performance. The study successfully developed a set of federated learning models that significantly enhance the prediction of candidemia in ICU patients. By leveraging a novel feature selection method and maintaining patient privacy, the models provide a robust framework for improved clinical decision-making in the treatment of candidemia.
Collapse
Affiliation(s)
- Siyi Yuan
- Peking Union Medical College Hospital (CAMS), Beijing, China
| | - Song Xu
- Yidu Cloud Technology Company Ltd., Beijing, China
| | - Xiao Lu
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing, 100081, China
| | - Xiangyu Chen
- Peking Union Medical College Hospital (CAMS), Beijing, China
- Peking Union Medical College Graduate School, Beijing, China
| | - Yao Wang
- Yidu Cloud Technology Company Ltd., Beijing, China
| | - Renyi Bao
- Yidu Cloud Technology Company Ltd., Beijing, China
| | - Yunbo Sun
- Department of Intensive Care Unit, Affiliated Hospital of Qingdao University, Qingdao, China
| | - Xiongjian Xiao
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Fujian Medical University, Fuzhou, China
| | - Longxiang Su
- Peking Union Medical College Hospital (CAMS), Beijing, China
| | - Yun Long
- Peking Union Medical College Hospital (CAMS), Beijing, China.
| | - Linfeng Li
- Yidu Cloud Technology Company Ltd., Beijing, China.
| | - Huaiwu He
- Peking Union Medical College Hospital (CAMS), Beijing, China.
| |
Collapse
|
3
|
Tao Y, Ding X, Guo WL. Using machine-learning models to predict extubation failure in neonates with bronchopulmonary dysplasia. BMC Pulm Med 2024; 24:308. [PMID: 38956528 PMCID: PMC11218173 DOI: 10.1186/s12890-024-03133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 06/26/2024] [Indexed: 07/04/2024] Open
Abstract
AIM To develop a decision-support tool for predicting extubation failure (EF) in neonates with bronchopulmonary dysplasia (BPD) using a set of machine-learning algorithms. METHODS A dataset of 284 BPD neonates on mechanical ventilation was used to develop predictive models via machine-learning algorithms, including extreme gradient boosting (XGBoost), random forest, support vector machine, naïve Bayes, logistic regression, and k-nearest neighbor. The top three models were assessed by the area under the receiver operating characteristic curve (AUC), and their performance was tested by decision curve analysis (DCA). Confusion matrix was used to show the high performance of the best model. The importance matrix plot and SHapley Additive exPlanations values were calculated to evaluate the feature importance and visualize the results. The nomogram and clinical impact curves were used to validate the final model. RESULTS According to the AUC values and DCA results, the XGboost model performed best (AUC = 0.873, sensitivity = 0.896, specificity = 0.838). The nomogram and clinical impact curve verified that the XGBoost model possessed a significant predictive value. The following were predictive factors for EF: pO2, hemoglobin, mechanical ventilation (MV) rate, pH, Apgar score at 5 min, FiO2, C-reactive protein, Apgar score at 1 min, red blood cell count, PIP, gestational age, highest FiO2 at the first 24 h, heart rate, birth weight, pCO2. Further, pO2, hemoglobin, and MV rate were the three most important factors for predicting EF. CONCLUSIONS The present study indicated that the XGBoost model was significant in predicting EF in BPD neonates with mechanical ventilation, which is helpful in determining the right extubation time among neonates with BPD to reduce the occurrence of complications.
Collapse
Affiliation(s)
- Yue Tao
- Department of radiology, Children's Hospital of Soochow University, 92 Zhongnan District, Suzhou, Jiangsu, 215025, China
| | - Xin Ding
- Department of neonatology, Children's Hospital of Soochow University, 92 Zhongnan District, Suzhou, Jiangsu, 215025, China
| | - Wan-Liang Guo
- Department of radiology, Children's Hospital of Soochow University, 92 Zhongnan District, Suzhou, Jiangsu, 215025, China.
| |
Collapse
|
4
|
Raj A, Petreaca RC, Mirzaei G. Multi-Omics Integration for Liver Cancer Using Regression Analysis. Curr Issues Mol Biol 2024; 46:3551-3562. [PMID: 38666952 PMCID: PMC11049490 DOI: 10.3390/cimb46040222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Revised: 04/11/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Genetic biomarkers have played a pivotal role in the classification, prognostication, and guidance of clinical cancer therapies. Large-scale and multi-dimensional analyses of entire cancer genomes, as exemplified by projects like The Cancer Genome Atlas (TCGA), have yielded an extensive repository of data that holds the potential to unveil the underlying biology of these malignancies. Mutations stand out as the principal catalysts of cellular transformation. Nonetheless, other global genomic processes, such as alterations in gene expression and chromosomal re-arrangements, also play crucial roles in conferring cellular immortality. The incorporation of multi-omics data specific to cancer has demonstrated the capacity to enhance our comprehension of the molecular mechanisms underpinning carcinogenesis. This report elucidates how the integration of comprehensive data on methylation, gene expression, and copy number variations can effectively facilitate the unsupervised clustering of cancer samples. We have identified regressors that can effectively classify tumor and normal samples with an optimal integration of RNA sequencing, DNA methylation, and copy number variation while also achieving significant p-values. Further, these regressors were trained using linear and logistic regression with k-means clustering. For comparison, we employed autoencoder- and stacking-based omics integration and computed silhouette scores to evaluate the clusters. The proof of concept is illustrated using liver cancer data. Our analysis serves to underscore the feasibility of unsupervised cancer classification by considering genetic markers beyond mutations, thereby emphasizing the clinical relevance of additional global cellular parameters that contribute to the transformative process in cells. This work is clinically relevant because changes in gene expression and genomic re-arrangements have been shown to be signatures of cellular transformation across cancers, as well as in liver cancers.
Collapse
Affiliation(s)
- Aditya Raj
- Department of Electrical and Computer Engineering, The Ohio State University, Columbus, OH 43210, USA;
| | - Ruben C. Petreaca
- Department of Molecular Genetics, The Ohio State University, Marion, OH 43302, USA;
- Cancer Biology Program, The Ohio State University James Comprehensive Cancer Center, Columbus, OH 43210, USA
| | - Golrokh Mirzaei
- Department of Computer Science and Engineering, The Ohio State University, Marion, OH 43302, USA
| |
Collapse
|
5
|
Carrillo-Perez F, Pizurica M, Zheng Y, Nandi TN, Madduri R, Shen J, Gevaert O. Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models. Nat Biomed Eng 2024:10.1038/s41551-024-01193-8. [PMID: 38514775 DOI: 10.1038/s41551-024-01193-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 02/29/2024] [Indexed: 03/23/2024]
Abstract
Training machine-learning models with synthetically generated data can alleviate the problem of data scarcity when acquiring diverse and sufficiently large datasets is costly and challenging. Here we show that cascaded diffusion models can be used to synthesize realistic whole-slide image tiles from latent representations of RNA-sequencing data from human tumours. Alterations in gene expression affected the composition of cell types in the generated synthetic image tiles, which accurately preserved the distribution of cell types and maintained the cell fraction observed in bulk RNA-sequencing data, as we show for lung adenocarcinoma, kidney renal papillary cell carcinoma, cervical squamous cell carcinoma, colon adenocarcinoma and glioblastoma. Machine-learning models pretrained with the generated synthetic data performed better than models trained from scratch. Synthetic data may accelerate the development of machine-learning models in scarce-data settings and allow for the imputation of missing data modalities.
Collapse
Affiliation(s)
- Francisco Carrillo-Perez
- Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, Stanford, CA, USA
| | - Marija Pizurica
- Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, Stanford, CA, USA
- Internet technology and Data science Lab (IDLab), Ghent University, Ghent, Belgium
| | - Yuanning Zheng
- Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, Stanford, CA, USA
| | - Tarak Nath Nandi
- Data Science and Learning Division, Argonne National Laboratory, Lemont, IL, USA
| | - Ravi Madduri
- Data Science and Learning Division, Argonne National Laboratory, Lemont, IL, USA
| | - Jeanne Shen
- Department of Pathology, Stanford University, School of Medicine, Palo Alto, CA, USA
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, Stanford, CA, USA.
- Department of Biomedical Data Science, Stanford University, School of Medicine, Stanford, CA, USA.
| |
Collapse
|
6
|
Zhang Y, Xiao L, LYu L, Zhang L. Construction of a predictive model for bone metastasis from first primary lung adenocarcinoma within 3 cm based on machine learning algorithm: a retrospective study. PeerJ 2024; 12:e17098. [PMID: 38495760 PMCID: PMC10944632 DOI: 10.7717/peerj.17098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/21/2024] [Indexed: 03/19/2024] Open
Abstract
Background Adenocarcinoma, the most prevalent histological subtype of non-small cell lung cancer, is associated with a significantly higher likelihood of bone metastasis compared to other subtypes. The presence of bone metastasis has a profound adverse impact on patient prognosis. However, to date, there is a lack of accurate bone metastasis prediction models. As a result, this study aims to employ machine learning algorithms for predicting the risk of bone metastasis in patients. Method We collected a dataset comprising 19,454 cases of solitary, primary lung adenocarcinoma with pulmonary nodules measuring less than 3 cm. These cases were diagnosed between 2010 and 2015 and were sourced from the Surveillance, Epidemiology, and End Results (SEER) database. Utilizing clinical feature indicators, we developed predictive models using seven machine learning algorithms, namely extreme gradient boosting (XGBoost), logistic regression (LR), light gradient boosting machine (LightGBM), Adaptive Boosting (AdaBoost), Gaussian Naive Bayes (GNB), multilayer perceptron (MLP) and support vector machine (SVM). Results The results demonstrated that XGBoost exhibited superior performance among the four algorithms (training set: AUC: 0.913; test set: AUC: 0.853). Furthermore, for convenient application, we created an online scoring system accessible at the following URL: https://www.xsmartanalysis.com/model/predict/?mid=731symbol=7Fr16wX56AR9Mk233917, which is based on the highest performing model. Conclusion XGBoost proves to be an effective algorithm for predicting the occurrence of bone metastasis in patients with solitary, primary lung adenocarcinoma featuring pulmonary nodules below 3 cm in size. Moreover, its robust clinical applicability enhances its potential utility.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Thoracic Surgery, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Lixia Xiao
- Department of Thoracic Surgery, Feicheng Hospital Affiliated to Shandong First Medical University, Taian, Shandong, China
| | - Lan LYu
- Department of Plastic Surgery, Feicheng Hospital Affiliated to Shandong First Medical University, Taian, Shandong, China
| | - Liwei Zhang
- Department of Thoracic Surgery, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China
| |
Collapse
|
7
|
Jiang W, Chen Z, Chen C, Wang L, Han T, Wen L. Machine learning algorithms being an auxiliary tool to predict the overall survival of patients with renal cell carcinoma using the SEER database. Transl Androl Urol 2024; 13:53-63. [PMID: 38404544 PMCID: PMC10891382 DOI: 10.21037/tau-23-319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 11/22/2023] [Indexed: 02/27/2024] Open
Abstract
Background The clinical prognosis assessment of renal cell carcinoma (RCC) still relies on nuclear grading and nuclear score by naked eye with microscope, which has defects long time, low efficiency, and uneven evaluation level criteria. There are few machine learning (ML) studies investigating the prognosis in the RCC literature which could also quantify the risk of postoperative recurrence of RCC patients and guide cancer patients to conduct individualized postoperative clinical management. This study evaluated the suitability of ML algorithms for survival prediction in patients with RCC. Methods A total of 192,912 RCC patients from the Surveillance, Epidemiology, and End Results (SEER) were obtained from 2004 to 2015. Six ML algorithms including support vector machine (SVM), Bayesian method, decision tree, random forest, neural network, and Extreme Gradient Boosting (XGBoost) were applied to predict overall survival (OS) of RCC. Results Patients from the SEER with a median age of 62 years and the pathological types were clear cell RCC (47.6%), papillary RCC (9.5%), chromophobe RCC (4.0%) and others (4.1%) were collected. In the deleting patients with missing data, the highest accurate model was XGBoost [area under the curve (AUC) 67.0%]. In the deleting patients with missing data and survival time <5 years, the accuracy of random forest, neural network and XGBoost were high, with AUC of 80.8%, 81.5% and 81.8%, respectively. In the only deleting the missing tumor diameter and filling the missing dataset with missForest, the highest accurate model was random forest (AUC: 71.9%). In this study, the overall accuracy of the SVM model was not high, apart from in the population of patients with deleting the missing tumor diameter and survival time <5 years, and filling the missing data with missForest. Random forest, neural network and XGBoost had high accuracy, with AUC of 84.1%, 84.7% and 84.8%, respectively. Conclusions ML algorithms could be used to predict the prognosis of RCC. It could quantify the recurrence possibility of patients and help more individualized postoperative clinical management. Given the limitations and complexity of datasets, ML may be used as an auxiliary tool to analyze and process larger datasets and complex data.
Collapse
Affiliation(s)
- Weixing Jiang
- Department of Urology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
| | - Zhenghao Chen
- Department of Urology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
- Department of Urology, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Cancan Chen
- Digital Health China Technologies Co., LTD., Beijing, China
| | - Lei Wang
- Department of Urology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
| | - Tiandong Han
- Department of Urology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
| | - Li Wen
- Department of Urology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
8
|
Veeramani N, Jayaraman P, Krishankumar R, Ravichandran KS, Gandomi AH. DDCNN-F: double decker convolutional neural network 'F' feature fusion as a medical image classification framework. Sci Rep 2024; 14:676. [PMID: 38182607 PMCID: PMC10770172 DOI: 10.1038/s41598-023-49721-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 12/11/2023] [Indexed: 01/07/2024] Open
Abstract
Melanoma is a severe skin cancer that involves abnormal cell development. This study aims to provide a new feature fusion framework for melanoma classification that includes a novel 'F' Flag feature for early detection. This novel 'F' indicator efficiently distinguishes benign skin lesions from malignant ones known as melanoma. The article proposes an architecture that is built in a Double Decker Convolutional Neural Network called DDCNN future fusion. The network's deck one, known as a Convolutional Neural Network (CNN), finds difficult-to-classify hairy images using a confidence factor termed the intra-class variance score. These hirsute image samples are combined to form a Baseline Separated Channel (BSC). By eliminating hair and using data augmentation techniques, the BSC is ready for analysis. The network's second deck trains the pre-processed BSC and generates bottleneck features. The bottleneck features are merged with features generated from the ABCDE clinical bio indicators to promote classification accuracy. Different types of classifiers are fed to the resulting hybrid fused features with the novel 'F' Flag feature. The proposed system was trained using the ISIC 2019 and ISIC 2020 datasets to assess its performance. The empirical findings expose that the DDCNN feature fusion strategy for exposing malignant melanoma achieved a specificity of 98.4%, accuracy of 93.75%, precision of 98.56%, and Area Under Curve (AUC) value of 0.98. This study proposes a novel approach that can accurately identify and diagnose fatal skin cancer and outperform other state-of-the-art techniques, which is attributed to the DDCNN 'F' Feature fusion framework. Also, this research ascertained improvements in several classifiers when utilising the 'F' indicator, resulting in the highest specificity of + 7.34%.
Collapse
Affiliation(s)
- Nirmala Veeramani
- School of Computing, SASTRA Deemed to Be University, Thanjavur, India
| | | | - Raghunathan Krishankumar
- Information Technology Systems and Analytics Area, Indian Institute of Management Bodh Gaya, Bodh Gaya, Bihar, 824234, India
| | | | - Amir H Gandomi
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, Australia.
- University Research and Innovation Center (EKIK), Obuda University, Buddapest, Hungary.
| |
Collapse
|
9
|
Qin ZM, Liang SQ, Long JX, Deng JM, Wei X, Yang ML, Tang SJ, Li HL. Importance of GWAS Risk Loci and Clinical Data in Predicting Asthma Using Machine-learning Approaches. Comb Chem High Throughput Screen 2024; 27:400-407. [PMID: 37278039 DOI: 10.2174/1386207326666230602161939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 04/17/2023] [Accepted: 05/04/2023] [Indexed: 06/07/2023]
Abstract
INTRODUCTION To understand the risk factors of asthma, we combined genome-wide association study (GWAS) risk loci and clinical data in predicting asthma using machine-learning approaches. METHODS A case-control study with 123 asthmatics and 100 controls was conducted in the Zhuang population in Guangxi. GWAS risk loci were detected using polymerase chain reaction, and clinical data were collected. Machine-learning approaches were used to identify the major factors that contribute to asthma. RESULTS A total of 14 GWAS risk loci with clinical data were analyzed on the basis of 10 times the 10-fold cross-validation for all machine-learning models. Using GWAS risk loci or clinical data, the best performances exhibited area under the curve (AUC) values of 64.3% and 71.4%, respectively. Combining GWAS risk loci and clinical data, the XGBoost established the best model with an AUC of 79.7%, indicating that the combination of genetics and clinical data can enable improved performance. We then sorted the importance of features and found the top six risk factors for predicting asthma to be rs3117098, rs7775228, family history, rs2305480, rs4833095, and body mass index. CONCLUSION Asthma-prediction models based on GWAS risk loci and clinical data can accurately predict asthma, and thus provide insights into the disease pathogenesis.
Collapse
Affiliation(s)
- Zan-Mei Qin
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Si-Qiao Liang
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Jian-Xiong Long
- Department of Epidemiology and Health Statistics, School of Public Health of Guangxi Medical University, Nanning, Guangxi, China
| | - Jing-Min Deng
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Xuan Wei
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Mei-Ling Yang
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| | - Shao-Jie Tang
- School of Automation, Xi'an University of Posts and Telecommunications, Xi'an, Shanxi, 710121, China
- Xi'an Key Laboratory of Advanced Controlling and Intelligent Processing (ACIP), Xi'an, Shanxi, 710121, China
| | - Hai-Li Li
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
10
|
Zheng F, Chen B, Zhang L, Chen H, Zang Y, Chen X, Li Y. Radiogenomic Analysis of Vascular Endothelial Growth Factor in Patients With Glioblastoma. J Comput Assist Tomogr 2023; 47:967-972. [PMID: 37948373 PMCID: PMC10662586 DOI: 10.1097/rct.0000000000001510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/26/2023] [Indexed: 07/29/2023]
Abstract
OBJECTIVES This article aims to predict the presence of vascular endothelial growth factor (VEGF) expression and to predict the expression level of VEGF by machine learning based on preoperative magnetic resonance imaging (MRI) of glioblastoma (GBM). METHODS We analyzed the axial T2-weighted images (T2WI) and T1-weighted contrast-enhancement images of preoperative MRI in 217 patients with pathologically diagnosed GBM. Patients were divided into negative and positive VEGF groups, with the latter group further subdivided into low and high expression. The machine learning models were established with the maximum relevance and minimum redundancy algorithm and the extreme gradient boosting classifier. The area under the receiver operating curve (AUC) and accuracy were calculated for the training and validation sets. RESULTS Positive VEGF in GBM was 63.1% (137/217), with a high expression ratio of 53.3% (73/137). To predict the positive and negative VEGF expression, 7 radiomic features were selected, with 3 features from T1CE and 4 from T2WI. The accuracy and AUC were 0.83 and 0.81, respectively, in the training set and were 0.73 and 0.74, respectively, in the validation set. To predict high and low levels, 7 radiomic features were selected, with 2 from T1CE, 1 from T2WI, and 4 from the data combinations of T1CE and T2WI. The accuracy and AUC were 0.88 and 0.88, respectively, in the training set and were 0.72 and 0.72, respectively, in the validation set. CONCLUSION The VEGF expression status in GBM can be predicted using a machine learning model. Radiomic features resulting from data combinations of different MRI sequences could be helpful.
Collapse
Affiliation(s)
| | - Baoshi Chen
- Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, P.R. China
| | | | | | | | | | - Yiming Li
- Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, P.R. China
| |
Collapse
|
11
|
Saceleanu VM, Toader C, Ples H, Covache-Busuioc RA, Costin HP, Bratu BG, Dumitrascu DI, Bordeianu A, Corlatescu AD, Ciurea AV. Integrative Approaches in Acute Ischemic Stroke: From Symptom Recognition to Future Innovations. Biomedicines 2023; 11:2617. [PMID: 37892991 PMCID: PMC10604797 DOI: 10.3390/biomedicines11102617] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/29/2023] Open
Abstract
Among the high prevalence of cerebrovascular diseases nowadays, acute ischemic stroke stands out, representing a significant worldwide health issue with important socio-economic implications. Prompt diagnosis and intervention are important milestones for the management of this multifaceted pathology, making understanding the various stroke-onset symptoms crucial. A key role in acute ischemic stroke management is emphasizing the essential role of a multi-disciplinary team, therefore, increasing the efficiency of recognition and treatment. Neuroimaging and neuroradiology have evolved dramatically over the years, with multiple approaches that provide a higher understanding of the morphological aspects as well as timely recognition of cerebral artery occlusions for effective therapy planning. Regarding the treatment matter, the pharmacological approach, particularly fibrinolytic therapy, has its merits and challenges. Endovascular thrombectomy, a game-changer in stroke management, has witnessed significant advances, with technologies like stent retrievers and aspiration catheters playing pivotal roles. For select patients, combining pharmacological and endovascular strategies offers evidence-backed benefits. The aim of our comprehensive study on acute ischemic stroke is to efficiently compare the current therapies, recognize novel possibilities from the literature, and describe the state of the art in the interdisciplinary approach to acute ischemic stroke. As we aspire for holistic patient management, the emphasis is not just on medical intervention but also on physical therapy, mental health, and community engagement. The future holds promising innovations, with artificial intelligence poised to reshape stroke diagnostics and treatments. Bridging the gap between groundbreaking research and clinical practice remains a challenge, urging continuous collaboration and research.
Collapse
Affiliation(s)
- Vicentiu Mircea Saceleanu
- Neurosurgery Department, Sibiu County Emergency Hospital, 550245 Sibiu, Romania;
- Neurosurgery Department, “Lucian Blaga” University of Medicine, 550024 Sibiu, Romania
| | - Corneliu Toader
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
- Department of Vascular Neurosurgery, National Institute of Neurology and Neurovascular Diseases, 020022 Bucharest, Romania
| | - Horia Ples
- Centre for Cognitive Research in Neuropsychiatric Pathology (NeuroPsy-Cog), “Victor Babes” University of Medicine and Pharmacy, 300736 Timisoara, Romania
- Department of Neurosurgery, “Victor Babes” University of Medicine and Pharmacy, 300041 Timisoara, Romania
| | - Razvan-Adrian Covache-Busuioc
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - Horia Petre Costin
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - Bogdan-Gabriel Bratu
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - David-Ioan Dumitrascu
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - Andrei Bordeianu
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - Antonio Daniel Corlatescu
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
| | - Alexandru Vlad Ciurea
- Department of Neurosurgery, “Carol Davila” University of Medicine and Pharmacy, 020021 Bucharest, Romania; (R.-A.C.-B.); (H.P.C.); (B.-G.B.); (D.-I.D.); (A.B.); (A.D.C.); (A.V.C.)
- Neurosurgery Department, Sanador Clinical Hospital, 010991 Bucharest, Romania
| |
Collapse
|
12
|
Gkantzios A, Kokkotis C, Tsiptsios D, Moustakidis S, Gkartzonika E, Avramidis T, Tripsianis G, Iliopoulos I, Aggelousis N, Vadikolias K. From Admission to Discharge: Predicting National Institutes of Health Stroke Scale Progression in Stroke Patients Using Biomarkers and Explainable Machine Learning. J Pers Med 2023; 13:1375. [PMID: 37763143 PMCID: PMC10532952 DOI: 10.3390/jpm13091375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/03/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
As a result of social progress and improved living conditions, which have contributed to a prolonged life expectancy, the prevalence of strokes has increased and has become a significant phenomenon. Despite the available stroke treatment options, patients frequently suffer from significant disability after a stroke. Initial stroke severity is a significant predictor of functional dependence and mortality following an acute stroke. The current study aims to collect and analyze data from the hyperacute and acute phases of stroke, as well as from the medical history of the patients, in order to develop an explainable machine learning model for predicting stroke-related neurological deficits at discharge, as measured by the National Institutes of Health Stroke Scale (NIHSS). More specifically, we approached the data as a binary task problem: improvement of NIHSS progression vs. worsening of NIHSS progression at discharge, using baseline data within the first 72 h. For feature selection, a genetic algorithm was applied. Using various classifiers, we found that the best scores were achieved from the Random Forest (RF) classifier at the 15 most informative biomarkers and parameters for the binary task of the prediction of NIHSS score progression. RF achieved 91.13% accuracy, 91.13% recall, 90.89% precision, 91.00% f1-score, 8.87% FNrate and 4.59% FPrate. Those biomarkers are: age, gender, NIHSS upon admission, intubation, history of hypertension and smoking, the initial diagnosis of hypertension, diabetes, dyslipidemia and atrial fibrillation, high-density lipoprotein (HDL) levels, stroke localization, systolic blood pressure levels, as well as erythrocyte sedimentation rate (ESR) levels upon admission and the onset of respiratory infection. The SHapley Additive exPlanations (SHAP) model interpreted the impact of the selected features on the model output. Our findings suggest that the aforementioned variables may play a significant role in determining stroke patients' NIHSS progression from the time of admission until their discharge.
Collapse
Affiliation(s)
- Aimilios Gkantzios
- Department of Neurology, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (D.T.); (I.I.); (K.V.)
- Department of Neurology, Korgialeneio—Benakeio “Hellenic Red Cross” General Hospital of Athens, 11526 Athens, Greece;
| | - Christos Kokkotis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece; (C.K.); (S.M.); (N.A.)
| | - Dimitrios Tsiptsios
- Department of Neurology, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (D.T.); (I.I.); (K.V.)
| | - Serafeim Moustakidis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece; (C.K.); (S.M.); (N.A.)
| | - Elena Gkartzonika
- School of Philosophy, University of Ioannina, 45110 Ioannina, Greece;
| | - Theodoros Avramidis
- Department of Neurology, Korgialeneio—Benakeio “Hellenic Red Cross” General Hospital of Athens, 11526 Athens, Greece;
| | - Gregory Tripsianis
- Laboratory of Medical Statistics, Democritus University of Thrace, 68100 Alexandroupolis, Greece;
| | - Ioannis Iliopoulos
- Department of Neurology, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (D.T.); (I.I.); (K.V.)
| | - Nikolaos Aggelousis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece; (C.K.); (S.M.); (N.A.)
| | - Konstantinos Vadikolias
- Department of Neurology, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (D.T.); (I.I.); (K.V.)
| |
Collapse
|
13
|
Tanaka T, Goto Y, Horie M, Masuda K, Shinno Y, Matsumoto Y, Okuma Y, Yoshida T, Horinouchi H, Motoi N, Yatabe Y, Watanabe S, Yamamoto N, Ohe Y. Whole Exome Sequencing of Thymoma Patients Exhibiting Exceptional Responses to Pemetrexed Monotherapy. Cancers (Basel) 2023; 15:4018. [PMID: 37627046 PMCID: PMC10452868 DOI: 10.3390/cancers15164018] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 08/01/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND Pemetrexed is used for the chemotherapy of advanced thymoma. Exceptional responses of thymoma to pemetrexed treatment are not frequently observed. The underlying genetic mechanism of the exceptional responses remains unclear. We used whole-exome sequencing to explore the specific genomic aberrations that lead to an extreme and durable response. METHODS Whole-exome sequencing using NovaSeq6000 (150 bp paired-end sequencing) was performed on nine formalin-fixed paraffin-embedded tissues from patients with advanced thymomas treated with pemetrexed (two exceptional responders and seven typical responders). RESULTS We identified 284 somatic single-nucleotide variants (SNVs; 272 missense, 8 missense/splice-site, 3 stop-gain, and 1 stop-gain/splice-site), 34 insertions and deletions (Indels; 33 frameshift and one splice region), and 21 copy number variations (CNVs; 15 gains and six losses). No difference in the number of SNVs variants and distribution of deleterious Indels was observed between the exceptional and typical responders. Interestingly, arm-level chromosomal CNVs (15 gains and six losses) were detected in four patients, including an exceptional responder. The highest number of arm-level CNVs was observed in an exceptional responder. CONCLUSION Exceptional responders to pemetrexed for metastatic thymomas may be characterized by arm-level CNVs. Further, whole-genome and RNA sequencing studies should be performed.
Collapse
Affiliation(s)
- Tomohiro Tanaka
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
- Department of Respiratory Medicine and Infectious Diseases, Niigata University Medical & Dental Hospital, Niigata 951-8510, Japan
| | - Yasushi Goto
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Masafumi Horie
- Department of Molecular and Cellular Pathology, Kanazawa University, Kanazawa 920-8640, Japan
| | - Ken Masuda
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Yuki Shinno
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Yuji Matsumoto
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Yusuke Okuma
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Tatsuya Yoshida
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Hidehito Horinouchi
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Noriko Motoi
- Department of Pathology, Saitama Cancer Center, Saitama 362-0806, Japan
| | - Yasushi Yatabe
- Department of Pathology and Clinical Laboratory, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Shunichi Watanabe
- Department of Thoracic Surgery, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Noboru Yamamoto
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Yuichiro Ohe
- Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| |
Collapse
|
14
|
Carrillo-Perez F, Pizurica M, Zheng Y, Nandi TN, Madduri R, Shen J, Gevaert O. RNA-to-image multi-cancer synthesis using cascaded diffusion models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523899. [PMID: 36711711 PMCID: PMC9882105 DOI: 10.1101/2023.01.13.523899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Data scarcity presents a significant obstacle in the field of biomedicine, where acquiring diverse and sufficient datasets can be costly and challenging. Synthetic data generation offers a potential solution to this problem by expanding dataset sizes, thereby enabling the training of more robust and generalizable machine learning models. Although previous studies have explored synthetic data generation for cancer diagnosis, they have predominantly focused on single modality settings, such as whole-slide image tiles or RNA-Seq data. To bridge this gap, we propose a novel approach, RNA-Cascaded-Diffusion-Model or RNA-CDM, for performing RNA-to-image synthesis in a multi-cancer context, drawing inspiration from successful text-to-image synthesis models used in natural images. In our approach, we employ a variational auto-encoder to reduce the dimensionality of a patient's gene expression profile, effectively distinguishing between different types of cancer. Subsequently, we employ a cascaded diffusion model to synthesize realistic whole-slide image tiles using the latent representation derived from the patient's RNA-Seq data. Our results demonstrate that the generated tiles accurately preserve the distribution of cell types observed in real-world data, with state-of-the-art cell identification models successfully detecting important cell types in the synthetic samples. Furthermore, we illustrate that the synthetic tiles maintain the cell fraction observed in bulk RNA-Seq data and that modifications in gene expression affect the composition of cell types in the synthetic tiles. Next, we utilize the synthetic data generated by RNA-CDM to pretrain machine learning models and observe improved performance compared to training from scratch. Our study emphasizes the potential usefulness of synthetic data in developing machine learning models in sarce-data settings, while also highlighting the possibility of imputing missing data modalities by leveraging the available information. In conclusion, our proposed RNA-CDM approach for synthetic data generation in biomedicine, particularly in the context of cancer diagnosis, offers a novel and promising solution to address data scarcity. By generating synthetic data that aligns with real-world distributions and leveraging it to pretrain machine learning models, we contribute to the development of robust clinical decision support systems and potential advancements in precision medicine.
Collapse
|
15
|
Xiang T, Li T, Li J, Li X, Wang J. Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs. FASEB J 2023; 37:e22961. [PMID: 37178007 DOI: 10.1096/fj.202300245r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 03/30/2023] [Accepted: 04/25/2023] [Indexed: 05/15/2023]
Abstract
Genomic prediction, which is based on solving linear mixed-model (LMM) equations, is the most popular method for predicting breeding values or phenotypic performance for economic traits in livestock. With the need to further improve the performance of genomic prediction, nonlinear methods have been considered as an alternative and promising approach. The excellent ability to predict phenotypes in animal husbandry has been demonstrated by machine learning (ML) approaches, which have been rapidly developed. To investigate the feasibility and reliability of implementing genomic prediction using nonlinear models, the performances of genomic predictions for pig productive traits using the linear genomic selection model and nonlinear machine learning models were compared. Then, to reduce the high-dimensional features of genome sequence data, different machine learning algorithms, including the random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and convolutional neural network (CNN) algorithms, were used to perform genomic feature selection as well as genomic prediction on reduced feature genome data. All of the analyses were processed on two real pig datasets: the published PIC pig dataset and a dataset comprising data from a national pig nucleus herd in Chifeng, North China. Overall, the accuracies of predicted phenotypic performance for traits T1, T2, T3 and T5 in the PIC dataset and average daily gain (ADG) in the Chifeng dataset were higher using the ML methods than the LMM method, while those for trait T4 in the PIC dataset and total number of piglets born (TNB) in the Chifeng dataset were slightly lower using the ML methods than the LMM method. Among all the different ML algorithms, SVM was the most appropriate for genomic prediction. For the genomic feature selection experiment, the most stable and most accurate results across different algorithms were achieved using XGBoost in combination with the SVM algorithm. Through feature selection, the number of genomic markers can be reduced to 1 in 20, while the predictive performance on some traits can even be improved compared to using the full genome data. Finally, we developed a new tool that can be used to execute combined XGBoost and SVM algorithms to realize genomic feature selection and phenotypic prediction.
Collapse
Affiliation(s)
- Tao Xiang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Tao Li
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Jielin Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Xin Li
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Jia Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
16
|
Ye L, Gu L, Zheng Z, Zhang X, Xing H, Guo X, Chen W, Wang Y, Wang Y, Liang T, Wang H, Li Y, Jin S, Shi Y, Liu D, Yang T, Liu Q, Deng C, Wang Y, Ma W. An online survival predictor in glioma patients using machine learning based on WHO CNS5 data. Front Neurol 2023; 14:1179761. [PMID: 37273702 PMCID: PMC10237015 DOI: 10.3389/fneur.2023.1179761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 04/25/2023] [Indexed: 06/06/2023] Open
Abstract
Background The World Health Organization (WHO) CNS5 classification system highlights the significance of molecular biomarkers in providing meaningful prognostic and therapeutic information for gliomas. However, predicting individual patient survival remains challenging due to the lack of integrated quantitative assessment tools. In this study, we aimed to design a WHO CNS5-related risk signature to predict the overall survival (OS) rate of glioma patients using machine learning algorithms. Methods We extracted data from patients who underwent an operation for histopathologically confirmed glioma from our hospital database (2011-2022) and split them into a training and hold-out test set in a 7/3 ratio. We used biological markers related to WHO CNS5, clinical data (age, sex, and WHO grade), and prognosis follow-up information to identify prognostic factors and construct a predictive dynamic nomograph to predict the survival rate of glioma patients using 4 kinds machine learning algorithms (RF, SVM, XGB, and GLM). Results A total of 198 patients with complete WHO5 molecular data and follow-up information were included in the study. The median OS time of all patients was 29.77 [95% confidence interval (CI): 21.19-38.34] months. Age, FGFR2, IDH1, CDK4, CDK6, KIT, and CDKN2A were considered vital indicators related to the prognosis and OS time of glioma. To better predict the prognosis of glioma patients, we constructed a WHO5-related risk signature and nomogram. The AUC values of the ROC curves of the nomogram for predicting the 1, 3, and 5-year OS were 0.849, 0.835, and 0.821 in training set, and, 0.844, 0.943, and 0.959 in validation set. The calibration plot confirmed the reliability of the nomogram, and the c-index was 0.742 in training set and 0.775 in validation set. Additionally, our nomogram showed a superior net benefit across a broader scale of threshold probabilities in decision curve analysis. Therefore, we selected it as the backend for the online survival prediction tool (Glioma Survival Calculator, https://who5pumch.shinyapps.io/DynNomapp/), which can calculate the survival probability for a specific time of the patients. Conclusion An online prognosis predictor based on WHO5-related biomarkers was constructed. This therapeutically promising tool may increase the precision of forecast therapy outcomes and assess prognosis.
Collapse
Affiliation(s)
- Liguo Ye
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Lingui Gu
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhiyao Zheng
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
- Research Unit of Accurate Diagnosis, Treatment, and Translational Medicine of Brain Tumors (No. 2019RU011), Chinese Academy of Medical Sciences, Beijing, China
| | - Xin Zhang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Hao Xing
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiaopeng Guo
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- China Anti-Cancer Association Specialty Committee of Glioma, Beijing, China
| | - Wenlin Chen
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yaning Wang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yuekun Wang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Tingyu Liang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Hai Wang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yilin Li
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shanmu Jin
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- 4+4 Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yixin Shi
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- Eight-year Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Delin Liu
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- Eight-year Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Tianrui Yang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- Eight-year Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Qianshu Liu
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- Eight-year Medical Doctor Program, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Congcong Deng
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yu Wang
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- China Anti-Cancer Association Specialty Committee of Glioma, Beijing, China
| | - Wenbin Ma
- Department of Neurosurgery, Center for Malignant Brain Tumors, National Glioma MDT Alliance, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- China Anti-Cancer Association Specialty Committee of Glioma, Beijing, China
| |
Collapse
|
17
|
Machine learning on MRI radiomic features: identification of molecular subtype alteration in breast cancer after neoadjuvant therapy. Eur Radiol 2023; 33:2965-2974. [PMID: 36418622 DOI: 10.1007/s00330-022-09264-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/03/2022] [Accepted: 10/22/2022] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Recent studies have revealed the change of molecular subtypes in breast cancer (BC) after neoadjuvant therapy (NAT). This study aims to construct a non-invasive model for predicting molecular subtype alteration in breast cancer after NAT. METHODS Eighty-two estrogen receptor (ER)-negative/ human epidermal growth factor receptor 2 (HER2)-negative or ER-low-positive/HER2-negative breast cancer patients who underwent NAT and completed baseline MRI were retrospectively recruited between July 2010 and November 2020. Subtype alteration was observed in 21 cases after NAT. A 2D-DenseUNet machine-learning model was built to perform automatic segmentation of breast cancer. 851 radiomic features were extracted from each MRI sequence (T2-weighted imaging, ADC, DCE, and contrast-enhanced T1-weighted imaging), both in the manual and auto-segmentation masks. All samples were divided into a training set (n = 66) and a test set (n = 16). XGBoost model with 5-fold cross-validation was performed to predict molecular subtype alterations in breast cancer patients after NAT. The predictive ability of these models was subsequently evaluated by the AUC of the ROC curve, sensitivity, and specificity. RESULTS A model consisting of three radiomics features from the manual segmentation of multi-sequence MRI achieved favorable predictive efficacy in identifying molecular subtype alteration in BC after NAT (cross-validation set: AUC = 0.908, independent test set: AUC = 0.864); whereas an automatic segmentation approach of BC lesions on the DCE sequence produced good segmentation results (Dice similarity coefficient = 0.720). CONCLUSIONS A machine learning model based on baseline MRI is proven useful for predicting molecular subtype alterations in breast cancer after NAT. KEY POINTS • Machine learning models using MRI-based radiomics signature have the ability to predict molecular subtype alterations in breast cancer after neoadjuvant therapy, which subsequently affect treatment protocols. • The application of deep learning in the automatic segmentation of breast cancer lesions from MRI images shows the potential to replace manual segmentation..
Collapse
|
18
|
Mohammed MA, Lakhan A, Abdulkareem KH, Garcia-Zapirain B. A hybrid cancer prediction based on multi-omics data and reinforcement learning state action reward state action (SARSA). Comput Biol Med 2023; 154:106617. [PMID: 36753981 DOI: 10.1016/j.compbiomed.2023.106617] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 01/21/2023] [Accepted: 01/28/2023] [Indexed: 02/05/2023]
Abstract
These days, the ratio of cancer diseases among patients has been growing day by day. Recently, many cancer cases have been reported in different clinical hospitals. Many machine learning algorithms have been suggested in the literature to predict cancer diseases with the same class types based on trained and test data. However, there are many research rooms available for further research. In this paper, the studies look into the different types of cancer by analyzing, classifying, and processing the multi-omics dataset in a fog cloud network. Based on SARSA on-policy and multi-omics workload learning, made possible by reinforcement learning, the study made new hybrid cancer detection schemes. It consists of different layers, such as clinical data collection via laboratories and tool processes (biopsy, colonoscopy, and mammography) at the distributed omics-based clinics in the network. The study considers the different cancer classes such as carcinomas, sarcomas, leukemias, and lymphomas with their types in work and processes them using the multi-omics distributed clinics in work. In order to solve the problem, the study presents omics cancer workload reinforcement learning state action reward state action "SARSA" (OCWLS) schemes, which are made up of an on-policy learning scheme on different parameters like states, actions, timestamps, reward, accuracy, and processing time constraints. The goal is to process multiple cancer classes and workload feature matching while reducing the time it takes to process in clinical hospitals that are spread out. Simulation results show that OCWLS is better than other machine learning methods regarding+ processing time, extracting features from multiple classes of cancer, and matching in the system.
Collapse
Affiliation(s)
- Mazin Abed Mohammed
- College of Computer Science and Information Technology, University of Anbar, Anbar 31001, Iraq; eVIDA Lab, University of Deusto, 48007 Bilbao, Spain.
| | - Abdullah Lakhan
- Department of Computer Science, Dawood University of Engineering and Technology, Pakistan.
| | - Karrar Hameed Abdulkareem
- College of Agriculture, Al-Muthanna University, Samawah 66001, Iraq; College of Engineering, University of Warith Al-Anbiyaa, Karbala 56001, Iraq.
| | | |
Collapse
|
19
|
Appadurai JP, G S, Prabhu Kavin B, C K, Lai WC. Multi-Process Remora Enhanced Hyperparameters of Convolutional Neural Network for Lung Cancer Prediction. Biomedicines 2023; 11:biomedicines11030679. [PMID: 36979657 PMCID: PMC10045623 DOI: 10.3390/biomedicines11030679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/30/2023] [Accepted: 02/08/2023] [Indexed: 03/30/2023] Open
Abstract
In recent years, lung cancer prediction is an essential topic for reducing the death rate of humans. In the literature section, some papers are reviewed that reduce the accuracy level during the prediction stage. Hence, in this paper, we develop a Multi-Process Remora Optimized Hyperparameters of Convolutional Neural Network (MPROH-CNN) aimed at lung cancer prediction. The proposed technique can be utilized to detect the CT images of the human lung. The proposed technique proceeds with four phases, including pre-processing, feature extraction and classification. Initially, the databases are collected from the open-source system. After that, the collected CT images contain unwanted noise, which affects classification efficiency. So, the pre-processing techniques can be considered to remove unwanted noise from the input images, such as filtering and contrast enhancement. Following that, the essential features are extracted with the assistance of feature extraction techniques such as histogram, texture and wavelet. The extracted features are utilized to classification stage. The proposed classifier is a combination of the Remora Optimization Algorithm (ROA) and Convolutional Neural Network (CNN). In the CNN, the ROA is utilized for multi process optimization such as structure optimization and hyperparameter optimization. The proposed methodology is implemented in MATLAB and performances are evaluated by utilized performance matrices such as accuracy, precision, recall, specificity, sensitivity and F_Measure. To validate the projected approach, it is compared with the traditional techniques CNN, CNN-Particle Swarm Optimization (PSO) and CNN-Firefly Algorithm (FA), respectively. From the analysis, the proposed method achieved a 0.98 accuracy level in the lung cancer prediction.
Collapse
Affiliation(s)
- Jothi Prabha Appadurai
- Computer Science and Engineering Department, Kakatiya Institute of Technology and Science, Warangal 506015, Telangana, India
| | - Suganeshwari G
- School of Computer Science and Engineering, Vellore Institute of Technology, Chennai 600127, Tamil Nadu, India
| | - Balasubramanian Prabhu Kavin
- Department of Data Science and Business Systems, College of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Chengalpattu District, Chennai 603203, Tamil Nadu, India
| | - Kavitha C
- Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai 600119, Tamil Nadu, India
| | - Wen-Cheng Lai
- Bachelor Program in Industrial Projects, National Yunlin University of Science and Technology, Douliu 640301, Taiwan
- Department Electronic Engineering, National Yunlin University of Science and Technology, Douliu 640301, Taiwan
| |
Collapse
|
20
|
Gkantzios A, Kokkotis C, Tsiptsios D, Moustakidis S, Gkartzonika E, Avramidis T, Aggelousis N, Vadikolias K. Evaluation of Blood Biomarkers and Parameters for the Prediction of Stroke Survivors' Functional Outcome upon Discharge Utilizing Explainable Machine Learning. Diagnostics (Basel) 2023; 13:diagnostics13030532. [PMID: 36766637 PMCID: PMC9914778 DOI: 10.3390/diagnostics13030532] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 01/25/2023] [Accepted: 01/29/2023] [Indexed: 02/04/2023] Open
Abstract
Despite therapeutic advancements, stroke remains a leading cause of death and long-term disability. The quality of current stroke prognostic models varies considerably, whereas prediction models of post-stroke disability and mortality are restricted by the sample size, the range of clinical and risk factors and the clinical applicability in general. Accurate prognostication can ease post-stroke discharge planning and help healthcare practitioners individualize aggressive treatment or palliative care, based on projected life expectancy and clinical course. In this study, we aimed to develop an explainable machine learning methodology to predict functional outcomes of stroke patients at discharge, using the Modified Rankin Scale (mRS) as a binary classification problem. We identified 35 parameters from the admission, the first 72 h, as well as the medical history of stroke patients, and used them to train the model. We divided the patients into two classes in two approaches: "Independent" vs. "Non-Independent" and "Non-Disability" vs. "Disability". Using various classifiers, we found that the best models in both approaches had an upward trend, with respect to the selected biomarkers, and achieved a maximum accuracy of 88.57% and 89.29%, respectively. The common features in both approaches included: age, hemispheric stroke localization, stroke localization based on blood supply, development of respiratory infection, National Institutes of Health Stroke Scale (NIHSS) upon admission and systolic blood pressure levels upon admission. Intubation and C-reactive protein (CRP) levels upon admission are additional features for the first approach and Erythrocyte Sedimentation Rate (ESR) levels upon admission for the second. Our results suggest that the said factors may be important predictors of functional outcomes in stroke patients.
Collapse
Affiliation(s)
- Aimilios Gkantzios
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
- Department of Neurology, Korgialeneio—Benakeio “Hellenic Red Cross” General Hospital of Athens, 11526 Athens, Greece
- Correspondence:
| | - Christos Kokkotis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Dimitrios Tsiptsios
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Serafeim Moustakidis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
- AIDEAS OÜ, Narva mnt 5, 10117 Tallinn, Estonia
| | - Elena Gkartzonika
- School of Philosophy, University of Ioannina, 45110 Ioannina, Greece
| | - Theodoros Avramidis
- Department of Neurology, Korgialeneio—Benakeio “Hellenic Red Cross” General Hospital of Athens, 11526 Athens, Greece
| | - Nikolaos Aggelousis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Konstantinos Vadikolias
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| |
Collapse
|
21
|
Zhang J, Yang X, Chen J, Han J, Chen X, Fan Y, Zheng H. Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model. J Obstet Gynaecol Res 2023; 49:296-303. [PMID: 36220631 DOI: 10.1111/jog.15458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/18/2022] [Accepted: 09/23/2022] [Indexed: 01/19/2023]
Abstract
BACKGROUND The pathological phenotype of early-stage cervical cancer (CC) is similar to that of cervical intraepithelial neoplasia (CIN), which provides a challenge for the diagnosis of cervical precancerous lesions. Meanwhile, the existing diagnostic methods have certain subjectivity and limitations, resulting in the possibility of misdiagnosis or missed diagnosis. Hence, some methods are needed to assist diagnosis of CC and CIN. METHODS Based on the data of CIN and CC in gene expression omnibus (GEO) dataset, the eXtreme Gradient Boosting (XGBoost) algorithm was used to screen the feature genes between CIN and CC for constructing the classifier. Incremental feature selection (IFS) curve was also used for screening. The classifier was validated for reliability using principal component analysis (PCA) dimensionality reduction analysis and heat map analysis of gene expression. Then, differentially expressed genes of CIN and CC were intersected with the classifier genes. Genes in the intersection were used as seeds for protein-protein interaction network construction and restart random walk analysis. And the genes with the top 50 affinity coefficients were selected for gene ontology (GO) and kyoto encyclopedia of genes and genome (KEGG) enrichment analyses to observe the biological functions with differences between CIN and CC. RESULTS The peripheral blood genes of CIN and CC were analyzed, and seven genes were screened. Using this gene for classifier construction, IFS curve screening revealed that the three-feature gene classifier constructed according to the random forest model had the best effect. The results of PCA dimensionality reduction analysis and gene expression heat map analysis showed that the three-gene classifier could effectively distinguish CIN from CC. CONCLUSION A three-gene diagnostic classifier can effectively distinguish CIN patients from CC patients and provide a reference for the clinical diagnosis of early CC.
Collapse
Affiliation(s)
- Jing Zhang
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Xiuqing Yang
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Jia Chen
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Jing Han
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Xiaofeng Chen
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Yueping Fan
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| | - Hui Zheng
- Department of Gynaecology and Obstetrics, Jiangsu Xiangshui Hospital of Chinese Medicine, Yancheng, Jiangsu, China
| |
Collapse
|
22
|
Műzes G, Bohusné Barta B, Szabó O, Horgas V, Sipos F. Cell-Free DNA in the Pathogenesis and Therapy of Non-Infectious Inflammations and Tumors. Biomedicines 2022; 10:biomedicines10112853. [PMID: 36359370 PMCID: PMC9687442 DOI: 10.3390/biomedicines10112853] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 10/31/2022] [Accepted: 11/07/2022] [Indexed: 11/09/2022] Open
Abstract
The basic function of the immune system is the protection of the host against infections, along with the preservation of the individual antigenic identity. The process of self-tolerance covers the discrimination between self and foreign antigens, including proteins, nucleic acids, and larger molecules. Consequently, a broken immunological self-tolerance results in the development of autoimmune or autoinflammatory disorders. Immunocompetent cells express pattern-recognition receptors on their cell membrane and cytoplasm. The majority of endogenous DNA is located intracellularly within nuclei and mitochondria. However, extracellular, cell-free DNA (cfDNA) can also be detected in a variety of diseases, such as autoimmune disorders and malignancies, which has sparked interest in using cfDNA as a possible biomarker. In recent years, the widespread use of liquid biopsies and the increasing demand for screening, as well as monitoring disease activity and therapy response, have enabled the revival of cfDNA research. The majority of studies have mainly focused on the function of cfDNA as a biomarker. However, research regarding the immunological consequences of cfDNA, such as its potential immunomodulatory or therapeutic benefits, is still in its infancy. This article discusses the involvement of various DNA-sensing receptors (e.g., absent in melanoma-2; Toll-like receptor 9; cyclic GMP-AMP synthase/activator of interferon genes) in identifying host cfDNA as a potent danger-associated molecular pattern. Furthermore, we aim to summarize the results of the experimental studies that we recently performed and highlight the immunomodulatory capacity of cfDNA, and thus, the potential for possible therapeutic consideration.
Collapse
Affiliation(s)
| | | | | | | | - Ferenc Sipos
- Correspondence: ; Tel.: +36-20-478-0752; Fax: +36-1-266-0816
| |
Collapse
|
23
|
Ozcan I, Aydin H, Cetinkaya A. Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer. Asian Pac J Cancer Prev 2022; 23:3287-3297. [PMID: 36308351 PMCID: PMC9924317 DOI: 10.31557/apjcp.2022.23.10.3287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Indexed: 02/18/2023] Open
Abstract
OBJECTIVE To identify which Machine Learning (ML) algorithms are the most successful in predicting and diagnosing breast cancer according to accuracy rates. METHODS The "College of Wisconsin Breast Cancer Dataset", which consists of 569 data and 30 features, was classified using Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Linear Discriminant Analysis (LDA), XgBoost (XGB), Ada-Boost (ABC) and Gradient Boosting (GBC) ML algorithms. Before the classification process, the dataset was preprocessed. Sensitivity, accuracy, and definiteness metrics were used to measure the success of the methods. RESULT Compared to other ML algorithms used in the study, the GBC ML algorithm was found to be the most successful method in the classification of tumors with an accuracy of 99.12%. The XGB ML algorithm was found to be the lowest method with an accuracy rate of 88.10%. In addition, it was determined that the general accuracy rates of the 11 ML algorithms used in the study varied between 88-95%. CONCLUSION When the results obtained from the ML classifiers used in the study are evaluated, the efficiency of the GBC algorithm in the classification of tumors is obvious. It can be said that the success rates obtained from 11 different ML algorithms used in the study are valuable in terms of being used to predict different cancer types.
Collapse
Affiliation(s)
- Irem Ozcan
- Department of Computer Engineering, Faculty of Engineering and Architecture, Istanbul Gelisim University, Istanbul, Turkey.
| | - Hakan Aydin
- Department of Computer Engineering, Faculty of Engineering, Istanbul Topkapı University, Istanbul, Turkey.
| | - Ali Cetinkaya
- Department of Electronics Technology, Istanbul Gelisim Vocational School, Istanbul Gelisim University, Istanbul, Turkey. ,For Correspondence:
| |
Collapse
|
24
|
Kokkotis C, Giarmatzis G, Giannakou E, Moustakidis S, Tsatalas T, Tsiptsios D, Vadikolias K, Aggelousis N. An Explainable Machine Learning Pipeline for Stroke Prediction on Imbalanced Data. Diagnostics (Basel) 2022; 12:diagnostics12102392. [PMID: 36292081 PMCID: PMC9600473 DOI: 10.3390/diagnostics12102392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 11/16/2022] Open
Abstract
Stroke is an acute neurological dysfunction attributed to a focal injury of the central nervous system due to reduced blood flow to the brain. Nowadays, stroke is a global threat associated with premature death and huge economic consequences. Hence, there is an urgency to model the effect of several risk factors on stroke occurrence, and artificial intelligence (AI) seems to be the appropriate tool. In the present study, we aimed to (i) develop reliable machine learning (ML) prediction models for stroke disease; (ii) cope with a typical severe class imbalance problem, which is posed due to the stroke patients’ class being significantly smaller than the healthy class; and (iii) interpret the model output for understanding the decision-making mechanism. The effectiveness of the proposed ML approach was investigated in a comparative analysis with six well-known classifiers with respect to metrics that are related to both generalization capability and prediction accuracy. The best overall false-negative rate was achieved by the Multi-Layer Perceptron (MLP) classifier (18.60%). Shapley Additive Explanations (SHAP) were employed to investigate the impact of the risk factors on the prediction output. The proposed AI method could lead to the creation of advanced and effective risk stratification strategies for each stroke patient, which would allow for timely diagnosis and the right treatments.
Collapse
Affiliation(s)
- Christos Kokkotis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Georgios Giarmatzis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Erasmia Giannakou
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | | | - Themistoklis Tsatalas
- Department of Physical Education and Sport Science, University of Thessaly, 38221 Trikala, Greece
| | - Dimitrios Tsiptsios
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Konstantinos Vadikolias
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Nikolaos Aggelousis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
- Correspondence:
| |
Collapse
|
25
|
Rasheed K, Qayyum A, Ghaly M, Al-Fuqaha A, Razi A, Qadir J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Comput Biol Med 2022; 149:106043. [PMID: 36115302 DOI: 10.1016/j.compbiomed.2022.106043] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 08/15/2022] [Accepted: 08/20/2022] [Indexed: 12/18/2022]
Abstract
With the advent of machine learning (ML) and deep learning (DL) empowered applications for critical applications like healthcare, the questions about liability, trust, and interpretability of their outputs are raising. The black-box nature of various DL models is a roadblock to clinical utilization. Therefore, to gain the trust of clinicians and patients, we need to provide explanations about the decisions of models. With the promise of enhancing the trust and transparency of black-box models, researchers are in the phase of maturing the field of eXplainable ML (XML). In this paper, we provided a comprehensive review of explainable and interpretable ML techniques for various healthcare applications. Along with highlighting security, safety, and robustness challenges that hinder the trustworthiness of ML, we also discussed the ethical issues arising because of the use of ML/DL for healthcare. We also describe how explainable and trustworthy ML can resolve all these ethical problems. Finally, we elaborate on the limitations of existing approaches and highlight various open research problems that require further development.
Collapse
Affiliation(s)
- Khansa Rasheed
- IHSAN Lab, Information Technology University of the Punjab (ITU), Lahore, Pakistan.
| | - Adnan Qayyum
- IHSAN Lab, Information Technology University of the Punjab (ITU), Lahore, Pakistan.
| | - Mohammed Ghaly
- Research Center for Islamic Legislation and Ethics (CILE), College of Islamic Studies, Hamad Bin Khalifa University (HBKU), Doha, Qatar.
| | - Ala Al-Fuqaha
- Information and Computing Technology Division, College of Science and Engineering, Hamad Bin Khalifa University (HBKU), Doha, Qatar.
| | - Adeel Razi
- Turner Institute for Brain and Mental Health, Monash University, Clayton, Australia; Monash Biomedical Imaging, Monash University, Clayton, Australia; Wellcome Centre for Human Neuroimaging, UCL, London, United Kingdom; CIFAR Azrieli Global Scholars program, CIFAR, Toronto, Canada.
| | - Junaid Qadir
- Department of Computer Science and Engineering, College of Engineering, Qatar University, Doha, Qatar.
| |
Collapse
|
26
|
Yuan Y, Lu H, Ma X, Chen F, Zhang S, Xia Y, Wang M, Shao C, Lu J, Shen F. Is rectal filling optimal for MRI-based radiomics in preoperative T staging of rectal cancer? Abdom Radiol (NY) 2022; 47:1741-1749. [PMID: 35267070 DOI: 10.1007/s00261-022-03477-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 02/22/2022] [Accepted: 02/23/2022] [Indexed: 11/26/2022]
Abstract
PURPOSE To determine whether rectal filling with ultrasound gel is clinically more beneficial in preoperative T staging of patients with rectal cancer (RC) using radiomics model based on magnetic resonance imaging (MRI). METHODS A total of 94 RC patients were assigned to cohort 1 (leave-one-out cross-validation [LOO-CV] set) and 230 RC patients were assigned to cohort 2 (test set). Patients were grouped according to different pathological T stages. The radiomics features were extracted through high-resolution T2-weighted imaging for all volume of interests in the two cohorts. Optimal features were selected using the least absolute shrinkage and selection operator (LASSO) algorithm. Model 1 (without rectal filling) and model 2 (with rectal filling) were constructed. LOO-CV was adopted for radiomics model building in cohort 1. Thereafter, the cohort 2 was used to test and verify the effectiveness of the two models. RESULTS Totally, 204 patients were enrolled, including 60 cases in cohort 1 and 144 cases in cohort 2. Finally, seven optimal features with LASSO were selected to build model 1 and nine optimal features were used for model 2. The ROC curves showed an AUC of 0.806 and 0.946 for model 1 and model 2 in cohort 1, respectively, and an AUC of 0.783 and 0.920 for model 1 and model 2 in cohort 2, respectively (p = 0.021). CONCLUSION The radiomics model with rectal filling showed an advantage for differentiating T1 + 2 from T3 and had less inaccurate categories in the test cohort, suggesting that this model may be useful for T-stage evaluation.
Collapse
Affiliation(s)
- Yuan Yuan
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Haidi Lu
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Xiaolu Ma
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Fangying Chen
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Shaoting Zhang
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Yuwei Xia
- Huiying Medical Technology Co., Ltd, B2, Dongsheng Science and Technology Park, HaiDian District, Beijing, China
| | - Minjie Wang
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Chengwei Shao
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Jianping Lu
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China
| | - Fu Shen
- Department of Radiology, Changhai Hospital, No.168 Changhai Road, Shanghai, 200433, China.
| |
Collapse
|
27
|
MRI-Based Radiomics of Rectal Cancer: Assessment of the Local Recurrence at the Site of Anastomosis. Acad Radiol 2021; 28 Suppl 1:S87-S94. [PMID: 33162318 DOI: 10.1016/j.acra.2020.09.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 09/25/2020] [Accepted: 09/30/2020] [Indexed: 01/18/2023]
Abstract
RATIONALE AND OBJECTIVE To investigate the significance of magnetic resonance imaging (MRI)-based radiomics model in differentiating local recurrence of rectal cancer from nonrecurrence lesions at the site of anastomosis. MATERIALS AND METHODS A total of 80 patients with clinically suspected lesions of anastomosis underwent 3.0T pelvic MRI consisting of T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), and contrast-enhanced T1-weighted volume interpolated body examination (VIBE) imaging. Radiomics features were extracted from volumes of interest (VOIs), delineated manually on multiple MRI sequences. Subsequently, principal component analysis reduced the dimensionality of features for T2WI, DWI, VIBE, and combined multisequences, respectively. On this basis, the extreme gradient boosting (XGBoost) classifier was trained to build ModelT2WI, ModelDWI, ModelVIBE, and Modelcombination. Receiver operating characteristic curves were generated to determine the diagnostic performance of various models. RESULTS Principal component analysis selected eight, four, seven, and six principal components to construct the radiomics model for T2WI, DWI, VIBE, and combined multisequences, respectively. Modelcombination had an area under the receiver operating characteristic curve of 0.864, with sensitivity and specificity of 81.82% and 75.86% in the validation set, demonstrating a more optimal performance compared to other models (p< 0.05). The decision curve analysis confirmed the clinical usefulness of the model. CONCLUSION This study demonstrated that MRI-based radiomics is a sophisticated and noninvasive tool for accurately distinguishing LR from nonrecurrence lesions at the site of anastomosis. Combining multiple sequences significantly improves its performance.
Collapse
|
28
|
Liu H, Qiu C, Wang B, Bing P, Tian G, Zhang X, Ma J, He B, Yang J. Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin. Front Cell Dev Biol 2021; 9:619330. [PMID: 34012960 PMCID: PMC8126648 DOI: 10.3389/fcell.2021.619330] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 03/22/2021] [Indexed: 12/18/2022] Open
Abstract
Carcinoma of unknown primary (CUP) is a type of metastatic cancer, the primary tumor site of which cannot be identified. CUP occupies approximately 5% of cancer incidences in the United States with usually unfavorable prognosis, making it a big threat to public health. Traditional methods to identify the tissue-of-origin (TOO) of CUP like immunohistochemistry can only deal with around 20% CUP patients. In recent years, more and more studies suggest that it is promising to solve the problem by integrating machine learning techniques with big biomedical data involving multiple types of biomarkers including epigenetic, genetic, and gene expression profiles, such as DNA methylation. Different biomarkers play different roles in cancer research; for example, genomic mutations in a patient’s tumor could lead to specific anticancer drugs for treatment; DNA methylation and copy number variation could reveal tumor tissue of origin and molecular classification. However, there is no systematic comparison on which biomarker is better at identifying the cancer type and site of origin. In addition, it might also be possible to further improve the inference accuracy by integrating multiple types of biomarkers. In this study, we used primary tumor data rather than metastatic tumor data. Although the use of primary tumors may lead to some biases in our classification model, their tumor-of-origins are known. In addition, previous studies have suggested that the CUP prediction model built from primary tumors could efficiently predict TOO of metastatic cancers (Lal et al., 2013; Brachtel et al., 2016). We systematically compared the performances of three types of biomarkers including DNA methylation, gene expression profile, and somatic mutation as well as their combinations in inferring the TOO of CUP patients. First, we downloaded the gene expression profile, somatic mutation and DNA methylation data of 7,224 tumor samples across 21 common cancer types from the cancer genome atlas (TCGA) and generated seven different feature matrices through various combinations. Second, we performed feature selection by the Pearson correlation method. The selected features for each matrix were used to build up an XGBoost multi-label classification model to infer cancer TOO, an algorithm proven to be effective in a few previous studies. The performance of each biomarker and combination was compared by the 10-fold cross-validation process. Our results showed that the TOO tracing accuracy using gene expression profile was the highest, followed by DNA methylation, while somatic mutation performed the worst. Meanwhile, we found that simply combining multiple biomarkers does not have much effect in improving prediction accuracy.
Collapse
Affiliation(s)
- Haiyan Liu
- Academician Workstation, Changsha Medical University, Changsha, China.,College of Information Engineering, Changsha Medical University, Changsha, China
| | - Chun Qiu
- Department of Oncology, Hainan General Hospital, Haikou, China
| | - Bo Wang
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xueliang Zhang
- Department of Oncology, Jiamusi Cancer Hospital, Jiamusi, China
| | - Jun Ma
- College of Information Engineering, Changsha Medical University, Changsha, China
| | - Bingsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Jialiang Yang
- Academician Workstation, Changsha Medical University, Changsha, China.,Geneis Beijing Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
29
|
Zhang Y, Wang Y, Xu J, Zhu B, Chen X, Ding X, Li Y. Comparison of Prediction Models for Acute Kidney Injury Among Patients with Hepatobiliary Malignancies Based on XGBoost and LASSO-Logistic Algorithms. Int J Gen Med 2021; 14:1325-1335. [PMID: 33889012 PMCID: PMC8057825 DOI: 10.2147/ijgm.s302795] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 03/17/2021] [Indexed: 01/07/2023] Open
Abstract
Background Based on the admission data, we applied the XGBoost algorithm to create a prediction model to estimate the AKI risk in patients with hepatobiliary malignancies and then compare its prediction capacity with the logistic model. Methods We reviewed clinical data of 7968 and 589 liver/gallbladder cancer patients admitted to Zhongshan Hospital during 2014 and 2015. They were randomly divided into the training set and test set. Data were collected from the electronic medical record system. XGBoost and LASSO-logistic were used to develop prediction models, respectively. The performance measures included the classification matrix, the area under the receiver operating characteristic curve (AUC), lift chart and learning curve. Results Of 6846 participants in the training set, 792 (11.6%) cases developed AKI. In XGBoost model, the top 3 most important variables for AKI were serum creatinine (SCr), glomerular filtration rate (eGFR) and antitumor treatment in liver cancer patients. Similarly, SCr and eGFR also ranked second and third most important variables in the gallbladder cancer-related AKI model just after phosphorus. In the classification matrix, XGBoost model possessed a comparably better agreement between the actual observations and the predictions than LASSO-logistic model. The Youden’s index of XGBoost model was 47.5% and 59.3%, respectively, which was significantly higher than that of LASSO-logistic model (41.6% and 32.7%). The AUCs of XGBoost model were 0.822 in liver cancer and 0.850 in gallbladder cancer. By comparison, the AUC values of Logistic models were significantly lower as 0.793 and 0.740 (p=0.024 and 0.018). With the accumulation of training samples, XGBoost model maintained greater robustness in the learning curve. Conclusion XGBoost model based on admission data has higher accuracy and stronger robustness in predicting AKI. It will benefit AKI risk classification management in clinical practice and take an advanced intervention among patients with hepatobiliary malignancies.
Collapse
Affiliation(s)
- Yunlu Zhang
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Yimei Wang
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Jiarui Xu
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Bowen Zhu
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Xiaohong Chen
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Xiaoqiang Ding
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| | - Yang Li
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.,Shanghai Medical Center of Kidney, Shanghai, People's Republic of China.,Shanghai Key Laboratory of Kidney and Blood Purification, Shanghai, People's Republic of China
| |
Collapse
|
30
|
Fan J, Chen M, Luo J, Yang S, Shi J, Yao Q, Zhang X, Du S, Qu H, Cheng Y, Ma S, Zhang M, Xu X, Wang Q, Zhan S. The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models. BMC Med Inform Decis Mak 2021; 21:115. [PMID: 33820531 PMCID: PMC8020544 DOI: 10.1186/s12911-021-01480-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 03/26/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine learning has shown a strong ability to classify data and a potential for prediction in the medical field. The combined use of machine learning and the electronic health records of patients could provide clinicians with a more convenient and precise method to identify asymptomatic CAS. METHODS Retrospective cohort study using routine clinical data of medical check-up subjects from April 19, 2010 to November 15, 2019. Six machine learning models (logistic regression [LR], random forest [RF], decision tree [DT], eXtreme Gradient Boosting [XGB], Gaussian Naïve Bayes [GNB], and K-Nearest Neighbour [KNN]) were used to predict asymptomatic CAS and compared their predictability in terms of the area under the receiver operating characteristic curve (AUCROC), accuracy (ACC), and F1 score (F1). RESULTS Of the 18,441 subjects, 6553 were diagnosed with asymptomatic CAS. Compared to DT (AUCROC 0.628, ACC 65.4%, and F1 52.5%), the other five models improved prediction: KNN + 7.6% (0.704, 68.8%, and 50.9%, respectively), GNB + 12.5% (0.753, 67.0%, and 46.8%, respectively), XGB + 16.0% (0.788, 73.4%, and 55.7%, respectively), RF + 16.6% (0.794, 74.5%, and 56.8%, respectively) and LR + 18.1% (0.809, 74.7%, and 59.9%, respectively). The highest achieving model, LR predicted 1045/1966 cases (sensitivity 53.2%) and 3088/3566 non-cases (specificity 86.6%). A tenfold cross-validation scheme further verified the predictive ability of the LR. CONCLUSIONS Among machine learning models, LR showed optimal performance in predicting asymptomatic CAS. Our findings set the stage for an early automatic alarming system, allowing a more precise allocation of CAS prevention measures to individuals probably to benefit most.
Collapse
Affiliation(s)
- Jiaxin Fan
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Mengying Chen
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Jian Luo
- Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Shusen Yang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China
| | - Jinming Shi
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Qingling Yao
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Xiaodong Zhang
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Shuang Du
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Huiyang Qu
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Yuxuan Cheng
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Shuyin Ma
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Meijuan Zhang
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Xi Xu
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China
| | - Qian Wang
- Department of Health Management, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Shuqin Zhan
- Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China.
| |
Collapse
|
31
|
Hynst J, Navrkalova V, Pal K, Pospisilova S. Bioinformatic strategies for the analysis of genomic aberrations detected by targeted NGS panels with clinical application. PeerJ 2021; 9:e10897. [PMID: 33850640 PMCID: PMC8019320 DOI: 10.7717/peerj.10897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/13/2021] [Indexed: 01/21/2023] Open
Abstract
Molecular profiling of tumor samples has acquired importance in cancer research, but currently also plays an important role in the clinical management of cancer patients. Rapid identification of genomic aberrations improves diagnosis, prognosis and effective therapy selection. This can be attributed mainly to the development of next-generation sequencing (NGS) methods, especially targeted DNA panels. Such panels enable a relatively inexpensive and rapid analysis of various aberrations with clinical impact specific to particular diagnoses. In this review, we discuss the experimental approaches and bioinformatic strategies available for the development of an NGS panel for a reliable analysis of selected biomarkers. Compliance with defined analytical steps is crucial to ensure accurate and reproducible results. In addition, a careful validation procedure has to be performed before the application of NGS targeted assays in routine clinical practice. With more focus on bioinformatics, we emphasize the need for thorough pipeline validation and management in relation to the particular experimental setting as an integral part of the NGS method establishment. A robust and reproducible bioinformatic analysis running on powerful machines is essential for proper detection of genomic variants in clinical settings since distinguishing between experimental noise and real biological variants is fundamental. This review summarizes state-of-the-art bioinformatic solutions for careful detection of the SNV/Indels and CNVs for targeted sequencing resulting in translation of sequencing data into clinically relevant information. Finally, we share our experience with the development of a custom targeted NGS panel for an integrated analysis of biomarkers in lymphoproliferative disorders.
Collapse
Affiliation(s)
- Jakub Hynst
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Veronika Navrkalova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| | - Karol Pal
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Hematology, University Hospital Schleswig-Holstein, Kiel, Germany
| | - Sarka Pospisilova
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Brno, Czech Republic.,Department of Internal Medicine-Hematology and Oncology, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic.,Department of Medical Genetics and Genomics, Faculty of Medicine and University Hospital Brno, Masaryk University, Brno, Czech Republic
| |
Collapse
|
32
|
Luxton JJ, McKenna MJ, Lewis AM, Taylor LE, Jhavar SG, Swanson GP, Bailey SM. Telomere Length Dynamics and Chromosomal Instability for Predicting Individual Radiosensitivity and Risk via Machine Learning. J Pers Med 2021; 11:188. [PMID: 33800260 PMCID: PMC8002073 DOI: 10.3390/jpm11030188] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 02/23/2021] [Accepted: 03/02/2021] [Indexed: 12/11/2022] Open
Abstract
The ability to predict a cancer patient's response to radiotherapy and risk of developing adverse late health effects would greatly improve personalized treatment regimens and individual outcomes. Telomeres represent a compelling biomarker of individual radiosensitivity and risk, as exposure can result in dysfunctional telomere pathologies that coincidentally overlap with many radiation-induced late effects, ranging from degenerative conditions like fibrosis and cardiovascular disease to proliferative pathologies like cancer. Here, telomere length was longitudinally assessed in a cohort of fifteen prostate cancer patients undergoing Intensity Modulated Radiation Therapy (IMRT) utilizing Telomere Fluorescence in situ Hybridization (Telo-FISH). To evaluate genome instability and enhance predictions for individual patient risk of secondary malignancy, chromosome aberrations were assessed utilizing directional Genomic Hybridization (dGH) for high-resolution inversion detection. We present the first implementation of individual telomere length data in a machine learning model, XGBoost, trained on pre-radiotherapy (baseline) and in vitro exposed (4 Gy γ-rays) telomere length measurements, to predict post radiotherapy telomeric outcomes, which together with chromosomal instability provide insight into individual radiosensitivity and risk for radiation-induced late effects.
Collapse
Affiliation(s)
- Jared J. Luxton
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA; (J.J.L.); (M.J.M.); (A.M.L.); (L.E.T.)
- Cell and Molecular Biology Program, Colorado State University, Fort Collins, CO 80523, USA
| | - Miles J. McKenna
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA; (J.J.L.); (M.J.M.); (A.M.L.); (L.E.T.)
- Cell and Molecular Biology Program, Colorado State University, Fort Collins, CO 80523, USA
| | - Aidan M. Lewis
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA; (J.J.L.); (M.J.M.); (A.M.L.); (L.E.T.)
| | - Lynn E. Taylor
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA; (J.J.L.); (M.J.M.); (A.M.L.); (L.E.T.)
| | - Sameer G. Jhavar
- Baylor Scott & White Medical Center, Temple, TX 76508, USA; (S.G.J.); (G.P.S.)
| | - Gregory P. Swanson
- Baylor Scott & White Medical Center, Temple, TX 76508, USA; (S.G.J.); (G.P.S.)
| | - Susan M. Bailey
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA; (J.J.L.); (M.J.M.); (A.M.L.); (L.E.T.)
- Cell and Molecular Biology Program, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
33
|
Extended Regression Modeling of the Toxicity of Phenol Derivatives to <i>Tetrahymena pyriformis</i> Using the Electronic-Structure Informatics Descriptor. JOURNAL OF COMPUTER AIDED CHEMISTRY 2021. [DOI: 10.2751/jcac.22.17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
34
|
Choi ES, Sim JA, Na YG, Seon JK, Shin HD. Machine-learning algorithm that can improve the diagnostic accuracy of septic arthritis of the knee. Knee Surg Sports Traumatol Arthrosc 2021; 29:3142-3148. [PMID: 33452576 PMCID: PMC8458173 DOI: 10.1007/s00167-020-06418-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 12/10/2020] [Indexed: 11/05/2022]
Abstract
PURPOSE Prompt diagnosis and treatment of septic arthritis of the knee is crucial. Nevertheless, the quality of evidence for the diagnosis of septic arthritis is low. In this study, the authors developed a machine learning-based diagnostic algorithm for septic arthritis of the native knee using clinical data in an emergency department and validated its diagnostic accuracy. METHODS Patients (n = 326) who underwent synovial fluid analysis at the emergency department for suspected septic arthritis of the knee were enrolled. Septic arthritis was diagnosed in 164 of the patients (50.3%) using modified Newman criteria. Clinical characteristics of septic and inflammatory arthritis were compared. Area under the receiver-operating characteristic (ROC) curve (AUC) statistics was applied to evaluate the efficacy of each variable for the diagnosis of septic arthritis. The dataset was divided into independent training and test sets (comprising 80% and 20%, respectively, of the data). Supervised machine-learning techniques (random forest and eXtreme Gradient Boosting: XGBoost) were applied to develop a diagnostic model using the training dataset. The test dataset was subsequently used to validate the developed model. The ROC curves of the machine-learning model and each variable were compared. RESULTS Synovial white blood cell (WBC) count was significantly higher in septic arthritis than in inflammatory arthritis in the multivariate analysis (P = 0.001). In the ROC comparison analysis, synovial WBC count yielded a significantly higher AUC than all other single variables (P = 0.002). The diagnostic model using the XGBoost algorithm yielded a higher AUC (0.831, 95% confidence interval 0.751-0.923) than synovial WBC count (0.740, 95% confidence interval 0.684-0.791; P = 0.033). The developed algorithm was deployed as a free access web-based application ( www.septicknee.com ). CONCLUSION The diagnosis of septic arthritis of the knee might be improved using a machine learning-based prediction model. LEVEL OF EVIDENCE Diagnostic study Level III (Case-control study).
Collapse
Affiliation(s)
- Eun-Seok Choi
- Department of Orthopaedic Surgery, Chungnam National University School of Medicine, Chungnam National University Hospital, 266 Munhwa-ro, Jung-gu, Daejeon, 35015, Republic of Korea.
| | - Jae Ang Sim
- grid.256155.00000 0004 0647 2973Department of Orthopaedic Surgery, Gachon University College of Medicine, Gil Medical Centre, Incheon, Republic of Korea
| | - Young Gon Na
- grid.489932.dDepartment of Orthopaedic Surgery, CM Hospital, Seoul, Republic of Korea
| | - Jong- Keun Seon
- grid.411597.f0000 0004 0647 2471Department of Orthopaedic Surgery, Chonnam National University School of Medicine, Chonnam National University Hospital, Gwangju, Republic of Korea
| | - Hyun Dae Shin
- grid.254230.20000 0001 0722 6377Department of Orthopaedic Surgery, Chungnam National University School of Medicine, Chungnam National University Hospital, 266 Munhwa-ro, Jung-gu, Daejeon, 35015 Republic of Korea
| |
Collapse
|
35
|
Application of Improved LightGBM Model in Blood Glucose Prediction. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10093227] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
In recent years, with increasing social pressure and irregular schedules, many people have developed unhealthy eating habits, which has resulted in an increasing number of patients with diabetes, a disease that cannot be cured under the current medical conditions, and can only be mitigated by early detection and prevention. A lot of human and material resources are required for the detection of the blood glucose of a large number of people in medical examination, while the integrated learning model based on machine learning can quickly predict the blood glucose level and assist doctors in treatment. Therefore, an improved LightGBM model based on the Bayesian hyper-parameter optimization algorithm is proposed for the prediction of blood glucose, namely HY_LightGBM, which optimizes parameters using a Bayesian hyper-parameter optimization algorithm based on LightGBM. The Bayesian hyper-parameter optimization algorithm is a model-based method for finding the minimum value of the function so as to obtain the optimal parameters of the LightGBM model. Experiments have demonstrated that the parameters obtained by the Bayesian hyper-parameter optimization algorithm are superior to those obtained by a genetic algorithm and random search. The improved LightGBM model based on the Bayesian hyper-parameter optimization algorithm achieves a mean square error of 0.5961 in blood glucose prediction, with a higher accuracy than the XGBoost model and CatBoost model.
Collapse
|