Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhu Y, Brettin T, Xia F, Partin A, Shukla M, Yoo H, Evrard YA, Doroshow JH, Stevens RL. Converting tabular data into images for deep learning with convolutional neural networks. Sci Rep 2021;11:11325. [PMID: 34059739 PMCID: PMC8166880 DOI: 10.1038/s41598-021-90923-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 05/17/2021] [Indexed: 12/11/2022] Open

For:	Zhu Y, Brettin T, Xia F, Partin A, Shukla M, Yoo H, Evrard YA, Doroshow JH, Stevens RL. Converting tabular data into images for deep learning with convolutional neural networks. Sci Rep 2021;11:11325. [PMID: 34059739 PMCID: PMC8166880 DOI: 10.1038/s41598-021-90923-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 05/17/2021] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Hwang J, Lee Y, Yoo SK, Kim JI. Image-based deep learning model using DNA methylation data predicts the origin of cancer of unknown primary. Neoplasia 2024;55:101021. [PMID: 38943996 PMCID: PMC11261876 DOI: 10.1016/j.neo.2024.101021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 06/24/2024] [Indexed: 07/01/2024]

Moulaei K, Afrash MR, Parvin M, Shadnia S, Rahimi M, Mostafazadeh B, Evini PET, Sabet B, Vahabi SM, Soheili A, Fathy M, Kazemi A, Khani S, Mortazavi SM, Hosseini SM. Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models. Sci Rep 2024;14:15751. [PMID: 38977750 PMCID: PMC11231277 DOI: 10.1038/s41598-024-66481-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 07/01/2024] [Indexed: 07/10/2024] Open

Abstract

The need for intubation in methanol-poisoned patients, if not predicted in time, can lead to irreparable complications and even death. Artificial intelligence (AI) techniques like machine learning (ML) and deep learning (DL) greatly aid in accurately predicting intubation needs for methanol-poisoned patients. So, our study aims to assess Explainable Artificial Intelligence (XAI) for predicting intubation necessity in methanol-poisoned patients, comparing deep learning and machine learning models. This study analyzed a dataset of 897 patient records from Loghman Hakim Hospital in Tehran, Iran, encompassing cases of methanol poisoning, including those requiring intubation (202 cases) and those not requiring it (695 cases). Eight established ML (SVM, XGB, DT, RF) and DL (DNN, FNN, LSTM, CNN) models were used. Techniques such as tenfold cross-validation and hyperparameter tuning were applied to prevent overfitting. The study also focused on interpretability through SHAP and LIME methods. Model performance was evaluated based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior performance in accuracy (94.0%), sensitivity (99.0%), specificity (94.0%), and F1-score (97.0%). CNN led in ROC with 78.0%. For ML models, RF excelled in accuracy (97.0%) and specificity (100%), followed by XGB with sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%). Overall, RF and XGB outperformed other models, with accuracy (97.0%) and specificity (100%) for RF, and sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%) for XGB. ML models surpassed DL models across all metrics, with accuracies from 93.0% to 97.0% for DL and 93.0% to 99.0% for ML. Sensitivities ranged from 98.0% to 99.37% for DL and 93.0% to 99.0% for ML. DL models achieved specificities from 78.0% to 94.0%, while ML models ranged from 93.0% to 100%. F1-scores for DL were between 93.0% and 97.0%, and for ML between 96.0% and 98.27%. DL models scored ROC between 68.0% and 78.0%, while ML models ranged from 84.0% to 96.08%. Key features for predicting intubation necessity include GCS at admission, ICU admission, age, longer folic acid therapy duration, elevated BUN and AST levels, VBG_HCO3 at initial record, and hemodialysis presence. This study as the showcases XAI's effectiveness in predicting intubation necessity in methanol-poisoned patients. ML models, particularly RF and XGB, outperform DL counterparts, underscoring their potential for clinical decision-making.

Collapse

Affiliation(s)

Khadijeh Moulaei Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
Mohammad Reza Afrash Deparment of Artificial Intelligence, Smart University of Medical Sciences, Tehran, Iran
Mohammad Parvin Department of Industrial and Systems Engineering, Auburn University, Auburn, AL, USA
Shahin Shadnia Toxicological Research Center, Excellence Center of Clinical Toxicology, Department of Clinical Toxicology, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Mitra Rahimi Toxicological Research Center, Excellence Center of Clinical Toxicology, Department of Clinical Toxicology, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Babak Mostafazadeh Toxicological Research Center, Excellence Center of Clinical Toxicology, Department of Clinical Toxicology, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Peyman Erfan Talab Evini Toxicological Research Center, Excellence Center of Clinical Toxicology, Department of Clinical Toxicology, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Babak Sabet Deparment of Artificial Intelligence, Smart University of Medical Sciences, Tehran, Iran Department of Surgery, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Seyed Mohammad Vahabi School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
Amirali Soheili Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran
Mobin Fathy School of Medicine, Tehran University of Medical Sciences, Tehran, Iran Students Research Committee, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Arya Kazemi Students Research Committee, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Sina Khani Students Research Committee, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Seyed Mohammad Mortazavi Students Research Committee, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
Sayed Masoud Hosseini Toxicological Research Center, Excellence Center of Clinical Toxicology, Department of Clinical Toxicology, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran.

Collapse

El-Melegy M, Mamdouh A, Ali S, Badawy M, El-Ghar MA, Alghamdi NS, El-Baz A. Prostate Cancer Diagnosis via Visual Representation of Tabular Data and Deep Transfer Learning. Bioengineering (Basel) 2024;11:635. [PMID: 39061717 PMCID: PMC11274351 DOI: 10.3390/bioengineering11070635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Revised: 06/10/2024] [Accepted: 06/17/2024] [Indexed: 07/28/2024] Open

Sharma A, López Y, Jia S, Lysenko A, Boroevich KA, Tsunoda T. Enhanced analysis of tabular data through Multi-representation DeepInsight. Sci Rep 2024;14:12851. [PMID: 38834670 DOI: 10.1038/s41598-024-63630-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 05/30/2024] [Indexed: 06/06/2024] Open

Borisov V, Leemann T, Sebler K, Haug J, Pawelczyk M, Kasneci G. Deep Neural Networks and Tabular Data: A Survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:7499-7519. [PMID: 37015381 DOI: 10.1109/tnnls.2022.3229161] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Abstract

Heterogeneous tabular data are the most commonly used form of data and are essential for numerous critical and computationally demanding applications. On homogeneous datasets, deep neural networks have repeatedly shown excellent performance and have therefore been widely adopted. However, their adaptation to tabular data for inference or data generation tasks remains highly challenging. To facilitate further progress in the field, this work provides an overview of state-of-the-art deep learning methods for tabular data. We categorize these methods into three groups: data transformations, specialized architectures, and regularization models. For each of these groups, our work offers a comprehensive overview of the main approaches. Moreover, we discuss deep learning approaches for generating tabular data and also provide an overview over strategies for explaining deep models on tabular data. Thus, our first contribution is to address the main research streams and existing methodologies in the mentioned areas while highlighting relevant challenges and open research questions. Our second contribution is to provide an empirical comparison of traditional machine learning methods with 11 deep learning approaches across five popular real-world tabular datasets of different sizes and with different learning objectives. Our results, which we have made publicly available as competitive benchmarks, indicate that algorithms based on gradient-boosted tree ensembles still mostly outperform deep learning models on supervised learning tasks, suggesting that the research progress on competitive deep learning models for tabular data is stagnating. To the best of our knowledge, this is the first in-depth overview of deep learning approaches for tabular data; as such, this work can serve as a valuable starting point to guide researchers and practitioners interested in deep learning with tabular data.

Collapse

Tang X, Prodduturi N, Thompson KJ, Weinshilboum RM, O'Sullivan CC, Boughey JC, Tizhoosh H, Klee EW, Wang L, Goetz MP, Suman V, Kalari KR. OmicsFootPrint: a framework to integrate and interpret multi-omics data using circular images and deep neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.21.586001. [PMID: 38585820 PMCID: PMC10996492 DOI: 10.1101/2024.03.21.586001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet 2024:10.1038/s10038-024-01231-y. [PMID: 38424184 DOI: 10.1038/s10038-024-01231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]

Yin AA, Zhang X, He YL, Zhao JJ, Zhang X, Fei Z, Lin W, Song BQ. Machine learning prediction models for in-hospital postoperative functional outcome after moderate-to-severe traumatic brain injury. Eur J Trauma Emerg Surg 2024:10.1007/s00068-023-02434-2. [PMID: 38355915 DOI: 10.1007/s00068-023-02434-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 12/28/2023] [Indexed: 02/16/2024]

Abstract

AIM

This study aims to utilize machine learning (ML) and logistic regression (LR) models to predict surgical outcomes among patients with traumatic brain injury (TBI) based on admission examination, assisting in making optimal surgical treatment decision for these patients.

METHOD

We conducted a retrospective review of patients hospitalized in our department for moderate-to-severe TBI. Patients admitted between October 2011 and October 2022 were assigned to the training set, while patients admitted between November 2022 and May 2023 were designated as the external validation set. Five ML algorithms and LR model were employed to predict the postoperative Glasgow Outcome Scale (GOS) status at discharge using clinical and routine blood data collected upon admission. The Shapley (SHAP) plot was utilized for interpreting the models.

RESULTS

A total of 416 patients were included in this study, and they were divided into the training set (n = 396) and the external validation set (n = 47). The ML models, using both clinical and routine blood data, were able to predict postoperative GOS outcomes with area under the curve (AUC) values ranging from 0.860 to 0.900 during the internal cross-validation and from 0.801 to 0.890 during the external validation. In contrast, the LR model had the lowest AUC values during the internal and external validation (0.844 and 0.567, respectively). When blood data was not available, the ML models achieved AUCs of 0.849 to 0.870 during the internal cross-validation and 0.714 to 0.861 during the external validation. Similarly, the LR model had the lowest AUC values (0.821 and 0.638, respectively). Through repeated cross-validation analysis, we found that routine blood data had a significant association with higher mean AUC values in all ML and LR models. The SHAP plot was used to visualize the contributions of all predictors and highlighted the significance of blood data in the lightGBM model.

CONCLUSION

The study concluded that ML models could provide rapid and accurate predictions for postoperative GOS outcomes at discharge following moderate-to-severe TBI. The study also highlighted the crucial role of routine blood tests in improving such predictions, and may contribute to the optimization of surgical treatment decision-making for patients with TBI.

Collapse

Sha Y, Meng W, Luo G, Zhai X, Tong HHY, Wang Y, Li K. MetDIT: Transforming and Analyzing Clinical Metabolomics Data with Convolutional Neural Networks. Anal Chem 2024. [PMID: 38324756 DOI: 10.1021/acs.analchem.3c04607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]

Ravaee H, Manshaei MH, Safayani M, Sartakhti JS. Intelligent phenotype-detection and gene expression profile generation with generative adversarial networks. J Theor Biol 2024;577:111636. [PMID: 37944593 DOI: 10.1016/j.jtbi.2023.111636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 08/11/2023] [Accepted: 10/05/2023] [Indexed: 11/12/2023]

Branson N, Cutillas PR, Bessant C. Comparison of multiple modalities for drug response prediction with learning curves using neural networks and XGBoost. BIOINFORMATICS ADVANCES 2023;4:vbad190. [PMID: 38282976 PMCID: PMC10812874 DOI: 10.1093/bioadv/vbad190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 01/30/2024]

Narykov O, Zhu Y, Brettin T, Evrard YA, Partin A, Shukla M, Xia F, Clyde A, Vasanthakumari P, Doroshow JH, Stevens RL. Integration of Computational Docking into Anti-Cancer Drug Response Prediction Models. Cancers (Basel) 2023;16:50. [PMID: 38201477 PMCID: PMC10777918 DOI: 10.3390/cancers16010050] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 01/12/2024] Open

Affiliation(s)

Oleksandr Narykov Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Yitan Zhu Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Thomas Brettin Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Yvonne A. Evrard Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA;
Alexander Partin Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Maulik Shukla Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Fangfang Xia Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
Austin Clyde Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.) Department of Computer Science, The University of Chicago, Chicago, IL 60637, USA
Priyanka Vasanthakumari Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.)
James H. Doroshow Developmental Therapeutics Branch, National Cancer Institute, Bethesda, MD 20892, USA;
Rick L. Stevens Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA; (Y.Z.); (T.B.); (A.P.); (M.S.); (F.X.); (P.V.); (R.L.S.) Department of Computer Science, The University of Chicago, Chicago, IL 60637, USA

Collapse

Medeiros Neto L, Rogerio da Silva Neto S, Endo PT. A comparative analysis of converters of tabular data into image for the classification of Arboviruses using Convolutional Neural Networks. PLoS One 2023;18:e0295598. [PMID: 38064477 PMCID: PMC10707680 DOI: 10.1371/journal.pone.0295598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/24/2023] [Indexed: 12/18/2023] Open

Bazgir O, Lu J. REFINED-CNN framework for survival prediction with high-dimensional features. iScience 2023;26:107627. [PMID: 37664631 PMCID: PMC10474067 DOI: 10.1016/j.isci.2023.107627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 06/21/2023] [Accepted: 08/10/2023] [Indexed: 09/05/2023] Open

Partin A, Brettin T, Zhu Y, Dolezal JM, Kochanny S, Pearson AT, Shukla M, Evrard YA, Doroshow JH, Stevens RL. Data augmentation and multimodal learning for predicting drug response in patient-derived xenografts from gene expressions and histology images. Front Med (Lausanne) 2023;10:1058919. [PMID: 36960342 PMCID: PMC10027779 DOI: 10.3389/fmed.2023.1058919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 02/10/2023] [Indexed: 03/09/2023] Open

Feed-Forward Deep Neural Network (FFDNN)-Based Deep Features for Static Malware Detection. INT J INTELL SYST 2023. [DOI: 10.1155/2023/9544481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]

Partin A, Brettin TS, Zhu Y, Narykov O, Clyde A, Overbeek J, Stevens RL. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Front Med (Lausanne) 2023;10:1086097. [PMID: 36873878 PMCID: PMC9975164 DOI: 10.3389/fmed.2023.1086097] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023] Open

Sharma A, Lysenko A, Boroevich KA, Tsunoda T. DeepInsight-3D architecture for anti-cancer drug response prediction with deep-learning on multi-omics. Sci Rep 2023;13:2483. [PMID: 36774402 PMCID: PMC9922304 DOI: 10.1038/s41598-023-29644-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/08/2023] [Indexed: 02/13/2023] Open

Silva Rocha ED, de Morais Melo FL, de Mello MEF, Figueiroa B, Sampaio V, Endo PT. On usage of artificial intelligence for predicting mortality during and post-pregnancy: a systematic review of literature. BMC Med Inform Decis Mak 2022;22:334. [PMID: 36536413 PMCID: PMC9764498 DOI: 10.1186/s12911-022-02082-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open

Kim T, Lee SJ, Jang T. Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer. Thorac Cancer 2022;13:3353-3361. [PMID: 36278315 PMCID: PMC9715822 DOI: 10.1111/1759-7714.14694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 09/28/2022] [Accepted: 09/30/2022] [Indexed: 01/09/2023] Open

Abstract

BACKGROUND

The present study aimed to evaluate the performance of several machine learning (ML) algorithms in predicting 1-year afatinib continuation and 2-year survival after afatinib initiation and to identify the differences in survival outcomes between ML-classified strata.

METHODS

Data that were also used in the RESET study were retrospectively collected from 16 hospitals in South Korea. A stratified random sampling method was applied to split the data into training and test sets (70:30 split ratio). Clinical information, such as age, sex, tumor stage, smoking, performance status, metastasis, type of metastasis, dose adjustment, and pathologic information on EGFR mutations were inputted. Training was performed using eight ML algorithms: logistic regression, decision tree, deep neural network, random forest, support vector machine, boosting, bagging, and the naïve Bayes classifier. The model performance was assessed based on sensitivity, specificity, and accuracy. Area under the receiver operator characteristic curve (AUC) was calculated and compared between the ML models using DeLong's test. A Kaplan-Meier (KM) curve was used to visualize the identified strata obtained from the ML models.

RESULTS

No significant differences in the input variables were observed between the training and test datasets. The best-performing models were support vector machine in predicting 1-year afatinib continuation (AUC 0.626) and decision tree in 2-year survival after afatinib start (AUC 0.644), although the performances of the ML models were comparable and did not display any predictive roles. KM analysis and log-rank test revealed significant differences between the strata identified from the ML model (p < 0.001) in terms of both time-on-treatment (TOT) and overall survival (OS).

CONCLUSION

The performances of ML models in our study found no discernible roles in predicting afatinib-related outcomes, although the identified strata revealed different TOT and OS in the KM analysis. This implies the strength of ML in predicting the survival outcome, as well as the limitation of electronic medical record-based variables in ML algorithms. Careful consideration of variable inclusion is likely to improve the general model performance.

Collapse

Image-Based Approach to Intrusion Detection in Cyber-Physical Objects. INFORMATION 2022. [DOI: 10.3390/info13120553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Explainable artificial intelligence through graph theory by generalized social network analysis-based classifier. Sci Rep 2022;12:15210. [PMID: 36075941 PMCID: PMC9458666 DOI: 10.1038/s41598-022-19419-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 08/29/2022] [Indexed: 12/05/2022] Open

Abstract

We propose a new type of supervised visual machine learning classifier, GSNAc, based on graph theory and social network analysis techniques. In a previous study, we employed social network analysis techniques and introduced a novel classification model (called Social Network Analysis-based Classifier—SNAc) which efficiently works with time-series numerical datasets. In this study, we have extended SNAc to work with any type of tabular data by showing its classification efficiency on a broader collection of datasets that may contain numerical and categorical features. This version of GSNAc simply works by transforming traditional tabular data into a network where samples of the tabular dataset are represented as nodes and similarities between the samples are reflected as edges connecting the corresponding nodes. The raw network graph is further simplified and enriched by its edge space to extract a visualizable ‘graph classifier model—GCM’. The concept of the GSNAc classification model relies on the study of node similarities over network graphs. In the prediction step, the GSNAc model maps test nodes into GCM, and evaluates their average similarity to classes by employing vectorial and topological metrics. The novel side of this research lies in transforming multidimensional data into a 2D visualizable domain. This is realized by converting a conventional dataset into a network of ‘samples’ and predicting classes after a careful and detailed network analysis. We exhibit the classification performance of GSNAc as an effective classifier by comparing it with several well-established machine learning classifiers using some popular benchmark datasets. GSNAc has demonstrated superior or comparable performance compared to other classifiers. Additionally, it introduces a visually comprehensible process for the benefit of end-users. As a result, the spin-off contribution of GSNAc lies in the interpretability of the prediction task since the process is human-comprehensible; and it is highly visual.

Collapse

Cao H, Xie X, Shi J, Jiang G, Wang Y. Siamese Network-Based Transfer Learning Model to Predict Geogenic Contaminated Groundwaters. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022;56:11071-11079. [PMID: 35816418 DOI: 10.1021/acs.est.1c08682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Stevens JM, Li J, Simmons EM, Wisniewski SR, DiSomma S, Fraunhoffer KJ, Geng P, Hao B, Jackson EW. Advancing Base Metal Catalysis through Data Science: Insight and Predictive Models for Ni-Catalyzed Borylation through Supervised Machine Learning. Organometallics 2022. [DOI: 10.1021/acs.organomet.2c00089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Trapotsi MA, Hosseini-Gerami L, Bender A. Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol 2022;3:170-200. [PMID: 35360890 PMCID: PMC8827085 DOI: 10.1039/d1cb00069a] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 12/09/2021] [Indexed: 12/15/2022] Open

Tang H, Yu X, Liu R, Zeng T. Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion. Brief Bioinform 2022;23:6518046. [PMID: 35106553 PMCID: PMC8921615 DOI: 10.1093/bib/bbab584] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/06/2021] [Accepted: 12/20/2021] [Indexed: 01/05/2023] Open

Abstract

Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and prediction. To satisfy such demands, we developed Vec2image, an explainable convolutional neural network framework for characterizing the feature engineering, feature selection and classifier training that is mainly based on the collaboration of principal component coordinate conversion, deep residual neural networks and embedded k-nearest neighbor representation on pseudo images of high-dimensional biological data, where the pseudo images represent feature measurements and feature associations simultaneously. Vec2image has achieved better performance compared with other popular methods and illustrated its efficiency on feature selection in cell marker identification from tissue-specific single-cell datasets. In particular, in a case study on type 2 diabetes (T2D) by multiple human islet scRNA-seq datasets, Vec2image first displayed robust performance on T2D classification model building across different datasets, then a specific Vec2image model was trained to accurately recognize the cell state and efficiently rank feature genes relevant to T2D which uncovered potential T2D cellular pathogenesis; and next the cell activity changes, cell composition imbalances and cell–cell communication dysfunctions were associated to our finding T2D feature genes from both population-shared and individual-specific perspectives. Collectively, Vec2image is a new and efficient explainable artificial intelligence methodology that can be widely applied in human-readable classification and prediction on the basis of pseudo image representation of biological deep sequencing data.

Collapse