151
|
Feng Y, McGuire N, Walton A, Fox S, Papa A, Lakhani SR, McCart Reed AE. Predicting breast cancer-specific survival in metaplastic breast cancer patients using machine learning algorithms. J Pathol Inform 2023; 14:100329. [PMID: 37664452 PMCID: PMC10470383 DOI: 10.1016/j.jpi.2023.100329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Revised: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 09/05/2023] Open
Abstract
Metaplastic breast cancer (MpBC) is a rare and aggressive subtype of breast cancer, with data emerging on prognostic factors and survival prediction. This study aimed to develop machine learning models to predict breast cancer-specific survival (BCSS) in MpBC patients, utilizing a dataset of 160 patients with clinical, pathological, and biological variables. An in-depth variable selection process was carried out using gain ratio and correlation-based methods, resulting in 10 variables for model estimation. Five models (decision tree with bagging; logistic regression; multilayer perceptron; naïve Bayes; and, random forest algorithms) were evaluated using 10-fold cross-validation. Despite the constraints posed by the absence of therapeutic information, the random forest model exhibited the highest performance in predicting BCSS, with an ROC area of 0.808. This study emphasizes the potential of machine learning algorithms in predicting prognosis for complex and heterogeneous cancer subtypes using clinical datasets, and their potential to contribute to patient management. Further research that incorporates additional variables, such as treatment response, and more advanced machine learning techniques will likely enhance the predictive power of MpBC prognostic models.
Collapse
Affiliation(s)
- Yufan Feng
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane 4029, Australia
| | - Natasha McGuire
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane 4029, Australia
| | - Alexandra Walton
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane 4029, Australia
- Pathology Queensland, The Royal Brisbane and Women’s Hospital, Brisbane 4029, Australia
| | | | - Stephen Fox
- Peter MacCallum Cancer Centre and University of Melbourne, Melbourne 3000, Australia
| | - Antonella Papa
- Monash Biomedicine Discovery Institute, Monash University, Melbourne 3800, Australia
| | - Sunil R. Lakhani
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane 4029, Australia
- Pathology Queensland, The Royal Brisbane and Women’s Hospital, Brisbane 4029, Australia
| | - Amy E. McCart Reed
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane 4029, Australia
| |
Collapse
|
152
|
Jiang L, Xu C, Bai Y, Liu A, Gong Y, Wang YP, Deng HW. AUTOSURV: INTERPRETABLE DEEP LEARNING FRAMEWORK FOR CANCER SURVIVAL ANALYSIS INCORPORATING CLINICAL AND MULTI-OMICS DATA. RESEARCH SQUARE 2023:rs.3.rs-2486756. [PMID: 37609286 PMCID: PMC10441464 DOI: 10.21203/rs.3.rs-2486756/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Accurate prognosis for cancer patients can provide critical information for optimizing treatment plans and improving life quality. Combining omics data and demographic/clinical information can offer a more comprehensive view of cancer prognosis than using omics or clinical data alone and can reveal the underlying disease mechanisms at the molecular level. In this study, we developed a novel deep learning framework to extract information from high-dimensional gene expression and miRNA expression data and conduct prognosis prediction for breast cancer and ovarian cancer patients. Our model achieved significantly better prognosis prediction than the conventional Cox Proportional Hazard model and other competitive deep learning approaches in various settings. Moreover, an interpretation approach was applied to tackle the "black-box" nature of deep neural networks and we identified features (i.e., genes, miRNA, demographic/clinical variables) that made important contributions to distinguishing predicted high- and low-risk patients. The identified associations were partially supported by previous studies.
Collapse
Affiliation(s)
- Lindong Jiang
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112
| | - Chao Xu
- Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104
| | - Yuntong Bai
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118
| | - Anqi Liu
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112
| | - Yun Gong
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112
| | - Yu-Ping Wang
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112
| |
Collapse
|
153
|
Sukhadia SS, Muller KE, Workman AA, Nagaraj SH. Machine Learning-Based Prediction of Distant Recurrence in Invasive Breast Carcinoma Using Clinicopathological Data: A Cross-Institutional Study. Cancers (Basel) 2023; 15:3960. [PMID: 37568776 PMCID: PMC10416932 DOI: 10.3390/cancers15153960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/19/2023] [Accepted: 07/19/2023] [Indexed: 08/13/2023] Open
Abstract
Breast cancer is the most common type of cancer worldwide. Alarmingly, approximately 30% of breast cancer cases result in disease recurrence at distant organs after treatment. Distant recurrence is more common in some subtypes such as invasive breast carcinoma (IBC). While clinicians have utilized several clinicopathological measurements to predict distant recurrences in IBC, no studies have predicted distant recurrences by combining clinicopathological evaluations of IBC tumors pre- and post-therapy with machine learning (ML) models. The goal of our study was to determine whether classification-based ML techniques could predict distant recurrences in IBC patients using key clinicopathological measurements, including pathological staging of the tumor and surrounding lymph nodes assessed both pre- and post-neoadjuvant therapy, response to therapy via standard-of-care imaging, and binary status of adjuvant therapy administered to patients. We trained and tested four clinicopathological ML models using a dataset (144 and 17 patients for training and testing, respectively) from Duke University and validated the best-performing model using an external dataset (8 patients) from Dartmouth Hitchcock Medical Center. The random forest model performed better than the C-support vector classifier, multilayer perceptron, and logistic regression models, yielding AUC values of 1.0 in the testing set and 0.75 in the validation set (p < 0.002) across both institutions, thereby demonstrating the cross-institutional portability and validity of ML models in the field of clinical research in cancer. The top-ranking clinicopathological measurement impacting the prediction of distant recurrences in IBC were identified to be tumor response to neoadjuvant therapy as evaluated via SOC imaging and pathology, which included tumor as well as node staging.
Collapse
Affiliation(s)
- Shrey S. Sukhadia
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4059, Australia
- Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH 03766, USA; (K.E.M.); (A.A.W.)
| | - Kristen E. Muller
- Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH 03766, USA; (K.E.M.); (A.A.W.)
| | - Adrienne A. Workman
- Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH 03766, USA; (K.E.M.); (A.A.W.)
| | - Shivashankar H. Nagaraj
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4059, Australia
| |
Collapse
|
154
|
Timilsina M, Fey D, Buosi S, Janik A, Costabello L, Carcereny E, Abreu DR, Cobo M, Castro RL, Bernabé R, Minervini P, Torrente M, Provencio M, Nováček V. Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer. J Biomed Inform 2023; 144:104424. [PMID: 37352900 DOI: 10.1016/j.jbi.2023.104424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 06/06/2023] [Accepted: 06/11/2023] [Indexed: 06/25/2023]
Abstract
OBJECTIVE Lung cancer exhibits unpredictable recurrence in low-stage tumors and variable responses to different therapeutic interventions. Predicting relapse in early-stage lung cancer can facilitate precision medicine and improve patient survivability. While existing machine learning models rely on clinical data, incorporating genomic information could enhance their efficiency. This study aims to impute and integrate specific types of genomic data with clinical data to improve the accuracy of machine learning models for predicting relapse in early-stage, non-small cell lung cancer patients. METHODS The study utilized a publicly available TCGA lung cancer cohort and imputed genetic pathway scores into the Spanish Lung Cancer Group (SLCG) data, specifically in 1348 early-stage patients. Initially, tumor recurrence was predicted without imputed pathway scores. Subsequently, the SLCG data were augmented with pathway scores imputed from TCGA. The integrative approach aimed to enhance relapse risk prediction performance. RESULTS The integrative approach achieved improved relapse risk prediction with the following evaluation metrics: an area under the precision-recall curve (PR-AUC) score of 0.75, an area under the ROC (ROC-AUC) score of 0.80, an F1 score of 0.61, and a Precision of 0.80. The prediction explanation model SHAP (SHapley Additive exPlanations) was employed to explain the machine learning model's predictions. CONCLUSION We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk while also improving the predictive power by incorporating proxy genomic data not available for specific patients.
Collapse
Affiliation(s)
- Mohan Timilsina
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | - Dirk Fey
- Systems Biology Ireland, University College Dublin, Ireland.
| | - Samuele Buosi
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | | | | | - Enric Carcereny
- Catalan Institute of Oncology, Hospital Universitari Germans Trias i Pujol, B-ARGO, IGTP, Badalona, Spain.
| | | | - Manuel Cobo
- Medical Oncology Intercenter Unit. Regional and Virgen de la Victoria University Hospitals. IBIMA. Málaga., Spain.
| | | | - Reyes Bernabé
- Hospital Universitario Virgen del Rocio, Sevilla, Spain.
| | | | - Maria Torrente
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Mariano Provencio
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Vít Nováček
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland; Faculty of Informatics, Masaryk University Brno, Czech Republic; Masaryk Memorial Cancer Institute, Brno, Czech Republic.
| |
Collapse
|
155
|
Gygi JP, Kleinstein SH, Guan L. Predictive overfitting in immunological applications: Pitfalls and solutions. Hum Vaccin Immunother 2023; 19:2251830. [PMID: 37697867 PMCID: PMC10498807 DOI: 10.1080/21645515.2023.2251830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/27/2023] [Accepted: 08/21/2023] [Indexed: 09/13/2023] Open
Abstract
Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.
Collapse
Affiliation(s)
- Jeremy P. Gygi
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
| | - Steven H. Kleinstein
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
- Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Leying Guan
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
156
|
Mäkitie AA, Alabi RO, Ng SP, Takes RP, Robbins KT, Ronen O, Shaha AR, Bradley PJ, Saba NF, Nuyts S, Triantafyllou A, Piazza C, Rinaldo A, Ferlito A. Artificial Intelligence in Head and Neck Cancer: A Systematic Review of Systematic Reviews. Adv Ther 2023; 40:3360-3380. [PMID: 37291378 PMCID: PMC10329964 DOI: 10.1007/s12325-023-02527-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 04/20/2023] [Indexed: 06/10/2023]
Abstract
INTRODUCTION Several studies have emphasized the potential of artificial intelligence (AI) and its subfields, such as machine learning (ML), as emerging and feasible approaches to optimize patient care in oncology. As a result, clinicians and decision-makers are faced with a plethora of reviews regarding the state of the art of applications of AI for head and neck cancer (HNC) management. This article provides an analysis of systematic reviews on the current status, and of the limitations of the application of AI/ML as adjunctive decision-making tools in HNC management. METHODS Electronic databases (PubMed, Medline via Ovid, Scopus, and Web of Science) were searched from inception until November 30, 2022. The study selection, searching and screening processes, inclusion, and exclusion criteria followed the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines. A risk of bias assessment was conducted using a tailored and modified version of the Assessment of Systematic Review (AMSTAR-2) tool and quality assessment using the Risk of Bias in Systematic Reviews (ROBIS) guidelines. RESULTS Of the 137 search hits retrieved, 17 fulfilled the inclusion criteria. This analysis of systematic reviews revealed that the application of AI/ML as a decision aid in HNC management can be thematized as follows: (1) detection of precancerous and cancerous lesions within histopathologic slides; (2) prediction of the histopathologic nature of a given lesion from various sources of medical imaging; (3) prognostication; (4) extraction of pathological findings from imaging; and (5) different applications in radiation oncology. In addition, the challenges in implementation of AI/ML models for clinical evaluations include the lack of standardized methodological guidelines for the collection of clinical images, development of these models, reporting of their performance, external validation procedures, and regulatory frameworks. CONCLUSION At present, there is a paucity of evidence to suggest the adoption of these models in clinical practice due to the aforementioned limitations. Therefore, this manuscript highlights the need for development of standardized guidelines to facilitate the adoption and implementation of these models in the daily clinical practice. In addition, adequately powered, prospective, randomized controlled trials are urgently needed to further assess the potential of AI/ML models in real-world clinical settings for the management of HNC.
Collapse
Affiliation(s)
- Antti A Mäkitie
- Department of Otorhinolaryngology-Head and Neck Surgery, Helsinki University Hospital, University of Helsinki, P.O. Box 263, 00029, HUS, Helsinki, Finland.
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden.
| | - Rasheed Omobolaji Alabi
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
| | - Sweet Ping Ng
- Department of Radiation Oncology, Olivia Newton-John Cancer Wellness and Research Centre, Austin Health, Melbourne, Australia
- Department of Surgery, The University of Melbourne, Melbourne, Australia
- School of Cancer Medicine, La Trobe University, Melbourne, Australia
- School of Imaging and Radiation Sciences, Monash University, Melbourne, Australia
| | - Robert P Takes
- Department of Otolaryngology and Head and Neck Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
| | - K Thomas Robbins
- Department of Otolaryngology Head Neck Surgery, SIU School of Medicine, Southern Illinois University, Springfield, IL, USA
| | - Ohad Ronen
- Department of Otolaryngology-Head and Neck Surgery, Galilee Medical Center Affiliated with Azrieil Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Ashok R Shaha
- Head and Neck Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Patrick J Bradley
- The University of Nottingham, Department of ORLHNS, Queens Medical Centre Campus, Nottingham University Hospital, Derby Road, Nottingham, NG7 2UH, UK
| | - Nabil F Saba
- Department of Hematology and Medical Oncology, The Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Sandra Nuyts
- Laboratory of Experimental Radiotherapy, Department of Oncology, KU Leuven, 3000, Leuven, Belgium
- Department of Radiation Oncology, Leuven Cancer Institute, University Hospitals Leuven, 3000, Leuven, Belgium
| | - Asterios Triantafyllou
- Department of Pathology, Liverpool Clinical Laboratories, School of Dentistry, University of Liverpool, Liverpool, UK
| | - Cesare Piazza
- Unit of Otorhinolaryngology-Head and Neck Surgery, ASST Spedali Civili of Brescia, Brescia, Italy
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, School of Medicine, University of Brescia, Brescia, Italy
| | | | - Alfio Ferlito
- Coordinator of the International Head and Neck Scientific Group, Padua, Italy
| |
Collapse
|
157
|
Pan X, Feng T, Liu C, Savjani RR, Chin RK, Sharon Qi X. A survival prediction model via interpretable machine learning for patients with oropharyngeal cancer following radiotherapy. J Cancer Res Clin Oncol 2023; 149:6813-6825. [PMID: 36807760 DOI: 10.1007/s00432-023-04644-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 02/08/2023] [Indexed: 02/21/2023]
Abstract
PURPOSE To explore interpretable machine learning (ML) methods, with the hope of adding more prognosis value, for predicting survival for patients with Oropharyngeal-Cancer (OPC). METHODS A cohort of 427 OPC patients (Training 341, Test 86) from TCIA database was analyzed. Radiomic features of gross-tumor-volume (GTV) extracted from planning CT using Pyradiomics, and HPV p16 status, etc. patient characteristics were considered as potential predictors. A multi-level dimension reduction algorithm consisting of Least-Absolute-Selection-Operator (Lasso) and Sequential-Floating-Backward-Selection (SFBS) was proposed to effectively remove redundant/irrelevant features. The interpretable model was constructed by quantifying the contribution of each feature to the Extreme-Gradient-Boosting (XGBoost) decision by Shapley-Additive-exPlanations (SHAP) algorithm. RESULTS The Lasso-SFBS algorithm proposed in this study finally selected 14 features, and our prediction model achieved an area-under-ROC-curve (AUC) of 0.85 on the test dataset based on this feature set. The ranking of the contribution values calculated by SHAP shows that the top predictors that were most correlated with survival were ECOG performance status, wavelet-LLH_firstorder_Mean, chemotherapy, wavelet-LHL_glcm_InverseVariance, tumor size. Those patients who had chemotherapy, with positive HPV p16 status, and lower ECOG performance status, tended to have higher SHAP scores and longer survival; who had an older age at diagnosis, heavy drinking and smoking pack year history, tended to lower SHAP scores and shorter survival. CONCLUSION We demonstrated predictive values of combined patient characteristics and imaging features for the overall survival of OPC patients. The multi-level dimension reduction algorithm can reliably identify the most plausible predictors that are mostly associated with overall survival. The interpretable patient-specific survival prediction model, capturing correlations of each predictor and clinical outcome, was developed to facilitate clinical decision-making for personalized treatment.
Collapse
Affiliation(s)
- Xiaoying Pan
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China.
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China.
| | - Tianhao Feng
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Chen Liu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Ricky R Savjani
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Robert K Chin
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - X Sharon Qi
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| |
Collapse
|
158
|
Fei H, Han X, Wang Y, Li S. Mining Prognostic Biomarkers of Thyroid Cancer Patients Based on the Immune-Related Genes and Development of a Reliable Prognostic Risk Model. Mediators Inflamm 2023; 2023:6503476. [PMID: 37554551 PMCID: PMC10406562 DOI: 10.1155/2023/6503476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 04/21/2023] [Accepted: 07/10/2023] [Indexed: 08/10/2023] Open
Abstract
PURPOSE Tumor immunity serves an essential role in the occurrence and development of thyroid cancer (THCA). The aim of this study is to establish an immune-related prognostic model for THCA patients by using immune-related genes (IRGs). METHODS Wilcox test was used to screen the differentially expressed immune-related genes (DEIRGs) in THCA and normal tissues, then the DEIRGs related to prognosis were identified using univariate Cox regression analysis. According to The Cancer Genome Atlas (TCGA) cohort, we developed a least absolute shrinkage and selection operator (LASSO) regression prognostic model and performed validation analyses regard to the predictive value of the model in internal (TCGA) and external (International Cancer Genome Consortium) cohorts respectively. Finally, we analyzed the correlation among the prognostic model, clinical variables, and immune cell infiltration. RESULTS Eighty-two of 2,498 IRGs were differentially expressed between THCA and normal tissues, and 18 of them were related to prognosis. LASSO Cox regression analysis identified seven DEIRGs with the greatest prognostic value to construct the prognostic model. The risk model showed high predictive value for the survival of THCA in two independent cohorts. The risk score according to the risk model was positively associated with poor survival and the infiltration levels of immune cells, it can evaluate the prognosis of THCA patients independent of any other clinicopathologic feature. The prognostic value and genetic alternations of seven risk genes were evaluated separately. CONCLUSION Our study established and verified a dependable prognostic model associated with immune for THCA, both the identified IRGs and immune-related risk model were clinically significant, which is conducive to promoting individualized immunotherapy against THCA.
Collapse
Affiliation(s)
- Hongjun Fei
- Department of Reproductive Genetics, International Peace Maternity and Child Health Hospital, Shanghai Key Laboratory of Embryo Original Diseases, Shanghai Municipal Key Clinical Specialty, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, China
| | - Xu Han
- Department of Reproductive Genetics, International Peace Maternity and Child Health Hospital, Shanghai Key Laboratory of Embryo Original Diseases, Shanghai Municipal Key Clinical Specialty, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, China
| | - Yanlin Wang
- Department of Reproductive Genetics, International Peace Maternity and Child Health Hospital, Shanghai Key Laboratory of Embryo Original Diseases, Shanghai Municipal Key Clinical Specialty, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, China
| | - Shuyuan Li
- Department of Reproductive Genetics, International Peace Maternity and Child Health Hospital, Shanghai Key Laboratory of Embryo Original Diseases, Shanghai Municipal Key Clinical Specialty, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, China
| |
Collapse
|
159
|
Gao X, Bu H, Ge J, Gao X, Wang Y, Zhang Z, Wang L. A Comprehensive Analysis of the Prognostic, Immunological and Diagnostic Role of CCNF in Pan-cancer. J Cancer 2023; 14:2431-2442. [PMID: 37670965 PMCID: PMC10475360 DOI: 10.7150/jca.86597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 07/17/2023] [Indexed: 09/07/2023] Open
Abstract
Background: Cyclin F (CCNF) represents a pivotal constituent within the family of cell cycle proteins, which also belongs to the F-box protein family and acts as a critical regulatory factor in cell cycle transition. Its heightened expression has been consistently identified across various cancer types, including breast, pancreatic, and colorectal cancer. Nonetheless, a comprehensive exploration of CCNF's involvement in pan-cancer remains lacking. Methods: This study collected transcriptomic data and clinical information from several databases, including The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and BioGPS detabase. Employing bioinformatics methods, we investigated the potential oncogenic role of CCNF, utilizing various databases such as cBioPortal, Human Protein Atlas (HPA), TIMER2, UALCAN, GEPIA, GSCALite, and CTD detabase. These analyses focused on exploring CCNF expression, prognosis, gene mutations, immune cell infiltration, DNA methylation levels, and targeted chemical drugs across different tumor types. Additionally, we obtained CCNF-related genes from GeneMANIA and GEPIA databases and conducted GO and KEGG enrichment analyses to gain deeper insights into the biological processes associated with CCNF. Furthermore, we validated the differential expression of CCNF in normal human breast cancer and breast cancer cell lines using experimental verification. Results: CCNF exhibited upregulation in the majority of cancer types, demonstrating early diagnostic potential in 15 cancers and prognostic implications for adverse outcomes across numerous malignancies. Furthermore, CCNF was found to be linked with markers of the tumor immune microenvironment in various cancers. Additionally, CCNF expression influenced genetic alterations in pan-cancer. Enrichment analysis revealed that CCNF primarily participates in crucial biological pathways such as the cell cycle, p53 signaling pathway, and cellular senescence pathways. RT-qpcr and WB assays further confirmed that CCNF expression was higher in human cancer cell lines than in normal cell lines. Conclusion: The underlying role and mechanism of CCNF in pan-cancer were elucidated through comprehensive bioinformatics analysis and experimental validation. CCNF holds promise as an invaluable early detection indicator and tumor biomarker, offering potential targets for tumor treatment and prevention.
Collapse
Affiliation(s)
- Xiaofeng Gao
- School of Basic Medical Sciences, Xianning Medical College, Hubei University of Science and Technology, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
- Medicine Research Institute /Hubei Key Laboratory of Diabetes and Angiopathy, Xianning Medical College, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| | - Huitong Bu
- College of Biology, Hunan University, Hunan, Changsha, 410012, PR China
| | - Juanjuan Ge
- Medicine Research Institute /Hubei Key Laboratory of Diabetes and Angiopathy, Xianning Medical College, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| | - Xuzheng Gao
- School of Basic Medical Sciences, Xianning Medical College, Hubei University of Science and Technology, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| | - Ying Wang
- School of Basic Medical Sciences, Xianning Medical College, Hubei University of Science and Technology, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| | - Zhenwang Zhang
- School of Basic Medical Sciences, Xianning Medical College, Hubei University of Science and Technology, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
- Medicine Research Institute /Hubei Key Laboratory of Diabetes and Angiopathy, Xianning Medical College, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| | - Long Wang
- School of Basic Medical Sciences, Xianning Medical College, Hubei University of Science and Technology, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
- Medicine Research Institute /Hubei Key Laboratory of Diabetes and Angiopathy, Xianning Medical College, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
- School of Stomatology and Ophthalmology, Xianning Medical College, Hubei University of Science and Technology, Xianning 437000, Hubei, PR China
| |
Collapse
|
160
|
Yarahmadi B, Hashemianzadeh SM, Milani Hosseini SMR. Machine-learning-based predictions of imprinting quality using ensemble and non-linear regression algorithms. Sci Rep 2023; 13:12111. [PMID: 37495673 PMCID: PMC10372080 DOI: 10.1038/s41598-023-39374-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023] Open
Abstract
The molecularly imprinted polymers are artificial polymers that, during the synthesis, create specific sites for a definite purpose. These polymers due to their characteristics such as stability, easy of synthesis, reproducibility, reusability, high accuracy, and selectivity have many applications. However, the variety of the functional monomers, templates, solvents, and synthesis conditions like pH, temperature, the rate of stirring, and time, limit the selectivity of imprinting. The Practical optimization of the synthetic conditions has many drawbacks, including chemical compound usage, equipment requirements, and time costs. The use of machine learning (ML) for the prediction of the imprinting factor (IF), which indicates the quality of imprinting is a very interesting idea to overcome these problems. The ML has many advantages, for example a lack of human error, high accuracy, high repeatability, and prediction of a large amount of data in the minimum time. In this research, ML was used to predict the IF using non-linear regression algorithms, including classification and regression tree, support vector regression, and k-nearest neighbors, and ensemble algorithms, like gradient boosting (GB), random forest, and extra trees. The data sets were obtained practically in the laboratory, and inputs, included pH, the type of the template, the type of the monomer, solvent, the distribution coefficient of the MIP (KMIP), and the distribution coefficient of the non-imprinted polymer (KNIP). The mutual information feature selection method was used to select the important features affecting the IF. The results showed that the GB algorithm had the best performance in predicting the IF, and using this algorithm, the maximum R2 value (R2 = 0.871), and the minimum mean absolute error (MAE = - 0.982), and mean square error were obtained (MSE = - 2.303).
Collapse
Affiliation(s)
- Bita Yarahmadi
- Real Samples Analysis Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran.
| | | |
Collapse
|
161
|
Azher ZL, Suvarna A, Chen JQ, Zhang Z, Christensen BC, Salas LA, Vaickus LJ, Levy JJ. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023; 16:23. [PMID: 37481666 PMCID: PMC10363299 DOI: 10.1186/s13040-023-00338-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 07/05/2023] [Indexed: 07/24/2023] Open
Abstract
BACKGROUND Deep learning models can infer cancer patient prognosis from molecular and anatomic pathology information. Recent studies that leveraged information from complementary multimodal data improved prognostication, further illustrating the potential utility of such methods. However, current approaches: 1) do not comprehensively leverage biological and histomorphological relationships and 2) make use of emerging strategies to "pretrain" models (i.e., train models on a slightly orthogonal dataset/modeling objective) which may aid prognostication by reducing the amount of information required for achieving optimal performance. In addition, model interpretation is crucial for facilitating the clinical adoption of deep learning methods by fostering practitioner understanding and trust in the technology. METHODS Here, we develop an interpretable multimodal modeling framework that combines DNA methylation, gene expression, and histopathology (i.e., tissue slides) data, and we compare performance of crossmodal pretraining, contrastive learning, and transfer learning versus the standard procedure. RESULTS Our models outperform the existing state-of-the-art method (average 11.54% C-index increase), and baseline clinically driven models (average 11.7% C-index increase). Model interpretations elucidate consideration of biologically meaningful factors in making prognosis predictions. DISCUSSION Our results demonstrate that the selection of pretraining strategies is crucial for obtaining highly accurate prognostication models, even more so than devising an innovative model architecture, and further emphasize the all-important role of the tumor microenvironment on disease progression.
Collapse
Affiliation(s)
- Zarif L Azher
- Thomas Jefferson High School for Science and Technology, Alexandria, VA, USA
| | - Anish Suvarna
- Thomas Jefferson High School for Science and Technology, Alexandria, VA, USA
| | - Ji-Qing Chen
- Cancer Biology Graduate Program, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Ze Zhang
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Brock C Christensen
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Molecular and Systems Biology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Community and Family Medicine, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Lucas A Salas
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Molecular and Systems Biology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Integrative Neuroscience at Dartmouth (IND) Graduate Program, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA
| | - Joshua J Levy
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA.
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA.
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA.
- Department of Dermatology, Dartmouth Health, Lebanon, NH, USA.
| |
Collapse
|
162
|
Casaes Teixeira B, Toporcov TN, Chiaravalloti-Neto F, Chiavegatto Filho ADP. Spatial Clusters of Cancer Mortality in Brazil: A Machine Learning Modeling Approach. Int J Public Health 2023; 68:1604789. [PMID: 37546351 PMCID: PMC10397398 DOI: 10.3389/ijph.2023.1604789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 06/26/2023] [Indexed: 08/08/2023] Open
Abstract
Objectives: Our aim was to test if machine learning algorithms can predict cancer mortality (CM) at an ecological level and use these results to identify statistically significant spatial clusters of excess cancer mortality (eCM). Methods: Age-standardized CM was extracted from the official databases of Brazil. Predictive features included sociodemographic and health coverage variables. Machine learning algorithms were selected and trained with 70% of the data, and the performance was tested with the remaining 30%. Clusters of eCM were identified using SatScan. Additionally, separate analyses were performed for the 10 most frequent cancer types. Results: The gradient boosting trees algorithm presented the highest coefficient of determination (R 2 = 0.66). For total cancer, all algorithms overlapped in the region of Bagé (27% eCM). For esophageal cancer, all algorithms overlapped in west Rio Grande do Sul (48%-96% eCM). The most significant cluster for stomach cancer was in Macapá (82% eCM). The most important variables were the percentage of the white population and residents with computers. Conclusion: We found consistent and well-defined geographic regions in Brazil with significantly higher than expected cancer mortality.
Collapse
|
163
|
Duan M, Wang Y, Zhao D, Liu H, Zhang G, Li K, Zhang H, Huang L, Zhang R, Zhou F. Orchestrating information across tissues via a novel multitask GAT framework to improve quantitative gene regulation relation modeling for survival analysis. Brief Bioinform 2023; 24:bbad238. [PMID: 37427963 DOI: 10.1093/bib/bbad238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 05/29/2023] [Accepted: 06/08/2023] [Indexed: 07/11/2023] Open
Abstract
Survival analysis is critical to cancer prognosis estimation. High-throughput technologies facilitate the increase in the dimension of genic features, but the number of clinical samples in cohorts is relatively small due to various reasons, including difficulties in participant recruitment and high data-generation costs. Transcriptome is one of the most abundantly available OMIC (referring to the high-throughput data, including genomic, transcriptomic, proteomic and epigenomic) data types. This study introduced a multitask graph attention network (GAT) framework DQSurv for the survival analysis task. We first used a large dataset of healthy tissue samples to pretrain the GAT-based HealthModel for the quantitative measurement of the gene regulatory relations. The multitask survival analysis framework DQSurv used the idea of transfer learning to initiate the GAT model with the pretrained HealthModel and further fine-tuned this model using two tasks i.e. the main task of survival analysis and the auxiliary task of gene expression prediction. This refined GAT was denoted as DiseaseModel. We fused the original transcriptomic features with the difference vector between the latent features encoded by the HealthModel and DiseaseModel for the final task of survival analysis. The proposed DQSurv model stably outperformed the existing models for the survival analysis of 10 benchmark cancer types and an independent dataset. The ablation study also supported the necessity of the main modules. We released the codes and the pretrained HealthModel to facilitate the feature encodings and survival analysis of transcriptome-based future studies, especially on small datasets. The model and the code are available at http://www.healthinformaticslab.org/supp/.
Collapse
Affiliation(s)
- Meiyu Duan
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
| | - Yueying Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
| | - Dong Zhao
- School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Hongmei Liu
- School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University, Guiyang, Guizhou 550025, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China, 130012
| | - Gongyou Zhang
- School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Kewei Li
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
| | - Haotian Zhang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
| | - Lan Huang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China, 130012
| | - Ruochi Zhang
- School of Artificial Intelligence, Jilin University, Changchun, China, 130012
| | - Fengfeng Zhou
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China, 130012
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China, 130012
| |
Collapse
|
164
|
Wang J, Cong L, Shi W, Xu W, Xu S. Single-Cell Analysis and Classification according to Multiplexed Proteins via Microdroplet-Based Self-Driven Magnetic Surface-Enhanced Raman Spectroscopy Platforms Assisted with Machine Learning Algorithms. Anal Chem 2023. [PMID: 37419505 DOI: 10.1021/acs.analchem.3c01273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
A microdroplet-based surface-enhanced Raman spectroscopy (microdroplet SERS) platform was constructed to envelop individual cells in microdroplets, followed by the SERS detection of their extracellular vesicle-proteins (EV-proteins) via the in-drop immunoassays by use of immunomagnetic beads (iMBs) and immuno-SERS tags (iSERS tags). A unique phenomenon is found that iMBs can start a spontaneous reorientation on the probed cell surface based on the electrostatic force-driven interfacial aggregation effect, which leads EV-proteins and iSERS tags to be gathered from a liquid phase to a cell membrane interface and significantly improves SERS sensitivity to the single-cell analysis level due to the formation of numbers of SERS hotspots. Three EV-proteins from two breast cancer cell lines were collected and further analyzed by machine learning algorithmic tools, which will be helpful for a deeper understanding of breast cancer subtypes from the view of EV-proteins.
Collapse
Affiliation(s)
- Jiaqi Wang
- State Key Laboratory of Supramolecular Structure and Materials, College of Chemistry, Jilin University, Changchun 130012, P. R. China
| | - Lili Cong
- State Key Laboratory of Supramolecular Structure and Materials, College of Chemistry, Jilin University, Changchun 130012, P. R. China
| | - Wei Shi
- Key Lab for Molecular Enzymology & Engineering of Ministry of Education, Jilin University, Changchun 130012, P. R. China
| | - Weiqing Xu
- State Key Laboratory of Supramolecular Structure and Materials, College of Chemistry, Jilin University, Changchun 130012, P. R. China
- Institute of Theoretical Chemistry, Jilin University, Changchun 130012, P. R. China
| | - Shuping Xu
- State Key Laboratory of Supramolecular Structure and Materials, College of Chemistry, Jilin University, Changchun 130012, P. R. China
- State Key Laboratory of Applied Optics, Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, P. R. China
- Center for Supramolecular Chemical Biology, College of Chemistry, Jilin University, Changchun 130012, P. R. China
- Institute of Theoretical Chemistry, Jilin University, Changchun 130012, P. R. China
| |
Collapse
|
165
|
Alotaibi FM, Khan YD. A Framework for Prediction of Oncogenomic Progression Aiding Personalized Treatment of Gastric Cancer. Diagnostics (Basel) 2023; 13:2291. [PMID: 37443684 DOI: 10.3390/diagnostics13132291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 06/05/2023] [Accepted: 06/13/2023] [Indexed: 07/15/2023] Open
Abstract
Mutations in genes can alter their DNA patterns, and by recognizing these mutations, many carcinomas can be diagnosed in the progression stages. The human body contains many hidden and enigmatic features that humankind has not yet fully understood. A total of 7539 neoplasm cases were reported from 1 January 2021 to 31 December 2021. Of these, 3156 were seen in males (41.9%) and 4383 (58.1%) in female patients. Several machine learning and deep learning frameworks are already implemented to detect mutations, but these techniques lack generalized datasets and need to be optimized for better results. Deep learning-based neural networks provide the computational power to calculate the complex structures of gastric carcinoma-driven gene mutations. This study proposes deep learning approaches such as long and short-term memory, gated recurrent units and bi-LSTM to help in identifying the progression of gastric carcinoma in an optimized manner. This study includes 61 carcinogenic driver genes whose mutations can cause gastric cancer. The mutation information was downloaded from intOGen.org and normal gene sequences were downloaded from asia.ensembl.org, as explained in the data collection section. The proposed deep learning models are validated using the self-consistency test (SCT), 10-fold cross-validation test (FCVT), and independent set test (IST); the IST prediction metrics of accuracy, sensitivity, specificity, MCC and AUC of LSTM, Bi-LSTM, and GRU are 97.18%, 98.35%, 96.01%, 0.94, 0.98; 99.46%, 98.93%, 100%, 0.989, 1.00; 99.46%, 98.93%, 100%, 0.989 and 1.00, respectively.
Collapse
Affiliation(s)
- Fahad M Alotaibi
- Department of Information System, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore 54770, Pakistan
| |
Collapse
|
166
|
Kan CM, Pei XM, Yeung MHY, Jin N, Ng SSM, Tsang HF, Cho WCS, Yim AKY, Yu ACS, Wong SCC. Exploring the Role of Circulating Cell-Free RNA in the Development of Colorectal Cancer. Int J Mol Sci 2023; 24:11026. [PMID: 37446204 DOI: 10.3390/ijms241311026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 06/25/2023] [Accepted: 07/02/2023] [Indexed: 07/15/2023] Open
Abstract
Circulating tumor RNA (ctRNA) has recently emerged as a novel and attractive liquid biomarker. CtRNA is capable of providing important information about the expression of a variety of target genes noninvasively, without the need for biopsies, through the use of circulating RNA sequencing. The overexpression of cancer-specific transcripts increases the tumor-derived RNA signal, which overcomes limitations due to low quantities of circulating tumor DNA (ctDNA). The purpose of this work is to present an up-to-date review of current knowledge regarding ctRNAs and their status as biomarkers to address the diagnosis, prognosis, prediction, and drug resistance of colorectal cancer. The final section of the article discusses the practical aspects involved in analyzing plasma ctRNA, including storage and isolation, detection technologies, and their limitations in clinical applications.
Collapse
Affiliation(s)
- Chau-Ming Kan
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Xiao Meng Pei
- Department of Applied Biology & Chemical Technology, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Martin Ho Yin Yeung
- Department of Applied Biology & Chemical Technology, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Nana Jin
- Codex Genetics Limited, Shatin, Hong Kong SAR, China
| | - Simon Siu Man Ng
- Department of Surgery, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Hin Fung Tsang
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - William Chi Shing Cho
- Department of Clinical Oncology, Queen Elizabeth Hospital, Kowloon, Hong Kong SAR, China
| | | | | | - Sze Chuen Cesar Wong
- Department of Applied Biology & Chemical Technology, The Hong Kong Polytechnic University, Hong Kong SAR, China
| |
Collapse
|
167
|
Rodrigues PM, Madeiro JP, Marques JAL. Enhancing Health and Public Health through Machine Learning: Decision Support for Smarter Choices. Bioengineering (Basel) 2023; 10:792. [PMID: 37508819 PMCID: PMC10376309 DOI: 10.3390/bioengineering10070792] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023] Open
Abstract
In recent years, the integration of Machine Learning (ML) techniques in the field of healthcare and public health has emerged as a powerful tool for improving decision-making processes [...].
Collapse
Affiliation(s)
- Pedro Miguel Rodrigues
- CBQF-Centro de Biotecnologia e Química Fina-Laboratório Associado, Escola Superior de Biotecnologia, Universidade Católica Portuguesa, Rua de Diogo Botelho 1327, 4169-005 Porto, Portugal
| | - João Paulo Madeiro
- Department of Computing, Federal University of Ceará, Fortaleza 60440-900, Ceará, Brazil
| | | |
Collapse
|
168
|
Ghanem M, Ghaith AK, Zamanian C, Bon-Nieves A, Bhandarkar A, Bydon M, Quiñones-Hinojosa A. Deep Learning Approaches for Glioblastoma Prognosis in Resource-Limited Settings: A Study Using Basic Patient Demographic, Clinical, and Surgical Inputs. World Neurosurg 2023; 175:e1089-e1109. [PMID: 37088416 DOI: 10.1016/j.wneu.2023.04.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/15/2023] [Accepted: 04/17/2023] [Indexed: 04/25/2023]
Abstract
BACKGROUND Glioblastoma (GBM) is the most common brain tumor in the United States, with an annual incidence rate of 3.21 per 100,000. It is the most aggressive type of diffuse glioma and has a median survival of months after treatment. This study aims to assess the accuracy of different novel deep learning models trained on a set of simple clinical, demographic, and surgical variables to assist in clinical practice, even in areas with constrained health care infrastructure. METHODS Our study included 37,095 patients with GBM from the SEER (Surveillance Epidemiology and End Results) database. All predictors were based on demographic, clinicopathologic, and treatment information of the cases. Our outcomes of interest were months of survival and vital status. Concordance index (C-index) and integrated Brier scores (IBS) were used to evaluate the performance of the models. RESULTS The patient characteristics and the statistical analyses were consistent with the epidemiologic literature. The models C-index and IBS ranged from 0.6743 to 0.6918 and from 0.0934 to 0.1034, respectively. Probabilistic matrix factorization (0.6918), multitask logistic regression (0.6916), and logistic hazard (0.6916) had the highest C-index scores. The models with the lowest IBS were the probabilistic matrix factorization (0.0934), multitask logistic regression (0.0935), and logistic hazard (0.0936). These models had an accuracy (1-IBS) of 90.66%; 90.65%, and 90.64%, respectively. The deep learning algorithms were deployed on an interactive Web-based tool for practical use available via https://glioblastoma-survanalysis.herokuapp.com/. CONCLUSIONS Novel deep learning algorithms can better predict GBM prognosis than do baseline methods and can lead to more personalized patient care regardless of extensive electronic health record availability.
Collapse
Affiliation(s)
- Marc Ghanem
- Gilbert and Rose-Marie Chagoury School of Medicine, Lebanese American University, Beirut, Lebanon
| | - Abdul Karim Ghaith
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, Minnesota, USA; Department of Neurological Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Cameron Zamanian
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, Minnesota, USA; Department of Neurological Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Antonio Bon-Nieves
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, Minnesota, USA; Department of Neurological Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Archis Bhandarkar
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, Minnesota, USA; Department of Neurological Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Mohamad Bydon
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, Minnesota, USA; Department of Neurological Surgery, Mayo Clinic, Rochester, Minnesota, USA.
| | | |
Collapse
|
169
|
Liu YS, Thaliffdeen R, Han S, Park C. Use of machine learning to predict bladder cancer survival outcomes: a systematic literature review. Expert Rev Pharmacoecon Outcomes Res 2023; 23:761-771. [PMID: 37306511 DOI: 10.1080/14737167.2023.2224963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 06/09/2023] [Indexed: 06/13/2023]
Abstract
INTRODUCTION The objective of this systematic review is to summarize the use of machine learning (ML) in predicting overall survival (OS) in patients with bladder cancer. METHODS Search terms for bladder cancer, ML algorithms, and mortality were used to identify studies in PubMed and Web of Science as of February 2022. Notable inclusion/exclusion criteria contained the inclusion of studies that utilized patient-level datasets and exclusion of primary gene expression-related dataset studies. Study quality and bias were assessed using the International Journal of Medical Informatics (IJMEDI) checklist. RESULTS Of the 14 included studies, the most common algorithms were artificial neural networks (n = 8) and logistic regression (n = 4). Nine articles described missing data handling, with five articles removing patients with missing data entirely. With respect to feature selection, the most common sociodemographic variables were age (n = 9), gender (n = 9), and smoking status (n = 3), with clinical variables most commonly including tumor stage (n = 8), grade (n = 7), and lymph node involvement (n = 6). Most studies (n = 10) were of medium IJMEDI quality, with common areas of improvement being the descriptions of data preparation and deployment. CONCLUSIONS ML holds promise for optimizing bladder cancer care through accurate OS predictions, but challenges related to data processing, feature selection, and data source quality must be resolved to develop robust models. While this review is limited by its inability to compare models across studies, this systematic review will inform decision-making by various stakeholders to improve understanding of ML-based OS prediction in bladder cancer and foster interpretability of future models.
Collapse
Affiliation(s)
- Yi-Shao Liu
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Ryan Thaliffdeen
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Sola Han
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Chanhyun Park
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| |
Collapse
|
170
|
Huang TC, Hsu TC, Hsieh YH, Che-Lin. Utilizing Graph Neural Networks for Breast Cancer Prognosis Prediction with High-dimensional Genomic Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083007 DOI: 10.1109/embc40787.2023.10340045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
An accurate prediction of breast cancer is essential to help physicians make appropriate treatment recommendations to reduce the chance of excessive treatment, avoiding unnecessary anxiety for patients. Cancer prognosis is highly related to patients' genomic features, which are high-dimensional in nature. In this study, we utilize a systems biology feature selector for dimension reduction to select 20 prognostic biomarkers that are considered closely related to breast cancer prognosis from the high dimensional RNA Sequencing (RNA-Seq) data. Furthermore, we establish a graph neural network (GNN) and a multi-layer perception (MLP) graph-level readout method to better extract the underlying gene interactions from the corresponding gene interaction network (GIN). With the help of GINs, the model performs the best among all baseline models, especially in the area under the precision-recall curve (AUPRC) by as large as 23%. The results demonstrate that our approach using GNNs can successfully extract high-dimensional and complicated interactions within genomic data.
Collapse
|
171
|
Zou S, Lin Y, Yu X, Eriksson M, Lin M, Fu F, Yang H. Genetic and lifestyle factors for breast cancer risk assessment in Southeast China. Cancer Med 2023; 12:15504-15514. [PMID: 37264741 PMCID: PMC10417168 DOI: 10.1002/cam4.6198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 04/01/2023] [Accepted: 05/23/2023] [Indexed: 06/03/2023] Open
Abstract
BACKGROUND Despite the rising incidence and mortality of breast cancer among women in China, there are currently few predictive models for breast cancer in the Chinese population and with low accuracy. This study aimed to identify major genetic and life-style risk factors in a Chinese population for potential application in risk assessment models. METHODS A case-control study in southeast China was conducted including 1321 breast cancer patients and 2045 controls during 2013-2016, in which the data were randomly divided into a training set and a test set on a 7:3 scale. The association between genetic and life-style factors and breast cancer was examined using logistic regression models. Using AUC curves, we also compared the performance of the logistic model to machine learning models, namely LASSO regression model and support vector machine (SVM), and the scores calculated from CKB, Gail and Tyrer-Cuzick models in the test set. RESULTS Among all factors considered, the best model was achieved when polygenetic risk score, lifestyle, and reproductive factors were considered jointly in the logistic regression model (AUC = 0.73; 95% CI: 0.70-0.77). The models created in this study performed better than those using scores calculated from the CKB, Gail, and Tyrer-Cuzick models. However, the logistic model and machine learning models did not significantly differ from one another. CONCLUSION In summary, we have found genetic and lifestyle risk predictors for breast cancer with moderate discrimination, which might provide reference for breast cancer screening in southeast China. Further population-based studies are needed to validate the model for future applications in personalized breast cancer screening programs.
Collapse
Affiliation(s)
- Shuqing Zou
- Department of Epidemiology and Health Statistics, School of Public HealthFujian Medical UniversityFuzhouChina
| | - Yuxiang Lin
- Department of Breast SurgeryFujian Medical University Union HospitalFuzhouChina
- Department of General SurgeryFujian Medical University Union HospitalFuzhouChina
- Breast Cancer Institute, Fujian Medical UniversityFuzhouChina
| | - Xingxing Yu
- Department of Epidemiology and Health Statistics, School of Public HealthFujian Medical UniversityFuzhouChina
| | - Mikael Eriksson
- Department of Medical Epidemiology and BiostatisticsKarolinska InstitutetStockholmSweden
| | | | - Fangmeng Fu
- Department of Breast SurgeryFujian Medical University Union HospitalFuzhouChina
- Department of General SurgeryFujian Medical University Union HospitalFuzhouChina
- Breast Cancer Institute, Fujian Medical UniversityFuzhouChina
| | - Haomin Yang
- Department of Epidemiology and Health Statistics, School of Public HealthFujian Medical UniversityFuzhouChina
- Department of Medical Epidemiology and BiostatisticsKarolinska InstitutetStockholmSweden
| |
Collapse
|
172
|
Lacan A, Sebag M, Hanczar B. GAN-based data augmentation for transcriptomics: survey and comparative assessment. Bioinformatics 2023; 39:i111-i120. [PMID: 37387181 DOI: 10.1093/bioinformatics/btad239] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Transcriptomics data are becoming more accessible due to high-throughput and less costly sequencing methods. However, data scarcity prevents exploiting deep learning models' full predictive power for phenotypes prediction. Artificially enhancing the training sets, namely data augmentation, is suggested as a regularization strategy. Data augmentation corresponds to label-invariant transformations of the training set (e.g. geometric transformations on images and syntax parsing on text data). Such transformations are, unfortunately, unknown in the transcriptomic field. Therefore, deep generative models such as generative adversarial networks (GANs) have been proposed to generate additional samples. In this article, we analyze GAN-based data augmentation strategies with respect to performance indicators and the classification of cancer phenotypes. RESULTS This work highlights a significant boost in binary and multiclass classification performances due to augmentation strategies. Without augmentation, training a classifier on only 50 RNA-seq samples yields an accuracy of, respectively, 94% and 70% for binary and tissue classification. In comparison, we achieved 98% and 94% of accuracy when adding 1000 augmented samples. Richer architectures and more expensive training of the GAN return better augmentation performances and generated data quality overall. Further analysis of the generated data shows that several performance indicators are needed to assess its quality correctly. AVAILABILITY AND IMPLEMENTATION All data used for this research are publicly available and comes from The Cancer Genome Atlas. Reproducible code is available on the GitLab repository: https://forge.ibisc.univ-evry.fr/alacan/GANs-for-transcriptomics.
Collapse
Affiliation(s)
- Alice Lacan
- IBISC, University Paris-Saclay (Univ. Evry), Evry 91000, France
| | - Michèle Sebag
- TAU, CNRS-INRIA-LISN, University Paris-Saclay, Gif-sur-Yvette 91190, France
| | - Blaise Hanczar
- IBISC, University Paris-Saclay (Univ. Evry), Evry 91000, France
| |
Collapse
|
173
|
Beaude A, Rafiee Vahid M, Augé F, Zehraoui F, Hanczar B. AttOmics: attention-based architecture for diagnosis and prognosis from omics data. Bioinformatics 2023; 39:i94-i102. [PMID: 37387182 DOI: 10.1093/bioinformatics/btad232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The increasing availability of high-throughput omics data allows for considering a new medicine centered on individual patients. Precision medicine relies on exploiting these high-throughput data with machine-learning models, especially the ones based on deep-learning approaches, to improve diagnosis. Due to the high-dimensional small-sample nature of omics data, current deep-learning models end up with many parameters and have to be fitted with a limited training set. Furthermore, interactions between molecular entities inside an omics profile are not patient specific but are the same for all patients. RESULTS In this article, we propose AttOmics, a new deep-learning architecture based on the self-attention mechanism. First, we decompose each omics profile into a set of groups, where each group contains related features. Then, by applying the self-attention mechanism to the set of groups, we can capture the different interactions specific to a patient. The results of different experiments carried out in this article show that our model can accurately predict the phenotype of a patient with fewer parameters than deep neural networks. Visualizing the attention maps can provide new insights into the essential groups for a particular phenotype. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://forge.ibisc.univ-evry.fr/abeaude/AttOmics. TCGA data can be downloaded from the Genomic Data Commons Data Portal.
Collapse
Affiliation(s)
- Aurélien Beaude
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Milad Rafiee Vahid
- Sanofi R&D Data and Data Science, Artificial Intelligence & Deep Analytics, Omics Data Science, 450 Water Street, Cambridge, MA 02142, United States
| | - Franck Augé
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Farida Zehraoui
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| | - Blaise Hanczar
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| |
Collapse
|
174
|
Hao Y, Jing XY, Sun Q. Cancer survival prediction by learning comprehensive deep feature representation for multiple types of genetic data. BMC Bioinformatics 2023; 24:267. [PMID: 37380946 DOI: 10.1186/s12859-023-05392-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/19/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Cancer is one of the leading death causes around the world. Accurate prediction of its survival time is significant, which can help clinicians make appropriate therapeutic schemes. Cancer data can be characterized by varied molecular features, clinical behaviors and morphological appearances. However, the cancer heterogeneity problem usually makes patient samples with different risks (i.e., short and long survival time) inseparable, thereby causing unsatisfactory prediction results. Clinical studies have shown that genetic data tends to contain more molecular biomarkers associated with cancer, and hence integrating multi-type genetic data may be a feasible way to deal with cancer heterogeneity. Although multi-type gene data have been used in the existing work, how to learn more effective features for cancer survival prediction has not been well studied. RESULTS To this end, we propose a deep learning approach to reduce the negative impact of cancer heterogeneity and improve the cancer survival prediction effect. It represents each type of genetic data as the shared and specific features, which can capture the consensus and complementary information among all types of data. We collect mRNA expression, DNA methylation and microRNA expression data for four cancers to conduct experiments. CONCLUSIONS Experimental results demonstrate that our approach substantially outperforms established integrative methods and is effective for cancer survival prediction. AVAILABILITY AND IMPLEMENTATION https://github.com/githyr/ComprehensiveSurvival .
Collapse
Affiliation(s)
- Yaru Hao
- School of Computer Science, Wuhan University, Wuhan, China.
| | - Xiao-Yuan Jing
- School of Computer Science, Wuhan University, Wuhan, China.
- School of Computer, Guangdong University of Petrochemical Technology, Maoming, China.
- State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China.
| | - Qixing Sun
- School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
175
|
Li X, Dai A, Tran R, Wang J. Identifying miRNA biomarkers for breast cancer and ovarian cancer: a text mining perspective. Breast Cancer Res Treat 2023:10.1007/s10549-023-06996-y. [PMID: 37329459 DOI: 10.1007/s10549-023-06996-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 05/25/2023] [Indexed: 06/19/2023]
Abstract
BACKGROUND microRNA (miRNAs) are small, non-coding RNAs that mediate post-transcriptional gene silencing. Numerous studies have demonstrated the critical role of miRNAs in the development of breast cancer and ovarian cancer. To reduce potential bias from individual studies, a more comprehensive approach of exploring miRNAs in cancer research is essential. This study aims to explore the role of miRNAs in the development of breast cancer and ovarian cancer. METHODS Abstracts of the publications were tokenized and the biomedical terms (miRNA, gene, disease, species) were identified and extracted for vectorization. Predictive analyses were conducted with four machine learning models: K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Random Forest (RF), and Naïve Bayes. Both holdout validation and cross-validation were utilized. Feature importance will be identified for miRNA-cancer networks construction. RESULTS We found that miR-182 is highly specific to female cancers. miR-182 targets different genes in regulating breast cancer and ovarian cancer. Naïve Bayes provided a promising prediction model for breast cancer and ovarian cancer with miRNAs and genes combination, with an accuracy score greater than 60%. Feature importance identified miR-155 and miR-199 are critical for breast cancer and ovarian cancer prediction, with miR-155 being highly related to breast cancer, whereas miR-199 being more associated with ovarian cancer. CONCLUSION Our approach effectively identified potential miRNA biomarkers associated with breast cancer and ovarian cancer, providing a solid foundation for generating novel research hypotheses and guiding future experimental studies.
Collapse
Affiliation(s)
- Xin Li
- Ophthalmology Department, Central Hospital Affiliated to Shandong First Medical University, Jinan, 250013, Shandong, China
| | - Andrea Dai
- Oakland University William Beaumont School of Medicine, Rochester, MI, 48309, USA
| | - Richard Tran
- Masters Program in Computer Science, University of Chicago, Chicago, IL, 20833, USA
| | - Jie Wang
- Applied Data Science Program, Syracuse University, Syracuse, NY, 13244, USA.
- MDSight, LLC, Brookeville, MD, 20833, USA.
| |
Collapse
|
176
|
Manou M, Kanakoglou DS, Loupis T, Vrachnos DM, Theocharis S, Papavassiliou AG, Piperi C. Role of Histone Deacetylases in the Pathogenesis of Salivary Gland Tumors and Therapeutic Targeting Options. Int J Mol Sci 2023; 24:10038. [PMID: 37373187 DOI: 10.3390/ijms241210038] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 06/29/2023] Open
Abstract
Salivary gland tumors (SGTs) comprise a rare and heterogenous category of benign/malignant neoplasms with progressively increasing knowledge of the molecular mechanisms underpinning their pathogenesis, poor prognosis, and therapeutic treatment efficacy. Emerging data are pointing toward an interplay of genetic and epigenetic factors contributing to their heterogeneity and diverse clinical phenotypes. Post-translational histone modifications such as histone acetylation/deacetylation have been shown to actively participate in the pathobiology of SGTs, further suggesting that histone deacetylating factors (HDACs), selective or pan-HDAC inhibitors (HDACis), might present effective treatment options for these neoplasms. Herein, we describe the molecular and epigenetic mechanisms underlying the pathology of the different types of SGTs, focusing on histone acetylation/deacetylation effects on gene expression as well as the progress of HDACis in SGT therapy and the current status of relevant clinical trials.
Collapse
Affiliation(s)
- Maria Manou
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Dimitrios S Kanakoglou
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Theodoros Loupis
- Haematology Research Laboratory, Clinical, Experimental Surgery and Translational Research Center, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - Dimitrios M Vrachnos
- Haematology Research Laboratory, Clinical, Experimental Surgery and Translational Research Center, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - Stamatios Theocharis
- First Department of Pathology, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Athanasios G Papavassiliou
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Christina Piperi
- Department of Biological Chemistry, Medical School, National and Kapodistrian University of Athens, 11527 Athens, Greece
| |
Collapse
|
177
|
Mehrpour O, Saeedi F, Abdollahi J, Amirabadizadeh A, Goss F. The value of machine learning for prognosis prediction of diphenhydramine exposure: National analysis of 50,000 patients in the United States. JOURNAL OF RESEARCH IN MEDICAL SCIENCES : THE OFFICIAL JOURNAL OF ISFAHAN UNIVERSITY OF MEDICAL SCIENCES 2023; 28:49. [PMID: 37496638 PMCID: PMC10366979 DOI: 10.4103/jrms.jrms_602_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 02/14/2023] [Accepted: 03/27/2023] [Indexed: 07/28/2023]
Abstract
Background Diphenhydramine (DPH) is an antihistamine medication that in overdose can result in anticholinergic symptoms and serious complications, including arrhythmia and coma. We aimed to compare the value of various machine learning (ML) models, including light gradient boosting machine (LGBM), logistic regression (LR), and random forest (RF), in the outcome prediction of DPH poisoning. Materials and Methods We used the National Poison Data System database and included all of the human exposures of DPH from January 01, 2017 to December 31, 2017, and excluded those cases with missing information, duplicated cases, and those who reported co-ingestion. Data were split into training and test datasets, and three ML models were compared. We developed confusion matrices for each, and standard performance metrics were calculated. Results Our study population included 53,761 patients with DPH exposure. The most common reasons for exposure, outcome, chronicity of exposure, and formulation were captured. Our results showed that the average precision-recall area under the curve (AUC) of 0.84. LGBM and RF had the highest performance (average AUC of 0.91), followed by LR (average AUC of 0.90). The specificity of the models was 87.0% in the testing groups. The precision of models was 75.0%. Recall (sensitivity) of models ranged between 73% and 75% with an F1 score of 75.0%. The overall accuracy of LGBM, LR, and RF models in the test dataset was 74.8%, 74.0%, and 75.1%, respectively. In total, just 1.1% of patients (mostly those with major outcomes) received physostigmine. Conclusion Our study demonstrates the application of ML in the prediction of DPH poisoning.
Collapse
Affiliation(s)
- Omid Mehrpour
- Michigan Poison & Drug Information Center, Wayne State University School of Medicine, Detroit, Michigan, United States
- Rocky Mountain Poison and Drug Safety, Denver Health and Hospital Authority, Denver, CO, United States
| | - Farhad Saeedi
- Medical Toxicology and Drug Abuse Research Center, Birjand University of Medical Sciences, Birjand, Iran
- Student Research Committee, Birjand University of Medical Sciences, Birjand, Iran
| | - Jafar Abdollahi
- Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
| | - Alireza Amirabadizadeh
- Endocrine Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Foster Goss
- Department of Emergency Medicine, University of Colorado School of Medicine, Aurora, CO, USA
| |
Collapse
|
178
|
Hatano Y, Ishihara T, Hirokawa S, Onodera O. Machine Learning Approach for the Prediction of Age-Specific Probability of SCA3 and DRPLA by Survival Curve Analysis. Neurol Genet 2023; 9:e200075. [PMID: 37152445 PMCID: PMC10159758 DOI: 10.1212/nxg.0000000000200075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 03/23/2023] [Indexed: 05/09/2023]
Abstract
Background and Objectives As the number of repeats in the expansion increases, polyglutamine diseases tend to show at a younger age. From this relationship, attempts have been made to predict age at onset by parametric survival analysis. However, a method for a more accurate prediction has been desirable. In this study, we examined 2 methods for survival analysis using machine learning and 6 conventional methods for parametric survival analysis of spinocerebellar ataxia (SCA)3 and dentatorubral-pallidoluysian atrophy (DRPLA). Methods We compared the performance of 2 machine learning methods of survival analysis (random survival forest [RSF] and DeepSurv) and 6 methods of parametric survival analysis (Weibull, exponential, Gaussian, logistic, loglogistic, and log Gaussian). Training and evaluation were performed using the leave-one-out cross-validation method, and evaluation criteria included root mean squared error (RMSE), mean absolute error (MAE), and the integrated Brier score. The latter was used as the primary end point, and the survival analysis model yielding the best result was used to predict the asymptomatic probability. Results Among the models examined, the RSF and DeepSurv machine learning methods had a higher prediction accuracy than the parametric methods of survival analysis. For both SCA3 and DRPLA, RSF had a higher accuracy than DeepSurv for the assessment of RMSE (SCA3: 7.37, DRPLA: 10.78), MAE (SCA3: 5.52, DRPLA: 8.17), and the integrated Brier score (SCA3: 0.05, DRPLA: 0.077). Using RSF, we determined the age-specific probability distribution of age at onset based on CAG repeat size and current age. Discussion In this study, we have demonstrated the superiority of machine learning methods for predicting age at onset of SCA3 and DRPLA using survival analysis. Such accurate prediction of onset will be useful for genetic counseling of carriers and for devising methods to verify the effects of interventions for unaffected individuals.
Collapse
Affiliation(s)
- Yuya Hatano
- Department of Neurology, Brain Research Institute, Niigata University, Niigata-shi, Japan
| | - Tomohiko Ishihara
- Department of Neurology, Brain Research Institute, Niigata University, Niigata-shi, Japan
| | - Sachiko Hirokawa
- Department of Neurology, Brain Research Institute, Niigata University, Niigata-shi, Japan
| | - Osamu Onodera
- Department of Neurology, Brain Research Institute, Niigata University, Niigata-shi, Japan
| |
Collapse
|
179
|
Buk Cardoso L, Cunha Parro V, Verzinhasse Peres S, Curado MP, Fernandes GA, Wünsch Filho V, Natasha Toporcov T. Machine learning for predicting survival of colorectal cancer patients. Sci Rep 2023; 13:8874. [PMID: 37264045 PMCID: PMC10235087 DOI: 10.1038/s41598-023-35649-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 05/22/2023] [Indexed: 06/03/2023] Open
Abstract
Colorectal cancer is one of the most incident types of cancer in the world, with almost 2 million new cases annually. In Brazil, the scenery is the same, around 41 thousand new cases were estimated in the last 3 years. This increase in cases further intensifies the interest and importance of studies related to the topic, especially using new approaches. The use of machine learning algorithms for cancer studies has grown in recent years, and they can provide important information to medicine, in addition to making predictions based on the data. In this study, five different classifications were performed, considering patients' survival. Data were extracted from Hospital Based Cancer Registries of São Paulo, which is coordinated by Fundação Oncocentro de São Paulo, containing patients with colorectal cancer from São Paulo state, Brazil, treated between 2000 and 2021. The machine learning models used provided us the predictions and the most important features for each one of the algorithms of the studies. Using part of the dataset to validate our models, the results of the predictors were around 77% of accuracy, with AUC close to 0.86, and the most important column was the clinical staging in all of them.
Collapse
Affiliation(s)
- Lucas Buk Cardoso
- Núcleo de Sistemas Eletrônicos Embarcados, Instituto Mauá de Tecnologia, São Paulo, 09580-900, Brazil.
| | - Vanderlei Cunha Parro
- Núcleo de Sistemas Eletrônicos Embarcados, Instituto Mauá de Tecnologia, São Paulo, 09580-900, Brazil
| | - Stela Verzinhasse Peres
- Information and Epidemiology, Fundação Oncocentro de São Paulo, São Paulo, 05409-012, Brazil
| | - Maria Paula Curado
- Epidemiology and Statistics on Cancer Group, A.C. Camargo Cancer Center, São Paulo, 01525-001, Brazil
| | | | - Victor Wünsch Filho
- Information and Epidemiology, Fundação Oncocentro de São Paulo, São Paulo, 05409-012, Brazil
- Epidemiology Department, Faculdade de Saude Pública da Universidade de São Paulo, São Paulo, 01246-904, Brazil
| | - Tatiana Natasha Toporcov
- Epidemiology Department, Faculdade de Saude Pública da Universidade de São Paulo, São Paulo, 01246-904, Brazil
| |
Collapse
|
180
|
Qasim Gilani S, Syed T, Umair M, Marques O. Skin Cancer Classification Using Deep Spiking Neural Network. J Digit Imaging 2023; 36:1137-1147. [PMID: 36690775 PMCID: PMC10287885 DOI: 10.1007/s10278-023-00776-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 12/30/2022] [Accepted: 01/02/2023] [Indexed: 01/24/2023] Open
Abstract
Skin cancer is one of the primary causes of death globally, and experts diagnose it by visual inspection, which can be inaccurate. The need for developing a computer-aided method to aid dermatologists in diagnosing skin cancer is highlighted by the fact that early identification can lower the number of deaths caused by skin malignancies. Among computer-aided techniques, deep learning is the most popular for identifying cancer from skin lesion images. Due to their power-efficient behavior, spiking neural networks are attractive deep neural networks for hardware implementation. We employed deep spiking neural networks using the surrogate gradient descent method to classify 3670 melanoma and 3323 non-melanoma images from the ISIC 2019 dataset. We achieved an accuracy of 89.57% and an F1 score of 90.07% using the proposed spiking VGG-13 model, which is higher than the VGG-13 and AlexNet using less trainable parameters.
Collapse
Affiliation(s)
- Syed Qasim Gilani
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, 33431 FL USA
| | - Tehreem Syed
- Department of Electrical Engineering and Computer Engineering, Technische Universität Dresden, Dresden, 01069 Saxony Germany
| | - Muhammad Umair
- Department of Electrical and Computer Engineering, George Mason University, Fairfax, 22030 VA USA
| | - Oge Marques
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, 33431 FL USA
| |
Collapse
|
181
|
Ahmad A, Imran M, Ahsan H. Biomarkers as Biomedical Bioindicators: Approaches and Techniques for the Detection, Analysis, and Validation of Novel Biomarkers of Diseases. Pharmaceutics 2023; 15:1630. [PMID: 37376078 DOI: 10.3390/pharmaceutics15061630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/24/2023] [Accepted: 05/29/2023] [Indexed: 06/29/2023] Open
Abstract
A biomarker is any measurable biological moiety that can be assessed and measured as a potential index of either normal or abnormal pathophysiology or pharmacological responses to some treatment regimen. Every tissue in the body has a distinct biomolecular make-up, which is known as its biomarkers, which possess particular features, viz., the levels or activities (the ability of a gene or protein to carry out a particular body function) of a gene, protein, or other biomolecules. A biomarker refers to some feature that can be objectively quantified by various biochemical samples and evaluates the exposure of an organism to normal or pathological procedures or their response to some drug interventions. An in-depth and comprehensive realization of the significance of these biomarkers becomes quite important for the efficient diagnosis of diseases and for providing the appropriate directions in case of multiple drug choices being presently available, which can benefit any patient. Presently, advancements in omics technologies have opened up new possibilities to obtain novel biomarkers of different types, employing genomic strategies, epigenetics, metabolomics, transcriptomics, lipid-based analysis, protein studies, etc. Particular biomarkers for specific diseases, their prognostic capabilities, and responses to therapeutic paradigms have been applied for screening of various normal healthy, as well as diseased, tissue or serum samples, and act as appreciable tools in pharmacology and therapeutics, etc. In this review, we have summarized various biomarker types, their classification, and monitoring and detection methods and strategies. Various analytical techniques and approaches of biomarkers have also been described along with various clinically applicable biomarker sensing techniques which have been developed in the recent past. A section has also been dedicated to the latest trends in the formulation and designing of nanotechnology-based biomarker sensing and detection developments in this field.
Collapse
Affiliation(s)
- Anas Ahmad
- Julia McFarlane Diabetes Research Centre (JMDRC), Department of Microbiology, Immunology and Infectious Diseases, Snyder Institute for Chronic Diseases, Hotchkiss Brain Institute, Cumming School of Medicine, Foothills Medical Centre, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Mohammad Imran
- Therapeutics Research Group, Frazer Institute, Faculty of Medicine, University of Queensland, Brisbane 4102, Australia
| | - Haseeb Ahsan
- Department of Biochemistry, Faculty of Dentistry, Jamia Millia Islamia, New Delhi 110025, India
| |
Collapse
|
182
|
Nasir MU, Khan MF, Khan MA, Zubair M, Abbas S, Alharbi M, Akhtaruzzaman M. Hematologic Cancer Detection Using White Blood Cancerous Cells Empowered with Transfer Learning and Image Processing. JOURNAL OF HEALTHCARE ENGINEERING 2023; 2023:1406545. [PMID: 37284488 PMCID: PMC10241593 DOI: 10.1155/2023/1406545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 03/23/2023] [Accepted: 03/28/2023] [Indexed: 06/08/2023]
Abstract
Lymphoma and leukemia are fatal syndromes of cancer that cause other diseases and affect all types of age groups including male and female, and disastrous and fatal blood cancer causes an increased savvier death ratio. Both lymphoma and leukemia are associated with the damage and rise of immature lymphocytes, monocytes, neutrophils, and eosinophil cells. So, in the health sector, the early prediction and treatment of blood cancer is a major issue for survival rates. Nowadays, there are various manual techniques to analyze and predict blood cancer using the microscopic medical reports of white blood cell images, which is very steady for prediction and causes a major ratio of deaths. Manual prediction and analysis of eosinophils, lymphocytes, monocytes, and neutrophils are very difficult and time-consuming. In previous studies, they used numerous deep learning and machine learning techniques to predict blood cancer, but there are still some limitations in these studies. So, in this article, we propose a model of deep learning empowered with transfer learning and indulge in image processing techniques to improve the prediction results. The proposed transfer learning model empowered with image processing incorporates different levels of prediction, analysis, and learning procedures and employs different learning criteria like learning rate and epochs. The proposed model used numerous transfer learning models with varying parameters for each model and cloud techniques to choose the best prediction model, and the proposed model used an extensive set of performance techniques and procedures to predict the white blood cells which cause cancer to incorporate image processing techniques. So, after extensive procedures of AlexNet, MobileNet, and ResNet with both image processing and without image processing techniques with numerous learning criteria, the stochastic gradient descent momentum incorporated with AlexNet is outperformed with the highest prediction accuracy of 97.3% and the misclassification rate is 2.7% with image processing technique. The proposed model gives good results and can be applied for smart diagnosing of blood cancer using eosinophils, lymphocytes, monocytes, and neutrophils.
Collapse
Affiliation(s)
- Muhammad Umar Nasir
- Department of Computer Science, Bahria University, Lahore Campus, Lahore 54000, Pakistan
| | - Muhammad Farhan Khan
- Department of Forensic Sciences, University of Health Sciences, Lahore 54000, Pakistan
| | - Muhammad Adnan Khan
- Riphah School of Computing and Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore 54000, Pakistan
- School of Information Technology, Skyline University College, University City Sharjah, Sharjah, UAE
| | - Muhammad Zubair
- Faculty of Computing, Riphah International University, Islamabad 45000, Pakistan
| | - Sagheer Abbas
- School of Computer Science, National College of Business Administration & Economics, Lahore 54000, Pakistan
| | - Meshal Alharbi
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharjb 11942, Saudi Arabia
| | - Md Akhtaruzzaman
- Department of Computer Science and Engineering, Aisan University of Bangladesh, Ashulia, Dhaka-1230, Bangladesh
| |
Collapse
|
183
|
Syleouni ME, Karavasiloglou N, Manduchi L, Wanner M, Korol D, Ortelli L, Bordoni A, Rohrmann S. Predicting second breast cancer among women with primary breast cancer using machine learning algorithms, a population-based observational study. Int J Cancer 2023. [PMID: 37243372 DOI: 10.1002/ijc.34568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 04/26/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023]
Abstract
Breast cancer survivors often experience recurrence or a second primary cancer. We developed an automated approach to predict the occurrence of any second breast cancer (SBC) using patient-level data and explored the generalizability of the models with an external validation data source. Breast cancer patients from the cancer registry of Zurich, Zug, Schaffhausen, Schwyz (N = 3213; training dataset) and the cancer registry of Ticino (N = 1073; external validation dataset), diagnosed between 2010 and 2018, were used for model training and validation, respectively. Machine learning (ML) methods, namely a feed-forward neural network (ANN), logistic regression, and extreme gradient boosting (XGB) were employed for classification. The best-performing model was selected based on the receiver operating characteristic (ROC) curve. Key characteristics contributing to a high SBC risk were identified. SBC was diagnosed in 6% of all cases. The most important features for SBC prediction were age at incidence, year of birth, stage, and extent of the pathological primary tumor. The ANN model had the highest area under the ROC curve with 0.78 (95% confidence interval [CI] 0.750.82) in the training data and 0.70 (95% CI 0.61-0.79) in the external validation data. Investigating the generalizability of different ML algorithms, we found that the ANN generalized better than the other models on the external validation data. This research is a first step towards the development of an automated tool that could assist clinicians in the identification of women at high risk of developing an SBC and potentially preventing it.
Collapse
Affiliation(s)
- Maria-Eleni Syleouni
- Division of Chronic Disease Epidemiology, Epidemiology Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- Cancer Registry Zurich, Zug, Schaffhausen and Schwyz, University Hospital Zurich, Zurich, Switzerland
| | - Nena Karavasiloglou
- Division of Chronic Disease Epidemiology, Epidemiology Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- European Food Safety Authority, Parma, Italy
| | | | - Miriam Wanner
- Cancer Registry Zurich, Zug, Schaffhausen and Schwyz, University Hospital Zurich, Zurich, Switzerland
| | - Dimitri Korol
- Cancer Registry Zurich, Zug, Schaffhausen and Schwyz, University Hospital Zurich, Zurich, Switzerland
| | - Laura Ortelli
- Ticino Cancer Registry, Public Health Division of Canton Ticino, Locarno, Switzerland
| | - Andrea Bordoni
- Ticino Cancer Registry, Public Health Division of Canton Ticino, Locarno, Switzerland
| | - Sabine Rohrmann
- Division of Chronic Disease Epidemiology, Epidemiology Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- Cancer Registry Zurich, Zug, Schaffhausen and Schwyz, University Hospital Zurich, Zurich, Switzerland
| |
Collapse
|
184
|
Sinnarasan VSP, Paul D, Das R, Venkatesan A. Gastric Cancer Biomarker Candidates Identified by Machine Learning and Integrative Bioinformatics: Toward Personalized Medicine. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023. [PMID: 37229622 DOI: 10.1089/omi.2023.0015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Gastric cancer (GC) is among the leading causes of cancer-related deaths worldwide. The discovery of robust diagnostic biomarkers for GC remains a challenge. This study sought to identify biomarker candidates for GC by integrating machine learning (ML) and bioinformatics approaches. Transcriptome profiles of patients with GC were analyzed to identify differentially expressed genes between the tumor and adjacent normal tissues. Subsequently, we constructed protein-protein interaction networks so as to find the significant hub genes. Along with the bioinformatics integration of ML methods such as support vector machine, the recursive feature elimination was used to select the most informative genes. The analysis unraveled 160 significant genes, with 88 upregulated and 72 downregulated, 10 hub genes, and 12 features from the variable selection method. The integrated analyses found that EXO1, DTL, KIF14, and TRIP13 genes are significant and poised as potential diagnostic biomarkers in relation to GC. The receiver operating characteristic curve analysis found KIF14 and TRIP13 are strongly associated with diagnosis of GC. We suggest KIF14 and TRIP13 are considered as biomarker candidates that might potentially inform future research on diagnosis, prognosis, or therapeutic targets for GC. These findings collectively offer new future possibilities for precision/personalized medicine research and development for patients with GC.
Collapse
Affiliation(s)
| | - Dahrii Paul
- Department for Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry, India
| | - Rajesh Das
- Department for Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry, India
| | - Amouda Venkatesan
- Department for Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry, India
| |
Collapse
|
185
|
Rafiq A, Chursin A, Awad Alrefaei W, Rashed Alsenani T, Aldehim G, Abdel Samee N, Menzli LJ. Detection and Classification of Histopathological Breast Images Using a Fusion of CNN Frameworks. Diagnostics (Basel) 2023; 13:diagnostics13101700. [PMID: 37238186 DOI: 10.3390/diagnostics13101700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 04/07/2023] [Accepted: 04/20/2023] [Indexed: 05/28/2023] Open
Abstract
Breast cancer is responsible for the deaths of thousands of women each year. The diagnosis of breast cancer (BC) frequently makes the use of several imaging techniques. On the other hand, incorrect identification might occasionally result in unnecessary therapy and diagnosis. Therefore, the accurate identification of breast cancer can save a significant number of patients from undergoing unnecessary surgery and biopsy procedures. As a result of recent developments in the field, the performance of deep learning systems used for medical image processing has showed significant benefits. Deep learning (DL) models have found widespread use for the aim of extracting important features from histopathologic BC images. This has helped to improve the classification performance and has assisted in the automation of the process. In recent times, both convolutional neural networks (CNNs) and hybrid models of deep learning-based approaches have demonstrated impressive performance. In this research, three different types of CNN models are proposed: a straightforward CNN model (1-CNN), a fusion CNN model (2-CNN), and a three CNN model (3-CNN). The findings of the experiment demonstrate that the techniques based on the 3-CNN algorithm performed the best in terms of accuracy (90.10%), recall (89.90%), precision (89.80%), and f1-Score (89.90%). In conclusion, the CNN-based approaches that have been developed are contrasted with more modern machine learning and deep learning models. The application of CNN-based methods has resulted in a significant increase in the accuracy of the BC classification.
Collapse
Affiliation(s)
- Ahsan Rafiq
- School of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Alexander Chursin
- Higher School of Industrial Policy and Entrepreneurship, RUDN University, 6 Miklukho-Maklaya St, Moscow 117198, Russia
| | - Wejdan Awad Alrefaei
- Department of Programming and Computer Sciences, Applied College in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj 16245, Saudi Arabia
| | - Tahani Rashed Alsenani
- Department of Biology, College of Sciences in Yanbu, Taibah University, Yanbu 46522, Saudi Arabia
| | - Ghadah Aldehim
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Nagwan Abdel Samee
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Leila Jamel Menzli
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| |
Collapse
|
186
|
Lo CM, Yang YW, Lin JK, Lin TC, Chen WS, Yang SH, Chang SC, Wang HS, Lan YT, Lin HH, Huang SC, Cheng HH, Jiang JK, Lin CC. Modeling the survival of colorectal cancer patients based on colonoscopic features in a feature ensemble vision transformer. Comput Med Imaging Graph 2023; 107:102242. [PMID: 37172354 DOI: 10.1016/j.compmedimag.2023.102242] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 05/05/2023] [Accepted: 05/07/2023] [Indexed: 05/14/2023]
Abstract
The prognosis of patients with colorectal cancer (CRC) mostly relies on the classic tumor node metastasis (TNM) staging classification. A more accurate and convenient prediction model would provide a better prognosis and assist in treatment. From May 2014 to December 2017, patients who underwent an operation for CRC were enrolled. The proposed feature ensemble vision transformer (FEViT) used ensemble classifiers to benefit the combinations of relevant colonoscopy features from the pretrained vision transformer and clinical features, including sex, age, family history of CRC, and tumor location, to establish the prognostic model. A total of 1729 colonoscopy images were enrolled in the current retrospective study. For the prediction of patient survival, FEViT achieved an accuracy of 94 % with an area under the receiver operating characteristic curve of 0.93, which was better than the TNM staging classification (90 %, 0.83) in the experiment. FEViT reduced the limited receptive field and gradient disappearance in the conventional convolutional neural network and was a relatively effective and efficient procedure. The promising accuracy of FEViT in modeling survival makes the prognosis of CRC patients more predictable and practical.
Collapse
Affiliation(s)
- Chung-Ming Lo
- Graduate Institute of Library, Information and Archival Studies, National Chengchi University, Taipei, Taiwan
| | - Yi-Wen Yang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Jen-Kou Lin
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Tzu-Chen Lin
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Wei-Shone Chen
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Shung-Haur Yang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan; Department of Surgery, National Yang Ming Chiao Tung University Hospital, Yilan, Taiwan
| | - Shih-Ching Chang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Huann-Sheng Wang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yuan-Tzu Lan
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Hung-Hsin Lin
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Sheng-Chieh Huang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Hou-Hsuan Cheng
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Jeng-Kai Jiang
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chun-Chi Lin
- Division of Colon and Rectal Surgery, Department of Surgery, Taipei Veterans General Hospital, Taipei, Taiwan; Department of Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
187
|
Ma J, Guan Y, Xing F, Eltzov E, Wang Y, Li X, Tai B. Accurate and non-destructive monitoring of mold contamination in foodstuffs based on whole-cell biosensor array coupling with machine-learning prediction models. JOURNAL OF HAZARDOUS MATERIALS 2023; 449:131030. [PMID: 36827728 DOI: 10.1016/j.jhazmat.2023.131030] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 02/15/2023] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
Mold contamination in foodstuffs causes huge economic losses, quality deterioration and mycotoxin production. Thus, non-destructive and accurate monitoring of mold occurrence in foodstuffs is highly required. We proposed a novel whole-cell biosensor array to monitor pre-mold events in foodstuffs. Firstly, 3 volatile markers ethyl propionate, 1-methyl-1 H-pyrrole and 2,3-butanediol were identified from pre-mold peanuts using gas chromatography-mass spectrometry. Together with other 3 frequently-reported volatiles from Aspergillus flavus infection, the volatiles at subinhibitory concentrations induced significant but differential response patterns from 14 stress-responsive Escherichia coli promoters. Subsequently, a whole-cell biosensor array based on the 14 promoters was constructed after whole-cell immobilization in calcium alginate. To discriminate the response patterns of the whole-cell biosensor array to mold-contaminated foodstuffs, optimal classifiers were determined by comparing 6 machine-learning algorithms. 100 % accuracy was achieved to discriminate healthy from moldy peanuts and maize, and 95 % and 98 % accuracy in discriminating pre-mold stages for infected peanuts and maize, based on random forest classifiers. 83 % accuracy was obtained to separate moldy peanuts from moldy maize by sparse partial least square determination analysis. The results demonstrated high accuracy and practicality of our method based on a whole-cell biosensor array coupling with machine-learning classifiers for mold monitoring in foodstuffs.
Collapse
Affiliation(s)
- Junning Ma
- Key Laboratory of Agro-Products Quality and Safety Control in Storage and Transport Process, Ministry of Agriculture and Rural Affairs / Institute of Food Science and Technology, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Yue Guan
- College of Food Science and Technology, Zhejiang University of Technology, Hangzhou 310014, China
| | - Fuguo Xing
- Key Laboratory of Agro-Products Quality and Safety Control in Storage and Transport Process, Ministry of Agriculture and Rural Affairs / Institute of Food Science and Technology, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| | - Evgeni Eltzov
- Department of Postharvest Science, Institute of Postharvest and Food Sciences, The Volcani Center, Agricultural Research Organization, Bet Dagan 50250, Israel
| | - Yan Wang
- College of Food Science and Technology, Zhejiang University of Technology, Hangzhou 310014, China
| | - Xu Li
- Key Laboratory of Agro-Products Quality and Safety Control in Storage and Transport Process, Ministry of Agriculture and Rural Affairs / Institute of Food Science and Technology, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Bowen Tai
- Key Laboratory of Agro-Products Quality and Safety Control in Storage and Transport Process, Ministry of Agriculture and Rural Affairs / Institute of Food Science and Technology, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| |
Collapse
|
188
|
Pan X, Cong H, Wang X, Zhang H, Ge Y, Hu S. Deep learning-extracted CT imaging phenotypes predict response to total resection in colorectal cancer. Acta Radiol 2023; 64:1783-1791. [PMID: 36762417 DOI: 10.1177/02841851231152685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
BACKGROUND Deep learning surpasses many traditional methods for many vision tasks, allowing the transformation of hierarchical features into more abstract, high-level features. PURPOSE To evaluate the prognostic value of preoperative computed tomography (CT) image texture features and deep learning self-learning high-throughput features (SHF) on postoperative overall survival in the treatment of patients with colorectal cancer (CRC). MATERIAL AND METHODS The dataset consisted of 810 enrolled patients with CRC confirmed from 10 November 2011 to 10 February 2018. In contrast, SHF extracted by deep learning with multi-task training mechanism and texture features were extracted from the CT with tumor volume region of interest, respectively, and combined with the Cox proportional hazard (CoxPH) model for initial validation to obtain a RAD score to classify patients into high- and low-risk groups. The SHF stability was further validated in combination with Neural Multi-Task Logistic Regression (N-MTLR) model. The overall recognition ability and accuracy of CoxPH and N-MTLR model were evaluated by C-index and Integrated Brier Score (IBS). RESULTS SHF had a more significant degree of differentiation than texture features. The result is (SHF vs. texture features: C-index: 0.884 vs. 0.611; IBS: 0.025 vs. 0.073) in the CoxPH model, and (SHF vs. texture features: C-index: 0.861 vs. 0.630; IBS: 0.024 vs. 0.065) in N-MTLR. CONCLUSION SHF is superior to texture features and has potential application for the preoperative prediction of the individualized treatment of CRC.
Collapse
Affiliation(s)
- Xiang Pan
- The School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, PR China
- Faculty of Health Sciences, University of Macau, Macau, PR China
| | - He Cong
- The School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, PR China
| | - Xiaolei Wang
- The School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, PR China
| | - Heng Zhang
- Department of Radiology, Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu, PR China
| | - Yuxi Ge
- Department of Radiology, Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu, PR China
| | - Shudong Hu
- Department of Radiology, Affiliated Hospital of Jiangnan University, Wuxi, Jiangsu, PR China
| |
Collapse
|
189
|
Charlton CE, Poon MTC, Brennan PM, Fleuriot JD. Development of prediction models for one-year brain tumour survival using machine learning: a comparison of accuracy and interpretability. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 233:107482. [PMID: 36947980 DOI: 10.1016/j.cmpb.2023.107482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 12/15/2022] [Accepted: 03/12/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Prediction of survival in patients diagnosed with a brain tumour is challenging because of heterogeneous tumour behaviours and treatment response. Advances in machine learning have led to the development of clinical prognostic models, but due to the lack of model interpretability, integration into clinical practice is almost non-existent. In this retrospective study, we compare five classification models with varying degrees of interpretability for the prediction of brain tumour survival greater than one year following diagnosis. METHODS 1028 patients aged ≥16 years with a brain tumour diagnosis between April 2012 and April 2020 were included in our study. Three intrinsically interpretable 'glass box' classifiers (Bayesian Rule Lists [BRL], Explainable Boosting Machine [EBM], and Logistic Regression [LR]), and two 'black box' classifiers (Random Forest [RF] and Support Vector Machine [SVM]) were trained on electronic patients records for the prediction of one-year survival. All models were evaluated using balanced accuracy (BAC), F1-score, sensitivity, specificity, and receiver operating characteristics. Black box model interpretability and misclassified predictions were quantified using SHapley Additive exPlanations (SHAP) values and model feature importance was evaluated by clinical experts. RESULTS The RF model achieved the highest BAC of 78.9%, closely followed by SVM (77.7%), LR (77.5%) and EBM (77.1%). Across all models, age, diagnosis (tumour type), functional features, and first treatment were top contributors to the prediction of one year survival. We used EBM and SHAP to explain model misclassifications and investigated the role of feature interactions in prognosis. CONCLUSION Interpretable models are a natural choice for the domain of predictive medicine. Intrinsically interpretable models, such as EBMs, may provide an advantage over traditional clinical assessment of brain tumour prognosis by weighting potential risk factors and their interactions that may be unknown to clinicians. An agreement between model predictions and clinical knowledge is essential for establishing trust in the models decision making process, as well as trust that the model will make accurate predictions when applied to new data.
Collapse
Affiliation(s)
- Colleen E Charlton
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK.
| | - Michael T C Poon
- Cancer Research UK Brain Tumour Centre of Excellence, CRUK Edinburgh Centre, University of Edinburgh, Edinburgh, UK; Department of Clinical Neuroscience, Royal Infirmary of Edinburgh, 51 Little France Crescent EH16 4SA, UK.; Translational Neurosurgery, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK; Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Paul M Brennan
- Cancer Research UK Brain Tumour Centre of Excellence, CRUK Edinburgh Centre, University of Edinburgh, Edinburgh, UK; Department of Clinical Neuroscience, Royal Infirmary of Edinburgh, 51 Little France Crescent EH16 4SA, UK.; Translational Neurosurgery, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
| | - Jacques D Fleuriot
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
| |
Collapse
|
190
|
Al-hajjar ALN, Al-Qurabat AKM. An overview of machine learning methods in enabling IoMT-based epileptic seizure detection. THE JOURNAL OF SUPERCOMPUTING 2023; 79:1-48. [PMID: 37359338 PMCID: PMC10123593 DOI: 10.1007/s11227-023-05299-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 04/12/2023] [Indexed: 06/28/2023]
Abstract
The healthcare industry is rapidly automating, in large part because of the Internet of Things (IoT). The sector of the IoT devoted to medical research is sometimes called the Internet of Medical Things (IoMT). Data collecting and processing are the fundamental components of all IoMT applications. Machine learning (ML) algorithms must be included into IoMT immediately due to the vast quantity of data involved in healthcare and the value that precise forecasts have. In today's world, together, IoMT, cloud services, and ML techniques have become effective tools for solving many problems in the healthcare sector, such as epileptic seizure monitoring and detection. One of the biggest hazards to people's lives is epilepsy, a lethal neurological condition that has become a global issue. To prevent the deaths of thousands of epileptic patients each year, there is a critical necessity for an effective method for detecting epileptic seizures at their earliest stage. Numerous medical procedures, including epileptic monitoring, diagnosis, and other procedures, may be carried out remotely with the use of IoMT, which will reduce healthcare expenses and improve services. This article seeks to act as both a collection and a review of the different cutting-edge ML applications for epilepsy detection that are presently being combined with IoMT.
Collapse
Affiliation(s)
| | - Ali Kadhum M. Al-Qurabat
- Department of Computer Science, College of Science for Women, University of Babylon, Babylon, Iraq
| |
Collapse
|
191
|
Li H, Wang S, Liu B, Fang M, Cao R, He B, Liu S, Hu C, Dong D, Wang X, Wang H, Tian J. A multi-view co-training network for semi-supervised medical image-based prognostic prediction. Neural Netw 2023; 164:455-463. [PMID: 37182347 DOI: 10.1016/j.neunet.2023.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 03/07/2023] [Accepted: 04/18/2023] [Indexed: 05/16/2023]
Abstract
Prognostic prediction has long been a hotspot in disease analysis and management, and the development of image-based prognostic prediction models has significant clinical implications for current personalized treatment strategies. The main challenge in prognostic prediction is to model a regression problem based on censored observations, and semi-supervised learning has the potential to play an important role in improving the utilization efficiency of censored data. However, there are yet few effective semi-supervised paradigms to be applied. In this paper, we propose a semi-supervised co-training deep neural network incorporating a support vector regression layer for survival time estimation (Co-DeepSVS) that improves the efficiency in utilizing censored data for prognostic prediction. First, we introduce a support vector regression layer in deep neural networks to deal with censored data and directly predict survival time, and more importantly to calculate the labeling confidence of each case. Then, we apply a semi-supervised multi-view co-training framework to achieve accurate prognostic prediction, where labeling confidence estimation with prior knowledge of pseudo time is conducted for each view. Experimental results demonstrate that the proposed Co-DeepSVS has a promising prognostic ability and surpasses most widely used methods on a multi-phase CT dataset. Besides, the introduction of SVR layer makes the model more robust in the presence of follow-up bias.
Collapse
Affiliation(s)
- Hailin Li
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China; CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Siwen Wang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Bo Liu
- Lanzhou University Second Hospital, Lanzhou, 730050, Gansu, China; Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Shandong University, Jinan, 250021, Shandong, China
| | - Mengjie Fang
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China; CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Runnan Cao
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Bingxi He
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China; CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Shengyuan Liu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chaoen Hu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Di Dong
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Ximing Wang
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Shandong University, Jinan, 250021, Shandong, China.
| | - Hexiang Wang
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China.
| | - Jie Tian
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Engineering Medicine, Beihang University, Beijing, 100191, China; CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
192
|
Qu J, Li C, Liu M, Wang Y, Feng Z, Li J, Wang W, Wu F, Zhang S, Zhao X. Prognostic Models Using Machine Learning Algorithms and Treatment Outcomes of Occult Breast Cancer Patients. J Clin Med 2023; 12:jcm12093097. [PMID: 37176539 PMCID: PMC10179501 DOI: 10.3390/jcm12093097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 03/05/2023] [Accepted: 04/20/2023] [Indexed: 05/15/2023] Open
Abstract
BACKGROUND Occult breast cancer (OBC) is an uncommon malignant tumor and the prognosis and treatment of OBC remain controversial. Currently, there exists no accurate prognostic clinical model for OBC, and the treatment outcomes of chemotherapy and surgery in its different molecular subtypes are still unknown. METHODS The SEER database provided the data used for this study's analysis (2010-2019). To identify the prognostic variables for patients with ODC, we conducted Cox regression analysis and constructed prognostic models using six machine learning algorithms to predict overall survival (OS) of OBC patients. A series of validation methods, including calibration curve and area under the curve (AUC value) of receiver operating characteristic curve (ROC) were employed to validate the accuracy and reliability of the logistic regression (LR) models. The effectiveness of clinical application of the predictive models was validated using decision curve analysis (DCA). We also investigated the role of chemotherapy and surgery in OBC patients with different molecular subtypes, with the help of K-M survival analysis as well as propensity score matching, and these results were further validated by subgroup Cox analysis. RESULTS The LR models performed best, with high precision and applicability, and they were proved to predict the OS of OBC patients in the most accurate manner (test set: 1-year AUC = 0.851, 3-year AUC = 0.790 and 5-year survival AUC = 0.824). Interestingly, we found that the N1 and N2 stage OBC patients had more favorable prognosis than N0 stage patients, but the N3 stage was similar to the N0 stage (OS: N0 vs. N1, HR = 0.6602, 95%CI 0.4568-0.9542, p < 0.05; N0 vs. N2, HR = 0.4716, 95%CI 0.2351-0.9464, p < 0.05; N0 vs. N3, HR = 0.96, 95%CI 0.6176-1.5844, p = 0.96). Patients aged >80 and distant metastases were also independent prognostic factors for OBC. In terms of treatment, our multivariate Cox regression analysis discovered that surgery and radiotherapy were both independent protective variables for OBC patients, but chemotherapy was not. We also found that chemotherapy significantly improved both OS and breast cancer-specific survival (BCSS) only in the HR-/HER2+ molecular subtype (OS: HR = 0.15, 95%CI 0.037-0.57, p < 0.01; BCSS: HR = 0.027, 95%CI 0.027-0.81, p < 0.05). However, surgery could help only the HR-/HER2+ and HR+/HER2- subtypes improve prognosis. CONCLUSIONS We analyzed the clinical features and prognostic factors of OBC patients; meanwhile, machine learning prognostic models with high precision and applicability were constructed to predict their overall survival. The treatment results in different molecular subtypes suggested that primary surgery might improve the survival of HR+/HER2- and HR-/HER2+ subtypes, however, only the HR-/HER2+ subtype could benefit from chemotherapy. The necessity of surgery and chemotherapy needs to be carefully considered for OBC patients with other subtypes.
Collapse
Affiliation(s)
- Jingkun Qu
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Chaofan Li
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Mengjie Liu
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Yusheng Wang
- Department of Otolaryngology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Zeyao Feng
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Jia Li
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Weiwei Wang
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Fei Wu
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Shuqun Zhang
- Department of Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| | - Xixi Zhao
- Department of Radiation Oncology, The Second Affiliated Hospital of Xi'an Jiaotong University, 157 West Fifth Street, Xi'an 710004, China
| |
Collapse
|
193
|
Inagaki M, Uchiyama M, Yoshikawa-Kawabe K, Ito M, Murakami H, Gunji M, Minoshima M, Kohnoh T, Ito R, Kodama Y, Tanaka-Sakai M, Nakase A, Goto N, Tsushima Y, Mori S, Kozuka M, Otomo R, Hirai M, Fujino M, Yokoyama T. Comprehensive circulating microRNA profile as a supersensitive biomarker for early-stage lung cancer screening. J Cancer Res Clin Oncol 2023:10.1007/s00432-023-04728-9. [PMID: 37076642 PMCID: PMC10115369 DOI: 10.1007/s00432-023-04728-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 03/28/2023] [Indexed: 04/21/2023]
Abstract
PURPOSE Less-invasive early diagnosis of lung cancer is essential for improving patient survival rates. The purpose of this study is to demonstrate that serum comprehensive miRNA profile is high sensitive biomarker to early-stage lung cancer in direct comparison to the conventional blood biomarker using next-generation sequencing (NGS) technology combined with automated machine learning (AutoML). METHODS We first evaluated the reproducibility of our measurement system using Pearson's correlation coefficients between samples derived from a single pooled RNA sample. To generate comprehensive miRNA profile, we performed NGS analysis of miRNAs in 262 serum samples. Among the discovery set (57 patients with lung cancer and 57 healthy controls), 1123 miRNA-based diagnostic models for lung cancer detection were constructed and screened using AutoML technology. The diagnostic faculty of the best performance model was evaluated by inspecting the validation samples (74 patients with lung cancer and 74 healthy controls). RESULTS The Pearson's correlation coefficients between samples derived from the pooled RNA sample ≥ 0.98. In the validation analysis, the best model showed a high AUC score (0.98) and a high sensitivity for early stage lung cancer (85.7%, n = 28). Furthermore, in comparison to carcinoembryonic antigen (CEA), a conventional blood biomarker for adenocarcinoma, the miRNA-based model showed higher sensitivity for early-stage lung adenocarcinoma (CEA, 27.8%, n = 18; miRNA-based model, 77.8%, n = 18). CONCLUSION The miRNA-based diagnostic model showed a high sensitivity for lung cancer, including early-stage disease. Our study provides the experimental evidence that serum comprehensive miRNA profile can be a highly sensitive blood biomarker for early-stage lung cancer.
Collapse
Affiliation(s)
- Masayasu Inagaki
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Makoto Uchiyama
- Research and Development Division, ARKRAY, Inc., Yousuien-Nai, 59 Gansuin-Cho, Kamigyo-Ku, Kyoto, 602-0008, Japan.
| | - Kanae Yoshikawa-Kawabe
- Department of Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Masafumi Ito
- Department of Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Hideki Murakami
- Department of Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Masaharu Gunji
- Department of Cytology and Molecular Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Makoto Minoshima
- Department of Cytology and Molecular Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Takashi Kohnoh
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Ryota Ito
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Yuta Kodama
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Mari Tanaka-Sakai
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Atsushi Nakase
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Nozomi Goto
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Yusuke Tsushima
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan
| | - Shoich Mori
- Department of Respiratory Surgery, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Masahiro Kozuka
- Research and Development Division, ARKRAY, Inc., Yousuien-Nai, 59 Gansuin-Cho, Kamigyo-Ku, Kyoto, 602-0008, Japan
| | - Ryo Otomo
- Research and Development Division, ARKRAY, Inc., Yousuien-Nai, 59 Gansuin-Cho, Kamigyo-Ku, Kyoto, 602-0008, Japan
| | - Mitsuharu Hirai
- Research and Development Division, ARKRAY, Inc., Yousuien-Nai, 59 Gansuin-Cho, Kamigyo-Ku, Kyoto, 602-0008, Japan
| | - Masahiko Fujino
- Department of Pathology, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, Nagoya, Aichi, 453-8511, Japan
| | - Toshihiko Yokoyama
- Department of Respiratory Medicine, Japanese Red Cross Aichi Medical Center Nagoya Daiichi Hospital, 3-35 Michishita-Cho, Nakamura-Ku, Nagoya, Aichi, 453-8511, Japan.
| |
Collapse
|
194
|
Saito S, Sakamoto S, Higuchi K, Sato K, Zhao X, Wakai K, Kanesaka M, Kamada S, Takeuchi N, Sazuka T, Imamura Y, Anzai N, Ichikawa T, Kawakami E. Machine-learning predicts time-series prognosis factors in metastatic prostate cancer patients treated with androgen deprivation therapy. Sci Rep 2023; 13:6325. [PMID: 37072487 PMCID: PMC10113215 DOI: 10.1038/s41598-023-32987-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 04/05/2023] [Indexed: 05/03/2023] Open
Abstract
Machine learning technology is expected to support diagnosis and prognosis prediction in medicine. We used machine learning to construct a new prognostic prediction model for prostate cancer patients based on longitudinal data obtained from age at diagnosis, peripheral blood and urine tests of 340 prostate cancer patients. Random survival forest (RSF) and survival tree were used for machine learning. In the time-series prognostic prediction model for metastatic prostate cancer patients, the RSF model showed better prediction accuracy than the conventional Cox proportional hazards model for almost all time periods of progression-free survival (PFS), overall survival (OS) and cancer-specific survival (CSS). Based on the RSF model, we created a clinically applicable prognostic prediction model using survival trees for OS and CSS by combining the values of lactate dehydrogenase (LDH) before starting treatment and alkaline phosphatase (ALP) at 120 days after treatment. Machine learning provides useful information for predicting the prognosis of metastatic prostate cancer prior to treatment intervention by considering the nonlinear and combined impacts of multiple features. The addition of data after the start of treatment would allow for more precise prognostic risk assessment of patients and would be beneficial for subsequent treatment selection.
Collapse
Affiliation(s)
- Shinpei Saito
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Chiba, Japan
| | - Shinichi Sakamoto
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan.
| | | | - Kodai Sato
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Chiba, Japan
| | - Xue Zhao
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Ken Wakai
- Teikyo University Chiba Medical Center, Ichihara, Chiba, Japan
| | - Manato Kanesaka
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Shuhei Kamada
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Nobuyoshi Takeuchi
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Tomokazu Sazuka
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Yusuke Imamura
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Naohiko Anzai
- Department of Pharmacology, Graduate School of Medicine, Chiba University, Chiba, Chiba, Japan
| | - Tomohiko Ichikawa
- Department of Urology, Graduate School of Medicine, Chiba University, 1-8-1 Inohana, Chuo-Ku, Chiba, Chiba, 260-8670, Japan
| | - Eiryo Kawakami
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Chiba, Japan
- Advanced Data Science Project (ADSP), RIKEN Information R&D and Strategy Headquarters, RIKEN, Kanagawa, Japan
- Institute for Advanced Academic Research (IAAR), Chiba University, Chiba, Chiba, Japan
| |
Collapse
|
195
|
Eiro N, Medina A, Gonzalez LO, Fraile M, Palacios A, Escaf S, Fernández-Gómez JM, Vizoso FJ. Evaluation of Matrix Metalloproteases by Artificial Intelligence Techniques in Negative Biopsies as New Diagnostic Strategy in Prostate Cancer. Int J Mol Sci 2023; 24:ijms24087022. [PMID: 37108185 PMCID: PMC10139111 DOI: 10.3390/ijms24087022] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 03/27/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023] Open
Abstract
Usually, after an abnormal level of serum prostate-specific antigen (PSA) or digital rectal exam, men undergo a prostate needle biopsy. However, the traditional sextant technique misses 15-46% of cancers. At present, there are problems regarding disease diagnosis/prognosis, especially in patients' classification, because the information to be handled is complex and challenging to process. Matrix metalloproteases (MMPs) have high expression by prostate cancer (PCa) compared with benign prostate tissues. To assess the possible contribution to the diagnosis of PCa, we evaluated the expression of several MMPs in prostate tissues before and after PCa diagnosis using machine learning, classifiers, and supervised algorithms. A retrospective study was conducted on 29 patients diagnosed with PCa with previous benign needle biopsies, 45 patients with benign prostatic hyperplasia (BHP), and 18 patients with high-grade prostatic intraepithelial neoplasia (HGPIN). An immunohistochemical study was performed on tissue samples from tumor and non-tumor areas using specific antibodies against MMP -2, 9, 11, and 13, and the tissue inhibitor of MMPs -3 (TIMP-3), and the protein expression by different cell types was analyzed to which several automatic learning techniques have been applied. Compared with BHP or HGPIN specimens, epithelial cells (ECs) and fibroblasts from benign prostate biopsies before the diagnosis of PCa showed a significantly higher expression of MMPs and TIMP-3. Machine learning techniques provide a differentiable classification between these patients, with greater than 95% accuracy, considering ECs, being slightly lower when considering fibroblasts. In addition, evolutionary changes were found in paired tissues from benign biopsy to prostatectomy specimens in the same patient. Thus, ECs from the tumor zone from prostatectomy showed higher expressions of MMPs and TIMP-3 compared to ECs of the corresponding zone from the benign biopsy. Similar differences were found for expressions of MMP-9 and TIMP-3, between fibroblasts from these zones. The classifiers have determined that patients with benign prostate biopsies before the diagnosis of PCa showed a high MMPs/TIMP-3 expression by ECs, so in the zone without future cancer development as in the zone with future tumor, compared with biopsy samples from patients with BPH or HGPIN. Expression of MMP -2, 9, 11, and 13, and TIMP-3 phenotypically define ECs associated with future tumor development. Also, the results suggest that MMPs/TIMPs expression in biopsy tissues may reflect evolutionary changes from prostate benign tissues to PCa. Thus, these findings in combination with other parameters might contribute to improving the suspicion of PCa diagnosis.
Collapse
Affiliation(s)
- Noemi Eiro
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Antonio Medina
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Luis O Gonzalez
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
- Department of Anatomical Pathology, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Maria Fraile
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Ana Palacios
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Safwan Escaf
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| | - Jesús M Fernández-Gómez
- Department of Urology, Hospital Universitario Central de Asturias, Universidad de Oviedo, Avda. de Roma s/n, 33011 Oviedo, Spain
| | - Francisco J Vizoso
- Research Unit, Fundación Hospital de Jove, Avda. Eduardo Castro, 161, 33920 Gijón, Spain
| |
Collapse
|
196
|
Afrash MR, Mirbagheri E, Mashoufi M, Kazemi-Arpanahi H. Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study. BMC Med Inform Decis Mak 2023; 23:54. [PMID: 37024885 PMCID: PMC10080884 DOI: 10.1186/s12911-023-02154-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 03/15/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Gastric cancer is the most common malignant tumor worldwide and a leading cause of cancer deaths. This neoplasm has a poor prognosis and heterogeneous outcomes. Survivability prediction may help select the best treatment plan based on an individual's prognosis. Numerous clinical and pathological features are generally used in predicting gastric cancer survival, and their influence on the survival of this cancer has not been fully elucidated. Moreover, the five-year survivability prognosis performances of feature selection methods with machine learning (ML) classifiers for gastric cancer have not been fully benchmarked. Therefore, we adopted several well-known feature selection methods and ML classifiers together to determine the best-paired feature selection-classifier for this purpose. METHODS This was a retrospective study on a dataset of 974 patients diagnosed with gastric cancer in the Ayatollah Talleghani Hospital, Abadan, Iran. First, four feature selection algorithms, including Relief, Boruta, least absolute shrinkage and selection operator (LASSO), and minimum redundancy maximum relevance (mRMR) were used to select a set of relevant features that are very informative for five-year survival prediction in gastric cancer patients. Then, each feature set was fed to three classifiers: XG Boost (XGB), hist gradient boosting (HGB), and support vector machine (SVM) to develop predictive models. Finally, paired feature selection-classifier methods were evaluated to select the best-paired method using the area under the curve (AUC), accuracy, sensitivity, specificity, and f1-score metrics. RESULTS The LASSO feature selection algorithm combined with the XG Boost classifier achieved an accuracy of 89.10%, a specificity of 87.15%, a sensitivity of 89.42%, an AUC of 89.37%, and an f1-score of 90.8%. Tumor stage, history of other cancers, lymphatic invasion, tumor site, type of treatment, body weight, histological type, and addiction were identified as the most significant factors affecting gastric cancer survival. CONCLUSIONS This study proved the worth of the paired feature selection-classifier to identify the best path that could augment the five-year survival prediction in gastric cancer patients. Our results were better than those of previous studies, both in terms of the time required to form the models and the performance measurement criteria of the algorithms. These findings may be very promising and can, therefore, inform clinical decision-making and shed light on future studies.
Collapse
Affiliation(s)
- Mohammad Reza Afrash
- Department of Artificial Intelligence, Smart University of Medical Sciences, Tehran, Iran
| | - Esmat Mirbagheri
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Mehrnaz Mashoufi
- Department of Health Information Management, Ardabil University of Medical Sciences, Ardabil, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
197
|
Guttà C, Morhard C, Rehm M. Applying a GAN-based classifier to improve transcriptome-based prognostication in breast cancer. PLoS Comput Biol 2023; 19:e1011035. [PMID: 37011102 PMCID: PMC10101642 DOI: 10.1371/journal.pcbi.1011035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 04/13/2023] [Accepted: 03/17/2023] [Indexed: 04/05/2023] Open
Abstract
Established prognostic tests based on limited numbers of transcripts can identify high-risk breast cancer patients yet are approved only for individuals presenting with specific clinical features or disease characteristics. Deep learning algorithms could hold potential for stratifying patient cohorts based on full transcriptome data, yet the development of robust classifiers is hampered by the number of variables in omics datasets typically far exceeding the number of patients. To overcome this hurdle, we propose a classifier based on a data augmentation pipeline consisting of a Wasserstein generative adversarial network (GAN) with gradient penalty and an embedded auxiliary classifier to obtain a trained GAN discriminator (T-GAN-D). Applied to 1244 patients of the METABRIC breast cancer cohort, this classifier outperformed established breast cancer biomarkers in separating low- from high-risk patients (disease specific death, progression or relapse within 10 years from initial diagnosis). Importantly, the T-GAN-D also performed across independent, merged transcriptome datasets (METABRIC and TCGA-BRCA cohorts), and merging data improved overall patient stratification. In conclusion, the reiterative GAN-based training process allowed generating a robust classifier capable of stratifying low- vs high-risk patients based on full transcriptome data and across independent and heterogeneous breast cancer cohorts.
Collapse
Affiliation(s)
- Cristiano Guttà
- Institute of Cell Biology and Immunology, University of Stuttgart, Stuttgart, Germany
| | | | - Markus Rehm
- Institute of Cell Biology and Immunology, University of Stuttgart, Stuttgart, Germany
- Stuttgart Research Center Systems Biology, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
198
|
Mehrpour O, Nakhaee S, Saeedi F, Valizade B, Lotfi E, Nawaz MH. Utility of artificial intelligence to identify antihyperglycemic agents poisoning in the USA: introducing a practical web application using National Poison Data System (NPDS). ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:57801-57810. [PMID: 36973614 DOI: 10.1007/s11356-023-26605-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/18/2023] [Indexed: 05/10/2023]
Abstract
Clinical effects of antihyperglycemic agents poisoning may overlap each other. So, distinguishing exposure to these pharmaceutical drugs may take work. This study examined the application of machine learning techniques in identifying antihyperglycemic agent exposure using the national poisoning database in the USA. In this study, the data of single exposure due to Biguanides and Sulfonylureas (n=6183) was requested from the National Poison Data System (NPDS) for 2014-2018. We have tried five machine learning models (random forest classifier, k-nearest neighbors, Xgboost classifier, logistic regression, neural network Keras). For the multiclass classification modeling, we have divided the dataset into two parts: train (75%) and test (25%). The performance metrics used were accuracy, specificity, precision, recall, and F1-score. The algorithms used to get the classification results of different models to diagnose antihyperglycemic agents were very accurate. The accuracy of our model in determining these two antihyperglycemic agents was 91-93%. The precision-recall curve showed average precision of 0.91, 0.97, 0.97, and 0.98 for k-nearest neighbors, logistic regression, random forest, and XGB, respectively. The logistic regression, random forest, and XGB had the highest AUC (AUC=0.97) among both biguanides and sulfonylureas groups. The negative predictive values (NPV) for all the models were between 89 and 93%. We introduced a practical web application to help physicians distinguish between these agents. Despite variations in accuracy among the different types of algorithms used, all of them could accurately determine the specific exposure to biguanides and sulfonylureas retrospectively. Machine learning can distinguish antihyperglycemic agents, which may be useful for physicians without any background in medical toxicology. Besides, Our suggested ML-based Web application might help physicians in their diagnosis.
Collapse
Affiliation(s)
- Omid Mehrpour
- AI and Health LLC, Tucson, AZ, USA.
- Rocky Mountain Poison & Drug Safety, Denver Health, and Hospital Authority, Denver, CO, USA.
| | - Samaneh Nakhaee
- Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences (BUMS), Birjand, Iran
| | - Farhad Saeedi
- Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences (BUMS), Birjand, Iran
- Student Research Committee, Birjand University of Medical Sciences, Birjand, Iran
| | - Bahare Valizade
- Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences (BUMS), Birjand, Iran
| | - Erfan Lotfi
- Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences (BUMS), Birjand, Iran
| | | |
Collapse
|
199
|
Bigarré C, Bertucci F, Finetti P, Macgrogan G, Muracciole X, Benzekry S. Mechanistic modeling of metastatic relapse in early breast cancer to investigate the biological impact of prognostic biomarkers. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107401. [PMID: 36804267 DOI: 10.1016/j.cmpb.2023.107401] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 01/12/2023] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Estimating the risk of metastatic relapse is a major challenge to decide adjuvant treatment options in early-stage breast cancer (eBC). To date, distant metastasis-free survival (DMFS) analysis mainly relies on classical, agnostic, statistical models (e.g., Cox regression). Instead, we propose here to derive mechanistic models of DMFS. METHODS The present series consisted of eBC patients who did not receive adjuvant systemic therapy from three datasets, composed respectively of 692 (Bergonié Institute), 591 (Paoli-Calmettes Institute, IPC), and 163 (Public Hospital Marseille, AP-HM) patients with routine clinical annotations. The last dataset also contained expression of three non-routine biomarkers. Our mechanistic model of DMFS relies on two mathematical parameters that represent growth (α) and dissemination (μ). We identified their population distributions using mixed-effects modeling. Critically, we propose a novel variable selection procedure allowing to: (i) identify the association of biological parameters with either α, μ or both, and (ii) generate an optimal candidate model for DMFS prediction. RESULTS We found that Ki67 and Thymidine Kinase-1 were associated with α, and nodal status and Plasminogen Activator Inhibitor-1 with μ. The predictive performances of the model were excellent in calibration but moderate in discrimination, with c-indices of 0.72 (95% CI [0.48, 0.95], AP-HM), 0.63 ([0.44, 0.83], Bergonié) and 0.60 (95% CI [0.54, 0.80], IPC). CONCLUSIONS Overall, we demonstrate that our novel method combining mechanistic and advanced statistical modeling is able to unravel the biological roles of clinicopathological parameters from DMFS data.
Collapse
Affiliation(s)
- Célestin Bigarré
- COMPO, Inria Méditerranée, Cancer Research Center of Marseille, Inserm UMR1068, CNRS UMR7258, Aix Marseille University UM105, 13385 Marseille, France.
| | - François Bertucci
- Predictive Oncology Laboratory, Marseille Cancer Research Centre (CRCM), Inserm U1068, CNRS UMR7258, Institut Paoli-Calmettes, Equipe labellisée Ligue Nationale Contre Le Cancer, Aix-Marseille University, Marseille, France; Department of Medical Oncology, CRCM, Institut Paoli-Calmettes, CNRS, Inserm, Aix-Marseille University, Marseille, France
| | - Pascal Finetti
- Predictive Oncology Laboratory, Marseille Cancer Research Centre (CRCM), Inserm U1068, CNRS UMR7258, Institut Paoli-Calmettes, Equipe labellisée Ligue Nationale Contre Le Cancer, Aix-Marseille University, Marseille, France
| | - Gaëtan Macgrogan
- Department of Biopathology, Institut Bergonié, Regional Comprehensive Cancer Centre, Bordeaux, France; Inserm U1218, Bordeaux Public Health, University of Bordeaux, Bordeaux, France
| | - Xavier Muracciole
- COMPO, Inria Méditerranée, Cancer Research Center of Marseille, Inserm UMR1068, CNRS UMR7258, Aix Marseille University UM105, 13385 Marseille, France; Radiotherapy Department, Assistance Publique - Hôpitaux de Marseille, Aix Marseille University, Marseille, France
| | - Sébastien Benzekry
- COMPO, Inria Méditerranée, Cancer Research Center of Marseille, Inserm UMR1068, CNRS UMR7258, Aix Marseille University UM105, 13385 Marseille, France
| |
Collapse
|
200
|
Kim Y, Yoon T, Park WB, Na S. Predicting mechanical properties of silk from its amino acid sequences via machine learning. J Mech Behav Biomed Mater 2023; 140:105739. [PMID: 36871478 DOI: 10.1016/j.jmbbm.2023.105739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 02/12/2023] [Accepted: 02/21/2023] [Indexed: 02/25/2023]
Abstract
The silk fiber is increasingly being sought for its superior mechanical properties, biocompatibility, and eco-friendliness, making it promising as a base material for various applications. One of the characteristics of protein fibers, such as silk, is that their mechanical properties are significantly dependent on the amino acid sequence. Numerous studies have been conducted to determine the specific relationship between the amino acid sequence of silk and its mechanical properties. Still, the relationship between the amino acid sequence of silk and its mechanical properties is yet to be clarified. Other fields have adopted machine learning (ML) to establish a relationship between the inputs, such as the ratio of different input material compositions and the resulting mechanical properties. We have proposed a method to convert the amino acid sequence into numerical values for input and succeeded in predicting the mechanical properties of silk from its amino acid sequences. Our study sheds light on predicting mechanical properties of silk fiber from respective amino acid sequences.
Collapse
|