1
|
Jing Z, Zheng W, Jianwen S, Hong S, Xiaojian Y, Qiang W, Yunfeng Y, Xinyue W, Shuwen H, Feimin Z. Gut microbes on the risk of advanced adenomas. BMC Microbiol 2024; 24:264. [PMID: 39026166 PMCID: PMC11256391 DOI: 10.1186/s12866-024-03416-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 07/08/2024] [Indexed: 07/20/2024] Open
Abstract
BACKGROUND More than 90% of colorectal cancer (CRC) arises from advanced adenomas (AA) and gut microbes are closely associated with the initiation and progression of both AA and CRC. OBJECTIVE To analyze the characteristic microbes in AA. METHODS Fecal samples were collected from 92 AA and 184 negative control (NC). Illumina HiSeq X sequencing platform was used for high-throughput sequencing of microbial populations. The sequencing results were annotated and compared with NCBI RefSeq database to find the microbial characteristics of AA. R-vegan package was used to analyze α diversity and β diversity. α diversity included box diagram, and β diversity included Principal Component Analysis (PCA), principal co-ordinates analysis (PCoA), and non-metric multidimensional scaling (NMDS). The AA risk prediction models were constructed based on six kinds of machine learning algorithms. In addition, unsupervised clustering methods were used to classify bacteria and viruses. Finally, the characteristics of bacteria and viruses in different subtypes were analyzed. RESULTS The abundance of Prevotella sp900557255, Alistipes putredinis, and Megamonas funiformis were higher in AA, while the abundance of Lilyvirus, Felixounavirus, and Drulisvirus were also higher in AA. The Catboost based model for predicting the risk of AA has the highest accuracy (bacteria test set: 87.27%; virus test set: 83.33%). In addition, 4 subtypes (B1V1, B1V2, B2V1, and B2V2) were distinguished based on the abundance of gut bacteria and enteroviruses (EVs). Escherichia coli D, Prevotella sp900557255, CAG-180 sp000432435, Phocaeicola plebeiuA, Teseptimavirus, Svunavirus, Felixounavirus, and Jiaodavirus are the characteristic bacteria and viruses of 4 subtypes. The results of Catboost model indicated that the accuracy of prediction improved after incorporating subtypes. The accuracy of discovery sets was 100%, 96.34%, 100%, and 98.46% in 4 subtypes, respectively. CONCLUSION Prevotella sp900557255 and Felixounavirus have high value in early warning of AA. As promising non-invasive biomarkers, gut microbes can become potential diagnostic targets for AA, and the accuracy of predicting AA can be improved by typing.
Collapse
Affiliation(s)
- Zhuang Jing
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Wu Zheng
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Song Jianwen
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Shen Hong
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
| | - Yu Xiaojian
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Wei Qiang
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Yin Yunfeng
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Wu Xinyue
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China
| | - Han Shuwen
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China.
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China.
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China.
- ICL, Junia, Université Catholique de Lille, Lille, France.
| | - Zhao Feimin
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang Province, China.
- Fifth School of Clinical Medicine of Zhejiang Chinese Medical University (Huzhou Central Hospital), Huzhou, Zhejiang Province, China.
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer, Huzhou, Zhejiang Province, China.
| |
Collapse
|
2
|
Mao J, He Y, Chu J, Hu B, Yao Y, Yan Q, Han S. Analysis of clinical characteristics of mismatch repair status in colorectal cancer: a multicenter retrospective study. Int J Colorectal Dis 2024; 39:100. [PMID: 38967814 PMCID: PMC11226506 DOI: 10.1007/s00384-024-04674-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/26/2024] [Indexed: 07/06/2024]
Abstract
BACKGROUND Microsatellite instability (MSI) caused by DNA mismatch repair (MMR) deficiency is of great significance in the occurrence, diagnosis and treatment of colorectal cancer (CRC). AIM This study aimed to analyze the relationship between mismatch repair status and clinical characteristics of CRC. METHODS The histopathological results and clinical characteristics of 2029 patients who suffered from CRC and underwent surgery at two centers from 2018 to 2020 were determined. After screening the importance of clinical characteristics through machine learning algorithms, the patients were divided into deficient mismatch repair (dMMR) and proficient mismatch repair (pMMR) groups based on the immunohistochemistry results and the clinical feature data between the two groups were observed by statistical methods. RESULTS The dMMR and pMMR groups had significant differences in histologic type, TNM stage, maximum tumor diameter, lymph node metastasis, differentiation grade, gross appearance, and vascular invasion. There were significant differences between the MLH1 groups in age, histologic type, TNM stage, lymph node metastasis, tumor location, and depth of invasion. The MSH2 groups were significantly different in age. The MSH6 groups had significant differences in age, histologic type, and TNM stage. There were significant differences between the PMS2 groups in lymph node metastasis and tumor location. CRC was dominated by MLH1 and PMS2 combined expression loss (41.77%). There was a positive correlation between MLH1 and MSH2 and between MSH6 and PMS2 as well. CONCLUSIONS The proportion of mucinous adenocarcinoma, protruding type, and poor differentiation is relatively high in dMMR CRCs, but lymph node metastasis is rare. It is worth noting that the expression of MMR protein has different prognostic significance in different stages of CRC disease.
Collapse
Affiliation(s)
- Jing Mao
- Department of General Surgery, Affiliated Huzhou Hospital, Zhejiang University School of Medicine, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
| | - Yang He
- Department of Oncology, The First Affiliated Hospital of Wannan Medical College, No. 92, Zheshan West Road, Jinghu District, Wuhu, Anhui, 241001, People's Republic of China
| | - Jian Chu
- Department of Gastroenterology, The Fifth Affiliated Clinical Medical College of Zhejiang, Chinese Medical University, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
| | - Boyang Hu
- Department of General Surgery, Affiliated Huzhou Hospital, Zhejiang University School of Medicine, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
| | - Yanjun Yao
- Department of General Surgery, Affiliated Huzhou Hospital, Zhejiang University School of Medicine, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China
| | - Qiang Yan
- Department of General Surgery, Affiliated Huzhou Hospital, Zhejiang University School of Medicine, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China.
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China.
| | - Shuwen Han
- Department of Oncology, Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China.
- Key Laboratory of Multiomics Research and Clinical Transformation of Digestive Cancer of Huzhou, No.1558, Sanhuan North Road, Wuxing District, Huzhou, Zhejiang, 313000, People's Republic of China.
| |
Collapse
|
3
|
Chen Z, Luo H, Xu L, Yi Y. Machine learning model for predicting stroke recurrence in adult stroke patients with moyamoya disease and factors of stroke recurrence. Clin Neurol Neurosurg 2024; 242:108308. [PMID: 38733759 DOI: 10.1016/j.clineuro.2024.108308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 04/09/2024] [Accepted: 04/27/2024] [Indexed: 05/13/2024]
Abstract
OBJECT The aim of this study was at building an effective machine learning model to contribute to the prediction of stroke recurrence in adult stroke patients subjected to moyamoya disease (MMD), while at analyzing the factors for stroke recurrence. METHODS The data of this retrospective study originated from the database of JiangXi Province Medical Big Data Engineering & Technology Research Center. Moreover, the information of MMD patients admitted to the second affiliated hospital of Nanchang university from January 1st, 2007 to December 31st, 2019 was acquired. A total of 661 patients from January 1st, 2007 to February 28th, 2017 were covered in the training set, while the external validation set comprised 284 patients that fell into a scope from March 1st, 2017 to December 31st, 2019. First, the information regarding all the subjects was compared between the training set and the external validation set. The key influencing variables were screened out using the Lasso Regression Algorithm. Furthermore, the models for predicting stroke recurrence in 1, 2, and 3 years after the initial stroke were built based on five different machine learning algorithms, and all models were externally validated and then compared. Lastly, the CatBoost model with the optimal performance was explained using the SHapley Additive exPlanations (SHAP) interpretation model. RESULT In general, 945 patients suffering from MMD were recruited, and the recurrence rate of acute stroke in 1, 2, and 3 years after the initial stroke reached 11.43%(108/945), 18.94%(179/945), and 23.17%(219/945), respectively. The CatBoost models exhibited the optimal prediction performance among all models; the area under the curve (AUC) of these models for predicting stroke recurrence in 1, 2, and 3 years was determined as 0.794 (0.787, 0.801), 0.813 (0.807, 0.818), and 0.789 (0.783, 0.795), respectively. As indicated by the results of the SHAP interpretation model, the high Suzuki stage, young adults (aged 18-44), no surgical treatment, and the presence of an aneurysm were likely to show significant correlations with the recurrence of stroke in adult stroke patients subjected to MMD. CONCLUSION In adult stroke patients suffering from MMD, the CatBoost model was confirmed to be effective in stroke recurrence prediction, yielding accurate and reliable prediction outcomes. High Suzuki stage, young adults (aged 18-44 years), no surgical treatment, and the presence of an aneurysm are likely to be significantly correlated with the recurrence of stroke in adult stroke patients subjected to MMD.
Collapse
Affiliation(s)
- Zhongjun Chen
- Department of Neurology, the Second Affiliated Hospital of Nanchang University, Nanchang, JiangXi, China; Department of Neurology, ShangRao people's Hospital, ShangRao, JiangXi, China
| | - Haowen Luo
- Medical Big-Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, JiangXi, China.
| | - Lijun Xu
- Department of Neurology, the Second Affiliated Hospital of Nanchang University, Nanchang, JiangXi, China
| | - Yingping Yi
- Medical Big-Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, JiangXi, China.
| |
Collapse
|
4
|
Zhou Y, Qi T, Pan M, Tu J, Zhao X, Ge Q, Lu Z. Deep-Cloud: A Deep Neural Network-Based Approach for Analyzing Differentially Expressed Genes of RNA-seq Data. J Chem Inf Model 2024; 64:2302-2310. [PMID: 37682833 DOI: 10.1021/acs.jcim.3c00766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2023]
Abstract
Presently, the field of analyzing differentially expressed genes (DEGs) of RNA-seq data is still in its infancy, with new approaches constantly being proposed. Taking advantage of deep neural networks to explore gene expression information on RNA-seq data can provide a novel possibility in the biomedical field. In this study, a novel approach based on a deep learning algorithm and cloud model was developed, named Deep-Cloud. Its main advantage is not only using a convolutional neural network and long short-term memory to extract original data features and estimate gene expression of RNA-seq data but also combining the statistical method of the cloud model to quantify the uncertainty and carry out in-depth analysis of the DEGs between the disease groups and the control groups. Compared with traditional analysis software of DEGs, the Deep-cloud model further improves the sensitivity and accuracy of obtaining DEGs from RNA-seq data. Overall, the proposed new approach Deep-cloud paves a new pathway for mining RNA-seq data in the biomedical field.
Collapse
Affiliation(s)
- Ying Zhou
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| | - Ting Qi
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| | - Min Pan
- School of Medicine, Southeast University, Nanjing 210097, China
| | - Jing Tu
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| | - Xiangwei Zhao
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| | - Qinyu Ge
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|
5
|
Padwal MK, Basu S, Basu B. Application of Machine Learning in Predicting Hepatic Metastasis or Primary Site in Gastroenteropancreatic Neuroendocrine Tumors. Curr Oncol 2023; 30:9244-9261. [PMID: 37887568 PMCID: PMC10605255 DOI: 10.3390/curroncol30100668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/16/2023] [Accepted: 10/16/2023] [Indexed: 10/28/2023] Open
Abstract
Gastroenteropancreatic neuroendocrine tumors (GEP-NETs) account for 80% of gastroenteropancreatic neuroendocrine neoplasms (GEP-NENs). GEP-NETs are well-differentiated tumors, highly heterogeneous in biology and origin, and are often diagnosed at the metastatic stage. Diagnosis is commonly through clinical symptoms, histopathology, and PET-CT imaging, while molecular markers for metastasis and the primary site are unknown. Here, we report the identification of multi-gene signatures for hepatic metastasis and primary sites through analyses on RNA-SEQ datasets of pancreatic and small intestinal NETs tissue samples. Relevant gene features, identified from the normalized RNA-SEQ data using the mRMRe algorithm, were used to develop seven Machine Learning models (LDA, RF, CART, k-NN, SVM, XGBOOST, GBM). Two multi-gene random forest (RF) models classified primary and metastatic samples with 100% accuracy in training and test cohorts and >90% accuracy in an independent validation cohort. Similarly, three multi-gene RF models identified the pancreas or small intestine as the primary site with 100% accuracy in training and test cohorts, and >95% accuracy in an independent cohort. Multi-label models for concurrent prediction of hepatic metastasis and primary site returned >98.42% and >87.42% accuracies on training and test cohorts, respectively. A robust molecular signature to predict liver metastasis or the primary site for GEP-NETs is reported for the first time and could complement the clinical management of GEP-NETs.
Collapse
Affiliation(s)
- Mahesh Kumar Padwal
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai 400085, India;
- Homi Bhabha National Institute, Mumbai 400094, India;
| | - Sandip Basu
- Homi Bhabha National Institute, Mumbai 400094, India;
- Radiation Medicine Centre, Bhabha Atomic Research Centre, Tata Memorial Hospital Annexe, Mumbai 400012, India
| | - Bhakti Basu
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai 400085, India;
- Homi Bhabha National Institute, Mumbai 400094, India;
| |
Collapse
|
6
|
Texture Feature-Based Machine Learning Classification on MRI Image for Sepsis-Associated Encephalopathy Detection: A Pilot Study. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2023; 2023:6403556. [PMID: 36778786 PMCID: PMC9911249 DOI: 10.1155/2023/6403556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 12/21/2022] [Accepted: 12/26/2022] [Indexed: 02/05/2023]
Abstract
Objective The objective of this study was to assess the performance of combining MRI-based texture analysis with machine learning for differentiating sepsis-associated encephalopathy (SAE) from sepsis alone. Method Sixty-six MRI-T1WI images of an SAE patient and 125 images of patients with sepsis alone were collected. Frontal lobe, brain stem, hippocampus, and amygdala were selected as regions of interest (ROIs). 279 texture features of each ROI were obtained using MaZda software. After the dimension reduction, 30 highly discriminative features of each ROI were adopted to differentiate SAE from sepsis alone using the CatBoost model. Results The classification models of frontal, brain stem, hippocampus, and amygdala were constructed. The classification accuracy was above 0.83, and the area under the curve (AUC) exceeded 0.90 in the validation set. Conclusion The texture features differed between SAE patients and patients with sepsis alone in different anatomical locations, suggesting that MRI-based texture analysis with machine learning might be helpful in differentiating SAE from sepsis alone.
Collapse
|
7
|
Prelaj A, Galli EG, Miskovic V, Pesenti M, Viscardi G, Pedica B, Mazzeo L, Bottiglieri A, Provenzano L, Spagnoletti A, Marinacci R, De Toma A, Proto C, Ferrara R, Brambilla M, Occhipinti M, Manglaviti S, Galli G, Signorelli D, Giani C, Beninato T, Pircher CC, Rametta A, Kosta S, Zanitti M, Di Mauro MR, Rinaldi A, Di Gregorio S, Antonia M, Garassino MC, de Braud FGM, Restelli M, Lo Russo G, Ganzinelli M, Trovò F, Pedrocchi ALG. Real-world data to build explainable trustworthy artificial intelligence models for prediction of immunotherapy efficacy in NSCLC patients. Front Oncol 2023; 12:1078822. [PMID: 36755856 PMCID: PMC9899835 DOI: 10.3389/fonc.2022.1078822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/14/2022] [Indexed: 01/24/2023] Open
Abstract
Introduction Artificial Intelligence (AI) methods are being increasingly investigated as a means to generate predictive models applicable in the clinical practice. In this study, we developed a model to predict the efficacy of immunotherapy (IO) in patients with advanced non-small cell lung cancer (NSCLC) using eXplainable AI (XAI) Machine Learning (ML) methods. Methods We prospectively collected real-world data from patients with an advanced NSCLC condition receiving immune-checkpoint inhibitors (ICIs) either as a single agent or in combination with chemotherapy. With regards to six different outcomes - Disease Control Rate (DCR), Objective Response Rate (ORR), 6 and 24-month Overall Survival (OS6 and OS24), 3-months Progression-Free Survival (PFS3) and Time to Treatment Failure (TTF3) - we evaluated five different classification ML models: CatBoost (CB), Logistic Regression (LR), Neural Network (NN), Random Forest (RF) and Support Vector Machine (SVM). We used the Shapley Additive Explanation (SHAP) values to explain model predictions. Results Of 480 patients included in the study 407 received immunotherapy and 73 chemo- and immunotherapy. From all the ML models, CB performed the best for OS6 and TTF3, (accuracy 0.83 and 0.81, respectively). CB and LR reached accuracy of 0.75 and 0.73 for the outcome DCR. SHAP for CB demonstrated that the feature that strongly influences models' prediction for all three outcomes was Neutrophil to Lymphocyte Ratio (NLR). Performance Status (ECOG-PS) was an important feature for the outcomes OS6 and TTF3, while PD-L1, Line of IO and chemo-immunotherapy appeared to be more important in predicting DCR. Conclusions In this study we developed a ML algorithm based on real-world data, explained by SHAP techniques, and able to accurately predict the efficacy of immunotherapy in sets of NSCLC patients.
Collapse
Affiliation(s)
- Arsela Prelaj
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,*Correspondence: Arsela Prelaj,
| | - Edoardo Gregorio Galli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Vanja Miskovic
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Mattia Pesenti
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Viscardi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Medical Oncology Unit, Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Naples, Italy
| | - Benedetta Pedica
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Laura Mazzeo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Achille Bottiglieri
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Leonardo Provenzano
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Andrea Spagnoletti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Roberto Marinacci
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Alessandro De Toma
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Claudia Proto
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Roberto Ferrara
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marta Brambilla
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Mario Occhipinti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Sara Manglaviti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Giulia Galli
- Medical Oncology Unit, Policlinico San Matteo Fondazione IRCCS, Pavia, Italy
| | - Diego Signorelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy
| | - Claudia Giani
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Teresa Beninato
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Chiara Carlotta Pircher
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Alessandro Rametta
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Sokol Kosta
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Michele Zanitti
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Maria Rosa Di Mauro
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Arturo Rinaldi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Settimio Di Gregorio
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Martinetti Antonia
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marina Chiara Garassino
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Thoracic Oncology Program, Section of Hematology/Oncology, University of Chicago, Chicago, IL, United States
| | - Filippo G. M. de Braud
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Marcello Restelli
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Lo Russo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Monica Ganzinelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Francesco Trovò
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | | |
Collapse
|
8
|
Han S, Zhuang J, Pan Y, Wu W, Ding K. Different Characteristics in Gut Microbiome between Advanced Adenoma Patients and Colorectal Cancer Patients by Metagenomic Analysis. Microbiol Spectr 2022; 10:e0159322. [PMID: 36453905 PMCID: PMC9769752 DOI: 10.1128/spectrum.01593-22] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 10/25/2022] [Indexed: 12/03/2022] Open
Abstract
The occurrence and development of colorectal cancer (CRC) and advanced adenoma (AA) are closely related to the gut microbiome, and AA has a high cancerization progression rate to CRC. Current studies have revealed that bacteriological analysis cannot identify CRC from AA. The objective was to explore microbial targets that could identify CRC and AA from a microecological perspective and to figure out the best way to identify CRC based on fecal microbes. The metagenomic sequencing data were used to describe the gut microbiome profile and analyze the differences between microbial abundance and microbial single nucleotide polymorphism (SNP) characteristics in AA and CRC patients. It was found that there were no significant differences in the diversity between the two groups. The abundance of bacteria (e.g., Firmicutes, Clostridia, and Blautia), fungi (Hypocreales), archaea (Methanosarcina, Methanoculleus, and Methanolacinia), and viruses (Alphacoronavirus, Sinsheimervirus, and Gammaretrovirus) differed between AA and CRC patients. Multiple machine-learning algorithms were used to establish prediction models, aiming to identify CRC and AA. The accuracy of the random forest (RF) model based on the gut microbiome was 86.54%. Nevertheless, the accuracy of SNP was 92.31% in identifying CRC from AA. In conclusion, using microbial SNP was the best method to identify CRC, it was superior to using the gut microbiome, and it could provide new targets for CRC screening. IMPORTANCE There are differences in characteristic microorganisms between AA and CRC. However, current studies have indicated that bacteriological analysis cannot identify CC from AA, and thus, we wondered if there were some other targets that could be used to identify CRC from AA in the gut microbiome. The differences of SNPs in the gut microbiota of intraindividuals were significantly smaller than those of interindividuals. In addition, compared with intestinal microbes, SNP was less affected by time with certain stability. It was discovered that microbial SNP was better than the gut microbiome for identifying CRC from AA. Therefore, screening characteristic microbial SNP could provide a new research direction for identifying CRC from AA.
Collapse
Affiliation(s)
- Shuwen Han
- Department of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, Zhejiang Provincial Clinical Research Center for Cancer, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Department of Medical Oncology, Huzhou Central Hospital, Huzhou, Zhejiang, China
| | - Jing Zhuang
- Department of Medical Oncology, Huzhou Central Hospital, Huzhou, Zhejiang, China
| | - Yuefen Pan
- Department of Medical Oncology, Huzhou Central Hospital, Huzhou, Zhejiang, China
| | - Wei Wu
- Department of Medical Oncology, Huzhou Central Hospital, Huzhou, Zhejiang, China
| | - Kefeng Ding
- Department of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, Zhejiang Provincial Clinical Research Center for Cancer, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- Cancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| |
Collapse
|
9
|
Random Forest Estimation and Trend Analysis of PM2.5 Concentration over the Huaihai Economic Zone, China (2000–2020). SUSTAINABILITY 2022. [DOI: 10.3390/su14148520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Consisting of ten cities in four Chinese provinces, the Huaihai Economic Zone has suffered serious air pollution over the last two decades, particularly of fine particulate matter (PM2.5). In this study, we used multi-source data, namely MAIAC AOD (at a 1 km spatial resolution), meteorological, topographic, date, and location (latitude and longitude) data, to construct a regression model using random forest to estimate the daily PM2.5 concentration over the Huaihai Economic Zone from 2000 to 2020. It was found that the variable expressing time (date) had the greatest characteristic importance when estimating PM2.5. By averaging the modeled daily PM2.5 concentration, we produced a yearly PM2.5 concentration dataset, at a 1 km resolution, for the study area from 2000 to 2020. On comparing modeled daily PM2.5 with observational data, the coefficient of determination (R2) of the modeling was 0.85, the root means square error (RMSE) was 14.63 μg/m3, and the mean absolute error (MAE) was 10.03 μg/m3. The quality assessment of the synthesized yearly PM2.5 concentration dataset shows that R2 = 0.77, RMSE = 6.92 μg/m3, and MAE = 5.42 μg/m3. Despite different trends from 2000–2010 and from 2010–2020, the trend of PM2.5 concentration over the Huaihai Economic Zone during the 21 years was, overall, decreasing. The area of the significantly decreasing trend was small and mainly concentrated in the lake areas of the Zone. It is concluded that PM2.5 can be well-estimated from the MAIAC AOD dataset, when incorporating spatiotemporal variability using random forest, and that the resultant PM2.5 concentration data provide a basis for environmental monitoring over large geographic areas.
Collapse
|
10
|
Safaei N, Safaei B, Seyedekrami S, Talafidaryani M, Masoud A, Wang S, Li Q, Moqri M. E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database. PLoS One 2022; 17:e0262895. [PMID: 35511882 PMCID: PMC9070907 DOI: 10.1371/journal.pone.0262895] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 01/09/2022] [Indexed: 11/19/2022] Open
Abstract
Improving the Intensive Care Unit (ICU) management network and building cost-effective and well-managed healthcare systems are high priorities for healthcare units. Creating accurate and explainable mortality prediction models helps identify the most critical risk factors in the patients' survival/death status and early detect the most in-need patients. This study proposes a highly accurate and efficient machine learning model for predicting ICU mortality status upon discharge using the information available during the first 24 hours of admission. The most important features in mortality prediction are identified, and the effects of changing each feature on the prediction are studied. We used supervised machine learning models and illness severity scoring systems to benchmark the mortality prediction. We also implemented a combination of SHAP, LIME, partial dependence, and individual conditional expectation plots to explain the predictions made by the best-performing model (CatBoost). We proposed E-CatBoost, an optimized and efficient patient mortality prediction model, which can accurately predict the patients' discharge status using only ten input features. We used eICU-CRD v2.0 to train and validate the models; the dataset contains information on over 200,000 ICU admissions. The patients were divided into twelve disease groups, and models were fitted and tuned for each group. The models' predictive performance was evaluated using the area under a receiver operating curve (AUROC). The AUROC scores were 0.86 [std:0.02] to 0.92 [std:0.02] for CatBoost and 0.83 [std:0.02] to 0.91 [std:0.03] for E-CatBoost models across the defined disease groups; if measured over the entire patient population, their AUROC scores were 7 to 18 and 2 to 12 percent higher than the baseline models, respectively. Based on SHAP explanations, we found age, heart rate, respiratory rate, blood urine nitrogen, and creatinine level as the most critical cross-disease features in mortality predictions.
Collapse
Affiliation(s)
- Nima Safaei
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Babak Safaei
- Civil and Environmental Engineering Department, Michigan State University, East Lansing, MI, United States of America
| | - Seyedhouman Seyedekrami
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States of America
| | | | - Arezoo Masoud
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Shaodong Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Qing Li
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Mahdi Moqri
- Department of Information Systems and Business Analytics, Ivy College of Business, Iowa State University, Ames, IA, United States of America
| |
Collapse
|
11
|
Pang X, Yan R, Li L, Wang P, Zhang Y, Liu Y, Liu P, Dong W, Miao P, Mei Q. Non-doped and non-modified carbon dots with high quantum yield for the chemosensing of uric acid and living cell imaging. Anal Chim Acta 2022; 1199:339571. [DOI: 10.1016/j.aca.2022.339571] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 02/01/2022] [Accepted: 02/01/2022] [Indexed: 01/13/2023]
|
12
|
Smith BJ, Silva-Costa LC, Martins-de-Souza D. Human disease biomarker panels through systems biology. Biophys Rev 2021; 13:1179-1190. [PMID: 35059036 PMCID: PMC8724340 DOI: 10.1007/s12551-021-00849-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 10/01/2021] [Indexed: 12/23/2022] Open
Abstract
As more uses for biomarkers are sought after for an increasing number of disease targets, single-target biomarkers are slowly giving way for biomarker panels. These panels incorporate various sources of biomolecular and clinical data to guarantee a higher robustness and power of separation for a clinical test. Multifactorial diseases such as psychiatric disorders show great potential for clinical use, assisting medical professionals during the analysis of risk and predisposition, disease diagnosis and prognosis, and treatment applicability and efficacy. More specific tests are also being developed to assist in ruling out, distinguishing between, and confirming suspicions of multifactorial diseases, as well as to predict which therapy option may be the best option for a given patient's biochemical profile. As more complex datasets are entering the field, involving multi-omic approaches, systems biology has stepped in to facilitate the discovery and validation steps during biomarker panel generation. Filtering biomolecules and clinical data, pre-validating and cross-validating potential biomarkers, generating final biomarker panels, and testing the robustness and applicability of those panels are all beginning to rely on machine learning and systems biology and research in this area will only benefit from advances in these approaches.
Collapse
Affiliation(s)
- Bradley J. Smith
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| | - Licia C. Silva-Costa
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| | - Daniel Martins-de-Souza
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
- Instituto Nacional de Biomarcadores Em Neuropsiquiatria (INBION), Conselho Nacional de Desenvolvimento Científico E Tecnológico, Sao Paulo, Brazil
- Experimental Medicine Research Cluster (EMRC), University of Campinas, Campinas, Brazil
| |
Collapse
|
13
|
A Review on Recent Progress in Machine Learning and Deep Learning Methods for Cancer Classification on Gene Expression Data. Processes (Basel) 2021. [DOI: 10.3390/pr9081466] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Data-driven model with predictive ability are important to be used in medical and healthcare. However, the most challenging task in predictive modeling is to construct a prediction model, which can be addressed using machine learning (ML) methods. The methods are used to learn and trained the model using a gene expression dataset without being programmed explicitly. Due to the vast amount of gene expression data, this task becomes complex and time consuming. This paper provides a recent review on recent progress in ML and deep learning (DL) for cancer classification, which has received increasing attention in bioinformatics and computational biology. The development of cancer classification methods based on ML and DL is mostly focused on this review. Although many methods have been applied to the cancer classification problem, recent progress shows that most of the successful techniques are those based on supervised and DL methods. In addition, the sources of the healthcare dataset are also described. The development of many machine learning methods for insight analysis in cancer classification has brought a lot of improvement in healthcare. Currently, it seems that there is highly demanded further development of efficient classification methods to address the expansion of healthcare applications.
Collapse
|