1
|
Novielli P, Romano D, Magarelli M, Bitonto PD, Diacono D, Chiatante A, Lopalco G, Sabella D, Venerito V, Filannino P, Bellotti R, De Angelis M, Iannone F, Tangaro S. Explainable artificial intelligence for microbiome data analysis in colorectal cancer biomarker identification. Front Microbiol 2024; 15:1348974. [PMID: 38426064 PMCID: PMC10901987 DOI: 10.3389/fmicb.2024.1348974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 01/24/2024] [Indexed: 03/02/2024] Open
Abstract
Background Colorectal cancer (CRC) is a type of tumor caused by the uncontrolled growth of cells in the mucosa lining the last part of the intestine. Emerging evidence underscores an association between CRC and gut microbiome dysbiosis. The high mortality rate of this cancer has made it necessary to develop new early diagnostic methods. Machine learning (ML) techniques can represent a solution to evaluate the interaction between intestinal microbiota and host physiology. Through explained artificial intelligence (XAI) it is possible to evaluate the individual contributions of microbial taxonomic markers for each subject. Our work also implements the Shapley Method Additive Explanations (SHAP) algorithm to identify for each subject which parameters are important in the context of CRC. Results The proposed study aimed to implement an explainable artificial intelligence framework using both gut microbiota data and demographic information from subjects to classify a cohort of control subjects from those with CRC. Our analysis revealed an association between gut microbiota and this disease. We compared three machine learning algorithms, and the Random Forest (RF) algorithm emerged as the best classifier, with a precision of 0.729 ± 0.038 and an area under the Precision-Recall curve of 0.668 ± 0.016. Additionally, SHAP analysis highlighted the most crucial variables in the model's decision-making, facilitating the identification of specific bacteria linked to CRC. Our results confirmed the role of certain bacteria, such as Fusobacterium, Peptostreptococcus, and Parvimonas, whose abundance appears notably associated with the disease, as well as bacteria whose presence is linked to a non-diseased state. Discussion These findings emphasizes the potential of leveraging gut microbiota data within an explainable AI framework for CRC classification. The significant association observed aligns with existing knowledge. The precision exhibited by the RF algorithm reinforces its suitability for such classification tasks. The SHAP analysis not only enhanced interpretability but identified specific bacteria crucial in CRC determination. This approach opens avenues for targeted interventions based on microbial signatures. Further exploration is warranted to deepen our understanding of the intricate interplay between microbiota and health, providing insights for refined diagnostic and therapeutic strategies.
Collapse
Affiliation(s)
- Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Michele Magarelli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pierpaolo Di Bitonto
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Domenico Diacono
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Annalisa Chiatante
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Giuseppe Lopalco
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Daniele Sabella
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Vincenzo Venerito
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pasquale Filannino
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
- Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Florenzo Iannone
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| |
Collapse
|
2
|
Huang X, Liu J, Huang W. Identification of S100A8 as a common diagnostic biomarkers and exploring potential pathogenesis for osteoarthritis and metabolic syndrome. Front Immunol 2023; 14:1185275. [PMID: 37497233 PMCID: PMC10366475 DOI: 10.3389/fimmu.2023.1185275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 06/27/2023] [Indexed: 07/28/2023] Open
Abstract
Background Osteoarthritis (OA) is the most frequent musculoskeletal disease and the major contributor to disability worldwide. Metabolic syndrome (MetS) has been recognized as being associated with the pathogenesis of osteoarthritis. However, the exact mechanisms and links between the two are not clear. Methods We downloaded clinical information data and gene expression profiles for OA and MetS from the database of Gene Expression Omnibus (GEO), and immune related gene (IRG) from the database of Immunology Database and Analysis Portal (IMMPORT). After screening OA-DEG and MetS-DEG, we identified the common immune hub gene by screening the overlapping genes between OA-DEG, MetS-DEG and IRG. Then we conducted single-gene analysis of S100A8, assessed the correlation of S100A8 with immune cell infiltration, and verified the diagnostic value of S100A8 in OA and MetS database respectively. Results 323 OA-DEGs,101 MetS-DEGs and an immune-related hub gene, S100A8, were identified. In single gene analysis of S100A8 in OA samples, GSEA suggested that immune-related biological processes were more significantly enriched. The results of immune cell infiltration analysis showed that the enrichment fraction of M2 macrophages was significantly higher in the high S100A8-expressing group, and the level of S100A8 expression was positively correlated with M2 macrophage infiltration. The results of the dataset validation showed that S100A8 expression levels were significantly upregulated in the OA group and performed well in the diagnosis of OA. In single gene analysis of S100A8 in MetS samples, immune cell infiltration analysis showed that monocyte infiltration was higher in the S100A8 high expression samples and that there was a positive correlation between the two. Dataset validation showed that S100A8 is of high value for the diagnosis of MetS. In the validation of the dataset for the four metabolism-related diseases (obesity, diabetes, hypertension and hyperlipidaemia), S100A8 was expressed at higher levels in the disease group and also had a higher diagnostic value for the four metabolism-related diseases. Conclusion S100A8 is a common hub gene and diagnostic biomarker for OA and MetS, and the immune regulation involved in S100A8 may play a central role in the pathogenesis of OA and MetS.
Collapse
Affiliation(s)
- Xu Huang
- Department of Critical Care Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Jiacheng Liu
- Department of Orthopedics, Orthopedic Laboratory of Chongqing Medical University, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wei Huang
- Department of Orthopedics, Orthopedic Laboratory of Chongqing Medical University, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
3
|
Machine learning applied to MRI evaluation for the detection of lymph node metastasis in patients with locally advanced cervical cancer treated with neoadjuvant chemotherapy. Arch Gynecol Obstet 2022; 307:1911-1919. [PMID: 36370209 DOI: 10.1007/s00404-022-06824-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/10/2022] [Indexed: 11/13/2022]
Abstract
PURPOSE Concurrent cisplatin-based chemotherapy and radiotherapy (CCRT) plus brachytherapy is the standard treatment for locally advanced cervical cancer (LACC). Platinum-based neoadjuvant chemotherapy (NACT) followed by radical hysterectomy is an alternative for patients with stage IB2-IIB disease. Therefore, the correct pre-treatment staging is essential to the proper management of this disease. Pelvic magnetic resonance imaging (MRI) is the gold standard examination but studies about MRI accuracy in the detection of lymph node metastasis (LNM) in LACC patients show conflicting data. Machine learning (ML) is emerging as a promising tool for unraveling complex non-linear relationships between patient attributes that cannot be solved by traditional statistical methods. Here we investigated whether ML might improve the accuracy of MRI in the detection of LNM in LACC patients. METHODS We analyzed retrospectively LACC patients who underwent NACT and radical hysterectomy from 2015 to 2020. Demographic, clinical and MRI characteristics before and after NACT were collected, as well as information about post-surgery histopathology. Random features elimination wrapper was used to determine an attribute core set. A ML algorithm, namely Extreme Gradient Boosting (XGBoost) was trained and validated with tenfold cross-validation. The performances of the algorithm were assessed. RESULTS Our analysis included n.92 patients. FIGO stage was IB2 in n.4/92 (4.3%), IB3 in n.42/92 (45%), IIA1 in n.1/92 (1.1%), IIA2 in n.16/92 (17.4%) and IIB in n.29/92 (31.5%). Despite detected neither at pre-treatment and post-treatment MRI in any patients, LNM occurred in n.16/92 (17%) patients. The attribute core set used to train ML algorithms included grading, histotypes, age, parity, largest diameter of lesion at either pre- and post-treatment MRI, presence/absence of fornix infiltration at pre-treatment MRI and FIGO stage. XGBoost showed a good performance (accuracy 89%, precision 83%, recall 78%, AUROC 0.79). CONCLUSIONS We developed an accurate model to predict LNM in LACC patients in NACT, based on a ML algorithm requiring few easy-to-collect attributes.
Collapse
|
4
|
The Use and Utility of Machine Learning in Achieving Precision Medicine in Systemic Sclerosis: A Narrative Review. J Pers Med 2022; 12:jpm12081198. [PMID: 35893293 PMCID: PMC9331823 DOI: 10.3390/jpm12081198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 07/18/2022] [Accepted: 07/19/2022] [Indexed: 11/17/2022] Open
Abstract
Background: Systemic sclerosis (SSc) is a rare connective tissue disease that can affect different organs and has extremely heterogenous presentations. This complexity makes it difficult to perform an early diagnosis and a subsequent subclassification of the disease. This hinders a personalized approach in clinical practice. In this context, machine learning (ML), a branch of artificial intelligence (AI), is able to recognize relationships in data and predict outcomes. Methods: Here, we performed a narrative review concerning the application of ML in SSc to define the state of art and evaluate its role in a precision medicine context. Results: Currently, ML has been used to stratify SSc patients and identify those at high risk of severe complications. Additionally, ML may be useful in the early detection of organ involvement. Furthermore, ML might have a role in target therapy approach and in predicting drug response. Conclusion: Available evidence about the utility of ML in SSc is sparse but promising. Future improvements in this field could result in a big step toward precision medicine. Further research is needed to define ML application in clinical practice.
Collapse
|