1
|
Alelyani T, Alshammari MM, Almuhanna A, Asan O. Explainable Artificial Intelligence in Quantifying Breast Cancer Factors: Saudi Arabia Context. Healthcare (Basel) 2024; 12:1025. [PMID: 38786433 PMCID: PMC11120946 DOI: 10.3390/healthcare12101025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/11/2024] [Accepted: 05/13/2024] [Indexed: 05/25/2024] Open
Abstract
Breast cancer represents a significant health concern, particularly in Saudi Arabia, where it ranks as the most prevalent cancer type among women. This study focuses on leveraging eXplainable Artificial Intelligence (XAI) techniques to predict benign and malignant breast cancer cases using various clinical and pathological features specific to Saudi Arabian patients. Six distinct models were trained and evaluated based on common performance metrics such as accuracy, precision, recall, F1 score, and AUC-ROC score. To enhance interpretability, Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) were applied. The analysis identified the Random Forest model as the top performer, achieving an accuracy of 0.72, along with robust precision, recall, F1 score, and AUC-ROC score values. Conversely, the Support Vector Machine model exhibited the poorest performance metrics, indicating its limited predictive capability. Notably, the XAI approaches unveiled variations in the feature importance rankings across models, underscoring the need for further investigation. These findings offer valuable insights into breast cancer diagnosis and machine learning interpretation, aiding healthcare providers in understanding and potentially integrating such technologies into clinical practices.
Collapse
Affiliation(s)
- Turki Alelyani
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran 1988, Saudi Arabia
| | - Maha M. Alshammari
- Department of Environmental Health, Institute for Research and Medical Consultations, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia;
| | - Afnan Almuhanna
- Department of Radiology, College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia;
| | - Onur Asan
- School of Systems and Enterprises, Stevens Institute of Technology, Hoboken, NJ 07030, USA;
| |
Collapse
|
2
|
Huang W, Xu K, Liu Z, Wang Y, Chen Z, Gao Y, Peng R, Zhou Q. Circulating tumor DNA- and cancer tissue-based next-generation sequencing reveals comparable consistency in targeted gene mutations for advanced or metastatic non-small cell lung cancer. Chin Med J (Engl) 2024:00029330-990000000-01055. [PMID: 38711358 DOI: 10.1097/cm9.0000000000003117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Indexed: 05/08/2024] Open
Abstract
BACKGROUND Molecular subtyping is an essential complementarity after pathological analyses for targeted therapy. This study aimed to investigate the consistency of next-generation sequencing (NGS) results between circulating tumor DNA (ctDNA)-based and tissue-based in non-small cell lung cancer (NSCLC) and identify the patient characteristics that favor ctDNA testing. METHODS Patients who diagnosed with NSCLC and received both ctDNA- and cancer tissue-based NGS before surgery or systemic treatment in Lung Cancer Center, Sichuan University West China Hospital between December 2017 and August 2022 were enrolled. A 425-cancer panel with a HiSeq 4000 NGS platform was used for NGS. The unweighted Cohen's kappa coefficient was employed to discriminate the high-concordance group from the low-concordance group with a cutoff value of 0.6. Six machine learning models were used to identify patient characteristics that relate to high concordance between ctDNA-based and tissue-based NGS. RESULTS A total of 85 patients were enrolled, of which 22.4% (19/85) had stage III disease and 56.5% had stage IV disease. Forty-four patients (51.8%) showed consistent gene mutation types between ctDNA-based and tissue-based NGS, while one patient (1.2%) tested negative in both approaches. Advanced diseases and metastases to other organs would be fit for the ctDNA-based NGS, and the generalized linear model showed that T stage, M stage, and tumor mutation burden were the critical discriminators to predict the consistency of results between ctDNA-based and tissue-based NGS. CONCLUSION ctDNA-based NGS showed comparable detection performance in the targeted gene mutations compared with tissue-based NGS, and it could be considered in advanced or metastatic NSCLC.
Collapse
Affiliation(s)
- Weijia Huang
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Kai Xu
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Zhenkun Liu
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Yifeng Wang
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Zijia Chen
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Yanyun Gao
- Department of General Thoracic Surgery, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3010, Switzerland
| | - Renwang Peng
- Department of General Thoracic Surgery, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3010, Switzerland
| | - Qinghua Zhou
- Lung Cancer Center/Lung Cancer Institute, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| |
Collapse
|
3
|
Mukherjee A, Abraham S, Singh A, Balaji S, Mukunthan KS. From Data to Cure: A Comprehensive Exploration of Multi-omics Data Analysis for Targeted Therapies. Mol Biotechnol 2024:10.1007/s12033-024-01133-6. [PMID: 38565775 DOI: 10.1007/s12033-024-01133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
In the dynamic landscape of targeted therapeutics, drug discovery has pivoted towards understanding underlying disease mechanisms, placing a strong emphasis on molecular perturbations and target identification. This paradigm shift, crucial for drug discovery, is underpinned by big data, a transformative force in the current era. Omics data, characterized by its heterogeneity and enormity, has ushered biological and biomedical research into the big data domain. Acknowledging the significance of integrating diverse omics data strata, known as multi-omics studies, researchers delve into the intricate interrelationships among various omics layers. This review navigates the expansive omics landscape, showcasing tailored assays for each molecular layer through genomes to metabolomes. The sheer volume of data generated necessitates sophisticated informatics techniques, with machine-learning (ML) algorithms emerging as robust tools. These datasets not only refine disease classification but also enhance diagnostics and foster the development of targeted therapeutic strategies. Through the integration of high-throughput data, the review focuses on targeting and modeling multiple disease-regulated networks, validating interactions with multiple targets, and enhancing therapeutic potential using network pharmacology approaches. Ultimately, this exploration aims to illuminate the transformative impact of multi-omics in the big data era, shaping the future of biological research.
Collapse
Affiliation(s)
- Arnab Mukherjee
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Suzanna Abraham
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Akshita Singh
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - S Balaji
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - K S Mukunthan
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| |
Collapse
|
4
|
Kim SY, Shin SY, Saeed M, Ryu JE, Kim JS, Ahn J, Jung Y, Moon JM, Choi CH, Choi HK. Prediction of Clinical Remission with Adalimumab Therapy in Patients with Ulcerative Colitis by Fourier Transform-Infrared Spectroscopy Coupled with Machine Learning Algorithms. Metabolites 2023; 14:2. [PMID: 38276292 PMCID: PMC10818421 DOI: 10.3390/metabo14010002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/06/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
We aimed to develop prediction models for clinical remission associated with adalimumab treatment in patients with ulcerative colitis (UC) using Fourier transform-infrared (FT-IR) spectroscopy coupled with machine learning (ML) algorithms. This prospective, observational, multicenter study enrolled 62 UC patients and 30 healthy controls. The patients were treated with adalimumab for 56 weeks, and clinical remission was evaluated using the Mayo score. Baseline fecal samples were collected and analyzed using FT-IR spectroscopy. Various data preprocessing methods were applied, and prediction models were established by 10-fold cross-validation using various ML methods. Orthogonal partial least squares-discriminant analysis (OPLS-DA) showed a clear separation of healthy controls and UC patients, applying area normalization and Pareto scaling. OPLS-DA models predicting short- and long-term remission (8 and 56 weeks) yielded area-under-the-curve values of 0.76 and 0.75, respectively. Logistic regression and a nonlinear support vector machine were selected as the best prediction models for short- and long-term remission, respectively (accuracy of 0.99). In external validation, prediction models for short-term (logistic regression) and long-term (decision tree) remission performed well, with accuracy values of 0.73 and 0.82, respectively. This was the first study to develop prediction models for clinical remission associated with adalimumab treatment in UC patients by fecal analysis using FT-IR spectroscopy coupled with ML algorithms. Logistic regression, nonlinear support vector machines, and decision tree were suggested as the optimal prediction models for remission, and these were noninvasive, simple, inexpensive, and fast analyses that could be applied to personalized treatments.
Collapse
Affiliation(s)
- Seok-Young Kim
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Seung Yong Shin
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Maham Saeed
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Ji Eun Ryu
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Jung-Seop Kim
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Junyoung Ahn
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Youngmi Jung
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Jung Min Moon
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Chang Hwan Choi
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Hyung-Kyoon Choi
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| |
Collapse
|
5
|
Hu K, Guo Y, Li Y, Zhou S, Lu C, Cai C, Yang H, Li Y, Wang W. Identification and Validation of PTGS2 Gene as an Oxidative Stress-Related Biomarker for Arteriovenous Fistula Failure. Antioxidants (Basel) 2023; 13:5. [PMID: 38275625 PMCID: PMC10812504 DOI: 10.3390/antiox13010005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/06/2023] [Accepted: 12/13/2023] [Indexed: 01/27/2024] Open
Abstract
(1) Background: Arteriovenous fistulas (AVFs) are the preferred site for hemodialysis. Unfortunately, approximately 60% of patients suffer from AVF failure within one year. Oxidative stress plays an important role in the occurrence and development of AVF. However, the underlying mechanisms remain unclear. Therefore, specific oxidative stress-related biomarkers are urgently needed for the diagnosis and treatment of AVF failure. (2) Methods: Bioinformatics analysis was carried out on dataset GSE119296 to screen for PTGS2 as a candidate gene related to oxidative stress and to verify the expression level and diagnostic efficacy of PTGS2 in clinical patients. The effects of NS398, a PTGS2 inhibitor, on hemodynamics, smooth muscle cell proliferation, migration, and oxidative stress were evaluated in a mouse AVF model. (3) Results: Based on 83 oxidative stress-related differentially expressed genes, we identified the important pathways related to oxidative stress. PTGS2 may have diagnostic and therapeutic efficacy for AVF failure. We further confirmed this finding using clinical specimens and validation datasets. The animal experiments illustrated that NS398 administration could reduce neointimal area (average decrease: 49%) and improve peak velocity (average increase: 53%). (4) Conclusions: Our study identified PTGS2 as an important oxidative stress-related biomarker for AVF failure. Targeting PTGS2 reduced oxidative stress and improved hemodynamics in an AVF mouse model.
Collapse
Affiliation(s)
- Ke Hu
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Yi Guo
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Yuxuan Li
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Shunchang Zhou
- Center of Experimental Animals, Huazhong University of Science and Technology, Wuhan 430000, China;
| | - Chanjun Lu
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Chuanqi Cai
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Hongjun Yang
- Key Laboratory of Green Processing and Functional New Textile Materials of Ministry of Education, Wuhan Textile University, Wuhan 430200, China;
| | - Yiqing Li
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| | - Weici Wang
- Department of Vascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China; (K.H.); (Y.G.); (Y.L.); (C.L.); (C.C.); (Y.L.)
| |
Collapse
|
6
|
Alvarez-Frutos L, Barriuso D, Duran M, Infante M, Kroemer G, Palacios-Ramirez R, Senovilla L. Multiomics insights on the onset, progression, and metastatic evolution of breast cancer. Front Oncol 2023; 13:1292046. [PMID: 38169859 PMCID: PMC10758476 DOI: 10.3389/fonc.2023.1292046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Accepted: 11/23/2023] [Indexed: 01/05/2024] Open
Abstract
Breast cancer is the most common malignant neoplasm in women. Despite progress to date, 700,000 women worldwide died of this disease in 2020. Apparently, the prognostic markers currently used in the clinic are not sufficient to determine the most appropriate treatment. For this reason, great efforts have been made in recent years to identify new molecular biomarkers that will allow more precise and personalized therapeutic decisions in both primary and recurrent breast cancers. These molecular biomarkers include genetic and post-transcriptional alterations, changes in protein expression, as well as metabolic, immunological or microbial changes identified by multiple omics technologies (e.g., genomics, epigenomics, transcriptomics, proteomics, glycomics, metabolomics, lipidomics, immunomics and microbiomics). This review summarizes studies based on omics analysis that have identified new biomarkers for diagnosis, patient stratification, differentiation between stages of tumor development (initiation, progression, and metastasis/recurrence), and their relevance for treatment selection. Furthermore, this review highlights the importance of clinical trials based on multiomics studies and the need to advance in this direction in order to establish personalized therapies and prolong disease-free survival of these patients in the future.
Collapse
Affiliation(s)
- Lucia Alvarez-Frutos
- Laboratory of Cell Stress and Immunosurveillance, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
| | - Daniel Barriuso
- Laboratory of Cell Stress and Immunosurveillance, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
| | - Mercedes Duran
- Laboratory of Molecular Genetics of Hereditary Cancer, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
| | - Mar Infante
- Laboratory of Molecular Genetics of Hereditary Cancer, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
| | - Guido Kroemer
- Centre de Recherche des Cordeliers, Equipe labellisée par la Ligue contre le cancer, Université Paris Cité, Sorbonne Université, Inserm U1138, Institut Universitaire de France, Paris, France
- Metabolomics and Cell Biology Platforms, Institut Gustave Roussy, Villejuif, France
- Department of Biology, Institut du Cancer Paris CARPEM, Hôpital Européen Georges Pompidou, Paris, France
| | - Roberto Palacios-Ramirez
- Laboratory of Cell Stress and Immunosurveillance, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
| | - Laura Senovilla
- Laboratory of Cell Stress and Immunosurveillance, Unidad de Excelencia Instituto de Biomedicina y Genética Molecular (IBGM), Universidad de Valladolid – Centro Superior de Investigaciones Cientificas (CSIC), Valladolid, Spain
- Centre de Recherche des Cordeliers, Equipe labellisée par la Ligue contre le cancer, Université Paris Cité, Sorbonne Université, Inserm U1138, Institut Universitaire de France, Paris, France
- Metabolomics and Cell Biology Platforms, Institut Gustave Roussy, Villejuif, France
| |
Collapse
|
7
|
Cui Y, Lu W, Xue J, Ge L, Yin X, Jian S, Li H, Zhu B, Dai Z, Shen Q. Machine learning-guided REIMS pattern recognition of non-dairy cream, milk fat cream and whipping cream for fraudulence identification. Food Chem 2023; 429:136986. [PMID: 37516053 DOI: 10.1016/j.foodchem.2023.136986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 07/02/2023] [Accepted: 07/22/2023] [Indexed: 07/31/2023]
Abstract
The illegal adulteration of non-dairy cream in milk fat cream during the manufacturing process of baked goods has significantly hindered the robust growth of the dairy industry. In this study, a method based on rapid evaporative ionization mass spectrometry (REIMS) lipidomics pattern recognition integrated with machine learning algorithms was established. A total of 26 ions with importance were picked using multivariate statistical analysis as salient contributing features to distinguish between milk fat cream and non-dairy cream. Furthermore, employing discriminant analysis, decision trees, support vector machines, and neural network classifiers, machine learning models were utilized to classify non-dairy cream, milk fat cream, and minute quantities of non-dairy cream adulterated in milk fat cream. These approaches were enhanced through hyperparameter optimization and feature engineering, yielding accuracy rates at 98.4-99.6%. This artificial intelligent method of machine learning-guided REIMS pattern recognition can accurately identify adulteration of whipped cream and might help combat food fraud.
Collapse
Affiliation(s)
- Yiwei Cui
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China; Zhejiang Province Joint Key Laboratory of Aquatic Products Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Weibo Lu
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Jing Xue
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China; Zhejiang Province Joint Key Laboratory of Aquatic Products Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Lijun Ge
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Xuelian Yin
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Shikai Jian
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China
| | - Haihong Li
- Hangzhou Linping District Maternal & Child Health Care Hospital, Hangzhou 311113, China
| | - Beiwei Zhu
- National Engineering Research Center of Seafood, Collaborative Innovation Center of Provincial and Ministerial Co-Construction for Seafood Deep Processing, School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China
| | - Zhiyuan Dai
- Collaborative Innovation Center of Seafood Deep Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China; Zhejiang Province Joint Key Laboratory of Aquatic Products Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China.
| | - Qing Shen
- Department of Clinical Laboratory, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou 324000, China; Zhejiang Province Joint Key Laboratory of Aquatic Products Processing, Institute of Seafood, Zhejiang Gongshang University, Hangzhou 310012, China.
| |
Collapse
|
8
|
Zhang C, Xu J, Tang R, Yang J, Wang W, Yu X, Shi S. Novel research and future prospects of artificial intelligence in cancer diagnosis and treatment. J Hematol Oncol 2023; 16:114. [PMID: 38012673 PMCID: PMC10680201 DOI: 10.1186/s13045-023-01514-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 11/20/2023] [Indexed: 11/29/2023] Open
Abstract
Research into the potential benefits of artificial intelligence for comprehending the intricate biology of cancer has grown as a result of the widespread use of deep learning and machine learning in the healthcare sector and the availability of highly specialized cancer datasets. Here, we review new artificial intelligence approaches and how they are being used in oncology. We describe how artificial intelligence might be used in the detection, prognosis, and administration of cancer treatments and introduce the use of the latest large language models such as ChatGPT in oncology clinics. We highlight artificial intelligence applications for omics data types, and we offer perspectives on how the various data types might be combined to create decision-support tools. We also evaluate the present constraints and challenges to applying artificial intelligence in precision oncology. Finally, we discuss how current challenges may be surmounted to make artificial intelligence useful in clinical settings in the future.
Collapse
Affiliation(s)
- Chaoyi Zhang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Jin Xu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Rong Tang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Jianhui Yang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Wei Wang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Xianjun Yu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China.
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China.
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China.
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China.
| | - Si Shi
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China.
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China.
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China.
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China.
| |
Collapse
|
9
|
Wang K, Theeke LA, Liao C, Wang N, Lu Y, Xiao D, Xu C. Deep learning analysis of UPLC-MS/MS-based metabolomics data to predict Alzheimer's disease. J Neurol Sci 2023; 453:120812. [PMID: 37776718 DOI: 10.1016/j.jns.2023.120812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/22/2023] [Accepted: 09/14/2023] [Indexed: 10/02/2023]
Abstract
OBJECTIVE Metabolic biomarkers can potentially inform disease progression in Alzheimer's disease (AD). The purpose of this study is to identify and describe a new set of diagnostic biomarkers for developing deep learning (DL) tools to predict AD using Ultra Performance Liquid Chromatography Mass Spectrometry (UPLC-MS/MS)-based metabolomics data. METHODS A total of 177 individuals, including 78 with AD and 99 with cognitive normal (CN), were selected from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort along with 150 metabolomic biomarkers. We performed feature selection using the Least Absolute Shrinkage and Selection Operator (LASSO). The H2O DL function was used to build multilayer feedforward neural networks to predict AD. RESULTS The LASSO selected 21 metabolic biomarkers. To develop DL models, the 21 biomarkers identified by LASSO were imported into the H2O package. The data was split into 70% for training and 30% for validation. The best DL model with two layers and 18 neurons achieved an accuracy of 0.881, F1-score of 0.892, and AUC of 0.873. Several metabolomic biomarkers involved in glucose and lipid metabolism, in particular bile acid metabolites, were associated with APOE-ε4 allele and clinical biomarkers (Aβ42, tTau, pTau), cognitive assessments [the Alzheimer's Disease Assessment Scale-cognitive subscale 13 (ADAS13), the Mini-Mental State Examination (MMSE)], and hippocampus volume. CONCLUSIONS This study identified a new set of diagnostic metabolomic biomarkers for developing DL tools to predict AD. These biomarkers may help with early diagnosis, prognostic risk stratification, and/or early treatment interventions for patients at risk for AD.
Collapse
Affiliation(s)
- Kesheng Wang
- School of Nursing, Health Sciences Center, West Virginia University, Morgantown, WV 26506, USA.
| | - Laurie A Theeke
- School of Nursing, The George Washington University, Ashburn, VA 20147, USA
| | - Christopher Liao
- Department of Electrical and Computer Engineering, Boston University, MA 02215, USA
| | - Nianyang Wang
- Department of Health Policy and Management, School of Public Health, University of Maryland, College Park, MD 20742, USA
| | - Yongke Lu
- Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, WV 25755, USA
| | - Danqing Xiao
- Department of STEM, School of Arts and Sciences, Regis College, Weston, MA 02493, USA
| | - Chun Xu
- Department of Health and Biomedical Sciences, College of Health Professions, University of Texas Rio Grande Valley, Brownsville, TX 78520, USA.
| |
Collapse
|
10
|
Ren T, Yin N, Du L, Pan M, Ding L. Identification and validation of FPR1, FPR2, IL17RA and TLR7 as immunogenic cell death related genes in osteoarthritis. Sci Rep 2023; 13:16872. [PMID: 37803031 PMCID: PMC10558501 DOI: 10.1038/s41598-023-43440-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 09/24/2023] [Indexed: 10/08/2023] Open
Abstract
Immunogenic cell death (ICDs) has gained increasing attention for its significant clinical efficacy in various diseases. Similarly, more and more attention has been paid in the role of immune factors in the pathological process of osteoarthritis (OA). The objective of this study is to reveal the relationship between ICD-related genes and the process of OA at the gene level through bioinformatics analysis. In this study, Limma R package was applied to identify differentially expressed genes (DEG), and OA related module genes were determined by weighted gene co-expression network analysis. The ICD-related genes were extracted from a previous study. The module genes related to DEGs and ICD were overlapped. Then, hub genes were identified by a series of analyses using the Least absolute shrinkage and selection operator and random forest algorithm, the expression level and diagnostic value of hub genes were evaluated by Logistic regression. In addition, we used Spearman rank correlation analysis to clarify the relationship between hub genes and infiltrating immune cells and immune pathways. The expression levels of FPR1, FPR2, IL17RA, and TLR7 was verified in SD rat knee joint model of OA by immunohistochemistry. The expression levels of FPR1, FPR2, IL17RA, and TLR7 mRNA were detected in the IL-1β induced rat chondrocytes in qPCR experiment in vitro. Four hub genes (FPR1, FPR2, IL17RA, and TLR7) were ultimately identified as OA biomarkers associated with ICD. And knockdown of TLR7 reversed collagen II and ADAMTS-5 degradation in IL-1β-stimulated chondrocytes. This research may provide new immune related biomarkers for the diagnosis of OA and serve as a reference for disease treatment monitoring.
Collapse
Affiliation(s)
- Tingting Ren
- Department of Critical Care Medicine, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, 200127, China
| | - Nuo Yin
- Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital South Campus, Shanghai, 201400, China
| | - Li Du
- Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital South Campus, Shanghai, 201400, China
| | - Mingmang Pan
- Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital South Campus, Shanghai, 201400, China
| | - Liang Ding
- Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital South Campus, Shanghai, 201400, China.
| |
Collapse
|
11
|
Bahado‐Singh RO, Turkoglu O, Aydas B, Vishweswaraiah S. Precision oncology: Artificial intelligence, circulating cell-free DNA, and the minimally invasive detection of pancreatic cancer-A pilot study. Cancer Med 2023; 12:19644-19655. [PMID: 37787018 PMCID: PMC10587955 DOI: 10.1002/cam4.6604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/18/2023] [Accepted: 09/20/2023] [Indexed: 10/04/2023] Open
Abstract
BACKGROUND Pancreatic cancer (PC) is among the most lethal cancers. The lack of effective tools for early detection results in late tumor detection and, consequently, high mortality rate. Precision oncology aims to develop targeted individual treatments based on advanced computational approaches of omics data. Biomarkers, such as global alteration of cytosine (CpG) methylation, can be pivotal for these objectives. In this study, we performed DNA methylation profiling of pancreatic cancer patients using circulating cell-free DNA (cfDNA) and artificial intelligence (AI) including Deep Learning (DL) for minimally invasive detection to elucidate the epigenetic pathogenesis of PC. METHODS The Illumina Infinium HD Assay was used for genome-wide DNA methylation profiling of cfDNA in treatment-naïve patients. Six AI algorithms were used to determine PC detection accuracy based on cytosine (CpG) methylation markers. Additional strategies for minimizing overfitting were employed. The molecular pathogenesis was interrogated using enrichment analysis. RESULTS In total, we identified 4556 significantly differentially methylated CpGs (q-value < 0.05; Bonferroni correction) in PC versus controls. Highly accurate PC detection was achieved with all 6 AI platforms (Area under the receiver operator characteristics curve [0.90-1.00]). For example, DL achieved AUC (95% CI): 1.00 (0.95-1.00), with a sensitivity and specificity of 100%. A separate modeling approach based on logistic regression-based yielded an AUC (95% CI) 1.0 (1.0-1.0) with a sensitivity and specificity of 100% for PC detection. The top four biological pathways that were epigenetically altered in PC and are known to be linked with cancer are discussed. CONCLUSION Using a minimally invasive approach, AI, and epigenetic analysis of circulating cfDNA, high predictive accuracy for PC was achieved. From a clinical perspective, our findings suggest that that early detection leading to improved overall survival may be achievable in the future.
Collapse
Affiliation(s)
- Ray O. Bahado‐Singh
- Department of Obstetrics and GynecologyCorewell Health – William Beaumont University HospitalRoyal OakMichiganUSA
| | - Onur Turkoglu
- Department of Obstetrics and GynecologyCorewell Health – William Beaumont University HospitalRoyal OakMichiganUSA
| | - Buket Aydas
- Department of Care Management AnalyticsBlue Cross Blue Shield of MichiganDetroitMichiganUSA
| | | |
Collapse
|
12
|
Zhang Y, Lin X, Gao Z, Wang T, Dong K, Zhang J. An omics data analysis method based on feature linear relationship and graph convolutional network. J Biomed Inform 2023; 145:104479. [PMID: 37634557 DOI: 10.1016/j.jbi.2023.104479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 07/26/2023] [Accepted: 08/23/2023] [Indexed: 08/29/2023]
Abstract
Biological networks are known to be highly modular, and the dysfunction of network modules may cause diseases. Defining the key modules from the omics data and establishing the classification model is helpful in promoting the research of disease diagnosis and prognosis. However, for applying modules in downstream analysis such as disease states discrimination, most methods only utilize the node information, and ignore the node interactions or topological information, which may lead to false positives and limit the model performance. In this study, we propose an omics data analysis method based on feature linear relationship and graph convolutional network (LCNet). In LCNet, we adopt a way of applying the difference of feature linear relationships during disease development to characterize physiological and pathological changes and construct the differential linear relation network, which is simple and interpretable from the perspective of feature linear relationship. A greedy strategy is developed for searching the highly interactive modules with a strong discrimination ability. To fully utilize the information of the detected modules, the personalized sub-graphs for each sample based on the modules are defined, and the graph convolutional network (GCN) classifiers are trained to predict the sample labels. The experimental results on public datasets show the superiority of LCNet in classification performance. For Breast Cancer metabolic data, the identified metabolites by LCNet involve important pathways. Thus, LCNet can identify the module biomarkers by feature linear relationship and a greedy strategy, and label samples by personalized sub-graphs and GCN. It provides a new manner of utilizing node (molecule) information and topological information in the defined modules for better disease classification.
Collapse
Affiliation(s)
- Yanhui Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Xiaohui Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China.
| | - Zhenbo Gao
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Tianxiang Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Kunjie Dong
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Jianjun Zhang
- Cancer Hospital of Dalian University of Technology (Liaoning Cancer Hospital & Institute), Liaoning, China
| |
Collapse
|
13
|
Lu W, Sun C, Hou J. Predicting key gene related to immune infiltration and myofibroblast-like valve interstitial cells in patients with calcified aortic valve disease based on bioinformatics analysis. J Thorac Dis 2023; 15:3726-3740. [PMID: 37559614 PMCID: PMC10407485 DOI: 10.21037/jtd-23-72] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 06/09/2023] [Indexed: 08/11/2023]
Abstract
BACKGROUND Calcified aortic valve disease (CAVD) is the most prevalent valvular disease that can be treated only through valve replacement. We aimed to explore potential biomarkers and the role of immune cell infiltration in CAVD progression through bioinformatics analysis. METHODS Differentially ex-pressed genes (DEGs) were screened out based on three microarray datasets: GSE12644, GSE51472 and GSE83453. Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed to evaluate gene expression differences. Machine learning algorithms and DEGs were used to screen key gene. We used CIBERSORT to evaluate the immune cell infiltration of CAVD and evaluated the correlation between the biomarkers and infiltrating immune cells. We also compared bioinformatics analysis results with the valve interstitial cells (VICs) gene expression in single-cell RNA sequencing. RESULTS Collagen triple helix repeat containing 1 (CTHRC1) was identified as the key gene of CAVD. We identified a cell subtype valve interstitial cells-fibroblast, which was closely associated with fibro-calcific progress of aortic valve. CTHRC1 highly expressed in the VIC subpopulation. Immune infiltration analysis demonstrated that mast cells, B cells, dendritic cells and eosinophils were involved in pathogenesis of CAVD. Correlation analysis demonstrated that CTHRC1 was correlated with mast cells mostly. CONCLUSIONS In summary, the study suggested that CTHRC1 was a key gene of CAVD and CTHRC1 might participate in the potential molecular pathways involved in the connection between infiltrating immune cells and myofibroblast phenotype VICs.
Collapse
Affiliation(s)
- Wenyuan Lu
- Cardiac Surgery Centre, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Cheng Sun
- Cardiac Surgery Centre, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jianfeng Hou
- Cardiac Surgery Centre, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
14
|
Sen P, Orešič M. Integrating Omics Data in Genome-Scale Metabolic Modeling: A Methodological Perspective for Precision Medicine. Metabolites 2023; 13:855. [PMID: 37512562 PMCID: PMC10383060 DOI: 10.3390/metabo13070855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/11/2023] [Accepted: 07/17/2023] [Indexed: 07/30/2023] Open
Abstract
Recent advancements in omics technologies have generated a wealth of biological data. Integrating these data within mathematical models is essential to fully leverage their potential. Genome-scale metabolic models (GEMs) provide a robust framework for studying complex biological systems. GEMs have significantly contributed to our understanding of human metabolism, including the intrinsic relationship between the gut microbiome and the host metabolism. In this review, we highlight the contributions of GEMs and discuss the critical challenges that must be overcome to ensure their reproducibility and enhance their prediction accuracy, particularly in the context of precision medicine. We also explore the role of machine learning in addressing these challenges within GEMs. The integration of omics data with GEMs has the potential to lead to new insights, and to advance our understanding of molecular mechanisms in human health and disease.
Collapse
Affiliation(s)
- Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland
- School of Medical Sciences, Faculty of Medicine and Health, Örebro University, 702 81 Örebro, Sweden
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland
- School of Medical Sciences, Faculty of Medicine and Health, Örebro University, 702 81 Örebro, Sweden
| |
Collapse
|
15
|
Godlewski A, Czajkowski M, Mojsak P, Pienkowski T, Gosk W, Lyson T, Mariak Z, Reszec J, Kondraciuk M, Kaminski K, Kretowski M, Moniuszko M, Kretowski A, Ciborowski M. A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors. Sci Rep 2023; 13:11044. [PMID: 37422554 PMCID: PMC10329700 DOI: 10.1038/s41598-023-38243-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/05/2023] [Indexed: 07/10/2023] Open
Abstract
Metabolomics combined with machine learning methods (MLMs), is a powerful tool for searching novel diagnostic panels. This study was intended to use targeted plasma metabolomics and advanced MLMs to develop strategies for diagnosing brain tumors. Measurement of 188 metabolites was performed on plasma samples collected from 95 patients with gliomas (grade I-IV), 70 with meningioma, and 71 healthy individuals as a control group. Four predictive models to diagnose glioma were prepared using 10 MLMs and a conventional approach. Based on the cross-validation results of the created models, the F1-scores were calculated, then obtained values were compared. Subsequently, the best algorithm was applied to perform five comparisons involving gliomas, meningiomas, and controls. The best results were obtained using the newly developed hybrid evolutionary heterogeneous decision tree (EvoHDTree) algorithm, which was validated using Leave-One-Out Cross-Validation, resulting in an F1-score for all comparisons in the range of 0.476-0.948 and the area under the ROC curves ranging from 0.660 to 0.873. Brain tumor diagnostic panels were constructed with unique metabolites, which reduces the likelihood of misdiagnosis. This study proposes a novel interdisciplinary method for brain tumor diagnosis based on metabolomics and EvoHDTree, exhibiting significant predictive coefficients.
Collapse
Affiliation(s)
- Adrian Godlewski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Marcin Czajkowski
- Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
| | - Patrycja Mojsak
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Tomasz Pienkowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Wioleta Gosk
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
| | - Tomasz Lyson
- Department of Neurosurgery, Medical University of Bialystok, Białystok, Poland
| | - Zenon Mariak
- Department of Neurosurgery, Medical University of Bialystok, Białystok, Poland
| | - Joanna Reszec
- Department of Medical Pathomorphology, Medical University of Bialystok, Białystok, Poland
| | - Marcin Kondraciuk
- Department of Population Medicine and Lifestyle Diseases Prevention, Medical University of Bialystok, Białystok, Poland
| | - Karol Kaminski
- Department of Population Medicine and Lifestyle Diseases Prevention, Medical University of Bialystok, Białystok, Poland
| | - Marek Kretowski
- Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
| | - Marcin Moniuszko
- Department of Regenerative Medicine and Immune Regulation, Medical University of Bialystok, Białystok, Poland
- Department of Allergology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
| | - Adam Kretowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland
- Department of Endocrinology, Diabetology and Internal Medicine, Medical University of Bialystok, Białystok, Poland
| | - Michal Ciborowski
- Clinical Research Centre, Medical University of Bialystok, M. Sklodowskiej-Curie 24a, 15-276, Białystok, Poland.
| |
Collapse
|
16
|
Wu Z, Wang W, Zhang K, Fan M, Lin R. Epigenetic and Tumor Microenvironment for Prognosis of Patients with Gastric Cancer. Biomolecules 2023; 13:biom13050736. [PMID: 37238607 DOI: 10.3390/biom13050736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 04/02/2023] [Accepted: 04/12/2023] [Indexed: 05/28/2023] Open
Abstract
BACKGROUND Epigenetics studies heritable or inheritable mechanisms that regulate gene expression rather than altering the DNA sequence. However, no research has investigated the link between TME-related genes (TRGs) and epigenetic-related genes (ERGs) in GC. METHODS A complete review of genomic data was performed to investigate the relationship between the epigenesis tumor microenvironment (TME) and machine learning algorithms in GC. RESULTS Firstly, TME-related differential expression of genes (DEGs) performed non-negative matrix factorization (NMF) clustering analysis and determined two clusters (C1 and C2). Then, Kaplan-Meier curves for overall survival (OS) and progression-free survival (PFS) rates suggested that cluster C1 predicted a poorer prognosis. The Cox-LASSO regression analysis identified eight hub genes (SRMS, MET, OLFML2B, KIF24, CLDN9, RNF43, NETO2, and PRSS21) to build the TRG prognostic model and nine hub genes (TMPO, SLC25A15, SCRG1, ISL1, SOD3, GAD1, LOXL4, AKR1C2, and MAGEA3) to build the ERG prognostic model. Additionally, the signature's area under curve (AUC) values, survival rates, C-index scores, and mean squared error (RMS) curves were evaluated against those of previously published signatures, which revealed that the signature identified in this study performed comparably. Meanwhile, based on the IMvigor210 cohort, a statistically significant difference in OS between immunotherapy and risk scores was observed. It was followed by LASSO regression analysis which identified 17 key DEGs and a support vector machine (SVM) model identified 40 significant DEGs, and based on the Venn diagram, eight co-expression genes (ENPP6, VMP1, LY6E, SHISA6, TMEM158, SYT4, IL11, and KLK8) were discovered. CONCLUSION The study identified some hub genes that could be useful in predicting prognosis and management in GC.
Collapse
Affiliation(s)
- Zenghong Wu
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Weijun Wang
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Kun Zhang
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Mengke Fan
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Rong Lin
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
17
|
Dou J, Dawuti W, Zheng X, Zhu Y, Lin R, Lü G, Zhang Y. Rapid discrimination of Brucellosis in sheep using serum Fourier transform infrared spectroscopy combined with PCA-LDA algorithm. Photodiagnosis Photodyn Ther 2023; 42:103567. [PMID: 37084931 DOI: 10.1016/j.pdpdt.2023.103567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 04/09/2023] [Accepted: 04/11/2023] [Indexed: 04/23/2023]
Abstract
Brucellosis in sheep is an infectious disease caused by Brucella melitensis in sheep. The current conventional serological methods for screening Brucella-infected sheep have the disadvantage of time consuming and low accuracy, so a simple, rapid and highly accurate screening method is needed. The aim of this study was to evaluate the feasibility of diagnosing Brucella-infected sheep by serum samples based on the Fourier transform infrared (FTIR) spectroscopy. In this study, FTIR spectroscopy of serum from Brucella-infected sheep (n=102) and healthy sheep (n=125) revealed abnormal protein and lipid metabolism in serum from Brucella-infected sheep compared to healthy sheep. Principal component analysis-Linear discriminant analysis (PCA-LDA) method was used to differentiate the FTIR spectra of serum from Brucella-infected sheep and healthy sheep in the protein band (3700-3090 cm-1) and lipid band (3000-2800 cm-1), and its overall diagnostic accuracy was 100% (sensitivity 100%, specificity 100%). In conclusion, our results suggest that serum FTIR spectroscopy combined with PCA-LDA algorithm has great potential for brucellosis in sheep screening.
Collapse
Affiliation(s)
- Jingrui Dou
- School of Public Health, Xinjiang Medical University, Urumqi 830054, China; State Key Laboratory of Pathogenesis, Prevention, and Treatment of Central Asian High Incidence Diseases, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi 830054, China
| | - Wubulitalifu Dawuti
- School of Public Health, Xinjiang Medical University, Urumqi 830054, China; State Key Laboratory of Pathogenesis, Prevention, and Treatment of Central Asian High Incidence Diseases, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi 830054, China
| | - Xiangxiang Zheng
- School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Yousen Zhu
- Clinical Laboratory, The First Affiliated Hospital of Xinjiang Medical University, Urumqi Xinjiang 830054, China
| | - Renyong Lin
- State Key Laboratory of Pathogenesis, Prevention, and Treatment of Central Asian High Incidence Diseases, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi 830054, China
| | - Guodong Lü
- School of Public Health, Xinjiang Medical University, Urumqi 830054, China; State Key Laboratory of Pathogenesis, Prevention, and Treatment of Central Asian High Incidence Diseases, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi 830054, China.
| | - Yujiang Zhang
- School of Public Health, Xinjiang Medical University, Urumqi 830054, China; The Center for Disease Control and Prevention of Xinjiang Uygur Autonomous Region, Urumqi 830002, China.
| |
Collapse
|
18
|
Liu W, Zhang L, Bao L, Shen G, Feng J. Accurate Classification and Prediction of Acute Myocardial Infarction through an ARMD Procedure. J Proteome Res 2023; 22:758-767. [PMID: 36710647 DOI: 10.1021/acs.jproteome.2c00488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The risk stratification of acute myocardial infarction (AMI) patients is of prime importance for clinical management and prognosis assessment. Thus, we propose an ensemble machine learning analysis procedure named ADASYN-RFECV-MDA-DNN (ARMD) to address sample-unbalanced problems and enable stratification and prediction of AMI outcomes. The ARMD analysis procedure was applied to the NMR data of sera from 534 AMI-related subjects in four categories with an extremely imbalanced sample proportion. Firstly, the adaptive synthetic sampling (ADASYN) algorithm was used to address the issue of the original sample imbalance. Secondly, the recursive feature elimination with cross-validation (RFECV) processing and random forest mean decrease accuracy (RF-MDA) algorithm was performed to identify the differential metabolites corresponding to each AMI outcome. Finally, the deep neural network (DNN) was employed to classify and predict AMI events, and its performance was evaluated by comparing the four traditional machine learning methods. Compared with the other four machine learning models, DNN presented consistent superiority in almost all of the model parameters including precision, f1-score, sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and classification accuracy, highlighting the potential of deep learning in classification and stratification of clinical diseases. The ARMD analysis procedure was a practical analysis tool for supervised classification and regression modeling of clinical diseases.
Collapse
Affiliation(s)
- Wuping Liu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Lirong Zhang
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Lijun Bao
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Guiping Shen
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Jianghua Feng
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| |
Collapse
|
19
|
Combining Machine Learning with Metabolomic and Embryologic Data Improves Embryo Implantation Prediction. Reprod Sci 2023; 30:984-994. [PMID: 36097248 PMCID: PMC10014658 DOI: 10.1007/s43032-022-01071-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 08/23/2022] [Indexed: 10/14/2022]
Abstract
This study investigated whether combining metabolomic and embryologic data with machine learning (ML) models improve the prediction of embryo implantation potential. In this prospective cohort study, infertile couples (n=56) undergoing day-5 single blastocyst transfer between February 2019 and August 2021 were included. After day-5 single blastocyst transfer, spent culture medium (SCM) was subjected to metabolite analysis using nuclear magnetic resonance (NMR) spectroscopy. Derived metabolite levels and embryologic parameters between successfully implanted and failed groups were incorporated into ML models to explore their predictive potential regarding embryo implantation. The SCM of blastocysts that resulted in successful embryo implantation had significantly lower pyruvate (p<0.05) and threonine (p<0.05) levels compared to medium control but not compared to SCM related to embryos that failed to implant. Notably, the prediction accuracy increased when classical ML algorithms were combined with metabolomic and embryologic data. Specifically, the custom artificial neural network (ANN) model with regularized parameters for metabolomic data provided 100% accuracy, indicating the efficiency in predicting implantation potential. Hence, combining ML models (specifically, custom ANN) with metabolomic and embryologic data improves the prediction of embryo implantation potential. The approach could potentially be used to derive clinical benefits for patients in real-time.
Collapse
|
20
|
Chen ZA, Ma HH, Wang Y, Tian H, Mi JW, Yao DM, Yang CJ. Integrated multiple microarray studies by robust rank aggregation to identify immune-associated biomarkers in Crohn's disease based on three machine learning methods. Sci Rep 2023; 13:2694. [PMID: 36792688 PMCID: PMC9931764 DOI: 10.1038/s41598-022-26345-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 12/13/2022] [Indexed: 02/17/2023] Open
Abstract
Crohn's disease (CD) is a complex autoimmune disorder presumed to be driven by complex interactions of genetic, immune, microbial and even environmental factors. Intrinsic molecular mechanisms in CD, however, remain poorly understood. The identification of novel biomarkers in CD cases based on larger samples through machine learning approaches may inform the diagnosis and treatment of diseases. A comprehensive analysis was conducted on all CD datasets of Gene Expression Omnibus (GEO); our team then used the robust rank aggregation (RRA) method to identify differentially expressed genes (DEGs) between controls and CD patients. PPI (protein‒protein interaction) network and functional enrichment analyses were performed to investigate the potential functions of the DEGs, with molecular complex detection (MCODE) identifying some important functional modules from the PPI network. Three machine learning algorithms, support vector machine-recursive feature elimination (SVM-RFE), random forest (RF), and least absolute shrinkage and selection operator (LASSO), were applied to determine characteristic genes, which were verified by ROC curve analysis and immunohistochemistry (IHC) using clinical samples. Univariable and multivariable logistic regression were used to establish a machine learning score for diagnosis. Single-sample GSEA (ssGSEA) was performed to examine the correlation between immune infiltration and biomarkers. In total, 5 datasets met the inclusion criteria: GSE75214, GSE95095, GSE126124, GSE179285, and GSE186582. Based on RRA integrated analysis, 203 significant DEGs were identified (120 upregulated genes and 83 downregulated genes), and MCODE revealed some important functional modules in the PPI network. Machine learning identified LCN2, REG1A, AQP9, CCL2, GIP, PROK2, DEFA5, CXCL9, and NAMPT; AQP9, PROK2, LCN2, and NAMPT were further verified by ROC curves and IHC in the external cohort. The final machine learning score was defined as [Expression level of AQP9 × (2.644)] + [Expression level of LCN2 × (0.958)] + [Expression level of NAMPT × (1.115)]. ssGSEA showed markedly elevated levels of dendritic cells and innate immune cells, such as macrophages and NK cells, in CD, consistent with the gene enrichment results that the DEGs are mainly involved in the IL-17 signaling pathway and humoral immune response. The selected biomarkers analyzed by the RRA method and machine learning are highly reliable. These findings improve our understanding of the molecular mechanisms of CD pathogenesis.
Collapse
Affiliation(s)
- Zi-An Chen
- grid.452702.60000 0004 1804 3009Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000 Hebei China ,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000 Hebei China
| | - Hui-hui Ma
- grid.452702.60000 0004 1804 3009Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000 Hebei China ,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000 Hebei China
| | - Yan Wang
- grid.452702.60000 0004 1804 3009Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000 Hebei China ,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000 Hebei China
| | - Hui Tian
- grid.452702.60000 0004 1804 3009Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000 Hebei China ,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000 Hebei China
| | - Jian-wei Mi
- grid.452702.60000 0004 1804 3009Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000 Hebei China ,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000 Hebei China
| | - Dong-Mei Yao
- Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China. .,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000, Hebei, China.
| | - Chuan-Jie Yang
- Department of Gastroenterology, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, Hebei, China. .,Hebei Key Laboratory of Gastroenterology, Hebei Institute of Gastroenterology, Hebei Clinical Research Center for Digestive Disease, Shijiazhuang, 050000, Hebei, China.
| |
Collapse
|
21
|
Mathema VB, Sen P, Lamichhane S, Orešič M, Khoomrung S. Deep learning facilitates multi-data type analysis and predictive biomarker discovery in cancer precision medicine. Comput Struct Biotechnol J 2023; 21:1372-1382. [PMID: 36817954 PMCID: PMC9929204 DOI: 10.1016/j.csbj.2023.01.043] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 01/28/2023] [Accepted: 01/29/2023] [Indexed: 02/02/2023] Open
Abstract
Cancer progression is linked to gene-environment interactions that alter cellular homeostasis. The use of biomarkers as early indicators of disease manifestation and progression can substantially improve diagnosis and treatment. Large omics datasets generated by high-throughput profiling technologies, such as microarrays, RNA sequencing, whole-genome shotgun sequencing, nuclear magnetic resonance, and mass spectrometry, have enabled data-driven biomarker discoveries. The identification of differentially expressed traits as molecular markers has traditionally relied on statistical techniques that are often limited to linear parametric modeling. The heterogeneity, epigenetic changes, and high degree of polymorphism observed in oncogenes demand biomarker-assisted personalized medication schemes. Deep learning (DL), a major subunit of machine learning (ML), has been increasingly utilized in recent years to investigate various diseases. The combination of ML/DL approaches for performance optimization across multi-omics datasets produces robust ensemble-learning prediction models, which are becoming useful in precision medicine. This review focuses on the recent development of ML/DL methods to provide integrative solutions in discovering cancer-related biomarkers, and their utilization in precision medicine.
Collapse
Affiliation(s)
- Vivek Bhakta Mathema
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
- School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Santosh Lamichhane
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
- School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center of Excellence for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Bangkok, Thailand
- Corresponding author at: Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand.
| |
Collapse
|
22
|
Bian R, Xu X, Li W. Uncovering the molecular mechanisms between heart failure and end-stage renal disease via a bioinformatics study. Front Genet 2023; 13:1037520. [PMID: 36704339 PMCID: PMC9871391 DOI: 10.3389/fgene.2022.1037520] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 12/20/2022] [Indexed: 01/12/2023] Open
Abstract
Background: Heart failure (HF) is not only a common complication in patients with end-stage renal disease (ESRD) but also a major cause of death. Although clinical studies have shown that there is a close relationship between them, the mechanism of its occurrence is unclear. The aim of this study is to explore the molecular mechanisms between HF and ESRD through comprehensive bioinformatics analysis, providing a new perspective on the crosstalk between these two diseases. Methods: The HF and ESRD datasets were downloaded from the Gene Expression Omnibus (GEO) database; we identified and analyzed common differentially expressed genes (DEGs). First, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set variation analyses (GSVA) were applied to explore the potential biological functions and construct protein-protein interaction (PPI) networks. Also, four algorithms, namely, random forest (RF), Boruta algorithm, logical regression of the selection operator (LASSO), and support vector machine-recursive feature elimination (SVM-RFE), were used to identify the candidate genes. Subsequently, the diagnostic efficacy of hub genes for HF and ESRD was evaluated using eXtreme Gradient Boosting (XGBoost) algorithm. CIBERSORT was used to analyze the infiltration of immune cells. Thereafter, we predicted target microRNAs (miRNAs) using databases (miRTarBase, TarBase, and ENOCRI), and transcription factors (TFs) were identified using the ChEA3 database. Cytoscape software was applied to construct mRNA-miRNA-TF regulatory networks. Finally, the Drug Signatures Database (DSigDB) was used to identify potential drug candidates. Results: A total of 68 common DEGs were identified. The enrichment analysis results suggest that immune response and inflammatory factors may be common features of the pathophysiology of HF and ESRD. A total of four hub genes (BCL6, CCL5, CNN1, and PCNT) were validated using RF, LASSO, Boruta, and SVM-RFE algorithms. Their AUC values were all greater than 0.8. Immune infiltration analysis showed that immune cells such as macrophages, neutrophils, and NK cells were altered in HF myocardial tissue, while neutrophils were significantly correlated with all four hub genes. Finally, 11 target miRNAs and 10 TFs were obtained, and miRNA-mRNA-TF regulatory network construction was performed. In addition, 10 gene-targeted drugs were discovered. Conclusion: Our study revealed important crosstalk between HF and ESRD. These common pathways and pivotal genes may provide new ideas for further clinical treatment and experimental studies.
Collapse
|
23
|
Cao J, Li J, Gu Z, Niu JJ, An GS, Jin QQ, Wang YY, Huang P, Sun JH. Combined metabolomics and machine learning algorithms to explore metabolic biomarkers for diagnosis of acute myocardial ischemia. Int J Legal Med 2023; 137:169-180. [PMID: 35348878 DOI: 10.1007/s00414-022-02816-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 03/15/2022] [Indexed: 01/10/2023]
Abstract
Acute myocardial ischemia (AMI) remains the leading cause of death worldwide, and the post-mortem diagnosis of AMI represents a current challenge for both clinical and forensic pathologists. In the present study, the untargeted metabolomics based on ultra-performance liquid chromatography combined with high-resolution mass spectrometry was applied to analyze serum metabolic signatures from AMI in a rat model (n = 10 per group). A total of 28 endogenous metabolites in serum were significantly altered in AMI group relative to control and sham groups. A set of machine learning algorithms, namely gradient tree boosting (GTB), support vector machine (SVM), random forest (RF), logistic regression (LR), and multilayer perceptron (MLP) models, was used to screen the more valuable metabolites from 28 metabolites to optimize the biomarker panel. The results showed that classification accuracy and performance of MLP model were better than other algorithms when the metabolites consisting of L-threonic acid, N-acetyl-L-cysteine, CMPF, glycocholic acid, L-tyrosine, cholic acid, and glycoursodeoxycholic acid. Finally, 17 blood samples from autopsy cases were applied to validate the classification model's value in human samples. The MLP model constructed based on rat dataset achieved accuracy of 88.23%, and ROC of 0.89 for predicting AMI type II in autopsy cases of sudden cardiac death. The results demonstrated that MLP model based on 7 molecular biomarkers had a good diagnostic performance for both AMI rats and autopsy-based blood samples. Thus, the combination of metabolomics and machine learning algorithms provides a novel strategy for AMI diagnosis.
Collapse
Affiliation(s)
- Jie Cao
- Shanghai Key Laboratory of Forensic Medicine (Academy of Forensic Science), 200063, Shanghai, People's Republic of China.,School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Jian Li
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Zhen Gu
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Jia-Jia Niu
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Guo-Shuai An
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Qian-Qian Jin
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Ying-Yuan Wang
- School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China
| | - Ping Huang
- Shanghai Key Laboratory of Forensic Medicine (Academy of Forensic Science), 200063, Shanghai, People's Republic of China
| | - Jun-Hong Sun
- Shanghai Key Laboratory of Forensic Medicine (Academy of Forensic Science), 200063, Shanghai, People's Republic of China. .,School of Forensic Medicine, Shanxi Medical University, No. 98, University Street, Wujinshan Town, Yuci District, Jinzhong, Shanxi Province, 030604, People's Republic of China.
| |
Collapse
|
24
|
Xu M, Zhou H, Hu P, Pan Y, Wang S, Liu L, Liu X. Identification and validation of immune and oxidative stress-related diagnostic markers for diabetic nephropathy by WGCNA and machine learning. Front Immunol 2023; 14:1084531. [PMID: 36911691 PMCID: PMC9992203 DOI: 10.3389/fimmu.2023.1084531] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 02/13/2023] [Indexed: 02/24/2023] Open
Abstract
Background Diabetic nephropathy (DN) is the primary cause of end-stage renal disease, but existing therapeutics are limited. Therefore, novel molecular pathways that contribute to DN therapy and diagnostics are urgently needed. Methods Based on the Gene Expression Omnibus (GEO) database and Limma R package, we identified differentially expressed genes of DN and downloaded oxidative stress-related genes based on the Genecard database. Then, immune and oxidative stress-related hub genes were screened by combined WGCNA, machine learning, and protein-protein interaction (PPI) networks and validated by external validation sets. We conducted ROC analysis to assess the diagnostic efficacy of hub genes. The correlation of hub genes with clinical characteristics was analyzed by the Nephroseq v5 database. To understand the cellular clustering of hub genes in DN, we performed single nucleus RNA sequencing through the KIT database. Results Ultimately, we screened three hub genes, namely CD36, ITGB2, and SLC1A3, which were all up-regulated. According to ROC analysis, all three demonstrated excellent diagnostic efficacy. Correlation analysis revealed that the expression of hub genes was significantly correlated with the deterioration of renal function, and the results of single nucleus RNA sequencing showed that hub genes were mainly clustered in endothelial cells and leukocyte clusters. Conclusion By combining three machine learning algorithms with WGCNA analysis, this research identified three hub genes that could serve as novel targets for the diagnosis and therapy of DN.
Collapse
Affiliation(s)
- Mingming Xu
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| | - Hang Zhou
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| | - Ping Hu
- Department of Orthopedics, Tianjin Medical University General Hospital, Tianjin, China
| | - Yang Pan
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| | - Shangren Wang
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| | - Li Liu
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| | - Xiaoqiang Liu
- Department of Urology, Tianjin Medical University General Hospital, Tianjin, China
| |
Collapse
|
25
|
Galal A, Talal M, Moustafa A. Applications of machine learning in metabolomics: Disease modeling and classification. Front Genet 2022; 13:1017340. [PMID: 36506316 PMCID: PMC9730048 DOI: 10.3389/fgene.2022.1017340] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
Metabolomics research has recently gained popularity because it enables the study of biological traits at the biochemical level and, as a result, can directly reveal what occurs in a cell or a tissue based on health or disease status, complementing other omics such as genomics and transcriptomics. Like other high-throughput biological experiments, metabolomics produces vast volumes of complex data. The application of machine learning (ML) to analyze data, recognize patterns, and build models is expanding across multiple fields. In the same way, ML methods are utilized for the classification, regression, or clustering of highly complex metabolomic data. This review discusses how disease modeling and diagnosis can be enhanced via deep and comprehensive metabolomic profiling using ML. We discuss the general layout of a metabolic workflow and the fundamental ML techniques used to analyze metabolomic data, including support vector machines (SVM), decision trees, random forests (RF), neural networks (NN), and deep learning (DL). Finally, we present the advantages and disadvantages of various ML methods and provide suggestions for different metabolic data analysis scenarios.
Collapse
Affiliation(s)
- Aya Galal
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Institute of Global Health and Human Ecology, American University in Cairo, New Cairo, Egypt
| | - Marwa Talal
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Biotechnology Graduate Program, American University in Cairo, New Cairo, Egypt
| | - Ahmed Moustafa
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Biotechnology Graduate Program, American University in Cairo, New Cairo, Egypt,Department of Biology, American University in Cairo, New Cairo, Egypt,*Correspondence: Ahmed Moustafa,
| |
Collapse
|
26
|
The Analysis of Relevant Gene Networks Based on Driver Genes in Breast Cancer. Diagnostics (Basel) 2022; 12:diagnostics12112882. [PMID: 36428940 PMCID: PMC9689550 DOI: 10.3390/diagnostics12112882] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 11/08/2022] [Accepted: 11/14/2022] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND The occurrence and development of breast cancer has a strong correlation with a person's genetics. Therefore, it is important to analyze the genetic factors of breast cancer for future development of potential targeted therapies from the genetic level. METHODS In this study, we complete an analysis of the relevant protein-protein interaction network relating to breast cancer. This includes three steps, which are breast cancer-relevant genes selection using mutual information method, protein-protein interaction network reconstruction based on the STRING database, and vital genes calculating by nodes centrality analysis. RESULTS The 230 breast cancer-relevant genes were chosen in gene selection to reconstruct the protein-protein interaction network and some vital genes were calculated by node centrality analyses. Node centrality analyses conducted with the top 10 and top 20 values of each metric found 19 and 39 statistically vital genes, respectively. In order to prove the biological significance of these vital genes, we carried out the survival analysis and DNA methylation analysis, inquired about the prognosis in other cancer tissues and the RNA expression level in breast cancer. The results all proved the validity of the selected genes. CONCLUSIONS These genes could provide a valuable reference in clinical treatment among breast cancer patients.
Collapse
|
27
|
Sidak D, Schwarzerová J, Weckwerth W, Waldherr S. Interpretable machine learning methods for predictions in systems biology from omics data. Front Mol Biosci 2022; 9:926623. [PMID: 36387282 PMCID: PMC9650551 DOI: 10.3389/fmolb.2022.926623] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 08/15/2022] [Indexed: 12/02/2022] Open
Abstract
Machine learning has become a powerful tool for systems biologists, from diagnosing cancer to optimizing kinetic models and predicting the state, growth dynamics, or type of a cell. Potential predictions from complex biological data sets obtained by “omics” experiments seem endless, but are often not the main objective of biological research. Often we want to understand the molecular mechanisms of a disease to develop new therapies, or we need to justify a crucial decision that is derived from a prediction. In order to gain such knowledge from data, machine learning models need to be extended. A recent trend to achieve this is to design “interpretable” models. However, the notions around interpretability are sometimes ambiguous, and a universal recipe for building well-interpretable models is missing. With this work, we want to familiarize systems biologists with the concept of model interpretability in machine learning. We consider data sets, data preparation, machine learning methods, and software tools relevant to omics research in systems biology. Finally, we try to answer the question: “What is interpretability?” We introduce views from the interpretable machine learning community and propose a scheme for categorizing studies on omics data. We then apply these tools to review and categorize recent studies where predictive machine learning models have been constructed from non-sequential omics data.
Collapse
Affiliation(s)
- David Sidak
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
| | - Jana Schwarzerová
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic
| | - Wolfram Weckwerth
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- Vienna Metabolomics Center (VIME), Faculty of Life Sciences, University of Vienna, Vienna, Austria
| | - Steffen Waldherr
- Department of Functional and Evolutionary Ecology, Faculty of Life Sciences, Molecular Systems Biology (MOSYS), University of Vienna, Vienna, Austria
- *Correspondence: Steffen Waldherr,
| |
Collapse
|
28
|
Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int J Mol Sci 2022; 23:ijms232012272. [PMID: 36293133 PMCID: PMC9603455 DOI: 10.3390/ijms232012272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/03/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022] Open
Abstract
Medical discoveries mainly depend on the capability to process and analyze biological datasets, which inundate the scientific community and are still expanding as the cost of next-generation sequencing technologies is decreasing. Deep learning (DL) is a viable method to exploit this massive data stream since it has advanced quickly with there being successive innovations. However, an obstacle to scientific progress emerges: the difficulty of applying DL to biology, and this because both fields are evolving at a breakneck pace, thus making it hard for an individual to occupy the front lines of both of them. This paper aims to bridge the gap and help computer scientists bring their valuable expertise into the life sciences. This work provides an overview of the most common types of biological data and data representations that are used to train DL models, with additional information on the models themselves and the various tasks that are being tackled. This is the essential information a DL expert with no background in biology needs in order to participate in DL-based research projects in biomedicine, biotechnology, and drug discovery. Alternatively, this study could be also useful to researchers in biology to understand and utilize the power of DL to gain better insights into and extract important information from the omics data.
Collapse
|
29
|
Identification of HIBCH as a Fatty Acid Metabolism-Related Biomarker in Aortic Valve Calcification Using Bioinformatics. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022. [DOI: 10.1155/2022/9558713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Objective. To identify fatty acid metabolism-related biomarkers of aortic valve calcification (AVC) using bioinformatics and to research the role of immune cell infiltration for AVC. Methods. The AVC dataset was retrieved from the Gene Expression Omnibus database. R package is used for differential expression genes analysis and weighted gene coexpression analysis. The differentially coexpressed genes were identified by the Venn diagram, followed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of differentially coexpressed genes. Functions closely related to AVC were identified by GO and KEGG enrichment analyses of differentially coexpressed genes. Genes related to fatty acid metabolism were retrieved from the Molecular Signatures Database (MSigDB) database. After removing duplicate genes, least absolute shrinkage and selection operator (LASSO) regression analysis, support vector machine recursive feature elimination (SVM-RFE), and random forest were applied to recognize biomarkers related to fatty acid metabolism in AVC. The CIBERSORT tool was used to analyze infiltration of immune cells in normal and AVC samples. Correlations between biomarkers and immune cells were calculated. Finally, HIBCH-related pathway was predicted by single-gene gene set enrichment analysis (GSEA). Results. 2416 differentially expressed genes and one coexpression module were identified. A total of 1473 differentially coexpressed genes were acquired. GO and KEGG enrichment analyses demonstrated that differentially coexpressed genes were closely related to fatty acid metabolism. LASSO regression analysis, SVM-REF, and random forest revealed that 3-hydroxyisobutyryl-CoA hydrolase (HIBCH) was a biomarker of fatty acid metabolism-related genes in AVC. Significant high levels of memory B cells were found in AVC than normal samples, while activated natural killer (NK) cells were significantly low in AVC than normal samples. A significantly positive relevance was observed between HIBCH and activated NK cells, regulatory T cells, monocytes, naïve B cells, activated dendritic cells, resting memory CD4 T cells, resting NK cells, and CD8 T cells. A significantly negative relevance was observed between HIBCH and activated memory CD4 T cells, memory B cells, neutrophils, gamma delta T cells, M0 macrophages, and plasma cells. The single-gene GSEA results suggest that HIBCH may work through the inhibition of multiple immune-related pathways. Conclusion. HIBCH is closely relevant to immune cell infiltration in AVC and could be applied as a diagnostic marker for AVC.
Collapse
|
30
|
Chardin D, Gille C, Pourcher T, Humbert O, Barlaud M. Learning a confidence score and the latent space of a new supervised autoencoder for diagnosis and prognosis in clinical metabolomic studies. BMC Bioinformatics 2022; 23:361. [PMID: 36050631 PMCID: PMC9434875 DOI: 10.1186/s12859-022-04900-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 07/27/2022] [Indexed: 11/15/2022] Open
Abstract
Background Presently, there is a wide variety of classification methods and deep neural network approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Therefore, these innovative methods are not appropriate for decision support systems in healthcare. Indeed, to allow clinicians to make informed and well thought out decisions, the algorithm should provide the main pieces of information used to compute the predicted diagnosis and/or prognosis, as well as a confidence score for this prediction. Methods Herein, we used a new supervised autoencoder (SAE) approach for classification of clinical metabolomic data. This new method has the advantage of providing a confidence score for each prediction thanks to a softmax classifier and a meaningful latent space visualization and to include a new efficient feature selection method, with a structured constraint, which allows for biologically interpretable results. Results Experimental results on three metabolomics datasets of clinical samples illustrate the effectiveness of our SAE and its confidence score. The supervised autoencoder provides an accurate localization of the patients in the latent space, and an efficient confidence score. Experiments show that the SAE outperforms classical methods (PLS-DA, Random Forests, SVM, and neural networks (NN)). Furthermore, the metabolites selected by the SAE were found to be biologically relevant. Conclusion In this paper, we describe a new efficient SAE method to support diagnostic or prognostic evaluation based on metabolomics analyses.
Collapse
Affiliation(s)
- David Chardin
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France.,Centre Antoine Lacassagne, Université Côte d'Azur (UCA), Nice, France
| | - Cyprien Gille
- Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis (I3S), Centre de Recherche Scientifique (CNRS), Université Côte d'Azur (UCA), Sophia Antipolis, France
| | - Thierry Pourcher
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France
| | - Olivier Humbert
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France.,Centre Antoine Lacassagne, Université Côte d'Azur (UCA), Nice, France
| | - Michel Barlaud
- Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis (I3S), Centre de Recherche Scientifique (CNRS), Université Côte d'Azur (UCA), Sophia Antipolis, France.
| |
Collapse
|
31
|
Zhong X, Ran R, Gao S, Shi M, Shi X, Long F, Zhou Y, Yang Y, Tang X, Lin A, He W, Yu T, Han TL. Complex metabolic interactions between ovary, plasma, urine, and hair in ovarian cancer. Front Oncol 2022; 12:916375. [PMID: 35982964 PMCID: PMC9379488 DOI: 10.3389/fonc.2022.916375] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 07/06/2022] [Indexed: 11/13/2022] Open
Abstract
Ovarian cancer (OC) is the third most common malignant tumor of women accompanied by alteration of systemic metabolism, yet the underlying interactions between the local OC tissue and other system biofluids remain unclear. In this study, we recruited 17 OC patients, 16 benign ovarian tumor (BOT) patients, and 14 control patients to collect biological samples including ovary plasma, urine, and hair from the same patient. The metabolic features of samples were characterized using a global and targeted metabolic profiling strategy based on Gas chromatography-mass spectrometry (GC-MS). Principal component analysis (PCA) revealed that the metabolites display obvious differences in ovary tissue, plasma, and urine between OC and non-malignant groups but not in hair samples. The metabolic alterations in OC tissue included elevated glycolysis (lactic acid) and TCA cycle intermediates (malic acid, fumaric acid) were related to energy metabolism. Furthermore, the increased levels of glutathione and polyunsaturated fatty acids (linoleic acid) together with decreased levels of saturated fatty acid (palmitic acid) were observed, which might be associated with the anti-oxidative stress capability of cancer. Furthermore, how metabolite profile changes across differential biospecimens were compared in OC patients. Plasma and urine showed a lower concentration of amino acids (alanine, aspartic acid, glutamic acid, proline, leucine, and cysteine) than the malignant ovary. Plasma exhibited the highest concentrations of fatty acids (stearic acid, EPA, and arachidonic acid), while TCA cycle intermediates (succinic acid, citric acid, and malic acid) were most concentrated in the urine. In addition, five plasma metabolites and three urine metabolites showed the best specificity and sensitivity in differentiating the OC group from the control or BOT groups (AUC > 0.90) using machine learning modeling. Overall, this study provided further insight into different specimen metabolic characteristics between OC and non-malignant disease and identified the metabolic fluctuation across ovary and biofluids.
Collapse
Affiliation(s)
- Xiaocui Zhong
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Rui Ran
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Shanhu Gao
- State Key Laboratory of Ultrasound Engineering in Medicine Co-Founded by Chongqing and the Ministry of Science and Technology, School of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Manlin Shi
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xian Shi
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Fei Long
- State Key Laboratory of Ultrasound Engineering in Medicine Co-Founded by Chongqing and the Ministry of Science and Technology, School of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| | - Yanqiu Zhou
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yang Yang
- Department of Obstetrics, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xianglan Tang
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Anping Lin
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wuyang He
- Department of Oncology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Tinghe Yu
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- *Correspondence: Tinghe Yu, ; ; Ting-Li Han,
| | - Ting-Li Han
- Department of Obstetrics and Gynaecology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- *Correspondence: Tinghe Yu, ; ; Ting-Li Han,
| |
Collapse
|
32
|
Petrick LM, Shomron N. AI/ML-driven advances in untargeted metabolomics and exposomics for biomedical applications. CELL REPORTS. PHYSICAL SCIENCE 2022; 3:100978. [PMID: 35936554 PMCID: PMC9354369 DOI: 10.1016/j.xcrp.2022.100978] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Metabolomics describes a high-throughput approach for measuring a repertoire of metabolites and small molecules in biological samples. One utility of untargeted metabolomics, unbiased global analysis of the metabolome, is to detect key metabolites as contributors to, or readouts of, human health and disease. In this perspective, we discuss how artificial intelligence (AI) and machine learning (ML) have promoted major advances in untargeted metabolomics workflows and facilitated pivotal findings in the areas of disease screening and diagnosis. We contextualize applications of AI and ML to the emerging field of high-resolution mass spectrometry (HRMS) exposomics, which unbiasedly detects endogenous metabolites and exogenous chemicals in human tissue to characterize exposure linked with disease outcomes. We discuss the state of the science and suggest potential opportunities for using AI and ML to improve data quality, rigor, detection, and chemical identification in untargeted metabolomics and exposomics studies.
Collapse
Affiliation(s)
- Lauren M. Petrick
- The Bert Strassburger Metabolic Center, Sheba Medical Center, Tel-Hashomer, Israel
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Institute for Exposomics Research, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Noam Shomron
- Faculty of Medicine, Edmond J. Safra Center for Bioinformatics, Sagol School of Neuroscience, Center for Nanoscience and Nanotechnology, Center for Innovation Laboratories (TILabs), Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
33
|
Metabolomics of Breast Cancer: A Review. Metabolites 2022; 12:metabo12070643. [PMID: 35888767 PMCID: PMC9325024 DOI: 10.3390/metabo12070643] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 07/09/2022] [Accepted: 07/11/2022] [Indexed: 12/10/2022] Open
Abstract
Breast cancer is the most commonly diagnosed cancer in women worldwide. Major advances have been made towards breast cancer prevention and treatment. Unfortunately, the incidence of breast cancer is still increasing globally. Metabolomics is the field of science which studies all the metabolites in a cell, tissue, system, or organism. Metabolomics can provide information on dynamic changes occurring during cancer development and progression. The metabolites identified using cutting-edge metabolomics techniques will result in the identification of biomarkers for the early detection, diagnosis, and treatment of cancers. This review briefly introduces the metabolic changes in cancer with particular focus on breast cancer.
Collapse
|
34
|
Gomari DP, Schweickart A, Cerchietti L, Paietta E, Fernandez H, Al-Amin H, Suhre K, Krumsiek J. Variational autoencoders learn transferrable representations of metabolomics data. Commun Biol 2022; 5:645. [PMID: 35773471 PMCID: PMC9246987 DOI: 10.1038/s42003-022-03579-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/10/2022] [Indexed: 01/14/2023] Open
Abstract
Dimensionality reduction approaches are commonly used for the deconvolution of high-dimensional metabolomics datasets into underlying core metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Variational Autoencoders (VAEs) are a deep learning method designed to learn nonlinear latent representations which generalize to unseen data. Here, we trained a VAE on a large-scale metabolomics population cohort of human blood samples consisting of over 4500 individuals. We analyzed the pathway composition of the latent space using a global feature importance score, which demonstrated that latent dimensions represent distinct cellular processes. To demonstrate model generalizability, we generated latent representations of unseen metabolomics datasets on type 2 diabetes, acute myeloid leukemia, and schizophrenia and found significant correlations with clinical patient groups. Notably, the VAE representations showed stronger effects than latent dimensions derived by linear and non-linear principal component analysis. Taken together, we demonstrate that the VAE is a powerful method that learns biologically meaningful, nonlinear, and transferrable latent representations of metabolomics data.
Collapse
Affiliation(s)
- Daniel P. Gomari
- grid.4567.00000 0004 0483 2525Institute of Computational Biology, Helmholtz Center Munich—German Research Center for Environmental Health, 85764 Neuherberg, Germany ,grid.6936.a0000000123222966Technical University of Munich—School of Life Sciences, 85354 Freising, Germany ,grid.168010.e0000000419368956Department of Genetics, Stanford University School of Medicine, Stanford, CA USA
| | - Annalise Schweickart
- grid.5386.8000000041936877XDepartment of Physiology and Biophysics, Weill Cornell Medicine, Institute for Computational Biomedicine, Englander Institute for Precision Medicine, New York, NY 10021 USA
| | - Leandro Cerchietti
- grid.5386.8000000041936877XDepartment of Medicine, Hematology and Oncology Division, Weill Cornell Medicine, New York, 10065 NY USA
| | - Elisabeth Paietta
- grid.251993.50000000121791997Albert Einstein College of Medicine-Montefiore Medical Center, Bronx, NY USA
| | - Hugo Fernandez
- grid.489080.d0000 0004 0444 4637Moffitt Malignant Hematology & Cellular Therapy at Memorial Healthcare System, Pembroke Pines, FL USA
| | - Hassen Al-Amin
- grid.416973.e0000 0004 0582 4340Department of Psychiatry, Weill Cornell Medicine—Qatar, Education City, P.O. Box 24144, Doha, Qatar
| | - Karsten Suhre
- grid.416973.e0000 0004 0582 4340Department of Physiology and Biophysics, Weill Cornell Medical College—Qatar Education City, Doha, Qatar
| | - Jan Krumsiek
- grid.5386.8000000041936877XDepartment of Physiology and Biophysics, Weill Cornell Medicine, Institute for Computational Biomedicine, Englander Institute for Precision Medicine, New York, NY 10021 USA
| |
Collapse
|
35
|
Quazi S. Artificial intelligence and machine learning in precision and genomic medicine. Med Oncol 2022; 39:120. [PMID: 35704152 PMCID: PMC9198206 DOI: 10.1007/s12032-022-01711-1] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 03/14/2022] [Indexed: 10/28/2022]
Abstract
The advancement of precision medicine in medical care has led behind the conventional symptom-driven treatment process by allowing early risk prediction of disease through improved diagnostics and customization of more effective treatments. It is necessary to scrutinize overall patient data alongside broad factors to observe and differentiate between ill and relatively healthy people to take the most appropriate path toward precision medicine, resulting in an improved vision of biological indicators that can signal health changes. Precision and genomic medicine combined with artificial intelligence have the potential to improve patient healthcare. Patients with less common therapeutic responses or unique healthcare demands are using genomic medicine technologies. AI provides insights through advanced computation and inference, enabling the system to reason and learn while enhancing physician decision making. Many cell characteristics, including gene up-regulation, proteins binding to nucleic acids, and splicing, can be measured at high throughput and used as training objectives for predictive models. Researchers can create a new era of effective genomic medicine with the improved availability of a broad range of datasets and modern computer techniques such as machine learning. This review article has elucidated the contributions of ML algorithms in precision and genome medicine.
Collapse
Affiliation(s)
- Sameer Quazi
- GenLab Biosolutions Private Limited, Bangalore, Karnataka, 560043, India.
- Department of Biomedical Sciences, School of Life Sciences, Anglia Ruskin University, Cambridge, UK.
| |
Collapse
|
36
|
Abstract
The advancement of precision medicine in medical care has led behind the conventional symptom-driven treatment process by allowing early risk prediction of disease through improved diagnostics and customization of more effective treatments. It is necessary to scrutinize overall patient data alongside broad factors to observe and differentiate between ill and relatively healthy people to take the most appropriate path toward precision medicine, resulting in an improved vision of biological indicators that can signal health changes. Precision and genomic medicine combined with artificial intelligence have the potential to improve patient healthcare. Patients with less common therapeutic responses or unique healthcare demands are using genomic medicine technologies. AI provides insights through advanced computation and inference, enabling the system to reason and learn while enhancing physician decision making. Many cell characteristics, including gene up-regulation, proteins binding to nucleic acids, and splicing, can be measured at high throughput and used as training objectives for predictive models. Researchers can create a new era of effective genomic medicine with the improved availability of a broad range of datasets and modern computer techniques such as machine learning. This review article has elucidated the contributions of ML algorithms in precision and genome medicine.
Collapse
Affiliation(s)
- Sameer Quazi
- GenLab Biosolutions Private Limited, Bangalore, Karnataka, 560043, India.
- Department of Biomedical Sciences, School of Life Sciences, Anglia Ruskin University, Cambridge, UK.
| |
Collapse
|
37
|
Using machine learning and an electronic tongue for discriminating saliva samples from oral cavity cancer patients and healthy individuals. Talanta 2022; 243:123327. [DOI: 10.1016/j.talanta.2022.123327] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 02/14/2022] [Accepted: 02/16/2022] [Indexed: 11/20/2022]
|
38
|
Bahado-Singh RO, Radhakrishna U, Gordevičius J, Aydas B, Yilmaz A, Jafar F, Imam K, Maddens M, Challapalli K, Metpally RP, Berrettini WH, Crist RC, Graham SF, Vishweswaraiah S. Artificial Intelligence and Circulating Cell-Free DNA Methylation Profiling: Mechanism and Detection of Alzheimer's Disease. Cells 2022; 11:cells11111744. [PMID: 35681440 PMCID: PMC9179874 DOI: 10.3390/cells11111744] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/13/2022] [Accepted: 05/17/2022] [Indexed: 02/01/2023] Open
Abstract
Background: Despite extensive efforts, significant gaps remain in our understanding of Alzheimer’s disease (AD) pathophysiology. Novel approaches using circulating cell-free DNA (cfDNA) have the potential to revolutionize our understanding of neurodegenerative disorders. Methods: We performed DNA methylation profiling of cfDNA from AD patients and compared them to cognitively normal controls. Six Artificial Intelligence (AI) platforms were utilized for the diagnosis of AD while enrichment analysis was used to elucidate the pathogenesis of AD. Results: A total of 3684 CpGs were significantly (adj. p-value < 0.05) differentially methylated in AD versus controls. All six AI algorithms achieved high predictive accuracy (AUC = 0.949−0.998) in an independent test group. As an example, Deep Learning (DL) achieved an AUC (95% CI) = 0.99 (0.95−1.0), with 94.5% sensitivity and specificity. Conclusion: We describe numerous epigenetically altered genes which were previously reported to be differentially expressed in the brain of AD sufferers. Genes identified by AI to be the best predictors of AD were either known to be expressed in the brain or have been previously linked to AD. We highlight enrichment in the Calcium signaling pathway, Glutamatergic synapse, Hedgehog signaling pathway, Axon guidance and Olfactory transduction in AD sufferers. To the best of our knowledge, this is the first reported genome-wide DNA methylation study using cfDNA to detect AD.
Collapse
Affiliation(s)
- Ray O. Bahado-Singh
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, Royal Oak, MI 48309, USA; (R.O.B.-S.); (A.Y.); (S.F.G.)
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
| | - Uppala Radhakrishna
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
- Correspondence: (U.R.); (S.V.); Tel.: +1-248-551-2574 (U.R.); +1-248-551-2569 (S.V.)
| | - Juozas Gordevičius
- Vugene, LLC, 625 Kenmoor Ave Suite 301 PMB 96578, Grand Rapids, MI 49546, USA;
| | - Buket Aydas
- Department of Care Management Analytics, Blue Cross Blue Shield of Michigan, Detroit, MI 48226, USA;
| | - Ali Yilmaz
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, Royal Oak, MI 48309, USA; (R.O.B.-S.); (A.Y.); (S.F.G.)
- Department of Alzheimer’s Disease Research, Beaumont Research Institute, 3811 W. 13 Mile Road, Royal Oak, MI 48073, USA
| | - Faryal Jafar
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
| | - Khaled Imam
- Department of Internal Medicine, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (K.I.); (M.M.)
| | - Michael Maddens
- Department of Internal Medicine, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (K.I.); (M.M.)
| | - Kshetra Challapalli
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
| | - Raghu P. Metpally
- Department of Molecular and Functional Genomics, Geisinger, Danville, PA 17821, USA; (R.P.M.); (W.H.B.)
| | - Wade H. Berrettini
- Department of Molecular and Functional Genomics, Geisinger, Danville, PA 17821, USA; (R.P.M.); (W.H.B.)
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA;
| | - Richard C. Crist
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA;
| | - Stewart F. Graham
- Department of Obstetrics and Gynecology, Oakland University-William Beaumont School of Medicine, Royal Oak, MI 48309, USA; (R.O.B.-S.); (A.Y.); (S.F.G.)
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
- Department of Alzheimer’s Disease Research, Beaumont Research Institute, 3811 W. 13 Mile Road, Royal Oak, MI 48073, USA
| | - Sangeetha Vishweswaraiah
- Department of Obstetrics and Gynecology, Beaumont Health, 3601 W. 13 Mile Road, Royal Oak, MI 48073, USA; (F.J.); (K.C.)
- Correspondence: (U.R.); (S.V.); Tel.: +1-248-551-2574 (U.R.); +1-248-551-2569 (S.V.)
| |
Collapse
|
39
|
Bahado-Singh R, Vlachos KT, Aydas B, Gordevicius J, Radhakrishna U, Vishweswaraiah S. Precision Oncology: Artificial Intelligence and DNA Methylation Analysis of Circulating Cell-Free DNA for Lung Cancer Detection. Front Oncol 2022; 12:790645. [PMID: 35600397 PMCID: PMC9114890 DOI: 10.3389/fonc.2022.790645] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 04/04/2022] [Indexed: 12/12/2022] Open
Abstract
Background Lung cancer (LC) is a leading cause of cancer-deaths globally. Its lethality is due in large part to the paucity of accurate screening markers. Precision Medicine includes the use of omics technology and novel analytic approaches for biomarker development. We combined Artificial Intelligence (AI) and DNA methylation analysis of circulating cell-free tumor DNA (ctDNA), to identify putative biomarkers for and to elucidate the pathogenesis of LC. Methods Illumina Infinium MethylationEPIC BeadChip array analysis was used to measure cytosine (CpG) methylation changes across the genome in LC. Six different AI platforms including support vector machine (SVM) and Deep Learning (DL) were used to identify CpG biomarkers and for LC detection. Training set and validation sets were generated, and 10-fold cross validation performed. Gene enrichment analysis using g:profiler and GREAT enrichment was used to elucidate the LC pathogenesis. Results Using a stringent GWAS significance threshold, p-value <5x10-8, we identified 4389 CpGs (cytosine methylation loci) in coding genes and 1812 CpGs in non-protein coding DNA regions that were differentially methylated in LC. SVM and three other AI platforms achieved an AUC=1.00; 95% CI (0.90-1.00) for LC detection. DL achieved an AUC=1.00; 95% CI (0.95-1.00) and 100% sensitivity and specificity. High diagnostic accuracies were achieved with only intragenic or only intergenic CpG loci. Gene enrichment analysis found dysregulation of molecular pathways involved in the development of small cell and non-small cell LC. Conclusion Using AI and DNA methylation analysis of ctDNA, high LC detection rates were achieved. Further, many of the genes that were epigenetically altered are known to be involved in the biology of neoplasms in general and lung cancer in particular.
Collapse
Affiliation(s)
- Ray Bahado-Singh
- Department of Obstetrics and Gynecology, Oakland University William Beaumont School of Medicine, Royal Oak, MI, United States
| | - Kyriacos T Vlachos
- Department of Biomedical Sciences, Wayne State School of Medicine, Basic Medical Sciences, Detroit, MI, United States
| | - Buket Aydas
- Department of Healthcare Analytics, Meridian Health Plans, Detroit, MI, United States
| | | | - Uppala Radhakrishna
- Department of Obstetrics and Gynecology, Oakland University William Beaumont School of Medicine, Royal Oak, MI, United States
| | - Sangeetha Vishweswaraiah
- Department of Obstetrics and Gynecology, Beaumont Research Institute, Royal Oak, MI, United States
| |
Collapse
|
40
|
Lisitsyna A, Moritz F, Liu Y, Al Sadat L, Hauner H, Claussnitzer M, Schmitt-Kopplin P, Forcisi S. Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm. Anal Chem 2022; 94:5474-5482. [PMID: 35344349 DOI: 10.1021/acs.analchem.1c03237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Non-targeted metabolomics via high-resolution mass spectrometry methods, such as direct infusion Fourier transform-ion cyclotron resonance mass spectrometry (DI-FT-ICR MS), produces data sets with thousands of features. By contrast, the number of samples is in general substantially lower. This disparity presents challenges when analyzing non-targeted metabolomics data sets and often requires custom methods to uncover information not always accessible via classical statistical techniques. In this work, we present a pipeline that combines a convolutional neural network with traditional statistical approaches and an adaptation of a genetic algorithm. The developed method was applied to a lifestyle intervention cohort data set, where subjects at risk of type 2 diabetes underwent an oral glucose tolerance test. Feature selection is the final result of the pipeline, achieved through classification of the data set via a neural network, with a precision-recall score of over 0.9 on the test set. The features most relevant for the described classification were then chosen via a genetic algorithm. The output of the developed pipeline encompasses approximately 200 features with high predictive scores, providing a fingerprint of the metabolic changes in the prediabetic class on the data set. Our framework presents a new approach which allows to apply complex modeling based on convolutional neural networks for the analysis of high-resolution mass spectrometric data.
Collapse
Affiliation(s)
- Anna Lisitsyna
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany.,German Center for Diabetes Research (DZD), Neuherberg 85764, Germany
| | - Franco Moritz
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Youzhong Liu
- Analytical Development, Small Molecule Development, Janssen Pharmaceutical Companies of Johnson and Johnson, Beerse 2340, Belgium
| | - Loubna Al Sadat
- Institute for Nutritional Medicine, School of Medicine, Technical University of Munich, Munich 80686, Germany
| | - Hans Hauner
- Institute for Nutritional Medicine, School of Medicine, Technical University of Munich, Munich 80686, Germany.,Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, Freising 85354, Germany
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge 02141-2023 Massachusetts, United States.,Division of Gerontology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02108, United States.,Harvard Medical School, Harvard University, Boston, Massachusetts 02108, United States
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany.,Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University Munich, Munich 80686, Germany
| | - Sara Forcisi
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany.,German Center for Diabetes Research (DZD), Neuherberg 85764, Germany
| |
Collapse
|
41
|
Wang H, Zhang J. Identification of DTL as Related Biomarker and Immune Infiltration Characteristics of Nasopharyngeal Carcinoma via Comprehensive Strategies. Int J Gen Med 2022; 15:2329-2345. [PMID: 35264872 PMCID: PMC8901051 DOI: 10.2147/ijgm.s352330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 02/02/2022] [Indexed: 11/25/2022] Open
Abstract
Purpose Although considerable progress has been made in basic and clinical research on nasopharyngeal carcinoma (NPC), the biomarkers of the progression of NPC have not been fully studied and described. This study was designed to identify potential novel biomarkers for NPC using integrated analyses and explore the immune cell infiltration in this pathological process. Methods Five GEO data sets were downloaded from gene expression omnibus database (GEO) and analysed to identify differentially expressed genes (DEGs), followed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. The four algorithms were adopted for screening of novel and key biomarkers for NPC, including random forest (RF) machine learning algorithm, least absolute shrinkage and selection operator (LASSO) logistic regression, support vector machine-recursive feature elimination (SVM-RFE), and weighted gene co-expression network analysis (WGCNA). Lastly, CIBERSORT was used to assess the infiltration of immune cells in NPC, and the correlation between diagnostic markers and infiltrating immune cells was analyzed. Results Herein, we identified 46 DEGs, and enrichment analysis results showed that DEGs and several kinds of signaling pathways might be closely associated with the occurrence and progression of NPC. DTL was recognized as NPC-related biomarker. DTL, also known as retinoic acid-regulated nuclear matrix-associated protein (RAMP), or DNA replication factor 2 (CDT2), is reported to be correlated with the cell proliferation, cell cycle arrest and cell invasion in hepatocellular carcinoma, breast cancer and gastric cancer. Immune infiltration analysis demonstrated that macrophages M0, macrophages M1 and T cells CD4 memory activated were linked to pathogenesis of NPC. Conclusion In summary, we adopted a comprehensive strategy to screen DTL as biomarkers related to NPC and explore the critical role of immune cell infiltration in NPC.
Collapse
Affiliation(s)
- Hehe Wang
- Department of Otolaryngology, Head and Neck Surgery, Ningbo First Hospital, Ningbo, Zhejiang, People’s Republic of China
- Correspondence: Hehe Wang, Department of Otolaryngology Head and Neck Surgery, Ningbo First Hospital, Ningbo, Zhejiang, 315010, People’s Republic of China, Email
| | - Junge Zhang
- Department of Anesthesiology, Ningbo First Hospital, Ningbo, Zhejiang, People’s Republic of China
| |
Collapse
|
42
|
Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet 2022; 23:169-181. [PMID: 34837041 DOI: 10.1038/s41576-021-00434-9] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2021] [Indexed: 11/08/2022]
Abstract
The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.
Collapse
|
43
|
Abram KJ, McCloskey D. A Comprehensive Evaluation of Metabolomics Data Preprocessing Methods for Deep Learning. Metabolites 2022; 12:metabo12030202. [PMID: 35323644 PMCID: PMC8948616 DOI: 10.3390/metabo12030202] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 02/15/2022] [Accepted: 02/17/2022] [Indexed: 12/04/2022] Open
Abstract
Machine learning has greatly advanced over the past decade, owing to advances in algorithmic innovations, hardware acceleration, and benchmark datasets to train on domains such as computer vision, natural-language processing, and more recently the life sciences. In particular, the subfield of machine learning known as deep learning has found applications in genomics, proteomics, and metabolomics. However, a thorough assessment of how the data preprocessing methods required for the analysis of life science data affect the performance of deep learning is lacking. This work contributes to filling that gap by assessing the impact of commonly used as well as newly developed methods employed in data preprocessing workflows for metabolomics that span from raw data to processed data. The results from these analyses are summarized into a set of best practices that can be used by researchers as a starting point for downstream classification and reconstruction tasks using deep learning.
Collapse
|
44
|
Chen Y, He B, Liu Y, Aung MT, Rosario-Pabón Z, Vélez-Vega CM, Alshawabkeh A, Cordero JF, Meeker JD, Garmire LX. Maternal plasma lipids are involved in the pathogenesis of preterm birth. Gigascience 2022; 11:6528776. [PMID: 35166340 PMCID: PMC8847704 DOI: 10.1093/gigascience/giac004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 11/20/2021] [Accepted: 01/12/2022] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Preterm birth is defined by the onset of labor at a gestational age shorter than 37 weeks, and it can lead to premature birth and impose a threat to newborns' health. The Puerto Rico PROTECT cohort is a well-characterized prospective birth cohort that was designed to investigate environmental and social contributors to preterm birth in Puerto Rico, where preterm birth rates have been elevated in recent decades. To elucidate possible relationships between metabolites and preterm birth in this cohort, we conducted a nested case-control study to conduct untargeted metabolomic characterization of maternal plasma of 31 women who experienced preterm birth and 69 controls who underwent full-term labor at 24-28 gestational weeks. RESULTS A total of 333 metabolites were identified and annotated with liquid chromatography/mass spectrometry. Subsequent weighted gene correlation network analysis shows that the fatty acid and carene-enriched module has a significant positive association (P = 8e-04, FDR = 0.006) with preterm birth. After controlling for potential clinical confounders, a total of 38 metabolites demonstrated significant changes uniquely associated with preterm birth, where 17 of them were preterm biomarkers. Among 7 machine-learning classifiers, the application of random forest achieved a highly accurate and specific prediction (AUC = 0.92) for preterm birth in testing data, demonstrating their strong potential as biomarkers for preterm births. The 17 preterm biomarkers are involved in cell signaling, lipid metabolism, and lipid peroxidation functions. Additional modeling using only the 19 spontaneous preterm births (sPTB) and controls identifies 16 sPTB markers, with an AUC of 0.89 in testing data. Half of the sPTB overlap with those markers for preterm births. Further causality analysis infers that suberic acid upregulates several fatty acids to promote preterm birth. CONCLUSIONS Altogether, this study demonstrates the involvement of lipids, particularly fatty acids, in the pathogenesis of preterm birth.
Collapse
Affiliation(s)
- Yile Chen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48105, USA
| | - Bing He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48105, USA
| | - Yu Liu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48105, USA
| | - Max T Aung
- Program on Reproductive Health and the Environment, Department of Obstetrics, Gynecology, and Reproductive Sciences, University of California, San Francisco, School of Medicine, San Francisco, CA 94158, USA
| | - Zaira Rosario-Pabón
- University of Puerto Rico Graduate School of Public Health, UPR Medical Sciences Campus, San Juan, Puerto Rico 365067, Spain
| | - Carmen M Vélez-Vega
- University of Puerto Rico Graduate School of Public Health, UPR Medical Sciences Campus, San Juan, Puerto Rico 365067, Spain
| | - Akram Alshawabkeh
- College of Engineering, Northeastern University, Boston, MA 02115, USA
| | - José F Cordero
- Department of Epidemiology and Biostatistics, University of Georgia, Athens, GA 30602, USA
| | - John D Meeker
- Department of Environmental and Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48105, USA
| |
Collapse
|
45
|
Debik J, Sangermani M, Wang F, Madssen TS, Giskeødegård GF. Multivariate analysis of NMR-based metabolomic data. NMR IN BIOMEDICINE 2022; 35:e4638. [PMID: 34738674 DOI: 10.1002/nbm.4638] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 09/08/2021] [Accepted: 09/29/2021] [Indexed: 06/13/2023]
Abstract
Nuclear magnetic resonance (NMR) spectroscopy allows for simultaneous detection of a wide range of metabolites and lipids. As metabolites act together in complex metabolic networks, they are often highly correlated, and optimal biological insight is achieved when using methods that take the correlation into account. For this reason, latent-variable-based methods, such as principal component analysis and partial least-squares discriminant analysis, are widely used in metabolomic studies. However, with increasing availability of larger population cohorts, and a shift from analysis of spectral data to using quantified metabolite levels, both more traditional statistical approaches and alternative machine learning methods have become more widely used. This review aims at providing an overview of the current state-of-the-art multivariate methods for the analysis of NMR-based metabolomic data as well as alternative methods, highlighting their strengths and limitations.
Collapse
Affiliation(s)
- Julia Debik
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Matteo Sangermani
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Feng Wang
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
- Clinic of Surgery, St. Olavs Hospital HF, Trondheim, Norway
| | - Torfinn S Madssen
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Guro F Giskeødegård
- Clinic of Surgery, St. Olavs Hospital HF, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| |
Collapse
|
46
|
Kehoe ER, Fitzgerald BL, Graham B, Islam MN, Sharma K, Wormser GP, Belisle JT, Kirby MJ. Biomarker selection and a prospective metabolite-based machine learning diagnostic for lyme disease. Sci Rep 2022; 12:1478. [PMID: 35087163 PMCID: PMC8795431 DOI: 10.1038/s41598-022-05451-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 01/06/2022] [Indexed: 12/14/2022] Open
Abstract
We provide a pipeline for data preprocessing, biomarker selection, and classification of liquid chromatography–mass spectrometry (LCMS) serum samples to generate a prospective diagnostic test for Lyme disease. We utilize tools of machine learning (ML), e.g., sparse support vector machines (SSVM), iterative feature removal (IFR), and k-fold feature ranking to select several biomarkers and build a discriminant model for Lyme disease. We report a 98.13% test balanced success rate (BSR) of our model based on a sequestered test set of LCMS serum samples. The methodology employed is general and can be readily adapted to other LCMS, or metabolomics, data sets.
Collapse
Affiliation(s)
- Eric R Kehoe
- Department of Mathematics, Colorado State University, Fort Collins, CO, 80523, USA.
| | - Bryna L Fitzgerald
- Department of Microbiology, Immunology & Pathology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Barbara Graham
- Department of Microbiology, Immunology & Pathology, Colorado State University, Fort Collins, CO, 80523, USA
| | - M Nurul Islam
- Department of Microbiology, Immunology & Pathology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Kartikay Sharma
- Department of Computer Science, Colorado State University, Fort Collins, CO, 80523, USA
| | - Gary P Wormser
- Department of Medicine, New York Medical College, Valhalla, NY, 10595, USA
| | - John T Belisle
- Department of Microbiology, Immunology & Pathology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Michael J Kirby
- Department of Computer Science, Colorado State University, Fort Collins, CO, 80523, USA.,Department of Mathematics, Colorado State University, Fort Collins, CO, 80523, USA
| |
Collapse
|
47
|
Lee SM, Kim HU. Development of computational models using omics data for the identification of effective cancer metabolic biomarkers. Mol Omics 2021; 17:881-893. [PMID: 34608924 DOI: 10.1039/d1mo00337b] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Identification of novel biomarkers has been an active area of study for the effective diagnosis, prognosis and treatment of cancers. Among various types of cancer biomarkers, metabolic biomarkers, including enzymes, metabolites and metabolic genes, deserve attention as they can serve as a reliable source for diagnosis, prognosis and treatment of cancers. In particular, efforts to identify novel biomarkers have been greatly facilitated by a rapid increase in the volume of multiple omics data generated for a range of cancer cells. These omics data in turn serve as ingredients for developing computational models that can help derive deeper insights into the biology of cancer cells, and identify metabolic biomarkers. In this review, we provide an overview of omics data generated for cancer cells, and discuss recent studies on computational models that were developed using omics data in order to identify effective cancer metabolic biomarkers.
Collapse
Affiliation(s)
- Sang Mi Lee
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea.
| | - Hyun Uk Kim
- Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea. .,KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea.,BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea
| |
Collapse
|
48
|
Nie H, Pan J, An F, Zheng C, Zhang Q, Zhan Q. Comprehensive Analysis of Serum Metabolites Profiles in Acute Radiation Enteritis Rats by Untargeted Metabolomics. TOHOKU J EXP MED 2021; 255:257-265. [PMID: 34853247 DOI: 10.1620/tjem.255.257] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Acute radiation enteritis is a common complication occurring in patients with pelvic and abdominal tumors who receive radiotherapy. Acute radiation enteritis seriously reduces the life quality, even threatens the lives of patients. Untargeted metabolomics is an emerging strategy to explore the novel biomarkers and uncover potential pathogenesis of acute radiation enteritis. Acute radiation enteritis rat model was established by single abdominal irradiation with a gamma-ray dose of 10 Gy. Serum from 15 acute radiation enteritis rats and 10 controls was extracted for metabolomics analysis by UHPLC-Q-TOF/MS. Clinical manifestations and morphological alterations of intestine confirmed the successful establishment of acute radiation enteritis. According to the metabolomics data, 6,044 positive peaks and 4,241 negative peaks were extracted from each specimen. OPLS-DA analysis and the heat map for cluster analysis showed satisfactory discriminatory power between acute radiation enteritis rats and controls. Subsequent analysis extracted 66 significantly differentially expressed metabolites, which might be potential biomarkers for acute radiation enteritis diagnosis. Moreover, Kyoto Encyclopedia of Genes and Genomes enrichment analyses uncovered the potential mechanisms through which differentially expressed metabolites participated in acute radiation enteritis pathogenesis. To sum up, we summarized several differentially expressed serum metabolites as potential biomarkers for diagnosis of acute radiation enteritis and provide latent clues for elucidating acute radiation enteritis pathology.
Collapse
Affiliation(s)
- He Nie
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| | - Jiadong Pan
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| | - Fangmei An
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| | - Chuwei Zheng
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| | - Qinglin Zhang
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| | - Qiang Zhan
- Department of Gastroenterology, Wuxi People's Hospital Affiliated to Nanjing Medical University
| |
Collapse
|
49
|
Gondal MN, Chaudhary SU. Navigating Multi-Scale Cancer Systems Biology Towards Model-Driven Clinical Oncology and Its Applications in Personalized Therapeutics. Front Oncol 2021; 11:712505. [PMID: 34900668 PMCID: PMC8652070 DOI: 10.3389/fonc.2021.712505] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 10/26/2021] [Indexed: 12/19/2022] Open
Abstract
Rapid advancements in high-throughput omics technologies and experimental protocols have led to the generation of vast amounts of scale-specific biomolecular data on cancer that now populates several online databases and resources. Cancer systems biology models built using this data have the potential to provide specific insights into complex multifactorial aberrations underpinning tumor initiation, development, and metastasis. Furthermore, the annotation of these single- and multi-scale models with patient data can additionally assist in designing personalized therapeutic interventions as well as aid in clinical decision-making. Here, we have systematically reviewed the emergence and evolution of (i) repositories with scale-specific and multi-scale biomolecular cancer data, (ii) systems biology models developed using this data, (iii) associated simulation software for the development of personalized cancer therapeutics, and (iv) translational attempts to pipeline multi-scale panomics data for data-driven in silico clinical oncology. The review concludes that the absence of a generic, zero-code, panomics-based multi-scale modeling pipeline and associated software framework, impedes the development and seamless deployment of personalized in silico multi-scale models in clinical settings.
Collapse
Affiliation(s)
- Mahnoor Naseer Gondal
- Biomedical Informatics Research Laboratory, Department of Biology, Syed Babar Ali School of Science and Engineering, Lahore University of Management Sciences, Lahore, Pakistan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Safee Ullah Chaudhary
- Biomedical Informatics Research Laboratory, Department of Biology, Syed Babar Ali School of Science and Engineering, Lahore University of Management Sciences, Lahore, Pakistan
| |
Collapse
|
50
|
Deep Learning for Human Disease Detection, Subtype Classification, and Treatment Response Prediction Using Epigenomic Data. Biomedicines 2021; 9:biomedicines9111733. [PMID: 34829962 PMCID: PMC8615388 DOI: 10.3390/biomedicines9111733] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 10/26/2021] [Accepted: 11/17/2021] [Indexed: 12/25/2022] Open
Abstract
Deep learning (DL) is a distinct class of machine learning that has achieved first-class performance in many fields of study. For epigenomics, the application of DL to assist physicians and scientists in human disease-relevant prediction tasks has been relatively unexplored until very recently. In this article, we critically review published studies that employed DL models to predict disease detection, subtype classification, and treatment responses, using epigenomic data. A comprehensive search on PubMed, Scopus, Web of Science, Google Scholar, and arXiv.org was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Among 1140 initially identified publications, we included 22 articles in our review. DNA methylation and RNA-sequencing data are most frequently used to train the predictive models. The reviewed models achieved a high accuracy ranged from 88.3% to 100.0% for disease detection tasks, from 69.5% to 97.8% for subtype classification tasks, and from 80.0% to 93.0% for treatment response prediction tasks. We generated a workflow to develop a predictive model that encompasses all steps from first defining human disease-related tasks to finally evaluating model performance. DL holds promise for transforming epigenomic big data into valuable knowledge that will enhance the development of translational epigenomics.
Collapse
|