1
|
Cheng H, Miller D, Southwell N, Fischer JL, Taylor I, Salbaum JM, Kappen C, Hu F, Yang C, Gross SS, D'Aurelio M, Chen Q. Untargeted Pixel-by-Pixel Imaging of Metabolite Ratio Pairs as a Novel Tool for Biomedical Discovery in Mass Spectrometry Imaging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.10.575105. [PMID: 38370710 PMCID: PMC10871215 DOI: 10.1101/2024.01.10.575105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Mass spectrometry imaging (MSI) is a powerful technology used to define the spatial distribution and relative abundance of structurally identified and yet-undefined metabolites across tissue cryosections. While numerous software packages enable pixel-by-pixel imaging of individual metabolites, the research community lacks a discovery tool that images all metabolite abundance ratio pairs. Importantly, recognition of correlated metabolite pairs informs discovery of unanticipated molecules contributing to shared metabolic pathways, uncovers hidden metabolic heterogeneity across cells and tissue subregions, and indicates single-timepoint flux through pathways of interest. Here, we describe the development and implementation of an untargeted R package workflow for pixel-by-pixel ratio imaging of all metabolites detected in an MSI experiment. Considering untargeted MSI studies of murine brain and embryogenesis, we demonstrate that ratio imaging minimizes systematic data variation introduced by sample handling and instrument drift, markedly enhances spatial image resolution, and reveals previously unrecognized metabotype-distinct tissue regions. Furthermore, ratio imaging facilitates identification of novel regional biomarkers and provides anatomical information regarding spatial distribution of metabolite-linked biochemical pathways. The algorithm described herein is applicable to any MSI dataset containing spatial information for metabolites, peptides or proteins, offering a potent tool to enhance knowledge obtained from current spatial metabolite profiling technologies.
Collapse
|
2
|
Alhamdan YR, Ayoub NM, Jaradat SK, Shatnawi A, Yaghan RJ. BRAF Expression and Copy Number Alterations Predict Unfavorable Tumor Features and Adverse Outcomes in Patients With Breast Cancer. Int J Breast Cancer 2024; 2024:6373900. [PMID: 38919805 PMCID: PMC11199069 DOI: 10.1155/2024/6373900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 04/15/2024] [Accepted: 05/07/2024] [Indexed: 06/27/2024] Open
Abstract
Background: The role of BRAF in breast cancer pathogenesis is still unclear. To address this knowledge gap, this study is aimed at evaluating the impact of BRAF gene expression and copy number alterations (CNAs) on clinicopathologic characteristics and survival in patients with breast cancer. Methods: The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset was obtained from the cBioPortal public domain. Tumoral BRAF mRNA expression and CNAs along with demographic and tumor data for patients with breast cancer were retrieved. The association of BRAF expression and CNAs with breast cancer clinicopathologic characteristics was analyzed. The impact of BRAF mRNA expression on the overall survival of patients was assessed using Kaplan-Meier survival analysis. Results: BRAF gene mRNA log intensity expression was positively correlated with tumor size and the Nottingham Prognostic Index (NPI) (p < 0.001). Alternatively, BRAF gene expression was negatively correlated with the age at diagnosis (p = 0.003). The average BRAF mRNA expression was significantly higher in premenopausal patients, patients with high tumor grade, hormone receptor-negative status, and non-luminal tumors compared to postmenopausal patients, patients with low-grade, hormone receptor-positive, and luminal disease. BRAF gain and high-level amplification copy numbers were significantly associated with higher NPI scores and larger tumor sizes compared to neutral copy number status. Survival analysis revealed no discernible differences in overall survival for patients with low and high BRAF mRNA expression. Conclusion: High BRAF mRNA expression as well as the gain and high-level amplification copy numbers were associated with advanced tumor characteristics and unfavorable prognostic factors in breast cancer. BRAF could be an appealing target for the treatment of premenopausal patients with hormone receptor-negative breast cancer.
Collapse
Affiliation(s)
- Yazan R. Alhamdan
- Department of Clinical PharmacyFaculty of PharmacyJordan University of Science and Technology, PO Box 3030, Irbid 22110, Jordan
| | - Nehad M. Ayoub
- Department of Clinical PharmacyFaculty of PharmacyJordan University of Science and Technology, PO Box 3030, Irbid 22110, Jordan
| | - Sara K. Jaradat
- Department of Clinical PharmacyFaculty of PharmacyJordan University of Science and Technology, PO Box 3030, Irbid 22110, Jordan
| | - Aymen Shatnawi
- Department of Drug Discovery and Biomedical SciencesCollege of PharmacyMedical University of South Carolina, 70 President St., Charleston, South Carolina 29425, USA
| | - Rami J. Yaghan
- Department of SurgeryCollege of Medicine and Medical SciencesArabian Gulf University, Road 2904, Building 293, Manama, Bahrain
- Department of General Surgery and UrologyFaculty of MedicineJordan University of Science and Technology, PO Box 3030, Irbid 22110, Jordan
| |
Collapse
|
3
|
Jilani M, Degras D, Haspel N. Elucidating Cancer Subtypes by Using the Relationship between DNA Methylation and Gene Expression. Genes (Basel) 2024; 15:631. [PMID: 38790260 PMCID: PMC11121157 DOI: 10.3390/genes15050631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 05/26/2024] Open
Abstract
Advancements in the field of next generation sequencing (NGS) have generated vast amounts of data for the same set of subjects. The challenge that arises is how to combine and reconcile results from different omics studies, such as epigenome and transcriptome, to improve the classification of disease subtypes. In this study, we introduce sCClust (sparse canonical correlation analysis with clustering), a technique to combine high-dimensional omics data using sparse canonical correlation analysis (sCCA), such that the correlation between datasets is maximized. This stage is followed by clustering the integrated data in a lower-dimensional space. We apply sCClust to gene expression and DNA methylation data for three cancer genomics datasets from the Cancer Genome Atlas (TCGA) to distinguish between underlying subtypes. We evaluate the identified subtypes using Kaplan-Meier plots and hazard ratio analysis on the three types of cancer-GBM (glioblastoma multiform), lung cancer and colon cancer. Comparison with subtypes identified by both single- and multi-omics studies implies improved clinical association. We also perform pathway over-representation analysis in order to identify up-regulated and down-regulated genes as tentative drug targets. The main goal of the paper is twofold: the integration of epigenomic and transcriptomic datasets followed by elucidating subtypes in the latent space. The significance of this study lies in the enhanced categorization of cancer data, which is crucial to precision medicine.
Collapse
Affiliation(s)
- Muneeba Jilani
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA;
| | - David Degras
- Department of Mathematics, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Nurit Haspel
- Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125, USA;
| |
Collapse
|
4
|
Chereda H, Leha A, Beißbarth T. Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer. Artif Intell Med 2024; 151:102840. [PMID: 38658129 DOI: 10.1016/j.artmed.2024.102840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 03/05/2024] [Accepted: 03/10/2024] [Indexed: 04/26/2024]
Abstract
High-throughput technologies are becoming increasingly important in discovering prognostic biomarkers and in identifying novel drug targets. With Mammaprint, Oncotype DX, and many other prognostic molecular signatures breast cancer is one of the paradigmatic examples of the utility of high-throughput data to deliver prognostic biomarkers, that can be represented in a form of a rather short gene list. Such gene lists can be obtained as a set of features (genes) that are important for the decisions of a Machine Learning (ML) method applied to high-dimensional gene expression data. Several studies have identified predictive gene lists for patient prognosis in breast cancer, but these lists are unstable and have only a few genes in common. Instability of feature selection impedes biological interpretability: genes that are relevant for cancer pathology should be members of any predictive gene list obtained for the same clinical type of patients. Stability and interpretability of selected features can be improved by including information on molecular networks in ML methods. Graph Convolutional Neural Network (GCNN) is a contemporary deep learning approach applicable to gene expression data structured by a prior knowledge molecular network. Layer-wise Relevance Propagation (LRP) and SHapley Additive exPlanations (SHAP) are methods to explain individual decisions of deep learning models. We used both GCNN+LRP and GCNN+SHAP techniques to construct feature sets by aggregating individual explanations. We suggest a methodology to systematically and quantitatively analyze the stability, the impact on the classification performance, and the interpretability of the selected feature sets. We used this methodology to compare GCNN+LRP to GCNN+SHAP and to more classical ML-based feature selection approaches. Utilizing a large breast cancer gene expression dataset we show that, while feature selection with SHAP is useful in applications where selected features have to be impactful for classification performance, among all studied methods GCNN+LRP delivers the most stable (reproducible) and interpretable gene lists.
Collapse
Affiliation(s)
- Hryhorii Chereda
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany
| | - Andreas Leha
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany; Scientific Core Facility Medical Biometry and Statistical Bioinformatics, University Medical Center Göttingen, Humboldtallee 32, Göttingen, 37073, Germany
| | - Tim Beißbarth
- Medical Bioinformatics, University Medical Center Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany; Campus-Institute Data Science (CIDAS), University of Göttingen, Goldschmidtstraße 1, Göttingen, 37077, Germany.
| |
Collapse
|
5
|
Chai H, Lin S, Lin J, He M, Yang Y, OuYang Y, Zhao H. An uncertainty-based interpretable deep learning framework for predicting breast cancer outcome. BMC Bioinformatics 2024; 25:88. [PMID: 38418940 PMCID: PMC10902951 DOI: 10.1186/s12859-024-05716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 02/21/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Predicting outcome of breast cancer is important for selecting appropriate treatments and prolonging the survival periods of patients. Recently, different deep learning-based methods have been carefully designed for cancer outcome prediction. However, the application of these methods is still challenged by interpretability. In this study, we proposed a novel multitask deep neural network called UISNet to predict the outcome of breast cancer. The UISNet is able to interpret the importance of features for the prediction model via an uncertainty-based integrated gradients algorithm. UISNet improved the prediction by introducing prior biological pathway knowledge and utilizing patient heterogeneity information. RESULTS The model was tested in seven public datasets of breast cancer, and showed better performance (average C-index = 0.691) than the state-of-the-art methods (average C-index = 0.650, ranged from 0.619 to 0.677). Importantly, the UISNet identified 20 genes as associated with breast cancer, among which 11 have been proven to be associated with breast cancer by previous studies, and others are novel findings of this study. CONCLUSIONS Our proposed method is accurate and robust in predicting breast cancer outcomes, and it is an effective way to identify breast cancer-associated genes. The method codes are available at: https://github.com/chh171/UISNet .
Collapse
Affiliation(s)
- Hua Chai
- School of Mathematics and Big Data, Foshan University, Foshan, 528000, China
| | - Siyin Lin
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, 510000, China
| | - Junqi Lin
- School of Mathematics and Big Data, Foshan University, Foshan, 528000, China
| | - Minfan He
- School of Mathematics and Big Data, Foshan University, Foshan, 528000, China
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, 510000, China
| | - Yongzhong OuYang
- School of Mathematics and Big Data, Foshan University, Foshan, 528000, China.
| | - Huiying Zhao
- Department of Medical Research Center, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, 510000, China.
| |
Collapse
|
6
|
Abbasi EY, Deng Z, Ali Q, Khan A, Shaikh A, Reshan MSA, Sulaiman A, Alshahrani H. A machine learning and deep learning-based integrated multi-omics technique for leukemia prediction. Heliyon 2024; 10:e25369. [PMID: 38352790 PMCID: PMC10862685 DOI: 10.1016/j.heliyon.2024.e25369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/13/2023] [Accepted: 01/25/2024] [Indexed: 02/16/2024] Open
Abstract
In recent years, scientific data on cancer has expanded, providing potential for a better understanding of malignancies and improved tailored care. Advances in Artificial Intelligence (AI) processing power and algorithmic development position Machine Learning (ML) and Deep Learning (DL) as crucial players in predicting Leukemia, a blood cancer, using integrated multi-omics technology. However, realizing these goals demands novel approaches to harness this data deluge. This study introduces a novel Leukemia diagnosis approach, analyzing multi-omics data for accuracy using ML and DL algorithms. ML techniques, including Random Forest (RF), Naive Bayes (NB), Decision Tree (DT), Logistic Regression (LR), Gradient Boosting (GB), and DL methods such as Recurrent Neural Networks (RNN) and Feedforward Neural Networks (FNN) are compared. GB achieved 97 % accuracy in ML, while RNN outperformed by achieving 98 % accuracy in DL. This approach filters unclassified data effectively, demonstrating the significance of DL for leukemia prediction. The testing validation was based on 17 different features such as patient age, sex, mutation type, treatment methods, chromosomes, and others. Our study compares ML and DL techniques and chooses the best technique that gives optimum results. The study emphasizes the implications of high-throughput technology in healthcare, offering improved patient care.
Collapse
Affiliation(s)
- Erum Yousef Abbasi
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Zhongliang Deng
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Qasim Ali
- Department of Software Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
| | - Adil Khan
- State Key Laboratory of Wireless Network Positioning and Communication Engineering Integration Research, School of Electronics Engineering, Beijing University of Posts and Telecommunications, Beijing, China
| | - Asadullah Shaikh
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| | - Mana Saleh Al Reshan
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
- Scientific and Engineering Research Centre, Najran University, Najran, 61441, Saudi Arabia
| | - Adel Sulaiman
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| | - Hani Alshahrani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia
| |
Collapse
|
7
|
Qattous H, Azzeh M, Ibrahim R, Abed Al-Ghafer I, Al Sorkhy M, Alkhateeb A. PaCMAP-embedded convolutional neural network for multi-omics data integration. Heliyon 2024; 10:e23195. [PMID: 38163104 PMCID: PMC10756978 DOI: 10.1016/j.heliyon.2023.e23195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 01/03/2024] Open
Abstract
Aims The multi-omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential for enhancing predictive models. The main motivation behind this study stems from the imperative need to advance prognostic methodologies in cancer diagnosis, an area where precision is pivotal for effective clinical decision-making. In this context, the present study introduces an innovative methodology that integrates copy number alteration (CNA), DNA methylation, and gene expression data. Methods The three omics data were successfully merged into a two-dimensional (2D) map using the PaCMAP dimensionality reduction technique. Utilizing the RGB coloring scheme, a visual representation of the integration was produced utilizing the values of the three omics of each sample. Then, the colored 2D maps were fed into a convolutional neural network (CNN) to forecast the Gleason score. Results Our proposed model outperforms the cutting-edge i-SOM-GSN model by integrating multi-omics data and the CNN architecture with an accuracy of 98.89, and AUC of 0.9996. Conclusion This study demonstrates the effectiveness of multi-omics data integration in predicting health outcomes. The proposed methodology, combining PaCMAP for dimensionality reduction, RGB coloring for visualization, and CNN for prediction, offers a comprehensive framework for integrating heterogeneous omics data and improving predictive accuracy. These findings contribute to the advancement of personalized medicine and have the potential to aid in clinical decision-making for prostate cancer patients.
Collapse
Affiliation(s)
- Hazem Qattous
- Software Engineering Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Mohammad Azzeh
- Data Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Rahmeh Ibrahim
- Computer Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Ibrahim Abed Al-Ghafer
- Data Science Department, Princess Sumaya University for Technology, Amman P.O. Box 1438, Jordan
| | - Mohammad Al Sorkhy
- Heritage College of Osteopathic medicine, Ohio University, Cleveland, OH 44122, USA
| | - Abedalrhman Alkhateeb
- Computer Science Department, Lakehead University, 955 Oliver Rd, Thunder Bay, ON P7B 5E1, Ontario, Canada
| |
Collapse
|
8
|
Huang Z, Cai Z, Zhang J, Gu Y, Wang J, Yang J, Lv G, Yang C, Zhang Y, Ji C, Jiang S. Integrating proteomics and metabolomics to elucidate the molecular network regulating of inosine monophosphate-specific deposition in Jingyuan chicken. Poult Sci 2023; 102:103118. [PMID: 37862870 PMCID: PMC10590753 DOI: 10.1016/j.psj.2023.103118] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 09/10/2023] [Accepted: 09/12/2023] [Indexed: 10/22/2023] Open
Abstract
Inosine monophosphate (IMP) plays a significant role in meat taste, yet the molecular mechanisms controlling IMP deposition in muscle tissues still require elucidation. The present study systematically and comprehensively explores the molecular network governing IMP deposition in different regions of Jingyuan chicken muscle. Two muscle groups, the breast and leg, were examined as test materials. Using nontargeted metabolomic sequencing, we screened and identified 20 metabolites that regulate IMP-specific deposition. We maintained regular author and institution formatting, used clear, objective, and value-neutral language, and avoided biased or emotional language. We followed a consistent footnote style and formatting features and used precise word choice with technical terms where appropriate. Out of these, 5 were identified as significant contributors to the regulation of IMP deposition. We explained technical term abbreviations when first used and ensured a logical flow of information with causal connections between statements. The results indicate that PGM1, a key enzyme involved in synthesis, is higher in the breast muscle compared to the leg muscle, which may provide an explanation for the increased deposition of IMP in the breast muscle. We aimed for a clear structure with logical progression, avoided filler words, and ensured grammatical correctness. The activity of key enzymes (PKM2, AK1, AMPD1) involved in this process was higher in the breast muscle than in the leg muscle. In the case of IMP degradation metabolism, the activity of its participating enzyme (PurH) was lower in the breast muscle than in the leg muscle. These findings suggest that the increased deposition of IMP in Jingyuan chickens' breast muscle may result from elevated metabolism and reduced catabolism of key metabolites. In summary, a metaomic strategy was utilized to assess the molecular network regulation mechanism of IMP-specific deposition in various segments of Jingyuan chicken. These findings provide insight into genetic improvement and molecular breeding of meat quality traits for top-notch broilers.
Collapse
Affiliation(s)
- Zengwen Huang
- Agriculture College, Ningxia University, Ningxia, Yinchuan 750021, China; College of Animal Science, Xichang University, Sichuan, Xichang 615012, China; Xinjiang Taikun Group Co., Ltd., Xinjiang, Changji 831100, China
| | - Zhengyun Cai
- Agriculture College, Ningxia University, Ningxia, Yinchuan 750021, China
| | - Juan Zhang
- Agriculture College, Ningxia University, Ningxia, Yinchuan 750021, China.
| | - Yaling Gu
- Agriculture College, Ningxia University, Ningxia, Yinchuan 750021, China
| | - Jing Wang
- College of Animal Science, Xichang University, Sichuan, Xichang 615012, China
| | - Jinzeng Yang
- Department of Human Nutrition, Food & Animal Sciences, College of Tropical Agriculture and Human Resources, University of Hawaii at Manoa, Manoa, HI 96822
| | - Gang Lv
- Xinjiang Taikun Group Co., Ltd., Xinjiang, Changji 831100, China
| | - Chaoyun Yang
- College of Animal Science, Xichang University, Sichuan, Xichang 615012, China
| | - Yi Zhang
- College of Animal Science, Xichang University, Sichuan, Xichang 615012, China
| | - Chen Ji
- College of Animal Science, Xichang University, Sichuan, Xichang 615012, China
| | - Shengwang Jiang
- College of Animal Science, Xichang University, Sichuan, Xichang 615012, China
| |
Collapse
|
9
|
Ivanisevic T, Sewduth RN. Multi-Omics Integration for the Design of Novel Therapies and the Identification of Novel Biomarkers. Proteomes 2023; 11:34. [PMID: 37873876 PMCID: PMC10594525 DOI: 10.3390/proteomes11040034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/13/2023] [Accepted: 10/19/2023] [Indexed: 10/25/2023] Open
Abstract
Multi-omics is a cutting-edge approach that combines data from different biomolecular levels, such as DNA, RNA, proteins, metabolites, and epigenetic marks, to obtain a holistic view of how living systems work and interact. Multi-omics has been used for various purposes in biomedical research, such as identifying new diseases, discovering new drugs, personalizing treatments, and optimizing therapies. This review summarizes the latest progress and challenges of multi-omics for designing new treatments for human diseases, focusing on how to integrate and analyze multiple proteome data and examples of how to use multi-proteomics data to identify new drug targets. We also discussed the future directions and opportunities of multi-omics for developing innovative and effective therapies by deciphering proteome complexity.
Collapse
Affiliation(s)
| | - Raj N. Sewduth
- VIB-KU Leuven Center for Cancer Biology (VIB), 3000 Leuven, Belgium;
| |
Collapse
|
10
|
Martins S, Coletti R, Lopes MB. Disclosing transcriptomics network-based signatures of glioma heterogeneity using sparse methods. BioData Min 2023; 16:26. [PMID: 37752578 PMCID: PMC10523751 DOI: 10.1186/s13040-023-00341-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/13/2023] [Indexed: 09/28/2023] Open
Abstract
Gliomas are primary malignant brain tumors with poor survival and high resistance to available treatments. Improving the molecular understanding of glioma and disclosing novel biomarkers of tumor development and progression could help to find novel targeted therapies for this type of cancer. Public databases such as The Cancer Genome Atlas (TCGA) provide an invaluable source of molecular information on cancer tissues. Machine learning tools show promise in dealing with the high dimension of omics data and extracting relevant information from it. In this work, network inference and clustering methods, namely Joint Graphical lasso and Robust Sparse K-means Clustering, were applied to RNA-sequencing data from TCGA glioma patients to identify shared and distinct gene networks among different types of glioma (glioblastoma, astrocytoma, and oligodendroglioma) and disclose new patient groups and the relevant genes behind groups' separation. The results obtained suggest that astrocytoma and oligodendroglioma have more similarities compared with glioblastoma, highlighting the molecular differences between glioblastoma and the others glioma subtypes. After a comprehensive literature search on the relevant genes pointed our from our analysis, we identified potential candidates for biomarkers of glioma. Further molecular validation of these genes is encouraged to understand their potential role in diagnosis and in the design of novel therapies.
Collapse
Affiliation(s)
- Sofia Martins
- NOVA School of Science and Technology, NOVA University of Lisbon, Caparica, 2829-516, Portugal
| | - Roberta Coletti
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, 2829-516, Portugal.
| | - Marta B Lopes
- NOVA School of Science and Technology, NOVA University of Lisbon, Caparica, 2829-516, Portugal.
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, 2829-516, Portugal.
- NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), NOVA School of Science and Technology, Caparica, 2829-516, Portugal.
- UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, 2829-516, Portugal.
| |
Collapse
|
11
|
Nguyen T, Bian X, Roberson D, Khanna R, Chen Q, Yan C, Beck R, Worman Z, Meerzaman D. Multi-omics Pathways Workflow (MOPAW): An Automated Multi-omics Workflow on the Cancer Genomics Cloud. Cancer Inform 2023; 22:11769351231180992. [PMID: 37342652 PMCID: PMC10278438 DOI: 10.1177/11769351231180992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 05/22/2023] [Indexed: 06/23/2023] Open
Abstract
Introduction In the era of big data, gene-set pathway analyses derived from multi-omics are exceptionally powerful. When preparing and analyzing high-dimensional multi-omics data, the installation process and programing skills required to use existing tools can be challenging. This is especially the case for those who are not familiar with coding. In addition, implementation with high performance computing solutions is required to run these tools efficiently. Methods We introduce an automatic multi-omics pathway workflow, a point and click graphical user interface to Multivariate Single Sample Gene Set Analysis (MOGSA), hosted on the Cancer Genomics Cloud by Seven Bridges Genomics. This workflow leverages the combination of different tools to perform data preparation for each given data types, dimensionality reduction, and MOGSA pathway analysis. The Omics data includes copy number alteration, transcriptomics data, proteomics and phosphoproteomics data. We have also provided an additional workflow to help with downloading data from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium and preprocessing these data to be used for this multi-omics pathway workflow. Results The main outputs of this workflow are the distinct pathways for subgroups of interest provided by users, which are displayed in heatmaps if identified. In addition to this, graphs and tables are provided to users for reviewing. Conclusion Multi-omics Pathway Workflow requires no coding experience. Users can bring their own data or download and preprocess public datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium using our additional workflow based on the samples of interest. Distinct overactivated or deactivated pathways for groups of interest can be found. This useful information is important in effective therapeutic targeting.
Collapse
Affiliation(s)
- Trinh Nguyen
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Xiaopeng Bian
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | | | - Rakesh Khanna
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Qingrong Chen
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | - Chunhua Yan
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| | | | | | - Daoud Meerzaman
- The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, USA
| |
Collapse
|
12
|
Yoosefzadeh Najafabadi M, Hesami M, Eskandari M. Machine Learning-Assisted Approaches in Modernized Plant Breeding Programs. Genes (Basel) 2023; 14:genes14040777. [PMID: 37107535 PMCID: PMC10137951 DOI: 10.3390/genes14040777] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 03/11/2023] [Accepted: 03/21/2023] [Indexed: 04/29/2023] Open
Abstract
In the face of a growing global population, plant breeding is being used as a sustainable tool for increasing food security. A wide range of high-throughput omics technologies have been developed and used in plant breeding to accelerate crop improvement and develop new varieties with higher yield performance and greater resilience to climate changes, pests, and diseases. With the use of these new advanced technologies, large amounts of data have been generated on the genetic architecture of plants, which can be exploited for manipulating the key characteristics of plants that are important for crop improvement. Therefore, plant breeders have relied on high-performance computing, bioinformatics tools, and artificial intelligence (AI), such as machine-learning (ML) methods, to efficiently analyze this vast amount of complex data. The use of bigdata coupled with ML in plant breeding has the potential to revolutionize the field and increase food security. In this review, some of the challenges of this method along with some of the opportunities it can create will be discussed. In particular, we provide information about the basis of bigdata, AI, ML, and their related sub-groups. In addition, the bases and functions of some learning algorithms that are commonly used in plant breeding, three common data integration strategies for the better integration of different breeding datasets using appropriate learning algorithms, and future prospects for the application of novel algorithms in plant breeding will be discussed. The use of ML algorithms in plant breeding will equip breeders with efficient and effective tools to accelerate the development of new plant varieties and improve the efficiency of the breeding process, which are important for tackling some of the challenges facing agriculture in the era of climate change.
Collapse
Affiliation(s)
| | - Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
13
|
Kong X, Zhou M, Bian K, Lai W, Hu F, Dai R, Yan J. Research on SPDTRS-PNN based intelligent assistant diagnosis for breast cancer. Sci Rep 2023; 13:4386. [PMID: 36928059 PMCID: PMC10020448 DOI: 10.1038/s41598-023-28316-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 01/17/2023] [Indexed: 03/18/2023] Open
Abstract
Breast cancer is the second dangerous cancer in the world. Breast cancer data often contains more redundant information. Redundant information makes the breast cancer auxiliary diagnosis less accurate and time consuming. Dimension reduction algorithm combined with machine learning can solve these problems well. This paper proposes the single parameter decision theoretic rough set (SPDTRS) combined with the probability neural network (PNN) model for breast cancer diagnosis. We find that when the parameter value of SPDTRS is 2.5 and the SPREAD value is 0.75, the number of 30 attributes of the original breast cancer data dropped to 12, the accuracy of the SPDTRS-PNN model training set is 99.25%, the accuracy of the test set is 97.04%, and the test time is 0.093 s. The experimental results show that the SPDTRS-PNN model can improve the ac-curacy of breast cancer recognition, reduce the time required for diagnosis.
Collapse
Affiliation(s)
- Xixi Kong
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China.
| | - Mengran Zhou
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| | - Kai Bian
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| | - Wenhao Lai
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| | - Feng Hu
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| | - Rongying Dai
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| | - Jingjing Yan
- School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan, 232001, China
| |
Collapse
|
14
|
Unlu Yazici M, Marron JS, Bakir-Gungor B, Zou F, Yousef M. Invention of 3Mint for feature grouping and scoring in multi-omics. Front Genet 2023; 14:1093326. [PMID: 37007972 PMCID: PMC10050723 DOI: 10.3389/fgene.2023.1093326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 02/27/2023] [Indexed: 03/17/2023] Open
Abstract
Advanced genomic and molecular profiling technologies accelerated the enlightenment of the regulatory mechanisms behind cancer development and progression, and the targeted therapies in patients. Along this line, intense studies with immense amounts of biological information have boosted the discovery of molecular biomarkers. Cancer is one of the leading causes of death around the world in recent years. Elucidation of genomic and epigenetic factors in Breast Cancer (BRCA) can provide a roadmap to uncover the disease mechanisms. Accordingly, unraveling the possible systematic connections between-omics data types and their contribution to BRCA tumor progression is crucial. In this study, we have developed a novel machine learning (ML) based integrative approach for multi-omics data analysis. This integrative approach combines information from gene expression (mRNA), microRNA (miRNA) and methylation data. Due to the complexity of cancer, this integrated data is expected to improve the prediction, diagnosis and treatment of disease through patterns only available from the 3-way interactions between these 3-omics datasets. In addition, the proposed method bridges the interpretation gap between the disease mechanisms that drive onset and progression. Our fundamental contribution is the 3 Multi-omics integrative tool (3Mint). This tool aims to perform grouping and scoring of groups using biological knowledge. Another major goal is improved gene selection via detection of novel groups of cross-omics biomarkers. Performance of 3Mint is assessed using different metrics. Our computational performance evaluations showed that the 3Mint classifies the BRCA molecular subtypes with lower number of genes when compared to the miRcorrNet tool which uses miRNA and mRNA gene expression profiles in terms of similar performance metrics (95% Accuracy). The incorporation of methylation data in 3Mint yields a much more focused analysis. The 3Mint tool and all other supplementary files are available at https://github.com/malikyousef/3Mint/.
Collapse
Affiliation(s)
- Miray Unlu Yazici
- Department of Bioengineering, Abdullah Gül University, Kayseri, Türkiye
| | - J. S. Marron
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC, United States
| | - Burcu Bakir-Gungor
- Department of Bioengineering, Abdullah Gül University, Kayseri, Türkiye
- Department of Computer Engineering, Abdullah Gul University, Kayseri, Türkiye
| | - Fei Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Malik Yousef
- Department of Information Systems, Zefat Academic College, Zefat, Israel
- Galilee Digital Health Research Center, Zefat Academic College, Zefat, Israel
- *Correspondence: Malik Yousef,
| |
Collapse
|
15
|
Liao J, Li X, Gan Y, Han S, Rong P, Wang W, Li W, Zhou L. Artificial intelligence assists precision medicine in cancer treatment. Front Oncol 2023; 12:998222. [PMID: 36686757 PMCID: PMC9846804 DOI: 10.3389/fonc.2022.998222] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 11/22/2022] [Indexed: 01/06/2023] Open
Abstract
Cancer is a major medical problem worldwide. Due to its high heterogeneity, the use of the same drugs or surgical methods in patients with the same tumor may have different curative effects, leading to the need for more accurate treatment methods for tumors and personalized treatments for patients. The precise treatment of tumors is essential, which renders obtaining an in-depth understanding of the changes that tumors undergo urgent, including changes in their genes, proteins and cancer cell phenotypes, in order to develop targeted treatment strategies for patients. Artificial intelligence (AI) based on big data can extract the hidden patterns, important information, and corresponding knowledge behind the enormous amount of data. For example, the ML and deep learning of subsets of AI can be used to mine the deep-level information in genomics, transcriptomics, proteomics, radiomics, digital pathological images, and other data, which can make clinicians synthetically and comprehensively understand tumors. In addition, AI can find new biomarkers from data to assist tumor screening, detection, diagnosis, treatment and prognosis prediction, so as to providing the best treatment for individual patients and improving their clinical outcomes.
Collapse
Affiliation(s)
- Jinzhuang Liao
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Xiaoying Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Yu Gan
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Shuangze Han
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Pengfei Rong
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Wang
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Li Zhou
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Pathology, The Xiangya Hospital of Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| |
Collapse
|