1
|
Avci CB, Bagca BG, Shademan B, Takanlou LS, Takanlou MS, Nourazarian A. Precision oncology: Using cancer genomics for targeted therapy advancements. Biochim Biophys Acta Rev Cancer 2025; 1880:189250. [PMID: 39701327 DOI: 10.1016/j.bbcan.2024.189250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 12/12/2024] [Accepted: 12/13/2024] [Indexed: 12/21/2024]
Abstract
Cancer genomics plays a crucial role in oncology by enhancing our understanding of how genes drive cancer and facilitating the development of improved treatments. This field meticulously examines various cancers' genetic makeup through various methodologies, leading to groundbreaking discoveries. Innovative tools such as rapid gene sequencing, single-cell studies, spatial gene mapping, epigenetic analysis, liquid biopsies, and computational modeling have significantly progressed the field. These techniques uncover genetic alterations, tumor heterogeneity, and the evolutionary dynamics of cancers. Genetic abnormalities and molecular markers that initiate and propagate distinct cancer types are classified according to tumor type. The integration of precision medicine with cancer genomics emphasizes the significance of utilizing genetic data in treatment decision-making, enabling personalized care and enhancing patient outcomes. Critical topics in cancer genomics encompass tumor diversity, alterations in non-coding DNA, epigenetic modifications, cancer-specific proteins, metabolic changes, and the impact of inherited genes on cancer risk.
Collapse
Affiliation(s)
- Cigir Biray Avci
- Department of Medical Biology, Faculty of Medicine, Ege University, Izmir, Turkey
| | - Bakiye Goker Bagca
- Department of Medical Biology, Faculty of Medicine, Adnan Menderes University, Aydın, Turkey
| | - Behrouz Shademan
- Stem Cell Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | | | | | - Alireza Nourazarian
- Department of Basic Medical Sciences, Khoy University of Medical Sciences, Khoy, Iran.
| |
Collapse
|
2
|
Cai Y, Zhou N, Zhao J, Li W, Wang S. CSSEC: An adaptive approach integrating consensus and specific self-expressive coefficients for multi-omics cancer subtyping. Methods 2025; 235:26-33. [PMID: 39880224 DOI: 10.1016/j.ymeth.2025.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 01/05/2025] [Accepted: 01/16/2025] [Indexed: 01/31/2025] Open
Abstract
Cancer is a complex and heterogeneous disease, and accurate cancer subtyping can significantly improve patient survival rates. The complexity of cancer spans multiple omics levels, and analyzing multi-omics data for cancer subtyping has become a major focus of research. However, extracting complementary information from different omics data sources and adaptively integrating them remains a major challenge. To address this, we proposed an adaptive approach integrating consensus and specific self-expressive coefficients for multi-omics cancer subtyping (CSSEC). First, independent self-expressive networks are applied to each omics to calculate coefficient matrices to measure patient similarity. Then, two feature graph convolutional network modules capture consensus and specific similarity features using the topK relevant features. Finally, the multi-omics self-expression coefficient matrix is constructed by consensus and specific similarity features. Furthermore, joint consistency and disparity constraints are applied to regularize the fusion of the self-expressive coefficients. Experimental results demonstrate that CSSEC outperforms existing state-of-the-art methods in survival analysis. Moreover, case studies on kidney cancer confirm that the cancer subtypes identified by CSSEC are biologically significant. The complete code can be available at https://github.com/ykxhs/CSSEC.
Collapse
Affiliation(s)
- Yueyi Cai
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Nan Zhou
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Junran Zhao
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Weihua Li
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| |
Collapse
|
3
|
Zhang S, Lv J, Zhang J, Fan Z, Gu B, Fan B, Li C, Wang C, Zhang T. Benchmarking multi-omics integrative clustering methods for subtype identification in colorectal cancer. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108603. [PMID: 39826483 DOI: 10.1016/j.cmpb.2025.108603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 11/27/2024] [Accepted: 01/12/2025] [Indexed: 01/22/2025]
Abstract
BACKGROUND AND OBJECTIVE Colorectal cancer (CRC) represents a heterogeneous malignancy that has concerned global burden of incidence and mortality. The traditional tumor-node-metastasis staging system has exhibited certain limitations. With the advancement of omics technologies, researchers are directing their focus on developing a more precise multi-omics molecular classification. Therefore, the utilization of unsupervised multi-omics integrative clustering methods in CRC, advocating for the establishment of a comprehensive benchmark with practical guidelines. METHODS In this study, we obtained CRC multi-omics data, encompassing DNA methylation, gene expression, and protein expression from the cancer genome atlas (TCGA)database. We then generated interrelated CRC multi-omics data with various structures based on realistic multi-omics correlations, and performed a comprehensive evaluation of eight representative methods categorized as early integration, intermediate integration, and late integration using complementary benchmarks for subtype classification accuracy. Lastly, we employed these methods to integrate real-world CRC multi-omics data, survival and differential analysis were used to highlight differences among newly identified multi-omics subtypes. RESULTS Through in-depth comparisons, we observed that similarity network fusion (SNF) exhibited exceptional performance in integrating multi-omics data derived from simulations. Additionally, SNF effectively distinguished CRC patients into five subgroups with the highest classification accuracy. Moreover, we found significant survival differences and molecular distinctions among SNF subtypes. CONCLUSIONS The findings consistently demonstrate that SNF outperforms other methods in CRC multi-omics integrative clustering. The significant survival differences and molecular distinctions among SNF subtypes provide novel insights into the multi-omics perspective on CRC heterogeneity with potential clinical treatment.
Collapse
Affiliation(s)
- Shuai Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Jiali Lv
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Jinglan Zhang
- School of Life Science, Shandong University, Qingdao, 266237, China
| | - Zhe Fan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Bingbing Gu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Bingbing Fan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Chunxia Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China.
| | - Tao Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, 250012, China; Department of Epidemiology and Biostatistics, School of Public Health, Tianjin Medical University, Tianjin, 300070, China.
| |
Collapse
|
4
|
Abdelaziz EH, Ismail R, Mabrouk MS, Amin E. Multi-omics data integration and analysis pipeline for precision medicine: Systematic review. Comput Biol Chem 2024; 113:108254. [PMID: 39447405 DOI: 10.1016/j.compbiolchem.2024.108254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 09/05/2024] [Accepted: 10/14/2024] [Indexed: 10/26/2024]
Abstract
Precision medicine has gained considerable popularity since the "one-size-fits-all" approach did not seem very effective or reflective of the complexity of the human body. Subsequently, since single-omics does not reflect the complexity of the human body's inner workings, it did not result in the expected advancement in the medical field. Therefore, the multi-omics approach has emerged. The multi-omics approach involves integrating data from different omics technologies, such as DNA sequencing, RNA sequencing, mass spectrometry, and others, using computational methods and then analyzing the integrated result for different downstream analysis applications such as survival analysis, cancer classification, or biomarker identification. Most of the recent reviews were constrained to discussing one aspect of the multi-omics analysis pipeline, such as the dimensionality reduction step, the integration methods, or the interpretability aspect; however, very few provide a comprehensive review of every step of the analysis. This study aims to give an overview of the multi-omics analysis pipeline, starting with the most popular multi-omics databases used in recent literature, dimensionality reduction techniques, details the different types of data integration techniques and their downstream analysis applications, describes the most commonly used evaluation metrics, highlights the importance of model interpretability, and lastly discusses the challenges and potential future work for multi-omics data integration in precision medicine.
Collapse
Affiliation(s)
| | - Rasha Ismail
- Faculty of Computer and Information Sciences, Ainshams University, Cairo, Egypt.
| | - Mai S Mabrouk
- Information Technology and Computer Science School, Nile University, Cairo, Egypt.
| | - Eman Amin
- Faculty of Computer and Information Sciences, Ainshams University, Cairo, Egypt.
| |
Collapse
|
5
|
Mansoor S, Hamid S, Tuan TT, Park JE, Chung YS. Advance computational tools for multiomics data learning. Biotechnol Adv 2024; 77:108447. [PMID: 39251098 DOI: 10.1016/j.biotechadv.2024.108447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 09/01/2024] [Accepted: 09/05/2024] [Indexed: 09/11/2024]
Abstract
The burgeoning field of bioinformatics has seen a surge in computational tools tailored for omics data analysis driven by the heterogeneous and high-dimensional nature of omics data. In biomedical and plant science research multi-omics data has become pivotal for predictive analytics in the era of big data necessitating sophisticated computational methodologies. This review explores a diverse array of computational approaches which play crucial role in processing, normalizing, integrating, and analyzing omics data. Notable methods such similarity-based methods, network-based approaches, correlation-based methods, Bayesian methods, fusion-based methods and multivariate techniques among others are discussed in detail, each offering unique functionalities to address the complexities of multi-omics data. Furthermore, this review underscores the significance of computational tools in advancing our understanding of data and their transformative impact on research.
Collapse
Affiliation(s)
- Sheikh Mansoor
- Department of Plant Resources and Environment, Jeju National University, 63243, Republic of Korea
| | - Saira Hamid
- Watson Crick Centre for Molecular Medicine, Islamic University of Science and Technology, Awantipora, Pulwama, J&K, India
| | - Thai Thanh Tuan
- Department of Plant Resources and Environment, Jeju National University, 63243, Republic of Korea; Multimedia Communications Laboratory, University of Information Technology, Ho Chi Minh city 70000, Vietnam; Multimedia Communications Laboratory, Vietnam National University, Ho Chi Minh city 70000, Vietnam
| | - Jong-Eun Park
- Department of Animal Biotechnology, College of Applied Life Science, Jeju National University, Jeju, Jeju-do, Republic of Korea.
| | - Yong Suk Chung
- Department of Plant Resources and Environment, Jeju National University, 63243, Republic of Korea.
| |
Collapse
|
6
|
Miao Y, Xu H, Wang S. PartIES: a disease subtyping framework with Partition-level Integration using diffusion-Enhanced Similarities from multi-omics Data. Brief Bioinform 2024; 26:bbae609. [PMID: 39584699 PMCID: PMC11586768 DOI: 10.1093/bib/bbae609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 10/16/2024] [Accepted: 11/11/2024] [Indexed: 11/26/2024] Open
Abstract
Integrating multi-omics data helps identify disease subtypes. Many similarity-based methods were developed for disease subtyping using multi-omics data, with many of them focusing on extracting common clustering structures across multiple types of omics data, but not preserving data-type-specific clustering structures. Moreover, clustering performance of similarity-based methods is affected when similarity measures are noisy. Here we proposed PartIES, a Partition-level Integration using diffusion-Enhanced Similarities to perform disease subtyping using multi-omics data. PartIES uses diffusion to reduce noises in individual similarity/kernel matrices from individual omics data types first, and then extract partition information from diffusion-enhanced similarity matrices and integrate the partition-level similarity through a weighted average iteratively. Simulation studies showed that (1) the diffusion step enhances clustering accuracy, and (2) PartIES outperforms competing methods, particularly when omics data types provide different clustering structures. Using mRNA, long noncoding RNAs, microRNAs expression data, DNA methylation data, and somatic mutation data from The Cancer Genome Atlas project, PartIES identified subtypes in bladder urothelial carcinoma, liver hepatocellular carcinoma, and thyroid carcinoma that are most significantly associated with patient survival across all methods. Further investigations suggested that among subtype-associated genes, many of those that are highly interacting with other genes are known important cancer genes. The identified cancer subtypes also have different activity levels for some known cancer-related pathways. The R code can be accessed at https://github.com/yuqimiao/PartIES.git.
Collapse
Affiliation(s)
- Yuqi Miao
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10027, United States
| | - Huang Xu
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10027, United States
| | - Shuang Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10027, United States
| |
Collapse
|
7
|
Kruta J, Carapito R, Trendelenburg M, Martin T, Rizzi M, Voll RE, Cavalli A, Natali E, Meier P, Stawiski M, Mosbacher J, Mollet A, Santoro A, Capri M, Giampieri E, Schkommodau E, Miho E. Machine learning for precision diagnostics of autoimmunity. Sci Rep 2024; 14:27848. [PMID: 39537649 PMCID: PMC11561187 DOI: 10.1038/s41598-024-76093-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 10/10/2024] [Indexed: 11/16/2024] Open
Abstract
Early and accurate diagnosis is crucial to prevent disease development and define therapeutic strategies. Due to predominantly unspecific symptoms, diagnosis of autoimmune diseases (AID) is notoriously challenging. Clinical decision support systems (CDSS) are a promising method with the potential to enhance and expedite precise diagnostics by physicians. However, due to the difficulties of integrating and encoding multi-omics data with clinical values, as well as a lack of standardization, such systems are often limited to certain data types. Accordingly, even sophisticated data models fall short when making accurate disease diagnoses and presenting data analyses in a user-friendly form. Therefore, the integration of various data types is not only an opportunity but also a competitive advantage for research and industry. We have developed an integration pipeline to enable the use of machine learning for patient classification based on multi-omics data in combination with clinical values and laboratory results. The application of our framework resulted in up to 96% prediction accuracy of autoimmune diseases with machine learning models. Our results deliver insights into autoimmune disease research and have the potential to be adapted for applications across disease conditions.
Collapse
Affiliation(s)
- Jan Kruta
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Raphael Carapito
- Laboratoire d'ImmunoRhumatologie Moléculaire, plateforme GENOMAX, Faculté de Médecine, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Institut Thématique Interdisciplinaire TRANSPLANTEX NG, INSERM UMR_S 1109, Fédération Hospitalo-Universitaire OMICARE, Université de Strasbourg, 4 rue Kirschleger, Strasbourg, 67085, France
- Service d'Immunologie Biologique, Pôle de Biologie, Plateau Technique de Biologie, Nouvel Hôpital Civil, 1 place de l'Hôpital, Strasbourg, 67091, France
| | - Marten Trendelenburg
- Division of Internal Medicine, University Hospital Basel, Basel, 4031, Switzerland
| | - Thierry Martin
- Laboratoire d'ImmunoRhumatologie Moléculaire, plateforme GENOMAX, Faculté de Médecine, Fédération de Médecine Translationnelle de Strasbourg (FMTS), Institut Thématique Interdisciplinaire TRANSPLANTEX NG, INSERM UMR_S 1109, Fédération Hospitalo-Universitaire OMICARE, Université de Strasbourg, 4 rue Kirschleger, Strasbourg, 67085, France
| | - Marta Rizzi
- Department of Rheumatology and Clinical Immunology, Medical Center, University of Freiburg, 79106, Freiburg, Germany
| | - Reinhard E Voll
- Department of Rheumatology and Clinical Immunology, Medical Center, University of Freiburg, 79106, Freiburg, Germany
| | - Andrea Cavalli
- FaBiT Department of Pharmacy and Biotechnology, Università di Bologna, Bologna, 40126, Italy
| | - Eriberto Natali
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Patrick Meier
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Marc Stawiski
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Johannes Mosbacher
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Annette Mollet
- Institute of Pharmaceutical Medicine, University of Basel, Basel, 4056, Switzerland
| | - Aurelia Santoro
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, 40126, Italy
| | - Miriam Capri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, 40126, Italy
| | - Enrico Giampieri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, 40126, Italy
| | - Erik Schkommodau
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland
| | - Enkelejda Miho
- School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Hofackerstrasse 30, Muttenz, 4132, Switzerland.
- SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland.
- aiNET GmbH, Lichtstrasse 35, Basel, 4056, Switzerland.
| |
Collapse
|
8
|
Anwar MY, Highland H, Buchanan VL, Graff M, Young K, Taylor KD, Tracy RP, Durda P, Liu Y, Johnson CW, Aguet F, Ardlie KG, Gerszten RE, Clish CB, Lange LA, Ding J, Goodarzi MO, Chen YDI, Peloso GM, Guo X, Stanislawski MA, Rotter JI, Rich SS, Justice AE, Liu CT, North K. Machine learning-based clustering identifies obesity subgroups with differential multi-omics profiles and metabolic patterns. Obesity (Silver Spring) 2024; 32:2024-2034. [PMID: 39497627 PMCID: PMC11540333 DOI: 10.1002/oby.24137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 06/18/2024] [Accepted: 07/22/2024] [Indexed: 11/08/2024]
Abstract
OBJECTIVE Individuals living with obesity are differentially susceptible to cardiometabolic diseases. We hypothesized that an integrative multi-omics approach might improve identification of subgroups of individuals with obesity who have distinct cardiometabolic disease patterns. METHODS We performed machine learning-based, integrative unsupervised clustering to identify proteomics- and metabolomics-defined subpopulations of individuals living with obesity (BMI ≥ 30 kg/m2), leveraging data from 243 individuals in the Multi-Ethnic Study of Atherosclerosis (MESA) cohort. Omics that contributed to the observed clusters were functionally characterized. We performed multivariate regression to assess whether the individuals in each cluster demonstrated differential patterns of cardiometabolic traits. RESULTS We identified two distinct clusters (iCluster1 and 2). iCluster2 had significantly higher average BMI values, fasting blood glucose, and inflammation. iCluster1 was associated with higher levels of total cholesterol and high-density lipoprotein cholesterol. Pathways mediating cell growth, lipogenesis, and energy expenditures were positively associated with iCluster1. Inflammatory response and insulin resistance pathways were positively associated with iCluster2. CONCLUSIONS Although the two identified clusters may represent progressive obesity-related pathologic processes measured at different stages, other mechanisms in combination could also underpin the identified clusters given no significant age difference between the comparative groups. For instance, clusters may reflect differences in dietary/behavioral patterns or differential rates of metabolic damage.
Collapse
Affiliation(s)
- Mohammad Y Anwar
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Heather Highland
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Victoria Lynn Buchanan
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Mariaelisa Graff
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kristin Young
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kent D Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Russell P Tracy
- Department of Pathology and Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, Vermont, USA
| | - Peter Durda
- Department of Pathology and Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, Vermont, USA
| | - Yongmei Liu
- Department of Medicine, Duke University Medical Center, Durham, North Carolina, USA
| | - Craig W Johnson
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Francois Aguet
- Program of Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Kristin G Ardlie
- Program of Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Robert E Gerszten
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Clary B Clish
- Metabolite Profiling Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Leslie A Lange
- Department of Epidemiology, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Jingzhong Ding
- Section of Gerontology and Geriatric Medicine, Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, North Carolina, USA
| | - Mark O Goodarzi
- Division of Endocrinology, Diabetes, and Metabolism, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston University, Boston, Massachusetts, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Maggie A Stanislawski
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, USA
| | - Anne E Justice
- Department of Population Health Sciences, Geisinger Health System, Danville, Pennsylvania, USA
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston University, Boston, Massachusetts, USA
| | - Kari North
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
9
|
Marino R, El Aalamat Y, Bol V, Caselle M, Del Giudice G, Lambert C, Medini D, Wilkinson TMA, Muzzi A. An integrative network-based approach to identify driving gene communities in chronic obstructive pulmonary disease. NPJ Syst Biol Appl 2024; 10:125. [PMID: 39461973 PMCID: PMC11513021 DOI: 10.1038/s41540-024-00425-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 08/19/2024] [Indexed: 10/28/2024] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is an etiologically complex disease characterized by acute exacerbations and stable phases. We aimed to identify biological functions modulated in specific COPD conditions, using whole blood samples collected in the AERIS clinical study (NCT01360398). Considered conditions were exacerbation onset, severity of airway obstruction, and presence of respiratory pathogens in sputum samples. With an integrative multi-network gene community detection (MNGCD) approach, we analyzed expression profiles to identify communities of correlated genes. The approach combined different layers of gene interactions for each explored condition/subset of samples: gene expression similarity, protein-protein interactions, transcription factors, and microRNAs validated regulons. Heme metabolism, interferon-alpha, and interferon-gamma pathways were modulated in patients at both exacerbation and stable-state visits, but with the involvement of distinct sets of genes. An important gene community was enriched with G2M checkpoint, E2F targets, and mitotic spindle pathways during exacerbation. Targets of TAL1 regulator and hsa-let-7b - 5p microRNA were modulated with increasing severity of airway obstruction. Bacterial infections with Moraxella catarrhalis and, particularly, Haemophilus influenzae triggered a specific cellular and inflammatory response in acute exacerbations, indicating an active reaction of the host to infections. In conclusion, COPD is a complex multifactorial disease that requires in-depth investigations of its causes and features during its evolution and whole blood transcriptome profiling can contribute to capturing some relevant regulatory mechanisms associated with this disease. In this work, we explored multi-network modeling that integrated diverse layers of regulatory gene networks and enhanced our comprehension of the biological functions implicated in the COPD pathogenesis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Tom M A Wilkinson
- Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- National Institute for Health Research Southampton Biomedical Research Centre, Southampton Centre for Biomedical Research, Southampton General Hospital, Southampton, United Kingdom
| | | |
Collapse
|
10
|
Zhang H, Liu S, Li B, Zhou X. IPFMC: an iterative pathway fusion approach for enhanced multi-omics clustering in cancer research. Brief Bioinform 2024; 25:bbae541. [PMID: 39470306 PMCID: PMC11514061 DOI: 10.1093/bib/bbae541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/13/2024] [Accepted: 10/09/2024] [Indexed: 10/30/2024] Open
Abstract
Using multi-omics data for clustering (cancer subtyping) is crucial for precision medicine research. Despite numerous methods having been proposed, current approaches either do not perform satisfactorily or lack biological interpretability, limiting the practical application of these methods. Based on the biological hypothesis that patients with the same subtype may exhibit similar dysregulated pathways, we developed an Iterative Pathway Fusion approach for enhanced Multi-omics Clustering (IPFMC), a novel multi-omics clustering method involving two data fusion stages. In the first stage, omics data are partitioned at each layer using pathway information, with crucial pathways iteratively selected to represent samples. Ultimately, the representation information from multiple pathways is integrated. In the second stage, similarity network fusion was applied to integrate the representation information from multiple omics. Comparative experiments with nine cancer datasets from The Cancer Genome Atlas (TCGA), involving systematic comparisons with 10 representative methods, reveal that IPFMC outperforms these methods. Additionally, the biological pathways and genes identified by our approach hold biological significance, affirming not only its excellent clustering performance but also its biological interpretability.
Collapse
Affiliation(s)
- Haoyang Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People’s Republic of China
| | - Sha Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People’s Republic of China
| | - Bingxin Li
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People’s Republic of China
| | - Xionghui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People’s Republic of China
- Key Laboratory of Smart Farming for Agricultural Animals, Ministry of Agriculture and Rural Affairs, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People’s Republic of China
| |
Collapse
|
11
|
Jia C, Wang T, Cui D, Tian Y, Liu G, Xu Z, Luo Y, Fang R, Yu H, Zhang Y, Cui Y, Cao H. A metagene based similarity network fusion approach for multi-omics data integration identified novel subtypes in renal cell carcinoma. Brief Bioinform 2024; 25:bbae606. [PMID: 39562162 PMCID: PMC11576078 DOI: 10.1093/bib/bbae606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 10/22/2024] [Accepted: 11/13/2024] [Indexed: 11/21/2024] Open
Abstract
Renal cell carcinoma (RCC) ranks among the most prevalent cancers worldwide, with both incidence and mortality rates increasing annually. The heterogeneity among RCC patients presents considerable challenges for developing universally effective treatment strategies, emphasizing the necessity of in-depth research into RCC's molecular mechanisms, understanding the variations among RCC patients and further identifying distinct molecular subtypes for precise treatment. We proposed a metagene-based similarity network fusion (Meta-SNF) method for RCC subtype identification with multi-omics data, using a non-negative matrix factorization technique to capture alternative structures inherent in the dataset as metagenes. These latent metagenes were then integrated to construct a fused network under the Similarity Network Fusion (SNF) framework for more precise subtyping. We conducted simulation studies and analyzed real-world data from two RCC datasets, namely kidney renal clear cell carcinoma (KIRC) and kidney renal papillary cell carcinoma (KIRP) to demonstrate the utility of Meta-SNF. The simulation studies indicated that Meta-SNF achieved higher accuracy in subtype identification compared with the original SNF and other state-of-the-art methods. In analyses of real data, Meta-SNF produced more distinct and well-separated clusters, classifying both KIRC and KIRP into four subtypes with significant differences in survival outcomes. Subsequently, we performed comprehensive bioinformatics analyses focused on subtypes with poor prognoses in KIRC and KIRP and identified several potential biomarkers. Meta-SNF offers a novel strategy for subtype identification using multi-omics data, and its application to RCC datasets has yielded diverse biological insights which are highly valuable for informing clinical decision-making processes in the treatment of RCC.
Collapse
Affiliation(s)
- Congcong Jia
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Tong Wang
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- Academy of Medical Sciences, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Dingtong Cui
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Yaxin Tian
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- Academy of Medical Sciences, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Gaiqin Liu
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Zhaoyang Xu
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- Academy of Medical Sciences, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Yanhong Luo
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Ruiling Fang
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Hongmei Yu
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Yanbo Zhang
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, 48824, United States
| | - Hongyan Cao
- Department of Health Statistics, Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
- MOE Key Laboratory of Coal Environmental Pathogenicity and Prevention, Shanxi Medical University, Taiyuan, Shanxi, 030001, PR, China
| |
Collapse
|
12
|
Zhao Y, Li X, Zhou C, Peng H, Zheng Z, Chen J, Ding W. A review of cancer data fusion methods based on deep learning. INFORMATION FUSION 2024; 108:102361. [DOI: 10.1016/j.inffus.2024.102361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
13
|
Rintala TJ, Fortino V. COPS: A novel platform for multi-omic disease subtype discovery via robust multi-objective evaluation of clustering algorithms. PLoS Comput Biol 2024; 20:e1012275. [PMID: 39102448 PMCID: PMC11326705 DOI: 10.1371/journal.pcbi.1012275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 08/15/2024] [Accepted: 06/25/2024] [Indexed: 08/07/2024] Open
Abstract
Recent research on multi-view clustering algorithms for complex disease subtyping often overlooks aspects like clustering stability and critical assessment of prognostic relevance. Furthermore, current frameworks do not allow for a comparison between data-driven and pathway-driven clustering, highlighting a significant gap in the methodology. We present the COPS R-package, tailored for robust evaluation of single and multi-omics clustering results. COPS features advanced methods, including similarity networks, kernel-based approaches, dimensionality reduction, and pathway knowledge integration. Some of these methods are not accessible through R, and some correspond to new approaches proposed with COPS. Our framework was rigorously applied to multi-omics data across seven cancer types, including breast, prostate, and lung, utilizing mRNA, CNV, miRNA, and DNA methylation data. Unlike previous studies, our approach contrasts data- and knowledge-driven multi-view clustering methods and incorporates cross-fold validation for robustness. Clustering outcomes were assessed using the ARI score, survival analysis via Cox regression models including relevant covariates, and the stability of the results. While survival analysis and gold-standard agreement are standard metrics, they vary considerably across methods and datasets. Therefore, it is essential to assess multi-view clustering methods using multiple criteria, from cluster stability to prognostic relevance, and to provide ways of comparing these metrics simultaneously to select the optimal approach for disease subtype discovery in novel datasets. Emphasizing multi-objective evaluation, we applied the Pareto efficiency concept to gauge the equilibrium of evaluation metrics in each cancer case-study. Affinity Network Fusion, Integrative Non-negative Matrix Factorization, and Multiple Kernel K-Means with linear or Pathway Induced Kernels were the most stable and effective in discerning groups with significantly different survival outcomes in several case studies.
Collapse
Affiliation(s)
- Teemu J. Rintala
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | - Vittorio Fortino
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| |
Collapse
|
14
|
Arici MK, Tuncbag N. Unveiling hidden connections in omics data via pyPARAGON: an integrative hybrid approach for disease network construction. Brief Bioinform 2024; 25:bbae399. [PMID: 39163205 PMCID: PMC11334722 DOI: 10.1093/bib/bbae399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 06/26/2024] [Accepted: 08/07/2024] [Indexed: 08/22/2024] Open
Abstract
Network inference or reconstruction algorithms play an integral role in successfully analyzing and identifying causal relationships between omics hits for detecting dysregulated and altered signaling components in various contexts, encompassing disease states and drug perturbations. However, accurate representation of signaling networks and identification of context-specific interactions within sparse omics datasets in complex interactomes pose significant challenges in integrative approaches. To address these challenges, we present pyPARAGON (PAgeRAnk-flux on Graphlet-guided network for multi-Omic data integratioN), a novel tool that combines network propagation with graphlets. pyPARAGON enhances accuracy and minimizes the inclusion of nonspecific interactions in signaling networks by utilizing network rather than relying on pairwise connections among proteins. Through comprehensive evaluations on benchmark signaling pathways, we demonstrate that pyPARAGON outperforms state-of-the-art approaches in node propagation and edge inference. Furthermore, pyPARAGON exhibits promising performance in discovering cancer driver networks. Notably, we demonstrate its utility in network-based stratification of patient tumors by integrating phosphoproteomic data from 105 breast cancer tumors with the interactome and demonstrating tumor-specific signaling pathways. Overall, pyPARAGON is a novel tool for analyzing and integrating multi-omic data in the context of signaling networks. pyPARAGON is available at https://github.com/netlab-ku/pyPARAGON.
Collapse
Affiliation(s)
- Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara 06800, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul 34450, Turkey
- School of Medicine, Koc University, Istanbul 34450, Turkey
- Koc University Research Center for Translational Medicine (KUTTAM), Koc University, Istanbul 34450, Turkey
| |
Collapse
|
15
|
Liu P, Page D, Ahlquist P, Ong IM, Gitter A. MPAC: a computational framework for inferring cancer pathway activities from multi-omic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.15.599113. [PMID: 38948762 PMCID: PMC11212914 DOI: 10.1101/2024.06.15.599113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Fully capturing cellular state requires examining genomic, epigenomic, transcriptomic, proteomic, and other assays for a biological sample and comprehensive computational modeling to reason with the complex and sometimes conflicting measurements. Modeling these so-called multi-omic data is especially beneficial in disease analysis, where observations across omic data types may reveal unexpected patient groupings and inform clinical outcomes and treatments. We present Multi-omic Pathway Analysis of Cancer (MPAC), a computational framework that interprets multi-omic data through prior knowledge from biological pathways. MPAC uses network relationships encoded in pathways using a factor graph to infer consensus activity levels for proteins and associated pathway entities from multi-omic data, runs permutation testing to eliminate spurious activity predictions, and groups biological samples by pathway activities to prioritize proteins with potential clinical relevance. Using DNA copy number alteration and RNA-seq data from head and neck squamous cell carcinoma patients from The Cancer Genome Atlas as an example, we demonstrate that MPAC predicts a patient subgroup related to immune responses not identified by analysis with either input omic data type alone. Key proteins identified via this subgroup have pathway activities related to clinical outcome as well as immune cell compositions. Our MPAC R package, available at https://bioconductor.org/packages/MPAC, enables similar multi-omic analyses on new datasets.
Collapse
Affiliation(s)
- Peng Liu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Carbone Cancer Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - David Page
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Carbone Cancer Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Paul Ahlquist
- John and Jeanne Rowe Center for Research in Virology, Morgridge Institute for Research, Madison, Wisconsin, United States of America
- McArdle Laboratory for Cancer Research, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Institute for Molecular Virology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Irene M Ong
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Carbone Cancer Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Obstetrics and Gynecology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Center for Human Genomics and Precision Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- John and Jeanne Rowe Center for Research in Virology, Morgridge Institute for Research, Madison, Wisconsin, United States of America
| |
Collapse
|
16
|
Rashid MM, Selvarajoo K. Advancing drug-response prediction using multi-modal and -omics machine learning integration (MOMLIN): a case study on breast cancer clinical data. Brief Bioinform 2024; 25:bbae300. [PMID: 38904542 PMCID: PMC11190965 DOI: 10.1093/bib/bbae300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/30/2024] [Accepted: 06/11/2024] [Indexed: 06/22/2024] Open
Abstract
The inherent heterogeneity of cancer contributes to highly variable responses to any anticancer treatments. This underscores the need to first identify precise biomarkers through complex multi-omics datasets that are now available. Although much research has focused on this aspect, identifying biomarkers associated with distinct drug responders still remains a major challenge. Here, we develop MOMLIN, a multi-modal and -omics machine learning integration framework, to enhance drug-response prediction. MOMLIN jointly utilizes sparse correlation algorithms and class-specific feature selection algorithms, which identifies multi-modal and -omics-associated interpretable components. MOMLIN was applied to 147 patients' breast cancer datasets (clinical, mutation, gene expression, tumor microenvironment cells and molecular pathways) to analyze drug-response class predictions for non-responders and variable responders. Notably, MOMLIN achieves an average AUC of 0.989, which is at least 10% greater when compared with current state-of-the-art (data integration analysis for biomarker discovery using latent components, multi-omics factor analysis, sparse canonical correlation analysis). Moreover, MOMLIN not only detects known individual biomarkers such as genes at mutation/expression level, most importantly, it correlates multi-modal and -omics network biomarkers for each response class. For example, an interaction between ER-negative-HMCN1-COL5A1 mutations-FBXO2-CSF3R expression-CD8 emerge as a multimodal biomarker for responders, potentially affecting antimicrobial peptides and FLT3 signaling pathways. In contrast, for resistance cases, a distinct combination of lymph node-TP53 mutation-PON3-ENSG00000261116 lncRNA expression-HLA-E-T-cell exclusions emerged as multimodal biomarkers, possibly impacting neurotransmitter release cycle pathway. MOMLIN, therefore, is expected advance precision medicine, such as to detect context-specific multi-omics network biomarkers and better predict drug-response classifications.
Collapse
Affiliation(s)
- Md Mamunur Rashid
- Biomolecular Sequence to Function Division, BII, (ASTAR), Singapore 138671, Republic of Singapore
| | - Kumar Selvarajoo
- Biomolecular Sequence to Function Division, BII, (ASTAR), Singapore 138671, Republic of Singapore
- Synthetic Biology Translational Research Program, Yong Loo Lin School of Medicine, NUS, Singapore 117456, Republic of Singapore
- School of Biological Sciences, Nanyang Technological University (NTU), Singapore 639798, Republic of Singapore
| |
Collapse
|
17
|
Reggiani F, El Rashed Z, Petito M, Pfeffer M, Morabito A, Tanda ET, Spagnolo F, Croce M, Pfeffer U, Amaro A. Machine Learning Methods for Gene Selection in Uveal Melanoma. Int J Mol Sci 2024; 25:1796. [PMID: 38339073 PMCID: PMC10855534 DOI: 10.3390/ijms25031796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 01/25/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
Uveal melanoma (UM) is the most common primary intraocular malignancy with a limited five-year survival for metastatic patients. Limited therapeutic treatments are currently available for metastatic disease, even if the genomics of this tumor has been deeply studied using next-generation sequencing (NGS) and functional experiments. The profound knowledge of the molecular features that characterize this tumor has not led to the development of efficacious therapies, and the survival of metastatic patients has not changed for decades. Several bioinformatics methods have been applied to mine NGS tumor data in order to unveil tumor biology and detect possible molecular targets for new therapies. Each application can be single domain based while others are more focused on data integration from multiple genomics domains (as gene expression and methylation data). Examples of single domain approaches include differentially expressed gene (DEG) analysis on gene expression data with statistical methods such as SAM (significance analysis of microarray) or gene prioritization with complex algorithms such as deep learning. Data fusion or integration methods merge multiple domains of information to define new clusters of patients or to detect relevant genes, according to multiple NGS data. In this work, we compare different strategies to detect relevant genes for metastatic disease prediction in the TCGA uveal melanoma (UVM) dataset. Detected targets are validated with multi-gene score analysis on a larger UM microarray dataset.
Collapse
Affiliation(s)
- Francesco Reggiani
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
| | - Zeinab El Rashed
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
| | - Mariangela Petito
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
- Department of Experimental Medicine (DIMES), University of Genova, Via Leon Battista Alberti, 16132 Genova, Italy
| | - Max Pfeffer
- Institute of Numerical and Applied Mathematics, University of Göttingen, 37083 Göttingen, Germany;
| | - Anna Morabito
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
| | - Enrica Teresa Tanda
- Skin Cancer Unit, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (E.T.T.); (F.S.)
- Department of Internal Medicine and Medical Specialties, University of Genova, Viale Benedetto XV, 16132 Genova, Italy
| | - Francesco Spagnolo
- Skin Cancer Unit, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (E.T.T.); (F.S.)
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genova, 16132 Genova, Italy
| | - Michela Croce
- Biotherapies, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy;
| | - Ulrich Pfeffer
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
| | - Adriana Amaro
- Laboratory of Gene Expression Regulation, IRCCS Ospedale Policlinico San Martino, 16132 Genova, Italy; (F.R.); (M.P.); (A.M.)
| |
Collapse
|
18
|
Liu W, Pratte KA, Castaldi PJ, Hersh C, Bowler RP, Banaei-Kashani F, Kechris KJ. A Generalized Higher-order Correlation Analysis Framework for Multi-Omics Network Inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.576667. [PMID: 38328226 PMCID: PMC10849540 DOI: 10.1101/2024.01.22.576667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Peter J. Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Craig Hersh
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Russell P. Bowler
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO, USA
| | - Farnoush Banaei-Kashani
- Department of Computer Science and Engineering, College of Engineering, Design and Computing, University of Colorado Denver, Denver, CO, USA
| | - Katerina J. Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
19
|
Nießl C, Hoffmann S, Ullmann T, Boulesteix AL. Explaining the optimistic performance evaluation of newly proposed methods: A cross-design validation experiment. Biom J 2024; 66:e2200238. [PMID: 36999395 DOI: 10.1002/bimj.202200238] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 01/09/2023] [Accepted: 01/10/2023] [Indexed: 04/01/2023]
Abstract
The constant development of new data analysis methods in many fields of research is accompanied by an increasing awareness that these new methods often perform better in their introductory paper than in subsequent comparison studies conducted by other researchers. We attempt to explain this discrepancy by conducting a systematic experiment that we call "cross-design validation of methods". In the experiment, we select two methods designed for the same data analysis task, reproduce the results shown in each paper, and then reevaluate each method based on the study design (i.e., datasets, competing methods, and evaluation criteria) that was used to show the abilities of the other method. We conduct the experiment for two data analysis tasks, namely cancer subtyping using multiomic data and differential gene expression analysis. Three of the four methods included in the experiment indeed perform worse when they are evaluated on the new study design, which is mainly caused by the different datasets. Apart from illustrating the many degrees of freedom existing in the assessment of a method and their effect on its performance, our experiment suggests that the performance discrepancies between original and subsequent papers may not only be caused by the nonneutrality of the authors proposing the new method but also by differences regarding the level of expertise and field of application. Authors of new methods should thus focus not only on a transparent and extensive evaluation but also on comprehensive method documentation that enables the correct use of their methods in subsequent studies.
Collapse
Affiliation(s)
- Christina Nießl
- Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
- Munich Center for Machine Learning (MCML), Munich, Germany
| | - Sabine Hoffmann
- Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
- Department of Statistics, LMU Munich, Munich, Germany
| | - Theresa Ullmann
- Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
| | - Anne-Laure Boulesteix
- Institute for Medical Information Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
| |
Collapse
|
20
|
Guo H, Lv X, Li Y, Li M. Attention-based GCN integrates multi-omics data for breast cancer subtype classification and patient-specific gene marker identification. Brief Funct Genomics 2023; 22:463-474. [PMID: 37114942 DOI: 10.1093/bfgp/elad013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 02/16/2023] [Accepted: 03/17/2023] [Indexed: 04/29/2023] Open
Abstract
Breast cancer is a heterogeneous disease and can be divided into several subtypes with unique prognostic and molecular characteristics. The classification of breast cancer subtypes plays an important role in the precision treatment and prognosis of breast cancer. Benefitting from the relation-aware ability of a graph convolution network (GCN), we present a multi-omics integrative method, the attention-based GCN (AGCN), for breast cancer molecular subtype classification using messenger RNA expression, copy number variation and deoxyribonucleic acid methylation multi-omics data. In the extensive comparative studies, our AGCN models outperform state-of-the-art methods under different experimental conditions and both attention mechanisms and the graph convolution subnetwork play an important role in accurate cancer subtype classification. The layer-wise relevance propagation (LRP) algorithm is used for the interpretation of model decision, which can identify patient-specific important biomarkers that are reported to be related to the occurrence and development of breast cancer. Our results highlighted the effectiveness of the GCN and attention mechanisms in multi-omics integrative analysis and the implement of the LRP algorithm can provide biologically reasonable insights into model decision.
Collapse
Affiliation(s)
- Hui Guo
- College of Chemistry at Sichuan University
| | - Xiang Lv
- College of Chemistry at Sichuan University
| | - Yizhou Li
- College of Cyber Science and Engineering at Sichuan University
| | | |
Collapse
|
21
|
Mushtaq AH, Shafqat A, Salah HT, Hashmi SK, Muhsen IN. Machine learning applications and challenges in graft-versus-host disease: a scoping review. Curr Opin Oncol 2023; 35:594-600. [PMID: 37820094 DOI: 10.1097/cco.0000000000000996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
PURPOSE OF REVIEW This review delves into the potential of artificial intelligence (AI), particularly machine learning (ML), in enhancing graft-versus-host disease (GVHD) risk assessment, diagnosis, and personalized treatment. RECENT FINDINGS Recent studies have demonstrated the superiority of ML algorithms over traditional multivariate statistical models in donor selection for allogeneic hematopoietic stem cell transplantation. ML has recently enabled dynamic risk assessment by modeling time-series data, an upgrade from the static, "snapshot" assessment of patients that conventional statistical models and older ML algorithms offer. Regarding diagnosis, a deep learning model, a subset of ML, can accurately identify skin segments affected with chronic GVHD with satisfactory results. ML methods such as Q-learning and deep reinforcement learning have been utilized to develop adaptive treatment strategies (ATS) for the personalized prevention and treatment of acute and chronic GVHD. SUMMARY To capitalize on these promising advancements, there is a need for large-scale, multicenter collaborations to develop generalizable ML models. Furthermore, addressing pertinent issues such as the implementation of stringent ethical guidelines is crucial before the widespread introduction of AI into GVHD care.
Collapse
Affiliation(s)
- Ali Hassan Mushtaq
- Department of Internal Medicine, Cleveland Clinic Foundation, Cleveland, Ohio, USA
| | - Areez Shafqat
- College of Medicine, Alfaisal University, Riyadh, Saudi Arabia
| | - Haneen T Salah
- Department of Pathology and Genomic Medicine, Houston Methodist Hospital, Houston, Texas
| | - Shahrukh K Hashmi
- Division of Hematology, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Department of Medicine, Sheikh Shakbout Medical City
- Medical Affairs, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Ibrahim N Muhsen
- Section of Hematology and Oncology, Department of Medicine, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
22
|
Chen W, Wang H, Liang C. Deep multi-view contrastive learning for cancer subtype identification. Brief Bioinform 2023; 24:bbad282. [PMID: 37539822 DOI: 10.1093/bib/bbad282] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/29/2023] [Accepted: 07/19/2023] [Indexed: 08/05/2023] Open
Abstract
Cancer heterogeneity has posed great challenges in exploring precise therapeutic strategies for cancer treatment. The identification of cancer subtypes aims to detect patients with distinct molecular profiles and thus could provide new clues on effective clinical therapies. While great efforts have been made, it remains challenging to develop powerful computational methods that can efficiently integrate multi-omics datasets for the task. In this paper, we propose a novel self-supervised learning model called Deep Multi-view Contrastive Learning (DMCL) for cancer subtype identification. Specifically, by incorporating the reconstruction loss, contrastive loss and clustering loss into a unified framework, our model simultaneously encodes the sample discriminative information into the extracted feature representations and well preserves the sample cluster structures in the embedded space. Moreover, DMCL is an end-to-end framework where the cancer subtypes could be directly obtained from the model outputs. We compare DMCL with eight alternatives ranging from classic cancer subtype identification methods to recently developed state-of-the-art systems on 10 widely used cancer multi-omics datasets as well as an integrated dataset, and the experimental results validate the superior performance of our method. We further conduct a case study on liver cancer and the analysis results indicate that different subtypes might have different responses to the selected chemotherapeutic drugs.
Collapse
Affiliation(s)
- Wenlan Chen
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| |
Collapse
|
23
|
Chen Y, Wen Y, Xie C, Chen X, He S, Bo X, Zhang Z. MOCSS: Multi-omics data clustering and cancer subtyping via shared and specific representation learning. iScience 2023; 26:107378. [PMID: 37559907 PMCID: PMC10407241 DOI: 10.1016/j.isci.2023.107378] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/23/2023] [Accepted: 07/07/2023] [Indexed: 08/11/2023] Open
Abstract
Cancer is an extremely complex disease and each type of cancer usually has several different subtypes. Multi-omics data can provide more comprehensive biological information for identifying and discovering cancer subtypes. However, existing unsupervised cancer subtyping methods cannot effectively learn comprehensive shared and specific information of multi-omics data. Therefore, a novel method is proposed based on shared and specific representation learning. For each omics data, two autoencoders are applied to extract shared and specific information, respectively. To reduce redundancy and mutual interference, orthogonality constraint is introduced to separate shared and specific information. In addition, contrastive learning is applied to align the shared information and strengthen their consistency. Finally, the obtained shared and specific information for all samples are used for clustering tasks to achieve cancer subtyping. Experimental results demonstrate that the proposed method can effectively capture shared and specific information of multi-omics data and outperform other state-of-the-art methods on cancer subtyping.
Collapse
Affiliation(s)
- Yuxin Chen
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Chenyang Xie
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Xinjian Chen
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Zhongnan Zhang
- School of Informatics, Xiamen University, Xiamen 361005, China
| |
Collapse
|
24
|
Park J, Lee JW, Park M. Comparison of cancer subtype identification methods combined with feature selection methods in omics data analysis. BioData Min 2023; 16:18. [PMID: 37420304 DOI: 10.1186/s13040-023-00334-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 06/30/2023] [Indexed: 07/09/2023] Open
Abstract
BACKGROUND Cancer subtype identification is important for the early diagnosis of cancer and the provision of adequate treatment. Prior to identifying the subtype of cancer in a patient, feature selection is also crucial for reducing the dimensionality of the data by detecting genes that contain important information about the cancer subtype. Numerous cancer subtyping methods have been developed, and their performance has been compared. However, combinations of feature selection and subtype identification methods have rarely been considered. This study aimed to identify the best combination of variable selection and subtype identification methods in single omics data analysis. RESULTS Combinations of six filter-based methods and six unsupervised subtype identification methods were investigated using The Cancer Genome Atlas (TCGA) datasets for four cancers. The number of features selected varied, and several evaluation metrics were used. Although no single combination was found to have a distinctively good performance, Consensus Clustering (CC) and Neighborhood-Based Multi-omics Clustering (NEMO) used with variance-based feature selection had a tendency to show lower p-values, and nonnegative matrix factorization (NMF) stably showed good performance in many cases unless the Dip test was used for feature selection. In terms of accuracy, the combination of NMF and similarity network fusion (SNF) with Monte Carlo Feature Selection (MCFS) and Minimum-Redundancy Maximum Relevance (mRMR) showed good overall performance. NMF always showed among the worst performances without feature selection in all datasets, but performed much better when used with various feature selection methods. iClusterBayes (ICB) had decent performance when used without feature selection. CONCLUSIONS Rather than a single method clearly emerging as optimal, the best methodology was different depending on the data used, the number of features selected, and the evaluation method. A guideline for choosing the best combination method under various situations is provided.
Collapse
Affiliation(s)
- JiYoon Park
- Department of Statistics, Korea University, 145 Anam-Ro, Seongbuk-Gu, Seoul, 02841, South Korea
| | - Jae Won Lee
- Department of Statistics, Korea University, 145 Anam-Ro, Seongbuk-Gu, Seoul, 02841, South Korea
| | - Mira Park
- Department of Preventive Medicine, Eulji University, 77 Gyeryong-Ro, Jung-Gu, Daejeon, 34824, South Korea.
| |
Collapse
|
25
|
Maiorino E, Loscalzo J. Phenomics and Robust Multiomics Data for Cardiovascular Disease Subtyping. Arterioscler Thromb Vasc Biol 2023; 43:1111-1123. [PMID: 37226730 PMCID: PMC10330619 DOI: 10.1161/atvbaha.122.318892] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/10/2023] [Indexed: 05/26/2023]
Abstract
The complex landscape of cardiovascular diseases encompasses a wide range of related pathologies arising from diverse molecular mechanisms and exhibiting heterogeneous phenotypes. This variety of manifestations poses significant challenges in the development of treatment strategies. The increasing availability of precise phenotypic and multiomics data of cardiovascular disease patient populations has spurred the development of a variety of computational disease subtyping techniques to identify distinct subgroups with unique underlying pathogeneses. In this review, we outline the essential components of computational approaches to select, integrate, and cluster omics and clinical data in the context of cardiovascular disease research. We delve into the challenges faced during different stages of the analysis, including feature selection and extraction, data integration, and clustering algorithms. Next, we highlight representative applications of subtyping pipelines in heart failure and coronary artery disease. Finally, we discuss the current challenges and future directions in the development of robust subtyping approaches that can be implemented in clinical workflows, ultimately contributing to the ongoing evolution of precision medicine in health care.
Collapse
Affiliation(s)
- Enrico Maiorino
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
26
|
Chong D, Jones NC, Schittenhelm RB, Anderson A, Casillas-Espinosa PM. Multi-omics Integration and Epilepsy: Towards a Better Understanding of Biological Mechanisms. Prog Neurobiol 2023:102480. [PMID: 37286031 DOI: 10.1016/j.pneurobio.2023.102480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/09/2023] [Accepted: 06/03/2023] [Indexed: 06/09/2023]
Abstract
The epilepsies are a group of complex neurological disorders characterised by recurrent seizures. Approximately 30% of patients fail to respond to anti-seizure medications, despite the recent introduction of many new drugs. The molecular processes underlying epilepsy development are not well understood and this knowledge gap impedes efforts to identify effective targets and develop novel therapies against epilepsy. Omics studies allow a comprehensive characterisation of a class of molecules. Omics-based biomarkers have led to clinically validated diagnostic and prognostic tests for personalised oncology, and more recently for non-cancer diseases. We believe that, in epilepsy, the full potential of multi-omics research is yet to be realised and we envisage that this review will serve as a guide to researchers planning to undertake omics-based mechanistic studies.
Collapse
Affiliation(s)
- Debbie Chong
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia
| | - Nigel C Jones
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| | - Ralf B Schittenhelm
- Monash Proteomics & Metabolomics Facility and Monash Biomedicine Discovery Institute, Monash University, Clayton, Victoria, 3800, Australia
| | - Alison Anderson
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| | - Pablo M Casillas-Espinosa
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| |
Collapse
|
27
|
Stokes T, Cen HH, Kapranov P, Gallagher IJ, Pitsillides AA, Volmar C, Kraus WE, Johnson JD, Phillips SM, Wahlestedt C, Timmons JA. Transcriptomics for Clinical and Experimental Biology Research: Hang on a Seq. ADVANCED GENETICS (HOBOKEN, N.J.) 2023; 4:2200024. [PMID: 37288167 PMCID: PMC10242409 DOI: 10.1002/ggn2.202200024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Indexed: 06/09/2023]
Abstract
Sequencing the human genome empowers translational medicine, facilitating transcriptome-wide molecular diagnosis, pathway biology, and drug repositioning. Initially, microarrays are used to study the bulk transcriptome; but now short-read RNA sequencing (RNA-seq) predominates. Positioned as a superior technology, that makes the discovery of novel transcripts routine, most RNA-seq analyses are in fact modeled on the known transcriptome. Limitations of the RNA-seq methodology have emerged, while the design of, and the analysis strategies applied to, arrays have matured. An equitable comparison between these technologies is provided, highlighting advantages that modern arrays hold over RNA-seq. Array protocols more accurately quantify constitutively expressed protein coding genes across tissue replicates, and are more reliable for studying lower expressed genes. Arrays reveal long noncoding RNAs (lncRNA) are neither sparsely nor lower expressed than protein coding genes. Heterogeneous coverage of constitutively expressed genes observed with RNA-seq, undermines the validity and reproducibility of pathway analyses. The factors driving these observations, many of which are relevant to long-read or single-cell sequencing are discussed. As proposed herein, a reappreciation of bulk transcriptomic methods is required, including wider use of the modern high-density array data-to urgently revise existing anatomical RNA reference atlases and assist with more accurate study of lncRNAs.
Collapse
Affiliation(s)
- Tanner Stokes
- Faculty of ScienceMcMaster UniversityHamiltonL8S 4L8Canada
| | - Haoning Howard Cen
- Life Sciences InstituteUniversity of British ColumbiaVancouverV6T 1Z3Canada
| | | | - Iain J Gallagher
- School of Applied SciencesEdinburgh Napier UniversityEdinburghEH11 4BNUK
| | | | | | | | - James D. Johnson
- Life Sciences InstituteUniversity of British ColumbiaVancouverV6T 1Z3Canada
| | | | | | - James A. Timmons
- Miller School of MedicineUniversity of MiamiMiamiFL33136USA
- William Harvey Research InstituteQueen Mary University LondonLondonEC1M 6BQUK
- Augur Precision Medicine LTDStirlingFK9 5NFUK
| |
Collapse
|
28
|
Jiang C, Geng L, Wang J, Liang Y, Guo X, Liu C, Zhao Y, Jin J, Liu Z, Mu Y. Multiplexed Gene Engineering Based on dCas9 and gRNA-tRNA Array Encoded on Single Transcript. Int J Mol Sci 2023; 24:ijms24108535. [PMID: 37239880 DOI: 10.3390/ijms24108535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/04/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023] Open
Abstract
Simultaneously, multiplexed genome engineering and targeting multiple genomic loci are valuable to elucidating gene interactions and characterizing genetic networks that affect phenotypes. Here, we developed a general CRISPR-based platform to perform four functions and target multiple genome loci encoded in a single transcript. To establish multiple functions for multiple loci targets, we fused four RNA hairpins, MS2, PP7, com and boxB, to stem-loops of gRNA (guide RNA) scaffolds, separately. The RNA-hairpin-binding domains MCP, PCP, Com and λN22 were fused with different functional effectors. These paired combinations of cognate-RNA hairpins and RNA-binding proteins generated the simultaneous, independent regulation of multiple target genes. To ensure that all proteins and RNAs are expressed in one transcript, multiple gRNAs were constructed in a tandemly arrayed tRNA (transfer RNA)-gRNA architecture, and the triplex sequence was cloned between the protein-coding sequences and the tRNA-gRNA array. By leveraging this system, we illustrate the transcriptional activation, transcriptional repression, DNA methylation and DNA demethylation of endogenous targets using up to 16 individual CRISPR gRNAs delivered on a single transcript. This system provides a powerful platform to investigate synthetic biology questions and engineer complex-phenotype medical applications.
Collapse
Affiliation(s)
- Chaoqian Jiang
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
- College of Life Science, Northeast Agricultural University, Harbin 150030, China
| | - Lishuang Geng
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
- College of Life Science, Northeast Agricultural University, Harbin 150030, China
| | - Jinpeng Wang
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
| | - Yingjuan Liang
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
| | - Xiaochen Guo
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
| | - Chang Liu
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
| | - Yunjing Zhao
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
| | - Junxue Jin
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
- College of Life Science, Northeast Agricultural University, Harbin 150030, China
| | - Zhonghua Liu
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
- College of Life Science, Northeast Agricultural University, Harbin 150030, China
| | - Yanshuang Mu
- Key Laboratory of Animal Cellular and Genetic Engineering of Heilongjiang Province, Northeast Agricultural University, Harbin 150030, China
- College of Life Science, Northeast Agricultural University, Harbin 150030, China
| |
Collapse
|
29
|
Steyaert S, Pizurica M, Nagaraj D, Khandelwal P, Hernandez-Boussard T, Gentles AJ, Gevaert O. Multimodal data fusion for cancer biomarker discovery with deep learning. NAT MACH INTELL 2023; 5:351-362. [PMID: 37693852 PMCID: PMC10484010 DOI: 10.1038/s42256-023-00633-5] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 02/17/2023] [Indexed: 09/12/2023]
Abstract
Technological advances now make it possible to study a patient from multiple angles with high-dimensional, high-throughput multi-scale biomedical data. In oncology, massive amounts of data are being generated ranging from molecular, histopathology, radiology to clinical records. The introduction of deep learning has significantly advanced the analysis of biomedical data. However, most approaches focus on single data modalities leading to slow progress in methods to integrate complementary data types. Development of effective multimodal fusion approaches is becoming increasingly important as a single modality might not be consistent and sufficient to capture the heterogeneity of complex diseases to tailor medical care and improve personalised medicine. Many initiatives now focus on integrating these disparate modalities to unravel the biological processes involved in multifactorial diseases such as cancer. However, many obstacles remain, including lack of usable data as well as methods for clinical validation and interpretation. Here, we cover these current challenges and reflect on opportunities through deep learning to tackle data sparsity and scarcity, multimodal interpretability, and standardisation of datasets.
Collapse
Affiliation(s)
- Sandra Steyaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | - Marija Pizurica
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | | | | | - Tina Hernandez-Boussard
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Andrew J Gentles
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| |
Collapse
|
30
|
Liu C, Duan Y, Zhou Q, Wang Y, Gao Y, Kan H, Hu J. A classification method of gastric cancer subtype based on residual graph convolution network. Front Genet 2023; 13:1090394. [PMID: 36685956 PMCID: PMC9845413 DOI: 10.3389/fgene.2022.1090394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 12/09/2022] [Indexed: 01/06/2023] Open
Abstract
Background: Clinical diagnosis and treatment of tumors are greatly complicated by their heterogeneity, and the subtype classification of cancer frequently plays a significant role in the subsequent treatment of tumors. Presently, the majority of studies rely far too heavily on gene expression data, omitting the enormous power of multi-omics fusion data and the potential for patient similarities. Method: In this study, we created a gastric cancer subtype classification model called RRGCN based on residual graph convolutional network (GCN) using multi-omics fusion data and patient similarity network. Given the multi-omics data's high dimensionality, we built an artificial neural network Autoencoder (AE) to reduce the dimensionality of the data and extract hidden layer features. The model is then built using the feature data. In addition, we computed the correlation between patients using the Pearson correlation coefficient, and this relationship between patients forms the edge of the graph structure. Four graph convolutional network layers and two residual networks with skip connections make up RRGCN, which reduces the amount of information lost during transmission between layers and prevents model degradation. Results: The results show that RRGCN significantly outperforms other classification methods with an accuracy as high as 0.87 when compared to four other traditional machine learning methods and deep learning models. Conclusion: In terms of subtype classification, RRGCN excels in all areas and has the potential to offer fresh perspectives on disease mechanisms and disease progression. It has the potential to be used for a broader range of disorders and to aid in clinical diagnosis.
Collapse
Affiliation(s)
- Can Liu
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Yuchen Duan
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
| | - Qingqing Zhou
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
| | - Yongkang Wang
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Yong Gao
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Hongxing Kan
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| | - Jili Hu
- School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, Anhui, China
| |
Collapse
|
31
|
Chen J, Rong W, Tao G, Cai H. Similarity Fusion via Exploiting High Order Proximity for Cancer Subtyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:658-667. [PMID: 34971537 DOI: 10.1109/tcbb.2021.3139597] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Identifying cancer subtypes holds essential promise for improving prognosis and personalized treatment. Cancer subtyping based on multi-omics data has become a hotspot in bioinformatics research. One of the critical approaches of handling data heterogeneity in multi-omics data is first modeling each omics data as a separate similarity graph. Then, the information of multiple graphs is integrated into a unified graph. However, a significant challenge is how to measure the similarity of nodes in each graph and preserve cluster information of each graph. To that end, we exploit a new high order proximity in each graph and propose a similarity fusion method to fuse the high order proximity of multiple graphs while preserving cluster information of multiple graphs. Compared with the current techniques employing the first order proximity, exploiting high order proximity contributes to attaining accurate similarity. The proposed similarity fusion method makes full use of the complementary information from multi-omics data. Experiments in six benchmark multi-omics datasets and two individual cancer case studies confirm that our proposed method achieves statistically significant and biologically meaningful cancer subtypes.
Collapse
|
32
|
Li B, Zhang F, Niu Q, Liu J, Yu Y, Wang P, Zhang S, Zhang H, Wang Z. A molecular classification of gastric cancer associated with distinct clinical outcomes and validated by an XGBoost-based prediction model. MOLECULAR THERAPY. NUCLEIC ACIDS 2022; 31:224-240. [PMID: 36700042 PMCID: PMC9843270 DOI: 10.1016/j.omtn.2022.12.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022]
Abstract
Gastric cancer (GC) is a heterogeneous disease and a leading cause of cancer-related deaths. Discovering robust, clinically relevant molecular classifications is critical for guiding personalized therapies for GC. Here, we propose a refined molecular classification scheme for GC using integrated optimal algorithms and multi-omics data. Based on the important features of mRNA, microRNA, and DNA methylation data selected by the multivariate Cox regression model, three subtypes linked to distinct clinical outcomes were identified by combining similarity network fusion and consensus clustering methods. Three subtypes were validated by an extreme gradient boosting machine learning prediction model with 125 differentially expressed genes in multiple independent cohorts. The molecular characteristics of mutation signatures, characteristic gene sets, driver genes, and chemotherapy sensitivity for each subtype were also identified: subtype 1 was associated with favorable prognosis and characterized by high ARID1A and PIK3CA mutations, subtype 2 was associated with a poor prognosis and harbored high recurrent TP53 mutations, and subtype 3 was associated with high CHD1, APOA1 mutations, and a poor prognosis. The proposed three-subtype scheme achieved a better clinical prediction performance (area under the curve value = 0.71) than The Cancer Genome Atlas classification, which may provide a practical subtyping framework to improve the treatment of GC.
Collapse
Affiliation(s)
- Bing Li
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Fengbin Zhang
- Department of Gastroenterology and Hepatology, The Fourth Hospital of Hebei Medical University, Shijiazhuang 050011, China
| | - Qikai Niu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Jun Liu
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Yanan Yu
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Pengqian Wang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Siqi Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Huamin Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China,Corresponding author: Huamin Zhang, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| | - Zhong Wang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China,Corresponding author: Zhong Wang, Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| |
Collapse
|
33
|
Evaluation and Comparison of Multi-Omics Data Integration Methods for Subtyping of Cutaneous Melanoma. Biomedicines 2022; 10:biomedicines10123240. [PMID: 36551996 PMCID: PMC9775581 DOI: 10.3390/biomedicines10123240] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/29/2022] [Accepted: 12/07/2022] [Indexed: 12/15/2022] Open
Abstract
There is a growing number of multi-domain genomic datasets for human tumors. Multi-domain data are usually interpreted after separately analyzing single-domain data and integrating the results post hoc. Data fusion techniques allow for the real integration of multi-domain data to ideally improve the tumor classification results for the prognosis and prediction of response to therapy. We have previously described the joint singular value decomposition (jSVD) technique as a means of data fusion. Here, we report on the development of these methods in open source code based on R and Python and on the application of these data fusion methods. The Cancer Genome Atlas (TCGA) Skin Cutaneous Melanoma (SKCM) dataset was used as a benchmark to evaluate the potential of the data fusion approaches to improve molecular classification of cancers in a clinically relevant manner. Our data show that the data fusion approach does not generate classification results superior to those obtained using single-domain data. Data from different domains are not entirely independent from each other, and molecular classes are characterized by features that penetrate different domains. Data fusion techniques might be better suited for response prediction, where they could contribute to the identification of predictive features in a domain-independent manner to be used as biomarkers.
Collapse
|
34
|
Raufaste-Cazavieille V, Santiago R, Droit A. Multi-omics analysis: Paving the path toward achieving precision medicine in cancer treatment and immuno-oncology. Front Mol Biosci 2022; 9:962743. [PMID: 36304921 PMCID: PMC9595279 DOI: 10.3389/fmolb.2022.962743] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/21/2022] [Indexed: 11/13/2022] Open
Abstract
The acceleration of large-scale sequencing and the progress in high-throughput computational analyses, defined as omics, was a hallmark for the comprehension of the biological processes in human health and diseases. In cancerology, the omics approach, initiated by genomics and transcriptomics studies, has revealed an incredible complexity with unsuspected molecular diversity within a same tumor type as well as spatial and temporal heterogeneity of tumors. The integration of multiple biological layers of omics studies brought oncology to a new paradigm, from tumor site classification to pan-cancer molecular classification, offering new therapeutic opportunities for precision medicine. In this review, we will provide a comprehensive overview of the latest innovations for multi-omics integration in oncology and summarize the largest multi-omics dataset available for adult and pediatric cancers. We will present multi-omics techniques for characterizing cancer biology and show how multi-omics data can be combined with clinical data for the identification of prognostic and treatment-specific biomarkers, opening the way to personalized therapy. To conclude, we will detail the newest strategies for dissecting the tumor immune environment and host–tumor interaction. We will explore the advances in immunomics and microbiomics for biomarker identification to guide therapeutic decision in immuno-oncology.
Collapse
Affiliation(s)
| | - Raoul Santiago
- CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- Division of Pediatric Hematology-Oncology, Centre Hospitalier Universitaire de L’Université Laval, Charles Bruneau Cancer Center, Québec, QC, Canada
- *Correspondence: Raoul Santiago, ; Arnaud Droit,
| | - Arnaud Droit
- CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- *Correspondence: Raoul Santiago, ; Arnaud Droit,
| |
Collapse
|
35
|
|
36
|
Suter P, Dazert E, Kuipers J, Ng CKY, Boldanova T, Hall MN, Heim MH, Beerenwinkel N. Multi-omics subtyping of hepatocellular carcinoma patients using a Bayesian network mixture model. PLoS Comput Biol 2022; 18:e1009767. [PMID: 36067230 PMCID: PMC9481159 DOI: 10.1371/journal.pcbi.1009767] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 09/16/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open
Abstract
Comprehensive molecular characterization of cancer subtypes is essential for predicting clinical outcomes and searching for personalized treatments. We present bnClustOmics, a statistical model and computational tool for multi-omics unsupervised clustering, which serves a dual purpose: Clustering patient samples based on a Bayesian network mixture model and learning the networks of omics variables representing these clusters. The discovered networks encode interactions among all omics variables and provide a molecular characterization of each patient subgroup. We conducted simulation studies that demonstrated the advantages of our approach compared to other clustering methods in the case where the generative model is a mixture of Bayesian networks. We applied bnClustOmics to a hepatocellular carcinoma (HCC) dataset comprising genome (mutation and copy number), transcriptome, proteome, and phosphoproteome data. We identified three main HCC subtypes together with molecular characteristics, some of which are associated with survival even when adjusting for the clinical stage. Cluster-specific networks shed light on the links between genotypes and molecular phenotypes of samples within their respective clusters and suggest targets for personalized treatments.
Collapse
Affiliation(s)
- Polina Suter
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Eva Dazert
- Biozentrum, University of Basel, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Charlotte K. Y. Ng
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department for BioMedical Research (DBMR), University of Bern, Bern, Switzerland
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
- Institute of Medical Genetics and Pathology, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Tuyana Boldanova
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
| | | | - Markus H. Heim
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
- Department of Gastroenterology and Hepatology, Clarunis, University Center for Gastrointestinal and Liver Diseases, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
37
|
Guo X, Han J, Song Y, Yin Z, Liu S, Shang X. Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions. Front Genet 2022; 13:921775. [PMID: 36046233 PMCID: PMC9421127 DOI: 10.3389/fgene.2022.921775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Motivation: A central goal of current biology is to establish a complete functional link between the genotype and phenotype, known as the so-called genotype–phenotype map. With the continuous development of high-throughput technology and the decline in sequencing costs, multi-omics analysis has become more widely employed. While this gives us new opportunities to uncover the correlation mechanisms between single-nucleotide polymorphism (SNP), genes, and phenotypes, multi-omics still faces certain challenges, specifically: 1) When the sample size is large enough, the number of omics types is often not large enough to meet the requirements of multi-omics analysis; 2) each omics’ internal correlations are often unclear, such as the correlation between genes in genomics; 3) when analyzing a large number of traits (p), the sample size (n) is often smaller than p, n << p, hindering the application of machine learning methods in the classification of disease outcomes.Results: To solve these issues with multi-omics and build a robust classification model, we propose a graph-embedded deep neural network (G-EDNN) based on expression quantitative trait loci (eQTL) data, which achieves sparse connectivity between network layers to prevent overfitting. The correlation within each omics is also considered such that the model more closely resembles biological reality. To verify the capabilities of this method, we conducted experimental analysis using the GSE28127 and GSE95496 data sets from the Gene Expression Omnibus (GEO) database, tested various neural network architectures, and used prior data for feature selection and graph embedding. Results show that the proposed method could achieve a high classification accuracy and easy-to-interpret feature selection. This method represents an extended application of genotype–phenotype association analysis in deep learning networks.
Collapse
Affiliation(s)
- Xinpeng Guo
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, China
| | - Jinyu Han
- School of Economics and Management, Chang ‘an University, Xi’an, China
| | - Yafei Song
- School of Air and Missile Defense, Air Force Engineering University, Xi’an, China
| | - Zhilei Yin
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
| | - Shuaichen Liu
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, China
| | - Xuequn Shang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an, China
- *Correspondence: Xuequn Shang,
| |
Collapse
|
38
|
A binary dandelion algorithm using seeding and chaos population strategies for feature selection. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
39
|
Mass Spectrometry Imaging Spatial Tissue Analysis toward Personalized Medicine. LIFE (BASEL, SWITZERLAND) 2022; 12:life12071037. [PMID: 35888125 PMCID: PMC9318569 DOI: 10.3390/life12071037] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/04/2022] [Accepted: 07/10/2022] [Indexed: 12/19/2022]
Abstract
Novel profiling methodologies are redefining the diagnostic capabilities and therapeutic approaches towards more precise and personalized healthcare. Complementary information can be obtained from different omic approaches in combination with the traditional macro- and microscopic analysis of the tissue, providing a more complete assessment of the disease. Mass spectrometry imaging, as a tissue typing approach, provides information on the molecular level directly measured from the tissue. Lipids, metabolites, glycans, and proteins can be used for better understanding imbalances in the DNA to RNA to protein translation, which leads to aberrant cellular behavior. Several studies have explored the capabilities of this technology to be applied to tumor subtyping, patient prognosis, and tissue profiling for intraoperative tissue evaluation. In the future, intercenter studies may provide the needed confirmation on the reproducibility, robustness, and applicability of the developed classification models for tissue characterization to assist in disease management.
Collapse
|
40
|
Nemes E, Fiore-Gartland A, Boggiano C, Coccia M, D'Souza P, Gilbert P, Ginsberg A, Hyrien O, Laddy D, Makar K, McElrath MJ, Ramachandra L, Schmidt AC, Shororbani S, Sunshine J, Tomaras G, Yu WH, Scriba TJ, Frahm N. The quest for vaccine-induced immune correlates of protection against tuberculosis. VACCINE INSIGHTS 2022; 1:165-181. [PMID: 37091190 PMCID: PMC10117634 DOI: 10.18609/vac/2022.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Immunization strategies against tuberculosis (TB) that confer better protection than neonatal vaccination with the 101-year-old Bacille Calmette-Guerin (BCG) are urgently needed to control the epidemic, but clinical development is hampered by a lack of established immune correlates of protection (CoPs). Two phase 2b clinical trials offer the first opportunity to discover human CoPs against TB. Adolescent BCG re-vaccination showed partial protection against Mycobacterium tuberculosis (Mtb) infection, as measured by sustained IFNγ release assay (IGRA) conversion. Adult M72/AS01E vaccination showed partial protection against pulmonary TB. We describe two collaborative research programs to discover CoPs against TB and ensure rigorous, streamlined use of available samples, involving international immunology experts in TB and state-of-the-art technologies, sponsors and funders. Hypotheses covering immune responses thought to be important in protection against TB have been defined and prioritized. A statistical framework to integrate the data analysis strategy was developed. Exploratory analyses will be performed to generate novel hypotheses.
Collapse
Affiliation(s)
- Elisa Nemes
- South African Tuberculosis Vaccine Initiative, Division of Immunology, Department of Pathology and Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Andrew Fiore-Gartland
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Cesar Boggiano
- National Institute of Allergy and Infectious Diseases, National Institutes of Health
| | | | - Patricia D'Souza
- National Institute of Allergy and Infectious Diseases, National Institutes of Health
| | - Peter Gilbert
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Ann Ginsberg
- Bill & Melinda Gates Foundation, Seattle, WA, USA
| | - Ollivier Hyrien
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | | | - Karen Makar
- Bill & Melinda Gates Foundation, Seattle, WA, USA
| | - M Juliana McElrath
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Lakshmi Ramachandra
- National Institute of Allergy and Infectious Diseases, National Institutes of Health
| | | | - Solmaz Shororbani
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Justine Sunshine
- Bill & Melinda Gates Medical Research Institute, Cambridge, MA, USA
| | - Georgia Tomaras
- Duke Human Vaccine Institute, Duke University, Durham, NC, USA
| | - Wen-Han Yu
- Bill & Melinda Gates Medical Research Institute, Cambridge, MA, USA
| | - Thomas J Scriba
- South African Tuberculosis Vaccine Initiative, Division of Immunology, Department of Pathology and Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Nicole Frahm
- Bill & Melinda Gates Medical Research Institute, Cambridge, MA, USA
| |
Collapse
|
41
|
Hill C, Avila-Palencia I, Maxwell AP, Hunter RF, McKnight AJ. Harnessing the Full Potential of Multi-Omic Analyses to Advance the Study and Treatment of Chronic Kidney Disease. FRONTIERS IN NEPHROLOGY 2022; 2:923068. [PMID: 37674991 PMCID: PMC10479694 DOI: 10.3389/fneph.2022.923068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 05/30/2022] [Indexed: 09/08/2023]
Abstract
Chronic kidney disease (CKD) was the 12th leading cause of death globally in 2017 with the prevalence of CKD estimated at ~9%. Early detection and intervention for CKD may improve patient outcomes, but standard testing approaches even in developed countries do not facilitate identification of patients at high risk of developing CKD, nor those progressing to end-stage kidney disease (ESKD). Recent advances in CKD research are moving towards a more personalised approach for CKD. Heritability for CKD ranges from 30% to 75%, yet identified genetic risk factors account for only a small proportion of the inherited contribution to CKD. More in depth analysis of genomic sequencing data in large cohorts is revealing new genetic risk factors for common diagnoses of CKD and providing novel diagnoses for rare forms of CKD. Multi-omic approaches are now being harnessed to improve our understanding of CKD and explain some of the so-called 'missing heritability'. The most common omic analyses employed for CKD are genomics, epigenomics, transcriptomics, metabolomics, proteomics and phenomics. While each of these omics have been reviewed individually, considering integrated multi-omic analysis offers considerable scope to improve our understanding and treatment of CKD. This narrative review summarises current understanding of multi-omic research alongside recent experimental and analytical approaches, discusses current challenges and future perspectives, and offers new insights for CKD.
Collapse
Affiliation(s)
| | | | | | | | - Amy Jayne McKnight
- Centre for Public Health, Queen’s University Belfast, Belfast, United Kingdom
| |
Collapse
|
42
|
Integrated Multi-Omics Maps of Lower-Grade Gliomas. Cancers (Basel) 2022; 14:cancers14112797. [PMID: 35681780 PMCID: PMC9179546 DOI: 10.3390/cancers14112797] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 05/18/2022] [Accepted: 05/31/2022] [Indexed: 02/01/2023] Open
Abstract
Multi-omics high-throughput technologies produce data sets which are not restricted to only one but consist of multiple omics modalities, often as patient-matched tumour specimens. The integrative analysis of these omics modalities is essential to obtain a holistic view on the otherwise fragmented information hidden in this data. We present an intuitive method enabling the combined analysis of multi-omics data based on self-organizing maps machine learning. It "portrays" the expression, methylation and copy number variations (CNV) landscapes of each tumour using the same gene-centred coordinate system. It enables the visual evaluation and direct comparison of the different omics layers on a personalized basis. We applied this combined molecular portrayal to lower grade gliomas, a heterogeneous brain tumour entity. It classifies into a series of molecular subtypes defined by genetic key lesions, which associate with large-scale effects on DNA methylation and gene expression, and in final consequence, drive with cell fate decisions towards oligodendroglioma-, astrocytoma- and glioblastoma-like cancer cell lineages with different prognoses. Consensus modes of concerted changes of expression, methylation and CNV are governed by the degree of co-regulation within and between the omics layers. The method is not restricted to the triple-omics data used here. The similarity landscapes reflect partly independent effects of genetic lesions and DNA methylation with consequences for cancer hallmark characteristics such as proliferation, inflammation and blocked differentiation in a subtype specific fashion. It can be extended to integrate other omics features such as genetic mutation, protein expression data as well as extracting prognostic markers.
Collapse
|
43
|
Mo H, Breitling R, Francavilla C, Schwartz JM. Data integration and mechanistic modelling for breast cancer biology: Current state and future directions. CURRENT OPINION IN ENDOCRINE AND METABOLIC RESEARCH 2022; 24:None. [PMID: 36034741 PMCID: PMC9402443 DOI: 10.1016/j.coemr.2022.100350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Breast cancer is one of the most common cancers threatening women worldwide. A limited number of available treatment options, frequent recurrence, and drug resistance exacerbate the prognosis of breast cancer patients. Thus, there is an urgent need for methods to investigate novel treatment options, while taking into account the vast molecular heterogeneity of breast cancer. Recent advances in molecular profiling technologies, including genomics, epigenomics, transcriptomics, proteomics and metabolomics data, enable approaching breast cancer biology at multiple levels of omics interaction networks. Systems biology approaches, including computational inference of ‘big data’ and mechanistic modelling of specific pathways, are emerging to identify potential novel combinations of breast cancer subtype signatures and more diverse targeted therapies.
Collapse
|
44
|
Yang B, Yang Y, Su X. Deep structure integrative representation of multi-omics data for cancer subtyping. Bioinformatics 2022; 38:3337-3342. [PMID: 35639657 DOI: 10.1093/bioinformatics/btac345] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/22/2022] [Accepted: 05/17/2022] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Cancer is a heterogeneous group of diseases. Cancer subtyping is crucial and critical step to diagnosis, prognosis and treatment. Since high-throughput sequencing technologies provide unprecedented opportunity to rapid collect multi-omics data for the same individuals, an urgent need in current is how to effectively represent and integrate these multi-omics data to achieve clinically meaningful cancer subtyping. RESULTS We propose a novel deep learning model, called Deep Structure Integrative Representation (DSIR), for cancer subtypes dentification by integrating representation and clustering multi-omics data. DSIR simultaneously captures the global structures in sparse subspace and local structures in manifold subspace from multi-omics data and constructs consensus similarity matrix by utilizing deep neural networks. Extensive tests are performed in twelve different cancers on three levels of omics data from The Cancer Genome Atlas. The results demonstrate that DSIR obtains more significant performances than the state-of-the-art integrative methods. AVAILABILITY https://github.com/Polytech-bioinf/Deep-structure-integrative-representation.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Yang
- School of Computer Science, Xi'an Polytechnic University, Xi'an, 710048, China.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Yan Yang
- School of Computer Science, Xi'an Polytechnic University, Xi'an, 710048, China
| | - Xueping Su
- School of Electronics and Information, Xi'an Polytechnic University, Xi'an, 710048, China
| |
Collapse
|
45
|
Hurgobin B, Lewsey MG. Applications of cell- and tissue-specific 'omics to improve plant productivity. Emerg Top Life Sci 2022; 6:163-173. [PMID: 35293572 PMCID: PMC9023014 DOI: 10.1042/etls20210286] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 02/21/2022] [Accepted: 02/25/2022] [Indexed: 01/05/2023]
Abstract
The individual tissues and cell types of plants each have characteristic properties that contribute to the function of the plant as a whole. These are reflected by unique patterns of gene expression, protein and metabolite content, which enable cell-type-specific patterns of growth, development and physiology. Gene regulatory networks act within the cell types to govern the production and activity of these components. For the broader organism to grow and reproduce successfully, cell-type-specific activity must also function within the context of surrounding cell types, which is achieved by coordination of signalling pathways. We can investigate how gene regulatory networks are constructed and function using integrative 'omics technologies. Historically such experiments in plant biological research have been performed at the bulk tissue level, to organ resolution at best. In this review, we describe recent advances in cell- and tissue-specific 'omics technologies that allow investigation at much improved resolution. We discuss the advantages of these approaches for fundamental and translational plant biology, illustrated through the examples of specialised metabolism in medicinal plants and seed germination. We also discuss the challenges that must be overcome for such approaches to be adopted widely by the community.
Collapse
Affiliation(s)
- Bhavna Hurgobin
- La Trobe Institute for Agriculture and Food, Department of Animal, Plant and Soil Sciences, School of Life Sciences, La Trobe University, AgriBio Building, Bundoora, VIC 3086, Australia
- Australian Research Council Research Hub for Medicinal Agriculture, La Trobe University, AgriBio Building, Bundoora, VIC 3086, Australia
| | - Mathew G. Lewsey
- La Trobe Institute for Agriculture and Food, Department of Animal, Plant and Soil Sciences, School of Life Sciences, La Trobe University, AgriBio Building, Bundoora, VIC 3086, Australia
- Australian Research Council Research Hub for Medicinal Agriculture, La Trobe University, AgriBio Building, Bundoora, VIC 3086, Australia
| |
Collapse
|
46
|
John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
47
|
Sevastre AS, Costachi A, Tataranu LG, Brandusa C, Artene SA, Stovicek O, Alexandru O, Danoiu S, Sfredel V, Dricu A. Glioblastoma pharmacotherapy: A multifaceted perspective of conventional and emerging treatments (Review). Exp Ther Med 2021; 22:1408. [PMID: 34676001 PMCID: PMC8524703 DOI: 10.3892/etm.2021.10844] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 09/21/2021] [Indexed: 12/13/2022] Open
Abstract
Due to its localisation, rapid onset, high relapse rate and resistance to most currently available treatment methods, glioblastoma multiforme (GBM) is considered to be the deadliest type of all gliomas. Although surgical resection, chemotherapy and radiotherapy are among the therapeutic strategies used for the treatment of GBM, the survival rates achieved are not satisfactory, and there is an urgent need for novel effective therapeutic options. In addition to single-target therapy, multi-target therapies are currently under development. Furthermore, drugs are being optimised to improve their ability to cross the blood-brain barrier. In the present review, the main strategies applied for GBM treatment in terms of the most recent therapeutic agents and approaches that are currently under pre-clinical and clinical testing were discussed. In addition, the most recently reported experimental data following the testing of novel therapies, including stem cell therapy, immunotherapy, gene therapy, genomic correction and precision medicine, were reviewed, and their advantages and drawbacks were also summarised.
Collapse
Affiliation(s)
- Ani-Simona Sevastre
- Department of Pharmaceutical Technology, Faculty of Pharmacy, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Alexandra Costachi
- Department of Biochemistry, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Ligia Gabriela Tataranu
- Department of Neurosurgery, ‘Bagdasar-Arseni’ Emergency Clinical Hospital, 041915 Bucharest, Romania
| | - Corina Brandusa
- Department of Biochemistry, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Stefan Alexandru Artene
- Department of Biochemistry, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Olivian Stovicek
- Department of Pharmacology, Faculty of Nursing Targu Jiu, Titu Maiorescu University of Bucharest, 210106 Targu Jiu, Romania
| | - Oana Alexandru
- Department of Neurology, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Suzana Danoiu
- Department of Pathophysiology, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Veronica Sfredel
- Department of Physiology, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| | - Anica Dricu
- Department of Biochemistry, Faculty of Medicine, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
| |
Collapse
|
48
|
Liu Q, Cheng B, Jin Y, Hu P. Bayesian tensor factorization-drive breast cancer subtyping by integrating multi-omics data. J Biomed Inform 2021; 125:103958. [PMID: 34839017 DOI: 10.1016/j.jbi.2021.103958] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 10/13/2021] [Accepted: 11/19/2021] [Indexed: 12/12/2022]
Abstract
Breast cancer is a highly heterogeneous disease. Subtyping the disease and identifying the genomic features driving these subtypes are critical for precision oncology for breast cancer. This study focuses on developing a new computational approach for breast cancer subtyping. We proposed to use Bayesian tensor factorization (BTF) to integrate multi-omics data of breast cancer, which include expression profiles of RNA-sequencing, copy number variation, and DNA methylation measured on 762 breast cancer patients from The Cancer Genome Atlas. We applied a consensus clustering approach to identify breast cancer subtypes using the factorized latent features by BTF. Subtype-specific survival patterns of the breast cancer patients were evaluated using Kaplan-Meier (KM) estimators. The proposed approach was compared with other state-of-the-art approaches for cancer subtyping. The BTF-subtyping analysis identified 17 optimized latent components, which were used to reveal six major breast cancer subtypes. Out of all different approaches, only the proposed approach showed distinct survival patterns (p < 0.05). Statistical tests also showed that the identified clusters have statistically significant distributions. Our results showed that the proposed approach is a promising strategy to efficiently use publicly available multi-omics data to identify breast cancer subtypes.
Collapse
Affiliation(s)
- Qian Liu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Bowen Cheng
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Yongwon Jin
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada
| | - Pingzhao Hu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada; CancerCare Manitoba Research Institute, Winnipeg, Manitoba, Canada.
| |
Collapse
|
49
|
Subbannayya Y, Di Fiore R, Urru SAM, Calleja-Agius J. The Role of Omics Approaches to Characterize Molecular Mechanisms of Rare Ovarian Cancers: Recent Advances and Future Perspectives. Biomedicines 2021; 9:1481. [PMID: 34680597 PMCID: PMC8533212 DOI: 10.3390/biomedicines9101481] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/05/2021] [Accepted: 10/12/2021] [Indexed: 01/02/2023] Open
Abstract
Rare ovarian cancers are ovarian cancers with an annual incidence of less than 6 cases per 100,000 women. They generally have a poor prognosis due to being delayed diagnosis and treatment. Exploration of molecular mechanisms in these cancers has been challenging due to their rarity and research efforts being fragmented across the world. Omics approaches can provide detailed molecular snapshots of the underlying mechanisms of these cancers. Omics approaches, including genomics, transcriptomics, proteomics, and metabolomics, can identify potential candidate biomarkers for diagnosis, prognosis, and screening of rare gynecological cancers and can aid in identifying therapeutic targets. The integration of multiple omics techniques using approaches such as proteogenomics can provide a detailed understanding of the molecular mechanisms of carcinogenesis and cancer progression. Further, omics approaches can provide clues towards developing immunotherapies, cancer recurrence, and drug resistance in tumors; and form a platform for personalized medicine. The current review focuses on the application of omics approaches and integrative biology to gain a better understanding of rare ovarian cancers.
Collapse
Affiliation(s)
- Yashwanth Subbannayya
- Centre of Molecular Inflammation Research (CEMIR), Department of Clinical and Molecular Medicine (IKOM), Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Riccardo Di Fiore
- Department of Anatomy, Faculty of Medicine and Surgery, University of Malta, MSD 2080 Msida, Malta;
- Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, College of Science and Technology, Temple University, Philadelphia, PA 19122, USA
| | - Silvana Anna Maria Urru
- Hospital Pharmacy Unit, Trento General Hospital, Autonomous Province of Trento, 38122 Trento, Italy;
- Department of Chemistry and Pharmacy, School of Hospital Pharmacy, University of Sassari, 07100 Sassari, Italy
| | - Jean Calleja-Agius
- Department of Anatomy, Faculty of Medicine and Surgery, University of Malta, MSD 2080 Msida, Malta;
| |
Collapse
|