1
|
Vitorino R. Transforming Clinical Research: The Power of High-Throughput Omics Integration. Proteomes 2024; 12:25. [PMID: 39311198 PMCID: PMC11417901 DOI: 10.3390/proteomes12030025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/31/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024] Open
Abstract
High-throughput omics technologies have dramatically changed biological research, providing unprecedented insights into the complexity of living systems. This review presents a comprehensive examination of the current landscape of high-throughput omics pipelines, covering key technologies, data integration techniques and their diverse applications. It looks at advances in next-generation sequencing, mass spectrometry and microarray platforms and highlights their contribution to data volume and precision. In addition, this review looks at the critical role of bioinformatics tools and statistical methods in managing the large datasets generated by these technologies. By integrating multi-omics data, researchers can gain a holistic understanding of biological systems, leading to the identification of new biomarkers and therapeutic targets, particularly in complex diseases such as cancer. The review also looks at the integration of omics data into electronic health records (EHRs) and the potential for cloud computing and big data analytics to improve data storage, analysis and sharing. Despite significant advances, there are still challenges such as data complexity, technical limitations and ethical issues. Future directions include the development of more sophisticated computational tools and the application of advanced machine learning techniques, which are critical for addressing the complexity and heterogeneity of omics datasets. This review aims to serve as a valuable resource for researchers and practitioners, highlighting the transformative potential of high-throughput omics technologies in advancing personalized medicine and improving clinical outcomes.
Collapse
Affiliation(s)
- Rui Vitorino
- iBiMED, Department of Medical Sciences, University of Aveiro, 3810-193 Aveiro, Portugal;
- Department of Surgery and Physiology, Cardiovascular R&D Centre—UnIC@RISE, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
| |
Collapse
|
2
|
Maier A, Hartung M, Abovsky M, Adamowicz K, Bader G, Baier S, Blumenthal D, Chen J, Elkjaer M, Garcia-Hernandez C, Helmy M, Hoffmann M, Jurisica I, Kotlyar M, Lazareva O, Levi H, List M, Lobentanzer S, Loscalzo J, Malod-Dognin N, Manz Q, Matschinske J, Mee M, Oubounyt M, Pastrello C, Pico A, Pillich R, Poschenrieder J, Pratt D, Pržulj N, Sadegh S, Saez-Rodriguez J, Sarkar S, Shaked G, Shamir R, Trummer N, Turhan U, Wang RS, Zolotareva O, Baumbach J. Drugst.One - a plug-and-play solution for online systems medicine and network-based drug repurposing. Nucleic Acids Res 2024; 52:W481-W488. [PMID: 38783119 PMCID: PMC11223884 DOI: 10.1093/nar/gkae388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/08/2024] [Accepted: 04/29/2024] [Indexed: 05/25/2024] Open
Abstract
In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.
Collapse
Affiliation(s)
- Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Michael Hartung
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Mark Abovsky
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto, ON M5T 0S8, Canada
| | - Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Gary D Bader
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada
| | - Sylvie Baier
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - David B Blumenthal
- Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), 91052 Erlangen, Germany
| | - Jing Chen
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Maria L Elkjaer
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | | | - Mohamed Helmy
- Vaccine and Infectious Disease Organization (VIDO), University of Saskatchewan, Canada
- School of Public Health, University of Saskatchewan, Canada
- Department of Computer Science, University of Saskatchewan, Canada
- Department of Computer Science, Lakehead University, Canada
- Department of Computer Science, Idaho State University, USA
- Bioinformatics Institute (BII), A*STAR, Singapore
| | - Markus Hoffmann
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Institute for Advanced Study, Technical University of Munich, Germany
- National Institute of Diabetes, Digestive, and Kidney Diseases, Bethesda, MD 20892, USA
| | - Igor Jurisica
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto, ON M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Max Kotlyar
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto, ON M5T 0S8, Canada
| | - Olga Lazareva
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Junior Clinical Cooperation Unit Multiparametric methods for early detection of prostate cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - Hagai Levi
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Markus List
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Sebastian Lobentanzer
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | - Quirin Manz
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Miles Mee
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Chiara Pastrello
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto, ON M5T 0S8, Canada
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, 1650 Owens Street, San Francisco, 94158 California, USA
| | - Rudolf T Pillich
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Julian M Poschenrieder
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Dexter Pratt
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Nataša Pržulj
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
- Department of Computer Science, University College London, London WC1E 6BT, UK
- ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain
| | - Sepideh Sadegh
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
- Clinical Genome Center, Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Suryadipto Sarkar
- Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), 91052 Erlangen, Germany
| | - Gideon Shaked
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Nico Trummer
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Ugur Turhan
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Rui-Sheng Wang
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Olga Zolotareva
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
3
|
Adamowicz K, Arend L, Maier A, Schmidt JR, Kuster B, Tsoy O, Zolotareva O, Baumbach J, Laske T. Proteomic meta-study harmonization, mechanotyping and drug repurposing candidate prediction with ProHarMeD. NPJ Syst Biol Appl 2023; 9:49. [PMID: 37816770 PMCID: PMC10564802 DOI: 10.1038/s41540-023-00311-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/25/2023] [Indexed: 10/12/2023] Open
Abstract
Proteomics technologies, which include a diverse range of approaches such as mass spectrometry-based, array-based, and others, are key technologies for the identification of biomarkers and disease mechanisms, referred to as mechanotyping. Despite over 15,000 published studies in 2022 alone, leveraging publicly available proteomics data for biomarker identification, mechanotyping and drug target identification is not readily possible. Proteomic data addressing similar biological/biomedical questions are made available by multiple research groups in different locations using different model organisms. Furthermore, not only various organisms are employed but different assay systems, such as in vitro and in vivo systems, are used. Finally, even though proteomics data are deposited in public databases, such as ProteomeXchange, they are provided at different levels of detail. Thus, data integration is hampered by non-harmonized usage of identifiers when reviewing the literature or performing meta-analyses to consolidate existing publications into a joint picture. To address this problem, we present ProHarMeD, a tool for harmonizing and comparing proteomics data gathered in multiple studies and for the extraction of disease mechanisms and putative drug repurposing candidates. It is available as a website, Python library and R package. ProHarMeD facilitates ID and name conversions between protein and gene levels, or organisms via ortholog mapping, and provides detailed logs on the loss and gain of IDs after each step. The web tool further determines IDs shared by different studies, proposes potential disease mechanisms as well as drug repurposing candidates automatically, and visualizes these results interactively. We apply ProHarMeD to a set of four studies on bone regeneration. First, we demonstrate the benefit of ID harmonization which increases the number of shared genes between studies by 50%. Second, we identify a potential disease mechanism, with five corresponding drug targets, and the top 20 putative drug repurposing candidates, of which Fondaparinux, the candidate with the highest score, and multiple others are known to have an impact on bone regeneration. Hence, ProHarMeD allows users to harmonize multi-centric proteomics research data in meta-analyses, evaluates the success of the ID conversions and remappings, and finally, it closes the gaps between proteomics, disease mechanism mining and drug repurposing. It is publicly available at https://apps.cosy.bio/proharmed/ .
Collapse
Affiliation(s)
- Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
| | - Lis Arend
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
| | - Johannes R Schmidt
- Department of Preclinical Development and Validation, Fraunhofer Institute for Cell Therapy and Immunology IZI, Leipzig, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Olga Tsoy
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
| | - Olga Zolotareva
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, 5230, Denmark
| | - Tanja Laske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, 22607, Germany.
| |
Collapse
|
4
|
Sarkar S, Lucchetta M, Maier A, Abdrabbou MM, Baumbach J, List M, Schaefer MH, Blumenthal DB. Online bias-aware disease module mining with ROBUST-Web. Bioinformatics 2023; 39:btad345. [PMID: 37233198 PMCID: PMC10246579 DOI: 10.1093/bioinformatics/btad345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 04/24/2023] [Accepted: 05/25/2023] [Indexed: 05/27/2023] Open
Abstract
SUMMARY We present ROBUST-Web which implements our recently presented ROBUST disease module mining algorithm in a user-friendly web application. ROBUST-Web features seamless downstream disease module exploration via integrated gene set enrichment analysis, tissue expression annotation, and visualization of drug-protein and disease-gene links. Moreover, ROBUST-Web includes bias-aware edge costs for the underlying Steiner tree model as a new algorithmic feature, which allow to correct for study bias in protein-protein interaction networks and further improves the robustness of the computed modules. AVAILABILITY AND IMPLEMENTATION Web application: https://robust-web.net. Source code of web application and Python package with new bias-aware edge costs: https://github.com/bionetslab/robust-web, https://github.com/bionetslab/robust_bias_aware.
Collapse
Affiliation(s)
- Suryadipto Sarkar
- Biomedical Network Science Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91301, Germany
| | - Marta Lucchetta
- Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan 20139, Italy
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Mohamed M Abdrabbou
- Biomedical Network Science Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91301, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising 85354, Germany
| | - Martin H Schaefer
- Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan 20139, Italy
| | - David B Blumenthal
- Biomedical Network Science Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91301, Germany
| |
Collapse
|
5
|
Sadegh S, Skelton J, Anastasi E, Maier A, Adamowicz K, Möller A, Kriege NM, Kronberg J, Haller T, Kacprowski T, Wipat A, Baumbach J, Blumenthal DB. Lacking mechanistic disease definitions and corresponding association data hamper progress in network medicine and beyond. Nat Commun 2023; 14:1662. [PMID: 36966134 PMCID: PMC10039912 DOI: 10.1038/s41467-023-37349-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 03/13/2023] [Indexed: 03/27/2023] Open
Abstract
A long-term objective of network medicine is to replace our current, mainly phenotype-based disease definitions by subtypes of health conditions corresponding to distinct pathomechanisms. For this, molecular and health data are modeled as networks and are mined for pathomechanisms. However, many such studies rely on large-scale disease association data where diseases are annotated using the very phenotype-based disease definitions the network medicine field aims to overcome. This raises the question to which extent the biases mechanistically inadequate disease annotations introduce in disease association data distort the results of studies which use such data for pathomechanism mining. We address this question using global- and local-scale analyses of networks constructed from disease association data of various types. Our results indicate that large-scale disease association data should be used with care for pathomechanism mining and that analyses of such data should be accompanied by close-up analyses of molecular data for well-characterized patient cohorts.
Collapse
Affiliation(s)
- Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - James Skelton
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Elisa Anastasi
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Anna Möller
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Nils M Kriege
- Faculty of Computer Science, University of Vienna, Vienna, Austria
- Research Network Data Science, University of Vienna, Vienna, Austria
| | - Jaanika Kronberg
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Toomas Haller
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Anil Wipat
- School of Computing, Newcastle University, Newcastle upon Tyne, UK
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - David B Blumenthal
- Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
| |
Collapse
|
6
|
Jia X, Yin Z, Peng Y. Gene differential co-expression analysis of male infertility patients based on statistical and machine learning methods. Front Microbiol 2023; 14:1092143. [PMID: 36778885 PMCID: PMC9911419 DOI: 10.3389/fmicb.2023.1092143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/11/2023] [Indexed: 01/28/2023] Open
Abstract
Male infertility has always been one of the important factors affecting the infertility of couples of gestational age. The reasons that affect male infertility includes living habits, hereditary factors, etc. Identifying the genetic causes of male infertility can help us understand the biology of male infertility, as well as the diagnosis of genetic testing and the determination of clinical treatment options. While current research has made significant progress in the genes that cause sperm defects in men, genetic studies of sperm content defects are still lacking. This article is based on a dataset of gene expression data on the X chromosome in patients with azoospermia, mild and severe oligospermia. Due to the difference in the degree of disease between patients and the possible difference in genetic causes, common classical clustering methods such as k-means, hierarchical clustering, etc. cannot effectively identify samples (realize simultaneous clustering of samples and features). In this paper, we use machine learning and various statistical methods such as hypergeometric distribution, Gibbs sampling, Fisher test, etc. and genes the interaction network for cluster analysis of gene expression data of male infertility patients has certain advantages compared with existing methods. The cluster results were identified by differential co-expression analysis of gene expression data in male infertility patients, and the model recognition clusters were analyzed by multiple gene enrichment methods, showing different degrees of enrichment in various enzyme activities, cancer, virus-related, ATP and ADP production, and other pathways. At the same time, as this paper is an unsupervised analysis of genetic factors of male infertility patients, we constructed a simulated data set, in which the clustering results have been determined, which can be used to measure the effect of discriminant model recognition. Through comparison, it finds that the proposed model has a better identification effect.
Collapse
|
7
|
Suter P, Dazert E, Kuipers J, Ng CKY, Boldanova T, Hall MN, Heim MH, Beerenwinkel N. Multi-omics subtyping of hepatocellular carcinoma patients using a Bayesian network mixture model. PLoS Comput Biol 2022; 18:e1009767. [PMID: 36067230 PMCID: PMC9481159 DOI: 10.1371/journal.pcbi.1009767] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 09/16/2022] [Accepted: 07/18/2022] [Indexed: 11/18/2022] Open
Abstract
Comprehensive molecular characterization of cancer subtypes is essential for predicting clinical outcomes and searching for personalized treatments. We present bnClustOmics, a statistical model and computational tool for multi-omics unsupervised clustering, which serves a dual purpose: Clustering patient samples based on a Bayesian network mixture model and learning the networks of omics variables representing these clusters. The discovered networks encode interactions among all omics variables and provide a molecular characterization of each patient subgroup. We conducted simulation studies that demonstrated the advantages of our approach compared to other clustering methods in the case where the generative model is a mixture of Bayesian networks. We applied bnClustOmics to a hepatocellular carcinoma (HCC) dataset comprising genome (mutation and copy number), transcriptome, proteome, and phosphoproteome data. We identified three main HCC subtypes together with molecular characteristics, some of which are associated with survival even when adjusting for the clinical stage. Cluster-specific networks shed light on the links between genotypes and molecular phenotypes of samples within their respective clusters and suggest targets for personalized treatments.
Collapse
Affiliation(s)
- Polina Suter
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Eva Dazert
- Biozentrum, University of Basel, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Charlotte K. Y. Ng
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department for BioMedical Research (DBMR), University of Bern, Bern, Switzerland
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
- Institute of Medical Genetics and Pathology, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Tuyana Boldanova
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
| | | | - Markus H. Heim
- Department of Biomedicine, University Hospital Basel, University of Basel, Basel, Switzerland
- Department of Gastroenterology and Hepatology, Clarunis, University Center for Gastrointestinal and Liver Diseases, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
8
|
Adamowicz K, Maier A, Baumbach J, Blumenthal DB. Online in silico validation of disease and gene sets, clusterings or subnetworks with DIGEST. Brief Bioinform 2022; 23:6618231. [PMID: 35753693 DOI: 10.1093/bib/bbac247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/25/2022] [Accepted: 05/26/2022] [Indexed: 11/12/2022] Open
Abstract
As the development of new drugs reaches its physical and financial limits, drug repurposing has become more important than ever. For mechanistically grounded drug repurposing, it is crucial to uncover the disease mechanisms and to detect clusters of mechanistically related diseases. Various methods for computing candidate disease mechanisms and disease clusters exist. However, in the absence of ground truth, in silico validation is challenging. This constitutes a major hurdle toward the adoption of in silico prediction tools by experimentalists who are often hesitant to carry out wet-lab validations for predicted candidate mechanisms without clearly quantified initial plausibility. To address this problem, we present DIGEST (in silico validation of disease and gene sets, clusterings or subnetworks), a Python-based validation tool available as a web interface (https://digest-validation.net), as a stand-alone package or over a REST API. DIGEST greatly facilitates in silico validation of gene and disease sets, clusterings or subnetworks via fully automated pipelines comprising disease and gene ID mapping, enrichment analysis, comparisons of shared genes and variants and background distribution estimation. Moreover, functionality is provided to automatically update the external databases used by the pipelines. DIGEST hence allows the user to assess the statistical significance of candidate mechanisms with regard to functional and genetic coherence and enables the computation of empirical $P$-values with just a few mouse clicks.
Collapse
Affiliation(s)
- Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - David B Blumenthal
- Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|
9
|
MoSBi: Automated signature mining for molecular stratification and subtyping. Proc Natl Acad Sci U S A 2022; 119:e2118210119. [PMID: 35412913 PMCID: PMC9169782 DOI: 10.1073/pnas.2118210119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Molecular patient stratification and disease subtyping are ongoing and high-impact problems that rely on the identification of characteristic molecular signatures. Current computational methods show high sensitivity to custom parameterization, which leads to inconsistent performance on different molecular data. Our new method, MoSBi (molecular signature identification using biclustering), 1) enables so far unmatched high performance for stratification and subtyping across datasets of various different biomolecules, 2) provides a scalable solution for visualizing the results and their correspondence to clinical factors, and 3) has immediate practical relevance through its automatic workflow where individual selection, parameterization, screening, and visualization of biclustering algorithms is not required. MoSBi is a major step forward with a high impact for clinical and wet-lab researchers. The improving access to increasing amounts of biomedical data provides completely new chances for advanced patient stratification and disease subtyping strategies. This requires computational tools that produce uniformly robust results across highly heterogeneous molecular data. Unsupervised machine learning methodologies are able to discover de novo patterns in such data. Biclustering is especially suited by simultaneously identifying sample groups and corresponding feature sets across heterogeneous omics data. The performance of available biclustering algorithms heavily depends on individual parameterization and varies with their application. Here, we developed MoSBi (molecular signature identification using biclustering), an automated multialgorithm ensemble approach that integrates results utilizing an error model-supported similarity network. We systematically evaluated the performance of 11 available and established biclustering algorithms together with MoSBi. For this, we used transcriptomics, proteomics, and metabolomics data, as well as synthetic datasets covering various data properties. Profiting from multialgorithm integration, MoSBi identified robust group and disease-specific signatures across all scenarios, overcoming single algorithm specificities. Furthermore, we developed a scalable network-based visualization of bicluster communities that supports biological hypothesis generation. MoSBi is available as an R package and web service to make automated biclustering analysis accessible for application in molecular sample stratification.
Collapse
|
10
|
Ramkumar M, Basker N, Pradeep D, Prajapati R, Yuvaraj N, Arshath Raja R, Suresh C, Vignesh R, Barakkath Nisha U, Srihari K, Alene A. Healthcare Biclustering-Based Prediction on Gene Expression Dataset. BIOMED RESEARCH INTERNATIONAL 2022; 2022:2263194. [PMID: 35265709 PMCID: PMC8901349 DOI: 10.1155/2022/2263194] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/02/2022] [Accepted: 02/10/2022] [Indexed: 12/20/2022]
Abstract
In this paper, we develop a healthcare biclustering model in the field of healthcare to reduce the inconveniences linked to the data clustering on gene expression. The present study uses two separate healthcare biclustering approaches to identify specific gene activity in certain environments and remove the duplication of broad gene information components. Moreover, because of its adequacy in the problem where populations of potential solutions allow exploration of a greater portion of the research area, machine learning or heuristic algorithm has become extensively used for healthcare biclustering in the field of healthcare. The study is evaluated in terms of average match score for nonoverlapping modules, overlapping modules through the influence of noise for constant bicluster and additive bicluster, and the run time. The results show that proposed FCM blustering method has higher average match score, and reduced run time proposed FCM than the existing PSO-SA and fuzzy logic healthcare biclustering methods.
Collapse
Affiliation(s)
- M. Ramkumar
- Department of Computer Science and Engineering, HKBK College of Engineering, India
| | - N. Basker
- Department of Computer Science and Engineering, Sona College of Technology, India
| | - D. Pradeep
- Department of Computer Science and Engineering, M.Kumarasamy College of Engineering, Karur, India
| | - Ramesh Prajapati
- Department of Computer Engineering, Shree Swaminarayan Institute of Technology (SSIT), India
| | - N. Yuvaraj
- Research and Publications, ICT Academy, IIT Madras Research Park, India
| | - R. Arshath Raja
- Research and Publications, ICT Academy, IIT Madras Research Park, India
| | - C. Suresh
- CSE, Sri Ranganathar Institute of Engineering and Technology, Coimbatore, India
| | - Rahul Vignesh
- CSE, Dhanalakshmi Srinivasan College of Engineering, Coimbatore, India
| | - U. Barakkath Nisha
- IT Department, Sri Krishna College of Engineering and Technology, Coimbatore, India
| | - K. Srihari
- Department of Computer Science and Engineering, SNS College of Technology, India
| | - Assefa Alene
- Department of Chemical Engineering, College of Biological and Chemical Engineering, Addis Ababa Science and Technology University, Ethiopia
| |
Collapse
|
11
|
Network medicine for disease module identification and drug repurposing with the NeDRex platform. Nat Commun 2021; 12:6848. [PMID: 34824199 PMCID: PMC8617287 DOI: 10.1038/s41467-021-27138-2] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 11/04/2021] [Indexed: 12/17/2022] Open
Abstract
Traditional drug discovery faces a severe efficacy crisis. Repurposing of registered drugs provides an alternative with lower costs and faster drug development timelines. However, the data necessary for the identification of disease modules, i.e. pathways and sub-networks describing the mechanisms of complex diseases which contain potential drug targets, are scattered across independent databases. Moreover, existing studies are limited to predictions for specific diseases or non-translational algorithmic approaches. There is an unmet need for adaptable tools allowing biomedical researchers to employ network-based drug repurposing approaches for their individual use cases. We close this gap with NeDRex, an integrative and interactive platform for network-based drug repurposing and disease module discovery. NeDRex integrates ten different data sources covering genes, drugs, drug targets, disease annotations, and their relationships. NeDRex allows for constructing heterogeneous biological networks, mining them for disease modules, prioritizing drugs targeting disease mechanisms, and statistical validation. We demonstrate the utility of NeDRex in five specific use-cases.
Collapse
|
12
|
Lazareva O, Baumbach J, List M, Blumenthal DB. On the limits of active module identification. Brief Bioinform 2021; 22:6189770. [PMID: 33782690 DOI: 10.1093/bib/bbab066] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 01/29/2021] [Indexed: 12/12/2022] Open
Abstract
In network and systems medicine, active module identification methods (AMIMs) are widely used for discovering candidate molecular disease mechanisms. To this end, AMIMs combine network analysis algorithms with molecular profiling data, most commonly, by projecting gene expression data onto generic protein-protein interaction (PPI) networks. Although active module identification has led to various novel insights into complex diseases, there is increasing awareness in the field that the combination of gene expression data and PPI network is problematic because up-to-date PPI networks have a very small diameter and are subject to both technical and literature bias. In this paper, we report the results of an extensive study where we analyzed for the first time whether widely used AMIMs really benefit from using PPI networks. Our results clearly show that, except for the recently proposed AMIM DOMINO, the tested AMIMs do not produce biologically more meaningful candidate disease modules on widely used PPI networks than on random networks with the same node degrees. AMIMs hence mainly learn from the node degrees and mostly fail to exploit the biological knowledge encoded in the edges of the PPI networks. This has far-reaching consequences for the field of active module identification. In particular, we suggest that novel algorithms are needed which overcome the degree bias of most existing AMIMs and/or work with customized, context-specific networks instead of generic PPI networks.
Collapse
Affiliation(s)
- Olga Lazareva
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany.,Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Markus List
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| | - David B Blumenthal
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| |
Collapse
|
13
|
Hoffmann M, Pachl E, Hartung M, Stiegler V, Baumbach J, Schulz MH, List M. SPONGEdb: a pan-cancer resource for competing endogenous RNA interactions. NAR Cancer 2021; 3:zcaa042. [PMID: 34316695 PMCID: PMC8210024 DOI: 10.1093/narcan/zcaa042] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 11/12/2020] [Accepted: 12/04/2020] [Indexed: 12/12/2022] Open
Abstract
microRNAs (miRNAs) are post-transcriptional regulators involved in many biological processes and human diseases, including cancer. The majority of transcripts compete over a limited pool of miRNAs, giving rise to a complex network of competing endogenous RNA (ceRNA) interactions. Currently, gene-regulatory networks focus mostly on transcription factor-mediated regulation, and dedicated efforts for charting ceRNA regulatory networks are scarce. Recently, it became possible to infer ceRNA interactions genome-wide from matched gene and miRNA expression data. Here, we inferred ceRNA regulatory networks for 22 cancer types and a pan-cancer ceRNA network based on data from The Cancer Genome Atlas. To make these networks accessible to the biomedical community, we present SPONGEdb, a database offering a user-friendly web interface to browse and visualize ceRNA interactions and an application programming interface accessible by accompanying R and Python packages. SPONGEdb allows researchers to identify potent ceRNA regulators via network centrality measures and to assess their potential as cancer biomarkers through survival, cancer hallmark and gene set enrichment analysis. In summary, SPONGEdb is a feature-rich web resource supporting the community in studying ceRNA regulation within and across cancer types.
Collapse
Affiliation(s)
- Markus Hoffmann
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Elisabeth Pachl
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Michael Hartung
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Veronika Stiegler
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Marcel H Schulz
- Institute for Cardiovascular Regeneration, Goethe University, 60596 Frankfurt am Main, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| |
Collapse
|