1
|
Oh VKS, Li RW. Wise Roles and Future Visionary Endeavors of Current Emperor: Advancing Dynamic Methods for Longitudinal Microbiome Meta-Omics Data in Personalized and Precision Medicine. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2400458. [PMID: 39535493 DOI: 10.1002/advs.202400458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 09/16/2024] [Indexed: 11/16/2024]
Abstract
Understanding the etiological complexity of diseases requires identifying biomarkers longitudinally associated with specific phenotypes. Advanced sequencing tools generate dynamic microbiome data, providing insights into microbial community functions and their impact on health. This review aims to explore the current roles and future visionary endeavors of dynamic methods for integrating longitudinal microbiome multi-omics data in personalized and precision medicine. This work seeks to synthesize existing research, propose best practices, and highlight innovative techniques. The development and application of advanced dynamic methods, including the unified analytical frameworks and deep learning tools in artificial intelligence, are critically examined. Aggregating data on microbes, metabolites, genes, and other entities offers profound insights into the interactions among microorganisms, host physiology, and external stimuli. Despite progress, the absence of gold standards for validating analytical protocols and data resources of various longitudinal multi-omics studies remains a significant challenge. The interdependence of workflow steps critically affects overall outcomes. This work provides a comprehensive roadmap for best practices, addressing current challenges with advanced dynamic methods. The review underscores the biological effects of clinical, experimental, and analytical protocol settings on outcomes. Establishing consensus on dynamic microbiome inter-studies and advancing reliable analytical protocols are pivotal for the future of personalized and precision medicine.
Collapse
Affiliation(s)
- Vera-Khlara S Oh
- Big Biomedical Data Integration and Statistical Analysis (DIANA) Research Center, Department of Data Science, College of Natural Sciences, Jeju National University, Jeju City, Jeju Do, 63243, South Korea
| | - Robert W Li
- United States Department of Agriculture, Agricultural Research Service, Animal Genomics and Improvement Laboratory, Beltsville, MD, 20705, USA
| |
Collapse
|
2
|
Hayes CN, Nakahara H, Ono A, Tsuge M, Oka S. From Omics to Multi-Omics: A Review of Advantages and Tradeoffs. Genes (Basel) 2024; 15:1551. [PMID: 39766818 PMCID: PMC11675490 DOI: 10.3390/genes15121551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Revised: 11/25/2024] [Accepted: 11/28/2024] [Indexed: 01/11/2025] Open
Abstract
Bioinformatics is a rapidly evolving field charged with cataloging, disseminating, and analyzing biological data. Bioinformatics started with genomics, but while genomics focuses more narrowly on the genes comprising a genome, bioinformatics now encompasses a much broader range of omics technologies. Overcoming barriers of scale and effort that plagued earlier sequencing methods, bioinformatics adopted an ambitious strategy involving high-throughput and highly automated assays. However, as the list of omics technologies continues to grow, the field of bioinformatics has changed in two fundamental ways. Despite enormous success in expanding our understanding of the biological world, the failure of bulk methods to account for biologically important variability among cells of the same or different type has led to a major shift toward single-cell and spatially resolved omics methods, which attempt to disentangle the conflicting signals contained in heterogeneous samples by examining individual cells or cell clusters. The second major shift has been the attempt to integrate two or more different classes of omics data in a single multimodal analysis to identify patterns that bridge biological layers. For example, unraveling the cause of disease may reveal a metabolite deficiency caused by the failure of an enzyme to be phosphorylated because a gene is not expressed due to aberrant methylation as a result of a rare germline variant. Conclusions: There is a fine line between superficial understanding and analysis paralysis, but like a detective novel, multi-omics increasingly provides the clues we need, if only we are able to see them.
Collapse
Affiliation(s)
- C. Nelson Hayes
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| | - Hikaru Nakahara
- Department of Clinical and Molecular Genetics, Hiroshima University, Hiroshima 734-8551, Japan;
| | - Atsushi Ono
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| | - Masataka Tsuge
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
- Liver Center, Hiroshima University, Hiroshima 734-8551, Japan
| | - Shiro Oka
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| |
Collapse
|
3
|
Ruiz-Perez D, Gimon I, Sazal M, Mathee K, Narasimhan G. Unfolding and de-confounding: biologically meaningful causal inference from longitudinal multi-omic networks using METALICA. mSystems 2024; 9:e0130323. [PMID: 39240096 PMCID: PMC11494969 DOI: 10.1128/msystems.01303-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 07/10/2024] [Indexed: 09/07/2024] Open
Abstract
A key challenge in the analysis of microbiome data is the integration of multi-omic datasets and the discovery of interactions between microbial taxa, their expressed genes, and the metabolites they consume and/or produce. In an effort to improve the state of the art in inferring biologically meaningful multi-omic interactions, we sought to address some of the most fundamental issues in causal inference from longitudinal multi-omics microbiome data sets. We developed METALICA, a suite of tools and techniques that can infer interactions between microbiome entities. METALICA introduces novel unrolling and de-confounding techniques used to uncover multi-omic entities that are believed to act as confounders for some of the relationships that may be inferred using standard causal inferencing tools. The results lend support to predictions about biological models and processes by which microbial taxa interact with each other in a microbiome. The unrolling process helps identify putative intermediaries (genes and/or metabolites) to explain the interactions between microbes; the de-confounding process identifies putative common causes that may lead to spurious relationships to be inferred. METALICA was applied to the networks inferred by existing causal discovery, and network inference algorithms were applied to a multi-omics data set resulting from a longitudinal study of IBD microbiomes. The most significant unrollings and de-confoundings were manually validated using the existing literature and databases. IMPORTANCE We have developed a suite of tools and techniques capable of inferring interactions between microbiome entities. METALICA introduces novel techniques called unrolling and de-confounding that are employed to uncover multi-omic entities considered to be confounders for some of the relationships that may be inferred using standard causal inferencing tools. To evaluate our method, we conducted tests on the inflammatory bowel disease (IBD) dataset from the iHMP longitudinal study, which we pre-processed in accordance with our previous work. From this dataset, we generated various subsets, encompassing different combinations of metagenomics, metabolomics, and metatranscriptomics datasets. Using these multi-omics datasets, we demonstrate how the unrolling process aids in the identification of putative intermediaries (genes and/or metabolites) to explain the interactions between microbes. Additionally, the de-confounding process identifies potential common causes that may give rise to spurious relationships to be inferred. The most significant unrollings and de-confoundings were manually validated using the existing literature and databases.
Collapse
Affiliation(s)
- Daniel Ruiz-Perez
- Bioinformatics Research Group (BioRG), Florida International University, Miami, Florida, USA
| | - Isabella Gimon
- Bioinformatics Research Group (BioRG), Florida International University, Miami, Florida, USA
| | - Musfiqur Sazal
- Bioinformatics Research Group (BioRG), Florida International University, Miami, Florida, USA
| | - Kalai Mathee
- Florida International University, Miami, Florida, USA
- Biomolecular Sciences Institute, Florida International University, Miami, Florida, USA
| | - Giri Narasimhan
- Bioinformatics Research Group (BioRG), Florida International University, Miami, Florida, USA
- Biomolecular Sciences Institute, Florida International University, Miami, Florida, USA
| |
Collapse
|
4
|
Hernández-Lemus E, Ochoa S. Methods for multi-omic data integration in cancer research. Front Genet 2024; 15:1425456. [PMID: 39364009 PMCID: PMC11446849 DOI: 10.3389/fgene.2024.1425456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 08/28/2024] [Indexed: 10/05/2024] Open
Abstract
Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
5
|
Zhou Y, Geng P, Zhang S, Xiao F, Cai G, Chen L, Lu Q. Multimodal functional deep learning for multiomics data. Brief Bioinform 2024; 25:bbae448. [PMID: 39285512 PMCID: PMC11405129 DOI: 10.1093/bib/bbae448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 08/03/2024] [Accepted: 08/28/2024] [Indexed: 09/20/2024] Open
Abstract
With rapidly evolving high-throughput technologies and consistently decreasing costs, collecting multimodal omics data in large-scale studies has become feasible. Although studying multiomics provides a new comprehensive approach in understanding the complex biological mechanisms of human diseases, the high dimensionality of omics data and the complexity of the interactions among various omics levels in contributing to disease phenotypes present tremendous analytical challenges. There is a great need of novel analytical methods to address these challenges and to facilitate multiomics analyses. In this paper, we propose a multimodal functional deep learning (MFDL) method for the analysis of high-dimensional multiomics data. The MFDL method models the complex relationships between multiomics variants and disease phenotypes through the hierarchical structure of deep neural networks and handles high-dimensional omics data using the functional data analysis technique. Furthermore, MFDL leverages the structure of the multimodal model to capture interactions between different types of omics data. Through simulation studies and real-data applications, we demonstrate the advantages of MFDL in terms of prediction accuracy and its robustness to the high dimensionality and noise within the data.
Collapse
Affiliation(s)
- Yuan Zhou
- Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL 32611, USA
| | - Pei Geng
- Department of Mathematics and Statistics, University of New Hampshire, 33 Academic Way, Durham, NH 03824, USA
| | - Shan Zhang
- Department of Statistics and Probability, Michigan State University, 619 Red Cedar Road, East Lansing, MI 48824, USA
| | - Feifei Xiao
- Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL 32611, USA
| | - Guoshuai Cai
- Department of Surgery, University of Florida, Gainesville, 1600 SW Archer Rd, FL 32611, USA
| | - Li Chen
- Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL 32611, USA
| | - Qing Lu
- Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL 32611, USA
| |
Collapse
|
6
|
Chakraborty S, Sharma G, Karmakar S, Banerjee S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167120. [PMID: 38484941 DOI: 10.1016/j.bbadis.2024.167120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/01/2024]
Abstract
Innovative multi-omics frameworks integrate diverse datasets from the same patients to enhance our understanding of the molecular and clinical aspects of cancers. Advanced omics and multi-view clustering algorithms present unprecedented opportunities for classifying cancers into subtypes, refining survival predictions and treatment outcomes, and unravelling key pathophysiological processes across various molecular layers. However, with the increasing availability of cost-effective high-throughput technologies (HTT) that generate vast amounts of data, analyzing single layers often falls short of establishing causal relations. Integrating multi-omics data spanning genomes, epigenomes, transcriptomes, proteomes, metabolomes, and microbiomes offers unique prospects to comprehend the underlying biology of complex diseases like cancer. This discussion explores algorithmic frameworks designed to uncover cancer subtypes, disease mechanisms, and methods for identifying pivotal genomic alterations. It also underscores the significance of multi-omics in tumor classifications, diagnostics, and prognostications. Despite its unparalleled advantages, the integration of multi-omics data has been slow to find its way into everyday clinics. A major hurdle is the uneven maturity of different omics approaches and the widening gap between the generation of large datasets and the capacity to process this data. Initiatives promoting the standardization of sample processing and analytical pipelines, as well as multidisciplinary training for experts in data analysis and interpretation, are crucial for translating theoretical findings into practical applications.
Collapse
Affiliation(s)
- Sohini Chakraborty
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Gaurav Sharma
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Sricheta Karmakar
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Satarupa Banerjee
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
7
|
Holsten L, Dahm K, Oestreich M, Becker M, Ulas T. hCoCena: A toolbox for network-based co-expression analysis and horizontal integration of transcriptomic datasets. STAR Protoc 2024; 5:102922. [PMID: 38427570 PMCID: PMC10918327 DOI: 10.1016/j.xpro.2024.102922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/09/2024] [Accepted: 02/13/2024] [Indexed: 03/03/2024] Open
Abstract
As the number and complexity of transcriptomic datasets increase, there is a rising demand for accessible and user-friendly analysis tools. Here, we present hCoCena (horizontal construction of co-expression networks and analysis), a toolbox facilitating the analysis of a single dataset, as well as the joint analysis of multiple datasets. We describe steps for workspace setup, formatting tables, data processing, and network integration. We then detail procedures for gene clustering, gene set enrichment analysis, and transcription factor enrichment analysis. For complete details on the use and execution of this protocol, please refer to Oestreich et al.1.
Collapse
Affiliation(s)
- Lisa Holsten
- Systems Medicine, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; PRECISE Platform for Single Cell Genomics and Epigenomics, DZNE, and University of Bonn, 53127 Bonn, Germany; Genomics and Immunoregulation, Life & Medical Sciences (LIMES) Institute, University of Bonn, 53115 Bonn, Germany; Department of Pediatrics, University Hospital Würzburg, 97080 Würzburg, Germany.
| | - Kilian Dahm
- Systems Medicine, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; PRECISE Platform for Single Cell Genomics and Epigenomics, DZNE, and University of Bonn, 53127 Bonn, Germany; Department of Pediatrics, University Hospital Würzburg, 97080 Würzburg, Germany
| | - Marie Oestreich
- Systems Medicine, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; Modular High-Performance Computing and Artificial Intelligence, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany
| | - Matthias Becker
- Systems Medicine, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; Modular High-Performance Computing and Artificial Intelligence, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany
| | - Thomas Ulas
- Systems Medicine, German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany; PRECISE Platform for Single Cell Genomics and Epigenomics, DZNE, and University of Bonn, 53127 Bonn, Germany; Genomics and Immunoregulation, Life & Medical Sciences (LIMES) Institute, University of Bonn, 53115 Bonn, Germany.
| |
Collapse
|
8
|
el Bouhaddani S, Höllerhage M, Uh HW, Moebius C, Bickle M, Höglinger G, Houwing-Duistermaat J. Statistical integration of multi-omics and drug screening data from cell lines. PLoS Comput Biol 2024; 20:e1011809. [PMID: 38295113 PMCID: PMC10878536 DOI: 10.1371/journal.pcbi.1011809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/20/2024] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online.
Collapse
Affiliation(s)
| | | | - Hae-Won Uh
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
| | - Claudia Moebius
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Marc Bickle
- Roche Institute for Translational Bioengineering, Basel, Switzerland
| | - Günter Höglinger
- Department of Neurology, Hannover Medical School, Hannover, Germany
- Department of Neurology, Ludwig-Maximilians-Universität, Munich, Germany
- German Center for Neurodegenerative Diseases, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Jeanine Houwing-Duistermaat
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
- Dept. of Mathematics, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
9
|
Ruiz-Perez D, Gimon I, Sazal M, Mathee K, Narasimhan G. Unfolding and De-confounding: Biologically meaningful causal inference from longitudinal multi-omic networks using METALICA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.12.571384. [PMID: 38168315 PMCID: PMC10760167 DOI: 10.1101/2023.12.12.571384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
A key challenge in the analysis of microbiome data is the integration of multi-omic datasets and the discovery of interactions between microbial taxa, their expressed genes, and the metabolites they consume and/or produce. In an effort to improve the state-of-the-art in inferring biologically meaningful multi-omic interactions, we sought to address some of the most fundamental issues in causal inference from longitudinal multi-omics microbiome data sets. We developed METALICA, a suite of tools and techniques that can infer interactions between microbiome entities. METALICA introduces novel unrolling and de-confounding techniques used to uncover multi-omic entities that are believed to act as confounders for some of the relationships that may be inferred using standard causal inferencing tools. The results lend support to predictions about biological models and processes by which microbial taxa interact with each other in a microbiome. The unrolling process helps to identify putative intermediaries (genes and/or metabolites) to explain the interactions between microbes; the de-confounding process identifies putative common causes that may lead to spurious relationships to be inferred. METALICA was applied to the networks inferred by existing causal discovery and network inference algorithms applied to a multi-omics data set resulting from a longitudinal study of IBD microbiomes. The most significant unrollings and de-confoundings were manually validated using the existing literature and databases.
Collapse
Affiliation(s)
- Daniel Ruiz-Perez
- Bioinformatics Research Group (BioRG), Florida International University, Miami, FL 33199, USA
| | - Isabella Gimon
- Bioinformatics Research Group (BioRG), Florida International University, Miami, FL 33199, USA
| | - Musfiqur Sazal
- Bioinformatics Research Group (BioRG), Florida International University, Miami, FL 33199, USA
| | - Kalai Mathee
- Florida International University, Miami, FL 33199, USA
- Biomolecular Sciences Institute, Florida International University, Miami, FL 33199, USA
| | - Giri Narasimhan
- Bioinformatics Research Group (BioRG), Florida International University, Miami, FL 33199, USA
- Biomolecular Sciences Institute, Florida International University, Miami, FL 33199, USA
| |
Collapse
|
10
|
Chen JW, Shrestha L, Green G, Leier A, Marquez-Lago TT. The hitchhikers' guide to RNA sequencing and functional analysis. Brief Bioinform 2023; 24:bbac529. [PMID: 36617463 PMCID: PMC9851315 DOI: 10.1093/bib/bbac529] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 01/10/2023] Open
Abstract
DNA and RNA sequencing technologies have revolutionized biology and biomedical sciences, sequencing full genomes and transcriptomes at very high speeds and reasonably low costs. RNA sequencing (RNA-Seq) enables transcript identification and quantification, but once sequencing has concluded researchers can be easily overwhelmed with questions such as how to go from raw data to differential expression (DE), pathway analysis and interpretation. Several pipelines and procedures have been developed to this effect. Even though there is no unique way to perform RNA-Seq analysis, it usually follows these steps: 1) raw reads quality check, 2) alignment of reads to a reference genome, 3) aligned reads' summarization according to an annotation file, 4) DE analysis and 5) gene set analysis and/or functional enrichment analysis. Each step requires researchers to make decisions, and the wide variety of options and resulting large volumes of data often lead to interpretation challenges. There also seems to be insufficient guidance on how best to obtain relevant information and derive actionable knowledge from transcription experiments. In this paper, we explain RNA-Seq steps in detail and outline differences and similarities of different popular options, as well as advantages and disadvantages. We also discuss non-coding RNA analysis, multi-omics, meta-transcriptomics and the use of artificial intelligence methods complementing the arsenal of tools available to researchers. Lastly, we perform a complete analysis from raw reads to DE and functional enrichment analysis, visually illustrating how results are not absolute truths and how algorithmic decisions can greatly impact results and interpretation.
Collapse
Affiliation(s)
- Jiung-Wen Chen
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Lisa Shrestha
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - George Green
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - André Leier
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
- Department of Microbiology, University of Alabama at Birmingham, School of Medicine, Birmingham, AL, USA
| |
Collapse
|
11
|
Milanese JS, Marcotte R, Costain WJ, Kablar B, Drouin S. Roles of Skeletal Muscle in Development: A Bioinformatics and Systems Biology Overview. ADVANCES IN ANATOMY, EMBRYOLOGY, AND CELL BIOLOGY 2023; 236:21-55. [PMID: 37955770 DOI: 10.1007/978-3-031-38215-4_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2023]
Abstract
The ability to assess various cellular events consequent to perturbations, such as genetic mutations, disease states and therapies, has been recently revolutionized by technological advances in multiple "omics" fields. The resulting deluge of information has enabled and necessitated the development of tools required to both process and interpret the data. While of tremendous value to basic researchers, the amount and complexity of the data has made it extremely difficult to manually draw inference and identify factors key to the study objectives. The challenges of data reduction and interpretation are being met by the development of increasingly complex tools that integrate disparate knowledge bases and synthesize coherent models based on current biological understanding. This chapter presents an example of how genomics data can be integrated with biological network analyses to gain further insight into the developmental consequences of genetic perturbations. State of the art methods for conducting similar studies are discussed along with modern methods used to analyze and interpret the data.
Collapse
Affiliation(s)
| | - Richard Marcotte
- Human Health Therapeutics, National Research Council of Canada , Montreal, QC, Canada
| | - Willard J Costain
- Human Health Therapeutics, National Research Council of Canada, Ottawa, ON, Canada
| | - Boris Kablar
- Department of Medical Neuroscience, Anatomy and Pathology, Faculty of Medicine, Dalhousie University, Halifax, NS, Canada
| | - Simon Drouin
- Human Health Therapeutics, National Research Council of Canada , Montreal, QC, Canada.
| |
Collapse
|
12
|
Hao X, Cheng S, Jiang B, Xin S. Applying multi-omics techniques to the discovery of biomarkers for acute aortic dissection. Front Cardiovasc Med 2022; 9:961991. [PMID: 36588568 PMCID: PMC9797526 DOI: 10.3389/fcvm.2022.961991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Acute aortic dissection (AAD) is a cardiovascular disease that manifests suddenly and fatally. Due to the lack of specific early symptoms, many patients with AAD are often overlooked or misdiagnosed, which is undoubtedly catastrophic for patients. The particular pathogenic mechanism of AAD is yet unknown, which makes clinical pharmacological therapy extremely difficult. Therefore, it is necessary and crucial to find and employ unique biomarkers for Acute aortic dissection (AAD) as soon as possible in clinical practice and research. This will aid in the early detection of AAD and give clear guidelines for the creation of focused treatment agents. This goal has been made attainable over the past 20 years by the quick advancement of omics technologies and the development of high-throughput tissue specimen biomarker screening. The primary histology data support and add to one another to create a more thorough and three-dimensional picture of the disease. Based on the introduction of the main histology technologies, in this review, we summarize the current situation and most recent developments in the application of multi-omics technologies to AAD biomarker discovery and emphasize the significance of concentrating on integration concepts for integrating multi-omics data. In this context, we seek to offer fresh concepts and recommendations for fundamental investigation, perspective innovation, and therapeutic development in AAD.
Collapse
Affiliation(s)
- Xinyu Hao
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Shuai Cheng
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Bo Jiang
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China
| | - Shijie Xin
- Department of Vascular Surgery, The First Affiliated Hospital of China Medical University, China Medical University, Shenyang, China,Key Laboratory of Pathogenesis, Prevention and Therapeutics of Aortic Aneurysm, Shenyang, Liaoning, China,*Correspondence: Shijie Xin,
| |
Collapse
|
13
|
Chantada-Vázquez MDP, Bravo SB, Barbosa-Gouveia S, Alvarez JV, Couce ML. Proteomics in Inherited Metabolic Disorders. Int J Mol Sci 2022; 23:14744. [PMID: 36499071 PMCID: PMC9740208 DOI: 10.3390/ijms232314744] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/17/2022] [Accepted: 11/22/2022] [Indexed: 11/29/2022] Open
Abstract
Inherited metabolic disorders (IMD) are rare medical conditions caused by genetic defects that interfere with the body's metabolism. The clinical phenotype is highly variable and can present at any age, although it more often manifests in childhood. The number of treatable IMDs has increased in recent years, making early diagnosis and a better understanding of the natural history of the disease more important than ever. In this review, we discuss the main challenges faced in applying proteomics to the study of IMDs, and the key advances achieved in this field using tandem mass spectrometry (MS/MS). This technology enables the analysis of large numbers of proteins in different body fluids (serum, plasma, urine, saliva, tears) with a single analysis of each sample, and can even be applied to dried samples. MS/MS has thus emerged as the tool of choice for proteome characterization and has provided new insights into many diseases and biological systems. In the last 10 years, sequential window acquisition of all theoretical fragmentation spectra mass spectrometry (SWATH-MS) has emerged as an accurate, high-resolution technique for the identification and quantification of proteins differentially expressed between healthy controls and IMD patients. Proteomics is a particularly promising approach to help obtain more information on rare genetic diseases, including identification of biomarkers to aid early diagnosis and better understanding of the underlying pathophysiology to guide the development of new therapies. Here, we summarize new and emerging proteomic technologies and discuss current uses and limitations of this approach to identify and quantify proteins. Moreover, we describe the use of proteomics to identify the mechanisms regulating complex IMD phenotypes; an area of research essential to better understand these rare disorders and many other human diseases.
Collapse
Affiliation(s)
- Maria del Pilar Chantada-Vázquez
- Proteomic Platform, Health Research Institute of Santiago de Compostela (IDIS), Hospital Clínico Universitario de Santiago de Compostela, 15706 Santiago de Compostela, Spain
| | - Susana B. Bravo
- Proteomic Platform, Health Research Institute of Santiago de Compostela (IDIS), Hospital Clínico Universitario de Santiago de Compostela, 15706 Santiago de Compostela, Spain
| | - Sofía Barbosa-Gouveia
- Department of Forensic Sciences, Pathology, Gynecology and Obstetrics, Pediatrics, Neonatology Service, Department of Pediatrics, Hospital Clínico Universitario de Santiago de Compostela, Health Research Institute of Santiago de Compostela (IDIS), CIBERER, MetabERN, 15706 Santiago de Compostela, Spain
| | - José V. Alvarez
- Department of Forensic Sciences, Pathology, Gynecology and Obstetrics, Pediatrics, Neonatology Service, Department of Pediatrics, Hospital Clínico Universitario de Santiago de Compostela, Health Research Institute of Santiago de Compostela (IDIS), CIBERER, MetabERN, 15706 Santiago de Compostela, Spain
| | - María L. Couce
- Department of Forensic Sciences, Pathology, Gynecology and Obstetrics, Pediatrics, Neonatology Service, Department of Pediatrics, Hospital Clínico Universitario de Santiago de Compostela, Health Research Institute of Santiago de Compostela (IDIS), CIBERER, MetabERN, 15706 Santiago de Compostela, Spain
| |
Collapse
|
14
|
Agamah FE, Bayjanov JR, Niehues A, Njoku KF, Skelton M, Mazandu GK, Ederveen THA, Mulder N, Chimusa ER, 't Hoen PAC. Computational approaches for network-based integrative multi-omics analysis. Front Mol Biosci 2022; 9:967205. [PMID: 36452456 PMCID: PMC9703081 DOI: 10.3389/fmolb.2022.967205] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 10/20/2022] [Indexed: 08/27/2023] Open
Abstract
Advances in omics technologies allow for holistic studies into biological systems. These studies rely on integrative data analysis techniques to obtain a comprehensive view of the dynamics of cellular processes, and molecular mechanisms. Network-based integrative approaches have revolutionized multi-omics analysis by providing the framework to represent interactions between multiple different omics-layers in a graph, which may faithfully reflect the molecular wiring in a cell. Here we review network-based multi-omics/multi-modal integrative analytical approaches. We classify these approaches according to the type of omics data supported, the methods and/or algorithms implemented, their node and/or edge weighting components, and their ability to identify key nodes and subnetworks. We show how these approaches can be used to identify biomarkers, disease subtypes, crosstalk, causality, and molecular drivers of physiological and pathological mechanisms. We provide insight into the most appropriate methods and tools for research questions as showcased around the aetiology and treatment of COVID-19 that can be informed by multi-omics data integration. We conclude with an overview of challenges associated with multi-omics network-based analysis, such as reproducibility, heterogeneity, (biological) interpretability of the results, and we highlight some future directions for network-based integration.
Collapse
Affiliation(s)
- Francis E. Agamah
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jumamurat R. Bayjanov
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Anna Niehues
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Kelechi F. Njoku
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Michelle Skelton
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Gaston K. Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- African Institute for Mathematical Sciences, Cape Town, South Africa
| | - Thomas H. A. Ederveen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Emile R. Chimusa
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle, United Kingdom
| | - Peter A. C. 't Hoen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
15
|
Robin V, Bodein A, Scott-Boyer MP, Leclercq M, Périn O, Droit A. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context. Front Mol Biosci 2022; 9:962799. [PMID: 36158572 PMCID: PMC9494275 DOI: 10.3389/fmolb.2022.962799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/16/2022] [Indexed: 11/26/2022] Open
Abstract
At the heart of the cellular machinery through the regulation of cellular functions, protein-protein interactions (PPIs) have a significant role. PPIs can be analyzed with network approaches. Construction of a PPI network requires prediction of the interactions. All PPIs form a network. Different biases such as lack of data, recurrence of information, and false interactions make the network unstable. Integrated strategies allow solving these different challenges. These approaches have shown encouraging results for the understanding of molecular mechanisms, drug action mechanisms, and identification of target genes. In order to give more importance to an interaction, it is evaluated by different confidence scores. These scores allow the filtration of the network and thus facilitate the representation of the network, essential steps to the identification and understanding of molecular mechanisms. In this review, we will discuss the main computational methods for predicting PPI, including ones confirming an interaction as well as the integration of PPIs into a network, and we will discuss visualization of these complex data.
Collapse
Affiliation(s)
- Vivian Robin
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie-Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Mickaël Leclercq
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Périn
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| |
Collapse
|
16
|
Oestreich M, Holsten L, Agrawal S, Dahm K, Koch P, Jin H, Becker M, Ulas T. hCoCena: Horizontal integration and analysis of transcriptomics datasets. Bioinformatics 2022; 38:4727-4734. [PMID: 36018233 PMCID: PMC9563699 DOI: 10.1093/bioinformatics/btac589] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 07/29/2022] [Accepted: 08/25/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation Transcriptome-based gene co-expression analysis has become a standard procedure for structured and contextualized understanding and comparison of different conditions and phenotypes. Since large study designs with a broad variety of conditions are costly and laborious, extensive comparisons are hindered when utilizing only a single dataset. Thus, there is an increased need for tools that allow the integration of multiple transcriptomic datasets with subsequent joint analysis, which can provide a more systematic understanding of gene co-expression and co-functionality within and across conditions. To make such an integrative analysis accessible to a wide spectrum of users with differing levels of programming expertise it is essential to provide user-friendliness and customizability as well as thorough documentation. Results This article introduces horizontal CoCena (hCoCena: horizontal construction of co-expression networks and analysis), an R-package for network-based co-expression analysis that allows the analysis of a single transcriptomic dataset as well as the joint analysis of multiple datasets. With hCoCena, we provide a freely available, user-friendly and adaptable tool for integrative multi-study or single-study transcriptomics analyses alongside extensive comparisons to other existing tools. Availability and implementation The hCoCena R-package is provided together with R Markdowns that implement an exemplary analysis workflow including extensive documentation and detailed descriptions of data structures and objects. Such efforts not only make the tool easy to use but also enable the seamless integration of user-written scripts and functions into the workflow, creating a tool that provides a clear design while remaining flexible and highly customizable. The package and additional information including an extensive Wiki are freely available on GitHub: https://github.com/MarieOestreich/hCoCena. The version at the time of writing has been added to Zenodo under the following link: https://doi.org/10.5281/zenodo.6911782. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marie Oestreich
- Modular High Performance Computing and Artificial Intelligence, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Venusberg-Campus 1/99, Bonn, 53127, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Lisa Holsten
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., PRECISE Platform for Genomics and Epigenomics at DZNE and University of Bonn, Bonn, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Shobhit Agrawal
- Genomics and Immunoregulation, LIMES-Institute, University of Bonn, Bonn, 53115, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Kilian Dahm
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., PRECISE Platform for Genomics and Epigenomics at DZNE and University of Bonn, Bonn, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Philipp Koch
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., PRECISE Platform for Genomics and Epigenomics at DZNE and University of Bonn, Bonn, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Han Jin
- Science for Life Laboratory (SciLifelab), KTH Royal Institute of Technology, Stockholm, Sweden
| | - Matthias Becker
- Modular High Performance Computing and Artificial Intelligence, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Venusberg-Campus 1/99, Bonn, 53127, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| | - Thomas Ulas
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., PRECISE Platform for Genomics and Epigenomics at DZNE and University of Bonn, Bonn, Germany.,Genomics and Immunoregulation, LIMES-Institute, University of Bonn, Bonn, 53115, Germany.,Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) e.V., Bonn, Germany
| |
Collapse
|
17
|
Characterization of methylation patterns associated with lifestyle factors and vitamin D supplementation in a healthy elderly cohort from Southwest Sweden. Sci Rep 2022; 12:12670. [PMID: 35879377 PMCID: PMC9310683 DOI: 10.1038/s41598-022-15924-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 07/01/2022] [Indexed: 11/08/2022] Open
Abstract
Numerous studies have shown that lifestyle factors, such as regular physical activity and vitamin D intake, may remarkably improve overall health and mental wellbeing. This is especially important in older adults whose vitamin D deficiency occurs with a high prevalence. This study aimed to examine the influence of lifestyle and vitamin D on global DNA methylation patterns in an elderly cohort in Southwest of Sweden. We also sought to examine the methylation levels of specific genes involved in vitamin D's molecular and metabolic activated pathways. We performed a genome wide methylation analysis, using Illumina Infinium DNA Methylation EPIC 850kBeadChip array, on 277 healthy individuals from Southwest Sweden at the age of 70–95. The study participants also answered queries on lifestyle, vitamin intake, heart medication, and estimated health. Vitamin D intake did not in general affect methylation patterns, which is in concert with other studies. However, when comparing the group of individuals taking vitamin supplements, including vitamin D, with those not taking supplements, a difference in methylation in the solute carrier family 25 (SCL25A24) gene was found. This confirms a previous finding, where changes in expression of SLC25A24 were associated with vitamin D treatment in human monocytes. The combination of vitamin D intake and high physical activity increased methylation of genes linked to regulation of vitamin D receptor pathway, the Wnt pathway and general cancer processes. To our knowledge, this is the first study detecting epigenetic markers associated with the combined effects of vitamin D supplementation and high physical activity. These results deserve to be further investigated in an extended, interventional study cohort, where also the levels of 25(OH)D3 can be monitored.
Collapse
|
18
|
Cheng B, Zhou P, Chen Y. Machine-learning algorithms based on personalized pathways for a novel predictive model for the diagnosis of hepatocellular carcinoma. BMC Bioinformatics 2022; 23:248. [PMID: 35739471 PMCID: PMC9219178 DOI: 10.1186/s12859-022-04805-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 06/20/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND At present, the diagnostic ability of hepatocellular carcinoma (HCC) based on serum alpha-fetoprotein level is limited. Finding markers that can effectively distinguish cancer and non-cancerous tissues is important for improving the diagnostic efficiency of HCC. RESULTS In this study, we developed a predictive model for HCC diagnosis using personalized biological pathways combined with a machine learning algorithm based on regularized regression and carry out relevant examinations. In two training sets, the overall cross-study-validated area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve and the Brier score of the diagnostic model were 0.987 [95%confidence interval (CI): 0.979-0.996], 0.981 and 0.091, respectively. Besides, the model showed good transferability in external validation set. In TCGA-LIHC cohort, the AUROC, AURPC and Brier score were 0.992 (95%CI: 0.985-0.998), 0.967 and 0.112, respectively. The diagnostic model has accomplished very impressive performance in distinguishing HCC from non-cancerous liver tissues. Moreover, we further analyzed the extracted biological pathways to explore molecular features and prognostic factors. The risk score generated from a 12-gene signature extracted from the characteristic pathways was correlated with some immune related pathways and served as an independent prognostic factor for HCC. CONCLUSION We used personalized biological pathways analysis and machine learning algorithm to construct a highly accurate HCC diagnostic model. The excellent interpretable performance and good transferability of this model enables it with great potential for personalized medicine, which can assist clinicians in diagnosis for HCC patients.
Collapse
Affiliation(s)
- Binglin Cheng
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, 1838 Guangzhou Avenue North, Baiyun District, Guangzhou, 510515, Guangdong Province, China.,The First School of Clinical Medicine, Southern Medical University, Guangzhou, Guangdong Province, China
| | - Peitao Zhou
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, 1838 Guangzhou Avenue North, Baiyun District, Guangzhou, 510515, Guangdong Province, China
| | - Yuhan Chen
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, 1838 Guangzhou Avenue North, Baiyun District, Guangzhou, 510515, Guangdong Province, China.
| |
Collapse
|
19
|
Johansson M, Tangruksa B, Heydarkhan-Hagvall S, Jeppsson A, Sartipy P, Synnergren J. Data Mining Identifies CCN2 and THBS1 as Biomarker Candidates for Cardiac Hypertrophy. Life (Basel) 2022; 12:life12050726. [PMID: 35629393 PMCID: PMC9147176 DOI: 10.3390/life12050726] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 05/06/2022] [Accepted: 05/11/2022] [Indexed: 12/02/2022] Open
Abstract
Cardiac hypertrophy is a condition that may contribute to the development of heart failure. In this study, we compare the gene-expression patterns of our in vitro stem-cell-based cardiac hypertrophy model with the gene expression of biopsies collected from hypertrophic human hearts. Twenty-five differentially expressed genes (DEGs) from both groups were identified and the expression of selected corresponding secreted proteins were validated using ELISA and Western blot. Several biomarkers, including CCN2, THBS1, NPPA, and NPPB, were identified, which showed significant overexpressions in the hypertrophic samples in both the cardiac biopsies and in the endothelin-1-treated cells, both at gene and protein levels. The protein-interaction network analysis revealed CCN2 as a central node among the 25 overlapping DEGs, suggesting that this gene might play an important role in the development of cardiac hypertrophy. GO-enrichment analysis of the 25 DEGs revealed many biological processes associated with cardiac function and the development of cardiac hypertrophy. In conclusion, we identified important similarities between ET-1-stimulated human-stem-cell-derived cardiomyocytes and human hypertrophic cardiac tissue. Novel putative cardiac hypertrophy biomarkers were identified and validated on the protein level, lending support for further investigations to assess their potential for future clinical applications.
Collapse
Affiliation(s)
- Markus Johansson
- Systems Biology Research Center, School of Bioscience, University of Skövde, SE-541 28 Skövde, Sweden; (S.H.-H.); (P.S.); (J.S.)
- Department of Molecular and Clinical Medicine, Institute of Medicine, The Sahlgrenska Academy at University of Gothenburg, SE-413 45 Gothenburg, Sweden;
- Correspondence: (M.J.); (B.T.)
| | - Benyapa Tangruksa
- Systems Biology Research Center, School of Bioscience, University of Skövde, SE-541 28 Skövde, Sweden; (S.H.-H.); (P.S.); (J.S.)
- Correspondence: (M.J.); (B.T.)
| | - Sepideh Heydarkhan-Hagvall
- Systems Biology Research Center, School of Bioscience, University of Skövde, SE-541 28 Skövde, Sweden; (S.H.-H.); (P.S.); (J.S.)
- Bioscience, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, SE-413 83 Gothenburg, Sweden
| | - Anders Jeppsson
- Department of Molecular and Clinical Medicine, Institute of Medicine, The Sahlgrenska Academy at University of Gothenburg, SE-413 45 Gothenburg, Sweden;
- Department of Cardiothoracic Surgery, Sahlgrenska University Hospital, SE-413 45 Gothenburg, Sweden
| | - Peter Sartipy
- Systems Biology Research Center, School of Bioscience, University of Skövde, SE-541 28 Skövde, Sweden; (S.H.-H.); (P.S.); (J.S.)
| | - Jane Synnergren
- Systems Biology Research Center, School of Bioscience, University of Skövde, SE-541 28 Skövde, Sweden; (S.H.-H.); (P.S.); (J.S.)
| |
Collapse
|
20
|
Integration of Omics and Phenotypic Data for Precision Medicine. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2486:19-35. [PMID: 35437716 DOI: 10.1007/978-1-0716-2265-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Over the past two decades, biomedical research is moving toward a big-data-driven approach. The underlying causes of this transition include the ability to gather genetic or molecular profiles of humans faster, the increasing adoption of electronic health record (EHR) system, and the growing interest in linking omics and phenotypic data for analysis. The integration of individual's biology data (e.g., genomics, proteomics, metabolomics), and health-care data has created unprecedented opportunities for precision medicine, that is, a medical model that uses a patient's unique information, mainly genetic, to prevent, diagnose, or treat disease. This chapter reviewed the research opportunities and applications of integrating omics and phenotypic data for precision medicine, such as understanding the relationship between genotype and phenotype, disease subtyping, and diagnosis or prediction of adverse outcomes. We reviewed the recent advanced methods, particularly the machine learning and deep learning-based approaches used for harnessing and harmonizing the multiomics and phenotypic data to address these applications. We finally discussed the challenges and future directions.
Collapse
|
21
|
Mallick P, Maity S, Chakrabarti O, Chakrabarti S. Role of systems biology and multi-omics analyses in delineating spatial interconnectivity and temporal dynamicity of ER stress mediated cellular responses. BIOCHIMICA ET BIOPHYSICA ACTA. MOLECULAR CELL RESEARCH 2022; 1869:119210. [PMID: 35032474 DOI: 10.1016/j.bbamcr.2022.119210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 12/01/2021] [Accepted: 12/30/2021] [Indexed: 06/14/2023]
Abstract
The endoplasmic reticulum (ER) is a membranous organelle involved in calcium storage, lipid biosynthesis, protein folding and processing. Many patho-physiological conditions and pharmacological agents are known to perturb normal ER function and can lead to ER stress, which severely compromise protein folding mechanism and hence poses high risk of proteotoxicity. Upon sensing ER stress, the different stress signaling pathways interconnect with each other and work together to preserve cellular homeostasis. ER stress response is a part of the integrative stress response (ISR) and might play an important role in the pathogenesis of chronic neurodegenerative diseases, where misfolded protein accumulation and cell death are common. The initiation, manifestation and progression of ER stress mediated unfolded protein response (UPR) is a complex procedure involving multiple proteins, pathways and cellular organelles. To understand the cause and consequences of such complex processes, implementation of an integrative holistic approach is required to identify novel players and regulators of ER stress. As multi-omics data-based systems analyses have shown potential to unravel the underneath molecular mechanism of complex biological systems, it is important to emphasize the utility of this approach in understanding the ER stress biology. In this review we first discuss the ER stress signaling pathways and regulatory players, along with their inter-connectivity. We next highlight the importance of systems and network biology approaches using multi-omics data in understanding ER stress mediated cellular responses. This report would help advance our current understanding of the multivariate spatial interconnectivity and temporal dynamicity of ER stress.
Collapse
Affiliation(s)
- Priyanka Mallick
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, CN-6, Sector 5, Salt Lake, Kolkata Pin 700091, WB, India; Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | - Sebabrata Maity
- Biophysics & Structural Genomics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India; Homi Bhabha National Institute, India
| | - Oishee Chakrabarti
- Biophysics & Structural Genomics Division, Saha Institute of Nuclear Physics, 1/AF Bidhannagar, Kolkata 700064, India; Homi Bhabha National Institute, India.
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, CN-6, Sector 5, Salt Lake, Kolkata Pin 700091, WB, India; Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India.
| |
Collapse
|
22
|
Marwaha S, Knowles JW, Ashley EA. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med 2022; 14:23. [PMID: 35220969 PMCID: PMC8883622 DOI: 10.1186/s13073-022-01026-w] [Citation(s) in RCA: 142] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 02/10/2022] [Indexed: 02/07/2023] Open
Abstract
Rare diseases affect 30 million people in the USA and more than 300-400 million worldwide, often causing chronic illness, disability, and premature death. Traditional diagnostic techniques rely heavily on heuristic approaches, coupling clinical experience from prior rare disease presentations with the medical literature. A large number of rare disease patients remain undiagnosed for years and many even die without an accurate diagnosis. In recent years, gene panels, microarrays, and exome sequencing have helped to identify the molecular cause of such rare and undiagnosed diseases. These technologies have allowed diagnoses for a sizable proportion (25-35%) of undiagnosed patients, often with actionable findings. However, a large proportion of these patients remain undiagnosed. In this review, we focus on technologies that can be adopted if exome sequencing is unrevealing. We discuss the benefits of sequencing the whole genome and the additional benefit that may be offered by long-read technology, pan-genome reference, transcriptomics, metabolomics, proteomics, and methyl profiling. We highlight computational methods to help identify regionally distant patients with similar phenotypes or similar genetic mutations. Finally, we describe approaches to automate and accelerate genomic analysis. The strategies discussed here are intended to serve as a guide for clinicians and researchers in the next steps when encountering patients with non-diagnostic exomes.
Collapse
Affiliation(s)
- Shruti Marwaha
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Medicine, Diabetes Research Center, Cardiovascular Institute and Prevention Research Center, Stanford, CA, USA
| | - Euan A Ashley
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
23
|
John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
24
|
Raza A, Tabassum J, Zahid Z, Charagh S, Bashir S, Barmukh R, Khan RSA, Barbosa F, Zhang C, Chen H, Zhuang W, Varshney RK. Advances in "Omics" Approaches for Improving Toxic Metals/Metalloids Tolerance in Plants. FRONTIERS IN PLANT SCIENCE 2022; 12:794373. [PMID: 35058954 PMCID: PMC8764127 DOI: 10.3389/fpls.2021.794373] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/22/2021] [Indexed: 05/17/2023]
Abstract
Food safety has emerged as a high-urgency matter for sustainable agricultural production. Toxic metal contamination of soil and water significantly affects agricultural productivity, which is further aggravated by extreme anthropogenic activities and modern agricultural practices, leaving food safety and human health at risk. In addition to reducing crop production, increased metals/metalloids toxicity also disturbs plants' demand and supply equilibrium. Counterbalancing toxic metals/metalloids toxicity demands a better understanding of the complex mechanisms at physiological, biochemical, molecular, cellular, and plant level that may result in increased crop productivity. Consequently, plants have established different internal defense mechanisms to cope with the adverse effects of toxic metals/metalloids. Nevertheless, these internal defense mechanisms are not adequate to overwhelm the metals/metalloids toxicity. Plants produce several secondary messengers to trigger cell signaling, activating the numerous transcriptional responses correlated with plant defense. Therefore, the recent advances in omics approaches such as genomics, transcriptomics, proteomics, metabolomics, ionomics, miRNAomics, and phenomics have enabled the characterization of molecular regulators associated with toxic metal tolerance, which can be deployed for developing toxic metal tolerant plants. This review highlights various response strategies adopted by plants to tolerate toxic metals/metalloids toxicity, including physiological, biochemical, and molecular responses. A seven-(omics)-based design is summarized with scientific clues to reveal the stress-responsive genes, proteins, metabolites, miRNAs, trace elements, stress-inducible phenotypes, and metabolic pathways that could potentially help plants to cope up with metals/metalloids toxicity in the face of fluctuating environmental conditions. Finally, some bottlenecks and future directions have also been highlighted, which could enable sustainable agricultural production.
Collapse
Affiliation(s)
- Ali Raza
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Javaria Tabassum
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Hangzhou, China
| | - Zainab Zahid
- School of Civil and Environmental Engineering (SCEE), Institute of Environmental Sciences and Engineering (IESE), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Sidra Charagh
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Hangzhou, China
| | - Shanza Bashir
- School of Civil and Environmental Engineering (SCEE), Institute of Environmental Sciences and Engineering (IESE), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Rutwik Barmukh
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Rao Sohail Ahmad Khan
- Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad, Pakistan
| | - Fernando Barbosa
- Department of Clinical Analysis, Toxicology and Food Sciences, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Chong Zhang
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Hua Chen
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Weijian Zhuang
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Rajeev K. Varshney
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
- State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, WA, Australia
| |
Collapse
|
25
|
Ju J, Wismans LV, Mustafa DAM, Reinders MJT, van Eijck CHJ, Stubbs AP, Li Y. Robust deep learning model for prognostic stratification of pancreatic ductal adenocarcinoma patients. iScience 2021; 24:103415. [PMID: 34901786 PMCID: PMC8637475 DOI: 10.1016/j.isci.2021.103415] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/27/2021] [Accepted: 11/05/2021] [Indexed: 02/07/2023] Open
Abstract
A major challenge for treating patients with pancreatic ductal adenocarcinoma (PDAC) is the unpredictability of their prognoses due to high heterogeneity. We present Multi-Omics DEep Learning for Prognosis-correlated subtyping (MODEL-P) to identify PDAC subtypes and to predict prognoses of new patients. MODEL-P was trained on autoencoder integrated multi-omics of 146 patients with PDAC together with their survival outcome. Using MODEL-P, we identified two PDAC subtypes with distinct survival outcomes (median survival 10.1 and 22.7 months, respectively, log rank p = 1 × 10−6), which correspond to DNA damage repair and immune response. We rigorously validated MODEL-P by stratifying patients in five independent datasets into these two survival groups and achieved significant survival difference, which is superior to current practice and other subtyping schemas. We believe the subtype-specific signatures would facilitate PDAC pathogenesis discovery, and MODEL-P can provide clinicians the prognoses information in the treatment decision-making to better gauge the benefits versus the risks. We developed DL-based MODEL-P to identify prognosis-correlated PDAC subtypes The identified subtypes related to DNA damage repair and immune response processes MODEL-P stratified patients from independent datasets into distinct survival groups MODEL-P could be used in clinics to aid treatment decision-making
Collapse
Affiliation(s)
- Jie Ju
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Leonoor V Wismans
- Department of Surgery, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Dana A M Mustafa
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marcel J T Reinders
- The Delft Bioinformatics Lab, Delft University of Technology, Rotterdam, the Netherlands
| | - Casper H J van Eijck
- Department of Surgery, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Andrew P Stubbs
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Yunlei Li
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, the Netherlands
| |
Collapse
|
26
|
Demirel HC, Arici MK, Tuncbag N. Computational approaches leveraging integrated connections of multi-omic data toward clinical applications. Mol Omics 2021; 18:7-18. [PMID: 34734935 DOI: 10.1039/d1mo00158b] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In line with the advances in high-throughput technologies, multiple omic datasets have accumulated to study biological systems and diseases coherently. No single omics data type is capable of fully representing cellular activity. The complexity of the biological processes arises from the interactions between omic entities such as genes, proteins, and metabolites. Therefore, multi-omic data integration is crucial but challenging. The impact of the molecular alterations in multi-omic data is not local in the neighborhood of the altered gene or protein; rather, the impact diffuses in the network and changes the functionality of multiple signaling pathways and regulation of the gene expression. Additionally, multi-omic data is high-dimensional and has background noise. Several integrative approaches have been developed to accurately interpret the multi-omic datasets, including machine learning, network-based methods, and their combination. In this review, we overview the most recent integrative approaches and tools with a focus on network-based methods. We then discuss these approaches according to their specific applications, from disease-network and biomarker identification to patient stratification, drug discovery, and repurposing.
Collapse
Affiliation(s)
- Habibe Cansu Demirel
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey
| | - Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, 06044, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey.,School of Medicine, Koc University, Istanbul, 34450, Turkey.,Koc University Research Center for Translational Medicine (KUTTAM), Istanbul, Turkey.
| |
Collapse
|
27
|
Sindhu KJ, Venkatesan N, Karunagaran D. MicroRNA Interactome Multiomics Characterization for Cancer Research and Personalized Medicine: An Expert Review. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2021; 25:545-566. [PMID: 34448651 DOI: 10.1089/omi.2021.0087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
MicroRNAs (miRNAs) that are mutually modulated by their interacting partners (interactome) are being increasingly noted for their significant role in pathogenesis and treatment of various human cancers. Recently, miRNA interactome dissected with multiomics approaches has been the subject of focus since individual tools or methods failed to provide the necessary comprehensive clues on the complete interactome. Even though single-omics technologies such as proteomics can uncover part of the interactome, the biological and clinical understanding still remain incomplete. In this study, we present an expert review of studies involving multiomics approaches to identification of miRNA interactome and its application in mechanistic characterization, classification, and therapeutic target identification in a variety of cancers, and with a focus on proteomics. We also discuss individual or multiple miRNA-based interactome identification in various pathological conditions of relevance to clinical medicine. Various new single-omics methods that can be integrated into multiomics cancer research and the computational approaches to analyze and predict miRNA interactome are also highlighted in this review. In all, we contextulize the power of multiomics approaches and the importance of the miRNA interactome to achieve the vision and practice of predictive, preventive, and personalized medicine in cancer research and clinical oncology.
Collapse
Affiliation(s)
- K J Sindhu
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Nalini Venkatesan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Devarajan Karunagaran
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
28
|
Park KS, Kim SH, Oh JH, Kim SY. Highly accurate diagnosis of papillary thyroid carcinomas based on personalized pathways coupled with machine learning. Brief Bioinform 2021; 22:bbaa336. [PMID: 33341874 PMCID: PMC8599295 DOI: 10.1093/bib/bbaa336] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 10/25/2020] [Accepted: 10/26/2020] [Indexed: 01/27/2023] Open
Abstract
Thyroid nodules are neoplasms commonly found among adults, with papillary thyroid carcinoma (PTC) being the most prevalent malignancy. However, current diagnostic methods often subject patients to unnecessary surgical burden. In this study, we developed and validated an automated, highly accurate multi-study-derived diagnostic model for PTCs using personalized biological pathways coupled with a sophisticated machine learning algorithm. Surprisingly, the algorithm achieved near-perfect performance in discriminating PTCs from non-tumoral thyroid samples with an overall cross-study-validated area under the receiver operating characteristic curve (AUROC) of 0.999 (95% confidence interval [CI]: 0.995-1) and a Brier score of 0.013 on three independent development cohorts. In addition, the algorithm showed excellent generalizability and transferability on two large-scale external blind PTC cohorts consisting of The Cancer Genome Atlas (TCGA), which is the largest genomic PTC cohort studied to date, and the post-Chernobyl cohort, which includes PTCs reported after exposure to radiation from the Chernobyl accident. When applied to the TCGA cohort, the model yielded an AUROC of 0.969 (95% CI: 0.950-0.987) and a Brier score of 0.109. On the post-Chernobyl cohort, it yielded an AUROC of 0.962 (95% CI: 0.918-1) and a Brier score of 0.073. This algorithm also is robust against other various types of clinical scenarios, discriminating malignant from benign lesions as well as clinically aggressive thyroid cancer with poor prognosis from indolent ones. Furthermore, we discovered novel pathway alterations and prognostic signatures for PTC, which can provide directions for follow-up studies.
Collapse
Affiliation(s)
| | | | - Jung Hun Oh
- Department of Medical Physics at Memorial Sloan Kettering Cancer Center, USA
| | | |
Collapse
|
29
|
Abstract
A key challenge in the analysis of longitudinal microbiome data is the inference of temporal interactions between microbial taxa, their genes, the metabolites that they consume and produce, and host genes. To address these challenges, we developed a computational pipeline, a pipeline for the analysis of longitudinal multi-omics data (PALM), that first aligns multi-omics data and then uses dynamic Bayesian networks (DBNs) to reconstruct a unified model. Our approach overcomes differences in sampling and progression rates, utilizes a biologically inspired multi-omic framework, reduces the large number of entities and parameters in the DBNs, and validates the learned network. Applying PALM to data collected from inflammatory bowel disease patients, we show that it accurately identifies known and novel interactions. Targeted experimental validations further support a number of the predicted novel metabolite-taxon interactions. IMPORTANCE While a number of large consortia collect and profile several different types of microbiome and genomic time series data, very few methods exist for joint modeling of multi-omics data sets. We developed a new computational pipeline, PALM, which uses dynamic Bayesian networks (DBNs) and is designed to integrate multi-omics data from longitudinal microbiome studies. When used to integrate sequence, expression, and metabolomics data from microbiome samples along with host expression data, the resulting models identify interactions between taxa, their genes, and the metabolites that they produce and consume, as well as their impact on host expression. We tested the models both by using them to predict future changes in microbiome levels and by comparing the learned interactions to known interactions in the literature. Finally, we performed experimental validations for a few of the predicted interactions to demonstrate the ability of the method to identify novel relationships and their impact.
Collapse
|
30
|
Aghakhani S, Zerrouk N, Niarakis A. Metabolic Reprogramming of Fibroblasts as Therapeutic Target in Rheumatoid Arthritis and Cancer: Deciphering Key Mechanisms Using Computational Systems Biology Approaches. Cancers (Basel) 2020; 13:E35. [PMID: 33374292 PMCID: PMC7795338 DOI: 10.3390/cancers13010035] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Revised: 12/12/2020] [Accepted: 12/17/2020] [Indexed: 12/29/2022] Open
Abstract
Fibroblasts, the most abundant cells in the connective tissue, are key modulators of the extracellular matrix (ECM) composition. These spindle-shaped cells are capable of synthesizing various extracellular matrix proteins and collagen. They also provide the structural framework (stroma) for tissues and play a pivotal role in the wound healing process. While they are maintainers of the ECM turnover and regulate several physiological processes, they can also undergo transformations responding to certain stimuli and display aggressive phenotypes that contribute to disease pathophysiology. In this review, we focus on the metabolic pathways of glucose and highlight metabolic reprogramming as a critical event that contributes to the transition of fibroblasts from quiescent to activated and aggressive cells. We also cover the emerging evidence that allows us to draw parallels between fibroblasts in autoimmune disorders and more specifically in rheumatoid arthritis and cancer. We link the metabolic changes of fibroblasts to the toxic environment created by the disease condition and discuss how targeting of metabolic reprogramming could be employed in the treatment of such diseases. Lastly, we discuss Systems Biology approaches, and more specifically, computational modeling, as a means to elucidate pathogenetic mechanisms and accelerate the identification of novel therapeutic targets.
Collapse
Affiliation(s)
- Sahar Aghakhani
- GenHotel, University of Evry, University of Paris-Saclay, Genopole, 91000 Evry, France; (S.A.); (N.Z.)
- Lifeware Group, Inria Saclay, 91120 Palaiseau, France
| | - Naouel Zerrouk
- GenHotel, University of Evry, University of Paris-Saclay, Genopole, 91000 Evry, France; (S.A.); (N.Z.)
| | - Anna Niarakis
- GenHotel, University of Evry, University of Paris-Saclay, Genopole, 91000 Evry, France; (S.A.); (N.Z.)
- Lifeware Group, Inria Saclay, 91120 Palaiseau, France
| |
Collapse
|
31
|
Krassowski M, Das V, Sahu SK, Misra BB. State of the Field in Multi-Omics Research: From Computational Needs to Data Mining and Sharing. Front Genet 2020; 11:610798. [PMID: 33362867 PMCID: PMC7758509 DOI: 10.3389/fgene.2020.610798] [Citation(s) in RCA: 167] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 11/20/2020] [Indexed: 12/24/2022] Open
Abstract
Multi-omics, variously called integrated omics, pan-omics, and trans-omics, aims to combine two or more omics data sets to aid in data analysis, visualization and interpretation to determine the mechanism of a biological process. Multi-omics efforts have taken center stage in biomedical research leading to the development of new insights into biological events and processes. However, the mushrooming of a myriad of tools, datasets, and approaches tends to inundate the literature and overwhelm researchers new to the field. The aims of this review are to provide an overview of the current state of the field, inform on available reliable resources, discuss the application of statistics and machine/deep learning in multi-omics analyses, discuss findable, accessible, interoperable, reusable (FAIR) research, and point to best practices in benchmarking. Thus, we provide guidance to interested users of the domain by addressing challenges of the underlying biology, giving an overview of the available toolset, addressing common pitfalls, and acknowledging current methods' limitations. We conclude with practical advice and recommendations on software engineering and reproducibility practices to share a comprehensive awareness with new researchers in multi-omics for end-to-end workflow.
Collapse
Affiliation(s)
- Michal Krassowski
- Nuffield Department of Women’s & Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Vivek Das
- Novo Nordisk Research Center Seattle, Inc, Seattle, WA, United States
| | | | | |
Collapse
|
32
|
Biswas N, Kumar K, Bose S, Bera R, Chakrabarti S. Analysis of Pan-omics Data in Human Interactome Network (APODHIN). Front Genet 2020; 11:589231. [PMID: 33363571 PMCID: PMC7753071 DOI: 10.3389/fgene.2020.589231] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 11/11/2020] [Indexed: 12/24/2022] Open
Abstract
Analysis of Pan-omics Data in Human Interactome Network (APODHIN) is a platform for integrative analysis of transcriptomics, proteomics, genomics, and metabolomics data for identification of key molecular players and their interconnections exemplified in cancer scenario. APODHIN works on a meta-interactome network consisting of human protein-protein interactions (PPIs), miRNA-target gene regulatory interactions, and transcription factor-target gene regulatory relationships. In its first module, APODHIN maps proteins/genes/miRNAs from different omics data in its meta-interactome network and extracts the network of biomolecules that are differentially altered in the given scenario. Using this context specific, filtered interaction network, APODHIN identifies topologically important nodes (TINs) implementing graph theory based network topology analysis and further justifies their role via pathway and disease marker mapping. These TINs could be used as prospective diagnostic and/or prognostic biomarkers and/or potential therapeutic targets. In its second module, APODHIN attempts to identify cross pathway regulatory and PPI links connecting signaling proteins, transcription factors (TFs), and miRNAs to metabolic enzymes via utilization of single-omics and/or pan-omics data and implementation of mathematical modeling. Interconnections between regulatory components such as signaling proteins/TFs/miRNAs and metabolic pathways need to be elucidated more elaborately in order to understand the role of oncogene and tumor suppressors in regulation of metabolic reprogramming during cancer. APODHIN platform contains a web server component where users can upload single/multi omics data to identify TINs and cross-pathway links. Tabular, graphical and 3D network representations of the identified TINs and cross-pathway links are provided for better appreciation. Additionally, this platform also provides few example data analysis of cancer specific, single and/or multi omics dataset for cervical, ovarian, and breast cancers where meta-interactome networks, TINs, and cross-pathway links are provided. APODHIN platform is freely available at http://www.hpppi.iicb.res.in/APODHIN/home.html.
Collapse
Affiliation(s)
| | | | | | | | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, India
| |
Collapse
|
33
|
Baek B, Lee H. Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data. Sci Rep 2020; 10:18951. [PMID: 33144687 PMCID: PMC7609582 DOI: 10.1038/s41598-020-76025-1] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 10/20/2020] [Indexed: 01/08/2023] Open
Abstract
Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival and recurrence, the accuracies of these approaches in the prediction of high-risk pancreatic adenocarcinoma (PAAD) still need to be improved. Accordingly, in this study, we proposed two biological features based on multi-omics datasets to predict survival and recurrence among patients with PAAD. First, the clonal expansion of cancer cells with somatic mutations was used to predict prognosis. Using whole-exome sequencing data from 134 patients with PAAD from The Cancer Genome Atlas (TCGA), we found five candidate genes that were mutated in the early stages of tumorigenesis with high cellular prevalence (CP). CDKN2A, TP53, TTN, KCNJ18, and KRAS had the highest CP values among the patients with PAAD, and survival and recurrence rates were significantly different between the patients harboring mutations in these candidate genes and those harboring mutations in other genes (p = 2.39E-03, p = 8.47E-04, respectively). Second, we generated an autoencoder to integrate the RNA sequencing, microRNA sequencing, and DNA methylation data from 134 patients with PAAD from TCGA. The autoencoder robustly reduced the dimensions of these multi-omics data, and the K-means clustering method was then used to cluster the patients into two subgroups. The subgroups of patients had significant differences in survival and recurrence (p = 1.41E-03, p = 4.43E-04, respectively). Finally, we developed a prediction model for prognosis using these two biological features and clinical data. When support vector machines, random forest, logistic regression, and L2 regularized logistic regression were used as prediction models, logistic regression analysis generally revealed the best performance for both disease-free survival (DFS) and overall survival (OS) (accuracy [ACC] = 0.762 and area under the curve [AUC] = 0.795 for DFS; ACC = 0.776 and AUC = 0.769 for OS). Thus, we could classify patients with a high probability of recurrence and at a high risk of poor outcomes. Our study provides insights into new personalized therapies on the basis of mutation status and multi-omics data.
Collapse
Affiliation(s)
- Bin Baek
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea.
- Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea.
| |
Collapse
|
34
|
Seneviratne CJ, Suriyanarayanan T, Widyarman AS, Lee LS, Lau M, Ching J, Delaney C, Ramage G. Multi-omics tools for studying microbial biofilms: current perspectives and future directions. Crit Rev Microbiol 2020; 46:759-778. [PMID: 33030973 DOI: 10.1080/1040841x.2020.1828817] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The advent of omics technologies has greatly improved our understanding of microbial biology, particularly in the last two decades. The field of microbial biofilms is, however, relatively new, consolidated in the 1980s. The morphogenic switching by microbes from planktonic to biofilm phenotype confers numerous survival advantages such as resistance to desiccation, antibiotics, biocides, ultraviolet radiation, and host immune responses, thereby complicating treatment strategies for pathogenic microorganisms. Hence, understanding the mechanisms governing the biofilm phenotype can result in efficient treatment strategies directed specifically against molecular markers mediating this process. The application of omics technologies for studying microbial biofilms is relatively less explored and holds great promise in furthering our understanding of biofilm biology. In this review, we provide an overview of the application of omics tools such as transcriptomics, proteomics, and metabolomics as well as multi-omics approaches for studying microbial biofilms in the current literature. We also highlight how the use of omics tools directed at various stages of the biological information flow, from genes to metabolites, can be integrated via multi-omics platforms to provide a holistic view of biofilm biology. Following this, we propose a future artificial intelligence-based multi-omics platform that can predict the pathways associated with different biofilm phenotypes.
Collapse
Affiliation(s)
- Chaminda J Seneviratne
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore.,Duke NUS Medical School, Singapore, Singapore
| | - Tanujaa Suriyanarayanan
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore.,Duke NUS Medical School, Singapore, Singapore
| | - Armelia Sari Widyarman
- Department of Microbiology, Faculty of Dentistry, Trisakti University, Grogol, West Jakarta, Indonesia
| | - Lye Siang Lee
- Duke-NUS Medical School, Metabolomics Lab, Cardiovascular and Metabolic Disorders, Singapore, Singapore
| | - Matthew Lau
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore
| | - Jianhong Ching
- Duke-NUS Medical School, Metabolomics Lab, Cardiovascular and Metabolic Disorders, Singapore, Singapore
| | - Christopher Delaney
- School of Medicine, Dentistry & Nursing, Glasgow Dental Hospital & School, University of Glasgow, Glasgow, UK
| | - Gordon Ramage
- School of Medicine, Dentistry & Nursing, Glasgow Dental Hospital & School, University of Glasgow, Glasgow, UK
| |
Collapse
|
35
|
Mote RS, Filipov NM. Use of Integrative Interactomics for Improvement of Farm Animal Health and Welfare: An Example with Fescue Toxicosis. Toxins (Basel) 2020; 12:toxins12100633. [PMID: 33019560 PMCID: PMC7600642 DOI: 10.3390/toxins12100633] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 09/18/2020] [Accepted: 09/24/2020] [Indexed: 02/07/2023] Open
Abstract
Rapid scientific advances are increasing our understanding of the way complex biological interactions integrate to maintain homeostatic balance and how seemingly small, localized perturbations can lead to systemic effects. The ‘omics movement, alongside increased throughput resulting from statistical and computational advances, has transformed our understanding of disease mechanisms and the multi-dimensional interaction between environmental stressors and host physiology through data integration into multi-dimensional analyses, i.e., integrative interactomics. This review focuses on the use of high-throughput technologies in farm animal research, including health- and toxicology-related papers. Although limited, we highlight recent animal agriculture-centered reports from the integrative multi-‘omics movement. We provide an example with fescue toxicosis, an economically costly disease affecting grazing livestock, and describe how integrative interactomics can be applied to a disease with a complex pathophysiology in the pursuit of novel treatment and management approaches. We outline how ‘omics techniques have been used thus far to understand fescue toxicosis pathophysiology, lay out a framework for the fescue toxicosis integrome, identify some challenges we foresee, and offer possible means for addressing these challenges. Finally, we briefly discuss how the example with fescue toxicosis could be used for other agriculturally important animal health and welfare problems.
Collapse
|
36
|
Abstract
Multi-omics strategies are indispensable tools in the search for new anti-tuberculosis drugs. Omics methodologies, where the ensemble of a class of biological molecules are measured and evaluated together, enable drug discovery programs to answer two fundamental questions. Firstly, in a discovery biology approach, to find new targets in druggable pathways for target-based investigation, advancing from target to lead compound. Secondly, in a discovery chemistry approach, to identify the mode of action of lead compounds derived from high-throughput screens, progressing from compound to target. The advantage of multi-omics methodologies in both of these settings is that omics approaches are unsupervised and unbiased to a priori hypotheses, making omics useful tools to confirm drug action, reveal new insights into compound activity, and discover new avenues for inquiry. This review summarizes the application of Mycobacterium tuberculosis omics technologies to the early stages of tuberculosis antimicrobial drug discovery.
Collapse
|
37
|
Chierici M, Bussola N, Marcolini A, Francescatto M, Zandonà A, Trastulla L, Agostinelli C, Jurman G, Furlanello C. Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling. Front Oncol 2020; 10:1065. [PMID: 32714870 PMCID: PMC7340129 DOI: 10.3389/fonc.2020.01065] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 05/28/2020] [Indexed: 12/20/2022] Open
Abstract
Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER, N = 381) and breast invasive carcinoma subtypes (BRCA-subtypes, N = 305), while gene expression, miRNA expression and methylation data is used as predictor layers for acute myeloid leukemia and renal clear cell carcinoma survival (AML-OS, N = 157; KIRC-OS, N = 181). In test, INF achieved similar Matthews Correlation Coefficient (MCC) values and 97% to 83% smaller feature sizes (FS), compared with juXT for BRCA-ER (MCC: 0.83 vs. 0.80; FS: 56 vs. 1801) and BRCA-subtypes (0.84 vs. 0.80; 302 vs. 1801), improving KIRC-OS performance (0.38 vs. 0.31; 111 vs. 2319). INF predictions are generally more accurate in test than one-dimensional omics models, with smaller signatures too, where transcriptomics consistently play the leading role. Overall, the INF framework effectively integrates multiple data levels in oncogenomics classification tasks, improving over the performance of single layers alone and naive juxtaposition, and provides compact signature sizes.
Collapse
Affiliation(s)
| | - Nicole Bussola
- Fondazione Bruno Kessler, Trento, Italy
- University of Trento, Trento, Italy
| | | | - Margherita Francescatto
- Fondazione Bruno Kessler, Trento, Italy
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste, Italy
| | | | | | | | | | | |
Collapse
|
38
|
Eicher T, Kinnebrew G, Patt A, Spencer K, Ying K, Ma Q, Machiraju R, Mathé EA. Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources. Metabolites 2020; 10:E202. [PMID: 32429287 PMCID: PMC7281435 DOI: 10.3390/metabo10050202] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/07/2020] [Accepted: 05/13/2020] [Indexed: 02/06/2023] Open
Abstract
As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose effective techniques for multi-omics analyses relevant to their field of study.
Collapse
Affiliation(s)
- Tara Eicher
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
| | - Garrett Kinnebrew
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Bioinformatics Shared Resource Group, The Ohio State University, Columbus, OH 43210, USA
| | - Andrew Patt
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
| | - Kyle Spencer
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
- Nationwide Children’s Research Hospital, Columbus, OH 43210, USA
| | - Kevin Ying
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Molecular, Cellular and Developmental Biology Program, The Ohio State University, Columbus, OH 43210, USA
| | - Qin Ma
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
| | - Raghu Machiraju
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
- Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| | - Ewy A. Mathé
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
| |
Collapse
|
39
|
O'Shea K, Misra BB. Software tools, databases and resources in metabolomics: updates from 2018 to 2019. Metabolomics 2020; 16:36. [PMID: 32146531 DOI: 10.1007/s11306-020-01657-3] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 03/01/2020] [Indexed: 12/24/2022]
Abstract
Metabolomics has evolved as a discipline from a discovery and functional genomics tool, and is now a cornerstone in the era of big data-driven precision medicine. Sample preparation strategies and analytical technologies have seen enormous growth, and keeping pace with data analytics is challenging, to say the least. This review introduces and briefly presents around 100 metabolomics software resources, tools, databases, and other utilities that have surfaced or have improved in 2019. Table 1 provides the computational dependencies of the tools, categorizes the resources based on utility and ease of use, and provides hyperlinks to webpages where the tools can be downloaded or used. This review intends to keep the community of metabolomics researchers up to date with all the software tools, resources, and databases developed in 2019, in one place.
Collapse
Affiliation(s)
- Keiron O'Shea
- Institute of Biological, Environmental, and Rural Studies, Aberystwyth University, Ceredigion, Wales, SY23 3DA, UK
| | - Biswapriya B Misra
- Center for Precision Medicine, Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC, 27157, USA.
| |
Collapse
|