1
|
Hernández-Lemus E, Ochoa S. Methods for multi-omic data integration in cancer research. Front Genet 2024; 15:1425456. [PMID: 39364009 PMCID: PMC11446849 DOI: 10.3389/fgene.2024.1425456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 08/28/2024] [Indexed: 10/05/2024] Open
Abstract
Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
2
|
Liu W, Vu T, R Konigsberg I, A Pratte K, Zhuang Y, Kechris KJ. Smccnet 2.0: a comprehensive tool for multi-omics network inference with shiny visualization. BMC Bioinformatics 2024; 25:276. [PMID: 39179997 PMCID: PMC11344457 DOI: 10.1186/s12859-024-05900-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 08/14/2024] [Indexed: 08/26/2024] Open
Abstract
Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. AVAILABILITY : This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/ .
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
| | - Thao Vu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Iain R Konigsberg
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Katherine A Pratte
- Department of Biostatistics, National Jewish Health, Denver, 80206, CO, USA
| | - Yonghua Zhuang
- Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Katerina J Kechris
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| |
Collapse
|
3
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
4
|
Loers JU, Vermeirssen V. A single-cell multimodal view on gene regulatory network inference from transcriptomics and chromatin accessibility data. Brief Bioinform 2024; 25:bbae382. [PMID: 39207727 PMCID: PMC11359808 DOI: 10.1093/bib/bbae382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/27/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
Eukaryotic gene regulation is a combinatorial, dynamic, and quantitative process that plays a vital role in development and disease and can be modeled at a systems level in gene regulatory networks (GRNs). The wealth of multi-omics data measured on the same samples and even on the same cells has lifted the field of GRN inference to the next stage. Combinations of (single-cell) transcriptomics and chromatin accessibility allow the prediction of fine-grained regulatory programs that go beyond mere correlation of transcription factor and target gene expression, with enhancer GRNs (eGRNs) modeling molecular interactions between transcription factors, regulatory elements, and target genes. In this review, we highlight the key components for successful (e)GRN inference from (sc)RNA-seq and (sc)ATAC-seq data exemplified by state-of-the-art methods as well as open challenges and future developments. Moreover, we address preprocessing strategies, metacell generation and computational omics pairing, transcription factor binding site detection, and linear and three-dimensional approaches to identify chromatin interactions as well as dynamic and causal eGRN inference. We believe that the integration of transcriptomics together with epigenomics data at a single-cell level is the new standard for mechanistic network inference, and that it can be further advanced with integrating additional omics layers and spatiotemporal data, as well as with shifting the focus towards more quantitative and causal modeling strategies.
Collapse
Affiliation(s)
- Jens Uwe Loers
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| | - Vanessa Vermeirssen
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| |
Collapse
|
5
|
Ahmad P, Hussain A, Siqueira WL. Mass spectrometry-based proteomic approaches for salivary protein biomarkers discovery and dental caries diagnosis: A critical review. MASS SPECTROMETRY REVIEWS 2024; 43:826-856. [PMID: 36444686 DOI: 10.1002/mas.21822] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Dental caries is a multifactorial chronic disease resulting from the intricate interplay among acid-generating bacteria, fermentable carbohydrates, and several host factors such as saliva. Saliva comprises several proteins which could be utilized as biomarkers for caries prevention, diagnosis, and prognosis. Mass spectrometry-based salivary proteomics approaches, owing to their sensitivity, provide the opportunity to investigate and unveil crucial cariogenic pathogen activity and host indicators and may demonstrate clinically relevant biomarkers to improve caries diagnosis and management. The present review outlines the published literature of human clinical proteomics investigations on caries and extensively elucidates frequently reported salivary proteins as biomarkers. This review also discusses important aspects while designing an experimental proteomics workflow. The protein-protein interactions and the clinical relevance of salivary proteins as biomarkers for caries, together with uninvestigated domains of the discipline are also discussed critically.
Collapse
Affiliation(s)
- Paras Ahmad
- College of Dentistry, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Ahmed Hussain
- College of Dentistry, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Walter L Siqueira
- College of Dentistry, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
6
|
Zheng Y, Liu Y, Yang J, Dong L, Zhang R, Tian S, Yu Y, Ren L, Hou W, Zhu F, Mai Y, Han J, Zhang L, Jiang H, Lin L, Lou J, Li R, Lin J, Liu H, Kong Z, Wang D, Dai F, Bao D, Cao Z, Chen Q, Chen Q, Chen X, Gao Y, Jiang H, Li B, Li B, Li J, Liu R, Qing T, Shang E, Shang J, Sun S, Wang H, Wang X, Zhang N, Zhang P, Zhang R, Zhu S, Scherer A, Wang J, Wang J, Huo Y, Liu G, Cao C, Shao L, Xu J, Hong H, Xiao W, Liang X, Lu D, Jin L, Tong W, Ding C, Li J, Fang X, Shi L. Multi-omics data integration using ratio-based quantitative profiling with Quartet reference materials. Nat Biotechnol 2024; 42:1133-1149. [PMID: 37679543 PMCID: PMC11252085 DOI: 10.1038/s41587-023-01934-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 07/31/2023] [Indexed: 09/09/2023]
Abstract
Characterization and integration of the genome, epigenome, transcriptome, proteome and metabolome of different datasets is difficult owing to a lack of ground truth. Here we develop and characterize suites of publicly available multi-omics reference materials of matched DNA, RNA, protein and metabolites derived from immortalized cell lines from a family quartet of parents and monozygotic twin daughters. These references provide built-in truth defined by relationships among the family members and the information flow from DNA to RNA to protein. We demonstrate how using a ratio-based profiling approach that scales the absolute feature values of a study sample relative to those of a concurrently measured common reference sample produces reproducible and comparable data suitable for integration across batches, labs, platforms and omics types. Our study identifies reference-free 'absolute' feature quantification as the root cause of irreproducibility in multi-omics measurement and data integration and establishes the advantages of ratio-based multi-omics profiling with common reference materials.
Collapse
Affiliation(s)
- Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, China
| | | | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China
| | - Sha Tian
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Feng Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | | | | | | | - Ling Lin
- Zhangjiang Center for Translational Medicine, Shanghai Biotecan Medical Diagnostics Co. Ltd., Shanghai, China
| | - Jingwei Lou
- Zhangjiang Center for Translational Medicine, Shanghai Biotecan Medical Diagnostics Co. Ltd., Shanghai, China
| | - Ruiqiang Li
- Novogene Bioinformatics Institute, Beijing, China
| | - Jingchao Lin
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | | | | | - Depeng Wang
- Nextomics Biosciences Institute, Wuhan, China
| | | | - Ding Bao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xingdong Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuechen Gao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - He Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bin Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Bingying Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingjing Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
- Nextomics Biosciences Institute, Wuhan, China
| | - Ruimei Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Tao Qing
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Erfei Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shanyue Sun
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xiaolin Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Ruolan Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Sibo Zhu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC-European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Jiucun Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jing Wang
- National Institute of Metrology, Beijing, China
| | - Yinbo Huo
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Gang Liu
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Chengming Cao
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Li Shao
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Wenming Xiao
- Office of Oncologic Diseases, Office of New Drugs, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Xiaozhen Liang
- Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Daru Lu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Weida Tong
- Key Laboratory of Bioanalysis and Metrology for State Market Regulation, Shanghai Institute of Measurement and Testing Technology, Shanghai, China
| | - Chen Ding
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital, Beijing, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute and Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes (Shanghai), Shanghai, China.
| |
Collapse
|
7
|
Liu W, Vu T, Konigsberg I, Pratte K, Zhuang Y, Kechris K. SmCCNet 2.0: A Comprehensive Tool for Multi-omics Network Inference with Shiny Visualization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.20.567893. [PMID: 38045372 PMCID: PMC10690212 DOI: 10.1101/2023.11.20.567893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Summary Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. Availability This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/.
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Thao Vu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Iain Konigsberg
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Katherine Pratte
- Department of Biostatistics, National Jewish Health, Denver, 80206, CO, USA
| | - Yonghua Zhuang
- Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Katerina Kechris
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| |
Collapse
|
8
|
Chen CK. Inference of gene networks using gene expression data with applications. Heliyon 2024; 10:e26065. [PMID: 38449656 PMCID: PMC10915353 DOI: 10.1016/j.heliyon.2024.e26065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/06/2024] [Accepted: 02/07/2024] [Indexed: 03/08/2024] Open
Abstract
Gene networks (GNs) use graphs to represent the interaction relationships between genes. Large-scale GNs are often sparse and contain hub genes that interact with many other genes. In this paper, we propose a novel method called NetARD, which utilizes Automatic Relevance Determination (ARD) to estimate partial correlations, to infer GNs with the hub genes from gene expression data. We test NetARD on simulated GNs and in silico GNs, and it outperforms existing methods. In our high-throughput gene expression data analysis, we integrate the NetARD into a method called GN Co-expression Extension (GNCE). This approach infers the GNs of co-expressed genes, with genes from a predefined GN serving as hub genes. We validate this approach by extending the core GN of transcription factor genes of E. coli using microarray data. In an application example, we identify biological process (BP) Gene Ontology (GO) terms that are significantly involved in cancer progression. This task is accomplished by analyzing the GN inferred through GNCE using the core GN associated with the colorectal cancer pathway and RNA-seq data.
Collapse
Affiliation(s)
- Chi-Kan Chen
- Department of Applied Mathematics, National Chung Hsing University, 145 Xingda Rd., South Dist., Taichung City, 40227, Taiwan
| |
Collapse
|
9
|
Wang A. Conceptual breakthroughs of the long noncoding RNA functional system and its endogenous regulatory role in the cancerous regime. EXPLORATION OF TARGETED ANTI-TUMOR THERAPY 2024; 5:170-186. [PMID: 38464381 PMCID: PMC10918237 DOI: 10.37349/etat.2024.00211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/18/2023] [Indexed: 03/12/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) derived from noncoding regions in the human genome were once regarded as junks with no biological significance, but recent studies have shown that these molecules are highly functional, prompting an explosion of studies on their biology. However, these recent efforts have only begun to recognize the biological significance of a small fraction (< 1%) of the lncRNAs. The basic concept of these lncRNA functions remains controversial. This controversy arises primarily from conventional biased observations based on limited datasets. Fortunately, emerging big data provides a promising path to circumvent conventional bias to understand an unbiased big picture of lncRNA biology and advance the fundamental principles of lncRNA biology. This review focuses on big data studies that break through the critical concepts of the lncRNA functional system and its endogenous regulatory roles in all cancers. lncRNAs have unique functional systems distinct from proteins, such as transcriptional initiation and regulation, and they abundantly interact with mitochondria and consume less energy. lncRNAs, rather than proteins as traditionally thought, function as the most critical endogenous regulators of all cancers. lncRNAs regulate the cancer regulatory regime by governing the endogenous regulatory network of all cancers. This is accomplished by dominating the regulatory network module and serving as a key hub and top inducer. These critical conceptual breakthroughs lay a blueprint for a comprehensive functional picture of the human genome. They also lay a blueprint for combating human diseases that are regulated by lncRNAs.
Collapse
Affiliation(s)
- Anyou Wang
- Feinstone Center for Genomic Research, University of Memphis, Memphis, TN 38152, USA
| |
Collapse
|
10
|
Hai Y, Ma J, Yang K, Wen Y. Bayesian linear mixed model with multiple random effects for prediction analysis on high-dimensional multi-omics data. Bioinformatics 2023; 39:btad647. [PMID: 37882747 PMCID: PMC10627352 DOI: 10.1093/bioinformatics/btad647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 09/24/2023] [Accepted: 10/24/2023] [Indexed: 10/27/2023] Open
Abstract
MOTIVATION Accurate disease risk prediction is an essential step in the modern quest for precision medicine. While high-dimensional multi-omics data have provided unprecedented data resources for prediction studies, their high-dimensionality and complex inter/intra-relationships have posed significant analytical challenges. RESULTS We proposed a two-step Bayesian linear mixed model framework (TBLMM) for risk prediction analysis on multi-omics data. TBLMM models the predictive effects from multi-omics data using a hybrid of the sparsity regression and linear mixed model with multiple random effects. It can resemble the shape of the true effect size distributions and accounts for non-linear, including interaction effects, among multi-omics data via kernel fusion. It infers its parameters via a computationally efficient variational Bayes algorithm. Through extensive simulation studies and the prediction analyses on the positron emission tomography imaging outcomes using data obtained from the Alzheimer's Disease Neuroimaging Initiative, we have demonstrated that TBLMM can consistently outperform the existing method in predicting the risk of complex traits. AVAILABILITY AND IMPLEMENTATION The corresponding R package is available on GitHub (https://github.com/YaluWen/TBLMM).
Collapse
Affiliation(s)
- Yang Hai
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
- Department of Statistics, University of Auckland, Auckland 1010, New Zealand
| | - Jixiang Ma
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
| | - Kaixin Yang
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
| | - Yalu Wen
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province 030000, China
- Department of Statistics, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
11
|
Henao JD, Lauber M, Azevedo M, Grekova A, Theis F, List M, Ogris C, Schubert B. Multi-omics regulatory network inference in the presence of missing data. Brief Bioinform 2023; 24:bbad309. [PMID: 37670505 PMCID: PMC10516394 DOI: 10.1093/bib/bbad309] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 05/06/2023] [Accepted: 05/29/2023] [Indexed: 09/07/2023] Open
Abstract
A key problem in systems biology is the discovery of regulatory mechanisms that drive phenotypic behaviour of complex biological systems in the form of multi-level networks. Modern multi-omics profiling techniques probe these fundamental regulatory networks but are often hampered by experimental restrictions leading to missing data or partially measured omics types for subsets of individuals due to cost restrictions. In such scenarios, in which missing data is present, classical computational approaches to infer regulatory networks are limited. In recent years, approaches have been proposed to infer sparse regression models in the presence of missing information. Nevertheless, these methods have not been adopted for regulatory network inference yet. In this study, we integrated regression-based methods that can handle missingness into KiMONo, a Knowledge guided Multi-Omics Network inference approach, and benchmarked their performance on commonly encountered missing data scenarios in single- and multi-omics studies. Overall, two-step approaches that explicitly handle missingness performed best for a wide range of random- and block-missingness scenarios on imbalanced omics-layers dimensions, while methods implicitly handling missingness performed best on balanced omics-layers dimensions. Our results show that robust multi-omics network inference in the presence of missing data with KiMONo is feasible and thus allows users to leverage available multi-omics data to its full extent.
Collapse
Affiliation(s)
- Juan D Henao
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
| | - Michael Lauber
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising
| | - Manuel Azevedo
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
| | - Anastasiia Grekova
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
| | - Fabian Theis
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
- Department of Mathematics, Technical University of Munich, 85748 Garching bei München, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising
| | - Christoph Ogris
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
| | - Benjamin Schubert
- Helmholtz Zentrum München, Computational Health Department, Ingolstädter Landstraße 1, 85764 Munich, Germany, Member of the German Center for Lung Research (DZL)
- Department of Mathematics, Technical University of Munich, 85748 Garching bei München, Germany
| |
Collapse
|
12
|
Singhal P, Verma SS, Ritchie MD. Gene Interactions in Human Disease Studies-Evidence Is Mounting. Annu Rev Biomed Data Sci 2023; 6:377-395. [PMID: 37196359 DOI: 10.1146/annurev-biodatasci-102022-120818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Despite monumental advances in molecular technology to generate genome sequence data at scale, there is still a considerable proportion of heritability in most complex diseases that remains unexplained. Because many of the discoveries have been single-nucleotide variants with small to moderate effects on disease, the functional implication of many of the variants is still unknown and, thus, we have limited new drug targets and therapeutics. We, and many others, posit that one primary factor that has limited our ability to identify novel drug targets from genome-wide association studies may be due to gene interactions (epistasis), gene-environment interactions, network/pathway effects, or multiomic relationships. We propose that many of these complex models explain much of the underlying genetic architecture of complex disease. In this review, we discuss the evidence from multiple research avenues, ranging from pairs of alleles to multiomic integration studies and pharmacogenomics, that supports the need for further investigation of gene interactions (or epistasis) in genetic and genomic studies of human disease. Our goal is to catalog the mounting evidence for epistasis in genetic studies and the connections between genetic interactions and human health and disease that could enable precision medicine of the future.
Collapse
Affiliation(s)
- Pankhuri Singhal
- Genetics and Epigenetics Graduate Group, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Shefali Setia Verma
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA;
- Penn Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
13
|
Li R, Rozum JC, Quail MM, Qasim MN, Sindi SS, Nobile CJ, Albert R, Hernday AD. Inferring gene regulatory networks using transcriptional profiles as dynamical attractors. PLoS Comput Biol 2023; 19:e1010991. [PMID: 37607190 PMCID: PMC10473541 DOI: 10.1371/journal.pcbi.1010991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 09/01/2023] [Accepted: 07/19/2023] [Indexed: 08/24/2023] Open
Abstract
Genetic regulatory networks (GRNs) regulate the flow of genetic information from the genome to expressed messenger RNAs (mRNAs) and thus are critical to controlling the phenotypic characteristics of cells. Numerous methods exist for profiling mRNA transcript levels and identifying protein-DNA binding interactions at the genome-wide scale. These enable researchers to determine the structure and output of transcriptional regulatory networks, but uncovering the complete structure and regulatory logic of GRNs remains a challenge. The field of GRN inference aims to meet this challenge using computational modeling to derive the structure and logic of GRNs from experimental data and to encode this knowledge in Boolean networks, Bayesian networks, ordinary differential equation (ODE) models, or other modeling frameworks. However, most existing models do not incorporate dynamic transcriptional data since it has historically been less widely available in comparison to "static" transcriptional data. We report the development of an evolutionary algorithm-based ODE modeling approach (named EA) that integrates kinetic transcription data and the theory of attractor matching to infer GRN architecture and regulatory logic. Our method outperformed six leading GRN inference methods, none of which incorporate kinetic transcriptional data, in predicting regulatory connections among TFs when applied to a small-scale engineered synthetic GRN in Saccharomyces cerevisiae. Moreover, we demonstrate the potential of our method to predict unknown transcriptional profiles that would be produced upon genetic perturbation of the GRN governing a two-state cellular phenotypic switch in Candida albicans. We established an iterative refinement strategy to facilitate candidate selection for experimentation; the experimental results in turn provide validation or improvement for the model. In this way, our GRN inference approach can expedite the development of a sophisticated mathematical model that can accurately describe the structure and dynamics of the in vivo GRN.
Collapse
Affiliation(s)
- Ruihao Li
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Jordan C. Rozum
- Department of Systems Science and Industrial Engineering, Binghamton University (State University of New York), Binghamton, New York, United States of America
| | - Morgan M. Quail
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Mohammad N. Qasim
- Quantitative and Systems Biology Graduate Program, University of California, Merced, Merced, California, United States of America
| | - Suzanne S. Sindi
- Department of Applied Mathematics, University of California, Merced, Merced, California, United States of America
| | - Clarissa J. Nobile
- Department of Molecular Cell Biology, University of California, Merced, Merced, California, United States of America
- Health Sciences Research Institute, University of California, Merced, Merced, California, United States of America
| | - Réka Albert
- Department of Physics, Pennsylvania State University, University Park, University Park, Pennsylvania, United States of America
- Department of Biology, Pennsylvania State University, University Park, University Park, Pennsylvania, United States of America
| | - Aaron D. Hernday
- Department of Molecular Cell Biology, University of California, Merced, Merced, California, United States of America
- Health Sciences Research Institute, University of California, Merced, Merced, California, United States of America
| |
Collapse
|
14
|
Morin A, Chu ECP, Sharma A, Adrian-Hamazaki A, Pavlidis P. Characterizing the targets of transcription regulators by aggregating ChIP-seq and perturbation expression data sets. Genome Res 2023; 33:763-778. [PMID: 37308292 PMCID: PMC10317128 DOI: 10.1101/gr.277273.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 04/26/2023] [Indexed: 06/14/2023]
Abstract
Mapping the gene targets of chromatin-associated transcription regulators (TRs) is a major goal of genomics research. ChIP-seq of TRs and experiments that perturb a TR and measure the differential abundance of gene transcripts are a primary means by which direct relationships are tested on a genomic scale. It has been reported that there is a poor overlap in the evidence across gene regulation strategies, emphasizing the need for integrating results from multiple experiments. Although research consortia interested in gene regulation have produced a valuable trove of high-quality data, there is an even greater volume of TR-specific data throughout the literature. In this study, we show a workflow for the identification, uniform processing, and aggregation of ChIP-seq and TR perturbation experiments for the ultimate purpose of ranking human and mouse TR-target interactions. Focusing on an initial set of eight regulators (ASCL1, HES1, MECP2, MEF2C, NEUROD1, PAX6, RUNX1, and TCF4), we identified 497 experiments suitable for analysis. We used this corpus to examine data concordance, to identify systematic patterns of the two data types, and to identify putative orthologous interactions between human and mouse. We build upon commonly used strategies to forward a procedure for aggregating and combining these two genomic methodologies, assessing these rankings against independent literature-curated evidence. Beyond a framework extensible to other TRs, our work also provides empirically ranked TR-target listings, as well as transparent experiment-level gene summaries for community use.
Collapse
Affiliation(s)
- Alexander Morin
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Eric Ching-Pan Chu
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Aman Sharma
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Alex Adrian-Hamazaki
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
15
|
Ferrocino I, Rantsiou K, McClure R, Kostic T, de Souza RSC, Lange L, FitzGerald J, Kriaa A, Cotter P, Maguin E, Schelkle B, Schloter M, Berg G, Sessitsch A, Cocolin L. The need for an integrated multi-OMICs approach in microbiome science in the food system. Compr Rev Food Sci Food Saf 2023; 22:1082-1103. [PMID: 36636774 DOI: 10.1111/1541-4337.13103] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 12/05/2022] [Accepted: 12/19/2022] [Indexed: 01/14/2023]
Abstract
Microbiome science as an interdisciplinary research field has evolved rapidly over the past two decades, becoming a popular topic not only in the scientific community and among the general public, but also in the food industry due to the growing demand for microbiome-based technologies that provide added-value solutions. Microbiome research has expanded in the context of food systems, strongly driven by methodological advances in different -omics fields that leverage our understanding of microbial diversity and function. However, managing and integrating different complex -omics layers are still challenging. Within the Coordinated Support Action MicrobiomeSupport (https://www.microbiomesupport.eu/), a project supported by the European Commission, the workshop "Metagenomics, Metaproteomics and Metabolomics: the need for data integration in microbiome research" gathered 70 participants from different microbiome research fields relevant to food systems, to discuss challenges in microbiome research and to promote a switch from microbiome-based descriptive studies to functional studies, elucidating the biology and interactive roles of microbiomes in food systems. A combination of technologies is proposed. This will reduce the biases resulting from each individual technology and result in a more comprehensive view of the biological system as a whole. Although combinations of different datasets are still rare, advanced bioinformatics tools and artificial intelligence approaches can contribute to understanding, prediction, and management of the microbiome, thereby providing the basis for the improvement of food quality and safety.
Collapse
Affiliation(s)
- Ilario Ferrocino
- Department of Agriculture, Forest and Food Science, University of Turin, Grugliasco, Italy
| | - Kalliopi Rantsiou
- Department of Agriculture, Forest and Food Science, University of Turin, Grugliasco, Italy
| | - Ryan McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Tanja Kostic
- AIT Austrian Institute of Technology GmbH, Bioresources Unit, Tulln, Austria
| | - Rafael Soares Correa de Souza
- Genomics for Climate Change Research Center (GCCRC), Universidade Estadual de Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | - Lene Lange
- BioEconomy, Research & Advisory, Valby, Denmark
| | - Jamie FitzGerald
- Teagasc Food Research Centre, Moorepark, Fermoy, County Cork, Ireland
| | - Aicha Kriaa
- MICALIS, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| | - Paul Cotter
- Teagasc Food Research Centre, Moorepark, Fermoy, County Cork, Ireland
| | - Emmanuelle Maguin
- MICALIS, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| | | | | | - Gabriele Berg
- Institute of Environmental Biotechnology, Graz University of Technology, Graz, Austria
| | - Angela Sessitsch
- AIT Austrian Institute of Technology GmbH, Bioresources Unit, Tulln, Austria
| | - Luca Cocolin
- Department of Agriculture, Forest and Food Science, University of Turin, Grugliasco, Italy
| | | |
Collapse
|
16
|
Paul I, Bolzan D, Youssef A, Gagnon KA, Hook H, Karemore G, Oliphant MUJ, Lin W, Liu Q, Phanse S, White C, Padhorny D, Kotelnikov S, Chen CS, Hu P, Denis GV, Kozakov D, Raught B, Siggers T, Wuchty S, Muthuswamy SK, Emili A. Parallelized multidimensional analytic framework applied to mammary epithelial cells uncovers regulatory principles in EMT. Nat Commun 2023; 14:688. [PMID: 36755019 PMCID: PMC9908882 DOI: 10.1038/s41467-023-36122-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 01/17/2023] [Indexed: 02/10/2023] Open
Abstract
A proper understanding of disease etiology will require longitudinal systems-scale reconstruction of the multitiered architecture of eukaryotic signaling. Here we combine state-of-the-art data acquisition platforms and bioinformatics tools to devise PAMAF, a workflow that simultaneously examines twelve omics modalities, i.e., protein abundance from whole-cells, nucleus, exosomes, secretome and membrane; N-glycosylation, phosphorylation; metabolites; mRNA, miRNA; and, in parallel, single-cell transcriptomes. We apply PAMAF in an established in vitro model of TGFβ-induced epithelial to mesenchymal transition (EMT) to quantify >61,000 molecules from 12 omics and 10 timepoints over 12 days. Bioinformatics analysis of this EMT-ExMap resource allowed us to identify; -topological coupling between omics, -four distinct cell states during EMT, -omics-specific kinetic paths, -stage-specific multi-omics characteristics, -distinct regulatory classes of genes, -ligand-receptor mediated intercellular crosstalk by integrating scRNAseq and subcellular proteomics, and -combinatorial drug targets (e.g., Hedgehog signaling and CAMK-II) to inhibit EMT, which we validate using a 3D mammary duct-on-a-chip platform. Overall, this study provides a resource on TGFβ signaling and EMT.
Collapse
Affiliation(s)
- Indranil Paul
- Department of Biochemistry, Boston University School of Medicine, Boston University, 71 East Concord Street, Boston, MA, 02118, USA
| | - Dante Bolzan
- Department of Computer Science, University of Miami, 1356 Memorial Drive, Coral Gables, FL, 33146, USA
| | - Ahmed Youssef
- Graduate Program in Bioinformatics, Boston University, 24 Cummington Mall, Boston, MA, 02215, USA
| | - Keith A Gagnon
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA, 02215, USA
| | - Heather Hook
- Department of Biology, Boston University, 24 Cummington Mall, Boston, MA, 02115, USA
- Biological Design Center, Boston University, 610 Commonwealth Avenue, Boston, MA, 02215, USA
| | - Gopal Karemore
- Advanced Analytics, Novo Nordisk A/S, 2760, Måløv, Denmark
| | - Michael U J Oliphant
- Cancer Research Institute, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, 02115, USA
| | - Weiwei Lin
- Department of Biochemistry, Boston University School of Medicine, Boston University, 71 East Concord Street, Boston, MA, 02118, USA
| | - Qian Liu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, R3E 0J9, Canada
| | - Sadhna Phanse
- Department of Biochemistry, Boston University School of Medicine, Boston University, 71 East Concord Street, Boston, MA, 02118, USA
| | - Carl White
- Department of Biochemistry, Boston University School of Medicine, Boston University, 71 East Concord Street, Boston, MA, 02118, USA
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794, Stony Brook, NY, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794, Stony Brook, NY, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Christopher S Chen
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, MA, 02215, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, 3 Blackfan Circle, Boston, MA, 02115, USA
| | - Pingzhao Hu
- Department of Biochemistry, Western University, London, ON, N6A 5C1, Canada
| | - Gerald V Denis
- Boston Medical Center Cancer Center, Boston University, Boston University, 72 East Concord Street, Boston, MA, 02118, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794, Stony Brook, NY, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Brian Raught
- Discovery Tower (TMDT), 101 College St, Rm. 9-701A, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Trevor Siggers
- Department of Biology, Boston University, 24 Cummington Mall, Boston, MA, 02115, USA
- Biological Design Center, Boston University, 610 Commonwealth Avenue, Boston, MA, 02215, USA
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, 1356 Memorial Drive, Coral Gables, FL, 33146, USA
| | - Senthil K Muthuswamy
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Andrew Emili
- Department of Biochemistry, Boston University School of Medicine, Boston University, 71 East Concord Street, Boston, MA, 02118, USA.
- Department of Biology, Charles River Campus, Boston University, Life Science & Engineering (LSEB-602), 24 Cummington Mall, Boston, MA, 02215, USA.
- Division of Oncological Sciences, Knight Cancer Institute, Oregon Health and Science University, Portland, USA.
| |
Collapse
|
17
|
Vu T, Litkowski EM, Liu W, Pratte KA, Lange L, Bowler RP, Banaei-Kashani F, Kechris KJ. NetSHy: network summarization via a hybrid approach leveraging topological properties. Bioinformatics 2023; 39:6957083. [PMID: 36548341 PMCID: PMC9831052 DOI: 10.1093/bioinformatics/btac818] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/30/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module's information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. RESULTS In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. AVAILABILITY AND IMPLEMENTATION R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thao Vu
- To whom correspondence should be addressed. or
| | - Elizabeth M Litkowski
- Department of Epidemiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
- Division of Biomedical Informatics & Personalized Medicine, School of Medicine, Colorado University Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Weixuan Liu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Katherine A Pratte
- Department of Biostatistics, National Jewish Health, Denver, CO 80206, USA
| | - Leslie Lange
- Division of Biomedical Informatics & Personalized Medicine, School of Medicine, Colorado University Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Russell P Bowler
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO 80206, USA
| | - Farnoush Banaei-Kashani
- Department of Computer Science and Engineering, College of Engineering, Design and Computing, University of Colorado Denver, Denver, CO 80204, USA
| | | |
Collapse
|
18
|
Galindez G, Sadegh S, Baumbach J, Kacprowski T, List M. Network-based approaches for modeling disease regulation and progression. Comput Struct Biotechnol J 2022; 21:780-795. [PMID: 36698974 PMCID: PMC9841310 DOI: 10.1016/j.csbj.2022.12.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular interaction networks lay the foundation for studying how biological functions are controlled by the complex interplay of genes and proteins. Investigating perturbed processes using biological networks has been instrumental in uncovering mechanisms that underlie complex disease phenotypes. Rapid advances in omics technologies have prompted the generation of high-throughput datasets, enabling large-scale, network-based analyses. Consequently, various modeling techniques, including network enrichment, differential network extraction, and network inference, have proven to be useful for gaining new mechanistic insights. We provide an overview of recent network-based methods and their core ideas to facilitate the discovery of disease modules or candidate mechanisms. Knowledge generated from these computational efforts will benefit biomedical research, especially drug development and precision medicine. We further discuss current challenges and provide perspectives in the field, highlighting the need for more integrative and dynamic network approaches to model disease development and progression.
Collapse
Affiliation(s)
- Gihanna Galindez
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
19
|
Tiong KL, Sintupisut N, Lin MC, Cheng CH, Woolston A, Lin CH, Ho M, Lin YW, Padakanti S, Yeang CH. An integrated analysis of the cancer genome atlas data discovers a hierarchical association structure across thirty three cancer types. PLOS DIGITAL HEALTH 2022; 1:e0000151. [PMID: 36812605 PMCID: PMC9931374 DOI: 10.1371/journal.pdig.0000151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 10/31/2022] [Indexed: 06/18/2023]
Abstract
Cancer cells harbor molecular alterations at all levels of information processing. Genomic/epigenomic and transcriptomic alterations are inter-related between genes, within and across cancer types and may affect clinical phenotypes. Despite the abundant prior studies of integrating cancer multi-omics data, none of them organizes these associations in a hierarchical structure and validates the discoveries in extensive external data. We infer this Integrated Hierarchical Association Structure (IHAS) from the complete data of The Cancer Genome Atlas (TCGA) and compile a compendium of cancer multi-omics associations. Intriguingly, diverse alterations on genomes/epigenomes from multiple cancer types impact transcriptions of 18 Gene Groups. Half of them are further reduced to three Meta Gene Groups enriched with (1) immune and inflammatory responses, (2) embryonic development and neurogenesis, (3) cell cycle process and DNA repair. Over 80% of the clinical/molecular phenotypes reported in TCGA are aligned with the combinatorial expressions of Meta Gene Groups, Gene Groups, and other IHAS subunits. Furthermore, IHAS derived from TCGA is validated in more than 300 external datasets including multi-omics measurements and cellular responses upon drug treatments and gene perturbations in tumors, cancer cell lines, and normal tissues. To sum up, IHAS stratifies patients in terms of molecular signatures of its subunits, selects targeted genes or drugs for precision cancer therapy, and demonstrates that associations between survival times and transcriptional biomarkers may vary with cancer types. These rich information is critical for diagnosis and treatments of cancers.
Collapse
Affiliation(s)
- Khong-Loon Tiong
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| | - Nardnisa Sintupisut
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| | - Min-Chin Lin
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
- Psomagen, Rockville, Maryland, United States of America
| | - Chih-Hung Cheng
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| | - Andrew Woolston
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
- Translational Cancer Immunotherapy & Genomics Lab, Barts Cancer Institute, Charterhouse Square, London, United Kingdom
| | - Chih-Hsu Lin
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
- C3.ai, Redwood City, California, United States of America
| | - Mirrian Ho
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| | - Yu-Wei Lin
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
- AiLife Diagnostics, Pearland, Texas, United States of America
| | - Sridevi Padakanti
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| | - Chen-Hsiang Yeang
- Institute of Statistical Science, Academia Sinica, Section 2, Taipei, Taiwan
| |
Collapse
|
20
|
Athieniti E, Spyrou GM. A guide to multi-omics data collection and integration for translational medicine. Comput Struct Biotechnol J 2022; 21:134-149. [PMID: 36544480 PMCID: PMC9747357 DOI: 10.1016/j.csbj.2022.11.050] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/02/2022] Open
Abstract
The emerging high-throughput technologies have led to the shift in the design of translational medicine projects towards collecting multi-omics patient samples and, consequently, their integrated analysis. However, the complexity of integrating these datasets has triggered new questions regarding the appropriateness of the available computational methods. Currently, there is no clear consensus on the best combination of omics to include and the data integration methodologies required for their analysis. This article aims to guide the design of multi-omics studies in the field of translational medicine regarding the types of omics and the integration method to choose. We review articles that perform the integration of multiple omics measurements from patient samples. We identify five objectives in translational medicine applications: (i) detect disease-associated molecular patterns, (ii) subtype identification, (iii) diagnosis/prognosis, (iv) drug response prediction, and (v) understand regulatory processes. We describe common trends in the selection of omic types combined for different objectives and diseases. To guide the choice of data integration tools, we group them into the scientific objectives they aim to address. We describe the main computational methods adopted to achieve these objectives and present examples of tools. We compare tools based on how they deal with the computational challenges of data integration and comment on how they perform against predefined objective-specific evaluation criteria. Finally, we discuss examples of tools for downstream analysis and further extraction of novel insights from multi-omics datasets.
Collapse
Affiliation(s)
- Efi Athieniti
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| | - George M. Spyrou
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| |
Collapse
|
21
|
Agamah FE, Bayjanov JR, Niehues A, Njoku KF, Skelton M, Mazandu GK, Ederveen THA, Mulder N, Chimusa ER, 't Hoen PAC. Computational approaches for network-based integrative multi-omics analysis. Front Mol Biosci 2022; 9:967205. [PMID: 36452456 PMCID: PMC9703081 DOI: 10.3389/fmolb.2022.967205] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 10/20/2022] [Indexed: 08/27/2023] Open
Abstract
Advances in omics technologies allow for holistic studies into biological systems. These studies rely on integrative data analysis techniques to obtain a comprehensive view of the dynamics of cellular processes, and molecular mechanisms. Network-based integrative approaches have revolutionized multi-omics analysis by providing the framework to represent interactions between multiple different omics-layers in a graph, which may faithfully reflect the molecular wiring in a cell. Here we review network-based multi-omics/multi-modal integrative analytical approaches. We classify these approaches according to the type of omics data supported, the methods and/or algorithms implemented, their node and/or edge weighting components, and their ability to identify key nodes and subnetworks. We show how these approaches can be used to identify biomarkers, disease subtypes, crosstalk, causality, and molecular drivers of physiological and pathological mechanisms. We provide insight into the most appropriate methods and tools for research questions as showcased around the aetiology and treatment of COVID-19 that can be informed by multi-omics data integration. We conclude with an overview of challenges associated with multi-omics network-based analysis, such as reproducibility, heterogeneity, (biological) interpretability of the results, and we highlight some future directions for network-based integration.
Collapse
Affiliation(s)
- Francis E. Agamah
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jumamurat R. Bayjanov
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Anna Niehues
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Kelechi F. Njoku
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Michelle Skelton
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Gaston K. Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- African Institute for Mathematical Sciences, Cape Town, South Africa
| | - Thomas H. A. Ederveen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI-Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Emile R. Chimusa
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle, United Kingdom
| | - Peter A. C. 't Hoen
- Center for Molecular and Biomolecular Informatics (CMBI), Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
22
|
Hawe JS, Saha A, Waldenberger M, Kunze S, Wahl S, Müller-Nurasyid M, Prokisch H, Grallert H, Herder C, Peters A, Strauch K, Theis FJ, Gieger C, Chambers J, Battle A, Heinig M. Network reconstruction for trans acting genetic loci using multi-omics data and prior information. Genome Med 2022; 14:125. [PMID: 36344995 PMCID: PMC9641770 DOI: 10.1186/s13073-022-01124-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms.
Collapse
Affiliation(s)
- Johann S Hawe
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Heart Centre Munich, Department of Cardiology, Technical University Munich, Munich, Germany.,Department of Informatics, Technical University of Munich, Garching, Germany
| | - Ashis Saha
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Sonja Kunze
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Simone Wahl
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Martina Müller-Nurasyid
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,IBE, Faculty of Medicine, LMU Munich, 81377, Munich, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Department of Internal Medicine I (Cardiology), Hospital of the Ludwig-Maximilians-University (LMU) Munich, Munich, Germany
| | - Holger Prokisch
- Institute of Human Genetics, School of Medicine, Technische Universität München, Munich, Germany
| | - Harald Grallert
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Christian Herder
- German Center for Diabetes Research (DZD), Neuherberg, Germany.,Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany.,Division of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Annette Peters
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Fabian J Theis
- Department of Informatics, Technical University of Munich, Garching, Germany.,Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - John Chambers
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Lee Kong Chian School of Medicine, Nanyang Technological University, 308232, Singapore, Singapore
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Matthias Heinig
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany. .,Department of Informatics, Technical University of Munich, Garching, Germany. .,Munich Heart Association, Partner Site Munich, DZHK (German Centre for Cardiovascular Research), 10785, Berlin, Germany.
| |
Collapse
|
23
|
Abdullah-Zawawi MR, Govender N, Harun S, Muhammad NAN, Zainal Z, Mohamed-Hussein ZA. Multi-Omics Approaches and Resources for Systems-Level Gene Function Prediction in the Plant Kingdom. PLANTS (BASEL, SWITZERLAND) 2022; 11:2614. [PMID: 36235479 PMCID: PMC9573505 DOI: 10.3390/plants11192614] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/05/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
In higher plants, the complexity of a system and the components within and among species are rapidly dissected by omics technologies. Multi-omics datasets are integrated to infer and enable a comprehensive understanding of the life processes of organisms of interest. Further, growing open-source datasets coupled with the emergence of high-performance computing and development of computational tools for biological sciences have assisted in silico functional prediction of unknown genes, proteins and metabolites, otherwise known as uncharacterized. The systems biology approach includes data collection and filtration, system modelling, experimentation and the establishment of new hypotheses for experimental validation. Informatics technologies add meaningful sense to the output generated by complex bioinformatics algorithms, which are now freely available in a user-friendly graphical user interface. These resources accentuate gene function prediction at a relatively minimal cost and effort. Herein, we present a comprehensive view of relevant approaches available for system-level gene function prediction in the plant kingdom. Together, the most recent applications and sought-after principles for gene mining are discussed to benefit the plant research community. A realistic tabulation of plant genomic resources is included for a less laborious and accurate candidate gene discovery in basic plant research and improvement strategies.
Collapse
Affiliation(s)
- Muhammad-Redha Abdullah-Zawawi
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, Kuala Lumpur 56000, Malaysia
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Nisha Govender
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Sarahani Harun
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Nor Azlan Nor Muhammad
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Zamri Zainal
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
- Faculty of Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
- Faculty of Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| |
Collapse
|
24
|
Robin V, Bodein A, Scott-Boyer MP, Leclercq M, Périn O, Droit A. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context. Front Mol Biosci 2022; 9:962799. [PMID: 36158572 PMCID: PMC9494275 DOI: 10.3389/fmolb.2022.962799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/16/2022] [Indexed: 11/26/2022] Open
Abstract
At the heart of the cellular machinery through the regulation of cellular functions, protein-protein interactions (PPIs) have a significant role. PPIs can be analyzed with network approaches. Construction of a PPI network requires prediction of the interactions. All PPIs form a network. Different biases such as lack of data, recurrence of information, and false interactions make the network unstable. Integrated strategies allow solving these different challenges. These approaches have shown encouraging results for the understanding of molecular mechanisms, drug action mechanisms, and identification of target genes. In order to give more importance to an interaction, it is evaluated by different confidence scores. These scores allow the filtration of the network and thus facilitate the representation of the network, essential steps to the identification and understanding of molecular mechanisms. In this review, we will discuss the main computational methods for predicting PPI, including ones confirming an interaction as well as the integration of PPIs into a network, and we will discuss visualization of these complex data.
Collapse
Affiliation(s)
- Vivian Robin
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie-Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Mickaël Leclercq
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Périn
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| |
Collapse
|
25
|
Dirmeier S, Beerenwinkel N. Structured hierarchical models for probabilistic inference from perturbation screening data. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Simon Dirmeier
- Department of Biosystems Science and Engineering, ETH Zurich
| | | |
Collapse
|
26
|
Dai X, Shen L. Advances and Trends in Omics Technology Development. Front Med (Lausanne) 2022; 9:911861. [PMID: 35860739 PMCID: PMC9289742 DOI: 10.3389/fmed.2022.911861] [Citation(s) in RCA: 89] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 05/09/2022] [Indexed: 12/11/2022] Open
Abstract
The human history has witnessed the rapid development of technologies such as high-throughput sequencing and mass spectrometry that led to the concept of “omics” and methodological advancement in systematically interrogating a cellular system. Yet, the ever-growing types of molecules and regulatory mechanisms being discovered have been persistently transforming our understandings on the cellular machinery. This renders cell omics seemingly, like the universe, expand with no limit and our goal toward the complete harness of the cellular system merely impossible. Therefore, it is imperative to review what has been done and is being done to predict what can be done toward the translation of omics information to disease control with minimal cell perturbation. With a focus on the “four big omics,” i.e., genomics, transcriptomics, proteomics, metabolomics, we delineate hierarchies of these omics together with their epiomics and interactomics, and review technologies developed for interrogation. We predict, among others, redoxomics as an emerging omics layer that views cell decision toward the physiological or pathological state as a fine-tuned redox balance.
Collapse
|
27
|
Rustam, Gunawan AY, Kresnowati MTAP. Data dimensionality reduction technique for clustering problem of metabolomics data. Heliyon 2022; 8:e09715. [PMID: 35721675 PMCID: PMC9201019 DOI: 10.1016/j.heliyon.2022.e09715] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 02/28/2022] [Accepted: 06/07/2022] [Indexed: 11/27/2022] Open
Abstract
In metabolomics studies, independent analyses or replicating the metabolite concentration measurements are often performed to anticipate errors. On the other hand, the size of the dataset is increasing. For clustering purposes, obtaining representative information chemically from independent analyses is needed. The objective of this study is to develop a data reduction method such that a dataset that represents chemical information is obtained. Overall a proper data reduction method would simplify the clustering of metabolite data. We propose the modified Weiszfeld algorithm (MWA) to reduce independent analyses. To obtain comprehensive results, we compare MWA with some other well-known reduction methods, including PCA, CMDS, LE, and LLE. Then reduced datasets are clustered using the fuzzy c-means (FCM) algorithm with the Tang Sun Sun (TSS) index and silhouette index as the cluster validity indices. The results show that MWA, together with PCA, present the optimal number of clusters, namely four clusters. This result aligns with the optimal number of clusters before dimensionality reduction. The present results show that MWA is robust to perform dimensionality reduction of independent analyses while maintaining chemical information on the reduced dataset. Therefore, we recommend the reliability of MWA as one of the chemometric techniques, and the present finding has enriched chemometric techniques in metabolomics studies.
Collapse
Affiliation(s)
- Rustam
- Telkom University, School of Electrical Engineering, Department of Telecommunication Engineering, Jl. Telekomunikasi No.1 Dayeuh Kolot, 40257 Kabupaten Bandung, Jawa Barat, Indonesia
| | - Agus Yodi Gunawan
- Institut Teknologi Bandung, Faculty of Mathematics and Natural Sciences, Industrial and Financial Mathematics Research Group, Jl. Ganesha 10 Bandung 40132, Indonesia
| | - Made Tri Ari Penia Kresnowati
- Institut Teknologi Bandung, Faculty of Industrial Technology, Food and Biomass Processing Technology Research Group, Jl. Ganesha 10 Bandung 40132, Indonesia
| |
Collapse
|
28
|
Sonawane AR, Aikawa E, Aikawa M. Connections for Matters of the Heart: Network Medicine in Cardiovascular Diseases. Front Cardiovasc Med 2022; 9:873582. [PMID: 35665246 PMCID: PMC9160390 DOI: 10.3389/fcvm.2022.873582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/19/2022] [Indexed: 01/18/2023] Open
Abstract
Cardiovascular diseases (CVD) are diverse disorders affecting the heart and vasculature in millions of people worldwide. Like other fields, CVD research has benefitted from the deluge of multiomics biomedical data. Current CVD research focuses on disease etiologies and mechanisms, identifying disease biomarkers, developing appropriate therapies and drugs, and stratifying patients into correct disease endotypes. Systems biology offers an alternative to traditional reductionist approaches and provides impetus for a comprehensive outlook toward diseases. As a focus area, network medicine specifically aids the translational aspect of in silico research. This review discusses the approach of network medicine and its application to CVD research.
Collapse
Affiliation(s)
- Abhijeet Rajendra Sonawane
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Elena Aikawa
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Masanori Aikawa
- Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Center for Excellence in Vascular Biology, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
29
|
Abbaszadeh O, Azarpeyvand A, Khanteymoori A, Bahari A. Data-Driven and Knowledge-Based Algorithms for Gene Network Reconstruction on High-Dimensional Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1545-1557. [PMID: 33119511 DOI: 10.1109/tcbb.2020.3034861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Previous efforts in gene network reconstruction have mainly focused on data-driven modeling, with little attention paid to knowledge-based approaches. Leveraging prior knowledge, however, is a promising paradigm that has been gaining momentum in network reconstruction and computational biology research communities. This paper proposes two new algorithms for reconstructing a gene network from expression profiles with and without prior knowledge in small sample and high-dimensional settings. First, using tools from the statistical estimation theory, particularly the empirical Bayesian approach, the current research estimates a covariance matrix via the shrinkage method. Second, estimated covariance matrix is employed in the penalized normal likelihood method to select the Gaussian graphical model. This formulation allows the application of prior knowledge in the covariance estimation, as well as in the Gaussian graphical model selection. Experimental results on simulated and real datasets show that, compared to state-of-the-art methods, the proposed algorithms achieve better results in terms of both PR and ROC curves. Finally, the present work applies its method on the RNA-seq data of human gastric atrophy patients, which was obtained from the EMBL-EBI database. The source codes and relevant data can be downloaded from: https://github.com/AbbaszadehO/DKGN.
Collapse
|
30
|
Perlo V, Margarido GRA, Botha FC, Furtado A, Hodgson-Kratky K, Correr FH, Henry RJ. Transcriptome changes in the developing sugarcane culm associated with high yield and early-season high sugar content. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1619-1636. [PMID: 35224663 PMCID: PMC9110458 DOI: 10.1007/s00122-022-04058-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/08/2022] [Indexed: 06/14/2023]
Abstract
Sugarcane, with its exceptional carbon dioxide assimilation, biomass and sugar yield, has a high potential for the production of bio-energy, bio-plastics and high-value products in the food and pharmaceutical industries. A crucial challenge for long-term economic viability and environmental sustainability is also to optimize the production of biomass composition and carbon sequestration. Sugarcane varieties such as KQ228 and Q253 are highly utilized in the industry. These varieties are characterized by a high early-season sugar content associated with high yield. In order to investigate these correlations, 1,440 internodes were collected and combined to generate a set of 120 samples in triplicate across 24 sugarcane cultivars at five different development stages. Weighted gene co-expression network analysis (WGCNA) was used and revealed for the first time two sets of co-expressed genes with a distinct and opposite correlation between fibre and sugar content. Gene identification and metabolism pathways analysis was used to define these two sets of genes. Correlation analysis identified a large number of interconnected metabolic pathways linked to sugar content and fibre content. Unsupervised hierarchical clustering of gene expression revealed a stronger level of segregation associated with the genotypes than the stage of development, suggesting a dominant genetic influence on biomass composition and facilitating breeding selection. Characterization of these two groups of co-expressed key genes can help to improve breeding program for high fibre, high sugar species or plant synthetic biology.
Collapse
Affiliation(s)
- Virginie Perlo
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
| | - Gabriel R. A. Margarido
- Departamento de Genética, Escola Superior de Agricultura “Luiz de Queiroz”, Universidade de São Paulo, Piracicaba, São Paulo, 13418-900 Brazil
| | - Frederik C. Botha
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
| | - Katrina Hodgson-Kratky
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
| | - Fernando H. Correr
- Departamento de Genética, Escola Superior de Agricultura “Luiz de Queiroz”, Universidade de São Paulo, Piracicaba, São Paulo, 13418-900 Brazil
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072 Australia
- The University of Queensland, Level 2, Queensland Bioscience Precinct [#80], 306 Carmody Road St Lucia, St Lucia, QLD 4072 Australia
| |
Collapse
|
31
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
32
|
Lavin KM, Coen PM, Baptista LC, Bell MB, Drummer D, Harper SA, Lixandrão ME, McAdam JS, O’Bryan SM, Ramos S, Roberts LM, Vega RB, Goodpaster BH, Bamman MM, Buford TW. State of Knowledge on Molecular Adaptations to Exercise in Humans: Historical Perspectives and Future Directions. Compr Physiol 2022; 12:3193-3279. [PMID: 35578962 PMCID: PMC9186317 DOI: 10.1002/cphy.c200033] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
For centuries, regular exercise has been acknowledged as a potent stimulus to promote, maintain, and restore healthy functioning of nearly every physiological system of the human body. With advancing understanding of the complexity of human physiology, continually evolving methodological possibilities, and an increasingly dire public health situation, the study of exercise as a preventative or therapeutic treatment has never been more interdisciplinary, or more impactful. During the early stages of the NIH Common Fund Molecular Transducers of Physical Activity Consortium (MoTrPAC) Initiative, the field is well-positioned to build substantially upon the existing understanding of the mechanisms underlying benefits associated with exercise. Thus, we present a comprehensive body of the knowledge detailing the current literature basis surrounding the molecular adaptations to exercise in humans to provide a view of the state of the field at this critical juncture, as well as a resource for scientists bringing external expertise to the field of exercise physiology. In reviewing current literature related to molecular and cellular processes underlying exercise-induced benefits and adaptations, we also draw attention to existing knowledge gaps warranting continued research effort. © 2021 American Physiological Society. Compr Physiol 12:3193-3279, 2022.
Collapse
Affiliation(s)
- Kaleen M. Lavin
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Center for Human Health, Resilience, and Performance, Institute for Human and Machine Cognition, Pensacola, Florida, USA
| | - Paul M. Coen
- Translational Research Institute for Metabolism and Diabetes, Advent Health, Orlando, Florida, USA
- Sanford Burnham Prebys Medical Discovery Institute, Orlando, Florida, USA
| | - Liliana C. Baptista
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Medicine, Division of Gerontology, Geriatrics and Palliative Care, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Margaret B. Bell
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Devin Drummer
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Sara A. Harper
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Medicine, Division of Gerontology, Geriatrics and Palliative Care, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Manoel E. Lixandrão
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Jeremy S. McAdam
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Samia M. O’Bryan
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Sofhia Ramos
- Translational Research Institute for Metabolism and Diabetes, Advent Health, Orlando, Florida, USA
- Sanford Burnham Prebys Medical Discovery Institute, Orlando, Florida, USA
| | - Lisa M. Roberts
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Medicine, Division of Gerontology, Geriatrics and Palliative Care, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Rick B. Vega
- Translational Research Institute for Metabolism and Diabetes, Advent Health, Orlando, Florida, USA
- Sanford Burnham Prebys Medical Discovery Institute, Orlando, Florida, USA
| | - Bret H. Goodpaster
- Translational Research Institute for Metabolism and Diabetes, Advent Health, Orlando, Florida, USA
- Sanford Burnham Prebys Medical Discovery Institute, Orlando, Florida, USA
| | - Marcas M. Bamman
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Cell, Developmental, and Integrative Biology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Center for Human Health, Resilience, and Performance, Institute for Human and Machine Cognition, Pensacola, Florida, USA
| | - Thomas W. Buford
- Center for Exercise Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
- Department of Medicine, Division of Gerontology, Geriatrics and Palliative Care, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| |
Collapse
|
33
|
Zenere A, Rundquist O, Gustafsson M, Altafini C. Multi-omics protein-coding units as massively parallel Bayesian networks: empirical validation of causality structure. iScience 2022; 25:104048. [PMID: 35355520 PMCID: PMC8958332 DOI: 10.1016/j.isci.2022.104048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/17/2022] [Accepted: 03/08/2022] [Indexed: 11/29/2022] Open
Abstract
In this article we use high-throughput epigenomics, transcriptomics, and proteomics data to construct fine-graded models of the “protein-coding units” gathering all transcript isoforms and chromatin accessibility peaks associated with more than 4000 genes in humans. Each protein-coding unit has the structure of a directed acyclic graph (DAG) and can be represented as a Bayesian network. The factorization of the joint probability distribution induced by the DAGs imposes a number of conditional independence relationships among the variables forming a protein-coding unit, corresponding to the missing edges in the DAGs. We show that a large fraction of these conditional independencies are indeed verified by the data. Factors driving this verification appear to be the structural and functional annotation of the transcript isoforms, as well as a notion of structural balance (or frustration-free) of the corresponding sample correlation graph, which naturally leads to reduction of correlation (and hence to independence) upon conditioning. Protein coding unit: DAG associated with epigenetic and gene information of a protein DAGs correspond to Bayesian networks Edge absence on a DAG corresponds to conditional independence Multi-omics data (ATAC-seq, RNA-seq and mass-spec) are used for DAG validation
Collapse
|
34
|
Selvaggio G, Cristellon S, Marchetti L. A Novel Hybrid Logic-ODE Modeling Approach to Overcome Knowledge Gaps. Front Mol Biosci 2022; 8:760077. [PMID: 34988115 PMCID: PMC8721169 DOI: 10.3389/fmolb.2021.760077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 11/09/2021] [Indexed: 11/13/2022] Open
Abstract
Mathematical modeling allows using different formalisms to describe, investigate, and understand biological processes. However, despite the advent of high-throughput experimental techniques, quantitative information is still a challenge when looking for data to calibrate model parameters. Furthermore, quantitative formalisms must cope with stiffness and tractability problems, more so if used to describe multicellular systems. On the other hand, qualitative models may lack the proper granularity to describe the underlying kinetic processes. We propose a hybrid modeling approach that integrates ordinary differential equations and logical formalism to describe distinct biological layers and their communication. We focused on a multicellular system as a case study by applying the hybrid formalism to the well-known Delta-Notch signaling pathway. We used a differential equation model to describe the intracellular pathways while the cell-cell interactions were defined by logic rules. The hybrid approach herein employed allows us to combine the pros of different modeling techniques by overcoming the lack of quantitative information with a qualitative description that discretizes activation and inhibition processes, thus avoiding complexity.
Collapse
Affiliation(s)
- Gianluca Selvaggio
- Piazza Manifattura, Fondazione The Microsoft Research-University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | - Serena Cristellon
- Piazza Manifattura, Fondazione The Microsoft Research-University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy.,Department of Mathematics, University of Trento, Trento, Italy
| | - Luca Marchetti
- Piazza Manifattura, Fondazione The Microsoft Research-University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy.,Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| |
Collapse
|
35
|
Abstract
DNA microarrays are widely used to investigate gene expression. Even though the classical analysis of microarray data is based on the study of differentially expressed genes, it is well known that genes do not act individually. Network analysis can be applied to study association patterns of the genes in a biological system. Moreover, it finds wide application in differential coexpression analysis between different systems. Network based coexpression studies have for example been used in (complex) disease gene prioritization, disease subtyping, and patient stratification.In this chapter we provide an overview of the methods and tools used to create networks from microarray data and describe multiple methods on how to analyze a single network or a group of networks. The described methods range from topological metrics, functional group identification to data integration strategies, topological pathway analysis as well as graphical models.
Collapse
Affiliation(s)
- Alisa Pavel
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Luca Cattelani
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
- BioMediTech Institute, Tampere University, Tampere, Finland.
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland.
- Institute of Biotechnology , University of Helsinki, Helsinki, Finland.
| |
Collapse
|
36
|
Randhawa V, Kumar M. An integrated network analysis approach to identify potential key genes, transcription factors, and microRNAs regulating human hematopoietic stem cell aging. Mol Omics 2021; 17:967-984. [PMID: 34605522 DOI: 10.1039/d1mo00199j] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Hematopoietic stem cells (HSCs) undergo functional deterioration with increasing age that causes loss of their self-renewal and regenerative potential. Despite various efforts, significant success in identifying molecular regulators of HSC aging has not been achieved, one prime reason being the non-availability of appropriate human HSC samples. To demonstrate the scope of integrating and re-analyzing the HSC transcriptomics data available, we used existing tools and databases to structure a sequential data analysis pipeline to predict potential candidate genes, transcription factors, and microRNAs simultaneously. This sequential approach comprises (i) collecting matched young and aged mice HSC sample datasets, (ii) identifying differentially expressed genes, (iii) identifying human homologs of differentially expressed genes, (iv) inferring gene co-expression network modules, and (v) inferring the microRNA-transcription factor-gene regulatory network. Systems-level analyses of HSC interaction networks provided various insights based on which several candidates were predicted. For example, 16 HSC aging-related candidate genes were predicted (e.g., CD38, BRCA1, AGTR1, GSTM1, etc.) from GCN analysis. Following this, the shortest path distance-based analyses of the regulatory network predicted several novel candidate miRNAs and TFs. Among these, miR-124-3p was a common regulator in candidate gene modules, while TFs MYC and SP1 were identified to regulate various candidate genes. Based on the regulatory interactions among candidate genes, TFs, and miRNAs, a potential regulation model of biological processes in each of the candidate modules was predicted, which provided systems-level insights into the molecular complexity of each module to regulate HSC aging.
Collapse
Affiliation(s)
- Vinay Randhawa
- Virology Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific & Industrial Research, Chandigarh-160036, India.
| | - Manoj Kumar
- Virology Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific & Industrial Research, Chandigarh-160036, India. .,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad-201002, India
| |
Collapse
|
37
|
Kaiser H, Kvist-Hansen A, Becker C, Wang X, McCauley BD, Krakauer M, Gørtz PM, Henningsen KMA, Zachariae C, Skov L, Hansen PR. Multiscale Biology of Cardiovascular Risk in Psoriasis: Protocol for a Case-Control Study. JMIR Res Protoc 2021; 10:e28669. [PMID: 34581684 PMCID: PMC8512189 DOI: 10.2196/28669] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/17/2021] [Accepted: 08/25/2021] [Indexed: 12/13/2022] Open
Abstract
Background Patients with psoriasis have increased risk of cardiovascular disease (CVD) independent of traditional risk factors. The molecular mechanisms underlying the psoriasis-CVD connection are not fully understood. Advances in high-throughput molecular profiling technologies and computational analysis techniques offer new opportunities to improve the understanding of disease connections. Objective We aim to characterize the complexity of cardiovascular risk in patients with psoriasis by integrating deep phenotypic data with systems biology techniques to perform comprehensive multiomic analyses and construct network models of the two interacting diseases. Methods The study aims to include 120 adult patients with psoriasis (60 with prior atherosclerotic CVD and 60 without CVD). Half of the patients are already receiving systemic antipsoriatic treatment. All patients complete a questionnaire, and a medical interview is conducted to collect medical history and information on, for example, socioeconomics, mental health, diet, and physical exercise. Participants are examined clinically with assessment of the Psoriasis Area and Severity Index and undergo imaging by transthoracic echocardiography, 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG-PET/CT), and carotid artery ultrasonography. Skin swabs are collected for analysis of microbiome metagenomics; skin biopsies and blood samples are collected for transcriptomic profiling by RNA sequencing; skin biopsies are collected for immunohistochemistry; plasma samples are collected for analyses of proteomics, lipidomics, and metabolomics; blood samples are collected for high-dimensional mass cytometry; and feces samples are collected for gut microbiome metagenomics. Bioinformatics and systems biology techniques are utilized to analyze the multiomic data and to integrate data into a network model of CVD in patients with psoriasis. Results Recruitment was completed in September 2020. Preliminary results of 18F-FDG-PET/CT data have recently been published, where vascular inflammation was reduced in the ascending aorta (P=.046) and aortic arch (P=.04) in patients treated with statins and was positively associated with inflammation in the visceral adipose tissue (P<.001), subcutaneous adipose tissue (P=.007), pericardial adipose tissue (P<.001), spleen (P=.001), and bone marrow (P<.001). Conclusions This systems biology approach with integration of multiomics and clinical data in patients with psoriasis with or without CVD is likely to provide novel insights into the biological mechanisms underlying these diseases and their interplay that can impact future treatment. International Registered Report Identifier (IRRID) DERR1-10.2196/28669
Collapse
Affiliation(s)
- Hannah Kaiser
- Department of Dermatology and Allergy, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark.,Department of Cardiology, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark
| | - Amanda Kvist-Hansen
- Department of Dermatology and Allergy, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark.,Department of Cardiology, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark
| | - Christine Becker
- Division of Clinical Immunology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Xing Wang
- Division of Clinical Immunology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Benjamin D McCauley
- Division of Clinical Immunology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Martin Krakauer
- Department of Clinical Physiology and Nuclear Medicine, Copenhagen University Hospital Bispebjerg and Frederiksberg, Copenhagen, Denmark
| | - Peter Michael Gørtz
- Department of Clinical Physiology and Nuclear Medicine, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark
| | | | - Claus Zachariae
- Department of Dermatology and Allergy, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark.,Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Lone Skov
- Department of Dermatology and Allergy, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark.,Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Peter Riis Hansen
- Department of Cardiology, Copenhagen University Hospital Herlev and Gentofte, Copenhagen, Denmark.,Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
38
|
Westerlund AM, Hawe JS, Heinig M, Schunkert H. Risk Prediction of Cardiovascular Events by Exploration of Molecular Data with Explainable Artificial Intelligence. Int J Mol Sci 2021; 22:10291. [PMID: 34638627 PMCID: PMC8508897 DOI: 10.3390/ijms221910291] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 09/17/2021] [Accepted: 09/18/2021] [Indexed: 12/11/2022] Open
Abstract
Cardiovascular diseases (CVD) annually take almost 18 million lives worldwide. Most lethal events occur months or years after the initial presentation. Indeed, many patients experience repeated complications or require multiple interventions (recurrent events). Apart from affecting the individual, this leads to high medical costs for society. Personalized treatment strategies aiming at prediction and prevention of recurrent events rely on early diagnosis and precise prognosis. Complementing the traditional environmental and clinical risk factors, multi-omics data provide a holistic view of the patient and disease progression, enabling studies to probe novel angles in risk stratification. Specifically, predictive molecular markers allow insights into regulatory networks, pathways, and mechanisms underlying disease. Moreover, artificial intelligence (AI) represents a powerful, yet adaptive, framework able to recognize complex patterns in large-scale clinical and molecular data with the potential to improve risk prediction. Here, we review the most recent advances in risk prediction of recurrent cardiovascular events, and discuss the value of molecular data and biomarkers for understanding patient risk in a systems biology context. Finally, we introduce explainable AI which may improve clinical decision systems by making predictions transparent to the medical practitioner.
Collapse
Affiliation(s)
- Annie M. Westerlund
- Department of Cardiology, Deutsches Herzzentrum München, Technical University Munich, Lazarettstrasse 36, 80636 Munich, Germany; (A.M.W.); (J.S.H.)
- Institute of Computational Biology, HelmholtzZentrum München, Ingolstädter Landstrasse 1, 85764 Munich, Germany
| | - Johann S. Hawe
- Department of Cardiology, Deutsches Herzzentrum München, Technical University Munich, Lazarettstrasse 36, 80636 Munich, Germany; (A.M.W.); (J.S.H.)
| | - Matthias Heinig
- Institute of Computational Biology, HelmholtzZentrum München, Ingolstädter Landstrasse 1, 85764 Munich, Germany
- Department of Informatics, Technical University Munich, Boltzmannstrasse 3, 85748 Garching, Germany
| | - Heribert Schunkert
- Department of Cardiology, Deutsches Herzzentrum München, Technical University Munich, Lazarettstrasse 36, 80636 Munich, Germany; (A.M.W.); (J.S.H.)
- Deutsches Zentrum für Herz- und Kreislaufforschung (DZHK), Munich Heart Alliance, Biedersteiner Strasse 29, 80802 Munich, Germany
| |
Collapse
|
39
|
González-López NM, Huertas-Ortiz KA, Leguizamon-Guerrero JE, Arias-Cortés MM, Tere-Peña CP, García-Castañeda JE, Rivera-Monroy ZJ. Omics in the detection and identification of biosynthetic pathways related to mycotoxin synthesis. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2021; 13:4038-4054. [PMID: 34486583 DOI: 10.1039/d1ay01017d] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Mycotoxins are secondary metabolites that are known to be toxic to humans and animals. On the other hand, some mycotoxins and their analogues possess antioxidant as well as antitumor properties, which could be relevant in the fields of pharmaceutical analysis and food research. Omics techniques are a group of analytical tools applied in the biological sciences in order to study genes (genomics), mRNA (transcriptomics), proteins (proteomics), and metabolites (metabolomics). Omics have become a vital tool in the field of mycotoxins, especially contributing to the identification of biomarkers with potential use for the detection of mycotoxigenic species and the gathering of information about the biosynthetic pathways of mycotoxins in different environments. This approach has provided tools for the development of prevention strategies and control measures for different mycotoxins. Additionally, research has revealed important information about the impact of global warming and climate change on the prevalence of mycotoxin issues in society. In the context of foodomics, the aim is to apply omics techniques in order to ensure food safety. The objective of the present review is to determine the state of the art regarding the development of analytical techniques based on omics in the identification of biosynthetic pathways related to mycotoxin synthesis.
Collapse
Affiliation(s)
| | - Kevin Andrey Huertas-Ortiz
- Facultad de Ciencias, Universidad Nacional de Colombia, Carrera 45 No 26-85, Building 450, Bogotá, Colombia.
| | | | | | | | | | - Zuly Jenny Rivera-Monroy
- Facultad de Ciencias, Universidad Nacional de Colombia, Carrera 45 No 26-85, Building 450, Bogotá, Colombia.
| |
Collapse
|
40
|
Correa Rojo A, Heylen D, Aerts J, Thas O, Hooyberghs J, Ertaylan G, Valkenborg D. Towards Building a Quantitative Proteomics Toolbox in Precision Medicine: A Mini-Review. Front Physiol 2021; 12:723510. [PMID: 34512391 PMCID: PMC8427610 DOI: 10.3389/fphys.2021.723510] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 08/05/2021] [Indexed: 12/26/2022] Open
Abstract
Precision medicine as a framework for disease diagnosis, treatment, and prevention at the molecular level has entered clinical practice. From the start, genetics has been an indispensable tool to understand and stratify the biology of chronic and complex diseases in precision medicine. However, with the advances in biomedical and omics technologies, quantitative proteomics is emerging as a powerful technology complementing genetics. Quantitative proteomics provide insight about the dynamic behaviour of proteins as they represent intermediate phenotypes. They provide direct biological insights into physiological patterns, while genetics accounting for baseline characteristics. Additionally, it opens a wide range of applications in clinical diagnostics, treatment stratification, and drug discovery. In this mini-review, we discuss the current status of quantitative proteomics in precision medicine including the available technologies and common methods to analyze quantitative proteomics data. Furthermore, we highlight the current challenges to put quantitative proteomics into clinical settings and provide a perspective to integrate proteomics data with genomics data for future applications in precision medicine.
Collapse
Affiliation(s)
- Alejandro Correa Rojo
- Data Science Institute, Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Hasselt University, Diepenbeek, Belgium.,Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Dries Heylen
- Data Science Institute, Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Hasselt University, Diepenbeek, Belgium.,Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Jan Aerts
- Data Science Institute, Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Hasselt University, Diepenbeek, Belgium
| | - Olivier Thas
- Data Science Institute, Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Hasselt University, Diepenbeek, Belgium.,Department of Applied Mathematics, Computer Science and Statistics, Faculty of Sciences, Ghent University, Ghent, Belgium.,National Institute for Applied Statistics Research Australia (NIASRA), Wollongong, NSW, Australia
| | - Jef Hooyberghs
- Flemish Institute for Technological Research (VITO), Mol, Belgium.,Theoretical Physics, Data Science Institute, Hasselt University, Diepenbeek, Belgium
| | - Gökhan Ertaylan
- Flemish Institute for Technological Research (VITO), Mol, Belgium
| | - Dirk Valkenborg
- Data Science Institute, Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Hasselt University, Diepenbeek, Belgium
| |
Collapse
|
41
|
Dorantes-Gilardi R, García-Cortés D, Hernández-Lemus E, Espinal-Enríquez J. k-core genes underpin structural features of breast cancer. Sci Rep 2021; 11:16284. [PMID: 34381069 PMCID: PMC8358063 DOI: 10.1038/s41598-021-95313-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023] Open
Abstract
Gene co-expression networks (GCNs) have been developed as relevant analytical tools for the study of the gene expression patterns behind complex phenotypes. Determining the association between structure and function in GCNs is a current challenge in biomedical research. Several structural differences between GCNs of breast cancer and healthy phenotypes have been reported. In a previous study, using co-expression multilayer networks, we have shown that there are abrupt differences in the connectivity patterns of the GCN of basal-like breast cancer between top co-expressed gene-pairs and the remaining gene-pairs. Here, we compared the top-100,000 interactions networks for the four breast cancer phenotypes (Luminal-A, Luminal-B, Her2+ and Basal), in terms of structural properties. For this purpose, we used the graph-theoretical k-core of a network (maximal sub-network with nodes of degree at least k). We developed a comprehensive analysis of the network k-core ([Formula: see text]) structures in cancer, and its relationship with biological functions. We found that in the Top-100,000-edges networks, the majority of interactions in breast cancer networks are intra-chromosome, meanwhile inter-chromosome interactions serve as connecting bridges between clusters. Moreover, core genes in the healthy network are strongly associated with processes such as metabolism and cell cycle. In breast cancer, only the core of Luminal A is related to those processes, and genes in its core are over-expressed. The intersection of the core nodes in all subtypes of cancer is composed only by genes in the chr8q24.3 region. This region has been observed to be highly amplified in several cancers before, and its appearance in the intersection of the four breast cancer k-cores, may suggest that local co-expression is a conserved phenomenon in cancer. Considering the many intricacies associated with these phenomena and the vast amount of research in epigenomic regulation which is currently undergoing, there is a need for further research on the epigenomic effects on the structure and function of gene co-expression networks in cancer.
Collapse
Affiliation(s)
- Rodrigo Dorantes-Gilardi
- grid.261112.70000 0001 2173 3359Network Science Institute and Department of Physics, Northeastern University, Boston, MA 02115 USA ,grid.462201.3El Colegio de México, Tlalpan, Mexico City, 14110 Mexico ,grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico
| | - Diana García-Cortés
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico
| | - Enrique Hernández-Lemus
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico ,grid.9486.30000 0001 2159 0001Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510 Mexico
| | - Jesús Espinal-Enríquez
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico ,grid.9486.30000 0001 2159 0001Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510 Mexico
| |
Collapse
|
42
|
Pan Y, Lei X, Zhang Y. Association predictions of genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, radiomics, drug, symptoms, environment factor, and disease networks: A comprehensive approach. Med Res Rev 2021; 42:441-461. [PMID: 34346083 DOI: 10.1002/med.21847] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 05/22/2021] [Accepted: 07/07/2021] [Indexed: 12/12/2022]
Abstract
Currently, the research of multi-omics, such as genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, and radiomics, are hot spots. The relationship between multi-omics data, drugs, and diseases has received extensive attention from researchers. At the same time, multi-omics can effectively predict the diagnosis, prognosis, and treatment of diseases. In essence, these research entities, such as genes, RNAs, proteins, microbes, metabolites, pathways as well as pathological and medical imaging data, can all be represented by the network at different levels. And some computer and biology scholars have tried to use computational methods to explore the potential relationships between biological entities. We summary a comprehensive research strategy, that is to build a multi-omics heterogeneous network, covering multimodal data, and use the current popular computational methods to make predictions. In this study, we first introduce the calculation method of the similarity of biological entities at the data level, second discuss multimodal data fusion and methods of feature extraction. Finally, the challenges and opportunities at this stage are summarized. Some scholars have used such a framework to calculate and predict. We also summarize them and discuss the challenges. We hope that our review could help scholars who are interested in the field of bioinformatics, biomedical image, and computer research.
Collapse
Affiliation(s)
- Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
43
|
Ershov P, Kaluzhskiy L, Mezentsev Y, Yablokov E, Gnedenko O, Ivanov A. Enzymes in the Cholesterol Synthesis Pathway: Interactomics in the Cancer Context. Biomedicines 2021; 9:biomedicines9080895. [PMID: 34440098 PMCID: PMC8389681 DOI: 10.3390/biomedicines9080895] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 07/20/2021] [Accepted: 07/22/2021] [Indexed: 02/06/2023] Open
Abstract
A global protein interactome ensures the maintenance of regulatory, signaling and structural processes in cells, but at the same time, aberrations in the repertoire of protein-protein interactions usually cause a disease onset. Many metabolic enzymes catalyze multistage transformation of cholesterol precursors in the cholesterol biosynthesis pathway. Cancer-associated deregulation of these enzymes through various molecular mechanisms results in pathological cholesterol accumulation (its precursors) which can be disease risk factors. This work is aimed at systematization and bioinformatic analysis of the available interactomics data on seventeen enzymes in the cholesterol pathway, encoded by HMGCR, MVK, PMVK, MVD, FDPS, FDFT1, SQLE, LSS, DHCR24, CYP51A1, TM7SF2, MSMO1, NSDHL, HSD17B7, EBP, SC5D, DHCR7 genes. The spectrum of 165 unique and 21 common protein partners that physically interact with target enzymes was selected from several interatomic resources. Among them there were 47 modifying proteins from different protein kinases/phosphatases and ubiquitin-protein ligases/deubiquitinases families. A literature search, enrichment and gene co-expression analysis showed that about a quarter of the identified protein partners was associated with cancer hallmarks and over-represented in cancer pathways. Our results allow to update the current fundamental view on protein-protein interactions and regulatory aspects of the cholesterol synthesis enzymes and annotate of their sub-interactomes in term of possible involvement in cancers that will contribute to prioritization of protein targets for future drug development.
Collapse
|
44
|
Gallego-Paüls M, Hernández-Ferrer C, Bustamante M, Basagaña X, Barrera-Gómez J, Lau CHE, Siskos AP, Vives-Usano M, Ruiz-Arenas C, Wright J, Slama R, Heude B, Casas M, Grazuleviciene R, Chatzi L, Borràs E, Sabidó E, Carracedo Á, Estivill X, Urquiza J, Coen M, Keun HC, González JR, Vrijheid M, Maitre L. Variability of multi-omics profiles in a population-based child cohort. BMC Med 2021; 19:166. [PMID: 34289836 PMCID: PMC8296694 DOI: 10.1186/s12916-021-02027-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 06/08/2021] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Multiple omics technologies are increasingly applied to detect early, subtle molecular responses to environmental stressors for future disease risk prevention. However, there is an urgent need for further evaluation of stability and variability of omics profiles in healthy individuals, especially during childhood. METHODS We aimed to estimate intra-, inter-individual and cohort variability of multi-omics profiles (blood DNA methylation, gene expression, miRNA, proteins and serum and urine metabolites) measured 6 months apart in 156 healthy children from five European countries. We further performed a multi-omics network analysis to establish clusters of co-varying omics features and assessed the contribution of key variables (including biological traits and sample collection parameters) to omics variability. RESULTS All omics displayed a large range of intra- and inter-individual variability depending on each omics feature, although all presented a highest median intra-individual variability. DNA methylation was the most stable profile (median 37.6% inter-individual variability) while gene expression was the least stable (6.6%). Among the least stable features, we identified 1% cross-omics co-variation between CpGs and metabolites (e.g. glucose and CpGs related to obesity and type 2 diabetes). Explanatory variables, including age and body mass index (BMI), explained up to 9% of serum metabolite variability. CONCLUSIONS Methylation and targeted serum metabolomics are the most reliable omics to implement in single time-point measurements in large cross-sectional studies. In the case of metabolomics, sample collection and individual traits (e.g. BMI) are important parameters to control for improved comparability, at the study design or analysis stage. This study will be valuable for the design and interpretation of epidemiological studies that aim to link omics signatures to disease, environmental exposures, or both.
Collapse
Affiliation(s)
- Marta Gallego-Paüls
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Carles Hernández-Ferrer
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Mariona Bustamante
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Xavier Basagaña
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Jose Barrera-Gómez
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Chung-Ho E Lau
- MRC Centre for Environment and Health, School of Public Health, Imperial College London, London, UK
- Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, South Kensington, London, UK
| | - Alexandros P Siskos
- Cancer Metabolism & Systems Toxicology Group, Division of Cancer, Department of Surgery & Cancer and Division of Systems Medicine, Department of Metabolism, Digestion & Reproduction, Imperial College London, London, UK
| | - Marta Vives-Usano
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Carlos Ruiz-Arenas
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - John Wright
- Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, UK
| | - Remy Slama
- Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, Institute for Advanced Biosciences (IAB), Inserm, CNRS, Université Grenoble Alpes, Grenoble, France
| | - Barbara Heude
- Université de Paris, Centre for Research in Epidemiology and Statistics (CRESS), INSERM, INRAE, F-75004, Paris, France
| | - Maribel Casas
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | | | - Leda Chatzi
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Eva Borràs
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Eduard Sabidó
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Ángel Carracedo
- Medicine Genomics Group, Centro de Investigación Biomédica en Red Enfermedades Raras (CIBERER), University of Santiago de Compostela, CEGEN-PRB3, Santiago de Compostela, Spain
- Galician Foundation of Genomic Medicine, Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Servicio Gallego de Salud (SERGAS), Santiago de Compostela, Galicia, Spain
| | - Xavier Estivill
- Center for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Jose Urquiza
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Muireann Coen
- Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, South Kensington, London, UK
- Oncology Safety, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Hector C Keun
- Cancer Metabolism & Systems Toxicology Group, Division of Cancer, Department of Surgery & Cancer and Division of Systems Medicine, Department of Metabolism, Digestion & Reproduction, Imperial College London, London, UK
| | - Juan R González
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Martine Vrijheid
- ISGlobal, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain
| | - Léa Maitre
- ISGlobal, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Consorcio de Investigacion Biomedica en Red de Epidemiologia y Salud Publica (CIBERESP), Madrid, Spain.
| |
Collapse
|
45
|
Kosvyra A, Ntzioni E, Chouvarda I. Network analysis with biological data of cancer patients: A scoping review. J Biomed Inform 2021; 120:103873. [PMID: 34298154 DOI: 10.1016/j.jbi.2021.103873] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 06/30/2021] [Accepted: 07/18/2021] [Indexed: 12/25/2022]
Abstract
BACKGROUND & OBJECTIVE Network Analysis (NA) is a mathematical method that allows exploring relations between units and representing them as a graph. Although NA was initially related to social sciences, the past two decades was introduced in Bioinformatics. The recent growth of the networks' use in biological data analysis reveals the need to further investigate this area. In this work, we attempt to identify the use of NA with biological data, and specifically: (a) what types of data are used and whether they are integrated or not, (b) what is the purpose of this analysis, predictive or descriptive, and (c) the outcome of such analyses, specifically in cancer diseases. METHODS & MATERIALS The literature review was conducted on two databases, PubMed & IEEE, and was restricted to journal articles of the last decade (January 2010 - December 2019). At a first level, all articles were screened by title and abstract, and at a second level the screening was conducted by reading the full text article, following the predefined inclusion & exclusion criteria leading to 131 articles of interest. A table was created with the information of interest and was used for the classification of the articles. The articles were initially classified to analysis studies and studies that propose a new algorithm or methodology. Each one of these categories was further screened by the following clustering criteria: (a) data used, (b) study purpose, (c) study outcome. Specifically for the studies proposing a new algorithm, the novelty presented in each one was detected. RESULTS & Conclusions: In the past five years researchers are focusing on creating new algorithms and methodologies to enhance this field. The articles' classification revealed that only 25% of the analyses are integrating multi-omics data, although 50% of the new algorithms developed follow this integrative direction. Moreover, only 20% of the analyses and 10% of the newly developed methodologies have a predictive purpose. Regarding the result of the works reviewed, 75% of the studies focus on identifying, prognostic or not, gene signatures. Concluding, this review revealed the need for deploying predictive and multi-omics integrative algorithms and methodologies that can be used to enhance cancer diagnosis, prognosis and treatment.
Collapse
Affiliation(s)
- A Kosvyra
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
| | - E Ntzioni
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - I Chouvarda
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
46
|
Paredes O, Morales JA, Mendizabal AP, Romo-Vázquez R. Metacode: One code to rule them all. Biosystems 2021; 208:104486. [PMID: 34274462 DOI: 10.1016/j.biosystems.2021.104486] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/07/2021] [Accepted: 07/09/2021] [Indexed: 12/13/2022]
Abstract
The code of codes or metacode is a microcosm where biological layers, as well as their codes, interact together allowing the continuity of information flow in organisms by increasing biological entities' complexity. Through this novel organic code, biological systems scale towards niches with higher informatic freedom building structures that increase the entropy in the universe. Code biology has developed a novel informational framework where biological entities strive themselves through the information flow carried out through organic codes consisting of two molecular or functional landscapes intertwined through arbitrary linkages via an adaptor whose nature is autonomous from molecular determinism. Here we will integrate genomic and epigenomic codes according to the evidence released in ENCODE (phase 3), psychENCODE and GTEx project, outlining the principles of the metacode, to address the continuous nature of biological systems and their inter-layered information flow. This novel complex metacode maps from very constrained sets of elements (i.e., regulation sites modulating gene expression) to new ones with greater freedom of decoding (i.e., a continuous cell phenotypic space). This leads to a new domain in code biology where biological systems are informatic attractors that navigate an energy metaspace through a complexity-noise balance, stalling in emergent niches where organic codes take meaning.
Collapse
Affiliation(s)
- Omar Paredes
- Computer Sciences Department, CUCEI, Universidad de Guadalajara, Mexico
| | | | - Adriana P Mendizabal
- Molecular Biology Laboratory, Farmacobiology Department, CUCEI, Universidad de Guadalajara, Mexico
| | | |
Collapse
|
47
|
Klaus VS, Schriever SC, Monroy Kuhn JM, Peter A, Irmler M, Tokarz J, Prehn C, Kastenmüller G, Beckers J, Adamski J, Königsrainer A, Müller TD, Heni M, Tschöp MH, Pfluger PT, Lutter D. Correlation guided Network Integration (CoNI) reveals novel genes affecting hepatic metabolism. Mol Metab 2021; 53:101295. [PMID: 34271221 PMCID: PMC8361260 DOI: 10.1016/j.molmet.2021.101295] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/24/2021] [Accepted: 07/09/2021] [Indexed: 11/19/2022] Open
Abstract
Objective Technological advances have brought a steady increase in the availability of various types of omics data, from genomics to metabolomics. Integrating these multi-omics data is a chance and challenge for systems biology; yet, tools to fully tap their potential remain scarce. Methods We present here a fully unsupervised and versatile correlation-based method – termed Correlation guided Network Integration (CoNI) – to integrate multi-omics data into a hypergraph structure that allows for the identification of effective modulators of metabolism. Our approach yields single transcripts of potential relevance that map to specific, densely connected, metabolic subgraphs or pathways. Results By applying our method on transcriptomics and metabolomics data from murine livers under standard Chow or high-fat diet, we identified eleven genes with potential regulatory effects on hepatic metabolism. Five candidates, including the hepatokine INHBE, were validated in human liver biopsies to correlate with diabetes-related traits such as overweight, hepatic fat content, and insulin resistance (HOMA-IR). Conclusion Our method's successful application to an independent omics dataset confirmed that the novel CoNI framework is a transferable, entirely data-driven, flexible, and versatile tool for multiple omics data integration and interpretation.
Collapse
Affiliation(s)
- Valentina S Klaus
- Computational Discovery Research Unit, Institute for Diabetes and Obesity, Helmholtz Zentrum München, Neuherberg, Germany; TUM School of Medicine, Neurobiology of Diabetes, Technical University Munich, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany
| | - Sonja C Schriever
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany; Research Unit Neurobiology of Diabetes, Helmholtz Zentrum München, Neuherberg, Germany
| | - José Manuel Monroy Kuhn
- Computational Discovery Research Unit, Institute for Diabetes and Obesity, Helmholtz Zentrum München, Neuherberg, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany
| | - Andreas Peter
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes Research and Metabolic Diseases of the Helmholtz Center Munich at the University of Tübingen, Tübingen, Germany; Institute for Clinical Chemistry and Pathobiochemistry, University Hospital Tübingen, Germany
| | - Martin Irmler
- Institute of Experimental Genetics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Janina Tokarz
- Research Unit Molecular Endocrinology and Metabolism, Helmholtz Zentrum München, Neuherberg, Germany
| | - Cornelia Prehn
- Research Unit Molecular Endocrinology and Metabolism, Helmholtz Zentrum München, Neuherberg, Germany
| | - Gabi Kastenmüller
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Johannes Beckers
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute of Experimental Genetics, Helmholtz Zentrum München, Neuherberg, Germany; Chair of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, Neuherberg, Germany
| | - Jerzy Adamski
- Research Unit Molecular Endocrinology and Metabolism, Helmholtz Zentrum München, Neuherberg, Germany; Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Chair of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, Neuherberg, Germany
| | - Alfred Königsrainer
- Department of General, Visceral and Transplant Surgery, University Hospital Tübingen, Germany
| | - Timo D Müller
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany; Department of Pharmacology and Experimental Therapy, Institute of Experimental and Clinical Pharmacology and Toxicology, Eberhard Karls University Hospitals and Clinics, Tübingen, Germany
| | - Martin Heni
- German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes Research and Metabolic Diseases of the Helmholtz Center Munich at the University of Tübingen, Tübingen, Germany; Department of Internal Medicine, Division of Endocrinology, Diabetology, and Nephrology, Eberhard Karls University Tübingen, Tübingen, Germany
| | - Matthias H Tschöp
- TUM School of Medicine, Neurobiology of Diabetes, Technical University Munich, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany; Division of Metabolic Diseases, Department of Medicine, Technical University Munich, Munich, Germany
| | - Paul T Pfluger
- TUM School of Medicine, Neurobiology of Diabetes, Technical University Munich, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany; Research Unit Neurobiology of Diabetes, Helmholtz Zentrum München, Neuherberg, Germany
| | - Dominik Lutter
- Computational Discovery Research Unit, Institute for Diabetes and Obesity, Helmholtz Zentrum München, Neuherberg, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany; Institute for Diabetes and Obesity, Helmholtz Diabetes Center at Helmholtz Zentrum München, Germany.
| |
Collapse
|
48
|
Bardozzo F, Lió P, Tagliaferri R. Signal metrics analysis of oscillatory patterns in bacterial multi-omic networks. Bioinformatics 2021; 37:1411-1419. [PMID: 33185666 DOI: 10.1093/bioinformatics/btaa966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 09/25/2020] [Accepted: 11/03/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION One of the branches of Systems Biology is focused on a deep understanding of underlying regulatory networks through the analysis of the biomolecules oscillations and their interplay. Synthetic Biology exploits gene or/and protein regulatory networks towards the design of oscillatory networks for producing useful compounds. Therefore, at different levels of application and for different purposes, the study of biomolecular oscillations can lead to different clues about the mechanisms underlying living cells. It is known that network-level interactions involve more than one type of biomolecule as well as biological processes operating at multiple omic levels. Combining network/pathway-level information with genetic information it is possible to describe well-understood or unknown bacterial mechanisms and organism-specific dynamics. RESULTS Following the methodologies used in signal processing and communication engineering, a methodology is introduced to identify and quantify the extent of multi-omic oscillations. These are due to the process of multi-omic integration and depend on the gene positions on the chromosome. Ad hoc signal metrics are designed to allow further biotechnological explanations and provide important clues about the oscillatory nature of the pathways and their regulatory circuits. Our algorithms designed for the analysis of multi-omic signals are tested and validated on 11 different bacteria for thousands of multi-omic signals perturbed at the network level by different experimental conditions. Information on the order of genes, codon usage, gene expression and protein molecular weight is integrated at three different functional levels. Oscillations show interesting evidence that network-level multi-omic signals present a synchronized response to perturbations and evolutionary relations along taxa. AVAILABILITY AND IMPLEMENTATION The algorithms, the code (in language R), the tool, the pipeline and the whole dataset of multi-omic signal metrics are available at: https://github.com/lodeguns/Multi-omicSignals. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK
| | | |
Collapse
|
49
|
Park Y, Heider D, Hauschild AC. Integrative Analysis of Next-Generation Sequencing for Next-Generation Cancer Research toward Artificial Intelligence. Cancers (Basel) 2021; 13:3148. [PMID: 34202427 PMCID: PMC8269018 DOI: 10.3390/cancers13133148] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 06/16/2021] [Accepted: 06/21/2021] [Indexed: 12/18/2022] Open
Abstract
The rapid improvement of next-generation sequencing (NGS) technologies and their application in large-scale cohorts in cancer research led to common challenges of big data. It opened a new research area incorporating systems biology and machine learning. As large-scale NGS data accumulated, sophisticated data analysis methods became indispensable. In addition, NGS data have been integrated with systems biology to build better predictive models to determine the characteristics of tumors and tumor subtypes. Therefore, various machine learning algorithms were introduced to identify underlying biological mechanisms. In this work, we review novel technologies developed for NGS data analysis, and we describe how these computational methodologies integrate systems biology and omics data. Subsequently, we discuss how deep neural networks outperform other approaches, the potential of graph neural networks (GNN) in systems biology, and the limitations in NGS biomedical research. To reflect on the various challenges and corresponding computational solutions, we will discuss the following three topics: (i) molecular characteristics, (ii) tumor heterogeneity, and (iii) drug discovery. We conclude that machine learning and network-based approaches can add valuable insights and build highly accurate models. However, a well-informed choice of learning algorithm and biological network information is crucial for the success of each specific research question.
Collapse
Affiliation(s)
- Youngjun Park
- Department of Mathematics and Computer Science, Philipps-University of Marburg, 35032 Marburg, Germany; (Y.P.); (D.H.)
| | - Dominik Heider
- Department of Mathematics and Computer Science, Philipps-University of Marburg, 35032 Marburg, Germany; (Y.P.); (D.H.)
| | - Anne-Christin Hauschild
- Department of Mathematics and Computer Science, Philipps-University of Marburg, 35032 Marburg, Germany; (Y.P.); (D.H.)
- Department of Medical Informatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| |
Collapse
|
50
|
Picard M, Scott-Boyer MP, Bodein A, Périn O, Droit A. Integration strategies of multi-omics data for machine learning analysis. Comput Struct Biotechnol J 2021; 19:3735-3746. [PMID: 34285775 PMCID: PMC8258788 DOI: 10.1016/j.csbj.2021.06.030] [Citation(s) in RCA: 172] [Impact Index Per Article: 57.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/17/2021] [Accepted: 06/21/2021] [Indexed: 12/25/2022] Open
Abstract
Increased availability of high-throughput technologies has generated an ever-growing number of omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these data have been obtained by machine learning algorithms that have produced diagnostic and classification biomarkers. Most biomarkers obtained to date however only include one omic measurement at a time and thus do not take full advantage of recent multi-omics experiments that now capture the entire complexity of biological systems. Multi-omics data integration strategies are needed to combine the complementary knowledge brought by each omics layer. We have summarized the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical. In this mini-review, we focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications.
Collapse
Affiliation(s)
- Milan Picard
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie-Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Périn
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- Corresponding author.
| |
Collapse
|