1
|
Bakr S, Brennan K, Mukherjee P, Argemi J, Hernaez M, Gevaert O. Identifying key multifunctional components shared by critical cancer and normal liver pathways via SparseGMM. CELL REPORTS METHODS 2023; 3:100392. [PMID: 36814838 PMCID: PMC9939431 DOI: 10.1016/j.crmeth.2022.100392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/16/2022] [Accepted: 12/21/2022] [Indexed: 01/19/2023]
Abstract
Despite the abundance of multimodal data, suitable statistical models that can improve our understanding of diseases with genetic underpinnings are challenging to develop. Here, we present SparseGMM, a statistical approach for gene regulatory network discovery. SparseGMM uses latent variable modeling with sparsity constraints to learn Gaussian mixtures from multiomic data. By combining coexpression patterns with a Bayesian framework, SparseGMM quantitatively measures confidence in regulators and uncertainty in target gene assignment by computing gene entropy. We apply SparseGMM to liver cancer and normal liver tissue data and evaluate discovered gene modules in an independent single-cell RNA sequencing (scRNA-seq) dataset. SparseGMM identifies PROCR as a regulator of angiogenesis and PDCD1LG2 and HNF4A as regulators of immune response and blood coagulation in cancer. Furthermore, we show that more genes have significantly higher entropy in cancer compared with normal liver. Among high-entropy genes are key multifunctional components shared by critical pathways, including p53 and estrogen signaling.
Collapse
Affiliation(s)
- Shaimaa Bakr
- Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
- Department of Radiology, Stanford University, Stanford, CA 94305, USA
| | - Kevin Brennan
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Pritam Mukherjee
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Josepmaria Argemi
- Liver Unit, Clinica Universidad de Navarra, Hepatology Program, Center for Applied Medical Research, 31008 Pamplona, Navarra, Spain
| | - Mikel Hernaez
- Center for Applied Medical Research, University of Navarra, 31009 Pamplona, Navarra, Spain
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
2
|
Sukhadia SS, Tyagi A, Venkataraman V, Mukherjee P, Prasad P, Gevaert O, Nagaraj SH. ImaGene: a web-based software platform for tumor radiogenomic evaluation and reporting. BIOINFORMATICS ADVANCES 2022; 2:vbac079. [PMID: 36699376 PMCID: PMC9714320 DOI: 10.1093/bioadv/vbac079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 09/26/2022] [Accepted: 11/09/2022] [Indexed: 11/12/2022]
Abstract
Summary Radiographic imaging techniques provide insight into the imaging features of tumor regions of interest, while immunohistochemistry and sequencing techniques performed on biopsy samples yield omics data. Relationships between tumor genotype and phenotype can be identified from these data through traditional correlation analyses and artificial intelligence (AI) models. However, the radiogenomics community lacks a unified software platform with which to conduct such analyses in a reproducible manner. To address this gap, we developed ImaGene, a web-based platform that takes tumor omics and imaging datasets as inputs, performs correlation analysis between them, and constructs AI models. ImaGene has several modifiable configuration parameters and produces a report displaying model diagnostics. To demonstrate the utility of ImaGene, we utilized data for invasive breast carcinoma (IBC) and head and neck squamous cell carcinoma (HNSCC) and identified potential associations between imaging features and nine genes (WT1, LGI3, SP7, DSG1, ORM1, CLDN10, CST1, SMTNL2, and SLC22A31) for IBC and eight genes (NR0B1, PLA2G2A, MAL, CLDN16, PRDM14, VRTN, LRRN1, and MECOM) for HNSCC. ImaGene has the potential to become a standard platform for radiogenomic tumor analyses due to its ease of use, flexibility, and reproducibility, playing a central role in the establishment of an emerging radiogenomic knowledge base. Availability and implementation www.ImaGene.pgxguide.org, https://github.com/skr1/Imagene.git. Supplementary information Supplementary data are available at https://github.com/skr1/Imagene.git.
Collapse
Affiliation(s)
- Shrey S Sukhadia
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4000, Australia.,Translational Research Institute, Brisbane, QLD 4000, Australia
| | - Aayush Tyagi
- Yardi School of Artificial Intelligence, Indian Institute of Technology, New Delhi 110016, India
| | - Vivek Venkataraman
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4000, Australia.,Translational Research Institute, Brisbane, QLD 4000, Australia
| | - Pritam Mukherjee
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305-5101, USA
| | - Pratosh Prasad
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore 560012, India
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305-5101, USA
| | - Shivashankar H Nagaraj
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4000, Australia.,Translational Research Institute, Brisbane, QLD 4000, Australia
| |
Collapse
|
3
|
MuSA: a graphical user interface for multi-OMICs data integration in radiogenomic studies. Sci Rep 2021; 11:1550. [PMID: 33452365 PMCID: PMC7811020 DOI: 10.1038/s41598-021-81200-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/04/2021] [Indexed: 12/27/2022] Open
Abstract
Analysis of large-scale omics data along with biomedical images has gaining a huge interest in predicting phenotypic conditions towards personalized medicine. Multiple layers of investigations such as genomics, transcriptomics and proteomics, have led to high dimensionality and heterogeneity of data. Multi-omics data integration can provide meaningful contribution to early diagnosis and an accurate estimate of prognosis and treatment in cancer. Some multi-layer data structures have been developed to integrate multi-omics biological information, but none of these has been developed and evaluated to include radiomic data. We proposed to use MultiAssayExperiment (MAE) as an integrated data structure to combine multi-omics data facilitating the exploration of heterogeneous data. We improved the usability of the MAE, developing a Multi-omics Statistical Approaches (MuSA) tool that uses a Shiny graphical user interface, able to simplify the management and the analysis of radiogenomic datasets. The capabilities of MuSA were shown using public breast cancer datasets from TCGA-TCIA databases. MuSA architecture is modular and can be divided in Pre-processing and Downstream analysis. The pre-processing section allows data filtering and normalization. The downstream analysis section contains modules for data science such as correlation, clustering (i.e., heatmap) and feature selection methods. The results are dynamically shown in MuSA. MuSA tool provides an easy-to-use way to create, manage and analyze radiogenomic data. The application is specifically designed to guide no-programmer researchers through different computational steps. Integration analysis is implemented in a modular structure, making MuSA an easily expansible open-source software.
Collapse
|
5
|
Gevaert O, Nabian M, Bakr S, Everaert C, Shinde J, Manukyan A, Liefeld T, Tabor T, Xu J, Lupberger J, Haas BJ, Baumert TF, Hernaez M, Reich M, Quintana FJ, Uhlmann EJ, Krichevsky AM, Mesirov JP, Carey V, Pochet N. Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes. JCO Clin Cancer Inform 2020; 4:421-435. [PMID: 32383980 PMCID: PMC7265792 DOI: 10.1200/cci.19.00125] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2020] [Indexed: 12/18/2022] Open
Abstract
PURPOSE The availability of increasing volumes of multiomics, imaging, and clinical data in complex diseases such as cancer opens opportunities for the formulation and development of computational imaging genomics methods that can link multiomics, imaging, and clinical data. METHODS Here, we present the Imaging-AMARETTO algorithms and software tools to systematically interrogate regulatory networks derived from multiomics data within and across related patient studies for their relevance to radiography and histopathology imaging features predicting clinical outcomes. RESULTS To demonstrate its utility, we applied Imaging-AMARETTO to integrate three patient studies of brain tumors, specifically, multiomics with radiography imaging data from The Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and low-grade glioma (LGG) cohorts and transcriptomics with histopathology imaging data from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort. Our results show that Imaging-AMARETTO recapitulates known key drivers of tumor-associated microglia and macrophage mechanisms, mediated by STAT3, AHR, and CCR2, and neurodevelopmental and stemness mechanisms, mediated by OLIG2. Imaging-AMARETTO provides interpretation of their underlying molecular mechanisms in light of imaging biomarkers of clinical outcomes and uncovers novel master drivers, THBS1 and MAP2, that establish relationships across these distinct mechanisms. CONCLUSION Our network-based imaging genomics tools serve as hypothesis generators that facilitate the interrogation of known and uncovering of novel hypotheses for follow-up with experimental validation studies. We anticipate that our Imaging-AMARETTO imaging genomics tools will be useful to the community of biomedical researchers for applications to similar studies of cancer and other complex diseases with available multiomics, imaging, and clinical data.
Collapse
Affiliation(s)
- Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Mohsen Nabian
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Shaimaa Bakr
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA
| | - Celine Everaert
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Jayendra Shinde
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA
| | - Artur Manukyan
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Ted Liefeld
- Department of Medicine, University of California, San Diego, San Diego, CA
| | - Thorin Tabor
- Department of Medicine, University of California, San Diego, San Diego, CA
| | - Jishu Xu
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Rush University Medical Center, Chicago, IL
| | - Joachim Lupberger
- INSERM, U1110, Institut de Recherche sur les Maladies Virales et Hépatiques, Université de Strasbourg, Institut Hopitalo-Universitaire, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Brian J. Haas
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Thomas F. Baumert
- INSERM, U1110, Institut de Recherche sur les Maladies Virales et Hépatiques, Université de Strasbourg, Institut Hopitalo-Universitaire, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Mikel Hernaez
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL
| | - Michael Reich
- Department of Medicine, University of California, San Diego, San Diego, CA
| | - Francisco J. Quintana
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Erik J. Uhlmann
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Anna M. Krichevsky
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Jill P. Mesirov
- Department of Medicine, University of California, San Diego, San Diego, CA
| | - Vincent Carey
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Nathalie Pochet
- Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, MA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|