1
|
Newman NK, Macovsky MS, Rodrigues RR, Bruce AM, Pederson JW, Padiadpu J, Shan J, Williams J, Patil SS, Dzutsev AK, Shulzhenko N, Trinchieri G, Brown K, Morgun A. Transkingdom Network Analysis (TkNA): a systems framework for inferring causal factors underlying host-microbiota and other multi-omic interactions. Nat Protoc 2024; 19:1750-1778. [PMID: 38472495 DOI: 10.1038/s41596-024-00960-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 11/29/2023] [Indexed: 03/14/2024]
Abstract
We present Transkingdom Network Analysis (TkNA), a unique causal-inference analytical framework that offers a holistic view of biological systems by integrating data from multiple cohorts and diverse omics types. TkNA helps to decipher key players and mechanisms governing host-microbiota (or any multi-omic data) interactions in specific conditions or diseases. TkNA reconstructs a network that represents a statistical model capturing the complex relationships between different omics in the biological system. It identifies robust and reproducible patterns of fold change direction and correlation sign across several cohorts to select differential features and their per-group correlations. The framework then uses causality-sensitive metrics, statistical thresholds and topological criteria to determine the final edges forming the transkingdom network. With the subsequent network's topological features, TkNA identifies nodes controlling a given subnetwork or governing communication between kingdoms and/or subnetworks. The computational time for the millions of correlations necessary for network reconstruction in TkNA typically takes only a few minutes, varying with the study design. Unlike most other multi-omics approaches that find only associations, TkNA focuses on establishing causality while accounting for the complex structure of multi-omic data. It achieves this without requiring huge sample sizes. Moreover, the TkNA protocol is user friendly, requiring minimal installation and basic familiarity with Unix. Researchers can access the TkNA software at https://github.com/CAnBioNet/TkNA/ .
Collapse
Affiliation(s)
- Nolan K Newman
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | | | - Richard R Rodrigues
- Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
- Microbiome and Genetics Core, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Amanda M Bruce
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Jacob W Pederson
- Carlson College of Veterinary Medicine, Oregon State University, Corvallis, OR, USA
| | - Jyothi Padiadpu
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Jigui Shan
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Joshua Williams
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Sankalp S Patil
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Amiran K Dzutsev
- Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Natalia Shulzhenko
- Carlson College of Veterinary Medicine, Oregon State University, Corvallis, OR, USA
| | - Giorgio Trinchieri
- Cancer Immunobiology Section, Laboratory of Integrative Cancer Immunology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
| | - Kevin Brown
- College of Pharmacy, Oregon State University, Corvallis, OR, USA.
| | - Andrey Morgun
- College of Pharmacy, Oregon State University, Corvallis, OR, USA.
| |
Collapse
|
2
|
Naderi Yeganeh P, Teo YY, Karagkouni D, Pita-Juárez Y, Morgan SL, Slack FJ, Vlachos IS, Hide WA. PanomiR: a systems biology framework for analysis of multi-pathway targeting by miRNAs. Brief Bioinform 2023; 24:bbad418. [PMID: 37985452 PMCID: PMC10661971 DOI: 10.1093/bib/bbad418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 10/16/2023] [Accepted: 10/20/2023] [Indexed: 11/22/2023] Open
Abstract
Charting microRNA (miRNA) regulation across pathways is key to characterizing their function. Yet, no method currently exists that can quantify how miRNAs regulate multiple interconnected pathways or prioritize them for their ability to regulate coordinate transcriptional programs. Existing methods primarily infer one-to-one relationships between miRNAs and pathways using differentially expressed genes. We introduce PanomiR, an in silico framework for studying the interplay of miRNAs and disease functions. PanomiR integrates gene expression, mRNA-miRNA interactions and known biological pathways to reveal coordinated multi-pathway targeting by miRNAs. PanomiR utilizes pathway-activity profiling approaches, a pathway co-expression network and network clustering algorithms to prioritize miRNAs that target broad-scale transcriptional disease phenotypes. It directly resolves differential regulation of pathways, irrespective of their differential gene expression, and captures co-activity to establish functional pathway groupings and the miRNAs that may regulate them. PanomiR uses a systems biology approach to provide broad but precise insights into miRNA-regulated functional programs. It is available at https://bioconductor.org/packages/PanomiR.
Collapse
Affiliation(s)
- Pourya Naderi Yeganeh
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
| | - Yue Y Teo
- National University of Singapore, Singapore
- École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Dimitra Karagkouni
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yered Pita-Juárez
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sarah L Morgan
- Harvard Medical School, Boston, MA, USA
- Centre for Neuroscience, Surgery and Trauma, Blizard Institute, Queen Mary University of London, London E1 2AT, UK
| | - Frank J Slack
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
| | - Ioannis S Vlachos
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Winston A Hide
- Harvard Medical School, Boston, MA, USA
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Harvard Medical School Initiative for RNA Medicine, Boston, MA, USA
| |
Collapse
|
3
|
Wilk G, Braun R. Integrative analysis reveals disrupted pathways regulated by microRNAs in cancer. Nucleic Acids Res 2019; 46:1089-1101. [PMID: 29294105 PMCID: PMC5814839 DOI: 10.1093/nar/gkx1250] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 12/01/2017] [Indexed: 02/06/2023] Open
Abstract
MicroRNAs (miRNAs) are small endogenous regulatory molecules that modulate gene expression post-transcriptionally. Although differential expression of miRNAs have been implicated in many diseases (including cancers), the underlying mechanisms of action remain unclear. Because each miRNA can target multiple genes, miRNAs may potentially have functional implications for the overall behavior of entire pathways. Here, we investigate the functional consequences of miRNA dysregulation through an integrative analysis of miRNA and mRNA expression data using a novel approach that incorporates pathway information a priori. By searching for miRNA-pathway associations that differ between healthy and tumor tissue, we identify specific relationships at the systems level which are disrupted in cancer. Our approach is motivated by the hypothesis that if an miRNA and pathway are associated, then the expression of the miRNA and the collective behavior of the genes in a pathway will be correlated. As such, we first obtain an expression-based summary of pathway activity using Isomap, a dimension reduction method which can articulate non-linear structure in high-dimensional data. We then search for miRNAs that exhibit differential correlations with the pathway summary between phenotypes as a means of finding aberrant miRNA-pathway coregulation in tumors. We apply our method to cancer data using gene and miRNA expression datasets from The Cancer Genome Atlas and compare ∼105 miRNA-pathway relationships between healthy and tumor samples from four tissues (breast, prostate, lung and liver). Many of the flagged pairs we identify have a biological basis for disruption in cancer.
Collapse
Affiliation(s)
- Gary Wilk
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Rosemary Braun
- Biostatistics Division, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA.,Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
4
|
Identification of pathways associated with chemosensitivity through network embedding. PLoS Comput Biol 2019; 15:e1006864. [PMID: 30893303 PMCID: PMC6443184 DOI: 10.1371/journal.pcbi.1006864] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Revised: 04/01/2019] [Accepted: 02/09/2019] [Indexed: 12/27/2022] Open
Abstract
Basal gene expression levels have been shown to be predictive of cellular response to cytotoxic treatments. However, such analyses do not fully reveal complex genotype- phenotype relationships, which are partly encoded in highly interconnected molecular networks. Biological pathways provide a complementary way of understanding drug response variation among individuals. In this study, we integrate chemosensitivity data from a large-scale pharmacogenomics study with basal gene expression data from the CCLE project and prior knowledge of molecular networks to identify specific pathways mediating chemical response. We first develop a computational method called PACER, which ranks pathways for enrichment in a given set of genes using a novel network embedding method. It examines a molecular network that encodes known gene-gene as well as gene-pathway relationships, and determines a vector representation of each gene and pathway in the same low-dimensional vector space. The relevance of a pathway to the given gene set is then captured by the similarity between the pathway vector and gene vectors. To apply this approach to chemosensitivity data, we identify genes whose basal expression levels in a panel of cell lines are correlated with cytotoxic response to a compound, and then rank pathways for relevance to these response-correlated genes using PACER. Extensive evaluation of this approach on benchmarks constructed from databases of compound target genes and large collections of drug response signatures demonstrates its advantages in identifying compound-pathway associations compared to existing statistical methods of pathway enrichment analysis. The associations identified by PACER can serve as testable hypotheses on chemosensitivity pathways and help further study the mechanisms of action of specific cytotoxic drugs. More broadly, PACER represents a novel technique of identifying enriched properties of any gene set of interest while also taking into account networks of known gene-gene relationships and interactions. Gene expression levels have been used to study the cellular response to drug treatments. However, analysis of gene expression without considering gene interactions cannot fully reveal complex genotype-phenotype relationships. Biological pathways reveal the interactions among genes, thus providing a complementary way of understanding the drug response variation among individuals. In this paper, we aim to identify pathways that mediate the chemical response of each drug. We used the recently generated CTRP pharmacogenomics data and CCLE basal expression data to identify these pathways. We showed that using the prior knowledge encoded in molecular networks substantially improves pathway identification. In particular, we integrate genes and pathways into a large heterogeneous network in which links are protein-protein interactions and gene-pathway affiliations. We then project this heterogeneous network onto a low-dimensional space, which enables more precise similarity measurements between pathways and drug-response-correlated genes. Extensive experiments on two benchmarks show that our method substantially improved the pathway identification performance by using the molecular networks. More importantly, our method represents a novel technique of identifying enriched properties of any gene set of interest while also taking into account networks of known gene-gene relationships and interactions.
Collapse
|
5
|
Campbell KR, Yau C. A descriptive marker gene approach to single-cell pseudotime inference. Bioinformatics 2019; 35:28-35. [PMID: 29939207 PMCID: PMC6298060 DOI: 10.1093/bioinformatics/bty498] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 02/05/2018] [Accepted: 06/20/2018] [Indexed: 12/25/2022] Open
Abstract
Motivation Pseudotime estimation from single-cell gene expression data allows the recovery of temporal information from otherwise static profiles of individual cells. Conventional pseudotime inference methods emphasize an unsupervised transcriptome-wide approach and use retrospective analysis to evaluate the behaviour of individual genes. However, the resulting trajectories can only be understood in terms of abstract geometric structures and not in terms of interpretable models of gene behaviour. Results Here we introduce an orthogonal Bayesian approach termed 'Ouija' that learns pseudotimes from a small set of marker genes that might ordinarily be used to retrospectively confirm the accuracy of unsupervised pseudotime algorithms. Crucially, we model these genes in terms of switch-like or transient behaviour along the trajectory, allowing us to understand why the pseudotimes have been inferred and learn informative parameters about the behaviour of each gene. Since each gene is associated with a switch or peak time the genes are effectively ordered along with the cells, allowing each part of the trajectory to be understood in terms of the behaviour of certain genes. We demonstrate that this small panel of marker genes can recover pseudotimes that are consistent with those obtained using the entire transcriptome. Furthermore, we show that our method can detect differences in the regulation timings between two genes and identify 'metastable' states-discrete cell types along the continuous trajectories-that recapitulate known cell types. Availability and implementation An open source implementation is available as an R package at http://www.github.com/kieranrcampbell/ouija and as a Python/TensorFlow package at http://www.github.com/kieranrcampbell/ouijaflow. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kieran R Campbell
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK
- Wellcome Trust Centre for Human Genetics University of Oxford, Oxford, UK
| | - Christopher Yau
- Wellcome Trust Centre for Human Genetics University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
- Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, UK
| |
Collapse
|
6
|
Gonzalez-Valbuena EE, Treviño V. Metrics to estimate differential co-expression networks. BioData Min 2017; 10:32. [PMID: 29151892 PMCID: PMC5681815 DOI: 10.1186/s13040-017-0152-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 10/30/2017] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Detecting the differences in gene expression data is important for understanding the underlying molecular mechanisms. Although the differentially expressed genes are a large component, differences in correlation are becoming an interesting approach to achieving deeper insights. However, diverse metrics have been used to detect differential correlation, making selection and use of a single metric difficult. In addition, available implementations are metric-specific, complicating their use in different contexts. Moreover, because the analyses in the literature have been performed on real data, there are uncertainties regarding the performance of metrics and procedures. RESULTS In this work, we compare four novel and two previously proposed metrics to detect differential correlations. We generated well-controlled datasets into which differences in correlations were carefully introduced by controlled multivariate normal correlation networks and addition of noise. The comparisons were performed on three datasets derived from real tumor data. Our results show that metrics differ in their detection performance and computational time. No single metric was the best in all datasets, but trends show that three metrics are highly correlated and are very good candidates for real data analysis. In contrast, other metrics proposed in the literature seem to show low performance and different detections. Overall, our results suggest that metrics that do not filter correlations perform better. We also show an additional analysis of TCGA breast cancer subtypes. CONCLUSIONS We show a methodology to generate controlled datasets for the objective evaluation of differential correlation pipelines, and compare the performance of several metrics. We implemented in R a package called DifCoNet that can provide easy-to-use functions for differential correlation analyses.
Collapse
Affiliation(s)
| | - Víctor Treviño
- Cátedra de Bioinformática, Escuela de Medicina, Tecnológico de Monterrey, 64710 Monterrey, Nuevo León Mexico
| |
Collapse
|
7
|
Quigley DA, Tahiri A, Lüders T, Riis MH, Balmain A, Børresen-Dale AL, Bukholm I, Kristensen V. Age, estrogen, and immune response in breast adenocarcinoma and adjacent normal tissue. Oncoimmunology 2017; 6:e1356142. [PMID: 29147603 PMCID: PMC5674948 DOI: 10.1080/2162402x.2017.1356142] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Revised: 07/04/2017] [Accepted: 07/07/2017] [Indexed: 12/13/2022] Open
Abstract
Chronic inflammation promotes breast tumor growth and invasion by accelerating angiogenesis and tissue remodeling in the tumor microenvironment. There is a complex relationship between inflammation and estrogen, which drives the growth of 70 percent of breast tumors. While low levels of estrogen exposure stimulate macrophages and other inflammatory cell populations, very high levels are immune suppressive. Breast tumor incidence is increased by obesity and age, which interact to influence inflammatory cell populations in normal breast tissue. To characterize the impact of these factors on tumors and the tumor microenvironment, we measured gene expression in 195 breast adenocarcinomas and matched adjacent normal breast tissue samples collected at Akershus University Hospital (AHUS). Age and Body Mass Index (BMI) were independently associated with inflammation in adjacent normal tissue but not tumors. Estrogen Receptor (ER)-negative tumors had elevated macrophage expression compared with matched normal tissue, but ER-positive tumors showed an unexpected decrease in macrophage expression. We found an inverse relationship between the increase in tumor estrogen pathway expression compared with adjacent normal tissue and tumor macrophage score. We validated this finding in 126 breast tumor-normal pairs from the previously published METABRIC cohort. We developed a novel statistic, the Rewiring Coefficient, to quantify the rewiring of gene co-expression networks at the level of individual genes. Differential correlation analysis demonstrated distinct pathways were rewired during tumorigenesis. Our data support an immune suppressive effect of high doses of estrogen signaling in breast tumor microenvironment, suggesting that this effect contributes to the greater presence of prognostic and therapeutically relevant immune cells in ER-negative tumors.
Collapse
Affiliation(s)
- David A Quigley
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway.,K.G. Jebsen Centre for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway.,Helen Diller Family Comprehensive Cancer Center, University of California at San Francisco, San Francisco, California, USA.,Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California, USA
| | - Andliena Tahiri
- Department of Clinical Molecular Biology (EpiGen), Medical Division, Akershus University Hospital, Lørenskog, Norway.,Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Torben Lüders
- Department of Clinical Molecular Biology (EpiGen), Medical Division, Akershus University Hospital, Lørenskog, Norway.,Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Margit H Riis
- Department of Surgery, Oslo University Hospital, Ullevål, Oslo, Norway
| | - Allan Balmain
- Helen Diller Family Comprehensive Cancer Center, University of California at San Francisco, San Francisco, California, USA
| | - Anne-Lise Børresen-Dale
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway.,K.G. Jebsen Centre for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Ida Bukholm
- Department of Surgery, Oslo University Hospital, Ullevål, Oslo, Norway.,Department of Breast-Endocrine Surgery, Surgical Division, Akershus University Hospital, Lørenskog, Norway
| | - Vessela Kristensen
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway.,K.G. Jebsen Centre for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway.,Department of Clinical Molecular Biology (EpiGen), Medical Division, Akershus University Hospital, Lørenskog, Norway.,Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| |
Collapse
|
8
|
Koumakis L, Kanterakis A, Kartsaki E, Chatzimina M, Zervakis M, Tsiknakis M, Vassou D, Kafetzopoulos D, Marias K, Moustakis V, Potamias G. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways. PLoS Comput Biol 2016; 12:e1005187. [PMID: 27832067 PMCID: PMC5104320 DOI: 10.1371/journal.pcbi.1005187] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 10/10/2016] [Indexed: 01/04/2023] Open
Abstract
Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers' exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes.
Collapse
Affiliation(s)
- Lefteris Koumakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Alexandros Kanterakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Evgenia Kartsaki
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Maria Chatzimina
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Michalis Zervakis
- School of Electrical and Computer Engineering, Technical University of Crete, Greece
| | - Manolis Tsiknakis
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
- Department of Informatics Engineering, Technological Educational Institute of Crete, Greece
| | - Despoina Vassou
- Institute of Molecular Biology & Biotechnology, FORTH, Heraklion, Crete, Greece
| | | | - Kostas Marias
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Vassilis Moustakis
- School of Production Engineering & Management, Technical University of Crete, Greece
| | - George Potamias
- Computational BioMedicine Laboratory (CBML), Institute of Computers Science (ICS), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| |
Collapse
|
9
|
Abstract
Background It is useful to incorporate biological knowledge on the role of genetic determinants in predicting an outcome. It is, however, not always feasible to fully elicit this information when the number of determinants is large. We present an approach to overcome this difficulty. First, using half of the available data, a shortlist of potentially interesting determinants are generated. Second, binary indications of biological importance are elicited for this much smaller number of determinants. Third, an analysis is carried out on this shortlist using the second half of the data. Results We show through simulations that, compared with adaptive lasso, this approach leads to models containing more biologically relevant variables, while the prediction mean squared error (PMSE) is comparable or even reduced. We also apply our approach to bone mineral density data, and again final models contain more biologically relevant variables and have reduced PMSEs. Conclusion Our method leads to comparable or improved predictive performance, and models with greater face validity and interpretability with feasible incorporation of biological knowledge into predictive models. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1210-7) contains supplementary material, which is available to authorized users.
Collapse
|
10
|
Quigley DA, Kandyba E, Huang P, Halliwill KD, Sjölund J, Pelorosso F, Wong CE, Hirst GL, Wu D, Delrosario R, Kumar A, Balmain A. Gene Expression Architecture of Mouse Dorsal and Tail Skin Reveals Functional Differences in Inflammation and Cancer. Cell Rep 2016; 16:1153-1165. [PMID: 27425619 DOI: 10.1016/j.celrep.2016.06.061] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Revised: 03/16/2016] [Accepted: 06/14/2016] [Indexed: 12/13/2022] Open
Abstract
Inherited germline polymorphisms can cause gene expression levels in normal tissues to differ substantially between individuals. We present an analysis of the genetic architecture of normal adult skin from 470 genetically unique mice, demonstrating the effect of germline variants, skin tissue location, and perturbation by exogenous inflammation or tumorigenesis on gene signaling pathways. Gene networks related to specific cell types and signaling pathways, including sonic hedgehog (Shh), Wnt, Lgr family stem cell markers, and keratins, differed at these tissue sites, suggesting mechanisms for the differential susceptibility of dorsal and tail skin to development of skin diseases and tumorigenesis. The Pten tumor suppressor gene network is rewired in premalignant tumors compared to normal tissue, but this response to perturbation is lost during malignant progression. We present a software package for expression quantitative trait loci (eQTL) network analysis and demonstrate how network analysis of whole tissues provides insights into interactions between cell compartments and signaling molecules.
Collapse
Affiliation(s)
- David A Quigley
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo 0310, Norway; K.G. Jebsen Centre for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo 0313, Norway; Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Eve Kandyba
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Phillips Huang
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA; Genome Institute of Singapore, 60 Biopolis Street, #02-01 Genome Building, Singapore 138672, Singapore
| | - Kyle D Halliwill
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Jonas Sjölund
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA; Division of Translational Cancer Research, Department of Laboratory Medicine, Lund University, 22381 Lund, Sweden
| | - Facundo Pelorosso
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA; Instituto de Farmacología, Facultad de Medicina, Universidad de Buenos Aires, Paraguay 2155, 9(th) Floor, Ciudad Autónoma de Buenos Aires 1121, Argentina
| | - Christine E Wong
- Institute of Surgical Pathology, University Hospital Zurich, 8091 Zurich, Switzerland
| | - Gillian L Hirst
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Di Wu
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Reyno Delrosario
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Atul Kumar
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Allan Balmain
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
11
|
Abstract
Modern high-throughput assays yield detailed characterizations of the genomic, transcriptomic, and proteomic states of biological samples, enabling us to probe the molecular mechanisms that regulate hematopoiesis or give rise to hematological disorders. At the same time, the high dimensionality of the data and the complex nature of biological interaction networks present significant analytical challenges in identifying causal variations and modeling the underlying systems biology. In addition to identifying significantly disregulated genes and proteins, integrative analysis approaches that allow the investigation of these single genes within a functional context are required. This chapter presents a survey of current computational approaches for the statistical analysis of high-dimensional data and the development of systems-level models of cellular signaling and regulation. Specifically, we focus on multi-gene analysis methods and the integration of expression data with domain knowledge (such as biological pathways) and other gene-wise information (e.g., sequence or methylation data) to identify novel functional modules in the complex cellular interaction network.
Collapse
Affiliation(s)
- Rosemary Braun
- Biostatistics Division, Department of Preventive Medicine and Northwestern Institute on Complex Systems, Northwestern University, 680 N. Lake Shore Dr., Suite 1400, 60611, Chicago, IL, USA,
| |
Collapse
|
12
|
Stein GY, Yosef N, Reichman H, Horev J, Laser-Azogui A, Berens A, Resau J, Ruppin E, Sharan R, Tsarfaty I. Met kinetic signature derived from the response to HGF/SF in a cellular model predicts breast cancer patient survival. PLoS One 2012; 7:e45969. [PMID: 23049908 PMCID: PMC3457970 DOI: 10.1371/journal.pone.0045969] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Accepted: 08/23/2012] [Indexed: 11/19/2022] Open
Abstract
To determine the signaling pathways leading from Met activation to metastasis and poor prognosis, we measured the kinetic gene alterations in breast cancer cell lines in response to HGF/SF. Using a network inference tool we analyzed the putative protein-protein interaction pathways leading from Met to these genes and studied their specificity to Met and prognostic potential. We identified a Met kinetic signature consisting of 131 genes. The signature correlates with Met activation and with response to anti-Met therapy (p<0.005) in in-vitro models. It also identifies breast cancer patients who are at high risk to develop an aggressive disease in six large published breast cancer patient cohorts (p<0.01, N>1000). Moreover, we have identified novel putative Met pathways, which correlate with Met activity and patient prognosis. This signature may facilitate personalized therapy by identifying patients who will respond to anti-Met therapy. Moreover, this novel approach may be applied for other tyrosine kinases and other malignancies.
Collapse
Affiliation(s)
- Gideon Y. Stein
- Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Department of Internal Medicine “B”, Beilinson Hospital, Rabin Medical Center, Petah-Tikva, Israel
| | - Nir Yosef
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Hadar Reichman
- Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Judith Horev
- Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Adi Laser-Azogui
- Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Angelique Berens
- Van Andel Research Institute, Grand Rapids, Michigan, United States of America
| | - James Resau
- Van Andel Research Institute, Grand Rapids, Michigan, United States of America
| | - Eytan Ruppin
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ilan Tsarfaty
- Department of Clinical Microbiology and Immunology, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
13
|
Abstract
In this paper we propose a Bayesian approach for inference about dependence of high throughput gene expression. Our goals are to use prior knowledge about pathways to anchor inference about dependence among genes; to account for this dependence while making inferences about differences in mean expression across phenotypes; and to explore differences in the dependence itself across phenotypes. Useful features of the proposed approach are a model-based parsimonious representation of expression as an ordinal outcome, a novel and flexible representation of prior information on the nature of dependencies, and the use of a coherent probability model over both the structure and strength of the dependencies of interest. We evaluate our approach through simulations and in the analysis of data on expression of genes in the Complement and Coagulation Cascade pathway in ovarian cancer.
Collapse
Affiliation(s)
- Donatello Telesca
- Department of Biostatistics, UCLA School of Public Health, Los Angeles, California 90095-1772, USA
| | - Peter Müller
- University of Texas, Austin Department of Mathematics, Austin, Texas 78712, USA
| | - Giovanni Parmigiani
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA
| | - Ralph S Freedman
- University of Texas, M.D. Anderson Cancer Center, Department of Gynecologic Oncology, Houston, Texas 7030, USA
| |
Collapse
|
14
|
A hypothesis test for equality of bayesian network models. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010; 2010:947564. [PMID: 20981254 DOI: 10.1155/2010/947564] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Revised: 07/09/2010] [Accepted: 08/05/2010] [Indexed: 11/18/2022]
Abstract
Bayesian network models are commonly used to model gene expression data. Some applications require a comparison of the network structure of a set of genes between varying phenotypes. In principle, separately fit models can be directly compared, but it is difficult to assign statistical significance to any observed differences. There would therefore be an advantage to the development of a rigorous hypothesis test for homogeneity of network structure. In this paper, a generalized likelihood ratio test based on Bayesian network models is developed, with significance level estimated using permutation replications. In order to be computationally feasible, a number of algorithms are introduced. First, a method for approximating multivariate distributions due to Chow and Liu (1968) is adapted, permitting the polynomial-time calculation of a maximum likelihood Bayesian network with maximum indegree of one. Second, sequential testing principles are applied to the permutation test, allowing significant reduction of computation time while preserving reported error rates used in multiple testing. The method is applied to gene-set analysis, using two sets of experimental data, and some advantage to a pathway modelling approach to this problem is reported.
Collapse
|