1
|
Computational methods for Gene Regulatory Networks reconstruction and analysis: A review. Artif Intell Med 2019; 95:133-145. [DOI: 10.1016/j.artmed.2018.10.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 10/23/2018] [Accepted: 10/23/2018] [Indexed: 01/14/2023]
|
2
|
Jurman G, Filosi M, Visintainer R, Riccadonna S, Furlanello C. Stability in GRN Inference. Methods Mol Biol 2019; 1883:323-346. [PMID: 30547407 DOI: 10.1007/978-1-4939-8882-2_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Reconstructing a gene regulatory network from one or more sets of omics measurements has been a major task of computational biology in the last 20 years. Despite an overwhelming number of algorithms proposed to solve the network inference problem either in the general scenario or in an ad-hoc tailored situation, assessing the stability of reconstruction is still an uncharted territory and exploratory studies mainly tackled theoretical aspects. We introduce here empirical stability, which is induced by variability of reconstruction as a function of data subsampling. By evaluating differences between networks that are inferred using different subsets of the same data we obtain quantitative indicators of the robustness of the algorithm, of the noise level affecting the data, and, overall, of the reliability of the reconstructed graph. We show that empirical stability can be used whenever no ground truth is available to compute a direct measure of the similarity between the inferred structure and the true network. The main ingredient here is a suite of indicators, called NetSI, providing statistics of distances between graphs generated by a given algorithm fed with different data subsets, where the chosen metric is the Hamming-Ipsen-Mikhailov (HIM) distance evaluating dissimilarity of graph topologies with shared nodes. Operatively, the NetSI family is demonstrated here on synthetic and high-throughput datasets, inferring graphs at different resolution levels (topology, direction, weight), showing how the stability indicators can be effectively used for the quantitative comparison of the stability of different reconstruction algorithms.
Collapse
Affiliation(s)
| | | | - Roberto Visintainer
- The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | | | | |
Collapse
|
3
|
Icay K, Liu C, Hautaniemi S. Dynamic visualization of multi-level molecular data: The Director package in R. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 153:129-136. [PMID: 29157446 DOI: 10.1016/j.cmpb.2017.10.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 02/23/2017] [Accepted: 10/10/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND AND OBJECTIVE High-throughput measurement technologies have triggered a rise in large-scale cancer studies containing multiple levels of molecular data. While there are a number of efficient methods to analyze individual data types, there are far less that enhance data interpretation after analysis. We present the R package Director, a dynamic visualization approach to linking and interrogating multiple levels of molecular data after analysis for clinically meaningful, actionable insights. METHODS Sankey diagrams are traditionally used to represent quantitative flows through multiple, distinct events. Regulation can be interpreted as a flow of biological information through a series of molecular interactions. Functions in Director introduce novel drawing capabilities to make Sankey diagrams robust to a wide range of quantitative measures and to depict molecular interactions as regulatory cascades. The package streamlines creation of diagrams using as input quantitative measurements identifying nodes as molecules of interest and paths as the interaction strength between two molecules. RESULTS Director's utility is demonstrated with quantitative measurements of candidate microRNA-gene networks identified in an ovarian cancer dataset. A recent study reported eight miRNAs as master regulators of signature genes in epithelial-mesenchymal transition (EMT). The Sankey diagrams generated with data from this study furthers interpretation of the miRNAs' roles by revealing potential co-regulatory behavior in the extracellular matrix (ECM). An additional analysis identified 32 genes differentially expressed between good and poor prognosis patients in four significant pathways (FDR ≤ 0.1), three of which support a complementary role of the ECM in ovarian cancer. The resulting diagram created with Director suggest elevated levels of COL11A1, INHBA, and THBS2 - a signature feature of metastasis [1] - and decreased levels of their targeting miRNAs define poor prognosis. CONCLUSION We have demonstrated a visualization approach suitable for implementation in an analysis workflow, linking multiple levels of molecular data to gain novel perspective on candidate biomarkers in a complex disease. The diagrams are dynamic, easily replicable, and rendered locally as HTML files to facilitate sharing. The R package Director is simple to use and widely available on all operating systems through Bioconductor (http://bioconductor.org/packages/Director) and GitHub (http://kzouchka.github.io/Director).
Collapse
Affiliation(s)
- Katherine Icay
- Research Programs Unit, Genome-Scale Biology, Faculty of Medicine, University of Helsinki, Helsinki, POB 63, 00014, Finland.
| | - Chengyu Liu
- Research Programs Unit, Genome-Scale Biology, Faculty of Medicine, University of Helsinki, Helsinki, POB 63, 00014, Finland
| | - Sampsa Hautaniemi
- Research Programs Unit, Genome-Scale Biology, Faculty of Medicine, University of Helsinki, Helsinki, POB 63, 00014, Finland.
| |
Collapse
|
4
|
Tosadori G, Bestvina I, Spoto F, Laudanna C, Scardoni G. Creating, generating and comparing random network models with NetworkRandomizer. F1000Res 2016; 5:2524. [PMID: 29188012 PMCID: PMC5686481 DOI: 10.12688/f1000research.9203.3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/11/2017] [Indexed: 12/20/2022] Open
Abstract
Biological networks are becoming a fundamental tool for the investigation of high-throughput data in several fields of biology and biotechnology. With the increasing amount of information, network-based models are gaining more and more interest and new techniques are required in order to mine the information and to validate the results. To fill the validation gap we present an app, for the Cytoscape platform, which aims at creating randomised networks and randomising existing, real networks. Since there is a lack of tools that allow performing such operations, our app aims at enabling researchers to exploit different, well known random network models that could be used as a benchmark for validating real, biological datasets. We also propose a novel methodology for creating random weighted networks, i.e. the multiplication algorithm, starting from real, quantitative data. Finally, the app provides a statistical tool that compares real versus randomly computed attributes, in order to validate the numerical findings. In summary, our app aims at creating a standardised methodology for the validation of the results in the context of the Cytoscape platform.
Collapse
Affiliation(s)
- Gabriele Tosadori
- Department of Computer Science, University of Verona, Verona, 37134, Italy.,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| | - Ivan Bestvina
- Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, 10000, Croatia
| | - Fausto Spoto
- Department of Computer Science, University of Verona, Verona, 37134, Italy
| | - Carlo Laudanna
- Department of Medicine, University of Verona, Verona, 37134, Italy
| | - Giovanni Scardoni
- Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| |
Collapse
|
5
|
Tosadori G, Bestvina I, Spoto F, Laudanna C, Scardoni G. Creating, generating and comparing random network models with NetworkRandomizer. F1000Res 2016; 5:2524. [PMID: 29188012 DOI: 10.12688/f1000research.9203.1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/06/2016] [Indexed: 11/20/2022] Open
Abstract
Biological networks are becoming a fundamental tool for the investigation of high-throughput data in several fields of biology and biotechnology. With the increasing amount of information, network-based models are gaining more and more interest and new techniques are required in order to mine the information and to validate the results. To fill the validation gap we present an app, for the Cytoscape platform, which aims at creating randomised networks and randomising existing, real networks. Since there is a lack of tools that allow performing such operations, our app aims at enabling researchers to exploit different, well known random network models that could be used as a benchmark for validating real, biological datasets. We also propose a novel methodology for creating random weighted networks, i.e. the multiplication algorithm, starting from real, quantitative data. Finally, the app provides a statistical tool that compares real versus randomly computed attributes, in order to validate the numerical findings. In summary, our app aims at creating a standardised methodology for the validation of the results in the context of the Cytoscape platform.
Collapse
Affiliation(s)
- Gabriele Tosadori
- Department of Computer Science, University of Verona, Verona, 37134, Italy.,Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| | - Ivan Bestvina
- Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, 10000, Croatia
| | - Fausto Spoto
- Department of Computer Science, University of Verona, Verona, 37134, Italy
| | - Carlo Laudanna
- Department of Medicine, University of Verona, Verona, 37134, Italy
| | - Giovanni Scardoni
- Center for BioMedical Computing, University of Verona, Verona, 37134, Italy
| |
Collapse
|
6
|
Spirov AV, Myasnikova EM, Holloway DM. Sequential construction of a model for modular gene expression control, applied to spatial patterning of theDrosophilagenehunchback. J Bioinform Comput Biol 2016; 14:1641005. [DOI: 10.1142/s0219720016410055] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Gene network simulations are increasingly used to quantify mutual gene regulation in biological tissues. These are generally based on linear interactions between single-entity regulatory and target genes. Biological genes, by contrast, commonly have multiple, partially independent, cis-regulatory modules (CRMs) for regulator binding, and can produce variant transcription and translation products. We present a modeling framework to address some of the gene regulatory dynamics implied by this biological complexity. Spatial patterning of the hunchback (hb) gene in Drosophila development involves control by three CRMs producing two distinct mRNA transcripts. We use this example to develop a differential equations model for transcription which takes into account the cis-regulatory architecture of the gene. Potential regulatory interactions are screened by a genetic algorithms (GAs) approach and compared to biological expression data.
Collapse
Affiliation(s)
- Alexander V. Spirov
- Computer Science and CEWIT, SUNY Stony Brook, 1500 Stony Brook Road, Stony Brook, NY 11794, USA
- Lab Modeling of Evolution, I. M. Sechenov Institute of Evolutionary Physiology and Biochemistry, Russian Academy of Sciences, pr. Torez 44, St. Petersburg 194223, Russia
| | - Ekaterina M. Myasnikova
- Center for Advanced Studies, Peter the Great St. Petersburg Polytechnical University, 29 Polytechnicheskaya St. Petersburg 195251, Russia
- Department of Bioinformatics, Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow 141700, Russia
| | - David M. Holloway
- Mathematics Department, British Columbia Institute of Technology, 3700 Willingdon Avenue, Burnaby, BC, Canada V5G 3H2, Canada
- Department of Biology, University of Victoria, Victoria, BC, Canada V8W 2Y2, Canada
| |
Collapse
|
7
|
Emmert-Streib F, Dehmer M, Haibe-Kains B. Untangling statistical and biological models to understand network inference: the need for a genomics network ontology. Front Genet 2014; 5:299. [PMID: 25221572 PMCID: PMC4148777 DOI: 10.3389/fgene.2014.00299] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 08/12/2014] [Indexed: 12/31/2022] Open
Abstract
In this paper, we shed light on approaches that are currently used to infer networks from gene expression data with respect to their biological meaning. As we will show, the biological interpretation of these networks depends on the chosen theoretical perspective. For this reason, we distinguish a statistical perspective from a mathematical modeling perspective and elaborate their differences and implications. Our results indicate the imperative need for a genomic network ontology in order to avoid increasing confusion about the biological interpretation of inferred networks, which can be even enhanced by approaches that integrate multiple data sets, respectively, data types.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Faculty of Medicine, Health and Life Sciences, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast Belfast, UK
| | - Matthias Dehmer
- Institute for Bioinformatics and Translational Research, UMIT Hall in Tyrol, Austria
| | - Benjamin Haibe-Kains
- Bioinformatics and Computational Genomics Laboratory, Princess Margaret Cancer Centre, University Health Network Toronto, ON, Canada
| |
Collapse
|