1
|
Böttcher L, Porter MA. Complex networks with complex weights. Phys Rev E 2024; 109:024314. [PMID: 38491610 DOI: 10.1103/physreve.109.024314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 12/20/2023] [Indexed: 03/18/2024]
Abstract
In many studies, it is common to use binary (i.e., unweighted) edges to examine networks of entities that are either adjacent or not adjacent. Researchers have generalized such binary networks to incorporate edge weights, which allow one to encode node-node interactions with heterogeneous intensities or frequencies (e.g., in transportation networks, supply chains, and social networks). Most such studies have considered real-valued weights, despite the fact that networks with complex weights arise in fields as diverse as quantum information, quantum chemistry, electrodynamics, rheology, and machine learning. Many of the standard network-science approaches in the study of classical systems rely on the real-valued nature of edge weights, so it is necessary to generalize them if one seeks to use them to analyze networks with complex edge weights. In this paper, we examine how standard network-analysis methods fail to capture structural features of networks with complex edge weights. We then generalize several network measures to the complex domain and show that random-walk centralities provide a useful approach to examine node importances in networks with complex weights.
Collapse
Affiliation(s)
- Lucas Böttcher
- Department of Computational Science and Philosophy, Frankfurt School of Finance and Management, 60322 Frankfurt am Main, Germany
- Department of Medicine, University of Florida, Gainesville, Florida, 32610, USA
| | - Mason A Porter
- Department of Mathematics, University of California, Los Angeles, California 90095, USA
- Department of Sociology, University of California, Los Angeles, California 90095, USA
- Santa Fe Institute, Santa Fe, New Mexico 87501, USA
| |
Collapse
|
2
|
Loers JU, Vermeirssen V. SUBATOMIC: a SUbgraph BAsed mulTi-OMIcs clustering framework to analyze integrated multi-edge networks. BMC Bioinformatics 2022; 23:363. [PMID: 36064320 PMCID: PMC9442970 DOI: 10.1186/s12859-022-04908-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/24/2022] [Indexed: 11/02/2022] Open
Abstract
BACKGROUND Representing the complex interplay between different types of biomolecules across different omics layers in multi-omics networks bears great potential to gain a deep mechanistic understanding of gene regulation and disease. However, multi-omics networks easily grow into giant hairball structures that hamper biological interpretation. Module detection methods can decompose these networks into smaller interpretable modules. However, these methods are not adapted to deal with multi-omics data nor consider topological features. When deriving very large modules or ignoring the broader network context, interpretability remains limited. To address these issues, we developed a SUbgraph BAsed mulTi-OMIcs Clustering framework (SUBATOMIC), which infers small and interpretable modules with a specific topology while keeping track of connections to other modules and regulators. RESULTS SUBATOMIC groups specific molecular interactions in composite network subgraphs of two and three nodes and clusters them into topological modules. These are functionally annotated, visualized and overlaid with expression profiles to go from static to dynamic modules. To preserve the larger network context, SUBATOMIC investigates statistically the connections in between modules as well as between modules and regulators such as miRNAs and transcription factors. We applied SUBATOMIC to analyze a composite Homo sapiens network containing transcription factor-target gene, miRNA-target gene, protein-protein, homologous and co-functional interactions from different databases. We derived and annotated 5586 modules with diverse topological, functional and regulatory properties. We created novel functional hypotheses for unannotated genes. Furthermore, we integrated modules with condition specific expression data to study the influence of hypoxia in three cancer cell lines. We developed two prioritization strategies to identify the most relevant modules in specific biological contexts: one considering GO term enrichments and one calculating an activity score reflecting the degree of differential expression. Both strategies yielded modules specifically reacting to low oxygen levels. CONCLUSIONS We developed the SUBATOMIC framework that generates interpretable modules from integrated multi-omics networks and applied it to hypoxia in cancer. SUBATOMIC can infer and contextualize modules, explore condition or disease specific modules, identify regulators and functionally related modules, and derive novel gene functions for uncharacterized genes. The software is available at https://github.com/CBIGR/SUBATOMIC .
Collapse
Affiliation(s)
- Jens Uwe Loers
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Vanessa Vermeirssen
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Ghent, Belgium. .,Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium. .,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
| |
Collapse
|
3
|
Ren Y, Sarkar A, Veltri P, Ay A, Dobra A, Kahveci T. Pattern Discovery in Multilayer Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:741-752. [PMID: 34398763 DOI: 10.1109/tcbb.2021.3105001] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
MOTIVATION In bioinformatics, complex cellular modeling and behavior simulation to identify significant molecular interactions is considered a relevant problem. Traditional methods model such complex systems using single and binary network. However, this model is inadequate to represent biological networks as different sets of interactions can simultaneously take place for different interaction constraints (such as transcription regulation and protein interaction). Furthermore, biological systems may exhibit varying interaction topologies even for the same interaction type under different developmental stages or stress conditions. Therefore, models which consider biological systems as solitary interactions are inaccurate as they fail to capture the complex behavior of cellular interactions within organisms. Identification and counting of recurrent motifs within a network is one of the fundamental problems in biological network analysis. Existing methods for motif counting on single network topologies are inadequate to capture patterns of molecular interactions that have significant changes in biological expression when identified across different organisms that are similar, or even time-varying networks within the same organism. That is, they fail to identify recurrent interactions as they consider a single snapshot of a network among a set of multiple networks. Therefore, we need methods geared towards studying multiple network topologies and the pattern conservation among them. Contributions: In this paper, we consider the problem of counting the number of instances of a user supplied motif topology in a given multilayer network. We model interactions among a set of entities (e.g., genes)describing various conditions or temporal variation as multilayer networks. Thus a separate network as each layer shows the connectivity of the nodes under a unique network state. Existing motif counting and identification methods are limited to single network topologies, and thus cannot be directly applied on multilayer networks. We apply our model and algorithm to study frequent patterns in cellular networks that are common in varying cellular states under different stress conditions, where the cellular network topology under each stress condition describes a unique network layer. RESULTS We develop a methodology and corresponding algorithm based on the proposed model for motif counting in multilayer networks. We performed experiments on both real and synthetic datasets. We modeled the synthetic datasets under a wide spectrum of parameters, such as network size, density, motif frequency. Results on synthetic datasets demonstrate that our algorithm finds motif embeddings with very high accuracy compared to existing state-of-the-art methods such as G-tries, ESU (FANMODE)and mfinder. Furthermore, we observe that our method runs from several times to several orders of magnitude faster than existing methods. For experiments on real dataset, we consider Escherichia coli (E. coli)transcription regulatory network under different experimental conditions. We observe that the genes selected by our method conserves functional characteristics under various stress conditions with very low false discovery rates. Moreover, the method is scalable to real networks in terms of both network size and number of layers.
Collapse
|
4
|
Affiliation(s)
- Mingao Yuan
- Department of Statistics, North Dakota State University
| | - Ruiqi Liu
- Department of Mathematical Sciences, Texas Tech University
| | - Yang Feng
- Department of Biostatistics, New York University
| | - Zuofeng Shang
- Department of Mathematical Sciences, New Jersey Institute of Technology
| |
Collapse
|
5
|
Affiliation(s)
- Mingao Yuan
- Department of Statistics, North Dakota State University, Fargo, North Dakota, USA
| | - Yehong Nan
- Department of Statistics, North Dakota State University, Fargo, North Dakota, USA
| |
Collapse
|
6
|
Ovens K, Eames BF, McQuillan I. Comparative Analyses of Gene Co-expression Networks: Implementations and Applications in the Study of Evolution. Front Genet 2021; 12:695399. [PMID: 34484293 PMCID: PMC8414652 DOI: 10.3389/fgene.2021.695399] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
Similarities and differences in the associations of biological entities among species can provide us with a better understanding of evolutionary relationships. Often the evolution of new phenotypes results from changes to interactions in pre-existing biological networks and comparing networks across species can identify evidence of conservation or adaptation. Gene co-expression networks (GCNs), constructed from high-throughput gene expression data, can be used to understand evolution and the rise of new phenotypes. The increasing abundance of gene expression data makes GCNs a valuable tool for the study of evolution in non-model organisms. In this paper, we cover motivations for why comparing these networks across species can be valuable for the study of evolution. We also review techniques for comparing GCNs in the context of evolution, including local and global methods of graph alignment. While some protein-protein interaction (PPI) bioinformatic methods can be used to compare co-expression networks, they often disregard highly relevant properties, including the existence of continuous and negative values for edge weights. Also, the lack of comparative datasets in non-model organisms has hindered the study of evolution using PPI networks. We also discuss limitations and challenges associated with cross-species comparison using GCNs, and provide suggestions for utilizing co-expression network alignments as an indispensable tool for evolutionary studies going forward.
Collapse
Affiliation(s)
- Katie Ovens
- Augmented Intelligence & Precision Health Laboratory (AIPHL), Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - B. Frank Eames
- Department of Anatomy, Physiology, & Pharmacology, University of Saskatchewan, Saskatoon, SK, Canada
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
7
|
Yang Z, Telesford QK, Franco AR, Lim R, Gu S, Xu T, Ai L, Castellanos FX, Yan CG, Colcombe S, Milham MP. Measurement reliability for individual differences in multilayer network dynamics: Cautions and considerations. Neuroimage 2021; 225:117489. [PMID: 33130272 PMCID: PMC7829665 DOI: 10.1016/j.neuroimage.2020.117489] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 10/21/2020] [Indexed: 01/16/2023] Open
Abstract
Multilayer network models have been proposed as an effective means of capturing the dynamic configuration of distributed neural circuits and quantitatively describing how communities vary over time. Beyond general insights into brain function, a growing number of studies have begun to employ these methods for the study of individual differences. However, test-retest reliabilities for multilayer network measures have yet to be fully quantified or optimized, potentially limiting their utility for individual difference studies. Here, we systematically evaluated the impact of multilayer community detection algorithms, selection of network parameters, scan duration, and task condition on test-retest reliabilities of multilayer network measures (i.e., flexibility, integration, and recruitment). A key finding was that the default method used for community detection by the popular generalized Louvain algorithm can generate erroneous results. Although available, an updated algorithm addressing this issue is yet to be broadly adopted in the neuroimaging literature. Beyond the algorithm, the present work identified parameter selection as a key determinant of test-retest reliability; however, optimization of these parameters and expected reliabilities appeared to be dataset-specific. Once parameters were optimized, consistent with findings from the static functional connectivity literature, scan duration was a much stronger determinant of reliability than scan condition. When the parameters were optimized and scan duration was sufficient, both passive (i.e., resting state, Inscapes, and movie) and active (i.e., flanker) tasks were reliable, although reliability in the movie watching condition was significantly higher than in the other three tasks. The minimal data requirement for achieving reliable measures for the movie watching condition was 20 min, and 30 min for the other three tasks. Our results caution the field against the use of default parameters without optimization based on the specific datasets to be employed - a process likely to be limited for most due to the lack of test-retest samples to enable parameter optimization.
Collapse
Affiliation(s)
- Zhen Yang
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States; Department of Psychiatry, NYU Grossman School of Medicine, 550 1st Avenue, New York, NY 10016, United States.
| | - Qawi K Telesford
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States
| | - Alexandre R Franco
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States; Department of Psychiatry, NYU Grossman School of Medicine, 550 1st Avenue, New York, NY 10016, United States; Center for the Developing Brain, The Child Mind Institute, 101 East 56th Street, New York, NY 10022, United States
| | - Ryan Lim
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States
| | - Shi Gu
- University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Ting Xu
- Center for the Developing Brain, The Child Mind Institute, 101 East 56th Street, New York, NY 10022, United States
| | - Lei Ai
- Center for the Developing Brain, The Child Mind Institute, 101 East 56th Street, New York, NY 10022, United States
| | - Francisco X Castellanos
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States; Department of Child and Adolescent Psychiatry, NYU Grossman School of Medicine, New York, NY 10016, United States
| | - Chao-Gan Yan
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China
| | - Stan Colcombe
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States; Department of Psychiatry, NYU Grossman School of Medicine, 550 1st Avenue, New York, NY 10016, United States
| | - Michael P Milham
- Center for Biomedical Imaging and Neuromodulation, The Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, Orangeburg, NY 10962, United States; Center for the Developing Brain, The Child Mind Institute, 101 East 56th Street, New York, NY 10022, United States.
| |
Collapse
|
8
|
Franzese N, Groce A, Murali TM, Ritz A. Hypergraph-based connectivity measures for signaling pathway topologies. PLoS Comput Biol 2019; 15:e1007384. [PMID: 31652258 PMCID: PMC6834280 DOI: 10.1371/journal.pcbi.1007384] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Revised: 11/06/2019] [Accepted: 09/09/2019] [Indexed: 12/12/2022] Open
Abstract
Characterizing cellular responses to different extrinsic signals is an active area of research, and curated pathway databases describe these complex signaling reactions. Here, we revisit a fundamental question in signaling pathway analysis: are two molecules “connected” in a network? This question is the first step towards understanding the potential influence of molecules in a pathway, and the answer depends on the choice of modeling framework. We examined the connectivity of Reactome signaling pathways using four different pathway representations. We find that Reactome is very well connected as a graph, moderately well connected as a compound graph or bipartite graph, and poorly connected as a hypergraph (which captures many-to-many relationships in reaction networks). We present a novel relaxation of hypergraph connectivity that iteratively increases connectivity from a node while preserving the hypergraph topology. This measure, B-relaxation distance, provides a parameterized transition between hypergraph connectivity and graph connectivity. B-relaxation distance is sensitive to the presence of small molecules that participate in many functionally unrelated reactions in the network. We also define a score that quantifies one pathway’s downstream influence on another, which can be calculated as B-relaxation distance gradually relaxes the connectivity constraint in hypergraphs. Computing this score across all pairs of 34 Reactome pathways reveals pairs of pathways with statistically significant influence. We present two such case studies, and we describe the specific reactions that contribute to the large influence score. Finally, we investigate the ability for connectivity measures to capture functional relationships among proteins, and use the evidence channels in the STRING database as a benchmark dataset. STRING interactions whose proteins are B-connected in Reactome have statistically significantly higher scores than interactions connected in the bipartite graph representation. Our method lays the groundwork for other generalizations of graph-theoretic concepts to hypergraphs in order to facilitate signaling pathway analysis. Signaling pathways describe how cells respond to external signals through molecular interactions. As we gain a deeper understanding of these signaling reactions, it is important to understand how molecules may influence downstream responses and how pathways may affect each other. As the amount of information in signaling pathway databases continues to grow, we have the opportunity to analyze properties about pathway structure. We pose an intuitive question about signaling pathways: when are two molecules “connected” in a pathway? This answer varies dramatically based on the assumptions we make about how reactions link molecules. Here, examine four approaches for modeling the structural topology of signaling pathways, and present methods to quantify whether two molecules are “connected” in a pathway database. We find that existing approaches are either too permissive (molecules are connected to many others) or restrictive (molecules are connected to a handful of others), and we present a new measure that offers a continuum between these two extremes. We then expand our question to ask when an entire signaling pathway is “downstream” of another pathway, and show two case studies from the Reactome pathway database that uncovers pathway influence. Finally, we show that the strict notion of connectivity can capture functional relationships among proteins using an independent benchmark dataset. Our approach to quantify connectivity in pathways considers a biologically-motivated definition of connectivity, laying the foundation for more sophisticated analyses that leverage the detailed information in pathway databases.
Collapse
Affiliation(s)
- Nicholas Franzese
- Biology Department, Reed College, Portland, Oregon, United States of America
- Computer Science Department, Reed College, Portland, Oregon, United States of America
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Adam Groce
- Computer Science Department, Reed College, Portland, Oregon, United States of America
| | - T. M. Murali
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
- ICTAS Center for Systems Biology of Engineered Tissues, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Anna Ritz
- Biology Department, Reed College, Portland, Oregon, United States of America
- * E-mail:
| |
Collapse
|
9
|
Defoort J, Van de Peer Y, Vermeirssen V. Function, dynamics and evolution of network motif modules in integrated gene regulatory networks of worm and plant. Nucleic Acids Res 2019; 46:6480-6503. [PMID: 29873777 PMCID: PMC6061849 DOI: 10.1093/nar/gky468] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 05/14/2018] [Indexed: 12/29/2022] Open
Abstract
Gene regulatory networks (GRNs) consist of different molecular interactions that closely work together to establish proper gene expression in time and space. Especially in higher eukaryotes, many questions remain on how these interactions collectively coordinate gene regulation. We study high quality GRNs consisting of undirected protein–protein, genetic and homologous interactions, and directed protein–DNA, regulatory and miRNA–mRNA interactions in the worm Caenorhabditis elegans and the plant Arabidopsis thaliana. Our data-integration framework integrates interactions in composite network motifs, clusters these in biologically relevant, higher-order topological network motif modules, overlays these with gene expression profiles and discovers novel connections between modules and regulators. Similar modules exist in the integrated GRNs of worm and plant. We show how experimental or computational methodologies underlying a certain data type impact network topology. Through phylogenetic decomposition, we found that proteins of worm and plant tend to functionally interact with proteins of a similar age, while at the regulatory level TFs favor same age, but also older target genes. Despite some influence of the duplication mode difference, we also observe at the motif and module level for both species a preference for age homogeneity for undirected and age heterogeneity for directed interactions. This leads to a model where novel genes are added together to the GRNs in a specific biological functional context, regulated by one or more TFs that also target older genes in the GRNs. Overall, we detected topological, functional and evolutionary properties of GRNs that are potentially universal in all species.
Collapse
Affiliation(s)
- Jonas Defoort
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, 9052 Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, 9052 Ghent, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria 0028, South Africa
| | - Vanessa Vermeirssen
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, 9052 Ghent, Belgium
| |
Collapse
|
10
|
Erola P, Bonnet E, Michoel T. Learning Differential Module Networks Across Multiple Experimental Conditions. Methods Mol Biol 2019; 1883:303-321. [PMID: 30547406 DOI: 10.1007/978-1-4939-8882-2_13] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Module network inference is a statistical method to reconstruct gene regulatory networks, which uses probabilistic graphical models to learn modules of coregulated genes and their upstream regulatory programs from genome-wide gene expression and other omics data. Here, we review the basic theory of module network inference, present protocols for common gene regulatory network reconstruction scenarios based on the Lemon-Tree software, and show, using human gene expression data, how the software can also be applied to learn differential module networks across multiple experimental conditions.
Collapse
Affiliation(s)
- Pau Erola
- Division of Genetics and Genomics, Roslin Institute, University of Edinburgh, Midlothian, Scotland, UK
| | - Eric Bonnet
- Centre National de Recherche en Génomique Humaine, Institut de Biologie François Jacob, Direction de la Recherche Fondamentale, CEA, Evry, France
| | - Tom Michoel
- Division of Genetics and Genomics, The Roslin Institute, University of Edinburgh, Midlothian, Scotland, UK.
- Current Address: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
| |
Collapse
|
11
|
Shen T, Zhang Z, Chen Z, Gu D, Liang S, Xu Y, Li R, Wei Y, Liu Z, Yi Y, Xie X. A genome-scale metabolic network alignment method within a hypergraph-based framework using a rotational tensor-vector product. Sci Rep 2018; 8:16376. [PMID: 30401914 PMCID: PMC6219566 DOI: 10.1038/s41598-018-34692-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 10/23/2018] [Indexed: 12/14/2022] Open
Abstract
Biological network alignment aims to discover important similarities and differences and thus find a mapping between topological and/or functional components of different biological molecular networks. Then, the mapped components can be considered to correspond to both their places in the network topology and their biological attributes. Development and evolution of biological network alignment methods has been accelerated by the rapidly increasing availability of such biological networks, yielding a repertoire of tens of methods based upon graph theory. However, most biological processes, especially the metabolic reactions, are more sophisticated than simple pairwise interactions and contain three or more participating components. Such multi-lateral relations are not captured by graphs, and computational methods to overcome this limitation are currently lacking. This paper introduces hypergraphs and association hypergraphs to describe metabolic networks and their potential alignments, respectively. Within this framework, metabolic networks are aligned by identifying the maximal Z-eigenvalue of a symmetric tensor. A shifted higher-order power method was utilized to identify a solution. A rotational strategy has been introduced to accelerate the tensor-vector product by 250-fold on average and reduce the storage cost by up to 1,000-fold. The algorithm was implemented on a spark-based distributed computation cluster to significantly increase the convergence rate further by 50- to 80-fold. The parameters have been explored to understand their impact on alignment accuracy and speed. In particular, the influence of initial value selection on the stationary point has been simulated to ensure an accurate approximation of the global optimum. This framework was demonstrated by alignments among the genome-wide metabolic networks of Escherichia coli MG-1655 and Halophilic archaeon DL31. To our knowledge, this is the first genome-wide metabolic network alignment at both the metabolite level and the enzyme level. These results demonstrate that it can supply quite a few valuable insights into metabolic networks. First, this method can access the driving force of organic reactions through the chemical evolution of metabolic network. Second, this method can incorporate the chemical information of enzymes and structural changes of compounds to offer new way defining reaction class and module, such as those in KEGG. Third, as a vertex-focused treatment, this method can supply novel structural and functional annotation for ill-defined molecules. The related source code is available on request.
Collapse
Affiliation(s)
- Tie Shen
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China.
| | - Zhengdong Zhang
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Zhen Chen
- College of Mathematical Science, Guizhou Normal University, Guiyang, Guizhou, China
| | - Dagang Gu
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Shen Liang
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Yang Xu
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Ruiyuan Li
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Yimin Wei
- School of Mathematics Sciences and Key Laboratory of Mathematics for Nonlinear Sciences, Fudan University, Shanghai, China
| | - Zhijie Liu
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Yin Yi
- Key Laboratory of State Forestry Administration on Biodiversity Conservation in Karst of Southwest Areas China, Guizhou Normal University, Guiyang, Guizhou, China.
| | - Xiaoyao Xie
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China.
| |
Collapse
|
12
|
Reyes PFL, Michoel T, Joshi A, Devailly G. Meta-analysis of Liver and Heart Transcriptomic Data for Functional Annotation Transfer in Mammalian Orthologs. Comput Struct Biotechnol J 2017; 15:425-432. [PMID: 29187960 PMCID: PMC5691612 DOI: 10.1016/j.csbj.2017.08.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 08/10/2017] [Accepted: 08/11/2017] [Indexed: 11/30/2022] Open
Abstract
Functional annotation transfer across multi-gene family
orthologs can lead to functional misannotations. We hypothesised that co-expression
network will help predict functional orthologs amongst complex homologous gene
families. To explore the use of transcriptomic data available in public domain to
identify functionally equivalent ones from all predicted orthologs, we collected
genome wide expression data in mouse and rat liver from over 1500 experiments with
varied treatments. We used a hyper-graph clustering method to identify clusters of
orthologous genes co-expressed in both mouse and rat. We validated these clusters by
analysing expression profiles in each species separately, and demonstrating a high
overlap. We then focused on genes in 18 homology groups with one-to-many or
many-to-many relationships between two species, to discriminate between functionally
equivalent and non-equivalent orthologs. Finally, we further applied our method by
collecting heart transcriptomic data (over 1400 experiments) in rat and mouse to
validate the method in an independent tissue.
Collapse
Affiliation(s)
| | - Tom Michoel
- The Roslin Institute, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, Scotland, UK
| | - Anagha Joshi
- The Roslin Institute, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, Scotland, UK
| | - Guillaume Devailly
- The Roslin Institute, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, Scotland, UK
| |
Collapse
|
13
|
|
14
|
Ghoshdastidar D, Dukkipati A. Consistency of spectral hypergraph partitioning under planted partition model. Ann Stat 2017. [DOI: 10.1214/16-aos1453] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
15
|
Takaguchi T, Yoshida Y. Cycle and flow trusses in directed networks. ROYAL SOCIETY OPEN SCIENCE 2016; 3:160270. [PMID: 28018610 PMCID: PMC5180108 DOI: 10.1098/rsos.160270] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Accepted: 10/31/2016] [Indexed: 06/06/2023]
Abstract
When we represent real-world systems as networks, the directions of links often convey valuable information. Finding module structures that respect link directions is one of the most important tasks for analysing directed networks. Although many notions of a directed module have been proposed, no consensus has been reached. This lack of consensus results partly because there might exist distinct types of modules in a single directed network, whereas most previous studies focused on an independent criterion for modules. To address this issue, we propose a generic notion of the so-called truss structures in directed networks. Our definition of truss is able to extract two distinct types of trusses, named the cycle truss and the flow truss, from a unified framework. By applying the method for finding trusses to empirical networks obtained from a wide range of research fields, we find that most real networks contain both cycle and flow trusses. In addition, the abundance of (and the overlap between) the two types of trusses may be useful to characterize module structures in a wide variety of empirical networks. Our findings shed light on the importance of simultaneously considering different types of modules in directed networks.
Collapse
Affiliation(s)
- Taro Takaguchi
- National Institute of Informatics, ERATO, Kawarabayashi Large Graph Project, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430 Tokyo, Japan
- JST, ERATO, Kawarabayashi Large Graph Project, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430 Tokyo, Japan
| | - Yuichi Yoshida
- National Institute of Informatics, ERATO, Kawarabayashi Large Graph Project, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430 Tokyo, Japan
- Preferred Infrastructure, 1-6-1 Otemachi, Chiyoda-ku, 100-0004 Tokyo, Japan
| |
Collapse
|
16
|
Pearcy N, Chuzhanova N, Crofts JJ. Complexity and robustness in hypernetwork models of metabolism. J Theor Biol 2016; 406:99-104. [PMID: 27354314 DOI: 10.1016/j.jtbi.2016.06.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 06/17/2016] [Accepted: 06/22/2016] [Indexed: 11/25/2022]
Abstract
Metabolic reaction data is commonly modelled using a complex network approach, whereby nodes represent the chemical species present within the organism of interest, and connections are formed between those nodes participating in the same chemical reaction. Unfortunately, such an approach provides an inadequate description of the metabolic process in general, as a typical chemical reaction will involve more than two nodes, thus risking oversimplification of the system of interest in a potentially significant way. In this paper, we employ a complex hypernetwork formalism to investigate the robustness of bacterial metabolic hypernetworks by extending the concept of a percolation process to hypernetworks. Importantly, this provides a novel method for determining the robustness of these systems and thus for quantifying their resilience to random attacks/errors. Moreover, we performed a site percolation analysis on a large cohort of bacterial metabolic networks and found that hypernetworks that evolved in more variable environments displayed increased levels of robustness and topological complexity.
Collapse
Affiliation(s)
- Nicole Pearcy
- School of Science and Technology, Department of Physics and Mathematics, Nottingham Trent University, Nottingham NG11 8NS, UK
| | - Nadia Chuzhanova
- School of Science and Technology, Department of Physics and Mathematics, Nottingham Trent University, Nottingham NG11 8NS, UK
| | - Jonathan J Crofts
- School of Science and Technology, Department of Physics and Mathematics, Nottingham Trent University, Nottingham NG11 8NS, UK.
| |
Collapse
|
17
|
Bonnet E, Calzone L, Michoel T. Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput Biol 2015; 11:e1003983. [PMID: 25679508 PMCID: PMC4332478 DOI: 10.1371/journal.pcbi.1003983] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 10/14/2014] [Indexed: 01/05/2023] Open
Abstract
Module network inference is an established statistical method to reconstruct co-expression modules and their upstream regulatory programs from integrated multi-omics datasets measuring the activity levels of various cellular components across different individuals, experimental conditions or time points of a dynamic process. We have developed Lemon-Tree, an open-source, platform-independent, modular, extensible software package implementing state-of-the-art ensemble methods for module network inference. We benchmarked Lemon-Tree using large-scale tumor datasets and showed that Lemon-Tree algorithms compare favorably with state-of-the-art module network inference software. We also analyzed a large dataset of somatic copy-number alterations and gene expression levels measured in glioblastoma samples from The Cancer Genome Atlas and found that Lemon-Tree correctly identifies known glioblastoma oncogenes and tumor suppressors as master regulators in the inferred module network. Novel candidate driver genes predicted by Lemon-Tree were validated using tumor pathway and survival analyses. Lemon-Tree is available from http://lemon-tree.googlecode.com under the GNU General Public License version 2.0.
Collapse
Affiliation(s)
- Eric Bonnet
- Institut Curie, Paris, France
- INSERM U900, Paris, France
- Mines ParisTech, Fontainebleau, France
- * E-mail: (EB); (TM)
| | - Laurence Calzone
- Institut Curie, Paris, France
- INSERM U900, Paris, France
- Mines ParisTech, Fontainebleau, France
| | - Tom Michoel
- Division of Genetics & Genomics, The Roslin Institute, The University of Edinburgh, Easter Bush, Midlothian, United Kingdom
- * E-mail: (EB); (TM)
| |
Collapse
|
18
|
Boccaletti S, Bianconi G, Criado R, del Genio C, Gómez-Gardeñes J, Romance M, Sendiña-Nadal I, Wang Z, Zanin M. The structure and dynamics of multilayer networks. PHYSICS REPORTS 2014; 544:1-122. [PMID: 32834429 PMCID: PMC7332224 DOI: 10.1016/j.physrep.2014.07.001] [Citation(s) in RCA: 874] [Impact Index Per Article: 87.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/03/2014] [Indexed: 05/05/2023]
Abstract
In the past years, network theory has successfully characterized the interaction among the constituents of a variety of complex systems, ranging from biological to technological, and social systems. However, up until recently, attention was almost exclusively given to networks in which all components were treated on equivalent footing, while neglecting all the extra information about the temporal- or context-related properties of the interactions under study. Only in the last years, taking advantage of the enhanced resolution in real data sets, network scientists have directed their interest to the multiplex character of real-world systems, and explicitly considered the time-varying and multilayer nature of networks. We offer here a comprehensive review on both structural and dynamical organization of graphs made of diverse relationships (layers) between its constituents, and cover several relevant issues, from a full redefinition of the basic structural measures, to understanding how the multilayer nature of the network affects processes and dynamics.
Collapse
Affiliation(s)
- S. Boccaletti
- CNR - Institute of Complex Systems, Via Madonna del Piano, 10, 50019 Sesto Fiorentino, Florence, Italy
- The Italian Embassy in Israel, 25 Hamered st., 68125 Tel Aviv, Israel
| | - G. Bianconi
- School of Mathematical Sciences, Queen Mary University of London, London, United Kingdom
| | - R. Criado
- Departamento de Matemática Aplicada, Universidad Rey Juan Carlos, 28933 Móstoles, Madrid, Spain
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - C.I. del Genio
- Warwick Mathematics Institute, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
- Centre for Complexity Science, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
- Warwick Infectious Disease Epidemiology Research (WIDER) Centre, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - J. Gómez-Gardeñes
- Institute for Biocomputation and Physics of Complex Systems, University of Zaragoza, Zaragoza, Spain
| | - M. Romance
- Departamento de Matemática Aplicada, Universidad Rey Juan Carlos, 28933 Móstoles, Madrid, Spain
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - I. Sendiña-Nadal
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, Madrid, Spain
- Complex Systems Group, Universidad Rey Juan Carlos, 28933 Móstoles, Madrid, Spain
| | - Z. Wang
- Department of Physics, Hong Kong Baptist University, Kowloon Tong, Hong Kong Special Administrative Region
- Center for Nonlinear Studies, Beijing–Hong Kong–Singapore Joint Center for Nonlinear and Complex Systems (Hong Kong) and Institute of Computational and Theoretical Studies, Hong Kong Baptist University, Kowloon Tong, Hong Kong Special Administrative Region
| | - M. Zanin
- Innaxis Foundation & Research Institute, José Ortega y Gasset 20, 28006 Madrid, Spain
- Faculdade de Ciências e Tecnologia, Departamento de Engenharia Electrotécnica, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
| |
Collapse
|
19
|
Solá L, Romance M, Criado R, Flores J, García del Amo A, Boccaletti S. Eigenvector centrality of nodes in multiplex networks. CHAOS (WOODBURY, N.Y.) 2013; 23:033131. [PMID: 24089967 DOI: 10.1063/1.4818544] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We extend the concept of eigenvector centrality to multiplex networks, and introduce several alternative parameters that quantify the importance of nodes in a multi-layered networked system, including the definition of vectorial-type centralities. In addition, we rigorously show that, under reasonable conditions, such centrality measures exist and are unique. Computer experiments and simulations demonstrate that the proposed measures provide substantially different results when applied to the same multiplex structure, and highlight the non-trivial relationships between the different measures of centrality introduced.
Collapse
Affiliation(s)
- Luis Solá
- Department of Applied Mathematics, Rey Juan Carlos University, Madrid, Spain 28933
| | | | | | | | | | | |
Collapse
|
20
|
|