1
|
Singh P, Kuder H, Ritz A. Identification of disease modules using higher-order network structure. BIOINFORMATICS ADVANCES 2023; 3:vbad140. [PMID: 37860106 PMCID: PMC10582521 DOI: 10.1093/bioadv/vbad140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/18/2023] [Accepted: 10/03/2023] [Indexed: 10/21/2023]
Abstract
Motivation Higher-order interaction patterns among proteins have the potential to reveal mechanisms behind molecular processes and diseases. While clustering methods are used to identify functional groups within molecular interaction networks, these methods largely focus on edge density and do not explicitly take into consideration higher-order interactions. Disease genes in these networks have been shown to exhibit rich higher-order structure in their vicinity, and considering these higher-order interaction patterns in network clustering have the potential to reveal new disease-associated modules. Results We propose a higher-order community detection method which identifies community structure in networks with respect to specific higher-order connectivity patterns beyond edges. Higher-order community detection on four different protein-protein interaction networks identifies biologically significant modules and disease modules that conventional edge-based clustering methods fail to discover. Higher-order clusters also identify disease modules from genome-wide association study data, including new modules that were not discovered by top-performing approaches in a Disease Module DREAM Challenge. Our approach provides a more comprehensive view of community structure that enables us to predict new disease-gene associations. Availability and implementation https://github.com/Reed-CompBio/graphlet-clustering.
Collapse
Affiliation(s)
- Pramesh Singh
- Biology Department, Reed College, Portland, OR 97202, United States
- Data Intensive Studies Center, Tufts University, Medford, MA 02155, United States
| | - Hannah Kuder
- Physics Department, Reed College, Portland, OR 97202, United States
| | - Anna Ritz
- Biology Department, Reed College, Portland, OR 97202, United States
| |
Collapse
|
2
|
Jia M, Van Alboom M, Goubert L, Bracke P, Gabrys B, Musial K. Encoding edge type information in graphlets. PLoS One 2022; 17:e0273609. [PMID: 36026434 PMCID: PMC9416998 DOI: 10.1371/journal.pone.0273609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 08/11/2022] [Indexed: 11/18/2022] Open
Abstract
Graph embedding approaches have been attracting increasing attention in recent years mainly due to their universal applicability. They convert network data into a vector space in which the graph structural information and properties are maximumly preserved. Most existing approaches, however, ignore the rich information about interactions between nodes, i.e., edge attribute or edge type. Moreover, the learned embeddings suffer from a lack of explainability, and cannot be used to study the effects of typed structures in edge-attributed networks. In this paper, we introduce a framework to embed edge type information in graphlets and generate a Typed-Edge Graphlets Degree Vector (TyE-GDV). Additionally, we extend two combinatorial approaches, i.e., the colored graphlets and heterogeneous graphlets approaches to edge-attributed networks. Through applying the proposed method to a case study of chronic pain patients, we find that not only the network structure of a patient could indicate his/her perceived pain grade, but also certain social ties, such as those with friends, colleagues, and healthcare professionals, are more crucial in understanding the impact of chronic pain. Further, we demonstrate that in a node classification task, the edge-type encoded graphlets approaches outperform the traditional graphlet degree vector approach by a significant margin, and that TyE-GDV could achieve a competitive performance of the combinatorial approaches while being far more efficient in space requirements.
Collapse
Affiliation(s)
- Mingshan Jia
- Complex Adaptive Systems Lab, School of Computer Science, University of Technology Sydney, Sydney, NSW, Australia
- * E-mail:
| | | | | | - Piet Bracke
- Health Psychology Lab, Ghent University, Ghent, Belgium
| | - Bogdan Gabrys
- Complex Adaptive Systems Lab, School of Computer Science, University of Technology Sydney, Sydney, NSW, Australia
| | - Katarzyna Musial
- Complex Adaptive Systems Lab, School of Computer Science, University of Technology Sydney, Sydney, NSW, Australia
| |
Collapse
|
3
|
Huang CH, Zaenudin E, Tsai JJ, Kurubanjerdjit N, Ng KL. Network subgraph-based approach for analyzing and comparing molecular networks. PeerJ 2022; 10:e13137. [PMID: 35529499 PMCID: PMC9074881 DOI: 10.7717/peerj.13137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 02/28/2022] [Indexed: 01/12/2023] Open
Abstract
Molecular networks are built up from genetic elements that exhibit feedback interactions. Here, we studied the problem of measuring the similarity of directed networks by proposing a novel alignment-free approach: the network subgraph-based approach. Our approach does not make use of randomized networks to determine modular patterns embedded in a network, and this method differs from the network motif and graphlet methods. Network similarity was quantified by gauging the difference between the subgraph frequency distributions of two networks using Jensen-Shannon entropy. We applied the subgraph approach to study three types of molecular networks, i.e., cancer networks, signal transduction networks, and cellular process networks, which exhibit diverse molecular functions. We compared the performance of our subgraph detection algorithm with other algorithms, and the results were consistent, but other algorithms could not address the issue of subgraphs/motifs embedded within a subgraph/motif. To evaluate the effectiveness of the subgraph-based method, we applied the method along with the Jensen-Shannon entropy to classify six network models, and it achieves a 100% accuracy of classification. The proposed information-theoretic approach allows us to determine the structural similarity of two networks regardless of node identity and network size. We demonstrated the effectiveness of the subgraph approach to cluster molecular networks that exhibit similar regulatory interaction topologies. As an illustration, our method can identify (i) common subgraph-mediated signal transduction and/or cellular processes in AML and pancreatic cancer, and (ii) scaffold proteins in gastric cancer and hepatocellular carcinoma; thus, the results suggested that there are common regulation modules for cancer formation. We also found that the underlying substructures of the molecular networks are dominated by irreducible subgraphs; this feature is valid for the three classes of molecular networks we studied. The subgraph-based approach provides a systematic scenario for analyzing, compare and classifying molecular networks with diverse functionalities.
Collapse
Affiliation(s)
- Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yun-Lin, Taiwan
| | - Efendi Zaenudin
- National Research and Innovation Agency, Bandung, Jawa Barat, Republic of Indonesia,Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Jeffrey J.P. Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | | | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan,Center for Artificial Intelligence and Precision Medicine Research, Asia University, Taichung, Taiwan,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| |
Collapse
|
4
|
Zhao Y, Jiang H, Chen Q, Qin Y, Xie H, Wu Y, Liu S, Zhou Z, Xia J, Zhou F. Preserving Minority Structures in Graph Sampling. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:1698-1708. [PMID: 33048731 DOI: 10.1109/tvcg.2020.3030428] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Sampling is a widely used graph reduction technique to accelerate graph computations and simplify graph visualizations. By comprehensively analyzing the literature on graph sampling, we assume that existing algorithms cannot effectively preserve minority structures that are rare and small in a graph but are very important in graph analysis. In this work, we initially conduct a pilot user study to investigate representative minority structures that are most appealing to human viewers. We then perform an experimental study to evaluate the performance of existing graph sampling algorithms regarding minority structure preservation. Results confirm our assumption and suggest key points for designing a new graph sampling approach named mino-centric graph sampling (MCGS). In this approach, a triangle-based algorithm and a cut-point-based algorithm are proposed to efficiently identify minority structures. A set of importance assessment criteria are designed to guide the preservation of important minority structures. Three optimization objectives are introduced into a greedy strategy to balance the preservation between minority and majority structures and suppress the generation of new minority structures. A series of experiments and case studies are conducted to evaluate the effectiveness of the proposed MCGS.
Collapse
|
5
|
Huang CH, Zaenudin E, Tsai JJP, Kurubanjerdjit N, Dessie EY, Ng KL. Dissecting molecular network structures using a network subgraph approach. PeerJ 2020; 8:e9556. [PMID: 33005483 PMCID: PMC7512139 DOI: 10.7717/peerj.9556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 06/25/2020] [Indexed: 11/20/2022] Open
Abstract
Biological processes are based on molecular networks, which exhibit biological functions through interactions of genetic elements or proteins. This study presents a graph-based method to characterize molecular networks by decomposing the networks into directed multigraphs: network subgraphs. Spectral graph theory, reciprocity and complexity measures were used to quantify the network subgraphs. Graph energy, reciprocity and cyclomatic complexity can optimally specify network subgraphs with some degree of degeneracy. Seventy-one molecular networks were analyzed from three network types: cancer networks, signal transduction networks, and cellular processes. Molecular networks are built from a finite number of subgraph patterns and subgraphs with large graph energies are not present, which implies a graph energy cutoff. In addition, certain subgraph patterns are absent from the three network types. Thus, the Shannon entropy of the subgraph frequency distribution is not maximal. Furthermore, frequently-observed subgraphs are irreducible graphs. These novel findings warrant further investigation and may lead to important applications. Finally, we observed that cancer-related cellular processes are enriched with subgraph-associated driver genes. Our study provides a systematic approach for dissecting biological networks and supports the conclusion that there are organizational principles underlying molecular networks.
Collapse
Affiliation(s)
- Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin, Taiwan
| | - Efendi Zaenudin
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.,Research Center for Informatics, Indonesian Institute of Sciences, Bandung, Indonesia
| | - Jeffrey J P Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | | | - Eskezeia Y Dessie
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| |
Collapse
|
6
|
Abstract
We develop graphlet analysis for multiplex networks and discuss how this analysis can be extended to multilayer and multilevel networks as well as to graphs with node and/or link categorical attributes. The analysis has been adapted for two typical examples of multiplexes: economic trade data represented as a 957-plex network and 75 social networks each represented as a 12-plex network. We show that wedges (open triads) occur more often in economic trade networks than in social networks, indicating the tendency of a country to produce/trade of a product in local structure of triads which are not closed. Moreover, our analysis provides evidence that the countries with small diversity tend to form correlated triangles. Wedges also appear in the social networks, however the dominant graphlets in social networks are triangles (closed triads). If a multiplex structure indicates a strong tie, the graphlet analysis provides another evidence for the concepts of strong/weak ties and structural holes. In contrast to Granovetter’s seminal work on the strength of weak ties, in which it has been documented that the wedges with only strong ties are absent, here we show that for the analyzed 75 social networks, the wedges with only strong ties are not only present but also significantly correlated.
Collapse
|
7
|
Inhibitory interaction networks among coevolved Streptomyces populations from prairie soils. PLoS One 2019; 14:e0223779. [PMID: 31671139 PMCID: PMC6822729 DOI: 10.1371/journal.pone.0223779] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 09/29/2019] [Indexed: 12/24/2022] Open
Abstract
Soil microbes live within highly complex communities, where community composition, function, and evolution are the product of diverse interactions among community members. Analysis of the complex networks of interactions within communities has the potential to shed light on community stability, functioning, and evolution. However, we have little understanding of the variation in interaction networks among coevolved soil populations. We evaluated networks of antibiotic inhibitory interactions among sympatric Streptomyces communities from prairie soil. Inhibition networks differed significantly in key network characteristics from expectations under null models, largely reflecting variation among Streptomyces in the number of sympatric populations that they inhibited. Moreover, networks of inhibitory interactions within Streptomyces communities differed significantly from each other, suggesting unique network structures among soil communities from different locations. Analyses of tri-partite interactions (triads) showed that some triads were significantly over- or under- represented, and that communities differed in ‘preferred’ triads. These results suggest that local processes generate distinct structures among sympatric Streptomyces inhibition networks in soil. Understanding the properties of microbial interaction networks that generate competitive and functional capacities of soil communities will shed light on the ecological and coevolutionary history of sympatric populations, and provide a foundation for more effective management of inhibitory capacities of soil microbial communities.
Collapse
|
8
|
Sarajlić A, Malod-Dognin N, Yaveroğlu ÖN, Pržulj N. Graphlet-based Characterization of Directed Networks. Sci Rep 2016; 6:35098. [PMID: 27734973 PMCID: PMC5062067 DOI: 10.1038/srep35098] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 09/26/2016] [Indexed: 01/22/2023] Open
Abstract
We are flooded with large-scale, dynamic, directed, networked data. Analyses requiring exact comparisons between networks are computationally intractable, so new methodologies are sought. To analyse directed networks, we extend graphlets (small induced sub-graphs) and their degrees to directed data. Using these directed graphlets, we generalise state-of-the-art network distance measures (RGF, GDDA and GCD) to directed networks and show their superiority for comparing directed networks. Also, we extend the canonical correlation analysis framework that enables uncovering the relationships between the wiring patterns around nodes in a directed network and their expert annotations. On directed World Trade Networks (WTNs), our methodology allows uncovering the core-broker-periphery structure of the WTN, predicting the economic attributes of a country, such as its gross domestic product, from its wiring patterns in the WTN for up-to ten years in the future. It does so by enabling us to track the dynamics of a country's positioning in the WTN over years. On directed metabolic networks, our framework yields insights into preservation of enzyme function from the network wiring patterns rather than from sequence data. Overall, our methodology enables advanced analyses of directed networked data from any area of science, allowing domain-specific interpretation of a directed network's topology.
Collapse
Affiliation(s)
- Anida Sarajlić
- Department of Computing, Imperial College London, SW7 2AZ London, UK
| | - Noël Malod-Dognin
- Department of Computer Science, University College London, WC1E 6BT London, UK
| | | | - Nataša Pržulj
- Department of Computer Science, University College London, WC1E 6BT London, UK
| |
Collapse
|