1
|
Manipur I, Giordano M, Piccirillo M, Parashuraman S, Maddalena L. Community Detection in Protein-Protein Interaction Networks and Applications. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:217-237. [PMID: 34951849 DOI: 10.1109/tcbb.2021.3138142] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The ability to identify and characterize not only the protein-protein interactions but also their internal modular organization through network analysis is fundamental for understanding the mechanisms of biological processes at the molecular level. Indeed, the detection of the network communities can enhance our understanding of the molecular basis of disease pathology, and promote drug discovery and disease treatment in personalized medicine. This work gives an overview of recent computational methods for the detection of protein complexes and functional modules in protein-protein interaction networks, also providing a focus on some of its applications. We propose a systematic reformulation of frequently adopted taxonomies for these methods, also proposing new categories to keep up with the most recent research. We review the literature of the last five years (2017-2021) and provide links to existing data and software resources. Finally, we survey recent works exploiting module identification and analysis, in the context of a variety of disease processes for biomarker identification and therapeutic target detection. Our review provides the interested reader with an up-to-date and self-contained view of the existing research, with links to state-of-the-art literature and resources, as well as hints on open issues and future research directions in complex detection and its applications.
Collapse
|
2
|
Smell Detection Agent Optimisation Framework and Systems Biology Approach to Detect Dys-Regulated Subnetwork in Cancer Data. Biomolecules 2021; 12:biom12010037. [PMID: 35053185 PMCID: PMC8774275 DOI: 10.3390/biom12010037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Revised: 12/01/2021] [Accepted: 12/02/2021] [Indexed: 11/23/2022] Open
Abstract
Network biology has become a key tool in unravelling the mechanisms of complex diseases. Detecting dys-regulated subnetworks from molecular networks is a task that needs efficient computational methods. In this work, we constructed an integrated network using gene interaction data as well as protein–protein interaction data of differentially expressed genes derived from the microarray gene expression data. We considered the level of differential expression as well as the topological weight of proteins in interaction network to quantify dys-regulation. Then, a nature-inspired Smell Detection Agent (SDA) optimisation algorithm is designed with multiple agents traversing through various paths in the network. Finally, the algorithm provides a maximum weighted module as the optimum dys-regulated subnetwork. The analysis is performed for samples of triple-negative breast cancer as well as colorectal cancer. Biological significance analysis of module genes is also done to validate the results. The breast cancer subnetwork is found to contain (i) valid biomarkers including PIK3CA, PTEN, BRCA1, AR and EGFR; (ii) validated drug targets TOP2A, CDK4, HDAC1, IL6, BRCA1, HSP90AA1 and AR; (iii) synergistic drug targets EGFR and BIRC5. Moreover, based on the weight values assigned to nodes in the subnetwork, PLK1, CTNNB1, IGF1, AURKA, PCNA, HSPA4 and GAPDH are proposed as drug targets for further studies. For colorectal cancer module, the analysis revealed the occurrence of approved drug targets TYMS, TOP1, BRAF and EGFR. Considering the higher weight values, HSP90AA1, CCNB1, AKT1 and CXCL8 are proposed as drug targets for experimentation. The derived subnetworks possess cancer-related pathways as well. The SDA-derived breast cancer subnetwork is compared with that of tools such as MCODE and Minimum Spanning Tree, and observed a higher enrichment (75%) of significant elements. Thus, the proposed nature-inspired algorithm is a novel approach to derive the optimum dys-regulated subnetwork from huge molecular network.
Collapse
|
3
|
Pasquier C, Robichon A. Temporal and sequential order of nonoverlapping gene networks unraveled in mated female Drosophila. Life Sci Alliance 2021; 5:5/2/e202101119. [PMID: 34844981 PMCID: PMC8645335 DOI: 10.26508/lsa.202101119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 11/11/2021] [Accepted: 11/12/2021] [Indexed: 12/13/2022] Open
Abstract
Mating triggers successive waves of temporal transcriptomic changes within independent gene networks in female Drosophila, suggesting a recruitment of interconnected modules that vanish in late life. In this study, we reanalyzed available datasets of gene expression changes in female Drosophila head induced by mating. Mated females present metabolic phenotypic changes and display behavioral characteristics that are not observed in virgin females, such as repulsion to male sexual aggressiveness, fidelity to food spots selected for oviposition, and restriction to the colonization of new niches. We characterize gene networks that play a role in female brain plasticity after mating using AMINE, a novel algorithm to find dysregulated modules of interacting genes. The uncovered networks of altered genes revealed a strong specificity for each successive period of life span after mating in the female head, with little conservation between them. This finding highlights a temporal order of recruitment of waves of interconnected genes which are apparently transiently modified: the first wave disappears before the emergence of the second wave in a reversible manner and ends with few consolidated gene expression changes at day 20. This analysis might document an extended field of a programmatic control of female phenotypic traits by male seminal fluid.
Collapse
|
4
|
Li D, Pan Z, Hu G, Anderson G, He S. Active Module Identification From Multilayer Weighted Gene Co-Expression Networks: A Continuous Optimization Approach. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2239-2248. [PMID: 32011261 DOI: 10.1109/tcbb.2020.2970400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Searching for active modules, i.e., regions showing striking changes in molecular activity in biological networks is important to reveal regulatory and signaling mechanisms of biological systems. Most existing active modules identification methods are based on protein-protein interaction networks or metabolic networks, which require comprehensive and accurate prior knowledge. On the other hand, weighted gene co-expression networks (WGCNs) are purely constructed from gene expression profiles. However, existing WGCN analysis methods are designed for identifying functional modules but not capable of identifying active modules. There is an urgent need to develop an active module identification algorithm for WGCNs to discover regulatory and signaling mechanism associating with a given cellular response. To address this urgent need, we propose a novel algorithm called active modules on the multi-layer weighted (co-expression gene) network, based on a continuous optimization approach (AMOUNTAIN). The algorithm is capable of identifying active modules not only from single-layer WGCNs but also from multilayer WGCNs such as cross-species and dynamic WGCNs. We first validate AMOUNTAIN on a synthetic benchmark dataset. We then apply AMOUNTAIN to WGCNs constructed from Th17 differentiation gene expression datasets of human and mouse, which include a single layer, a cross-species two-layer and a multilayer dynamic WGCNs. The identified active modules from WGCNs are enriched by known protein-protein interactions, and more importantly, they reveal some interesting and important regulatory and signaling mechanisms of Th17 cell differentiation.
Collapse
|
5
|
A multi-objective genetic algorithm to find active modules in multiplex biological networks. PLoS Comput Biol 2021; 17:e1009263. [PMID: 34460810 PMCID: PMC8452006 DOI: 10.1371/journal.pcbi.1009263] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 09/20/2021] [Accepted: 07/09/2021] [Indexed: 12/13/2022] Open
Abstract
The identification of subnetworks of interest—or active modules—by integrating biological networks with molecular profiles is a key resource to inform on the processes perturbed in different cellular conditions. We here propose MOGAMUN, a Multi-Objective Genetic Algorithm to identify active modules in MUltiplex biological Networks. MOGAMUN optimizes both the density of interactions and the scores of the nodes (e.g., their differential expression). We compare MOGAMUN with state-of-the-art methods, representative of different algorithms dedicated to the identification of active modules in single networks. MOGAMUN identifies dense and high-scoring modules that are also easier to interpret. In addition, to our knowledge, MOGAMUN is the first method able to use multiplex networks. Multiplex networks are composed of different layers of physical and functional relationships between genes and proteins. Each layer is associated to its own meaning, topology, and biases; the multiplex framework allows exploiting this diversity of biological networks. We applied MOGAMUN to identify cellular processes perturbed in Facio-Scapulo-Humeral muscular Dystrophy, by integrating RNA-seq expression data with a multiplex biological network. We identified different active modules of interest, thereby providing new angles for investigating the pathomechanisms of this disease. Availability: MOGAMUN is available at https://github.com/elvanov/MOGAMUN and as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/MOGAMUN.html. Contact:anais.baudot@univ-amu.fr Integrating different sources of biological information is a powerful way to uncover the functioning of biological systems. In network biology, in particular, integrating interaction data with expression profiles helps contextualizing the networks and identifying subnetworks of interest, aka active modules. We here propose MOGAMUN, a multi-objective genetic algorithm that optimizes both the overall deregulation and the density to identify active modules, considering jointly multiple sources of biological interactions. We demonstrate the performance of MOGAMUN over state-of-the-art methods, and illustrate its usefulness in unveiling perturbed biological processes in Facio-Scapulo-Humeral muscular Dystrophy.
Collapse
|
6
|
Liang L, Chen V, Zhu K, Fan X, Lu X, Lu S. Integrating data and knowledge to identify functional modules of genes: a multilayer approach. BMC Bioinformatics 2019; 20:225. [PMID: 31046665 PMCID: PMC6498600 DOI: 10.1186/s12859-019-2800-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 04/09/2019] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. RESULTS Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. CONCLUSION Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence.
Collapse
Affiliation(s)
- Lifan Liang
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Vicky Chen
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, USA
| | - Kunju Zhu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Clinical Medicine Research Institute, Jinan University, Guangzhou, 51063, Guangdong, China
| | - Xiaonan Fan
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi'an, 710072, Shanxi, China
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Songjian Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
7
|
Nguyen H, Shrestha S, Tran D, Shafi A, Draghici S, Nguyen T. A Comprehensive Survey of Tools and Software for Active Subnetwork Identification. Front Genet 2019; 10:155. [PMID: 30891064 PMCID: PMC6411791 DOI: 10.3389/fgene.2019.00155] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 02/13/2019] [Indexed: 12/13/2022] Open
Abstract
A recent focus of computational biology has been to integrate the complementary information available in molecular profiles as well as in multiple network databases in order to identify connected regions that show significant changes under different conditions. This allows for capturing dynamic and condition-specific mechanisms of the underlying phenomena and disease stages. Here we review 22 such integrative approaches for active module identification published over the last decade. This article only focuses on tools that are currently available for use and are well-maintained. We compare these methods focusing on their primary features, integrative abilities, network structures, mathematical models, and implementations. We also provide real-world scenarios in which these methods have been successfully applied, as well as highlight outstanding challenges in the field that remain to be addressed. The main objective of this review is to help potential users and researchers to choose the best method that is suitable for their data and analysis purpose.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Sangam Shrestha
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, MI, United States
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, United States
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, United States
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| |
Collapse
|