1
|
Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108188. [PMID: 38657382 DOI: 10.1016/j.cmpb.2024.108188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/14/2024] [Accepted: 04/17/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND AND OBJECTIVE The protein-protein interaction (PPI) network alignment has proven to be an efficient technique in the diagnosis and prevention of certain diseases. However, the difficulty in maximizing, at the same time, the two qualities that measure the goodness of alignments (topological and biological quality) has led aligners to produce very different alignments. Thus making a comparative study among alignments of such different qualities a big challenge. Multi-objective optimization is a computer method, which is very powerful in this kind of contexts because both conflicting qualities are considered together. Analysing the alignments of each PPI network aligner with multi-objective methodologies allows you to visualize a bigger picture of the alignments and their qualities, obtaining very interesting conclusions. This paper proposes a comprehensive PPI network aligner study in the multi-objective domain. METHODS Alignments from each aligner and all aligners together were studied and compared to each other via Pareto dominance methodologies. The best alignments produced by each aligner and all aligners together for five different alignment scenarios were displayed in Pareto front graphs. Later, the aligners were ranked according to the topological, biological, and combined quality of their alignments. Finally, the aligners were also ranked based on their average runtimes. RESULTS Regarding aligners constructing the best overall alignments, we found that SAlign, BEAMS, SANA, and HubAlign are the best options. Additionally, the alignments of best topological quality are produced by: SANA, SAlign, and HubAlign aligners. On the contrary, the aligners returning the alignments of best biological quality are: BEAMS, TAME, and WAVE. However, if there are time constraints, it is recommended to select SAlign to obtain high topological quality alignments and PISwap or SAlign aligners for high biological quality alignments. CONCLUSIONS The use of the SANA aligner is recommended for obtaining the best alignments of topological quality, BEAMS for alignments of the best biological quality, and SAlign for alignments of the best combined topological and biological quality. Simultaneously, SANA and BEAMS have above-average runtimes. Therefore, it is suggested, if necessary due to time restrictions, to choose other, faster aligners like SAlign or PISwap whose alignments are also of high quality.
Collapse
Affiliation(s)
- Manuel Menor-Flores
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| | - Miguel A Vega-Rodríguez
- Escuela Politécnica, Universidad de Extremadura,(1) Campus Universitario s/n, 10003 Cáceres, Spain.
| |
Collapse
|
2
|
Menor-Flores M, Vega-Rodríguez MA. Boosting-based ensemble of global network aligners for PPI network alignment. EXPERT SYSTEMS WITH APPLICATIONS 2023; 230:120671. [DOI: 10.1016/j.eswa.2023.120671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
3
|
Kernel Embedding Transformation Learning for Graph Matching. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
4
|
Wang S, Chen X, Frederisy BJ, Mbakogu BA, Kanne AD, Khosravi P, Hayes WB. On the current failure-but bright future-of topology-driven biological network alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:1-44. [PMID: 35871888 DOI: 10.1016/bs.apcsb.2022.05.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Since the function of a protein is defined by its interaction partners, and since we expect similar interaction patterns across species, the alignment of protein-protein interaction (PPI) networks between species, based on network topology alone, should uncover functionally related proteins across species. Surprisingly, despite the publication of more than fifty algorithms aimed at performing PPI network alignment, few have demonstrated a statistically significant link between network topology and functional similarity, and none have demonstrated that orthologs can be recovered using network topology alone. We find that the major contributing factors to this surprising failure are: (i) edge densities in most currently available experimental PPI networks are demonstrably too low to expect topological network alignment to succeed; (ii) in the few cases where the edge densities are high enough, some measures of topological similarity easily uncover functionally similar proteins while others do not; and (iii) most network alignment algorithms to date perform poorly at optimizing even their own topological objective functions, hampering their ability to use topology effectively. We demonstrate that SANA-the Simulated Annealing Network Aligner-significantly outperforms existing aligners at optimizing their own objective functions, even achieving near-optimal solutions when the optimal solution is known. We offer the first demonstration of global network alignments based on topology alone that align functionally similar proteins with p-values in some cases below 10-300. We predict that topological network alignment has a bright future as edge densities increase toward the value where good alignments become possible. We demonstrate that when enough common topology is present at high enough edge densities-for example in the recent, partly synthetic networks of the Integrated Interaction Database-topological network alignment easily recovers most orthologs, paving the way toward high-throughput functional prediction based on topology-driven network alignment.
Collapse
Affiliation(s)
- Siyue Wang
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Xiaoyin Chen
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Brent J Frederisy
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Benedict A Mbakogu
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Amy D Kanne
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Pasha Khosravi
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, United States.
| |
Collapse
|
5
|
Ma L, Shao Z, Li L, Huang J, Wang S, Lin Q, Li J, Gong M, Nandi AK. Heuristics and metaheuristics for biological network alignment: A review. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.08.156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
6
|
|
7
|
Zhu H, Cui C, Deng L, Cheung RCC, Yan H. Elastic Net Constraint-Based Tensor Model for High-Order Graph Matching. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4062-4074. [PMID: 31536028 DOI: 10.1109/tcyb.2019.2936176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The procedure of establishing the correspondence between two sets of feature points is important in computer vision applications. In this article, an elastic net constraint-based tensor model is proposed for high-order graph matching. To control the tradeoff between the sparsity and the accuracy of the matching results, an elastic net constraint is introduced into the tensor-based graph matching model. Then, a nonmonotone spectral projected gradient (NSPG) method is derived to solve the proposed matching model. During the optimization of using NSPG, we propose an algorithm to calculate the projection on the feasible convex sets of elastic net constraint. Further, the global convergence of solving the proposed model using the NSPG method was proved. The superiority of the proposed method is verified through experiments on the synthetic data and natural images.
Collapse
|
8
|
Woo HM, Yoon BJ. MONACO: accurate biological network alignment through optimal neighborhood matching between focal nodes. Bioinformatics 2021; 37:1401-1410. [PMID: 33165517 DOI: 10.1093/bioinformatics/btaa962] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 10/19/2020] [Accepted: 11/02/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Alignment of protein-protein interaction networks can be used for the unsupervised prediction of functional modules, such as protein complexes and signaling pathways, that are conserved across different species. To date, various algorithms have been proposed for biological network alignment, many of which attempt to incorporate topological similarity between the networks into the alignment process with the goal of constructing accurate and biologically meaningful alignments. Especially, random walk models have been shown to be effective for quantifying the global topological relatedness between nodes that belong to different networks by diffusing node-level similarity along the interaction edges. However, these schemes are not ideal for capturing the local topological similarity between nodes. RESULTS In this article, we propose MONACO, a novel and versatile network alignment algorithm that finds highly accurate pairwise and multiple network alignments through the iterative optimal matching of 'local' neighborhoods around focal nodes. Extensive performance assessment based on real networks as well as synthetic networks, for which the ground truth is known, demonstrates that MONACO clearly and consistently outperforms all other state-of-the-art network alignment algorithms that we have tested, in terms of accuracy, coherence and topological quality of the aligned network regions. Furthermore, despite the sharply enhanced alignment accuracy, MONACO remains computationally efficient and it scales well with increasing size and number of networks. AVAILABILITY AND IMPLEMENTATION Matlab implementation is freely available at https://github.com/bjyoontamu/MONACO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hyun-Myung Woo
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77845, USA.,Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, USA
| |
Collapse
|
9
|
Woo HM, Jeong H, Yoon BJ. NAPAbench 2: A network synthesis algorithm for generating realistic protein-protein interaction (PPI) network families. PLoS One 2020; 15:e0227598. [PMID: 31986158 PMCID: PMC6984706 DOI: 10.1371/journal.pone.0227598] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 12/23/2019] [Indexed: 11/18/2022] Open
Abstract
Comparative network analysis provides effective computational means for gaining novel insights into the structural and functional compositions of biological networks. In recent years, various methods have been developed for biological network alignment, whose main goal is to identify important similarities and critical differences between networks in terms of their topology and composition. A major impediment to advancing network alignment techniques has been the lack of gold-standard benchmarks that can be used for accurate and comprehensive performance assessment of such algorithms. The original NAPAbench (network alignment performance assessment benchmark) was developed to address this problem, and it has been widely utilized by many researchers for the development, evaluation, and comparison of novel network alignment techniques. In this work, we introduce NAPAbench 2-a major update of the original NAPAbench that was introduced in 2012. NAPAbench 2 includes a completely redesigned network synthesis algorithm that can generate protein-protein interaction (PPI) network families whose characteristics closely match those of the latest real PPI networks. Furthermore, the network synthesis algorithm comes with an intuitive GUI that allows users to easily generate PPI network families with an arbitrary number of networks of any size, according to a flexible user-defined phylogeny. In addition, NAPAbench 2 provides updated benchmark datasets-created using the redesigned network synthesis algorithm-which can be used for comprehensive performance assessment of network alignment algorithms and their scalability.
Collapse
Affiliation(s)
- Hyun-Myung Woo
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, United States of America
| | - Hyundoo Jeong
- Department of Mechatronics Engineering, Incheon National University, Incheon, Republic of Korea
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, United States of America
- TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, United States of America
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY, United States of America
- * E-mail:
| |
Collapse
|
10
|
Shen T, Zhang Z, Chen Z, Gu D, Liang S, Xu Y, Li R, Wei Y, Liu Z, Yi Y, Xie X. A genome-scale metabolic network alignment method within a hypergraph-based framework using a rotational tensor-vector product. Sci Rep 2018; 8:16376. [PMID: 30401914 PMCID: PMC6219566 DOI: 10.1038/s41598-018-34692-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 10/23/2018] [Indexed: 12/14/2022] Open
Abstract
Biological network alignment aims to discover important similarities and differences and thus find a mapping between topological and/or functional components of different biological molecular networks. Then, the mapped components can be considered to correspond to both their places in the network topology and their biological attributes. Development and evolution of biological network alignment methods has been accelerated by the rapidly increasing availability of such biological networks, yielding a repertoire of tens of methods based upon graph theory. However, most biological processes, especially the metabolic reactions, are more sophisticated than simple pairwise interactions and contain three or more participating components. Such multi-lateral relations are not captured by graphs, and computational methods to overcome this limitation are currently lacking. This paper introduces hypergraphs and association hypergraphs to describe metabolic networks and their potential alignments, respectively. Within this framework, metabolic networks are aligned by identifying the maximal Z-eigenvalue of a symmetric tensor. A shifted higher-order power method was utilized to identify a solution. A rotational strategy has been introduced to accelerate the tensor-vector product by 250-fold on average and reduce the storage cost by up to 1,000-fold. The algorithm was implemented on a spark-based distributed computation cluster to significantly increase the convergence rate further by 50- to 80-fold. The parameters have been explored to understand their impact on alignment accuracy and speed. In particular, the influence of initial value selection on the stationary point has been simulated to ensure an accurate approximation of the global optimum. This framework was demonstrated by alignments among the genome-wide metabolic networks of Escherichia coli MG-1655 and Halophilic archaeon DL31. To our knowledge, this is the first genome-wide metabolic network alignment at both the metabolite level and the enzyme level. These results demonstrate that it can supply quite a few valuable insights into metabolic networks. First, this method can access the driving force of organic reactions through the chemical evolution of metabolic network. Second, this method can incorporate the chemical information of enzymes and structural changes of compounds to offer new way defining reaction class and module, such as those in KEGG. Third, as a vertex-focused treatment, this method can supply novel structural and functional annotation for ill-defined molecules. The related source code is available on request.
Collapse
Affiliation(s)
- Tie Shen
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China.
| | - Zhengdong Zhang
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Zhen Chen
- College of Mathematical Science, Guizhou Normal University, Guiyang, Guizhou, China
| | - Dagang Gu
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Shen Liang
- College of Mathematics and Information Science, Guiyang University, Guiyang, Guizhou, China
| | - Yang Xu
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Ruiyuan Li
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Yimin Wei
- School of Mathematics Sciences and Key Laboratory of Mathematics for Nonlinear Sciences, Fudan University, Shanghai, China
| | - Zhijie Liu
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China
| | - Yin Yi
- Key Laboratory of State Forestry Administration on Biodiversity Conservation in Karst of Southwest Areas China, Guizhou Normal University, Guiyang, Guizhou, China.
| | - Xiaoyao Xie
- Key Laboratory of Information and Computing Science Guizhou Province, Guizhou Normal University, Guiyang, Guizhou, China.
| |
Collapse
|