1
|
Yazdani K, Mousapour R, Hayes WB. New GO-based measures in multiple network alignment. Bioinformatics 2024; 40:btae476. [PMID: 39082966 PMCID: PMC11310457 DOI: 10.1093/bioinformatics/btae476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 06/11/2024] [Accepted: 07/30/2024] [Indexed: 08/10/2024] Open
Abstract
MOTIVATION Protein-protein interaction (PPI) networks provide valuable insights into the function of biological systems. Aligning multiple PPI networks may expose relationships beyond those observable by pairwise comparisons. However, assessing the biological quality of multiple network alignments is a challenging problem. RESULTS We propose two new measures to evaluate the quality of multiple network alignments using functional information from Gene Ontology (GO) terms. When aligning multiple real PPI networks across species, we observe that both measures are highly correlated with objective quality indicators, such as common orthologs. Additionally, our measures strongly correlate with an alignment's ability to predict novel GO annotations, which is a unique advantage over existing GO-based measures. AVAILABILITY AND IMPLEMENTATION The scripts and the links to the raw and alignment data can be accessed at https://github.com/kimiayazdani/GO_Measures.git.
Collapse
Affiliation(s)
- Kimia Yazdani
- Department of Computer Science, University of California, Irvine, CA 92697-3435, United States
| | - Reza Mousapour
- Department of Computer Engineering, Sharif University of Technology, Tehran 1458889694, Iran
| | - Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, United States
| |
Collapse
|
2
|
Singh V, Singh V. Characterizing the circadian connectome of Ocimum tenuiflorum using an integrated network theoretic framework. Sci Rep 2023; 13:13108. [PMID: 37567911 PMCID: PMC10421869 DOI: 10.1038/s41598-023-40212-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 08/07/2023] [Indexed: 08/13/2023] Open
Abstract
Across the three domains of life, circadian clock is known to regulate vital physiological processes, like, growth, development, defence etc. by anticipating environmental cues. In this work, we report an integrated network theoretic methodology comprising of random walk with restart and graphlet degree vectors to characterize genome wide core circadian clock and clock associated raw candidate proteins in a plant for which protein interaction information is available. As a case study, we have implemented this framework in Ocimum tenuiflorum (Tulsi); one of the most valuable medicinal plants that has been utilized since ancient times in the management of a large number of diseases. For that, 24 core clock (CC) proteins were mined in 56 template plant genomes to build their hidden Markov models (HMMs). These HMMs were then used to identify 24 core clock proteins in O. tenuiflorum. The local topology of the interologous Tulsi protein interaction network was explored to predict the CC associated raw candidate proteins. Statistical and biological significance of the raw candidates was determined using permutation and enrichment tests. A total of 66 putative CC associated proteins were identified and their functional annotation was performed.
Collapse
Affiliation(s)
- Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himahcal Pradesh, Dharamshala, Himahcal Pradesh, 176206, India
| | - Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himahcal Pradesh, Dharamshala, Himahcal Pradesh, 176206, India.
| |
Collapse
|
3
|
Yu Z, Su Y, Lu Y, Yang Y, Wang F, Zhang S, Chang Y, Wong KC, Li X. Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA. Nat Commun 2023; 14:400. [PMID: 36697410 PMCID: PMC9877026 DOI: 10.1038/s41467-023-36134-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 01/16/2023] [Indexed: 01/26/2023] Open
Abstract
Single-cell RNA sequencing provides high-throughput gene expression information to explore cellular heterogeneity at the individual cell level. A major challenge in characterizing high-throughput gene expression data arises from challenges related to dimensionality, and the prevalence of dropout events. To address these concerns, we develop a deep graph learning method, scMGCA, for single-cell data analysis. scMGCA is based on a graph-embedding autoencoder that simultaneously learns cell-cell topology representation and cluster assignments. We show that scMGCA is accurate and effective for cell segregation and batch effect correction, outperforming other state-of-the-art models across multiple platforms. In addition, we perform genomic interpretation on the key compressed transcriptomic space of the graph-embedding autoencoder to demonstrate the underlying gene regulation mechanism. We demonstrate that in a pancreatic ductal adenocarcinoma dataset, scMGCA successfully provides annotations on the specific cell types and reveals differential gene expression levels across multiple tumor-associated and cell signalling pathways.
Collapse
Affiliation(s)
- Zhuohan Yu
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yanchi Su
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yifu Lu
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yuning Yang
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Shixiong Zhang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Yi Chang
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China.
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China.
| |
Collapse
|
4
|
Lin P, Yan Y, Huang SY. DeepHomo2.0: improved protein-protein contact prediction of homodimers by transformer-enhanced deep learning. Brief Bioinform 2023; 24:6849483. [PMID: 36440949 DOI: 10.1093/bib/bbac499] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/08/2022] [Accepted: 10/21/2022] [Indexed: 11/30/2022] Open
Abstract
Protein-protein interactions play an important role in many biological processes. However, although structure prediction for monomer proteins has achieved great progress with the advent of advanced deep learning algorithms like AlphaFold, the structure prediction for protein-protein complexes remains an open question. Taking advantage of the Transformer model of ESM-MSA, we have developed a deep learning-based model, named DeepHomo2.0, to predict protein-protein interactions of homodimeric complexes by leveraging the direct-coupling analysis (DCA) and Transformer features of sequences and the structure features of monomers. DeepHomo2.0 was extensively evaluated on diverse test sets and compared with eight state-of-the-art methods including protein language model-based, DCA-based and machine learning-based methods. It was shown that DeepHomo2.0 achieved a high precision of >70% with experimental monomer structures and >60% with predicted monomer structures for the top 10 predicted contacts on the test sets and outperformed the other eight methods. Moreover, even the version without using structure information, named DeepHomoSeq, still achieved a good precision of >55% for the top 10 predicted contacts. Integrating the predicted contacts into protein docking significantly improved the structure prediction of realistic Critical Assessment of Protein Structure Prediction homodimeric complexes. DeepHomo2.0 and DeepHomoSeq are available at http://huanglab.phys.hust.edu.cn/DeepHomo2/.
Collapse
Affiliation(s)
- Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
5
|
Wang S, Atkinson GRS, Hayes WB. SANA: cross-species prediction of Gene Ontology GO annotations via topological network alignment. NPJ Syst Biol Appl 2022; 8:25. [PMID: 35859153 PMCID: PMC9300714 DOI: 10.1038/s41540-022-00232-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 05/20/2022] [Indexed: 12/31/2022] Open
Abstract
Topological network alignment aims to align two networks node-wise in order to maximize the observed common connection (edge) topology between them. The topological alignment of two protein-protein interaction (PPI) networks should thus expose protein pairs with similar interaction partners allowing, for example, the prediction of common Gene Ontology (GO) terms. Unfortunately, no network alignment algorithm based on topology alone has been able to achieve this aim, though those that include sequence similarity have seen some success. We argue that this failure of topology alone is due to the sparsity and incompleteness of the PPI network data of almost all species, which provides the network topology with a small signal-to-noise ratio that is effectively swamped when sequence information is added to the mix. Here we show that the weak signal can be detected using multiple stochastic samples of "good" topological network alignments, which allows us to observe regions of the two networks that are robustly aligned across multiple samples. The resulting network alignment frequency (NAF) strongly correlates with GO-based Resnik semantic similarity and enables the first successful cross-species predictions of GO terms based on topology-only network alignments. Our best predictions have an AUPR of about 0.4, which is competitive with state-of-the-art algorithms, even when there is no observable sequence similarity and no known homology relationship. While our results provide only a "proof of concept" on existing network data, we hypothesize that predicting GO terms from topology-only network alignments will become increasingly practical as the volume and quality of PPI network data increase.
Collapse
Affiliation(s)
- Siyue Wang
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA
| | - Giles R S Atkinson
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA
| | - Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, 92697-3435, USA.
| |
Collapse
|
6
|
Wang S, Chen X, Frederisy BJ, Mbakogu BA, Kanne AD, Khosravi P, Hayes WB. On the current failure-but bright future-of topology-driven biological network alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:1-44. [PMID: 35871888 DOI: 10.1016/bs.apcsb.2022.05.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Since the function of a protein is defined by its interaction partners, and since we expect similar interaction patterns across species, the alignment of protein-protein interaction (PPI) networks between species, based on network topology alone, should uncover functionally related proteins across species. Surprisingly, despite the publication of more than fifty algorithms aimed at performing PPI network alignment, few have demonstrated a statistically significant link between network topology and functional similarity, and none have demonstrated that orthologs can be recovered using network topology alone. We find that the major contributing factors to this surprising failure are: (i) edge densities in most currently available experimental PPI networks are demonstrably too low to expect topological network alignment to succeed; (ii) in the few cases where the edge densities are high enough, some measures of topological similarity easily uncover functionally similar proteins while others do not; and (iii) most network alignment algorithms to date perform poorly at optimizing even their own topological objective functions, hampering their ability to use topology effectively. We demonstrate that SANA-the Simulated Annealing Network Aligner-significantly outperforms existing aligners at optimizing their own objective functions, even achieving near-optimal solutions when the optimal solution is known. We offer the first demonstration of global network alignments based on topology alone that align functionally similar proteins with p-values in some cases below 10-300. We predict that topological network alignment has a bright future as edge densities increase toward the value where good alignments become possible. We demonstrate that when enough common topology is present at high enough edge densities-for example in the recent, partly synthetic networks of the Integrated Interaction Database-topological network alignment easily recovers most orthologs, paving the way toward high-throughput functional prediction based on topology-driven network alignment.
Collapse
Affiliation(s)
- Siyue Wang
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Xiaoyin Chen
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Brent J Frederisy
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Benedict A Mbakogu
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Amy D Kanne
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Pasha Khosravi
- Department of Computer Science, University of California, Irvine, CA, United States
| | - Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, United States.
| |
Collapse
|
7
|
Ma JX, Yang Y, Li G, Ma BG. Computationally Reconstructed Interactome of Bradyrhizobium diazoefficiens USDA110 Reveals Novel Functional Modules and Protein Hubs for Symbiotic Nitrogen Fixation. Int J Mol Sci 2021; 22:11907. [PMID: 34769335 PMCID: PMC8584416 DOI: 10.3390/ijms222111907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 10/22/2021] [Accepted: 10/28/2021] [Indexed: 11/16/2022] Open
Abstract
Symbiotic nitrogen fixation is an important part of the nitrogen biogeochemical cycles and the main nitrogen source of the biosphere. As a classical model system for symbiotic nitrogen fixation, rhizobium-legume systems have been studied elaborately for decades. Details about the molecular mechanisms of the communication and coordination between rhizobia and host plants is becoming clearer. For more systematic insights, there is an increasing demand for new studies integrating multiomics information. Here, we present a comprehensive computational framework integrating the reconstructed protein interactome of B. diazoefficiens USDA110 with its transcriptome and proteome data to study the complex protein-protein interaction (PPI) network involved in the symbiosis system. We reconstructed the interactome of B. diazoefficiens USDA110 by computational approaches. Based on the comparison of interactomes between B. diazoefficiens USDA110 and other rhizobia, we inferred that the slow growth of B. diazoefficiens USDA110 may be due to the requirement of more protein modifications, and we further identified 36 conserved functional PPI modules. Integrated with transcriptome and proteome data, interactomes representing free-living cell and symbiotic nitrogen-fixing (SNF) bacteroid were obtained. Based on the SNF interactome, a core-sub-PPI-network for symbiotic nitrogen fixation was determined and nine novel functional modules and eleven key protein hubs playing key roles in symbiosis were identified. The reconstructed interactome of B. diazoefficiens USDA110 may serve as a valuable reference for studying the mechanism underlying the SNF system of rhizobia and legumes.
Collapse
Affiliation(s)
| | | | | | - Bin-Guang Ma
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; (J.-X.M.); (Y.Y.); (G.L.)
| |
Collapse
|
8
|
Zhang X, Wang W, Ren CX, Dai DQ. Learning representation for multiple biological networks via a robust graph regularized integration approach. Brief Bioinform 2021; 23:6381251. [PMID: 34607360 DOI: 10.1093/bib/bbab409] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Revised: 08/23/2021] [Accepted: 09/06/2021] [Indexed: 01/18/2023] Open
Abstract
Learning node representation is a fundamental problem in biological network analysis, as compact representation features reveal complicated network structures and carry useful information for downstream tasks such as link prediction and node classification. Recently, multiple networks that profile objects from different aspects are increasingly accumulated, providing the opportunity to learn objects from multiple perspectives. However, the complex common and specific information across different networks pose challenges to node representation methods. Moreover, ubiquitous noise in networks calls for more robust representation. To deal with these problems, we present a representation learning method for multiple biological networks. First, we accommodate the noise and spurious edges in networks using denoised diffusion, providing robust connectivity structures for the subsequent representation learning. Then, we introduce a graph regularized integration model to combine refined networks and compute common representation features. By using the regularized decomposition technique, the proposed model can effectively preserve the common structural property of different networks and simultaneously accommodate their specific information, leading to a consistent representation. A simulation study shows the superiority of the proposed method on different levels of noisy networks. Three network-based inference tasks, including drug-target interaction prediction, gene function identification and fine-grained species categorization, are conducted using representation features learned from our method. Biological networks at different scales and levels of sparsity are involved. Experimental results on real-world data show that the proposed method has robust performance compared with alternatives. Overall, by eliminating noise and integrating effectively, the proposed method is able to learn useful representations from multiple biological networks.
Collapse
Affiliation(s)
- Xiwen Zhang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, 510275, Guangzhou, China
| | - Weiwen Wang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, 510275, Guangzhou, China
| | - Chuan-Xian Ren
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, 510275, Guangzhou, China
| | - Dao-Qing Dai
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, 510275, Guangzhou, China
| |
Collapse
|
9
|
Zambrana C, Xenos A, Böttcher R, Malod-Dognin N, Pržulj N. Network neighbors of viral targets and differentially expressed genes in COVID-19 are drug target candidates. Sci Rep 2021; 11:18985. [PMID: 34556735 PMCID: PMC8460804 DOI: 10.1038/s41598-021-98289-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 08/23/2021] [Indexed: 12/12/2022] Open
Abstract
The COVID-19 pandemic is raging. It revealed the importance of rapid scientific advancement towards understanding and treating new diseases. To address this challenge, we adapt an explainable artificial intelligence algorithm for data fusion and utilize it on new omics data on viral-host interactions, human protein interactions, and drugs to better understand SARS-CoV-2 infection mechanisms and predict new drug-target interactions for COVID-19. We discover that in the human interactome, the human proteins targeted by SARS-CoV-2 proteins and the genes that are differentially expressed after the infection have common neighbors central in the interactome that may be key to the disease mechanisms. We uncover 185 new drug-target interactions targeting 49 of these key genes and suggest re-purposing of 149 FDA-approved drugs, including drugs targeting VEGF and nitric oxide signaling, whose pathways coincide with the observed COVID-19 symptoms. Our integrative methodology is universal and can enable insight into this and other serious diseases.
Collapse
Affiliation(s)
| | | | | | - Noël Malod-Dognin
- Barcelona Supercomputing Center, Barcelona, Spain
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| | - Nataša Pržulj
- Barcelona Supercomputing Center, Barcelona, Spain.
- Department of Computer Science, University College London, London, WC1E 6BT, UK.
- ICREA, Pg. Lluís Companys 23, Barcelona, Spain.
| |
Collapse
|
10
|
Shang H, Zhang H, Ren Z, Zhao H, Zhang Z, Tong J. Characterization of the Potential Role of NTPCR in Epithelial Ovarian Cancer by Integrating Transcriptomic and Metabolomic Analysis. Front Genet 2021; 12:695245. [PMID: 34539736 PMCID: PMC8442909 DOI: 10.3389/fgene.2021.695245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 07/27/2021] [Indexed: 11/13/2022] Open
Abstract
Background Epithelial ovarian carcinoma (EOC) is a malignant tumor with high motility in women. Our previous study found that dysregulated nucleoside-triphosphatase cancer-related (NTPCR) was associated with the prognosis of EOC patients, and thus, this present study attempted to explore the potential roles of NTPCR in disease progression. Methods Expressed level of NTPCR was investigated in EOC tissues by RT-qPCR and Western blot analysis. NTPCR shRNA and overexpression vector were generated and transfected into OVCAR-3 or SKOV3 cells to detect the effect of NTPCR on cell proliferation, cell cycle, cell migration, and invasion. Transcriptomic sequencing and metabolite profiling analysis were performed in shNTPCR groups to identify transcriptome or metabolite alteration that might contribute to EOC. Finally, we searched the overlapped signaling pathways correlated with differential metabolites and differentially expressed genes (DEGs) by integrating analysis. Results Comparing para-cancerous tissues, we found that NTPCR is highly expressed in cancer tissues (p < 0.05). Overexpression of NTPCR inhibited cell proliferation, migration, and invasion and reduced the proportion of S- and G2/M-phase cells, while downregulation of NTPCR showed the opposite results. RNA sequencing analysis demonstrated cohorts of DEGs were identified in shNTPCR samples. Protein–protein interaction networks were constructed for DEGs. STAT1 (degree = 43) and OAS2 (degree = 36) were identified as hub genes in the network. Several miRNAs together with target genes were predicted to be crucial genes related to disease progression, including hsa-miR-124-3p, hsa-miR-30a-5p, hsa-miR-146a-5, EP300, GATA2, and STAT3. We also screened the differential metabolites from shNTPCR samples, including 22 upregulated and 22 downregulated metabolites. By integrating transcriptomics and metabolomics analysis, eight overlapped pathways were correlated with these DEGs and differential metabolites, such as primary bile acid biosynthesis, protein digestion, and absorption, pentose, and glucuronate interconversions. Conclusion NTPCR might serve as a tumor suppressor in EOC progression. Our results demonstrated that DEGs and differential metabolites were mainly related to several signaling pathways, which might be a crucial role in the progression of NTPCR regulation of EOC.
Collapse
Affiliation(s)
- Hongkai Shang
- Department of the Fourth Clinical Medical College, Zhejiang Chinese Medical University, Hangzhou, China.,Department of Gynecology, Hangzhou First People's Hospital, Hangzhou, China.,Department of Gynecology, Zhejiang University School of Medicine, Hangzhou, China
| | - Huizhi Zhang
- Department of the Fourth Clinical Medical College, Zhejiang Chinese Medical University, Hangzhou, China.,Department of Gynecology, Hangzhou First People's Hospital, Hangzhou, China
| | - Ziyao Ren
- Department of Gynecology, Hangzhou First People's Hospital, Hangzhou, China.,Department of Gynecology, Zhejiang University School of Medicine, Hangzhou, China
| | - Hongjiang Zhao
- Department of the Fourth Clinical Medical College, Zhejiang Chinese Medical University, Hangzhou, China.,Department of Gynecology, Hangzhou First People's Hospital, Hangzhou, China
| | - Zhifen Zhang
- Department of Gynecology, Hangzhou Women's Hospital (Maternity and Child Health Care Hospital), Hangzhou, China
| | - Jinyi Tong
- Department of the Fourth Clinical Medical College, Zhejiang Chinese Medical University, Hangzhou, China.,Department of Gynecology, Hangzhou Women's Hospital (Maternity and Child Health Care Hospital), Hangzhou, China
| |
Collapse
|
11
|
Xenos A, Malod-Dognin N, Milinković S, Pržulj N. Linear functional organization of the omic embedding space. Bioinformatics 2021; 37:3839-3847. [PMID: 34213534 PMCID: PMC8570782 DOI: 10.1093/bioinformatics/btab487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 06/21/2021] [Accepted: 06/30/2021] [Indexed: 11/21/2022] Open
Abstract
Motivation We are increasingly accumulating complex omics data that capture different aspects of cellular functioning. A key challenge is to untangle their complexity and effectively mine them for new biomedical information. To decipher this new information, we introduce algorithms based on network embeddings. Such algorithms represent biological macromolecules as vectors in d-dimensional space, in which topologically similar molecules are embedded close in space and knowledge is extracted directly by vector operations. Recently, it has been shown that neural networks used to obtain vectorial representations (embeddings) are implicitly factorizing a mutual information matrix, called Positive Pointwise Mutual Information (PPMI) matrix. Thus, we propose the use of the PPMI matrix to represent the human protein–protein interaction (PPI) network and also introduce the graphlet degree vector PPMI matrix of the PPI network to capture different topological (structural) similarities of the nodes in the molecular network. Results We generate the embeddings by decomposing these matrices with Nonnegative Matrix Tri-Factorization. We demonstrate that genes that are embedded close in these spaces have similar biological functions, so we can extract new biomedical knowledge directly by doing linear operations on their embedding vector representations. We exploit this property to predict new genes participating in protein complexes and to identify new cancer-related genes based on the cosine similarities between the vector representations of the genes. We validate 80% of our novel cancer-related gene predictions in the literature and also by patient survival curves that demonstrating that 93.3% of them have a potential clinical relevance as biomarkers of cancer. Availability and implementation Code and data are available online at https://gitlab.bsc.es/axenos/embedded-omics-data-geometry/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- A Xenos
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Universitat Politecnica de Catalunya (UPC), 08034 Barcelona, Spain
| | - N Malod-Dognin
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Department of Computer Science, University College London, WC1E 6BT London, United Kingdom
| | - S Milinković
- RAF School of Computing, Union University, Belgrade, Serbia
| | - N Pržulj
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Department of Computer Science, University College London, WC1E 6BT London, United Kingdom.,ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
12
|
Suratanee A, Buaboocha T, Plaimas K. Prediction of Human- Plasmodium vivax Protein Associations From Heterogeneous Network Structures Based on Machine-Learning Approach. Bioinform Biol Insights 2021; 15:11779322211013350. [PMID: 34188457 PMCID: PMC8212370 DOI: 10.1177/11779322211013350] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/04/2021] [Indexed: 11/24/2022] Open
Abstract
Malaria caused by Plasmodium vivax can lead to severe morbidity and death. In addition, resistance has been reported to existing drugs in treating this malaria. Therefore, the identification of new human proteins associated with malaria is urgently needed for the development of additional drugs. In this study, we established an analysis framework to predict human-P. vivax protein associations using network topological profiles from a heterogeneous network structure of human and P. vivax, machine-learning techniques and statistical analysis. Novel associations were predicted and ranked to determine the importance of human proteins associated with malaria. With the best-ranking score, 411 human proteins were identified as promising proteins. Their regulations and functions were statistically analyzed, which led to the identification of proteins involved in the regulation of membrane and vesicle formation, and proteasome complexes as potential targets for the treatment of P. vivax malaria. In conclusion, by integrating related data, our analysis was efficient in identifying potential targets providing an insight into human-parasite protein associations. Furthermore, generalizing this model could allow researchers to gain further insights into other diseases and enhance the field of biomedical science.
Collapse
Affiliation(s)
- Apichat Suratanee
- Department of Mathematics, Faculty of
Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok,
Thailand
| | - Teerapong Buaboocha
- Department of Biochemistry, Faculty of
Science, Chulalongkorn University, Bangkok, Thailand
- Omics Sciences and Bioinformatics
Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Kitiporn Plaimas
- Omics Sciences and Bioinformatics
Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Advanced Virtual and Intelligent
Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of
Science, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
13
|
Maharaj S, Qian T, Ohiba Z, Hayes W. Common Neighbors Extension of the Sticky Model for PPI Networks Evaluated by Global and Local Graphlet Similarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:16-26. [PMID: 32809943 DOI: 10.1109/tcbb.2020.3017374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The structure of protein-protein interaction (PPI) networks has been studied for over a decade. Many theoretical models have been proposed to model PPI network structure, but continuing noise and incompleteness in these networks make conclusions about their structure difficult. Using newer, larger networks from Sept. 2018 BioGRID and Jan. 2019 IID, we show the joint distribution of degree products and common neighbors has a greater impact on PPI edge connectivity than their individual distributions, and introduce two new models (CN and STICKY-CN) for PPI networks employing these features. Since graphlet-based measures are believed to be among the most discerning and sensitive network comparison tools available, we assess their overall global and local fits to PPI networks using Graphlet Kernel (GK). We fit 10 theoretical models to nine BioGRID networks and twelve Integrated Interactive Database (IID) networks and find: (1) STICKY and STICKY-CN are the overall globally best fitting models according to GK, (2) Hyperbolic Geometric Graph model is a better fit than any STICKY-based model on 4 species, (3) though STICKY-CN provides a better local fit than the STICKY model, the CN model provides the greatest local fit over most species. We conclude that the inclusion of CN into STICKY-CN makes it the best overall fit for PPI networks as it is a good fit locally and globally.
Collapse
|
14
|
Doria-Belenguer S, Youssef MK, Böttcher R, Malod-Dognin N, Pržulj N. Probabilistic graphlets capture biological function in probabilistic molecular networks. Bioinformatics 2020; 36:i804-i812. [PMID: 33381834 DOI: 10.1093/bioinformatics/btaa812] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. RESULTS We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while simultaneously showing a higher sensitivity to identify condition-specific functions compared to their unweighted graphlet-based method counterparts. AVAILABILITYAND IMPLEMENTATION Our implementation of probabilistic graphlets is available at https://github.com/Serdobe/Probabilistic_Graphlets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sergio Doria-Belenguer
- Barcelona Supercomputing Center, Barcelona 08034, Spain.,Universitat Politècnica de Catalunya (UPC), Barcelona 08034, Spain
| | - Markus K Youssef
- Barcelona Supercomputing Center, Barcelona 08034, Spain.,Universitat Politècnica de Catalunya (UPC), Barcelona 08034, Spain
| | - René Böttcher
- Barcelona Supercomputing Center, Barcelona 08034, Spain
| | - Noël Malod-Dognin
- Barcelona Supercomputing Center, Barcelona 08034, Spain.,Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Nataša Pržulj
- Barcelona Supercomputing Center, Barcelona 08034, Spain.,Department of Computer Science, University College London, London WC1E 6BT, UK.,ICREA, Barcelona 08010, Spain
| |
Collapse
|
15
|
Klimm F, Toledo EM, Monfeuga T, Zhang F, Deane CM, Reinert G. Functional module detection through integration of single-cell RNA sequencing data with protein-protein interaction networks. BMC Genomics 2020; 21:756. [PMID: 33138772 PMCID: PMC7607865 DOI: 10.1186/s12864-020-07144-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 10/12/2020] [Indexed: 12/14/2022] Open
Abstract
Background Recent advances in single-cell RNA sequencing have allowed researchers to explore transcriptional function at a cellular level. In particular, single-cell RNA sequencing reveals that there exist clusters of cells with similar gene expression profiles, representing different transcriptional states. Results In this study, we present scPPIN, a method for integrating single-cell RNA sequencing data with protein–protein interaction networks that detects active modules in cells of different transcriptional states. We achieve this by clustering RNA-sequencing data, identifying differentially expressed genes, constructing node-weighted protein–protein interaction networks, and finding the maximum-weight connected subgraphs with an exact Steiner-tree approach. As case studies, we investigate two RNA-sequencing data sets from human liver spheroids and human adipose tissue, respectively. With scPPIN we expand the output of differential expressed genes analysis with information from protein interactions. We find that different transcriptional states have different subnetworks of the protein–protein interaction networks significantly enriched which represent biological pathways. In these pathways, scPPIN identifies proteins that are not differentially expressed but have a crucial biological function (e.g., as receptors) and therefore reveals biology beyond a standard differential expressed gene analysis. Conclusions The introduced scPPIN method can be used to systematically analyse differentially expressed genes in single-cell RNA sequencing data by integrating it with protein interaction data. The detected modules that characterise each cluster help to identify and hypothesise a biological function associated to those cells. Our analysis suggests the participation of unexpected proteins in these pathways that are undetectable from the single-cell RNA sequencing data alone. The techniques described here are applicable to other organisms and tissues. Supplementary Information The online version contains supplementary material available at (doi:10.1186/s12864-020-07144-2).
Collapse
Affiliation(s)
- Florian Klimm
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK. .,Mitochondrial Biology Unit, University of Cambridge, Cambridge, CB2 0XY, UK.
| | - Enrique M Toledo
- Discovery Technology and Genomics, Novo Nordisk Research Centre Oxford, Oxford, OX3 7FZ, UK
| | - Thomas Monfeuga
- Discovery Technology and Genomics, Novo Nordisk Research Centre Oxford, Oxford, OX3 7FZ, UK
| | - Fang Zhang
- Discovery Technology and Genomics, Novo Nordisk Research Centre Oxford, Oxford, OX3 7FZ, UK
| | | | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
| |
Collapse
|
16
|
Abstract
MOTIVATION The structure of chromatin impacts gene expression. Its alteration has been shown to coincide with the occurrence of cancer. A key challenge is in understanding the role of chromatin structure (CS) in cellular processes and its implications in diseases. RESULTS We propose a comparative pipeline to analyze CSs and apply it to study chronic lymphocytic leukemia (CLL). We model the chromatin of the affected and control cells as networks and analyze the network topology by state-of-the-art methods. Our results show that CSs are a rich source of new biological and functional information about DNA elements and cells that can complement protein-protein and co-expression data. Importantly, we show the existence of structural markers of cancer-related DNA elements in the chromatin. Surprisingly, CLL driver genes are characterized by specific local wiring patterns not only in the CS network of CLL cells, but also of healthy cells. This allows us to successfully predict new CLL-related DNA elements. Importantly, this shows that we can identify cancer-related DNA elements in other cancer types by investigating the CS network of the healthy cell of origin, a key new insight paving the road to new therapeutic strategies. This gives us an opportunity to exploit chromosome conformation data in healthy cells to predict new drivers. AVAILABILITY AND IMPLEMENTATION Our predicted CLL genes and RNAs are provided as a free resource to the community at https://life.bsc.es/iconbi/chromatin/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- N Malod-Dognin
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
- Department of Computer Science, University College London, London WC1E 6BT, UK
| | - V Pancaldi
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
- Centre de Recherches en Cancérologie de Toulouse (CRCT), Toulouse 31037, France
- University Paul Sabatier III, Toulouse 31330, France
| | - A Valencia
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
- ICREA, Pg. Lluís Companys 23, Barcelona 08010, Spain
- Coordination Node, Spanish National Bioinformatics Institute, ELIXIR-Spain (INB, ELIXIR-ES), Madrid 28029, Spain
| | - N Pržulj
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
- Department of Computer Science, University College London, London WC1E 6BT, UK
- ICREA, Pg. Lluís Companys 23, Barcelona 08010, Spain
| |
Collapse
|
17
|
Hayes WB. An Introductory Guide to Aligning Networks Using SANA, the Simulated Annealing Network Aligner. Methods Mol Biol 2020; 2074:263-284. [PMID: 31583643 DOI: 10.1007/978-1-4939-9873-9_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological networks holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology-the "structure" of the network-is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment-which is an essentially solved problem-network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used.Here we introduce SANA, the Simulated Annealing Network Aligner. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between two or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks.
Collapse
Affiliation(s)
- Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA, USA.
| |
Collapse
|
18
|
Maharaj S, Tracy B, Hayes WB. BLANT—fast graphlet sampling tool. Bioinformatics 2019; 35:5363-5364. [DOI: 10.1093/bioinformatics/btz603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 07/19/2019] [Accepted: 07/30/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Summary
BLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide ‘seeds’ for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almost exclusive use despite the existence of more accurate, but slower, algorithms. In this paper, we introduce the Basic Local Alignment for Networks Tool (BLANT). BLANT is the analog of BLAST, but for networks: given an input graph, it samples small, induced, k-node sub-graphs called k-graphlets. Graphlets have been used to classify networks, quantify structure, align networks both locally and globally, identify topology-function relationships and build taxonomic trees without the use of sequences. Given an input network, BLANT produces millions of graphlet samples in seconds—orders of magnitude faster than existing methods. BLANT offers sampled graphlets in various forms: distributions of graphlets or their orbits; graphlet degree or graphlet orbit degree vectors, the latter being compatible with ORCA; or an index to be used as the basis for seed-and-extend local alignments. We demonstrate BLANT’s usefelness by using its indexing mode to find functional similarity between yeast and human PPI networks.
Availability and implementation
BLANT is written in C and is available at https://github.com/waynebhayes/BLANT/releases.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sridevi Maharaj
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Brennan Tracy
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Wayne B Hayes
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| |
Collapse
|
19
|
Windels SFL, Malod-Dognin N, Pržulj N. Graphlet Laplacians for topology-function and topology-disease relationships. Bioinformatics 2019; 35:5226-5234. [PMID: 31192358 DOI: 10.1093/bioinformatics/btz455] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 05/08/2019] [Accepted: 06/10/2019] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Laplacian matrices capture the global structure of networks and are widely used to study biological networks. However, the local structure of the network around a node can also capture biological information. Local wiring patterns are typically quantified by counting how often a node touches different graphlets (small, connected, induced sub-graphs). Currently available graphlet-based methods do not consider whether nodes are in the same network neighbourhood. To combine graphlet-based topological information and membership of nodes to the same network neighbourhood, we generalize the Laplacian to the Graphlet Laplacian, by considering a pair of nodes to be 'adjacent' if they simultaneously touch a given graphlet. RESULTS We utilize Graphlet Laplacians to generalize spectral embedding, spectral clustering and network diffusion. Applying Graphlet Laplacian-based spectral embedding, we visually demonstrate that Graphlet Laplacians capture biological functions. This result is quantified by applying Graphlet Laplacian-based spectral clustering, which uncovers clusters enriched in biological functions dependent on the underlying graphlet. We explain the complementarity of biological functions captured by different Graphlet Laplacians by showing that they capture different local topologies. Finally, diffusing pan-cancer gene mutation scores based on different Graphlet Laplacians, we find complementary sets of cancer-related genes. Hence, we demonstrate that Graphlet Laplacians capture topology-function and topology-disease relationships in biological networks. AVAILABILITY AND IMPLEMENTATION http://www0.cs.ucl.ac.uk/staff/natasa/graphlet-laplacian/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sam F L Windels
- Department of Computer Science, University College London, London, WC1E 6BT, United Kingdom
| | | | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, United Kingdom.,Barcelona Supercomputing Center, Barcelona, 08034, Spain.,ICREA, Pg. Lluís Companys 23, Barcelona, 08010, Spain
| |
Collapse
|
20
|
Malod-Dognin N, Pržulj N. Functional geometry of protein interactomes. Bioinformatics 2019; 35:3727-3734. [PMID: 30821317 DOI: 10.1093/bioinformatics/btz146] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Revised: 01/15/2019] [Accepted: 02/25/2019] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Protein-protein interactions (PPIs) are usually modeled as networks. These networks have extensively been studied using graphlets, small induced subgraphs capturing the local wiring patterns around nodes in networks. They revealed that proteins involved in similar functions tend to be similarly wired. However, such simple models can only represent pairwise relationships and cannot fully capture the higher-order organization of protein interactomes, including protein complexes. RESULTS To model the multi-scale organization of these complex biological systems, we utilize simplicial complexes from computational geometry. The question is how to mine these new representations of protein interactomes to reveal additional biological information. To address this, we define simplets, a generalization of graphlets to simplicial complexes. By using simplets, we define a sensitive measure of similarity between simplicial complex representations that allows for clustering them according to their data types better than clustering them by using other state-of-the-art measures, e.g. spectral distance, or facet distribution distance. We model human and baker's yeast protein interactomes as simplicial complexes that capture PPIs and protein complexes as simplices. On these models, we show that our newly introduced simplet-based methods cluster proteins by function better than the clustering methods that use the standard PPI networks, uncovering the new underlying functional organization of the cell. We demonstrate the existence of the functional geometry in the protein interactome data and the superiority of our simplet-based methods to effectively mine for new biological information hidden in the complexity of the higher-order organization of protein interactomes. AVAILABILITY AND IMPLEMENTATION Codes and datasets are freely available at http://www0.cs.ucl.ac.uk/staff/natasa/Simplets/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Noël Malod-Dognin
- Department of Life Sciences, Barcelona Supercomputing Center, Barcelona, Spain
| | - Nataša Pržulj
- Department of Life Sciences, Barcelona Supercomputing Center, Barcelona, Spain.,ICREA, Pg. Lluís Companys 23, Barcelona, Spain
| |
Collapse
|
21
|
Balasubramanian K, Gupta SP. Quantum Molecular Dynamics, Topological, Group Theoretical and Graph Theoretical Studies of Protein-Protein Interactions. Curr Top Med Chem 2019; 19:426-443. [PMID: 30836919 DOI: 10.2174/1568026619666190304152704] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Revised: 11/08/2018] [Accepted: 11/28/2018] [Indexed: 12/21/2022]
Abstract
BACKGROUND Protein-protein interactions (PPIs) are becoming increasingly important as PPIs form the basis of multiple aggregation-related diseases such as cancer, Creutzfeldt-Jakob, and Alzheimer's diseases. This mini-review presents hybrid quantum molecular dynamics, quantum chemical, topological, group theoretical, graph theoretical, and docking studies of PPIs. We also show how these theoretical studies facilitate the discovery of some PPI inhibitors of therapeutic importance. OBJECTIVE The objective of this review is to present hybrid quantum molecular dynamics, quantum chemical, topological, group theoretical, graph theoretical, and docking studies of PPIs. We also show how these theoretical studies enable the discovery of some PPI inhibitors of therapeutic importance. METHODS This article presents a detailed survey of hybrid quantum dynamics that combines classical and quantum MD for PPIs. The article also surveys various developments pertinent to topological, graph theoretical, group theoretical and docking studies of PPIs and highlight how the methods facilitate the discovery of some PPI inhibitors of therapeutic importance. RESULTS It is shown that it is important to include higher-level quantum chemical computations for accurate computations of free energies and electrostatics of PPIs and Drugs with PPIs, and thus techniques that combine classical MD tools with quantum MD are preferred choices. Topological, graph theoretical and group theoretical techniques are shown to be important in studying large network of PPIs comprised of over 100,000 proteins where quantum chemical and other techniques are not feasible. Hence, multiple techniques are needed for PPIs. CONCLUSION Drug discovery and our understanding of complex PPIs require multifaceted techniques that involve several disciplines such as quantum chemistry, topology, graph theory, knot theory and group theory, thus demonstrating a compelling need for a multi-disciplinary approach to the problem.
Collapse
Affiliation(s)
- Krishnan Balasubramanian
- School of Molecular Sciences, Arizona State University, Tempe, Arizona, AZ 85287-1604, United States
| | - Satya P Gupta
- Department of Pharmaceutical Technology, Meerut Institute of Engineering Technology, Meerut-250002, India
| |
Collapse
|
22
|
Liu X, Yang Z, Sang S, Lin H, Wang J, Xu B. Detection of protein complexes from multiple protein interaction networks using graph embedding. Artif Intell Med 2019; 96:107-115. [DOI: 10.1016/j.artmed.2019.04.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 04/06/2019] [Accepted: 04/06/2019] [Indexed: 12/22/2022]
|
23
|
Sonawane AR, Weiss ST, Glass K, Sharma A. Network Medicine in the Age of Biomedical Big Data. Front Genet 2019; 10:294. [PMID: 31031797 PMCID: PMC6470635 DOI: 10.3389/fgene.2019.00294] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 03/19/2019] [Indexed: 12/13/2022] Open
Abstract
Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare.
Collapse
Affiliation(s)
- Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women’s Hospital, Boston, MA, United States
| |
Collapse
|
24
|
Hayes WB, Mamano N. SANA NetGO: a combinatorial approach to using Gene Ontology (GO) terms to score network alignments. Bioinformatics 2019; 34:1345-1352. [PMID: 29228175 DOI: 10.1093/bioinformatics/btx716] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 12/04/2017] [Indexed: 01/05/2023] Open
Abstract
Motivation Gene Ontology (GO) terms are frequently used to score alignments between protein-protein interaction (PPI) networks. Methods exist to measure GO similarity between proteins in isolation, but proteins in a network alignment are not isolated: each pairing is dependent on every other via the alignment itself. Existing measures fail to take into account the frequency of GO terms across networks, instead imposing arbitrary rules on when to allow GO terms. Results Here we develop NetGO, a new measure that naturally weighs infrequent, informative GO terms more heavily than frequent, less informative GO terms, without arbitrary cutoffs, instead downweighting GO terms according to their frequency in the networks being aligned. This is a global measure applicable only to alignments, independent of pairwise GO measures, in the same sense that the edge-based EC or S3 scores are global measures of topological similarity independent of pairwise topological similarities. We demonstrate the superiority of NetGO in alignments of predetermined quality and show that NetGO correlates with alignment quality better than any existing GO-based alignment measures. We also demonstrate that NetGO provides a measure of taxonomic similarity between species, consistent with existing taxonomic measuresa feature not shared with existing GObased network alignment measures. Finally, we re-score alignments produced by almost a dozen aligners from a previous study and show that NetGO does a better job at separating good alignments from bad ones. Availability and implementation Available as part of SANA. Contact whayes@uci.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wayne B Hayes
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| | - Nil Mamano
- Department of Computer Science, University of California, Irvine, CA 92697-3435, USA
| |
Collapse
|
25
|
Malod-Dognin N, Petschnigg J, Windels SFL, Povh J, Hemingway H, Ketteler R, Pržulj N. Towards a data-integrated cell. Nat Commun 2019; 10:805. [PMID: 30778056 PMCID: PMC6379402 DOI: 10.1038/s41467-019-08797-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 01/18/2019] [Accepted: 01/25/2019] [Indexed: 01/01/2023] Open
Abstract
We are increasingly accumulating molecular data about a cell. The challenge is how to integrate them within a unified conceptual and computational framework enabling new discoveries. Hence, we propose a novel, data-driven concept of an integrated cell, iCell. Also, we introduce a computational prototype of an iCell, which integrates three omics, tissue-specific molecular interaction network types. We construct iCells of four cancers and the corresponding tissue controls and identify the most rewired genes in cancer. Many of them are of unknown function and cannot be identified as different in cancer in any specific molecular network. We biologically validate that they have a role in cancer by knockdown experiments followed by cell viability assays. We find additional support through Kaplan-Meier survival curves of thousands of patients. Finally, we extend this analysis to uncover pan-cancer genes. Our methodology is universal and enables integrative comparisons of diverse omics data over cells and tissues.
Collapse
Affiliation(s)
- Noël Malod-Dognin
- Department of Computer Science, University College London, London, WC1E 6BT, UK
- Department of Life Science, Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Julia Petschnigg
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| | - Sam F L Windels
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| | - Janez Povh
- Faculty of Mechanical Engineering, University of Ljubljana, Ljubljana, 1000, Slovenia
| | - Harry Hemingway
- Health Data Research UK London, University College London, London, WC1E 6BT, UK
- Institute of Health Informatics, University College London, London, WC1E 6BT, UK
- The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, W1T 7DN, UK
| | - Robin Ketteler
- MRC Laboratory for Molecular Cell Biology, University College London, London, WC1E 6BT, UK
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, UK.
- Department of Life Science, Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain.
- ICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
26
|
Mrzic A, Meysman P, Bittremieux W, Moris P, Cule B, Goethals B, Laukens K. Grasping frequent subgraph mining for bioinformatics applications. BioData Min 2018; 11:20. [PMID: 30202444 PMCID: PMC6122726 DOI: 10.1186/s13040-018-0181-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 08/13/2018] [Indexed: 11/18/2022] Open
Abstract
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, information about the bioinformatics application of subgraph mining remains scattered over heterogeneous literature. In this review, we provide an introduction to subgraph mining for life scientists. We give an overview of various subgraph mining algorithms from a bioinformatics perspective and present several of their potential biomedical applications.
Collapse
Affiliation(s)
- Aida Mrzic
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Pieter Meysman
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Wout Bittremieux
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Pieter Moris
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Boris Cule
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Bart Goethals
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| |
Collapse
|
27
|
Gaudelet T, Malod-Dognin N, Pržulj N. Higher-order molecular organization as a source of biological function. Bioinformatics 2018; 34:i944-i953. [PMID: 30423061 PMCID: PMC6129285 DOI: 10.1093/bioinformatics/bty570] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Motivation Molecular interactions have widely been modelled as networks. The local wiring patterns around molecules in molecular networks are linked with their biological functions. However, networks model only pairwise interactions between molecules and cannot explicitly and directly capture the higher-order molecular organization, such as protein complexes and pathways. Hence, we ask if hypergraphs (hypernetworks), that directly capture entire complexes and pathways along with protein-protein interactions (PPIs), carry additional functional information beyond what can be uncovered from networks of pairwise molecular interactions. The mathematical formalism of a hypergraph has long been known, but not often used in studying molecular networks due to the lack of sophisticated algorithms for mining the underlying biological information hidden in the wiring patterns of molecular systems modelled as hypernetworks. Results We propose a new, multi-scale, protein interaction hypernetwork model that utilizes hypergraphs to capture different scales of protein organization, including PPIs, protein complexes and pathways. In analogy to graphlets, we introduce hypergraphlets, small, connected, non-isomorphic, induced sub-hypergraphs of a hypergraph, to quantify the local wiring patterns of these multi-scale molecular hypergraphs and to mine them for new biological information. We apply them to model the multi-scale protein networks of bakers yeast and human and show that the higher-order molecular organization captured by these hypergraphs is strongly related to the underlying biology. Importantly, we demonstrate that our new models and data mining tools reveal different, but complementary biological information compared with classical PPI networks. We apply our hypergraphlets to successfully predict biological functions of uncharacterized proteins. Availability and implementation Code and data are available online at http://www0.cs.ucl.ac.uk/staff/natasa/hypergraphlets.
Collapse
Affiliation(s)
- Thomas Gaudelet
- Department of Computer Science, University College London, London, UK
| | - Noël Malod-Dognin
- Department of Computer Science, University College London, London, UK
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, UK
| |
Collapse
|
28
|
Wang B, Pourshafeie A, Zitnik M, Zhu J, Bustamante CD, Batzoglou S, Leskovec J. Network enhancement as a general method to denoise weighted biological networks. Nat Commun 2018; 9:3108. [PMID: 30082777 PMCID: PMC6078978 DOI: 10.1038/s41467-018-05469-x] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 07/03/2018] [Indexed: 12/31/2022] Open
Abstract
Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signal-to-noise ratio of undirected, weighted networks. NE uses a doubly stochastic matrix operator that induces sparsity and provides a closed-form solution that increases spectral eigengap of the input network. As a result, NE removes weak edges, enhances real connections, and leads to better downstream performance. Experiments show that NE improves gene-function prediction by denoising tissue-specific interaction networks, alleviates interpretation of noisy Hi-C contact maps from the human genome, and boosts fine-grained identification accuracy of species. Our results indicate that NE is widely applicable for denoising biological networks.
Collapse
Affiliation(s)
- Bo Wang
- Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, CA, USA
| | - Armin Pourshafeie
- Department of Physics, Stanford University, 382 Via Pueblo Mall, Stanford, 94305, CA, USA
| | - Marinka Zitnik
- Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, CA, USA
| | - Junjie Zhu
- Department of Electrical Engineering, Stanford University, 350 Serra Mall, Stanford, 94305, CA, USA
| | - Carlos D Bustamante
- Department of Biomedical Data Science, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, 94158, CA, USA
| | - Serafim Batzoglou
- Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, CA, USA.
- Illumina Inc, 499 Illinois Street, San Francisco, 94158, CA, USA.
| | - Jure Leskovec
- Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, CA, USA.
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, 94158, CA, USA.
| |
Collapse
|
29
|
How JJ, Navlakha S. Evidence of Rentian Scaling of Functional Modules in Diverse Biological Networks. Neural Comput 2018; 30:2210-2244. [DOI: 10.1162/neco_a_01095] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Biological networks have long been known to be modular, containing sets of nodes that are highly connected internally. Less emphasis, however, has been placed on understanding how intermodule connections are distributed within a network. Here, we borrow ideas from engineered circuit design and study Rentian scaling, which states that the number of external connections between nodes in different modules is related to the number of nodes inside the modules by a power-law relationship. We tested this property in a broad class of molecular networks, including protein interaction networks for six species and gene regulatory networks for 41 human and 25 mouse cell types. Using evolutionarily defined modules corresponding to known biological processes in the cell, we found that all networks displayed Rentian scaling with a broad range of exponents. We also found evidence for Rentian scaling in functional modules in the Caenorhabditis elegans neural network, but, interestingly, not in three different social networks, suggesting that this property does not inevitably emerge. To understand how such scaling may have arisen evolutionarily, we derived a new graph model that can generate Rentian networks given a target Rent exponent and a module decomposition as inputs. Overall, our work uncovers a new principle shared by engineered circuits and biological networks.
Collapse
Affiliation(s)
- Javier J. How
- Salk Institute for Biological Studies, Integrative Biology Laboratory, La Jolla, CA 92037, U.S.A
| | - Saket Navlakha
- Salk Institute for Biological Studies, Integrative Biology Laboratory, La Jolla, CA 92037, U.S.A
| |
Collapse
|
30
|
Yang Z, Tsui SKW. Functional Annotation of Proteins Encoded by the Minimal Bacterial Genome Based on Secondary Structure Element Alignment. J Proteome Res 2018; 17:2511-2520. [PMID: 29757649 DOI: 10.1021/acs.jproteome.8b00262] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In synthetic biology, one of the key focuses is building a minimal artificial cell which can provide basic chassis for functional study. Recently, the J. Craig Venter Institute published the latest version of the minimal bacterial genome JCVI-syn3.0, which only encoded 438 essential proteins. However, among them functions of 149 proteins remain unknown because of the lack of effective annotation method. Here, we report a secondary structure element alignment method called SSEalign based on an effective training data set extracting from various bacterial genomes. The experimentally validated homologous genes in different species were selected as training positives, while unrelated genes in different species were selected as training negatives. Moreover, SSEalign used a set of well-defined basic alignment elements with the backtracking line search algorithm to derive the best parameters for accurate prediction. Experimental results showed that SSEalign achieved 88.2% test accuracy, which is better than the existing prediction methods. SSEalign was subsequently applied to identify the functions of those unannotated proteins in the latest published minimal bacteria genome JCVI-syn3.0. Results indicated that at least 136 proteins out of 149 unannotated proteins in the JCVI-syn3.0 genome could be annotated by SSEalign. Our method is effective for the identification of protein homology in JCVI-syn3.0 and can be used to annotate those hypothetical proteins in other bacterial genomes.
Collapse
Affiliation(s)
- Zhiyuan Yang
- College of Life Information Science & Instrument Engineering , Hangzhou Dianzi University , Hangzhou 310018 , China.,School of Biomedical Sciences , The Chinese University of Hong Kong , Shatin , N.T. , Hong Kong.,Hong Kong Bioinformatics Centre , The Chinese University of Hong Kong , Shatin , N.T. , Hong Kong
| | - Stephen Kwok-Wing Tsui
- School of Biomedical Sciences , The Chinese University of Hong Kong , Shatin , N.T. , Hong Kong.,Hong Kong Bioinformatics Centre , The Chinese University of Hong Kong , Shatin , N.T. , Hong Kong.,Centre for Microbial Genomics and Proteomics , The Chinese University of Hong Kong , Shatin , N.T. , Hong Kong
| |
Collapse
|
31
|
Liu G, Chai B, Yang K, Yu J, Zhou X. Overlapping functional modules detection in PPI network with pair-wise constrained non-negative matrix tri-factorisation. IET Syst Biol 2018. [PMID: 29533217 PMCID: PMC8687432 DOI: 10.1049/iet-syb.2017.0084] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
A large amount of available protein–protein interaction (PPI) data has been generated by high‐throughput experimental techniques. Uncovering functional modules from PPI networks will help us better understand the underlying mechanisms of cellular functions. Numerous computational algorithms have been designed to identify functional modules automatically in the past decades. However, most community detection methods (non‐overlapping or overlapping types) are unsupervised models, which cannot incorporate the well‐known protein complexes as a priori. The authors propose a novel semi‐supervised model named pairwise constrains nonnegative matrix tri‐factorisation (PCNMTF), which takes full advantage of the well‐known protein complexes to find overlapping functional modules based on protein module indicator matrix and module correlation matrix simultaneously from PPI networks. PCNMTF determinately models and learns the mixed module memberships of each protein by considering the correlation among modules simultaneously based on the non‐negative matrix tri‐factorisation. The experiment results on both synthetic and real‐world biological networks demonstrate that PCNMTF gains more precise functional modules than that of state‐of‐the‐art methods.
Collapse
Affiliation(s)
- Guangming Liu
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Bianfang Chai
- Department of Information Engineering, Hebei GEO University, Shijiazhuang, People's Republic of China
| | - Kuo Yang
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Jian Yu
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China
| | - Xuezhong Zhou
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China.
| |
Collapse
|
32
|
Liu G, Wang H, Chu H, Yu J, Zhou X. Functional diversity of topological modules in human protein-protein interaction networks. Sci Rep 2017; 7:16199. [PMID: 29170401 PMCID: PMC5701033 DOI: 10.1038/s41598-017-16270-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 11/09/2017] [Indexed: 01/18/2023] Open
Abstract
A large-scale molecular interaction network of protein-protein interactions (PPIs) enables the automatic detection of molecular functional modules through a computational approach. However, the functional modules that are typically detected by topological community detection algorithms may be diverse in functional homogeneity and are empirically considered to be default functional modules. Thus, a significant challenge that has been described but not elucidated is investigating the relationship between topological modules and functional modules. We systematically investigated this issue by initially using seven widely used community detection algorithms to partition the PPI network into communities. Four homogeneity measures were subsequently implemented to evaluate the functional homogeneity of protein community. We determined that a significant portion of topological modules with heterogeneous functionality exists and should be further investigated; moreover, these findings indicated that topologically based functional module detection approaches must be reconsidered. Furthermore, we found that the functional homogeneity of topological modules is positively correlated with their edge densities, degree of association with diseases and general Gene Ontology (GO) terms. Thus, topologically based module detection approaches should be used with caution in the identification of functional modules with high homogeneity
Collapse
Affiliation(s)
- Guangming Liu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China
| | - Huixin Wang
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China
| | - Hongwei Chu
- Dalian University of Technology, Dalian, 116024, China.,Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Jian Yu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China.
| | - Xuezhong Zhou
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China.
| |
Collapse
|
33
|
Meysman P, Titeca K, Eyckerman S, Tavernier J, Goethals B, Martens L, Valkenborg D, Laukens K. Protein complex analysis: From raw protein lists to protein interaction networks. MASS SPECTROMETRY REVIEWS 2017; 36:600-614. [PMID: 26709718 DOI: 10.1002/mas.21485] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 11/17/2015] [Indexed: 06/05/2023]
Abstract
The elucidation of molecular interaction networks is one of the pivotal challenges in the study of biology. Affinity purification-mass spectrometry and other co-complex methods have become widely employed experimental techniques to identify protein complexes. These techniques typically suffer from a high number of false negatives and false positive contaminants due to technical shortcomings and purification biases. To support a diverse range of experimental designs and approaches, a large number of computational methods have been proposed to filter, infer and validate protein interaction networks from experimental pull-down MS data. Nevertheless, this expansion of available methods complicates the selection of the most optimal ones to support systems biology-driven knowledge extraction. In this review, we give an overview of the most commonly used computational methods to process and interpret co-complex results, and we discuss the issues and unsolved problems that still exist within the field. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:600-614, 2017.
Collapse
Affiliation(s)
- Pieter Meysman
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - Kevin Titeca
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Sven Eyckerman
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Jan Tavernier
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Bart Goethals
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Lennart Martens
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- IBioStat, Hasselt University, Hasselt, Belgium
- CFP-CeProMa, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
34
|
Karthikeyan BS, Akbarsha MA, Parthasarathy S. Network analysis and cross species comparison of protein-protein interaction networks of human, mouse and rat cytochrome P450 proteins that degrade xenobiotics. MOLECULAR BIOSYSTEMS 2017; 12:2119-34. [PMID: 27194593 DOI: 10.1039/c6mb00210b] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Cytochrome P450 (CYP) enzymes that degrade xenobiotics play a critical role in the metabolism and biotransformation of drugs and xenobiotics in humans as well as experimental animal models such as mouse and rat. These proteins function as a network collectively as well as independently. Though there are several reports on the organization, regulation and functionality of various CYP enzymes at the molecular level, the understanding of organization and functionality of these proteins at the holistic level remain unclear. The objective of this study is to understand the organization and functionality of xenobiotic degrading CYP enzymes of human, mouse and rat using network theory approaches and to study species differences that exist among them at the holistic level. For our analysis, a protein-protein interaction (PPI) network for CYP enzymes of human, mouse and rat was constructed using the STRING database. Topology, centrality, modularity and robustness analyses were performed for our predicted CYP PPI networks that were then validated by comparison with randomly generated network models. Network centrality analyses of CYP PPI networks reveal the central/hub proteins in the network. Modular analysis of the CYP PPI networks of human, mouse and rat resulted in functional clusters. These clusters were subjected to ontology and pathway enrichment analysis. The analyses show that the cluster of the human CYP PPI network is enriched with pathways principally related to xenobiotic/drug metabolism. Endo-xenobiotic crosstalk dominated in mouse and rat CYP PPI networks, and they were highly enriched with endogenous metabolic and signaling pathways. Thus, cross-species comparisons and analyses of human, mouse and rat CYP PPI networks gave insights about species differences that existed at the holistic level. More investigations from both reductionist and holistic perspectives can help understand CYP metabolism and species extrapolation in a much better way.
Collapse
Affiliation(s)
- Bagavathy Shanmugam Karthikeyan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamil Nadu, India. and Mahatma Gandhi-Doerenkamp Center (MGDC) for Alternatives to Use of Animals in Life Science Education, Bharathidasan University, Tiruchirappalli 620 024, Tamil Nadu, India.
| | - Mohammad Abdulkader Akbarsha
- Mahatma Gandhi-Doerenkamp Center (MGDC) for Alternatives to Use of Animals in Life Science Education, Bharathidasan University, Tiruchirappalli 620 024, Tamil Nadu, India.
| | - Subbiah Parthasarathy
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamil Nadu, India.
| |
Collapse
|
35
|
Ma CY, Chen YPP, Berger B, Liao CS. Identification of protein complexes by integrating multiple alignment of protein interaction networks. Bioinformatics 2017; 33:1681-1688. [PMID: 28130237 PMCID: PMC5860626 DOI: 10.1093/bioinformatics/btx043] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2016] [Revised: 11/22/2016] [Accepted: 01/20/2017] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Protein complexes are one of the keys to studying the behavior of a cell system. Many biological functions are carried out by protein complexes. During the past decade, the main strategy used to identify protein complexes from high-throughput network data has been to extract near-cliques or highly dense subgraphs from a single protein-protein interaction (PPI) network. Although experimental PPI data have increased significantly over recent years, most PPI networks still have many false positive interactions and false negative edge loss due to the limitations of high-throughput experiments. In particular, the false negative errors restrict the search space of such conventional protein complex identification approaches. Thus, it has become one of the most challenging tasks in systems biology to automatically identify protein complexes. RESULTS In this study, we propose a new algorithm, NEOComplex ( NE CC- and O rtholog-based Complex identification by multiple network alignment), which integrates functional orthology information that can be obtained from different types of multiple network alignment (MNA) approaches to expand the search space of protein complex detection. As part of our approach, we also define a new edge clustering coefficient (NECC) to assign weights to interaction edges in PPI networks so that protein complexes can be identified more accurately. The NECC is based on the intuition that there is functional information captured in the common neighbors of the common neighbors as well. Our results show that our algorithm outperforms well-known protein complex identification tools in a balance between precision and recall on three eukaryotic species: human, yeast, and fly. As a result of MNAs of the species, the proposed approach can tolerate edge loss in PPI networks and even discover sparse protein complexes which have traditionally been a challenge to predict. AVAILABILITY AND IMPLEMENTATION http://acolab.ie.nthu.edu.tw/bionetwork/NEOComplex. CONTACT bab@csail.mit.edu or csliao@ie.nthu.edu.tw. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cheng-Yu Ma
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Vic, Australia
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Vic, Australia
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Mathematics and Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Chung-Shou Liao
- Department of Industrial Engineering and Engineering Management, National Tsing Hua University, Hsinchu, Taiwan
| |
Collapse
|
36
|
Abstract
Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others.
Collapse
|
37
|
Mamano N, Hayes WB. SANA: simulated annealing far outperforms many other search algorithms for biological network alignment. Bioinformatics 2017; 33:2156-2164. [DOI: 10.1093/bioinformatics/btx090] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 02/08/2017] [Indexed: 11/14/2022] Open
|
38
|
Kazemi E, Hassani H, Grossglauser M, Pezeshgi Modarres H. PROPER: global protein interaction network alignment through percolation matching. BMC Bioinformatics 2016; 17:527. [PMID: 27955623 PMCID: PMC5153870 DOI: 10.1186/s12859-016-1395-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 11/29/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The alignment of protein-protein interaction (PPI) networks enables us to uncover the relationships between different species, which leads to a deeper understanding of biological systems. Network alignment can be used to transfer biological knowledge between species. Although different PPI-network alignment algorithms were introduced during the last decade, developing an accurate and scalable algorithm that can find alignments with high biological and structural similarities among PPI networks is still challenging. RESULTS In this paper, we introduce a new global network alignment algorithm for PPI networks called PROPER. Compared to other global network alignment methods, our algorithm shows higher accuracy and speed over real PPI datasets and synthetic networks. We show that the PROPER algorithm can detect large portions of conserved biological pathways between species. Also, using a simple parsimonious evolutionary model, we explain why PROPER performs well based on several different comparison criteria. CONCLUSIONS We highlight that PROPER has high potential in further applications such as detecting biological pathways, finding protein complexes and PPI prediction. The PROPER algorithm is available at http://proper.epfl.ch .
Collapse
Affiliation(s)
- Ehsan Kazemi
- School of Computer and Communication Sciences, EPFL, Lausanne, Switzerland.
| | - Hamed Hassani
- Department of Computer Science, ETHZ, Zurich, Switzerland
| | | | | |
Collapse
|
39
|
Li Z, Liu Z, Zhong W, Huang M, Wu N, Xie Y, Dai Z, Zou X. Large-scale identification of human protein function using topological features of interaction network. Sci Rep 2016; 6:37179. [PMID: 27849060 PMCID: PMC5111120 DOI: 10.1038/srep37179] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 10/26/2016] [Indexed: 12/25/2022] Open
Abstract
The annotation of protein function is a vital step to elucidate the essence of life at a molecular level, and it is also meritorious in biomedical and pharmaceutical industry. Developments of sequencing technology result in constant expansion of the gap between the number of the known sequences and their functions. Therefore, it is indispensable to develop a computational method for the annotation of protein function. Herein, a novel method is proposed to identify protein function based on the weighted human protein-protein interaction network and graph theory. The network topology features with local and global information are presented to characterise proteins. The minimum redundancy maximum relevance algorithm is used to select 227 optimized feature subsets and support vector machine technique is utilized to build the prediction models. The performance of current method is assessed through 10-fold cross-validation test, and the range of accuracies is from 67.63% to 100%. Comparing with other annotation methods, the proposed way possesses a 50% improvement in the predictive accuracy. Generally, such network topology features provide insights into the relationship between protein functions and network architectures. The source code of Matlab is freely available on request from the authors.
Collapse
Affiliation(s)
- Zhanchao Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Zhiqing Liu
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Wenqian Zhong
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Menghua Huang
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Na Wu
- School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China
| | - Yun Xie
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Zong Dai
- School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China
| | - Xiaoyong Zou
- SYSU-CMU Shunde International Joint Research Institute, Shunde, 528300, People's Republic of China.,School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China
| |
Collapse
|
40
|
Sarajlić A, Malod-Dognin N, Yaveroğlu ÖN, Pržulj N. Graphlet-based Characterization of Directed Networks. Sci Rep 2016; 6:35098. [PMID: 27734973 PMCID: PMC5062067 DOI: 10.1038/srep35098] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 09/26/2016] [Indexed: 01/22/2023] Open
Abstract
We are flooded with large-scale, dynamic, directed, networked data. Analyses requiring exact comparisons between networks are computationally intractable, so new methodologies are sought. To analyse directed networks, we extend graphlets (small induced sub-graphs) and their degrees to directed data. Using these directed graphlets, we generalise state-of-the-art network distance measures (RGF, GDDA and GCD) to directed networks and show their superiority for comparing directed networks. Also, we extend the canonical correlation analysis framework that enables uncovering the relationships between the wiring patterns around nodes in a directed network and their expert annotations. On directed World Trade Networks (WTNs), our methodology allows uncovering the core-broker-periphery structure of the WTN, predicting the economic attributes of a country, such as its gross domestic product, from its wiring patterns in the WTN for up-to ten years in the future. It does so by enabling us to track the dynamics of a country's positioning in the WTN over years. On directed metabolic networks, our framework yields insights into preservation of enzyme function from the network wiring patterns rather than from sequence data. Overall, our methodology enables advanced analyses of directed networked data from any area of science, allowing domain-specific interpretation of a directed network's topology.
Collapse
Affiliation(s)
- Anida Sarajlić
- Department of Computing, Imperial College London, SW7 2AZ London, UK
| | - Noël Malod-Dognin
- Department of Computer Science, University College London, WC1E 6BT London, UK
| | | | - Nataša Pržulj
- Department of Computer Science, University College London, WC1E 6BT London, UK
| |
Collapse
|
41
|
|
42
|
Hu R, Ren G, Sun G, Sun X. TarNet: An Evidence-Based Database for Natural Medicine Research. PLoS One 2016; 11:e0157222. [PMID: 27337171 PMCID: PMC4919029 DOI: 10.1371/journal.pone.0157222] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 05/26/2016] [Indexed: 11/26/2022] Open
Abstract
Background Complex diseases seriously threaten human health. Drug discovery approaches based on “single genes, single drugs, and single targets” are limited in targeting complex diseases. The development of new multicomponent drugs for complex diseases is imperative, and the establishment of a suitable solution for drug group-target protein network analysis is a key scientific problem that must be addressed. Herbal medicines have formed the basis of sophisticated systems of traditional medicine and have given rise to some key drugs that remain in use today. The search for new molecules is currently taking a different route, whereby scientific principles of ethnobotany and ethnopharmacognosy are being used by chemists in the discovery of different sources and classes of compounds. Results In this study, we developed TarNet, a manually curated database and platform of traditional medicinal plants with natural compounds that includes potential bio-target information. We gathered information on proteins that are related to or affected by medicinal plant ingredients and data on protein–protein interactions (PPIs). TarNet includes in-depth information on both plant–compound–protein relationships and PPIs. Additionally, TarNet can provide researchers with network construction analyses of biological pathways and protein–protein interactions (PPIs) associated with specific diseases. Researchers can upload a gene or protein list mapped to our PPI database that has been manually curated to generate relevant networks. Multiple functions are accessible for network topological calculations, subnetwork analyses, pathway analyses, and compound–protein relationships. Conclusions TarNet will serve as a useful analytical tool that will provide information on medicinal plant compound-affected proteins (potential targets) and system-level analyses for systems biology and network pharmacology researchers. TarNet is freely available at http://www.herbbol.org:8001/tarnet, and detailed tutorials on the program are also available.
Collapse
Affiliation(s)
- Ruifeng Hu
- Beijing Key Laboratory of Innovative Drug Discovery of Traditional Chinese Medicine (Natural Medicine) and Translational Medicine, Institute of Medicinal Plant Development, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
- Key Laboratory of Bioactive Substances and Resource Utilization of Chinese Herbal Medicine, Ministry of Education, Beijing, China
- Zhongguancun Open Laboratory of the Research and Development of Natural Medicine and Health Products, Beijing, China
| | - Guomin Ren
- Beijing Key Laboratory of Innovative Drug Discovery of Traditional Chinese Medicine (Natural Medicine) and Translational Medicine, Institute of Medicinal Plant Development, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
- Key Laboratory of Bioactive Substances and Resource Utilization of Chinese Herbal Medicine, Ministry of Education, Beijing, China
- Zhongguancun Open Laboratory of the Research and Development of Natural Medicine and Health Products, Beijing, China
| | - Guibo Sun
- Beijing Key Laboratory of Innovative Drug Discovery of Traditional Chinese Medicine (Natural Medicine) and Translational Medicine, Institute of Medicinal Plant Development, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
- Key Laboratory of Bioactive Substances and Resource Utilization of Chinese Herbal Medicine, Ministry of Education, Beijing, China
- Zhongguancun Open Laboratory of the Research and Development of Natural Medicine and Health Products, Beijing, China
| | - Xiaobo Sun
- Beijing Key Laboratory of Innovative Drug Discovery of Traditional Chinese Medicine (Natural Medicine) and Translational Medicine, Institute of Medicinal Plant Development, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
- Key Laboratory of Bioactive Substances and Resource Utilization of Chinese Herbal Medicine, Ministry of Education, Beijing, China
- Zhongguancun Open Laboratory of the Research and Development of Natural Medicine and Health Products, Beijing, China
- * E-mail:
| |
Collapse
|
43
|
Ding D, Li L, Shu C, Sun X. K-shell Analysis Reveals Distinct Functional Parts in an Electron Transfer Network and Its Implications for Extracellular Electron Transfer. Front Microbiol 2016; 7:530. [PMID: 27148219 PMCID: PMC4837345 DOI: 10.3389/fmicb.2016.00530] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 03/31/2016] [Indexed: 01/17/2023] Open
Abstract
Shewanella oneidensis MR-1 is capable of extracellular electron transfer (EET) and hence has attracted considerable attention. The EET pathways mainly consist of c-type cytochromes, along with some other proteins involved in electron transfer processes. By whole genome study and protein interactions inquisition, we constructed a large-scale electron transfer network containing 2276 interactions among 454 electron transfer related proteins in S. oneidensis MR-1. Using the k-shell decomposition method, we identified and analyzed distinct parts of the electron transfer network. We found that there was a negative correlation between the k s (k-shell values) and the average DR_100 (disordered regions per 100 amino acids) in every shell, which suggested that disordered regions of proteins played an important role during the formation and extension of the electron transfer network. Furthermore, proteins in the top three shells of the network are mainly located in the cytoplasm and inner membrane; these proteins can be responsible for transfer of electrons into the quinone pool in a wide variety of environmental conditions. In most of the other shells, proteins are broadly located throughout the five cellular compartments (cytoplasm, inner membrane, periplasm, outer membrane, and extracellular), which ensures the important EET ability of S. oneidensis MR-1. Specifically, the fourth shell was responsible for EET and the c-type cytochromes in the remaining shells of the electron transfer network were involved in aiding EET. Taken together, these results show that there are distinct functional parts in the electron transfer network of S. oneidensis MR-1, and the EET processes could achieve high efficiency through cooperation through such an electron transfer network.
Collapse
Affiliation(s)
- Dewu Ding
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast UniversityNanjing, China; Department of Mathematics and Computer Science, Chizhou CollegeChizhou, China
| | - Ling Li
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University Nanjing, China
| | - Chuanjun Shu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University Nanjing, China
| | - Xiao Sun
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University Nanjing, China
| |
Collapse
|