1
|
Quantitative assessment of gene expression network module-validation methods. Sci Rep 2015; 5:15258. [PMID: 26470848 PMCID: PMC4607977 DOI: 10.1038/srep15258] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 09/21/2015] [Indexed: 02/01/2023] Open
Abstract
Validation of pluripotent modules in diverse networks holds enormous potential for systems biology and network pharmacology. An arising challenge is how to assess the accuracy of discovering all potential modules from multi-omic networks and validating their architectural characteristics based on innovative computational methods beyond function enrichment and biological validation. To display the framework progress in this domain, we systematically divided the existing Computational Validation Approaches based on Modular Architecture (CVAMA) into topology-based approaches (TBA) and statistics-based approaches (SBA). We compared the available module validation methods based on 11 gene expression datasets, and partially consistent results in the form of homogeneous models were obtained with each individual approach, whereas discrepant contradictory results were found between TBA and SBA. The TBA of the Zsummary value had a higher Validation Success Ratio (VSR) (51%) and a higher Fluctuation Ratio (FR) (80.92%), whereas the SBA of the approximately unbiased (AU) p-value had a lower VSR (12.3%) and a lower FR (45.84%). The Gray area simulated study revealed a consistent result for these two models and indicated a lower Variation Ratio (VR) (8.10%) of TBA at 6 simulated levels. Despite facing many novel challenges and evidence limitations, CVAMA may offer novel insights into modular networks.
Collapse
|
2
|
Farhangmehr F, Maurya MR, Tartakovsky DM, Subramaniam S. Information theoretic approach to complex biological network reconstruction: application to cytokine release in RAW 264.7 macrophages. BMC SYSTEMS BIOLOGY 2014; 8:77. [PMID: 24964861 PMCID: PMC4094931 DOI: 10.1186/1752-0509-8-77] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Accepted: 06/04/2014] [Indexed: 12/27/2022]
Abstract
BACKGROUND High-throughput methods for biological measurements generate vast amounts of quantitative data, which necessitate the development of advanced approaches to data analysis to help understand the underlying mechanisms and networks. Reconstruction of biological networks from measured data of different components is a significant challenge in systems biology. RESULTS We use an information theoretic approach to reconstruct phosphoprotein-cytokine networks in RAW 264.7 macrophage cells. Cytokines are secreted upon activation of a wide range of regulatory signals transduced by the phosphoprotein network. Identifying these components can help identify regulatory modules responsible for the inflammatory phenotype. The information theoretic approach is based on estimation of mutual information of interactions by using kernel density estimators. Mutual information provides a measure of statistical dependencies between interacting components. Using the topology of the network derived, we develop a data-driven parsimonious input-output model of the phosphoprotein-cytokine network. CONCLUSIONS We demonstrate the applicability of our information theoretic approach to reconstruction of biological networks. For the phosphoprotein-cytokine network, this approach not only captures most of the known signaling components involved in cytokine release but also predicts new signaling components involved in the release of cytokines. The results of this study are important for gaining a clear understanding of macrophage activation during the inflammation process.
Collapse
Affiliation(s)
| | | | | | - Shankar Subramaniam
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, 92093-0412 La Jolla, CA, USA.
| |
Collapse
|
3
|
Zhao J, Hu X, He T, Li P, Zhang M, Shen X. An edge-based protein complex identification algorithm with gene co-expression data (PCIA-GeCo). IEEE Trans Nanobioscience 2014; 13:80-8. [PMID: 24803023 DOI: 10.1109/tnb.2014.2317519] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Recent studies have shown that protein complex is composed of core proteins and attachment proteins, and proteins inside the core are highly co-expressed. Based on this new concept, we reconstruct weighted PPI network by using gene expression data, and develop a novel protein complex identification algorithm from the angle of edge (PCIA-GeCo). First, we select the edge with high co-expressed coefficient as seed to form the preliminary cores. Then, the preliminary cores are filtered according to the weighted density of complex core to obtain the unique core. Finally, the protein complexes are generated by identifying attachment proteins for each core. A comprehensive comparison in term of F-measure, Coverage rate, P-value between our method and three other existing algorithms HUNTER, COACH and CORE has been made by comparing the predicted complexes against benchmark complexes. The evaluation results show our method PCIA-GeCo is effective; it can identify protein complexes more accurately.
Collapse
|
4
|
Integrative approaches for finding modular structure in biological networks. Nat Rev Genet 2013; 14:719-32. [PMID: 24045689 DOI: 10.1038/nrg3552] [Citation(s) in RCA: 351] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A central goal of systems biology is to elucidate the structural and functional architecture of the cell. To this end, large and complex networks of molecular interactions are being rapidly generated for humans and model organisms. A recent focus of bioinformatics research has been to integrate these networks with each other and with diverse molecular profiles to identify sets of molecules and interactions that participate in a common biological function - that is, 'modules'. Here, we classify such integrative approaches into four broad categories, describe their bioinformatic principles and review their applications.
Collapse
|
5
|
Detecting protein complexes based on relevancy from protein interaction networks. Interdiscip Sci 2013; 5:167-74. [PMID: 24307408 DOI: 10.1007/s12539-013-0171-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Revised: 03/30/2013] [Accepted: 06/12/2013] [Indexed: 10/26/2022]
Abstract
In protein-protein interaction networks, proteins combine into macromolecular complexes to execute essential functions in the cells, such as replication, transcription, protein transport. To solve the problem of detecting protein complexes from protein interaction networks, we used relevant graph and irrelevant graph to represent the relation of connection between a node and a core graph. We defined a variable Relevancy to represent whether a node had a dense or loose connection to a core graph. Then we proposed the Relevancy Judgment algorithm to detecting protein complexes from protein interaction networks. Our algorithm decided whether a node belonged to a protein complex through judging the relevancy between core graph and nodes out of core graph. Experiment results show that our algorithm has an excellent performance in both accuracy and hit rate.
Collapse
|
6
|
Affiliation(s)
- Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No. 2 South TaiBai Road, Xi'an, Shaanxi 710071, P.R. China
| | | |
Collapse
|
7
|
Rajagopala SV, Sikorski P, Caufield JH, Tovchigrechko A, Uetz P. Studying protein complexes by the yeast two-hybrid system. Methods 2012; 58:392-9. [PMID: 22841565 DOI: 10.1016/j.ymeth.2012.07.015] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Revised: 07/10/2012] [Accepted: 07/12/2012] [Indexed: 01/13/2023] Open
Abstract
Protein complexes are typically analyzed by affinity purification and subsequent mass spectrometric analysis. However, in most cases the structure and topology of the complexes remains elusive from such studies. Here we investigate how the yeast two-hybrid system can be used to analyze direct interactions among proteins in a complex. First we tested all pairwise interactions among the seven proteins of Escherichia coli DNA polymerase III as well as an uncharacterized complex that includes MntR and PerR. Four and seven interactions were identified in these two complexes, respectively. In addition, we review Y2H data for three other complexes of known structure which serve as "gold-standards", namely Varicella Zoster Virus (VZV) ribonucleotide reductase (RNR), the yeast proteasome, and bacteriophage lambda. Finally, we review an Y2H analysis of the human spliceosome which may serve as an example for a dynamic mega-complex.
Collapse
|
8
|
YU L, GAO L, SUN PG. Research on Algorithms for Complexes and Functional Modules Prediction in Protein-Protein Interaction Networks. ACTA ACUST UNITED AC 2011. [DOI: 10.3724/sp.j.1016.2011.01239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
9
|
Li W, Liu CC, Zhang T, Li H, Waterman MS, Zhou XJ. Integrative analysis of many weighted co-expression networks using tensor computation. PLoS Comput Biol 2011; 7:e1001106. [PMID: 21698123 PMCID: PMC3116899 DOI: 10.1371/journal.pcbi.1001106] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Accepted: 02/08/2011] [Indexed: 11/18/2022] Open
Abstract
The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks.
Collapse
Affiliation(s)
- Wenyuan Li
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
| | - Chun-Chi Liu
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
| | - Tong Zhang
- Department of Statistics, Rutgers University, New Brunswick, New Jersey, United States of America
| | - Haifeng Li
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
| | - Michael S. Waterman
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
| | - Xianghong Jasmine Zhou
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|
10
|
Abstract
The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed.
Collapse
Affiliation(s)
- Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, China
- Department of Computer Science, Georgia State University, Atlanta, GA30303, USA
| | - Min Li
- School of Information Science and Engineering, Central South University, Changsha 410083, China
| | - Youping Deng
- Rush University Cancer Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, GA30303, USA
| |
Collapse
|
11
|
Cho YR, Zhang A. Identification of functional hubs and modules by converting interactome networks into hierarchical ordering of proteins. BMC Bioinformatics 2010; 11 Suppl 3:S3. [PMID: 20438650 PMCID: PMC2863062 DOI: 10.1186/1471-2105-11-s3-s3] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-protein interactions play a key role in biological processes of proteins within a cell. Recent high-throughput techniques have generated protein-protein interaction data in a genome-scale. A wide range of computational approaches have been applied to interactome network analysis for uncovering functional organizations and pathways. However, they have been challenged because of complex connectivity. It has been investigated that protein interaction networks are typically characterized by intrinsic topological features: high modularity and hub-oriented structure. Elucidating the structural roles of modules and hubs is a critical step in complex interactome network analysis. RESULTS We propose a novel approach to convert the complex structure of an interactome network into hierarchical ordering of proteins. This algorithm measures functional similarity between proteins based on the path strength model, and reveals a hub-oriented tree structure hidden in the complex network. We score hub confidence and identify functional modules in the tree structure of proteins, retrieved by our algorithm. Our experimental results in the yeast protein interactome network demonstrate that the selected hubs are essential proteins for performing functions. In network topology, they have a role in bridging different functional modules. Furthermore, our approach has high accuracy in identifying functional modules hierarchically distributed. CONCLUSIONS Decomposing, converting, and synthesizing complex interaction networks are fundamental tasks for modeling their structural behaviors. In this study, we systematically analyzed complex interactome network structures for retrieving functional information. Unlike previous hierarchical clustering methods, this approach dynamically explores the hierarchical structure of proteins in a global view. It is well-applicable to the interactome networks in high-level organisms because of its efficiency and scalability.
Collapse
Affiliation(s)
- Young-Rae Cho
- Department of Computer Science Baylor University, Waco, TX 76798, USA.
| | | |
Collapse
|
12
|
Sadovsky MG. Genomes and information. Biophysics (Nagoya-shi) 2009. [DOI: 10.1134/s0006350909040034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
13
|
Gao L, Sun PG, Song J. Clustering algorithms for detecting functional modules in protein interaction networks. J Bioinform Comput Biol 2009; 7:217-42. [PMID: 19226668 DOI: 10.1142/s0219720009004023] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Revised: 10/21/2008] [Accepted: 10/21/2008] [Indexed: 01/21/2023]
Abstract
Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. When studying the workings of a biological cell, it is useful to be able to detect known and predict still undiscovered protein complexes within the cell's PPI networks. Such predictions may be used as an inexpensive tool to direct biological experiments. The increasing amount of available PPI data necessitate a fast, accurate approach to biological complex identification. Because of its importance in the studies of protein interaction network, there are different models and algorithms in identifying functional modules in PPI networks. In this paper, we review some representative algorithms, focusing on the algorithms underlying the approaches and how the algorithms relate to each other. In particular, a comparison is given based on the property of the algorithms. Since the PPI network is noisy and still incomplete, some methods which consider other additional properties for preprocessing and purifying of PPI data are presented. We also give a discussion about the functional annotation and validation of protein complexes. Finally, new progress and future research directions are discussed from the computational viewpoint.
Collapse
Affiliation(s)
- Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China.
| | | | | |
Collapse
|