1
|
A review of clique-based overlapping community detection algorithms. Knowl Inf Syst 2022. [DOI: 10.1007/s10115-022-01704-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
2
|
Unsupervised induction of inflectional families. COMPUT SPEECH LANG 2022. [DOI: 10.1016/j.csl.2021.101324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
3
|
Hwang S, Lee T, Yoon Y. Exploring disease comorbidity in a module-module interaction network. J Bioinform Comput Biol 2020; 18:2050010. [PMID: 32404015 DOI: 10.1142/s0219720020500109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Understanding disease comorbidity contributes to improved quality of life in patients who are suffering from multiple diseases. Therefore, to better explore comorbid diseases, the clarification of associations between diseases based on biological functions is essential. In our study, we propose a method for identifying disease comorbidity in a module-based network, named the module-module interaction (MMI) network, which represents how biological functions influence each other. To construct the MMI network, we detected gene modules - sets of genes that have a higher probability of taking part in specific functions - and established a link between these modules. Subsequently, we constructed disease-related networks in the MMI network to understand inherent disease mechanisms and calculated comorbidity scores of disease pairs using Gene Ontology (GO) terms. Our results show that we can obtain further information on disease mechanisms by considering interactions between functional modules instead of between genes. In addition, we verified that predicted comorbid relationships of disease pairs based on the MMI network are more significant than those based on the protein-protein interaction (PPI) network. This study can be useful to elucidate the mechanisms underlying comorbidities for further study, which will provide a broader insight into the pathogenesis of diseases.
Collapse
Affiliation(s)
- Soyoun Hwang
- Department of IT Convergence Engineering, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Korea
| | - Taekeon Lee
- Department of Computer Engineering, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Korea
| | - Youngmi Yoon
- Department of Computer Engineering, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Korea
| |
Collapse
|
4
|
Grbić M, Matić D, Kartelj A, Vračević S, Filipović V. A three-phase method for identifying functionally related protein groups in weighted PPI networks. Comput Biol Chem 2020; 86:107246. [PMID: 32339914 DOI: 10.1016/j.compbiolchem.2020.107246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Revised: 01/27/2020] [Accepted: 03/03/2020] [Indexed: 01/17/2023]
Abstract
Identifying significant protein groups is of great importance for further understanding protein functions. This paper introduces a novel three-phase heuristic method for identifying such groups in weighted PPI networks. In the first phase a variable neighborhood search (VNS) algorithm is applied on a weighted PPI network, in order to support protein complexes by adding a minimum number of new PPIs. In the second phase proteins from different complexes are merged into larger protein groups. In the third phase these groups are expanded by a number of 2-level neighbor proteins, favoring proteins that have higher average gene co-expression with the base group proteins. Experimental results show that: (i) the proposed VNS algorithm outperforms the existing approach described in literature and (ii) the above-mentioned three-phase method identifies protein groups with very high statistical significance.
Collapse
Affiliation(s)
- Milana Grbić
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Dragan Matić
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Aleksandar Kartelj
- University of Belgrade, Faculty of Mathematics, Studentski trg 16/IV 11 000, Belgrade, Serbia.
| | - Savka Vračević
- University of Banjaluka, Faculty of Natural Sciences and Mathematics, Mladena Stojanovića 2, 78000 Banjaluka, Bosnia and Herzegovina.
| | - Vladimir Filipović
- University of Belgrade, Faculty of Mathematics, Studentski trg 16/IV 11 000, Belgrade, Serbia.
| |
Collapse
|
5
|
Mao Y, Chen L, Li J, Shangguan AJ, Kujawa S, Zhao H. A network analysis revealed the essential and common downstream proteins related to inguinal hernia. PLoS One 2020; 15:e0226885. [PMID: 31910207 PMCID: PMC6946160 DOI: 10.1371/journal.pone.0226885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 12/08/2019] [Indexed: 01/10/2023] Open
Abstract
Although more than 1 in 4 men develop symptomatic inguinal hernia during their lifetime, the molecular mechanism behind inguinal hernia remains unknown. Here, we explored the protein-protein interaction network built on known inguinal hernia-causative genes to identify essential and common downstream proteins for inguinal hernia formation. We discovered that PIK3R1, PTPN11, TGFBR1, CDC42, SOS1, and KRAS were the most essential inguinal hernia-causative proteins and UBC, GRB2, CTNNB1, HSP90AA1, CBL, PLCG1, and CRK were listed as the most commonly-involved downstream proteins. In addition, the transmembrane receptor protein tyrosine kinase signaling pathway was the most frequently found inguinal hernia-related pathway. Our in silico approach was able to uncover a novel molecular mechanism underlying inguinal hernia formation by identifying inguinal hernia-related essential proteins and potential common downstream proteins of inguinal hernia-causative proteins.
Collapse
Affiliation(s)
- Yimin Mao
- School of Information and Technology, Jiangxi University of Science and Technology, Jiangxi, China
- Applied Science Institute, Jiangxi University of Science and Technology, Jiangxi, China
| | - Le Chen
- School of Information and Technology, Jiangxi University of Science and Technology, Jiangxi, China
| | - Jianghua Li
- School of Information and Technology, Jiangxi University of Science and Technology, Jiangxi, China
| | - Anna Junjie Shangguan
- Department of Radiology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Stacy Kujawa
- Division of Reproductive Science in Medicine, Department of Obstetrics and Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Hong Zhao
- Division of Reproductive Science in Medicine, Department of Obstetrics and Gynecology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
6
|
Tarsani E, Kranis A, Maniatis G, Avendano S, Hager-Theodorides AL, Kominakis A. Discovery and characterization of functional modules associated with body weight in broilers. Sci Rep 2019; 9:9125. [PMID: 31235723 PMCID: PMC6591351 DOI: 10.1038/s41598-019-45520-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 06/04/2019] [Indexed: 12/31/2022] Open
Abstract
Aim of the present study was to investigate whether body weight (BW) in broilers is associated with functional modular genes. To this end, first a GWAS for BW was conducted using 6,598 broilers and the high density SNP array. The next step was to search for positional candidate genes and QTLs within strong LD genomic regions around the significant SNPs. Using all positional candidate genes, a network was then constructed and community structure analysis was performed. Finally, functional enrichment analysis was applied to infer the functional relevance of modular genes. A total number of 645 positional candidate genes were identified in strong LD genomic regions around 11 genome-wide significant markers. 428 of the positional candidate genes were located within growth related QTLs. Community structure analysis detected 5 modules while functional enrichment analysis showed that 52 modular genes participated in developmental processes such as skeletal system development. An additional number of 14 modular genes (GABRG1, NGF, APOBEC2, STAT5B, STAT3, SMAD4, MED1, CACNB1, SLAIN2, LEMD2, ZC3H18, TMEM132D, FRYL and SGCB) were also identified as related to body weight. Taken together, current results suggested a total number of 66 genes as most plausible functional candidates for the trait examined.
Collapse
Affiliation(s)
- Eirini Tarsani
- Department of Animal Science and Aquaculture, Agricultural University of Athens, Iera Odos 75, 11855, Athens, Greece.
| | - Andreas Kranis
- Aviagen Ltd., Newbridge, Midlothian, EH28 8SZ, UK.,The Roslin Institute, University of Edinburgh, EH25 9RG, Midlothian, United Kingdom
| | | | | | - Ariadne L Hager-Theodorides
- Department of Animal Science and Aquaculture, Agricultural University of Athens, Iera Odos 75, 11855, Athens, Greece
| | - Antonios Kominakis
- Department of Animal Science and Aquaculture, Agricultural University of Athens, Iera Odos 75, 11855, Athens, Greece
| |
Collapse
|
7
|
Kim J, Shin M, Kim J, Park C, Lee S, Woo J, Kim H, Seo D, Yu S, Park S. CASS: A distributed network clustering algorithm based on structure similarity for large-scale network. PLoS One 2018; 13:e0203670. [PMID: 30303961 PMCID: PMC6179193 DOI: 10.1371/journal.pone.0203670] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 08/24/2018] [Indexed: 12/21/2022] Open
Abstract
As the size of networks increases, it is becoming important to analyze large-scale network data. A network clustering algorithm is useful for analysis of network data. Conventional network clustering algorithms in a single machine environment rather than a parallel machine environment are actively being researched. However, these algorithms cannot analyze large-scale network data because of memory size issues. As a solution, we propose a network clustering algorithm for large-scale network data analysis using Apache Spark by changing the paradigm of the conventional clustering algorithm to improve its efficiency in the Apache Spark environment. We also apply optimization approaches such as Bloom filter and shuffle selection to reduce memory usage and execution time. By evaluating our proposed algorithm based on an average normalized cut, we confirmed that the algorithm can analyze diverse large-scale network datasets such as biological, co-authorship, internet topology and social networks. Experimental results show that the proposed algorithm can develop more accurate clusters than comparative algorithms with less memory usage. Furthermore, we confirm the proposed optimization approaches and the scalability of the proposed algorithm. In addition, we validate that clusters found from the proposed algorithm can represent biologically meaningful functions.
Collapse
Affiliation(s)
- Jungrim Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Mincheol Shin
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Jeongwoo Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Chihyun Park
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Sujin Lee
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Jaemin Woo
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Hyerim Kim
- Department of Computer Science, Yonsei University, Seoul, South Korea
| | - Dongmin Seo
- Korea Institute of Science and Technology Information, Daejeon, South Korea
| | - Seokjong Yu
- Korea Institute of Science and Technology Information, Daejeon, South Korea
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, Seoul, South Korea
- * E-mail:
| |
Collapse
|
8
|
Wang MY, Liang JW, Olounfeh KM, Sun Q, Zhao N, Meng FH. A Comprehensive In Silico Method to Study the QSTR of the Aconitine Alkaloids for Designing Novel Drugs. Molecules 2018; 23:E2385. [PMID: 30231506 PMCID: PMC6225272 DOI: 10.3390/molecules23092385] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 09/11/2018] [Accepted: 09/12/2018] [Indexed: 12/22/2022] Open
Abstract
A combined in silico method was developed to predict potential protein targets that are involved in cardiotoxicity induced by aconitine alkaloids and to study the quantitative structure⁻toxicity relationship (QSTR) of these compounds. For the prediction research, a Protein-Protein Interaction (PPI) network was built from the extraction of useful information about protein interactions connected with aconitine cardiotoxicity, based on nearly a decade of literature and the STRING database. The software Cytoscape and the PharmMapper server were utilized to screen for essential proteins in the constructed network. The Calcium-Calmodulin-Dependent Protein Kinase II alpha (CAMK2A) and gamma (CAMK2G) were identified as potential targets. To obtain a deeper insight on the relationship between the toxicity and the structure of aconitine alkaloids, the present study utilized QSAR models built in Sybyl software that possess internal robustness and external high predictions. The molecular dynamics simulation carried out here have demonstrated that aconitine alkaloids possess binding stability for the receptor CAMK2G. In conclusion, this comprehensive method will serve as a tool for following a structural modification of the aconitine alkaloids and lead to a better insight into the cardiotoxicity induced by the compounds that have similar structures to its derivatives.
Collapse
Affiliation(s)
- Ming-Yang Wang
- School of Pharmacy, China Medical University, Shenyang 110122, Liaoning, China.
| | - Jing-Wei Liang
- School of Pharmacy, China Medical University, Shenyang 110122, Liaoning, China.
| | | | - Qi Sun
- School of Pharmacy, China Medical University, Shenyang 110122, Liaoning, China.
| | - Nan Zhao
- School of Pharmacy, China Medical University, Shenyang 110122, Liaoning, China.
| | - Fan-Hao Meng
- School of Pharmacy, China Medical University, Shenyang 110122, Liaoning, China.
| |
Collapse
|
9
|
Ma C, Xiang BB, Chen HS, Small M, Zhang HF. Detection of core-periphery structure in networks based on 3-tuple motifs. CHAOS (WOODBURY, N.Y.) 2018; 28:053121. [PMID: 29857652 DOI: 10.1063/1.5023719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Detecting mesoscale structure, such as community structure, is of vital importance for analyzing complex networks. Recently, a new mesoscale structure, core-periphery (CP) structure, has been identified in many real-world systems. In this paper, we propose an effective algorithm for detecting CP structure based on a 3-tuple motif. In this algorithm, we first define a 3-tuple motif in terms of the patterns of edges as well as the property of nodes, and then a motif adjacency matrix is constructed based on the 3-tuple motif. Finally, the problem is converted to find a cluster that minimizes the smallest motif conductance. Our algorithm works well in different CP structures: including single or multiple CP structure, and local or global CP structures. Results on the synthetic and the empirical networks validate the high performance of our method.
Collapse
Affiliation(s)
- Chuang Ma
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| | - Bing-Bing Xiang
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| | - Han-Shuang Chen
- School of Physics and Material Science, Anhui University, Hefei 230601, China
| | - Michael Small
- Department of Mathematics and Statistics, The University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Austria
| | - Hai-Feng Zhang
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| |
Collapse
|
10
|
Xiang BB, Bao ZK, Ma C, Zhang X, Chen HS, Zhang HF. A unified method of detecting core-periphery structure and community structure in networks. CHAOS (WOODBURY, N.Y.) 2018; 28:013122. [PMID: 29390643 DOI: 10.1063/1.4990734] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The core-periphery structure and the community structure are two typical meso-scale structures in complex networks. Although community detection has been extensively investigated from different perspectives, the definition and the detection of the core-periphery structure have not received much attention. Furthermore, the detection problems of the core-periphery and community structure were separately investigated. In this paper, we develop a unified framework to simultaneously detect the core-periphery structure and community structure in complex networks. Moreover, there are several extra advantages of our algorithm: our method can detect not only single but also multiple pairs of core-periphery structures; the overlapping nodes belonging to different communities can be identified; different scales of core-periphery structures can be detected by adjusting the size of the core. The good performance of the method has been validated on synthetic and real complex networks. So, we provide a basic framework to detect the two typical meso-scale structures: the core-periphery structure and the community structure.
Collapse
Affiliation(s)
- Bing-Bing Xiang
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| | - Zhong-Kui Bao
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| | - Chuang Ma
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| | - Xingyi Zhang
- Institute of Bio-inspired Intelligence and Mining Knowledge, School of Computer Science and Technology, Anhui University, Hefei 230601, China
| | - Han-Shuang Chen
- School of Physics and Material Science, Anhui University, Hefei 230601, China
| | - Hai-Feng Zhang
- School of Mathematical Science, Anhui University, Hefei 230601, People's Republic of China
| |
Collapse
|
11
|
Liu G, Wang H, Chu H, Yu J, Zhou X. Functional diversity of topological modules in human protein-protein interaction networks. Sci Rep 2017; 7:16199. [PMID: 29170401 PMCID: PMC5701033 DOI: 10.1038/s41598-017-16270-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 11/09/2017] [Indexed: 01/18/2023] Open
Abstract
A large-scale molecular interaction network of protein-protein interactions (PPIs) enables the automatic detection of molecular functional modules through a computational approach. However, the functional modules that are typically detected by topological community detection algorithms may be diverse in functional homogeneity and are empirically considered to be default functional modules. Thus, a significant challenge that has been described but not elucidated is investigating the relationship between topological modules and functional modules. We systematically investigated this issue by initially using seven widely used community detection algorithms to partition the PPI network into communities. Four homogeneity measures were subsequently implemented to evaluate the functional homogeneity of protein community. We determined that a significant portion of topological modules with heterogeneous functionality exists and should be further investigated; moreover, these findings indicated that topologically based functional module detection approaches must be reconsidered. Furthermore, we found that the functional homogeneity of topological modules is positively correlated with their edge densities, degree of association with diseases and general Gene Ontology (GO) terms. Thus, topologically based module detection approaches should be used with caution in the identification of functional modules with high homogeneity
Collapse
Affiliation(s)
- Guangming Liu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China
| | - Huixin Wang
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China
| | - Hongwei Chu
- Dalian University of Technology, Dalian, 116024, China.,Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China
| | - Jian Yu
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China.
| | - Xuezhong Zhou
- School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, 100044, China.
| |
Collapse
|
12
|
Mao Y, Kuo SW, Chen L, Heckman CJ, Jiang MC. The essential and downstream common proteins of amyotrophic lateral sclerosis: A protein-protein interaction network analysis. PLoS One 2017; 12:e0172246. [PMID: 28282387 PMCID: PMC5345759 DOI: 10.1371/journal.pone.0172246] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 02/01/2017] [Indexed: 12/12/2022] Open
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a devastative neurodegenerative disease characterized by selective loss of motoneurons. While several breakthroughs have been made in identifying ALS genetic defects, the detailed molecular mechanisms are still unclear. These genetic defects involve in numerous biological processes, which converge to a common destiny: motoneuron degeneration. In addition, the common comorbid Frontotemporal Dementia (FTD) further complicates the investigation of ALS etiology. In this study, we aimed to explore the protein-protein interaction network built on known ALS-causative genes to identify essential proteins and common downstream proteins between classical ALS and ALS+FTD (classical ALS + ALS/FTD) groups. The results suggest that classical ALS and ALS+FTD share similar essential protein set (VCP, FUS, TDP-43 and hnRNPA1) but have distinctive functional enrichment profiles. Thus, disruptions to these essential proteins might cause motoneuron susceptible to cellular stresses and eventually vulnerable to proteinopathies. Moreover, we identified a common downstream protein, ubiquitin-C, extensively interconnected with ALS-causative proteins (22 out of 24) which was not linked to ALS previously. Our in silico approach provides the computational background for identifying ALS therapeutic targets, and points out the potential downstream common ground of ALS-causative mutations.
Collapse
Affiliation(s)
- Yimin Mao
- Applied Science Institute, Jiangxi University of Science and Technology, Jiangxi, China
- Department of Physiology, Northwestern University, Chicago, Illinois, United States of America
| | - Su-Wei Kuo
- Department of Physiology, Northwestern University, Chicago, Illinois, United States of America
| | - Le Chen
- Applied Science Institute, Jiangxi University of Science and Technology, Jiangxi, China
| | - C. J. Heckman
- Department of Physiology, Northwestern University, Chicago, Illinois, United States of America
- Department of Physical Medicine and Rehabilitation, Northwestern University, Chicago, Illinois, United States of America
- Department of Physical Therapy and Human Movement Sciences, Northwestern University, Chicago, Illinois, United States of America
| | - M. C. Jiang
- Department of Physiology, Northwestern University, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
13
|
Zhang K, Li Y, Li T, Li ZG, Hsiang T, Zhang Z, Sun W. Pathogenicity Genes in Ustilaginoidea virens Revealed by a Predicted Protein-Protein Interaction Network. J Proteome Res 2017; 16:1193-1206. [PMID: 28099032 DOI: 10.1021/acs.jproteome.6b00720] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Rice false smut, caused by Ustilaginoidea virens, produces significant losses in rice yield and grain quality and has recently emerged as one of the most important rice diseases worldwide. Despite its importance in rice production, relatively few studies have been conducted to illustrate the complex interactome and the pathogenicity gene interactions. Here a protein-protein interaction network of U. virens was built through two well-recognized approaches, interolog- and domain-domain interaction-based methods. A total of 20 217 interactions associated with 3305 proteins were predicted after strict filtering. The reliability of the network was assessed computationally and experimentally. The topology of the interactome network revealed highly connected proteins. A pathogenicity-related subnetwork involving up-regulated genes during early U. virens infection was also constructed, and many novel pathogenicity proteins were predicted in the subnetwork. In addition, we built an interspecies PPI network between U. virens and Oryza sativa, providing new insights for molecular interactions of this host-pathogen pathosystem. A web-based publicly available interactive database based on these interaction networks has also been released. In summary, a proteome-scale map of the PPI network was described for U. virens, which will provide new perspectives for finely dissecting interactions of genes related to its pathogenicity.
Collapse
Affiliation(s)
- Kang Zhang
- Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University , Beijing 100193, China
| | - Yuejiao Li
- Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University , Beijing 100193, China
| | - Tengjiao Li
- Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University , Beijing 100193, China
| | - Zhi-Gang Li
- Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University , Beijing 100193, China
| | - Tom Hsiang
- School of Environmental Sciences, University of Guelph , Guelph, Ontario N1G 2W1, Canada
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University , Beijing 100193, China
| | - Wenxian Sun
- Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University , Beijing 100193, China
| |
Collapse
|
14
|
Han SK, Kim I, Hwang J, Kim S. Network Modules of the Cross-Species Genotype-Phenotype Map Reflect the Clinical Severity of Human Diseases. PLoS One 2015; 10:e0136300. [PMID: 26301634 PMCID: PMC4547739 DOI: 10.1371/journal.pone.0136300] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Accepted: 08/02/2015] [Indexed: 01/09/2023] Open
Abstract
Recent advances in genome sequencing techniques have improved our understanding of the genotype-phenotype relationship between genetic variants and human diseases. However, genetic variations uncovered from patient populations do not provide enough information to understand the mechanisms underlying the progression and clinical severity of human diseases. Moreover, building a high-resolution genotype-phenotype map is difficult due to the diverse genetic backgrounds of the human population. We built a cross-species genotype-phenotype map to explain the clinical severity of human genetic diseases. We developed a data-integrative framework to investigate network modules composed of human diseases mapped with gene essentiality measured from a model organism. Essential and nonessential genes connect diseases of different types which form clusters in the human disease network. In a large patient population study, we found that disease classes enriched with essential genes tended to show a higher mortality rate than disease classes enriched with nonessential genes. Moreover, high disease mortality rates are explained by the multiple comorbid relationships and the high pleiotropy of disease genes found in the essential gene-enriched diseases. Our results reveal that the genotype-phenotype map of a model organism can facilitate the identification of human disease-gene associations and predict human disease progression.
Collapse
Affiliation(s)
- Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Jihye Hwang
- Department of IT Convergence and Engineering, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
- * E-mail:
| |
Collapse
|
15
|
Li P, He T, Hu X, Zhao J, Shen X, Zhang M, Wang Y. A novel protein complex identification algorithm based on Connected Affinity Clique Extension (CACE). IEEE Trans Nanobioscience 2014; 13:89-96. [PMID: 24803142 DOI: 10.1109/tnb.2014.2317755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A novel algorithm based on Connected Affinity Clique Extension (CACE) for mining overlapping functional modules in protein interaction network is proposed in this paper. In this approach, the value of protein connected affinity which is inferred from protein complexes is interpreted as the reliability and possibility of interaction. The protein interaction network is constructed as a weighted graph, and the weight is dependent on the connected affinity coefficient. The experimental results of our CACE in two test data sets show that the CACE can detect the functional modules much more effectively and accurately when compared with other state-of-art algorithms CPM and IPC-MCE.
Collapse
|
16
|
Rotival M, Petretto E. Leveraging gene co-expression networks to pinpoint the regulation of complex traits and disease, with a focus on cardiovascular traits. Brief Funct Genomics 2013; 13:66-78. [PMID: 23960099 DOI: 10.1093/bfgp/elt030] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Over the past decade, the number of genome-scale transcriptional datasets in publicly available databases has climbed to nearly one million, providing an unprecedented opportunity for extensive analyses of gene co-expression networks. In systems-genetic studies of complex diseases researchers increasingly focus on groups of highly interconnected genes within complex transcriptional networks (referred to as clusters, modules or subnetworks) to uncover specific molecular processes that can inform functional disease mechanisms and pathological pathways. Here, we outline the basic paradigms underlying gene co-expression network analysis and critically review the most commonly used computational methods. Finally, we discuss specific applications of network-based approaches to the study of cardiovascular traits, which highlight the power of integrated analyses of networks, genetic and gene-regulation data to elucidate the complex mechanisms underlying cardiovascular disease.
Collapse
Affiliation(s)
- Maxime Rotival
- MRC-Clinical Sciences Centre, Hammersmith Hospital Campus, Imperial College Centre for Translational and Experimental Medicine (ICTEM Building), Du Cane Road, London, W12 0NN UK. Tel.: + 44-020-8383-1468; Fax: +44-208-383-8577;
| | | |
Collapse
|
17
|
Yang M, Chen JL, Xu LW, Ji G. Navigating traditional chinese medicine network pharmacology and computational tools. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2013; 2013:731969. [PMID: 23983798 PMCID: PMC3747450 DOI: 10.1155/2013/731969] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 07/04/2013] [Indexed: 12/17/2022]
Abstract
The concept of "network target" has ushered in a new era in the field of traditional Chinese medicine (TCM). As a new research approach, network pharmacology is based on the analysis of network models and systems biology. Taking advantage of advancements in systems biology, a high degree of integration data analysis strategy and interpretable visualization provides deeper insights into the underlying mechanisms of TCM theories, including the principles of herb combination, biological foundations of herb or herbal formulae action, and molecular basis of TCM syndromes. In this study, we review several recent developments in TCM network pharmacology research and discuss their potential for bridging the gap between traditional and modern medicine. We briefly summarize the two main functional applications of TCM network models: understanding/uncovering and predicting/discovering. In particular, we focus on how TCM network pharmacology research is conducted and highlight different computational tools, such as network-based and machine learning algorithms, and sources that have been proposed and applied to the different steps involved in the research process. To make network pharmacology research commonplace, some basic network definitions and analysis methods are presented.
Collapse
Affiliation(s)
- Ming Yang
- Longhua Hospital Affiliated to Shanghai University of TCM, Shanghai 200032, China
- Institute of Digestive Disease, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 200032, China
| | - Jia-Lei Chen
- Longhua Hospital Affiliated to Shanghai University of TCM, Shanghai 200032, China
| | - Li-Wen Xu
- Longhua Hospital Affiliated to Shanghai University of TCM, Shanghai 200032, China
| | - Guang Ji
- Institute of Digestive Disease, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 200032, China
| |
Collapse
|
18
|
Rende D, Baysal N, Kirdar B. Complex disease interventions from a network model for type 2 diabetes. PLoS One 2013; 8:e65854. [PMID: 23776558 PMCID: PMC3679160 DOI: 10.1371/journal.pone.0065854] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 05/02/2013] [Indexed: 12/20/2022] Open
Abstract
There is accumulating evidence that the proteins encoded by the genes associated with a common disorder interact with each other, participate in similar pathways and share GO terms. It has been anticipated that the functional modules in a disease related functional linkage network are informative to reveal significant metabolic processes and disease's associations with other complex disorders. In the current study, Type 2 diabetes associated functional linkage network (T2DFN) containing 2770 proteins and 15041 linkages was constructed. The functional modules in this network were scored and evaluated in terms of shared pathways, co-localization, co-expression and associations with similar diseases. The assembly of top scoring overlapping members in the functional modules revealed that, along with the well known biological pathways, circadian rhythm, diverse actions of nuclear receptors in steroid and retinoic acid metabolisms have significant occurrence in the pathophysiology of the disease. The disease's association with other metabolic and neuromuscular disorders was established through shared proteins. Nuclear receptor NRIP1 has a pivotal role in lipid and carbohydrate metabolism, indicating the need to investigate subsequent effects of NRIP1 on Type 2 diabetes. Our study also revealed that CREB binding protein (CREBBP) and cardiotrophin-1 (CTF1) have suggestive roles in linking Type 2 diabetes and neuromuscular diseases.
Collapse
Affiliation(s)
- Deniz Rende
- Department of Materials Science and Engineering, Rensselaer Polytechnic Institute, Troy, New York, United States of America.
| | | | | |
Collapse
|
19
|
Chin CH, Chen SH, Chen CY, Hsiung CA, Ho CW, Ko MT, Lin CY. Spotlight: assembly of protein complexes by integrating graph clustering methods. Gene 2012; 518:42-51. [PMID: 23274651 DOI: 10.1016/j.gene.2012.11.087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Accepted: 11/27/2012] [Indexed: 02/01/2023]
Abstract
UNLABELLED As is generally assumed, clusters in protein-protein interaction (PPI) networks perform specific, crucial functions in biological systems. Various network community detection methods have been developed to exploit PPI networks in order to identify protein complexes and functional modules. Due to the potential role of various regulatory modes in biological networks, a single method may just apply a single graph property and neglect communities highlighted by other network properties. This work presents a novel integration method to capture protein modules/protein complexes by multiple network features detected by different algorithms. The integration method is further implemented in a web-based platform with a highly effective interactive network analyzer. Conventionally adopted methods with different perspectives on network community detection (e.g., CPM, FastGreedy, HUNTER, MCL, LE, SpinGlass, and WalkTrap) are also executed simultaneously. Analytical results indicate that the proposed method performs better than the conventional ones. The proposed approach can capture the transcription and RNA splicing machineries from the yeast protein network. Meanwhile, proteins that are highly associated with each other, yet not described in both machineries are also identified. In sum, a protein that is closely connected to components of a known module or a complex in the network view implies the functional association among them. Importantly, our method can detect these unique network features, thus facilitating efforts to discover unknown components of functional modules/protein complexes. AVAILABILITY Spotlight is freely accessible at http://hub.iis.sinica.edu.tw/spotlight. Video clips for a quick view of usage are available in the website online help page.
Collapse
Affiliation(s)
- Chia-Hao Chin
- Institute of Information Science, Academia Sinica, No. 128 Yan-Chiu-Yuan Rd., Sec. 2, Taipei 115, Taiwan
| | | | | | | | | | | | | |
Collapse
|
20
|
Thahir M, Sharma T, Ganapathiraju MK. An efficient heuristic method for active feature acquisition and its application to protein-protein interaction prediction. BMC Proc 2012; 6 Suppl 7:S2. [PMID: 23173746 PMCID: PMC3504800 DOI: 10.1186/1753-6561-6-s7-s2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Machine learning approaches for classification learn the pattern of the feature space of different classes, or learn a boundary that separates the feature space into different classes. The features of the data instances are usually available, and it is only the class-labels of the instances that are unavailable. For example, to classify text documents into different topic categories, the words in the documents are features and they are readily available, whereas the topic is what is predicted. However, in some domains obtaining features may be resource-intensive because of which not all features may be available. An example is that of protein-protein interaction prediction, where not only are the labels ('interacting' or 'non-interacting') unavailable, but so are some of the features. It may be possible to obtain at least some of the missing features by carrying out a few experiments as permitted by the available resources. If only a few experiments can be carried out to acquire missing features, which proteins should be studied and which features of those proteins should be determined? From the perspective of machine learning for PPI prediction, it would be desirable that those features be acquired which when used in training the classifier, the accuracy of the classifier is improved the most. That is, the utility of the feature-acquisition is measured in terms of how much acquired features contribute to improving the accuracy of the classifier. Active feature acquisition (AFA) is a strategy to preselect such instance-feature combinations (i.e. protein and experiment combinations) for maximum utility. The goal of AFA is the creation of optimal training set that would result in the best classifier, and not in determining the best classification model itself. RESULTS We present a heuristic method for active feature acquisition to calculate the utility of acquiring a missing feature. This heuristic takes into account the change in belief of the classification model induced by the acquisition of the feature under consideration. As compared to random selection of proteins on which the experiments are performed and the type of experiment that is performed, the heuristic method reduces the number of experiments to as few as 40%. Most notable characteristic of this method is that it does not require re-training of the classification model on every possible combination of instance, feature and feature-value tuples. For this reason, our method is far less computationally expensive as compared with previous AFA strategies. CONCLUSIONS The results show that our heuristic method for AFA creates an optimal training set with far less features acquired as compared to random acquisition. This shows the value of active feature acquisition to aid in protein-protein interaction prediction where feature acquisition is costly. Compared to previous methods, the proposed method reduces computational cost while also achieving a better F-score. The proposed method is valuable as it presents a direction to AFA with a far lesser computational expense by removing the need for the first time, of training a classifier for every combination of instance, feature and feature-value tuples which would be impractical for several domains.
Collapse
Affiliation(s)
- Mohamed Thahir
- Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
| | | | | |
Collapse
|
21
|
A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification. Comput Biol Chem 2011; 35:298-307. [DOI: 10.1016/j.compbiolchem.2011.07.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2011] [Revised: 05/30/2011] [Accepted: 07/03/2011] [Indexed: 11/19/2022]
|
22
|
YU L, GAO L, SUN PG. Research on Algorithms for Complexes and Functional Modules Prediction in Protein-Protein Interaction Networks. ACTA ACUST UNITED AC 2011. [DOI: 10.3724/sp.j.1016.2011.01239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
23
|
Cui G, Shrestha R, Han K. ModuleSearch: finding functional modules in a protein-protein interaction network. Comput Methods Biomech Biomed Engin 2011; 15:691-9. [PMID: 21827286 DOI: 10.1080/10255842.2011.555404] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Many biological processes are performed by a group of proteins rather than by individual proteins. Proteins involved in the same biological process often form a densely connected sub-graph in a protein-protein interaction network. Therefore, finding a dense sub-graph provides useful information to predict the function or protein complex of uncharacterised proteins in the sub-graph. We developed a heuristic algorithm that finds functional modules in a protein-protein interaction network and visualises the modules. The algorithm has been implemented in a platform-independent, standalone program called ModuleSearch. In an interaction network of yeast proteins, ModuleSearch found 366 overlapping modules. Of the modules, 71% have a function shared by more than half the proteins in the module and 58% have a function shared by all proteins in the module. Comparison of ModuleSearch with other programs shows that ModuleSearch finds more sub-graphs than most other programs, yet a higher proportion of the sub-graphs correspond to known functional modules. ModuleSearch and sample data are freely available to academics at http://bclab.inha.ac.kr/ModuleSearch.
Collapse
Affiliation(s)
- Guangyu Cui
- School of Computer Science and Engineering, Inha University, Incheon, 402-751, South Korea
| | | | | |
Collapse
|
24
|
Rende D, Baysal N, Kirdar B. A novel integrative network approach to understand the interplay between cardiovascular disease and other complex disorders. MOLECULAR BIOSYSTEMS 2011; 7:2205-19. [PMID: 21559538 DOI: 10.1039/c1mb05064h] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
There is accumulating evidence that the proteins encoded by the genes associated with a common disorder interact with each other, participate in similar pathways and share GO terms. It has been anticipated that the functional modules in a disease related functional linkage network can be integrated with bibliomics to reveal association with other complex disorders. In this study, the cardiovascular disease functional linkage network (CFN) containing 1536 nodes and 3345 interactions was constructed using proteins encoded by 234 genes associated with the disease. Integration of CFN with bibliomics showed that 227 out of 566 functional modules are significantly associated with one or more diseases. Analysis of functional modules revealed the possible regulatory roles of SP1 and CXCL12 in the pathogenesis of cardiovascular disease (CVD) and modulation of their activities may be considered as potential therapeutic tools. The integration of CFN with bibliomics also indicated significant relations of CVD with other complex disorders. In a stratified map the members of 227 functional modules and 58 diseases in 15 disease classes were combined. In this map, leprosy, listeria monocytogenes, myasthenia, hemorrhagic diathesis and Protein S deficiency, which were not previously reported to be associated with CVD, showed significant associations. Several cancers arising from epithelial cells were also found to be linked to other diseases through hub proteins, VEGFA and PTGS2.
Collapse
Affiliation(s)
- Deniz Rende
- Rensselaer Nanotechnology Center, Rensselaer Polytechnic Institute, Troy, NY12180, USA.
| | | | | |
Collapse
|
25
|
Abstract
The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed.
Collapse
Affiliation(s)
- Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, China
- Department of Computer Science, Georgia State University, Atlanta, GA30303, USA
| | - Min Li
- School of Information Science and Engineering, Central South University, Changsha 410083, China
| | - Youping Deng
- Rush University Cancer Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, GA30303, USA
| |
Collapse
|
26
|
Wang J, Liu B, Li M, Pan Y. Identifying protein complexes from interaction networks based on clique percolation and distance restriction. BMC Genomics 2010; 11 Suppl 2:S10. [PMID: 21047377 PMCID: PMC2975417 DOI: 10.1186/1471-2164-11-s2-s10] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of protein complexes in large interaction networks is crucial to understand principles of cellular organization and predict protein functions, which is one of the most important issues in the post-genomic era. Each protein might be subordinate multiple protein complexes in the real protein-protein interaction networks. Identifying overlapping protein complexes from protein-protein interaction networks is a considerable research topic. RESULT As an effective algorithm in identifying overlapping module structures, clique percolation method (CPM) has a wide range of application in social networks and biological networks. However, the recognition accuracy of algorithm CPM is lowly. Furthermore, algorithm CPM is unfit to identifying protein complexes with meso-scale when it applied in protein-protein interaction networks. In this paper, we propose a new topological model by extending the definition of k-clique community of algorithm CPM and introduced distance restriction, and develop a novel algorithm called CP-DR based on the new topological model for identifying protein complexes. In this new algorithm, the protein complex size is restricted by distance constraint to conquer the shortcomings of algorithm CPM. The algorithm CP-DR is applied to the protein interaction network of Sacchromyces cerevisiae and identifies many well known complexes. CONCLUSION The proposed algorithm CP-DR based on clique percolation and distance restriction makes it possible to identify dense subgraphs in protein interaction networks, a large number of which correspond to known protein complexes. Compared to algorithm CPM, algorithm CP-DR has more outstanding performance.
Collapse
Affiliation(s)
- Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| | | | | | | |
Collapse
|
27
|
Chin CH, Chen SH, Ho CW, Ko MT, Lin CY. A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles. BMC Bioinformatics 2010; 11 Suppl 1:S25. [PMID: 20122197 PMCID: PMC3009496 DOI: 10.1186/1471-2105-11-s1-s25] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Background Many research results show that the biological systems are composed of functional modules. Members in the same module usually have common functions. This is useful information to understand how biological systems work. Therefore, detecting functional modules is an important research topic in the post-genome era. One of functional module detecting methods is to find dense regions in Protein-Protein Interaction (PPI) networks. Most of current methods neglect confidence-scores of interactions, and pay little attention on using gene expression data to improve their results. Results In this paper, we propose a novel hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles, and we name it HUNTER. Our method not only can extract functional modules from a weighted PPI network, but also use gene expression data as optional input to increase the quality of outcomes. Using HUNTER on yeast data, we found it can discover more novel components related with RNA polymerase complex than those existed methods from yeast interactome. And these new components show the close relationship with polymerase after functional analysis on Gene Ontology. Conclusion A C++ implementation of our prediction method, dataset and supplementary material are available at http://hub.iis.sinica.edu.tw/Hunter/. Our proposed HUNTER method has been applied on yeast data, and the empirical results show that our method can accurately identify functional modules. Such useful application derived from our algorithm can reconstruct the biological machinery, identify undiscovered components and decipher common sub-modules inside these complexes like RNA polymerases I, II, III.
Collapse
Affiliation(s)
- Chia-Hao Chin
- Institute of Information Science, Academia Sinica, No, 128 Yan-Chiu-Yuan Rd, Sec, 2, Taipei 115, Taiwan.
| | | | | | | | | |
Collapse
|
28
|
Zahoránszky LA, Katona GY, Hári P, Málnási-Csizmadia A, Zweig KA, Zahoránszky-Köhalmi G. Breaking the hierarchy--a new cluster selection mechanism for hierarchical clustering methods. Algorithms Mol Biol 2009; 4:12. [PMID: 19840391 PMCID: PMC2774311 DOI: 10.1186/1748-7188-4-12] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Accepted: 10/19/2009] [Indexed: 11/24/2022] Open
Abstract
Background Hierarchical clustering methods like Ward's method have been used since decades to understand biological and chemical data sets. In order to get a partition of the data set, it is necessary to choose an optimal level of the hierarchy by a so-called level selection algorithm. In 2005, a new kind of hierarchical clustering method was introduced by Palla et al. that differs in two ways from Ward's method: it can be used on data on which no full similarity matrix is defined and it can produce overlapping clusters, i.e., allow for multiple membership of items in clusters. These features are optimal for biological and chemical data sets but until now no level selection algorithm has been published for this method. Results In this article we provide a general selection scheme, the level independent clustering selection method, called LInCS. With it, clusters can be selected from any level in quadratic time with respect to the number of clusters. Since hierarchically clustered data is not necessarily associated with a similarity measure, the selection is based on a graph theoretic notion of cohesive clusters. We present results of our method on two data sets, a set of drug like molecules and set of protein-protein interaction (PPI) data. In both cases the method provides a clustering with very good sensitivity and specificity values according to a given reference clustering. Moreover, we can show for the PPI data set that our graph theoretic cohesiveness measure indeed chooses biologically homogeneous clusters and disregards inhomogeneous ones in most cases. We finally discuss how the method can be generalized to other hierarchical clustering methods to allow for a level independent cluster selection. Conclusion Using our new cluster selection method together with the method by Palla et al. provides a new interesting clustering mechanism that allows to compute overlapping clusters, which is especially valuable for biological and chemical data sets.
Collapse
|
29
|
Andreopoulos B, Winter C, Labudde D, Schroeder M. Triangle network motifs predict complexes by complementing high-error interactomes with structural information. BMC Bioinformatics 2009; 10:196. [PMID: 19558694 PMCID: PMC2714575 DOI: 10.1186/1471-2105-10-196] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Accepted: 06/27/2009] [Indexed: 11/30/2022] Open
Abstract
Background A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles. Results We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes. Conclusion Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN.
Collapse
Affiliation(s)
- Bill Andreopoulos
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany.
| | | | | | | |
Collapse
|
30
|
Gao L, Sun PG, Song J. Clustering algorithms for detecting functional modules in protein interaction networks. J Bioinform Comput Biol 2009; 7:217-42. [PMID: 19226668 DOI: 10.1142/s0219720009004023] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Revised: 10/21/2008] [Accepted: 10/21/2008] [Indexed: 01/21/2023]
Abstract
Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. When studying the workings of a biological cell, it is useful to be able to detect known and predict still undiscovered protein complexes within the cell's PPI networks. Such predictions may be used as an inexpensive tool to direct biological experiments. The increasing amount of available PPI data necessitate a fast, accurate approach to biological complex identification. Because of its importance in the studies of protein interaction network, there are different models and algorithms in identifying functional modules in PPI networks. In this paper, we review some representative algorithms, focusing on the algorithms underlying the approaches and how the algorithms relate to each other. In particular, a comparison is given based on the property of the algorithms. Since the PPI network is noisy and still incomplete, some methods which consider other additional properties for preprocessing and purifying of PPI data are presented. We also give a discussion about the functional annotation and validation of protein complexes. Finally, new progress and future research directions are discussed from the computational viewpoint.
Collapse
Affiliation(s)
- Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China.
| | | | | |
Collapse
|
31
|
Wu Z, Zhao X, Chen L. Identifying responsive functional modules from protein-protein interaction network. Mol Cells 2009; 27:271-7. [PMID: 19326072 DOI: 10.1007/s10059-009-0035-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2009] [Accepted: 01/26/2009] [Indexed: 10/21/2022] Open
Abstract
Proteins interact with each other within a cell, and those interactions give rise to the biological function and dynamical behavior of cellular systems. Generally, the protein interactions are temporal, spatial, or condition dependent in a specific cell, where only a small part of interactions usually take place under certain conditions. Recently, although a large amount of protein interaction data have been collected by high-throughput technologies, the interactions are recorded or summarized under various or different conditions and therefore cannot be directly used to identify signaling pathways or active networks, which are believed to work in specific cells under specific conditions. However, protein interactions activated under specific conditions may give hints to the biological process underlying corresponding phenotypes. In particular, responsive functional modules consist of protein interactions activated under specific conditions can provide insight into the mechanism underlying biological systems, e.g. protein interaction subnetworks found for certain diseases rather than normal conditions may help to discover potential biomarkers. From computational viewpoint, identifying responsive functional modules can be formulated as an optimization problem. Therefore, efficient computational methods for extracting responsive functional modules are strongly demanded due to the NP-hard nature of such a combinatorial problem. In this review, we first report recent advances in development of computational methods for extracting responsive functional modules or active pathways from protein interaction network and microarray data. Then from computational aspect, we discuss remaining obstacles and perspectives for this attractive and challenging topic in the area of systems biology.
Collapse
Affiliation(s)
- Zikai Wu
- Institute of Systems Biology, Shanghai University, Shanghai 200444, China
| | | | | |
Collapse
|
32
|
Gopalacharyulu PV, Velagapudi VR, Lindfors E, Halperin E, Oresic M. Dynamic network topology changes in functional modules predict responses to oxidative stress in yeast. MOLECULAR BIOSYSTEMS 2009; 5:276-87. [PMID: 19225619 DOI: 10.1039/b815347g] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In response to environmental challenges, biological systems respond with dynamic adaptive changes in order to maintain the functionality of the system. Such adaptations may lead to cumulative stress over time, possibly leading to global failure of the system. When studying such systems responses, it is therefore important to understand them in system-wide and dynamic context. Here we hypothesize that dynamic changes in the topology of functional modules of integrated biological networks reflect their activity under specific environmental challenges. We introduce topological enrichment analysis of functional subnetworks (TEAFS), a method for the analysis of integrated molecular profile and interactome data, which we validated by comprehensive metabolomic analysis of dynamic yeast response under oxidative stress. TEAFS identified activation of multiple stress response related mechanisms, such as lipid metabolism and phospholipid biosynthesis. We identified, among others, a fatty acid elongase IFA38 as a hub protein which was absent at all time points under oxidative stress conditions. The deletion mutant of the IFA38 encoding gene is known for the accumulation of ceramides. By applying a comprehensive metabolomic analysis, we confirmed the increased concentrations over time of ceramides and palmitic acid, a precursor of de novo ceramide biosynthesis. Our results imply that the connectivity of the system is being dynamically modulated in response to oxidative stress, progressively leading to the accumulation of (lipo)toxic lipids such as ceramides. Studies of local network topology dynamics can be used to investigate as well as predict the activity of biological processes and the system's responses to environmental challenges and interventions.
Collapse
|
33
|
Wang RS, Zhang S, Wang Y, Zhang XS, Chen L. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2007.12.043] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
34
|
Gene module level analysis: identification to networks and dynamics. Curr Opin Biotechnol 2008; 19:482-91. [PMID: 18725293 DOI: 10.1016/j.copbio.2008.07.011] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2008] [Revised: 07/25/2008] [Accepted: 07/29/2008] [Indexed: 12/23/2022]
Abstract
Nature exhibits modular design in biological systems. Gene module level analysis is based on this module concept, aiming to understand biological network design and systems behavior in disease and development by emphasizing on modules of genes rather than individual genes. Module level analysis has been extensively applied in genome wide level analysis, exploring the organization of biological systems from identifying modules to reconstructing module networks and analyzing module dynamics. Such module level perspective provides a high level representation of the regulatory scenario and design of biological systems, promising to revolutionize our view of systems biology, genetic engineering as well as disease mechanisms and molecular medicine.
Collapse
|
35
|
|
36
|
Zhang S, Jin G, Zhang XS, Chen L. Discovering functions and revealing mechanisms at molecular level from biological networks. Proteomics 2007; 7:2856-69. [PMID: 17703505 DOI: 10.1002/pmic.200700095] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
With the increasingly accumulated data from high-throughput technologies, study on biomolecular networks has become one of key focuses in systems biology and bioinformatics. In particular, various types of molecular networks (e.g., protein-protein interaction (PPI) network; gene regulatory network (GRN); metabolic network (MN); gene coexpression network (GCEN)) have been extensively investigated, and those studies demonstrate great potentials to discover basic functions and to reveal essential mechanisms for various biological phenomena, by understanding biological systems not at individual component level but at a system-wide level. Recent studies on networks have created very prolific researches on many aspects of living organisms. In this paper, we aim to review the recent developments on topics related to molecular networks in a comprehensive manner, with the special emphasis on the computational aspect. The contents of the survey cover global topological properties and local structural characteristics, network motifs, network comparison and query, detection of functional modules and network motifs, function prediction from network analysis, inferring molecular networks from biological data as well as representative databases and software tools.
Collapse
Affiliation(s)
- Shihua Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | | | | | | |
Collapse
|