1
|
Gollapalli P, Rudrappa S, Kumar V, Santosh Kumar HS. Domain Architecture Based Methods for Comparative Functional Genomics Toward Therapeutic Drug Target Discovery. J Mol Evol 2023; 91:598-615. [PMID: 37626222 DOI: 10.1007/s00239-023-10129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
Genes duplicate, mutate, recombine, fuse or fission to produce new genes, or when genes are formed from de novo, novel functions arise during evolution. Researchers have tried to quantify the causes of these molecular diversification processes to know how these genes increase molecular complexity over a period of time, for instance protein domain organization. In contrast to global sequence similarity, protein domain architectures can capture key structural and functional characteristics, making them better proxies for describing functional equivalence. In Prokaryotes and eukaryotes it has proven that, domain designs are retained over significant evolutionary distances. Protein domain architectures are now being utilized to categorize and distinguish evolutionarily related proteins and find homologs among species that are evolutionarily distant from one another. Additionally, structural information stored in domain structures has accelerated homology identification and sequence search methods. Tools for functional protein annotation have been developed to discover, protein domain content, domain order, domain recurrence, and domain position as all these contribute to the prediction of protein functional accuracy. In this review, an attempt is made to summarise facts and speculations regarding the use of protein domain architecture and modularity to identify possible therapeutic targets among cellular activities based on the understanding their linked biological processes.
Collapse
Affiliation(s)
- Pavan Gollapalli
- Center for Bioinformatics and Biostatistics, Nitte (Deemed to be University), Mangalore, Karnataka, 575018, India
| | - Sushmitha Rudrappa
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India
| | - Vadlapudi Kumar
- Department of Biochemistry, Davangere University, Shivagangothri, Davangere, Karnataka, 577007, India
| | - Hulikal Shivashankara Santosh Kumar
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India.
| |
Collapse
|
2
|
Naveenkumar N, Prabantu VM, Vishwanath S, Sowdhamini R, Srinivasan N. Structures of distantly related interacting protein homologs are less divergent than non-interacting homologs. FEBS Open Bio 2022; 12:2147-2153. [PMID: 36148593 PMCID: PMC9714365 DOI: 10.1002/2211-5463.13492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 08/09/2022] [Accepted: 09/22/2022] [Indexed: 01/25/2023] Open
Abstract
Homologous proteins can display high structural variation due to evolutionary divergence at low sequence identity. This classical inverse relationship between sequence identity and structural similarity, established many years ago, has remained true between homologous proteins of known structure over time. However, a large number of heteromeric proteins also exist in the structural data bank, where the interacting subunits belong to the same fold and maintain low sequence identity between themselves. It is not clear if there is any selection pressure to deviate from the inverse sequence-structure relationship for such interacting distant homologs, in comparison to pairs of homologs which are not known to interact. We examined 12,824 fold pairs of interacting homologs of known structure, which includes both heteromers and multi-domain proteins. These were compared with monomeric proteins, resulting in 26,082 fold pairs as a dataset of non-interacting homologous systems. Interacting homologs were found to retain higher structural similarity than non-interacting homologs at diminishing sequence identity in a statistically significant manner. Interacting homologs are more similar in their 3D structures than non-interacting homologs and have a preference towards symmetric association. There appears to be a structural constraint between remote homologs due to this commitment.
Collapse
Affiliation(s)
- Nagarajan Naveenkumar
- Molecular Biophysics UnitIndian Institute of ScienceBangaloreIndia,National Centre for Biological SciencesTata Institute of Fundamental ResearchBangaloreIndia
| | | | - Sneha Vishwanath
- Molecular Biophysics UnitIndian Institute of ScienceBangaloreIndia
| | - Ramanathan Sowdhamini
- National Centre for Biological SciencesTata Institute of Fundamental ResearchBangaloreIndia
| | | |
Collapse
|
3
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
4
|
Mat Razali N, Hisham SN, Kumar IS, Shukla RN, Lee M, Abu Bakar MF, Nadarajah K. Comparative Genomics: Insights on the Pathogenicity and Lifestyle of Rhizoctonia solani. Int J Mol Sci 2021; 22:ijms22042183. [PMID: 33671736 PMCID: PMC7926851 DOI: 10.3390/ijms22042183] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 02/06/2021] [Accepted: 02/15/2021] [Indexed: 12/17/2022] Open
Abstract
Proper management of agricultural disease is important to ensure sustainable food security. Staple food crops like rice, wheat, cereals, and other cash crops hold great export value for countries. Ensuring proper supply is critical; hence any biotic or abiotic factors contributing to the shortfall in yield of these crops should be alleviated. Rhizoctonia solani is a major biotic factor that results in yield losses in many agriculturally important crops. This paper focuses on genome informatics of our Malaysian Draft R. solani AG1-IA, and the comparative genomics (inter- and intra- AG) with four AGs including China AG1-IA (AG1-IA_KB317705.1), AG1-IB, AG3, and AG8. The genomic content of repeat elements, transposable elements (TEs), syntenic genomic blocks, functions of protein-coding genes as well as core orthologous genic information that underlies R. solani’s pathogenicity strategy were investigated. Our analyses show that all studied AGs have low content and varying profiles of TEs. All AGs were dominant for Class I TE, much like other basidiomycete pathogens. All AGs demonstrate dominance in Glycoside Hydrolase protein-coding gene assignments suggesting its importance in infiltration and infection of host. Our profiling also provides a basis for further investigation on lack of correlation observed between number of pathogenicity and enzyme-related genes with host range. Despite being grouped within the same AG with China AG1-IA, our Draft AG1-IA exhibits differences in terms of protein-coding gene proportions and classifications. This implies that strains from similar AG do not necessarily have to retain similar proportions and classification of TE but must have the necessary arsenal to enable successful infiltration and colonization of host. In a larger perspective, all the studied AGs essentially share core genes that are generally involved in adhesion, penetration, and host colonization. However, the different infiltration strategies will depend on the level of host resilience where this is clearly exhibited by the gene sets encoded for the process of infiltration, infection, and protection from host.
Collapse
Affiliation(s)
- Nurhani Mat Razali
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.M.R.); (S.N.H.); (I.S.K.)
| | - Siti Norvahida Hisham
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.M.R.); (S.N.H.); (I.S.K.)
| | - Ilakiya Sharanee Kumar
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.M.R.); (S.N.H.); (I.S.K.)
| | - Rohit Nandan Shukla
- Bionivid Technology Pte Ltd., 209, 4th Cross Rd, B Channasandra, East of NGEF Layout, Kasturi Nagar, Bengaluru 560043, Karnataka, India;
| | - Melvin Lee
- Codon Genomics Sdn. Bhd., No 26, Jalan Dutamas 7 Taman Dutamas Balakong, Seri Kembangan 43200, Selangor, Malaysia;
| | | | - Kalaivani Nadarajah
- Department of Biological Sciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.M.R.); (S.N.H.); (I.S.K.)
- Correspondence:
| |
Collapse
|
5
|
Chui KT, Shen CW. Tolerance analysis in scale-free social networks with varying degree exponents. LIBRARY HI TECH 2019. [DOI: 10.1108/lht-07-2017-0146] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
There are many complex networks like World-Wide Web, internet and social networks have been reported to be scale-free. The major property of scale-free networks is their degree distributions are in power law form. Generally, the degree exponents of scale-free networks fall into the range of (2, 3). The purpose of this paper is to investigate other situations where the degree exponents may lie outside the range.
Design/methodology/approach
In this paper, analysis has been carried out by varying the degree exponents in the range of (0.5, 4.5). In total, 243 scenarios have been generated with varying network size of 1,000, 2,000 and 4,000, and degree exponents in the range of (0.5, 4.5) using interval of 0.05.
Findings
The following five indicators have been investigated: average density, average clustering coefficient, average path length, average diameter and average node degree. These indicators vary with the network size and degree exponent. If certain indicators do not satisfy with the user requirement using degree exponents of (2, 3), one can further increase or decrease the value with tradeoff. Results recommend that for degree exponents in (0.5, 2), 26 possible scale-free networks can be selected whereas for (3, 4.5), 41 possible scale-free networks can be selected, assuming a 100 percent deviation on the network parameters.
Originality/value
A tolerance analysis is given for the tradeoff and guideline is drawn to help better design of scale-free network for degree exponents in range of (0.5, 2) and (3, 4.5) using network size 1,000, 2,000 and 4,000. The methodology is applicable to any network size.
Collapse
|
6
|
Orelle C, Durmort C, Mathieu K, Duchêne B, Aros S, Fenaille F, André F, Junot C, Vernet T, Jault JM. A multidrug ABC transporter with a taste for GTP. Sci Rep 2018; 8:2309. [PMID: 29396536 PMCID: PMC5797166 DOI: 10.1038/s41598-018-20558-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 01/19/2018] [Indexed: 01/26/2023] Open
Abstract
During the evolution of cellular bioenergetics, many protein families have been fashioned to match the availability and replenishment in energy supply. Molecular motors and primary transporters essentially need ATP to function while proteins involved in cell signaling or translation consume GTP. ATP-Binding Cassette (ABC) transporters are one of the largest families of membrane proteins gathering several medically relevant members that are typically powered by ATP hydrolysis. Here, a Streptococcus pneumoniae ABC transporter responsible for fluoroquinolones resistance in clinical settings, PatA/PatB, is shown to challenge this concept. It clearly favors GTP as the energy supply to expel drugs. This preference is correlated to its ability to hydrolyze GTP more efficiently than ATP, as found with PatA/PatB reconstituted in proteoliposomes or nanodiscs. Importantly, the ATP and GTP concentrations are similar in S. pneumoniae supporting the physiological relevance of GTP as the energy source of this bacterial transporter.
Collapse
Affiliation(s)
- Cédric Orelle
- University of Lyon, CNRS, UMR5086 "Molecular Microbiology and Structural Biochemistry", IBCP, 7 Passage du Vercors, F-69367, Lyon, France
| | - Claire Durmort
- Institut de Biologie Structurale (IBS), University Grenoble Alpes, CEA, CNRS, 38044, Grenoble, France.
| | - Khadija Mathieu
- University of Lyon, CNRS, UMR5086 "Molecular Microbiology and Structural Biochemistry", IBCP, 7 Passage du Vercors, F-69367, Lyon, France
| | - Benjamin Duchêne
- Institut de Biologie Structurale (IBS), University Grenoble Alpes, CEA, CNRS, 38044, Grenoble, France
| | - Sandrine Aros
- CEA, Institut Joliot, Service de Pharmacologie et d'Immunoanalyse, UMR 0496, Laboratoire d'Etude du Métabolisme des Médicaments, MetaboHUB-Paris, Université Paris Saclay, F-91191, Gif-sur-Yvette cedex, France
| | - François Fenaille
- CEA, Institut Joliot, Service de Pharmacologie et d'Immunoanalyse, UMR 0496, Laboratoire d'Etude du Métabolisme des Médicaments, MetaboHUB-Paris, Université Paris Saclay, F-91191, Gif-sur-Yvette cedex, France
| | - François André
- Laboratoire Stress Oxydant et Détoxication (LSOD), Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris-Sud, Université Paris-Saclay, F-91198, Gif-sur-Yvette cedex, France
| | - Christophe Junot
- CEA, Institut Joliot, Service de Pharmacologie et d'Immunoanalyse, UMR 0496, Laboratoire d'Etude du Métabolisme des Médicaments, MetaboHUB-Paris, Université Paris Saclay, F-91191, Gif-sur-Yvette cedex, France
| | - Thierry Vernet
- Institut de Biologie Structurale (IBS), University Grenoble Alpes, CEA, CNRS, 38044, Grenoble, France
| | - Jean-Michel Jault
- University of Lyon, CNRS, UMR5086 "Molecular Microbiology and Structural Biochemistry", IBCP, 7 Passage du Vercors, F-69367, Lyon, France.
| |
Collapse
|
7
|
Binding Direction-Based Two-Dimensional Flattened Contact Area Computing Algorithm for Protein–Protein Interactions. Molecules 2017; 22:molecules22101722. [PMID: 29027921 PMCID: PMC6151622 DOI: 10.3390/molecules22101722] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 10/06/2017] [Accepted: 10/12/2017] [Indexed: 11/16/2022] Open
Abstract
Interactions between protein molecules are essential for the assembly, function, and regulation of proteins. The contact region between two protein molecules in a protein complex is usually complementary in shape for both molecules and the area of the contact region can be used to estimate the binding strength between two molecules. Although the area is a value calculated from the three-dimensional surface, it cannot represent the three-dimensional shape of the surface. Therefore, we propose an original concept of two-dimensional contact area which provides further information such as the ruggedness of the contact region. We present a novel algorithm for calculating the binding direction between two molecules in a protein complex, and then suggest a method to compute the two-dimensional flattened area of the contact region between two molecules based on the binding direction.
Collapse
|
8
|
Ho CH, Tsai SF. Functional and biochemical characterization of a T cell-associated anti-apoptotic protein, GIMAP6. J Biol Chem 2017; 292:9305-9319. [PMID: 28381553 DOI: 10.1074/jbc.m116.768689] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2016] [Revised: 03/31/2017] [Indexed: 11/06/2022] Open
Abstract
GTPases of immunity-associated proteins (GIMAPs) are expressed in lymphocytes and regulate survival/death signaling and cell development within the immune system. We found that human GIMAP6 is expressed primarily in T cell lines. By sorting human peripheral blood mononuclear cells and performing quantitative RT-PCR, GIMAP6 was found to be expressed in CD3+ cells. In Jurkat cells that had been knocked down for GIMAP6, treatment with hydrogen peroxide, FasL, or okadaic acid significantly increased cell death/apoptosis. Exogenous expression of GMAP6 protected Huh-7 cells from apoptosis, suggesting that GIMAP6 is an anti-apoptotic protein. Furthermore, knockdown of GIMAP6 not only rendered Jurkat cells sensitive to apoptosis but also accelerated T cell activation under phorbol 12-myristate 13-acetate/ionomycin treatment conditions. Using this experimental system, we also observed a down-regulation of p65 phosphorylation (Ser-536) in GIMAP6 knockdown cells, indicating that GIMAP6 might display anti-apoptotic function through NF-κB activation. The conclusion from the study on cultured T cells was corroborated by the analysis of primary CD3+ T cells, showing that specific knockdown of GIMAP6 led to enhancement of phorbol 12-myristate 13-acetate/ionomycin-mediated activation signals. To characterize the biochemical properties of GIMAP6, we purified the recombinant GIMAP6 to homogeneity and revealed that GIMAP6 had ATPase as well as GTPase activity. We further demonstrated that the hydrolysis activity of GIMAP6 was not essential for its anti-apoptotic function in Huh-7 cells. Combining the expression data, biochemical properties, and cellular features, we conclude that GIMAP6 plays a role in modulating immune function and that it does this by controlling cell death and the activation of T cells.
Collapse
Affiliation(s)
- Ching-Huang Ho
- From the Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei 112, Taiwan and
| | - Shih-Feng Tsai
- From the Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei 112, Taiwan and .,the Institute of Molecular and Genomic Medicine, National Health Research Institutes, Zhunan, Miaoli 350, Taiwan
| |
Collapse
|
9
|
Pang E, Hao Y, Sun Y, Lin K. Differential variation patterns between hubs and bottlenecks in human protein-protein interaction networks. BMC Evol Biol 2016; 16:260. [PMID: 27903259 PMCID: PMC5131443 DOI: 10.1186/s12862-016-0840-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 11/25/2016] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The identification, description and understanding of protein-protein networks are important in cell biology and medicine, especially for the study of system biology where the focus concerns the interaction of biomolecules. Hubs and bottlenecks refer to the important proteins of a protein interaction network. Until now, very little attention has been paid to differentiate these two protein groups. RESULTS By integrating human protein-protein interaction networks and human genome-wide variations across populations, we described the differences between hubs and bottlenecks in this study. Our findings showed that similar to interspecies, hubs and bottlenecks changed significantly more slowly than non-hubs and non-bottlenecks. To distinguish hubs from bottlenecks, we extracted their special members: hub-non-bottlenecks and non-hub-bottlenecks. The differences between these two groups represent what is between hubs and bottlenecks. We found that the variation rate of hubs was significantly lower than that of bottlenecks. In addition, we verified that stronger constraint is exerted on hubs than on bottlenecks. We further observed fewer non-synonymous sites on the domains of hubs than on those of bottlenecks and different molecular functions between them. CONCLUSIONS Based on these results, we conclude that in recent human history, different variation patterns exist in hubs and bottlenecks in protein interaction networks. By revealing the difference between hubs and bottlenecks, our results might provide further insights in the relationship between evolution and biological structure.
Collapse
Affiliation(s)
- Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing, 100875, China. .,Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China.
| | - Yu Hao
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing, 100875, China.,Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Ying Sun
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing, 100875, China.,Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing, 100875, China.,Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
10
|
Huang L, Liao L, Wu CH. Protein-protein interaction prediction based on multiple kernels and partial network with linear programming. BMC SYSTEMS BIOLOGY 2016. [PMCID: PMC4977483 DOI: 10.1186/s12918-016-0296-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
Abstract
Background Prediction of de novo protein-protein interaction is a critical step toward reconstructing PPI networks, which is a central task in systems biology. Recent computational approaches have shifted from making PPI prediction based on individual pairs and single data source to leveraging complementary information from multiple heterogeneous data sources and partial network structure. However, how to quickly learn weights for heterogeneous data sources remains a challenge. In this work, we developed a method to infer de novo PPIs by combining multiple data sources represented in kernel format and obtaining optimal weights based on random walk over the existing partial networks. Results Our proposed method utilizes Barker algorithm and the training data to construct a transition matrix which constrains how a random walk would traverse the partial network. Multiple heterogeneous features for the proteins in the network are then combined into the form of weighted kernel fusion, which provides a new "adjacency matrix" for the whole network that may consist of disconnected components but is required to comply with the transition matrix on the training subnetwork. This requirement is met by adjusting the weights to minimize the element-wise difference between the transition matrix and the weighted kernels. The minimization problem is solved by linear programming. The weighted kernel fusion is then transformed to regularized Laplacian (RL) kernel to infer missing or new edges in the PPI network, which can potentially connect the previously disconnected components. Conclusions The results on synthetic data demonstrated the soundness and robustness of the proposed algorithms under various conditions. And the results on real data show that the accuracies of PPI prediction for yeast data and human data measured as AUC are increased by up to 19 % and 11 % respectively, as compared to a control method without using optimal weights. Moreover, the weights learned by our method Weight Optimization by Linear Programming (WOLP) are very consistent with that learned by sampling, and can provide insights into the relations between PPIs and various feature kernel, thereby improving PPI prediction even for disconnected PPI networks.
Collapse
|
11
|
Huang L, Liao L, Wu CH. Inference of protein-protein interaction networks from multiple heterogeneous data. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2016:8. [PMID: 26941784 PMCID: PMC4761017 DOI: 10.1186/s13637-016-0040-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 02/09/2016] [Indexed: 11/29/2022]
Abstract
Protein-protein interaction (PPI) prediction is a central task in achieving a better understanding of cellular and intracellular processes. Because high-throughput experimental methods are both expensive and time-consuming, and are also known of suffering from the problems of incompleteness and noise, many computational methods have been developed, with varied degrees of success. However, the inference of PPI network from multiple heterogeneous data sources remains a great challenge. In this work, we developed a novel method based on approximate Bayesian computation and modified differential evolution sampling (ABC-DEP) and regularized laplacian (RL) kernel. The method enables inference of PPI networks from topological properties and multiple heterogeneous features including gene expression and Pfam domain profiles, in forms of weighted kernels. The optimal weights are obtained by ABC-DEP, and the kernel fusion built based on optimal weights serves as input to RL to infer missing or new edges in the PPI network. Detailed comparisons with control methods have been made, and the results show that the accuracy of PPI prediction measured by AUC is increased by up to 23 %, as compared to a baseline without using optimal weights. The method can provide insights into the relations between PPIs and various feature kernels and demonstrates strong capability of predicting faraway interactions that cannot be well detected by traditional RL method.
Collapse
Affiliation(s)
- Lei Huang
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, 19716 DE USA
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, 19716 DE USA
| | - Cathy H Wu
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, 19716 DE USA ; Center for Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Newark, 19711 DE USA
| |
Collapse
|
12
|
Kim D, Li R, Dudek SM, Frase AT, Pendergrass SA, Ritchie MD. Knowledge-driven genomic interactions: an application in ovarian cancer. BioData Min 2014; 7:20. [PMID: 25214892 PMCID: PMC4161273 DOI: 10.1186/1756-0381-7-20] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 08/28/2014] [Indexed: 12/11/2022] Open
Abstract
Background Effective cancer clinical outcome prediction for understanding of the mechanism of various types of cancer has been pursued using molecular-based data such as gene expression profiles, an approach that has promise for providing better diagnostics and supporting further therapies. However, clinical outcome prediction based on gene expression profiles varies between independent data sets. Further, single-gene expression outcome prediction is limited for cancer evaluation since genes do not act in isolation, but rather interact with other genes in complex signaling or regulatory networks. In addition, since pathways are more likely to co-operate together, it would be desirable to incorporate expert knowledge to combine pathways in a useful and informative manner. Methods Thus, we propose a novel approach for identifying knowledge-driven genomic interactions and applying it to discover models associated with cancer clinical phenotypes using grammatical evolution neural networks (GENN). In order to demonstrate the utility of the proposed approach, an ovarian cancer data from the Cancer Genome Atlas (TCGA) was used for predicting clinical stage as a pilot project. Results We identified knowledge-driven genomic interactions associated with cancer stage from single knowledge bases such as sources of pathway-pathway interaction, but also knowledge-driven genomic interactions across different sets of knowledge bases such as pathway-protein family interactions by integrating different types of information. Notably, an integration model from different sources of biological knowledge achieved 78.82% balanced accuracy and outperformed the top models with gene expression or single knowledge-based data types alone. Furthermore, the results from the models are more interpretable because they are framed in the context of specific biological pathways or other expert knowledge. Conclusions The success of the pilot study we have presented herein will allow us to pursue further identification of models predictive of clinical cancer survival and recurrence. Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different biological knowledge sources has the potential for providing more effective screening strategies and therapeutic targets for many types of cancer.
Collapse
Affiliation(s)
- Dokyoon Kim
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Ruowang Li
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Scott M Dudek
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Alex T Frase
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Sarah A Pendergrass
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| |
Collapse
|
13
|
Fadhal E, Mwambene EC, Gamieldien J. Modelling human protein interaction networks as metric spaces has potential in disease research and drug target discovery. BMC SYSTEMS BIOLOGY 2014; 8:68. [PMID: 24929653 PMCID: PMC4088370 DOI: 10.1186/1752-0509-8-68] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 06/04/2014] [Indexed: 01/06/2023]
Abstract
Background We have recently shown by formally modelling human protein interaction networks (PINs) as metric spaces and classified proteins into zones based on their distance from the topological centre that hub proteins are primarily centrally located. We also showed that zones closest to the network centre are enriched for critically important proteins and are also functionally very specialised for specific ‘house keeping’ functions. We proposed that proteins closest to the network centre may present good therapeutic targets. Here, we present multiple pieces of novel functional evidence that provides strong support for this hypothesis. Results We found that the human PINs has a highly connected signalling core, with the majority of proteins involved in signalling located in the two zones closest to the topological centre. The majority of essential, disease related, tumour suppressor, oncogenic and approved drug target proteins were found to be centrally located. Similarly, the majority of proteins consistently expressed in 13 types of cancer are also predominantly located in zones closest to the centre. Proteins from zones 1 and 2 were also found to comprise the majority of proteins in key KEGG pathways such as MAPK-signalling, the cell cycle, apoptosis and also pathways in cancer, with very similar patterns seen in pathways that lead to cancers such as melanoma and glioma, and non-neoplastic diseases such as measles, inflammatory bowel disease and Alzheimer’s disease. Conclusions Based on the diversity of evidence uncovered, we propose that when considered holistically, proteins located centrally in the human PINs that also have similar functions to existing drug targets are good candidate targets for novel therapeutics. Similarly, since disease pathways are dominated by centrally located proteins, candidates shortlisted in genome scale disease studies can be further prioritized and contextualised based on whether they occupy central positions in the human PINs.
Collapse
Affiliation(s)
| | | | - Junaid Gamieldien
- South African National Bioinformatics Institute/ MRC Unit for Bioinformatics Capacity Development, University of the Western Cape, Bellville 7530, South Africa.
| |
Collapse
|
14
|
Abstract
Rapid development of genomic and proteomic methodologies has provided a wealth of data for deciphering the biomolecular circuitry of a living cell. The main areas of computational research of proteomes outlined in this review are: understanding the system, its features and parameters to help plan the experiments; data integration, to help produce more reliable data sets; visualization and other forms of data representation to simplify interpretation; modeling of the functional regulation; and systems biology. With false-positive rates reaching 50% even in the more reliable data sets, handling the experimental error remains one of the most challenging tasks. Integrative approaches, incorporating results of various genome- and proteome-wide experiments, allow for minimizing the error and bring with them significant predictive power.
Collapse
|
15
|
El-Denshary ESM, Rashed LA, Elhussiny M. Mesenchymal stromal cells versus betamethasone can dampen disease activity in the collagen arthritis mouse model. Clin Exp Med 2013; 14:285-95. [PMID: 23990050 DOI: 10.1007/s10238-013-0248-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2013] [Accepted: 06/08/2013] [Indexed: 12/15/2022]
Abstract
The objective of this study was to compare between the effects of mesenchymal stem cell (MSC) and betamethasone in the treatment of rheumatoid arthritis. Sixty male albino mice were divided equally into 2 models. They are MSC model, group 1: saline control group, group 2: collagen-induced arthritis (CIA), group 3: induced arthritis mice that received intravenous injection of MSCs. Betamethasone model, group 1: phosphate buffer saline, group 2: CIA, group 3: induced arthritis mice that received intraperitoneal injection of betamethasone. Mice arthritis models were assessed by clinical paw edema and X-rays, at the proper time of sacrefaction, tissues were collected and examined using real-time PCR, and synovial tissue was examined for interleukin-10, tumor necrosis factor α, cartilage oligomeric matrix protein and matrix metalloproteinase 3. While serum levels of rheumatoid factor and C-reactive protein were detected by enzyme-linked immunosorbent assay kits. Also blood erythrocyte sedimentation rate was detected. Histopathological, paw edema and PCR results showed improvement in the groups that received MSC compared with the diseased group and the groups which received betamethasone. MSC significantly enhanced the effect of collagen-induced arthritis treatment, which is superior to betamethasone treatment, likely through the modulation of the expression of various cytokines.
Collapse
|
16
|
Winter C, Henschel A, Tuukkanen A, Schroeder M. Protein interactions in 3D: From interface evolution to drug discovery. J Struct Biol 2012; 179:347-58. [DOI: 10.1016/j.jsb.2012.04.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 03/27/2012] [Accepted: 04/18/2012] [Indexed: 11/25/2022]
|
17
|
Pradhan MP, Prasad NKA, Palakal MJ. A systems biology approach to the global analysis of transcription factors in colorectal cancer. BMC Cancer 2012; 12:331. [PMID: 22852817 PMCID: PMC3539921 DOI: 10.1186/1471-2407-12-331] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Accepted: 06/21/2012] [Indexed: 02/08/2023] Open
Abstract
Background Biological entities do not perform in isolation, and often, it is the nature and degree of interactions among numerous biological entities which ultimately determines any final outcome. Hence, experimental data on any single biological entity can be of limited value when considered only in isolation. To address this, we propose that augmenting individual entity data with the literature will not only better define the entity’s own significance but also uncover relationships with novel biological entities. To test this notion, we developed a comprehensive text mining and computational methodology that focused on discovering new targets of one class of molecular entities, transcription factors (TF), within one particular disease, colorectal cancer (CRC). Methods We used 39 molecular entities known to be associated with CRC along with six colorectal cancer terms as the bait list, or list of search terms, for mining the biomedical literature to identify CRC-specific genes and proteins. Using the literature-mined data, we constructed a global TF interaction network for CRC. We then developed a multi-level, multi-parametric methodology to identify TFs to CRC. Results The small bait list, when augmented with literature-mined data, identified a large number of biological entities associated with CRC. The relative importance of these TF and their associated modules was identified using functional and topological features. Additional validation of these highly-ranked TF using the literature strengthened our findings. Some of the novel TF that we identified were: SLUG, RUNX1, IRF1, HIF1A, ATF-2, ABL1, ELK-1 and GATA-1. Some of these TFs are associated with functional modules in known pathways of CRC, including the Beta-catenin/development, immune response, transcription, and DNA damage pathways. Conclusions Our methodology of using text mining data and a multi-level, multi-parameter scoring technique was able to identify both known and novel TF that have roles in CRC. Starting with just one TF (SMAD3) in the bait list, the literature mining process identified an additional 116 CRC-associated TFs. Our network-based analysis showed that these TFs all belonged to any of 13 major functional groups that are known to play important roles in CRC. Among these identified TFs, we obtained a novel six-node module consisting of ATF2-P53-JNK1-ELK1-EPHB2-HIF1A, from which the novel JNK1-ELK1 association could potentially be a significant marker for CRC.
Collapse
Affiliation(s)
- Meeta P Pradhan
- School of Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | | | | |
Collapse
|
18
|
Planas-Iglesias J, Guney E, García-García J, Robertson KA, Raza S, Freeman TC, Ghazal P, Oliva B. Extending signaling pathways with protein-interaction networks. Application to apoptosis. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:245-56. [PMID: 22385281 DOI: 10.1089/omi.2011.0130] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Cells exploit signaling pathways during responses to environmental changes, and these processes are often modulated during disease. Particularly, relevant human pathologies such as cancer or viral infections require downregulating apoptosis signaling pathways to progress. As a result, the identification of proteins responsible for these changes is essential for the diagnostics and development of therapeutics. Transferring functional annotation within protein interaction networks has proven useful to identify such proteins, although this is not a trivial task. Here, we used different scoring methods to transfer annotation from 53 well-studied members of the human apoptosis pathways (as known by 2005) to their protein interactors. All scoring methods produced significant predictions (compared to a random negative model), but its number was too large to be useful. Thus, we made a final prediction using specific combinations of scoring methods and compared it to the proteins related to apoptosis signaling pathways during the last 5 years. We propose 273 candidate proteins that may be relevant in apoptosis signaling pathways. Although some of them have known functions consistent with their proposed apoptotsis involvement, the majority have not been annotated yet, leaving room for further experimental studies. We provide our predictions at http://sbi.imim.es/web/Apoptosis.php.
Collapse
Affiliation(s)
- Joan Planas-Iglesias
- Structural Bioinformatics Group, GRIB, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Zhang S, Chang Z, Li Z, DuanMu H, Li Z, Li K, Liu Y, Qiu F, Xu Y. Calculating phenotypic similarity between genes using hierarchical structure data based on semantic similarity. Gene 2012; 497:58-65. [PMID: 22305981 DOI: 10.1016/j.gene.2012.01.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2011] [Revised: 01/16/2012] [Accepted: 01/18/2012] [Indexed: 01/25/2023]
Abstract
Phenotypic similarity is correlated with a number of measures of gene function, such as relatedness at the level of direct protein-protein interaction. The phenotypic effect of a deleted or mutated gene, which is one part of gene annotation, has caught broad attention. However, there have been few measures to study phenotypic similarity with the data from Human Phenotype Ontology (HPO) database, therefore more analogous measures should be developed and investigated. We used five semantic similarity-based measures (Jiang and Conrath, Lin, Schlicker, Yu and Wu) to calculate the human phenotypic similarity between genes (PSG) with data from HPO database, and evaluated their accuracy with information of protein-protein interaction, protein complex, protein family, gene function or DNA sequence. Compared with the gene pairs that were random selected, the results of these methods were statistically significant (all P<0.001). Furthermore, we assessed the performance of these five measures by receiver operating characteristic (ROC) curve analysis, and found that most of them performed better than the previous methods. This work had proved that these measures based on semantic similarity for calculation of PSG were effective for hierarchical structure data. Our study contributes to the development and optimization of novel algorithms of PSG calculation and provides more alternative methods to researchers as well as tools and directions for PSG study.
Collapse
Affiliation(s)
- Shanzhen Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, PR China
| | | | | | | | | | | | | | | | | |
Collapse
|
20
|
LIU KANGPING, HSU KAICHENG, HUANG JHANGWEI, CHANG LUSHIAN, YANG JINNMOON. ATRIPPI: AN ATOM-RESIDUE PREFERENCE SCORING FUNCTION FOR PROTEIN–PROTEIN INTERACTIONS. INT J ARTIF INTELL T 2011. [DOI: 10.1142/s0218213010000169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We present an ATRIPPI model for analyzing protein–protein interactions. This model is a 167-atom-type and residue-specific interaction preferences with distance bins derived from 641 co-crystallized protein–protein interfaces. The ATRIPPI model is able to yield physical meanings of hydrogen bonding, disulfide bonding, electrostatic interactions, van der Waals and aromatic–aromatic interactions. We applied this model to identify the native states and near-native complex structures on 17 bound and 17 unbound complexes from thousands of decoy structures. On average, 77.5% structures (155 structures) of top rank 200 structures are closed to the native structure. These results suggest that the ATRIPPI model is able to keep the advantages of both atom–atom and residue–residue interactions and is a potential knowledge-based scoring function for protein–protein docking methods. We believe that our model is robust and provides biological meanings to support protein–protein interactions.
Collapse
Affiliation(s)
- KANG-PING LIU
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - KAI-CHENG HSU
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - JHANG-WEI HUANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - LU-SHIAN CHANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - JINN-MOON YANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
21
|
Hu WJ, Zhou SM, Yang JS, Meng FG. Computational simulations to predict creatine kinase-associated factors: protein-protein interaction studies of brain and muscle types of creatine kinases. Enzyme Res 2011; 2011:328249. [PMID: 21826261 PMCID: PMC3150154 DOI: 10.4061/2011/328249] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 05/26/2011] [Indexed: 11/20/2022] Open
Abstract
Creatine kinase (CK; EC 2.7.3.2) is related to several skin diseases such as psoriasis and dermatomyositis. CK is important in skin energy homeostasis because it catalyzes the reversible transfer of a phosphoryl group from MgATP to creatine. In this study, we predicted CK binding proteins via the use of bioinformatic tools such as protein-protein interaction (PPI) mappings and suggest the putative hub proteins for CK interactions. We obtained 123 proteins for brain type CK and 85 proteins for muscle type CK in the interaction networks. Among them, several hub proteins such as NFKB1, FHL2, MYOC, and ASB9 were predicted. Determination of the binding factors of CK can further promote our understanding of the roles of CK in physiological conditions.
Collapse
Affiliation(s)
- Wei-Jiang Hu
- Zhejiang Provincial Key Laboratory of Applied Enzymology, Yangtze Delta Region Institute of Tsinghua University, Jiaxing 314006, China
| | | | | | | |
Collapse
|
22
|
Schneider CM, de Arcangelis L, Herrmann HJ. Modeling the topology of protein interaction networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:016112. [PMID: 21867262 DOI: 10.1103/physreve.84.016112] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2010] [Revised: 06/13/2011] [Indexed: 05/31/2023]
Abstract
A major issue in biology is the understanding of the interactions between proteins. These interactions can be described by a network, where the proteins are modeled by nodes and the interactions by edges. The origin of these protein networks is not well understood yet. Here we present a two-step model, which generates clusters with the same topological properties as networks for protein-protein interactions, namely, the same degree distribution, cluster size distribution, clustering coefficient, and shortest path length. The biological and model networks are not scale-free but exhibit small-world features. The model allows the fitting of different biological systems by tuning a single parameter.
Collapse
|
23
|
Seidl MF, Van den Ackerveken G, Govers F, Snel B. A domain-centric analysis of oomycete plant pathogen genomes reveals unique protein organization. PLANT PHYSIOLOGY 2011; 155:628-644. [PMID: 21119047 PMCID: PMC3032455 DOI: 10.1104/pp.110.167841] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Accepted: 11/24/2010] [Indexed: 05/29/2023]
Abstract
Oomycetes comprise a diverse group of organisms that morphologically resemble fungi but belong to the stramenopile lineage within the supergroup of chromalveolates. Recent studies have shown that plant pathogenic oomycetes have expanded gene families that are possibly linked to their pathogenic lifestyle. We analyzed the protein domain organization of 67 eukaryotic species including four oomycete and five fungal plant pathogens. We detected 246 expanded domains in fungal and oomycete plant pathogens. The analysis of genes differentially expressed during infection revealed a significant enrichment of genes encoding expanded domains as well as signal peptides linking a substantial part of these genes to pathogenicity. Overrepresentation and clustering of domain abundance profiles revealed domains that might have important roles in host-pathogen interactions but, as yet, have not been linked to pathogenicity. The number of distinct domain combinations (bigrams) in oomycetes was significantly higher than in fungi. We identified 773 oomycete-specific bigrams, with the majority composed of domains common to eukaryotes. The analyses enabled us to link domain content to biological processes such as host-pathogen interaction, nutrient uptake, or suppression and elicitation of plant immune responses. Taken together, this study represents a comprehensive overview of the domain repertoire of fungal and oomycete plant pathogens and points to novel features like domain expansion and species-specific bigram types that could, at least partially, explain why oomycetes are such remarkable plant pathogens.
Collapse
Affiliation(s)
- Michael F Seidl
- Theoretical Biology and Bioinformatics , Department of Biology, Utrecht University, 3584 CH Utrecht, The Netherlands.
| | | | | | | |
Collapse
|
24
|
Flórez AF, Park D, Bhak J, Kim BC, Kuchinsky A, Morris JH, Espinosa J, Muskus C. Protein network prediction and topological analysis in Leishmania major as a tool for drug target selection. BMC Bioinformatics 2010; 11:484. [PMID: 20875130 PMCID: PMC2956735 DOI: 10.1186/1471-2105-11-484] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2010] [Accepted: 09/27/2010] [Indexed: 02/06/2023] Open
Abstract
Background Leishmaniasis is a virulent parasitic infection that causes a worldwide disease burden. Most treatments have toxic side-effects and efficacy has decreased due to the emergence of resistant strains. The outlook is worsened by the absence of promising drug targets for this disease. We have taken a computational approach to the detection of new drug targets, which may become an effective strategy for the discovery of new drugs for this tropical disease. Results We have predicted the protein interaction network of Leishmania major by using three validated methods: PSIMAP, PEIMAP, and iPfam. Combining the results from these methods, we calculated a high confidence network (confidence score > 0.70) with 1,366 nodes and 33,861 interactions. We were able to predict the biological process for 263 interacting proteins by doing enrichment analysis of the clusters detected. Analyzing the topology of the network with metrics such as connectivity and betweenness centrality, we detected 142 potential drug targets after homology filtering with the human proteome. Further experiments can be done to validate these targets. Conclusion We have constructed the first protein interaction network of the Leishmania major parasite by using a computational approach. The topological analysis of the protein network enabled us to identify a set of candidate proteins that may be both (1) essential for parasite survival and (2) without human orthologs. These potential targets are promising for further experimental validation. This strategy, if validated, may augment established drug discovery methodologies, for this and possibly other tropical diseases, with a relatively low additional investment of time and resources.
Collapse
Affiliation(s)
- Andrés F Flórez
- Programa de Estudio y Control de Enfermedades Tropicales-PECET, Universidad de Antioquia, Calle 62 No 52-59, Lab. 632, Medellín, Colombia
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Schlitt T, Brazma A. Learning about gene regulatory networks from gene deletion experiments. Comp Funct Genomics 2010; 3:499-503. [PMID: 18629255 PMCID: PMC2448417 DOI: 10.1002/cfg.220] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Revised: 09/09/2002] [Accepted: 10/14/2002] [Indexed: 11/10/2022] Open
Abstract
Gene regulatory networks are a major focus of interest in molecular biology. A crucial question is how complex regulatory systems are encoded and controlled by the genome. Three recent publications have raised the question of what can be learned about gene regulatory networks from microarray experiments on gene deletion mutants. Using this indirect approach, topological features such as connectivity and modularity have been studied.
Collapse
Affiliation(s)
- Thomas Schlitt
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | |
Collapse
|
26
|
Miyamoto-Sato E, Fujimori S, Ishizaka M, Hirai N, Masuoka K, Saito R, Ozawa Y, Hino K, Washio T, Tomita M, Yamashita T, Oshikubo T, Akasaka H, Sugiyama J, Matsumoto Y, Yanagawa H. A comprehensive resource of interacting protein regions for refining human transcription factor networks. PLoS One 2010; 5:e9289. [PMID: 20195357 PMCID: PMC2827538 DOI: 10.1371/journal.pone.0009289] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Accepted: 01/05/2010] [Indexed: 11/24/2022] Open
Abstract
Large-scale data sets of protein-protein interactions (PPIs) are a valuable resource for mapping and analysis of the topological and dynamic features of interactome networks. The currently available large-scale PPI data sets only contain information on interaction partners. The data presented in this study also include the sequences involved in the interactions (i.e., the interacting regions, IRs) suggested to correspond to functional and structural domains. Here we present the first large-scale IR data set obtained using mRNA display for 50 human transcription factors (TFs), including 12 transcription-related proteins. The core data set (966 IRs; 943 PPIs) displays a verification rate of 70%. Analysis of the IR data set revealed the existence of IRs that interact with multiple partners. Furthermore, these IRs were preferentially associated with intrinsic disorder. This finding supports the hypothesis that intrinsically disordered regions play a major role in the dynamics and diversity of TF networks through their ability to structurally adapt to and bind with multiple partners. Accordingly, this domain-based interaction resource represents an important step in refining protein interactions and networks at the domain level and in associating network analysis with biological structure and function.
Collapse
Affiliation(s)
- Etsuko Miyamoto-Sato
- Advanced Research Centers, Keio University, Yokohama, Japan
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- * E-mail: (HY); (EM-S)
| | | | - Masamichi Ishizaka
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
| | - Naoya Hirai
- Advanced Research Centers, Keio University, Yokohama, Japan
| | - Kazuyo Masuoka
- Advanced Research Centers, Keio University, Yokohama, Japan
| | - Rintaro Saito
- Department of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Yosuke Ozawa
- Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan
| | - Katsuya Hino
- Advanced Research Centers, Keio University, Yokohama, Japan
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
| | - Takanori Washio
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
| | - Masaru Tomita
- Department of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Tatsuhiro Yamashita
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- BioIT Business Development Unit, Fujitsu Limited, Chiba, Japan
| | - Tomohiro Oshikubo
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- Production Solution Business Unit, Production Solution Division, Solutions and Services Department, Fujitsu Advanced Engineering Limited, Tokyo, Japan
| | - Hidetoshi Akasaka
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- Production Solution Business Unit, Production Solution Division, Solutions and Services Department, Fujitsu Advanced Engineering Limited, Tokyo, Japan
| | - Jun Sugiyama
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- Special Suite Team, Custom Primer Production Department, Haneda Laboratories, Invitrogen Japan K.K., Tokyo, Japan
| | - Yasuo Matsumoto
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- Automation, QIAGEN K.K., Tokyo, Japan
| | - Hiroshi Yanagawa
- Advanced Research Centers, Keio University, Yokohama, Japan
- Department of Biosciences and Informatics, Faculty of Science and Technology, Keio University, Yokohama, Japan
- * E-mail: (HY); (EM-S)
| |
Collapse
|
27
|
Park SJ, Choi JS, Kim BC, Jho SW, Ryu JW, Park D, Lee KA, Bhak J, Kim SI. PutidaNET: interactome database service and network analysis of Pseudomonas putida KT2440. BMC Genomics 2009; 10 Suppl 3:S18. [PMID: 19958481 PMCID: PMC2788370 DOI: 10.1186/1471-2164-10-s3-s18] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background Pseudomonas putida KT2440 (P. putida KT2440) is a highly versatile saprophytic soil bacterium. It is a certified bio-safety host for transferring foreign genes. Therefore, the bacterium is used as a model organism for genetic and physiological studies and for the development of biotechnological applications. In order to provide a more systematic application of the organism, we have constructed a protein-protein interaction (PPI) network analysis system of P. putida KT2440. Results PutidaNET is a comprehensive interaction database and server of P. putida KT2440 which is generated from three protein-protein interaction (PPI) methods. We used PSIMAP (Protein Structural Interactome MAP), PEIMAP (Protein Experimental Interactome MAP), and Domain-domain interactions using iPfam. PutidaNET contains 3,254 proteins, and 82,019 possible interactions consisting of 61,011 (PSIMAP), 4,293 (PEIMAP), and 30,043 (iPfam) interaction pairs except for self interaction. Also, we performed a case study by integrating a protein interaction network and experimental 1-DE/MS-MS analysis data P. putida. We found that 1) major functional modules are involved in various metabolic pathways and ribosomes, and 2) existing PPI sub-networks that are specific to succinate or benzoate metabolism are not in the center as predicted. Conclusion We introduce the PutidaNET which provides predicted interaction partners and functional analyses such as physicochemical properties, KEGG pathway assignment, and Gene Ontology mapping of P. putida KT2440 PutidaNET is freely available at http://sequenceome.kobic.kr/PutidaNET.
Collapse
Affiliation(s)
- Seong-Jin Park
- Korean BioInformation Center (KOBIC), KRIBB, Daejeon 305-806, Korea.
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Reja R, Venkatakrishnan AJ, Lee J, Kim BC, Ryu JW, Gong S, Bhak J, Park D. MitoInteractome: mitochondrial protein interactome database, and its application in 'aging network' analysis. BMC Genomics 2009; 10 Suppl 3:S20. [PMID: 19958484 PMCID: PMC2788373 DOI: 10.1186/1471-2164-10-s3-s20] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background Mitochondria play a vital role in the energy production and apoptotic process of eukaryotic cells. Proteins in the mitochondria are encoded by nuclear and mitochondrial genes. Owing to a large increase in the number of identified mitochondrial protein sequences and completed mitochondrial genomes, it has become necessary to provide a web-based database of mitochondrial protein information. Results We present 'MitoInteractome', a consolidated web-based portal containing a wealth of information on predicted protein-protein interactions, physico-chemical properties, polymorphism, and diseases related to the mitochondrial proteome. MitoInteractome contains 6,549 protein sequences which were extracted from the following databases: SwissProt, MitoP, MitoProteome, HPRD and Gene Ontology database. The first general mitochondrial interactome has been constructed based on the concept of 'homologous interaction' using PSIMAP (Protein Structural Interactome MAP) and PEIMAP (Protein Experimental Interactome MAP). Using the above mentioned methods, protein-protein interactions were predicted for 74 species. The mitochondrial protein interaction data of humans was used to construct a network for the aging process. Analysis of the 'aging network' gave us vital insights into the interactions among proteins that influence the aging process. Conclusion MitoInteractome is a comprehensive database that would (1) aid in increasing our understanding of the molecular functions and interaction networks of mitochondrial proteins, (2) help in identifying new target proteins for experimental research using predicted protein-protein interaction information, and (3) help in identifying biomarkers for diagnosis and new molecular targets for drug development related to mitochondria. MitoInteractome is available at http://mitointeractome.kobic.kr/.
Collapse
Affiliation(s)
- Rohit Reja
- Korean Bioinformation Center, KRIBB, Daejeon, 305-806, Korea.
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Cho IH, Lü ZR, Yu JR, Park YD, Yang JM, Hahn MJ, Zou F. Towards Profiling the Gene Expression of Tyrosinase-induced Melanogenesis in HEK293 Cells: a Functional DNA Chip Microarray and Interactomics Studies. J Biomol Struct Dyn 2009; 27:331-46. [DOI: 10.1080/07391102.2009.10507320] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
30
|
Moritsugu K, Kurkal-Siebert V, Smith JC. REACH coarse-grained normal mode analysis of protein dimer interaction dynamics. Biophys J 2009; 97:1158-67. [PMID: 19686664 DOI: 10.1016/j.bpj.2009.05.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2008] [Revised: 05/02/2009] [Accepted: 05/05/2009] [Indexed: 01/03/2023] Open
Abstract
The REACH (realistic extension algorithm via covariance Hessian) coarse-grained biomolecular simulation method is a self-consistent multiscale approach directly mapping atomistic molecular dynamics simulation results onto a residue-scale model. Here, REACH is applied to calculate the dynamics of protein-protein interactions. The intra- and intermolecular fluctuations and the intermolecular vibrational densities of states derived from atomistic molecular dynamics are well reproduced by the REACH normal modes. The phonon dispersion relations derived from the REACH lattice dynamics model of crystalline ribonuclease A are also in satisfactory agreement with the corresponding all-atom results. The REACH model demonstrates that increasing dimer interaction strength decreases the translational and rotational intermolecular vibrational amplitudes, while their vibrational frequencies are relatively unaffected. A comparative study of functionally interacting biological dimers with crystal dimers, which are formed artificially via crystallization, reveals a relation between their static structures and the interprotein dynamics: i.e., the consequence of the extensive interfaces of biological dimers is reduction of the intermonomer translational and rotational amplitudes, but not the frequencies.
Collapse
Affiliation(s)
- Kei Moritsugu
- Center for Molecular Biophysics, University of Tennessee/Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
| | | | | |
Collapse
|
31
|
Lü ZR, Kim WS, Cho IH, Park D, Bhak J, Shi L, Zhou HW, Lee DY, Park YD, Yang JM, Zou F. DNA microarray analyses and interactomic predictions for atopic dermatitis. J Dermatol Sci 2009; 55:123-5. [PMID: 19443183 DOI: 10.1016/j.jdermsci.2009.04.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2008] [Revised: 02/21/2009] [Accepted: 04/02/2009] [Indexed: 11/29/2022]
|
32
|
Karst JC, Foucher AE, Campbell TL, Di Guilmi AM, Stroebel D, Mangat CS, Brown ED, Jault JM. The ATPase activity of an 'essential' Bacillus subtilis enzyme, YdiB, is required for its cellular function and is modulated by oligomerization. MICROBIOLOGY-SGM 2009; 155:944-956. [PMID: 19246765 DOI: 10.1099/mic.0.021543-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Characterization of 'unknown' proteins is one of the challenges of the post-genomic era. Here, we report a study of Bacillus subtilis YdiB, which belongs to an uncharted class of bacterial P-loop ATPases. Precise deletion of the ydiB gene yielded a mutant with much reduced growth rate compared to the wild-type strain. In vitro, purified YdiB was in equilibrium among different forms, monomers, dimers and oligomers, and this equilibrium was strongly affected by salts; high concentrations of NaCl favoured the monomeric over the oligomeric form of the enzyme. Interestingly, the ATPase activity of the monomer was about three times higher than that of the oligomer, and the monomer showed a K(m) of about 60 microM for ATP and a V(max) of about 10 nmol min(-1) (mg protein)(-1) (k(cat) approximately 10 h(-1)). This low ATPase activity was shown to be specific to YdiB because mutation of an invariant lysine residue in the P-loop motif (K41A) strongly attenuated this rate. This mutant was unable to restore a normal growth phenotype when introduced into a conditional knockout strain for ydiB, showing that the ATPase activity of YdiB is required for the in vivo function of the protein. Oligomerization was also observed with the purified YjeE from Escherichia coli, a YdiB orthologue, suggesting that this property is shared by all members of this family of ATPases. Importantly, dimers of YdiB were also observed in a B. subtilis extract, or when stabilized by formaldehyde cross-linking for YjeE from E. coli, suggesting that oligomerization might regulate the function of this new class of proteins in vivo.
Collapse
Affiliation(s)
- Johanna C Karst
- Institut de Biologie Structurale, UMR 5075 Université Joseph Fourier/CEA/CNRS, 41 rue Jules Horowitz, 38027 Grenoble cedex 1, France
| | - Anne-Emmanuelle Foucher
- Institut de Biologie Structurale, UMR 5075 Université Joseph Fourier/CEA/CNRS, 41 rue Jules Horowitz, 38027 Grenoble cedex 1, France
| | - Tracey L Campbell
- Antimicrobial Research Centre, Department of Biochemistry and Biomedical Sciences, McMaster University, 1200 Main Street West, Hamilton, ON L8N 3Z5, Canada
| | - Anne-Marie Di Guilmi
- Institut de Biologie Structurale, UMR 5075 Université Joseph Fourier/CEA/CNRS, 41 rue Jules Horowitz, 38027 Grenoble cedex 1, France
| | - David Stroebel
- Institut de Biologie Structurale, UMR 5075 Université Joseph Fourier/CEA/CNRS, 41 rue Jules Horowitz, 38027 Grenoble cedex 1, France
| | - Chand S Mangat
- Antimicrobial Research Centre, Department of Biochemistry and Biomedical Sciences, McMaster University, 1200 Main Street West, Hamilton, ON L8N 3Z5, Canada
| | - Eric D Brown
- Antimicrobial Research Centre, Department of Biochemistry and Biomedical Sciences, McMaster University, 1200 Main Street West, Hamilton, ON L8N 3Z5, Canada
| | - Jean-Michel Jault
- Institut de Biologie Structurale, UMR 5075 Université Joseph Fourier/CEA/CNRS, 41 rue Jules Horowitz, 38027 Grenoble cedex 1, France
| |
Collapse
|
33
|
Lee S, Brown A, Pitt WR, Higueruelo AP, Gong S, Bickerton GR, Schreyer A, Tanramluk D, Baylay A, Blundell TL. Structural interactomics: informatics approaches to aid the interpretation of genetic variation and the development of novel therapeutics. MOLECULAR BIOSYSTEMS 2009; 5:1456-72. [DOI: 10.1039/b906402h] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
34
|
Park K, Lee S, Ahn HS, Kim D. Predicting the multi-modal binding propensity of small molecules: towards an understanding of drug promiscuity. MOLECULAR BIOSYSTEMS 2009; 5:844-53. [DOI: 10.1039/b901356c] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
35
|
Yoon HK, Sohn KC, Lee JS, Kim YJ, Bhak J, Yang JM, You KH, Kim CD, Lee JH. Prediction and evaluation of protein–protein interaction in keratinocyte differentiation. Biochem Biophys Res Commun 2008; 377:662-667. [DOI: 10.1016/j.bbrc.2008.10.051] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2008] [Accepted: 10/10/2008] [Indexed: 11/29/2022]
|
36
|
Lü ZR, Park TH, Lee ES, Kim KJ, Park D, Kim BC, Cho SW, Bhak J, Park YD, Zou F, Yang JM. Dysregulated genes of extrinsic type of atopic dermatitis: 34K microarray and interactomic analyses. J Dermatol Sci 2008; 53:146-50. [PMID: 18824329 DOI: 10.1016/j.jdermsci.2008.08.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2008] [Revised: 08/06/2008] [Accepted: 08/07/2008] [Indexed: 10/21/2022]
|
37
|
Chen PY, Deane CM, Reinert G. Predicting and validating protein interactions using network structure. PLoS Comput Biol 2008; 4:e1000118. [PMID: 18654616 PMCID: PMC2435280 DOI: 10.1371/journal.pcbi.1000118] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2007] [Accepted: 06/09/2008] [Indexed: 11/18/2022] Open
Abstract
Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.
Collapse
Affiliation(s)
- Pao-Yang Chen
- Department of Statistics, University of Oxford, Oxford, United Kingdom.
| | | | | |
Collapse
|
38
|
Mahdavi MA, Lin YH. Prediction of protein-protein interactions using protein signature profiling. GENOMICS PROTEOMICS & BIOINFORMATICS 2008; 5:177-86. [PMID: 18267299 PMCID: PMC5963007 DOI: 10.1016/s1672-0229(08)60005-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Protein domains are conserved and functionally independent structures that play an important role in interactions among related proteins. Domain-domain interactions have been recently used to predict protein-protein interactions (PPI). In general, the interaction probability of a pair of domains is scored using a trained scoring function. Satisfying a threshold, the protein pairs carrying those domains are regarded as “interacting”. In this study, the signature contents of proteins were utilized to predict PPI pairs in Saccharomyces cerevisiae, Caenorhabditis elegans, and Homo sapiens. Similarity between protein signature patterns was scored and PPI predictions were drawn based on the binary similarity scoring function. Results show that the true positive rate of prediction by the proposed approach is approximately 32% higher than that using the maximum likelihood estimation method when compared with a test set, resulting in 22% increase in the area under the receiver operating characteristic (ROC) curve. When proteins containing one or two signatures were removed, the sensitivity of the predicted PPI pairs increased significantly. The predicted PPI pairs are on average 11 times more likely to interact than the random selection at a confidence level of 0.95, and on average 4 times better than those predicted by either phylogenetic profiling or gene expression profiling.
Collapse
Affiliation(s)
- Mahmood A Mahdavi
- Department of Chemical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| | | |
Collapse
|
39
|
Park D, Kim BC, Cho SW, Park SJ, Choi JS, Kim SI, Bhak J, Lee S. MassNet: a functional annotation service for protein mass spectrometry data. Nucleic Acids Res 2008; 36:W491-5. [PMID: 18448467 PMCID: PMC2447811 DOI: 10.1093/nar/gkn241] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Although mass spectrometry has been frequently used to identify proteins, there are no web servers that provide comprehensive functional annotation of those identified proteins. It is necessary to provide such web service due to a rapid increase in the data. We, therefore, introduce MassNet, which provides (i) physico-chemical analysis information, (ii) KEGG pathway assignment (iii) Gene Ontology mapping and (iv) protein-protein interaction (PPI) prediction for the data from MASCOT, Prospector and Profound. MassNet provides the prediction information for PPIs using both 3D structural interaction and experimental interaction deposited in PSIMAP, BIND, DIP, HPRD, IntAct, MINT, CYGD and BioGrid. The web service is freely available at http://massnet.kr or http://sequenceome.kobic.re.kr/MassNet/.
Collapse
Affiliation(s)
- Daeui Park
- Korean BioInformation Center, KRIBB, Daejeon 305-806, Korea
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Kim WY, Kang S, Kim BC, Oh J, Cho S, Bhak J, Choi JS. SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803. BMC Bioinformatics 2008; 9 Suppl 1:S20. [PMID: 18315852 PMCID: PMC2259421 DOI: 10.1186/1471-2105-9-s1-s20] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. DESCRIPTION We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. CONCLUSION SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at http://synechocystis.org/ or directly at http://bioportal.kobic.kr/SynechoNET/.
Collapse
Affiliation(s)
- Woo-Yeon Kim
- Korean BioInformation Center, KRIBB, Daejeon 305-806, Korea.
| | | | | | | | | | | | | |
Collapse
|
41
|
Park YD, Park D, Bhak J, Yang JM. Proteomic approaches to the analysis of atopic dermatitis and new insights from interactomics. Proteomics Clin Appl 2008; 2:290-300. [DOI: 10.1002/prca.200780063] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
42
|
Predicting the interactome of Xanthomonas oryzae pathovar oryzae for target selection and DB service. BMC Bioinformatics 2008; 9:41. [PMID: 18215330 PMCID: PMC2246157 DOI: 10.1186/1471-2105-9-41] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2007] [Accepted: 01/24/2008] [Indexed: 12/29/2022] Open
Abstract
Background Protein-protein interactions (PPIs) play key roles in various cellular functions. In addition, some critical inter-species interactions such as host-pathogen interactions and pathogenicity occur through PPIs. Phytopathogenic bacteria infect hosts through attachment to host tissue, enzyme secretion, exopolysaccharides production, toxins release, iron acquisition, and effector proteins secretion. Many such mechanisms involve some kind of protein-protein interaction in hosts. Our first aim was to predict the whole protein interaction pairs (interactome) of Xanthomonas oryzae pathovar oryzae (Xoo) that is an important pathogenic bacterium that causes bacterial blight (BB) in rice. We developed a detection protocol to find possibly interacting proteins in its host using whole genome PPI prediction algorithms. The second aim was to build a DB server and a bioinformatic procedure for finding target proteins in Xoo for developing pesticides that block host-pathogen protein interactions within critical biochemical pathways. Description A PPI network in Xoo proteome was predicted by bioinformatics algorithms: PSIMAP, PEIMAP, and iPfam. We present the resultant species specific interaction network and host-pathogen interaction, XooNET. It is a comprehensive predicted initial PPI data for Xoo. XooNET can be used by experimentalists to pick up protein targets for blocking pathological interactions. XooNET uses most of the major types of PPI algorithms. They are: 1) Protein Structural Interactome MAP (PSIMAP), a method using structural domain of SCOP, 2) Protein Experimental Interactome MAP (PEIMAP), a common method using public resources of experimental protein interaction information such as HPRD, BIND, DIP, MINT, IntAct, and BioGrid, and 3) Domain-domain interactions, a method using Pfam domains such as iPfam. Additionally, XooNET provides information on network properties of the Xoo interactome. Conclusion XooNET is an open and free public database server for protein interaction information for Xoo. It contains 4,538 proteins and 26,932 possible interactions consisting of 18,503 (PSIMAP), 3,118 (PEIMAP), and 8,938 (iPfam) pairs. In addition, XooNET provides 3,407 possible interaction pairs between two sets of proteins; 141 Xoo proteins that are predicted as membrane proteins and rice proteomes. The resultant interacting partners of a query protein can be easily retrieved by users as well as the interaction networks in graphical web interfaces. XooNET is freely available from .
Collapse
|
43
|
Jefferson ER, Walsh TP, Barton GJ. A comparison of SCOP and CATH with respect to domain-domain interactions. Proteins 2008; 70:54-62. [PMID: 17634986 DOI: 10.1002/prot.21496] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.
Collapse
Affiliation(s)
- Emily R Jefferson
- University of Dundee, School of Life Sciences, Dow Street, Dundee, DD1 5EH Scotland, United Kingdom
| | | | | |
Collapse
|
44
|
Costa S, Cesareni G. Domains mediate protein-protein interactions and nucleate protein assemblies. Handb Exp Pharmacol 2008:383-405. [PMID: 18491061 DOI: 10.1007/978-3-540-72843-6_16] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Cell physiology is governed by an intricate mesh of physical and functional links among proteins, nucleic acids and other metabolites. The recent information flood coming from large-scale genomic and proteomic approaches allows us to foresee the possibility of compiling an exhaustive list of the molecules present within a cell, enriched with quantitative information on concentration and cellular localization. Moreover, several high-throughput experimental and computational techniques have been devised to map all the protein interactions occurring in a living cell. So far, such maps have been drawn as graphs where nodes represent proteins and edges represent interactions. However, this representation does not take into account the intrinsically modular nature of proteins and thus fails in providing an effective description of the determinants of binding. Since proteins are composed of domains that often confer on proteins their binding capabilities, a more informative description of the interaction network would detail, for each pair of interacting proteins in the network, which domains mediate the binding. Understanding how protein domains combine to mediate protein interactions would allow one to add important features to the protein interaction network, making it possible to discriminate between simultaneously occurring and mutually exclusive interactions. This objective can be achieved by experimentally characterizing domain recognition specificity or by analyzing the frequency of co-occurring domains in proteins that do interact. Such approaches allow gaining insights on the topology of complexes with unknown three-dimensional structure, thus opening the prospect of adopting a more rational strategy in developing drugs designed to selectively target specific protein interactions.
Collapse
Affiliation(s)
- S Costa
- University of Rome Tor Vergata, Via della Ricerca Scientifica, Rome, Italy
| | | |
Collapse
|
45
|
Protein-protein interactions: analysis and prediction. MODERN GENOME ANNOTATION 2008. [PMCID: PMC7120725 DOI: 10.1007/978-3-211-75123-7_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Proteins represent the tools and appliances of the cell — they assemble into larger structural elements, catalyze the biochemical reactions of metabolism, transmit signals, move cargo across membrane boundaries and carry out many other tasks. For most of these functions proteins cannot act in isolation but require close cooperation with other proteins to accomplish their task. Often, this collaborative action implies physical interaction of the proteins involved. Accordingly, experimental detection, in silico prediction and computational analysis of protein-protein interactions (PPI) have attracted great attention in the quest for discovering functional links among proteins and deciphering the complex networks of the cell.
Collapse
|
46
|
|
47
|
Schuster-Böckler B, Bateman A. Reuse of structural domain-domain interactions in protein networks. BMC Bioinformatics 2007; 8:259. [PMID: 17640363 PMCID: PMC1940023 DOI: 10.1186/1471-2105-8-259] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2007] [Accepted: 07/18/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein interactions are thought to be largely mediated by interactions between structural domains. Databases such as iPfam relate interactions in protein structures to known domain families. Here, we investigate how the domain interactions from the iPfam database are distributed in protein interactions taken from the HPRD, MPact, BioGRID, DIP and IntAct databases. RESULTS We find that known structural domain interactions can only explain a subset of 4-19% of the available protein interactions, nevertheless this fraction is still significantly bigger than expected by chance. There is a correlation between the frequency of a domain interaction and the connectivity of the proteins it occurs in. Furthermore, a large proportion of protein interactions can be attributed to a small number of domain interactions. We conclude that many, but not all, domain interactions constitute reusable modules of molecular recognition. A substantial proportion of domain interactions are conserved between E. coli, S. cerevisiae and H. sapiens. These domains are related to essential cellular functions, suggesting that many domain interactions were already present in the last universal common ancestor. CONCLUSION Our results support the concept of domain interactions as reusable, conserved building blocks of protein interactions, but also highlight the limitations currently imposed by the small number of available protein structures.
Collapse
Affiliation(s)
| | - Alex Bateman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| |
Collapse
|
48
|
Probabilistic prediction and ranking of human protein-protein interactions. BMC Bioinformatics 2007; 8:239. [PMID: 17615067 PMCID: PMC1939716 DOI: 10.1186/1471-2105-8-239] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 07/05/2007] [Indexed: 11/24/2022] Open
Abstract
Background Although the prediction of protein-protein interactions has been extensively investigated for yeast, few such datasets exist for the far larger proteome in human. Furthermore, it has recently been estimated that the overall average false positive rate of available computational and high-throughput experimental interaction datasets is as high as 90%. Results The prediction of human protein-protein interactions was investigated by combining orthogonal protein features within a probabilistic framework. The features include co-expression, orthology to known interacting proteins and the full-Bayesian combination of subcellular localization, co-occurrence of domains and post-translational modifications. A novel scoring function for local network topology was also investigated. This topology feature greatly enhanced the predictions and together with the full-Bayes combined features, made the largest contribution to the predictions. Using a conservative threshold, our most accurate predictor identifies 37606 human interactions, 32892 (80%) of which are not present in other publicly available large human interaction datasets, thus substantially increasing the coverage of the human interaction map. A subset of the 32892 novel predicted interactions have been independently validated. Comparison of the prediction dataset to other available human interaction datasets estimates the false positive rate of the new method to be below 80% which is competitive with other methods. Since the new method scores and ranks all human protein pairs, smaller subsets of higher quality can be generated thus leading to even lower false positive prediction rates. Conclusion The set of interactions predicted in this work increases the coverage of the human interaction map and will help determine the highest confidence human interactions.
Collapse
|
49
|
A domain-based approach to predict protein-protein interactions. BMC Bioinformatics 2007; 8:199. [PMID: 17567909 PMCID: PMC1919395 DOI: 10.1186/1471-2105-8-199] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2006] [Accepted: 06/13/2007] [Indexed: 12/16/2022] Open
Abstract
Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI) networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed using the DomainGA scores are reasonably low, and the erroneous predictions can be filtered further using supplementary approaches such as those based on literature search or other prediction methods.
Collapse
|
50
|
Comprehensive analysis of co-occurring domain sets in yeast proteins. BMC Genomics 2007; 8:161. [PMID: 17562021 PMCID: PMC1919370 DOI: 10.1186/1471-2164-8-161] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2006] [Accepted: 06/11/2007] [Indexed: 11/23/2022] Open
Abstract
Background Protein domains are fundamental evolutionary units of protein architecture, composing proteins in a modular manner. Combinations of two or more, possibly non-adjacent, domains are thought to play specific functional roles within proteins. Indeed, while the number of potential co-occurring domain sets (CDSs) is very large, only a few of these occur in nature. Here we study the principles governing domain content of proteins, using yeast as a model species. Results We design a novel representation of proteins and their constituent domains as a protein-domain network. An analysis of this network reveals 99 CDSs that occur in proteins more than expected by chance. The identified CDSs are shown to preferentially include ancient domains that are conserved from bacteria or archaea. Moreover, the protein sets spanned by these combinations were found to be highly functionally coherent, significantly match known protein complexes, and enriched with protein-protein interactions. These observations serve to validate the biological significance of the identified CDSs. Conclusion Our work provides a comprehensive list of co-occurring domain sets in yeast, and sheds light on their function and evolution.
Collapse
|