1
|
Mostaffa NH, Suhaimi AH, Al-Idrus A. Interactomics in plant defence: progress and opportunities. Mol Biol Rep 2023; 50:4605-4618. [PMID: 36920596 DOI: 10.1007/s11033-023-08345-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 02/15/2023] [Indexed: 03/16/2023]
Abstract
Interactomics is a branch of systems biology that deals with the study of protein-protein interactions and how these interactions influence phenotypes. Identifying the interactomes involved during host-pathogen interaction events may bring us a step closer to deciphering the molecular mechanisms underlying plant defence. Here, we conducted a systematic review of plant interactomics studies over the last two decades and found that while a substantial progress has been made in the field, plant-pathogen interactomics remains a less-travelled route. As an effort to facilitate the progress in this field, we provide here a comprehensive research pipeline for an in planta plant-pathogen interactomics study that encompasses the in silico prediction step to the validation step, unconfined to model plants. We also highlight four challenges in plant-pathogen interactomics with plausible solution(s) for each.
Collapse
Affiliation(s)
- Nur Hikmah Mostaffa
- Programme of Genetics, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Ahmad Husaini Suhaimi
- Programme of Genetics, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Aisyafaznim Al-Idrus
- Programme of Genetics, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia.
| |
Collapse
|
2
|
Yuen HY, Jansson J. Normalized L3-based link prediction in protein-protein interaction networks. BMC Bioinformatics 2023; 24:59. [PMID: 36814208 PMCID: PMC9945744 DOI: 10.1186/s12859-023-05178-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 02/08/2023] [Indexed: 02/24/2023] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) data is an important type of data used in functional genomics. However, high-throughput experiments are often insufficient to complete the PPI interactome of different organisms. Computational techniques are thus used to infer missing data, with link prediction being one such approach that uses the structure of the network of PPIs known so far to identify non-edges whose addition to the network would make it more sound, according to some underlying assumptions. Recently, a new idea called the L3 principle introduced biological motivation into PPI link predictions, yielding predictors that are superior to general-purpose link predictors for complex networks. Interestingly, the L3 principle can be interpreted in another way, so that other signatures of PPI networks can also be characterized for PPI predictions. This alternative interpretation uncovers candidate PPIs that the current L3-based link predictors may not be able to fully capture, underutilizing the L3 principle. RESULTS In this article, we propose a formulation of link predictors that we call NormalizedL3 (L3N) which addresses certain missing elements within L3 predictors in the perspective of network modeling. Our computational validations show that the L3N predictors are able to find missing PPIs more accurately (in terms of true positives among the predicted PPIs) than the previously proposed methods on several datasets from the literature, including BioGRID, STRING, MINT, and HuRI, at the cost of using more computation time in some of the cases. In addition, we found that L3-based link predictors (including L3N) ranked a different pool of PPIs higher than the general-purpose link predictors did. This suggests that different types of PPIs can be predicted based on different topological assumptions, and that even better PPI link predictors may be obtained in the future by improved network modeling.
Collapse
Affiliation(s)
- Ho Yin Yuen
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong, China.
| | - Jesper Jansson
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan.
| |
Collapse
|
3
|
Xu C, Wang B, Yang L, Zhongming Hu L, Yi L, Wang Y, Chen S, Emili A, Wan C. Global Landscape of Native Protein Complexes in Synechocystis sp. PCC 6803. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:715-727. [PMID: 33636367 PMCID: PMC9880817 DOI: 10.1016/j.gpb.2020.06.020] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 04/04/2020] [Accepted: 06/12/2020] [Indexed: 01/31/2023]
Abstract
Synechocystis sp. PCC 6803 (hereafter: Synechocystis) is a model organism for studying photosynthesis, energy metabolism, and environmental stress. Although known as the first fully sequenced phototrophic organism, Synechocystis still has almost half of its proteome without functional annotations. In this study, by using co-fractionation coupled with liquid chromatography-tandem mass spectrometry (LC-MS/MS), we define 291 multi-protein complexes, encompassing 24,092 protein-protein interactions (PPIs) among 2062 distinct gene products. This information not only reveals the roles of photosynthesis in metabolism, cell motility, DNA repair, cell division, and other physiological processes, but also shows how protein functions vary from bacteria to higher plants due to changes in interaction partners. It also allows us to uncover the functions of hypothetical proteins, such as Sll0445, Sll0446, and Sll0447 involved in photosynthesis and cell motility, and Sll1334 involved in regulation of fatty acid biogenesis. Here we present the most extensive PPI data for Synechocystis so far, which provide critical insights into fundamental molecular mechanisms in cyanobacteria.
Collapse
Affiliation(s)
- Chen Xu
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Bing Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Lin Yang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Lucas Zhongming Hu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 2E8, Canada
| | - Lanxing Yi
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Yaxuan Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Shenglan Chen
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON M5S 2E8, Canada,Departments of Biochemistry and Biology, Boston University, Boston, MA 02215, USA
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan 430079, China,Corresponding author.
| |
Collapse
|
4
|
Abstract
Since the large-scale experimental characterization of protein–protein interactions (PPIs) is not possible for all species, several computational PPI prediction methods have been developed that harness existing data from other species. While PPI network prediction has been extensively used in eukaryotes, microbial network inference has lagged behind. However, bacterial interactomes can be built using the same principles and techniques; in fact, several methods are better suited to bacterial genomes. These predicted networks allow systems-level analyses in species that lack experimental interaction data. This review describes the current network inference and analysis techniques and summarizes the use of computationally-predicted microbial interactomes to date.
Collapse
|
5
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
6
|
Zaborowski AB, Walther D. Determinants of correlated expression of transcription factors and their target genes. Nucleic Acids Res 2020; 48:11347-11369. [PMID: 33104784 PMCID: PMC7672440 DOI: 10.1093/nar/gkaa927] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 10/01/2020] [Accepted: 10/06/2020] [Indexed: 11/14/2022] Open
Abstract
While transcription factors (TFs) are known to regulate the expression of their target genes (TGs), only a weak correlation of expression between TFs and their TGs has generally been observed. As lack of correlation could be caused by additional layers of regulation, the overall correlation distribution may hide the presence of a subset of regulatory TF-TG pairs with tight expression coupling. Using reported regulatory pairs in the plant Arabidopsis thaliana along with comprehensive gene expression information and testing a wide array of molecular features, we aimed to discern the molecular determinants of high expression correlation of TFs and their TGs. TF-family assignment, stress-response process involvement, short genomic distances of the TF-binding sites to the transcription start site of their TGs, few required protein-protein-interaction connections to establish physical interactions between the TF and polymerase-II, unambiguous TF-binding motifs, increased numbers of miRNA target-sites in TF-mRNAs, and a young evolutionary age of TGs were found particularly indicative of high TF-TG correlation. The modulating roles of post-transcriptional, post-translational processes, and epigenetic factors have been characterized as well. Our study reveals that regulatory pairs with high expression coupling are associated with specific molecular determinants.
Collapse
Affiliation(s)
- Adam B Zaborowski
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| |
Collapse
|
7
|
Kammerscheit X, Chauvat F, Cassier-Chauvat C. From Cyanobacteria to Human, MAPEG-Type Glutathione-S-Transferases Operate in Cell Tolerance to Heat, Cold, and Lipid Peroxidation. Front Microbiol 2019; 10:2248. [PMID: 31681188 PMCID: PMC6798054 DOI: 10.3389/fmicb.2019.02248] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 09/13/2019] [Indexed: 11/18/2022] Open
Abstract
The MAPEG2 sub-family of glutathione-S-transferase proteins (GST) has been poorly investigated in vivo, even in prokaryotes such as cyanobacteria the organisms that are regarded as having developed glutathione-dependent enzymes to protect themselves against the reactive oxygen species (ROS) often produced by their powerful photosynthesis. We report the first in vivo analysis of a cyanobacterial MAPEG2-like protein (Sll1147) in the model cyanobacterium Synechocystis PCC 6803. While Sll1147 is dispensable to cell growth in standard photo-autotrophic conditions, it plays an important role in the resistance to heat and cold, and to n-tertbutyl hydroperoxide (n-tBOOH) that induces lipid peroxidation. These findings suggest that Sll1147 could be involved in membrane fluidity, which is critical for photosynthesis. Attesting its sensitivity to these stresses, the Δsll1147 mutant lacking Sll1147 challenged by heat, cold, or n-tBOOH undergoes transient accumulation of peroxidized lipids and then of reduced and oxidized glutathione. These results are welcome because little is known concerning the signaling and/or protection mechanisms used by cyanobacteria to cope with heat and cold, two inevitable environmental stresses that limit their growth, and thus their production of biomass for our food chain and of biotechnologically interesting chemicals. Also interestingly, the decreased resistance to heat, cold and n-tBOOH of the Δsll1147 mutant could be rescued back to normal (wild-type) levels upon the expression of synthetic MAPEG2-encoding human genes adapted to the cyanobacterial codon usage. These synthetic hmGST2 and hmGST3 genes were also able to increase the Escherichia coli tolerance to heat and n-tBOOH. Collectively, these finding indicate that the activity of the MAPEG2 proteins have been conserved, at least in part, during evolution from (cyano)bacteria to human.
Collapse
Affiliation(s)
| | - Franck Chauvat
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Corinne Cassier-Chauvat
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
| |
Collapse
|
8
|
Nakajima N, Hayashida M, Jansson J, Maruyama O, Akutsu T. Determining the minimum number of protein-protein interactions required to support known protein complexes. PLoS One 2018; 13:e0195545. [PMID: 29698482 PMCID: PMC5919440 DOI: 10.1371/journal.pone.0195545] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 03/23/2018] [Indexed: 11/18/2022] Open
Abstract
The prediction of protein complexes from protein-protein interactions (PPIs) is a well-studied problem in bioinformatics. However, the currently available PPI data is not enough to describe all known protein complexes. In this paper, we express the problem of determining the minimum number of (additional) required protein-protein interactions as a graph theoretic problem under the constraint that each complex constitutes a connected component in a PPI network. For this problem, we develop two computational methods: one is based on integer linear programming (ILPMinPPI) and the other one is based on an existing greedy-type approximation algorithm (GreedyMinPPI) originally developed in the context of communication and social networks. Since the former method is only applicable to datasets of small size, we apply the latter method to a combination of the CYC2008 protein complex dataset and each of eight PPI datasets (STRING, MINT, BioGRID, IntAct, DIP, BIND, WI-PHI, iRefIndex). The results show that the minimum number of additional required PPIs ranges from 51 (STRING) to 964 (BIND), and that even the four best PPI databases, STRING (51), BioGRID (67), WI-PHI (93) and iRefIndex (85), do not include enough PPIs to form all CYC2008 protein complexes. We also demonstrate that the proposed problem framework and our solutions can enhance the prediction accuracy of existing PPI prediction methods. ILPMinPPI can be freely downloaded from http://sunflower.kuicr.kyoto-u.ac.jp/~nakajima/.
Collapse
Affiliation(s)
- Natsu Nakajima
- Institute of Molecular and Cellular Biosciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
- * E-mail: (NN); (TA)
| | - Morihiro Hayashida
- Department of Electrical Engineering and Computer Science, National Institute of Technology, Matsue College, 14-4, Nishiikumacho, Matsue, Shimane 690-8518, Japan
| | - Jesper Jansson
- Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
| | - Osamu Maruyama
- Institute of Mathematics for Industry, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
- * E-mail: (NN); (TA)
| |
Collapse
|
9
|
Abstract
The knowledge of protein-protein interactions (PPIs) and PPI networks (PPINs) is the key to starting to understand the biological processes inside the cell. Many computational tools have been designed to help explore PPIs and PPINs, such as those for interaction detection, reliability assessment and interaction network construction. Here, the application of computational tools is reviewed from three perspectives: PPI database construction, PPI prediction, and interaction network construction and analysis. This overview will provide researchers guidance on choosing appropriate methods for exploring PPIs.
Collapse
Affiliation(s)
- Shaowei Dong
- Department of Cell and System Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada
| | - Nicholas J Provart
- Department of Cell and System Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
10
|
Neoteric advancement in TB drugs and an overview on the anti-tubercular role of peptides through computational approaches. Microb Pathog 2017; 114:80-89. [PMID: 29174699 DOI: 10.1016/j.micpath.2017.11.034] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Revised: 11/21/2017] [Accepted: 11/22/2017] [Indexed: 11/21/2022]
Abstract
Tuberculosis (TB) is a devastating threat to human health whose treatment without the emergence of drug resistant Mycobacterium tuberculosis (M. tuberculosis) is the million-dollar question at present. The pathogenesis of M. tuberculosis has been extensively studied which represents unique defence strategies by infecting macrophages. Several anti-tubercular drugs with varied mode of action and administration from diversified sources have been used for the treatment of TB that later contributed to the emergence of multidrug-resistant tuberculosis (MDR-TB) and extensively drug-resistant tuberculosis (XDR-TB). However, few of potent anti-tubercular drugs are scheduled for clinical trials status in 2017-2018. Peptides of varied origins such as human immune cells and non-immune cells, bacteria, fungi, and venoms have been widely investigated as anti-tubercular agents for the replacement of existing anti-tubercular drugs in future. In the present review, we spotlighted not only on the mechanisms of action and mode of administration of currently available anti-tubercular drugs but also the recent comprehensive report of World Health Organization (WHO) on TB epidemic, diagnosis, prevention, and treatment. The major excerpt of the study also inspects the direct contribution of different computational tools during drug designing strategies against M. tuberculosis in order to grasp the interplay between anti-tubercular peptides and targeted bacterial protein. The potentiality of some of these anti-tubercular peptides as therapeutic agents unlocks a new portal for achieving the goal of end TB strategy.
Collapse
|
11
|
Meysman P, Titeca K, Eyckerman S, Tavernier J, Goethals B, Martens L, Valkenborg D, Laukens K. Protein complex analysis: From raw protein lists to protein interaction networks. MASS SPECTROMETRY REVIEWS 2017; 36:600-614. [PMID: 26709718 DOI: 10.1002/mas.21485] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 11/17/2015] [Indexed: 06/05/2023]
Abstract
The elucidation of molecular interaction networks is one of the pivotal challenges in the study of biology. Affinity purification-mass spectrometry and other co-complex methods have become widely employed experimental techniques to identify protein complexes. These techniques typically suffer from a high number of false negatives and false positive contaminants due to technical shortcomings and purification biases. To support a diverse range of experimental designs and approaches, a large number of computational methods have been proposed to filter, infer and validate protein interaction networks from experimental pull-down MS data. Nevertheless, this expansion of available methods complicates the selection of the most optimal ones to support systems biology-driven knowledge extraction. In this review, we give an overview of the most commonly used computational methods to process and interpret co-complex results, and we discuss the issues and unsolved problems that still exist within the field. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:600-614, 2017.
Collapse
Affiliation(s)
- Pieter Meysman
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - Kevin Titeca
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Sven Eyckerman
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Jan Tavernier
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Bart Goethals
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Lennart Martens
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- IBioStat, Hasselt University, Hasselt, Belgium
- CFP-CeProMa, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
12
|
Mahajan G, Mande SC. Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis. BMC Bioinformatics 2017; 18:201. [PMID: 28376709 PMCID: PMC5379762 DOI: 10.1186/s12859-017-1550-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 02/16/2017] [Indexed: 12/31/2022] Open
Abstract
Background A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved. Results We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how “distant” they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions. Conclusions Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and diversity of the template dataset, but, given the rapid accumulation of solved protein complex structures, its scope and utility are expected to keep steadily improving. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1550-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gaurang Mahajan
- National Centre for Cell Science, Ganeshkhind, Pune, 411 007, India. .,Indian Institute of Science Education and Research, Pashan, Pune, 411 008, India.
| | - Shekhar C Mande
- National Centre for Cell Science, Ganeshkhind, Pune, 411 007, India
| |
Collapse
|
13
|
Lv Q, Ma W, Liu H, Li J, Wang H, Lu F, Zhao C, Shi T. Genome-wide protein-protein interactions and protein function exploration in cyanobacteria. Sci Rep 2015; 5:15519. [PMID: 26490033 PMCID: PMC4614683 DOI: 10.1038/srep15519] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2015] [Accepted: 09/21/2015] [Indexed: 11/10/2022] Open
Abstract
Genome-wide network analysis is well implemented to study proteins of unknown function. Here, we effectively explored protein functions and the biological mechanism based on inferred high confident protein-protein interaction (PPI) network in cyanobacteria. We integrated data from seven different sources and predicted 1,997 PPIs, which were evaluated by experiments in molecular mechanism, text mining of literatures in proved direct/indirect evidences, and “interologs” in conservation. Combined the predicted PPIs with known PPIs, we obtained 4,715 no-redundant PPIs (involving 3,231 proteins covering over 90% of genome) to generate the PPI network. Based on the PPI network, terms in Gene ontology (GO) were assigned to function-unknown proteins. Functional modules were identified by dissecting the PPI network into sub-networks and analyzing pathway enrichment, with which we investigated novel function of underlying proteins in protein complexes and pathways. Examples of photosynthesis and DNA repair indicate that the network approach is a powerful tool in protein function analysis. Overall, this systems biology approach provides a new insight into posterior functional analysis of PPIs in cyanobacteria.
Collapse
Affiliation(s)
- Qi Lv
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China
| | - Weimin Ma
- College of Life and Environment Sciences, Shanghai Normal University, 100 Guilin Road, Shanghai, 200234, China
| | - Hui Liu
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China
| | - Jiang Li
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China
| | - Huan Wang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China
| | - Fang Lu
- College of Life and Environment Sciences, Shanghai Normal University, 100 Guilin Road, Shanghai, 200234, China
| | - Chen Zhao
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, China.,The institute of plant physiology and ecology, Shanghai Institutes for Biological Sciences, Chinese Acedamy of Sciences, 300 Fenglin Road, Shanghai 200032, China
| |
Collapse
|
14
|
Huo T, Liu W, Guo Y, Yang C, Lin J, Rao Z. Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs. BMC Bioinformatics 2015; 16:100. [PMID: 25887594 PMCID: PMC4456996 DOI: 10.1186/s12859-015-0535-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 03/13/2015] [Indexed: 12/28/2022] Open
Abstract
Background Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. Results We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by ‘interolog’ method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. Conclusions This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0535-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tong Huo
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Wei Liu
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Yu Guo
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Cheng Yang
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Jianping Lin
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Pharmacy, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| | - Zihe Rao
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, 300071, China. .,College of Life Sciences, Nankai University, Tianjin, 300071, China. .,Tianjin International Joint Academy of Biotechnology and Medicine, Tianjin, 300457, China.
| |
Collapse
|
15
|
Hernández-Prieto MA, Semeniuk TA, Futschik ME. Toward a systems-level understanding of gene regulatory, protein interaction, and metabolic networks in cyanobacteria. Front Genet 2014; 5:191. [PMID: 25071821 PMCID: PMC4079066 DOI: 10.3389/fgene.2014.00191] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Accepted: 06/11/2014] [Indexed: 12/21/2022] Open
Abstract
Cyanobacteria are essential primary producers in marine ecosystems, playing an important role in both carbon and nitrogen cycles. In the last decade, various genome sequencing and metagenomic projects have generated large amounts of genetic data for cyanobacteria. This wealth of data provides researchers with a new basis for the study of molecular adaptation, ecology and evolution of cyanobacteria, as well as for developing biotechnological applications. It also facilitates the use of multiplex techniques, i.e., expression profiling by high-throughput technologies such as microarrays, RNA-seq, and proteomics. However, exploration and analysis of these data is challenging, and often requires advanced computational methods. Also, they need to be integrated into our existing framework of knowledge to use them to draw reliable biological conclusions. Here, systems biology provides important tools. Especially, the construction and analysis of molecular networks has emerged as a powerful systems-level framework, with which to integrate such data, and to better understand biological relevant processes in these organisms. In this review, we provide an overview of the advances and experimental approaches undertaken using multiplex data from genomic, transcriptomic, proteomic, and metabolomic studies in cyanobacteria. Furthermore, we summarize currently available web-based tools dedicated to cyanobacteria, i.e., CyanoBase, CyanoEXpress, ProPortal, Cyanorak, CyanoBIKE, and CINPER. Finally, we present a case study for the freshwater model cyanobacteria, Synechocystis sp. PCC6803, to show the power of meta-analysis, and the potential to extrapolate acquired knowledge to the ecologically important marine cyanobacteria genus, Prochlorococcus.
Collapse
Affiliation(s)
| | - Trudi A Semeniuk
- Systems Biology and Bioinformatics Laboratory, IBB-CBME, University of Algarve Faro, Portugal
| | - Matthias E Futschik
- Systems Biology and Bioinformatics Laboratory, IBB-CBME, University of Algarve Faro, Portugal ; Centre of Marine Sciences, University of Algarve Faro, Portugal
| |
Collapse
|
16
|
Deng Y, Gao L, Wang B. ppiPre: predicting protein-protein interactions by combining heterogeneous features. BMC SYSTEMS BIOLOGY 2013; 7 Suppl 2:S8. [PMID: 24565177 PMCID: PMC3851814 DOI: 10.1186/1752-0509-7-s2-s8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
BACKGROUND Protein-protein interactions (PPIs) are crucial in cellular processes. Since the current biological experimental techniques are time-consuming and expensive, and the results suffer from the problems of incompleteness and noise, developing computational methods and software tools to predict PPIs is necessary. Although several approaches have been proposed, the species supported are often limited and additional data like homologous interactions in other species, protein sequence and protein expression are often required. And predictive abilities of different features for different kinds of PPI data have not been studied. RESULTS In this paper, we propose ppiPre, an open-source framework for PPI analysis and prediction using a combination of heterogeneous features including three GO-based semantic similarities, one KEGG-based co-pathway similarity and three topology-based similarities. It supports up to twenty species. Only the original PPI data and gold-standard PPI data are required from users. The experiments on binary and co-complex gold-standard yeast PPI data sets show that there exist big differences among the predictive abilities of different features on different kinds of PPI data sets. And the prediction performance on the two data sets shows that ppiPre is capable of handling PPI data in different kinds and sizes. ppiPre is implemented in the R language and is freely available on the CRAN (http://cran.r-project.org/web/packages/ppiPre/). CONCLUSIONS We applied our framework to both binary and co-complex gold-standard PPI data sets. The detailed analysis on three GO aspects suggests that different GO aspects should be used on different kinds of data sets, and that combining all the three aspects of GO often gets the best result. The analysis also shows that using only features based solely on the topology of the PPI network can get a very good result when predicting the co-complex PPI data. ppiPre provides useful functions for analysing PPI data and can be used to predict PPIs for multiple species.
Collapse
|
17
|
Abstract
UNLABELLED Protein interaction networks are important for the understanding of regulatory mechanisms, for the explanation of experimental data and for the prediction of protein functions. Unfortunately, most interaction data is available only for model organisms. As a possible remedy, the transfer of interactions to organisms of interest is common practice, but it is not clear when interactions can be transferred from one organism to another and, thus, the confidence in the derived interactions is low. Here, we propose to use a rich set of features to train Random Forests in order to score transferred interactions. We evaluated the transfer from a range of eukaryotic organisms to S. cerevisiae using orthologs. Directly transferred interactions to S. cerevisiae are on average only 24% consistent with the current S. cerevisiae interaction network. By using commonly applied filter approaches the transfer precision can be improved, but at the cost of a large decrease in the number of transferred interactions. Our Random Forest approach uses various features derived from both the target and the source network as well as the ortholog annotations to assign confidence values to transferred interactions. Thereby, we could increase the average transfer consistency to 85%, while still transferring almost 70% of all correctly transferable interactions. We tested our approach for the transfer of interactions to other species and showed that our approach outperforms competing methods for the transfer of interactions to species where no experimental knowledge is available. Finally, we applied our predictor to score transferred interactions to 83 targets species and we were able to extend the available interactome of B. taurus, M. musculus and G. gallus with over 40,000 interactions each. Our transferred interaction networks are publicly available via our web interface, which allows to inspect and download transferred interaction sets of different sizes, for various species, and at specified expected precision levels. AVAILABILITY http://services.bio.ifi.lmu.de/coin-db/.
Collapse
Affiliation(s)
- Robert Pesch
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
- * E-mail:
| | - Ralf Zimmer
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| |
Collapse
|
18
|
Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes. PLoS Genet 2012; 8:e1002834. [PMID: 22912585 PMCID: PMC3415404 DOI: 10.1371/journal.pgen.1002834] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 06/04/2012] [Indexed: 01/19/2023] Open
Abstract
Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led us to suggest that DR commonly suppresses translation, while stimulating an ancient reproduction-related process. Dietary restriction has been shown to extend lifespan in diverse, evolutionarily distant species, yet its underlying mechanisms remain unknown. We first constructed a database of genes essential for the life-extending effects of dietary restriction in various model organisms and then studied their interactions using a variety of network and systems biology approaches. This enabled us to predict novel genes related to dietary restriction, which we validated experimentally in yeast. By comparing large-scale data compilations (interactomes and transcriptomes) from multiple organisms, we were able to condense this -omics information to the most conserved essential elements, eliminating species-specific adaptive responses. These results lead us to the rather surprising conclusion that lifespan extension by a restricted diet commonly may exploit an ancient rejuvenation process derived from gametogenesis.
Collapse
|
19
|
Trabuco LG, Betts MJ, Russell RB. Negative protein-protein interaction datasets derived from large-scale two-hybrid experiments. Methods 2012; 58:343-8. [PMID: 22884951 DOI: 10.1016/j.ymeth.2012.07.028] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Revised: 06/27/2012] [Accepted: 07/28/2012] [Indexed: 01/05/2023] Open
Abstract
Negative protein-protein interaction datasets are needed for training and evaluation of interaction prediction methods, as well as validation of high-throughput interaction discovery experiments. In large-scale two-hybrid assays, the direct interaction of a large number of protein pairs is systematically probed. We present a simple method to harness two-hybrid data to obtain negative protein-protein interaction datasets, which we validated using other available experimental data. The method identifies interactions that were likely tested but not observed in a two-hybrid screen. For each negative interaction, a confidence score is defined as the shortest-path length between the two proteins in the interaction network derived from the two-hybrid experiment. We show that these high-quality negative datasets are particularly important when a specific biological context is considered, such as in the study of protein interaction specificity. We also illustrate the use of a negative dataset in the evaluation of the InterPreTS interaction prediction method.
Collapse
|
20
|
Wang C, Marshall A, Zhang D, Wilson ZA. ANAP: an integrated knowledge base for Arabidopsis protein interaction network analysis. PLANT PHYSIOLOGY 2012; 158:1523-33. [PMID: 22345505 PMCID: PMC3320167 DOI: 10.1104/pp.111.192203] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Accepted: 02/12/2012] [Indexed: 05/18/2023]
Abstract
Protein interactions are fundamental to the molecular processes occurring within an organism and can be utilized in network biology to help organize, simplify, and understand biological complexity. Currently, there are more than 10 publicly available Arabidopsis (Arabidopsis thaliana) protein interaction databases. However, there are limitations with these databases, including different types of interaction evidence, a lack of defined standards for protein identifiers, differing levels of information, and, critically, a lack of integration between them. In this paper, we present an interactive bioinformatics Web tool, ANAP (Arabidopsis Network Analysis Pipeline), which serves to effectively integrate the different data sets and maximize access to available data. ANAP has been developed for Arabidopsis protein interaction integration and network-based study to facilitate functional protein network analysis. ANAP integrates 11 Arabidopsis protein interaction databases, comprising 201,699 unique protein interaction pairs, 15,208 identifiers (including 11,931 The Arabidopsis Information Resource Arabidopsis Genome Initiative codes), 89 interaction detection methods, 73 species that interact with Arabidopsis, and 6,161 references. ANAP can be used as a knowledge base for constructing protein interaction networks based on user input and supports both direct and indirect interaction analysis. It has an intuitive graphical interface allowing easy network visualization and provides extensive detailed evidence for each interaction. In addition, ANAP displays the gene and protein annotation in the generated interactive network with links to The Arabidopsis Information Resource, the AtGenExpress Visualization Tool, the Arabidopsis 1,001 Genomes GBrowse, the Protein Knowledgebase, the Kyoto Encyclopedia of Genes and Genomes, and the Ensembl Genome Browser to significantly aid functional network analysis. The tool is available open access at http://gmdd.shgmo.org/Computational-Biology/ANAP.
Collapse
|
21
|
Haw R, Hermjakob H, D'Eustachio P, Stein L. Reactome pathway analysis to enrich biological discovery in proteomics data sets. Proteomics 2012; 11:3598-613. [PMID: 21751369 DOI: 10.1002/pmic.201100066] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Reactome (http://www.reactome.org) is an open-source, expert-authored, peer-reviewed, manually curated database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Pathway Browser is a Systems Biology Graphical Notation-like visualization system that supports manual navigation of pathways by zooming, scrolling and event highlighting, and that exploits PSI Common Query Interface web services to overlay pathways with molecular interaction data from the Reactome Functional Interaction Network and interaction databases such as IntAct, ChEMBL and BioGRID. Pathway and expression analysis tools employ web services to provide ID mapping, pathway assignment and over-representation analysis of user-supplied data sets. By applying Ensembl Compara to curated human proteins and reactions, Reactome generates pathway inferences for 20 other species. The Species Comparison tool provides a summary of results for each of these species as a table showing numbers of orthologous proteins found by pathway from which users can navigate to inferred details for specific proteins and reactions. Reactome's diverse pathway knowledge and suite of data analysis tools provide a platform for data mining, modeling and analysis of large-scale proteomics data sets. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP 8).
Collapse
Affiliation(s)
- Robin Haw
- Ontario Institute for Cancer Research, Department of Informatics and Bio-computing, Toronto, ON, Canada
| | | | | | | |
Collapse
|
22
|
Poultney CS, Greenfield A, Bonneau R. Integrated inference and analysis of regulatory networks from multi-level measurements. Methods Cell Biol 2012; 110:19-56. [PMID: 22482944 PMCID: PMC5615108 DOI: 10.1016/b978-0-12-388403-9.00002-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Regulatory and signaling networks coordinate the enormously complex interactions and processes that control cellular processes (such as metabolism and cell division), coordinate response to the environment, and carry out multiple cell decisions (such as development and quorum sensing). Regulatory network inference is the process of inferring these networks, traditionally from microarray data but increasingly incorporating other measurement types such as proteomics, ChIP-seq, metabolomics, and mass cytometry. We discuss existing techniques for network inference. We review in detail our pipeline, which consists of an initial biclustering step, designed to estimate co-regulated groups; a network inference step, designed to select and parameterize likely regulatory models for the control of the co-regulated groups from the biclustering step; and a visualization and analysis step, designed to find and communicate key features of the network. Learning biological networks from even the most complete data sets is challenging; we argue that integrating new data types into the inference pipeline produces networks of increased accuracy, validity, and biological relevance.
Collapse
Affiliation(s)
- Christopher S Poultney
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | | | | |
Collapse
|
23
|
Gallone G, Simpson TI, Armstrong JD, Jarman AP. Bio::Homology::InterologWalk--a Perl module to build putative protein-protein interaction networks through interolog mapping. BMC Bioinformatics 2011; 12:289. [PMID: 21767381 PMCID: PMC3161927 DOI: 10.1186/1471-2105-12-289] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 07/18/2011] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) data are widely used to generate network models that aim to describe the relationships between proteins in biological systems. The fidelity and completeness of such networks is primarily limited by the paucity of protein interaction information and by the restriction of most of these data to just a few widely studied experimental organisms. In order to extend the utility of existing PPIs, computational methods can be used that exploit functional conservation between orthologous proteins across taxa to predict putative PPIs or 'interologs'. To date most interolog prediction efforts have been restricted to specific biological domains with fixed underlying data sources and there are no software tools available that provide a generalised framework for 'on-the-fly' interolog prediction. RESULTS We introduce Bio::Homology::InterologWalk, a Perl module to retrieve, prioritise and visualise putative protein-protein interactions through an orthology-walk method. The module uses orthology and experimental interaction data to generate putative PPIs and optionally collates meta-data into an Interaction Prioritisation Index that can be used to help prioritise interologs for further analysis. We show the application of our interolog prediction method to the genomic interactome of the fruit fly, Drosophila melanogaster. We analyse the resulting interaction networks and show that the method proposes new interactome members and interactions that are candidates for future experimental investigation. CONCLUSIONS Our interolog prediction tool employs the Ensembl Perl API and PSICQUIC enabled protein interaction data sources to generate up to date interologs 'on-the-fly'. This represents a significant advance on previous methods for interolog prediction as it allows the use of the latest orthology and protein interaction data for all of the genomes in Ensembl. The module outputs simple text files, making it easy to customise the results by post-processing, allowing the putative PPI datasets to be easily integrated into existing analysis workflows. The Bio::Homology::InterologWalk module, sample scripts and full documentation are freely available from the Comprehensive Perl Archive Network (CPAN) under the GNU Public license.
Collapse
Affiliation(s)
- Giuseppe Gallone
- Centre for Integrative Physiology, University of Edinburgh, Hugh Robson Building, George Square, Edinburgh EH8 9XD, UK.
| | | | | | | |
Collapse
|
24
|
Luo F, Liu J, Li J. Discovering conditional co-regulated protein complexes by integrating diverse data sources. BMC SYSTEMS BIOLOGY 2010; 4 Suppl 2:S4. [PMID: 20840731 PMCID: PMC2982691 DOI: 10.1186/1752-0509-4-s2-s4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Abstract
Collapse
Affiliation(s)
- Fei Luo
- School of Computer, Wuhan University, Wuhan, Hubei, China.
| | | | | |
Collapse
|
25
|
Wiles AM, Doderer M, Ruan J, Gu TT, Ravi D, Blackman B, Bishop AJR. Building and analyzing protein interactome networks by cross-species comparisons. BMC SYSTEMS BIOLOGY 2010; 4:36. [PMID: 20353594 PMCID: PMC2859380 DOI: 10.1186/1752-0509-4-36] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2009] [Accepted: 03/30/2010] [Indexed: 11/10/2022]
Abstract
Background A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast) and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species. Results The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen) and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced. Conclusions Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website http://www.interologfinder.org provides research biologists intuitive access to this data.
Collapse
Affiliation(s)
- Amy M Wiles
- Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| | | | | | | | | | | | | |
Collapse
|
26
|
Dutkowski J, Tiuryn J. Phylogeny-guided interaction mapping in seven eukaryotes. BMC Bioinformatics 2009; 10:393. [PMID: 19948065 PMCID: PMC2793266 DOI: 10.1186/1471-2105-10-393] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 11/30/2009] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The assembly of reliable and complete protein-protein interaction (PPI) maps remains one of the significant challenges in systems biology. Computational methods which integrate and prioritize interaction data can greatly aid in approaching this goal. RESULTS We developed a Bayesian inference framework which uses phylogenetic relationships to guide the integration of PPI evidence across multiple datasets and species, providing more accurate predictions. We apply our framework to reconcile seven eukaryotic interactomes: H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. cerevisiae and A. thaliana. Comprehensive GO-based quality assessment indicates a 5% to 44% score increase in predicted interactomes compared to the input data. Further support is provided by gold-standard MIPS, CYC2008 and HPRD datasets. We demonstrate the ability to recover known PPIs in well-characterized yeast and human complexes (26S proteasome, endosome and exosome) and suggest possible new partners interacting with the putative SWI/SNF chromatin remodeling complex in A. thaliana. CONCLUSION Our phylogeny-guided approach compares favorably to two standard methods for mapping PPIs across species. Detailed analysis of predictions in selected functional modules uncovers specific PPI profiles among homologous proteins, establishing interaction-based partitioning of protein families. Provided evidence also suggests that interactions within core complex subunits are in general more conserved and easier to transfer accurately to other organisms, than interactions between these subunits.
Collapse
|
27
|
Aranda B, Achuthan P, Alam-Faruque Y, Armean I, Bridge A, Derow C, Feuermann M, Ghanbarian AT, Kerrien S, Khadake J, Kerssemakers J, Leroy C, Menden M, Michaut M, Montecchi-Palazzi L, Neuhauser SN, Orchard S, Perreau V, Roechert B, van Eijk K, Hermjakob H. The IntAct molecular interaction database in 2010. Nucleic Acids Res 2009; 38:D525-31. [PMID: 19850723 PMCID: PMC2808934 DOI: 10.1093/nar/gkp878] [Citation(s) in RCA: 524] [Impact Index Per Article: 34.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
IntAct is an open-source, open data molecular interaction database and toolkit. Data is abstracted from the literature or from direct data depositions by expert curators following a deep annotation model providing a high level of detail. As of September 2009, IntAct contains over 200.000 curated binary interaction evidences. In response to the growing data volume and user requests, IntAct now provides a two-tiered view of the interaction data. The search interface allows the user to iteratively develop complex queries, exploiting the detailed annotation with hierarchical controlled vocabularies. Results are provided at any stage in a simplified, tabular view. Specialized views then allows 'zooming in' on the full annotation of interactions, interactors and their properties. IntAct source code and data are freely available at http://www.ebi.ac.uk/intact.
Collapse
Affiliation(s)
- B Aranda
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|