1
|
Fang T, Szklarczyk D, Hachilif R, von Mering C. Enhancing coevolutionary signals in protein-protein interaction prediction through clade-wise alignment integration. Sci Rep 2024; 14:6009. [PMID: 38472223 PMCID: PMC10933411 DOI: 10.1038/s41598-024-55655-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
Protein-protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critical choices have to be made: how to ensure the reliable identification of orthologs, and how to optimally balance the need for large alignments versus sufficient alignment quality. Here, we propose a divide-and-conquer strategy for MSA generation: instead of building a single, large alignment for each protein, multiple distinct alignments are constructed under distinct clades in the tree of life. Coevolutionary signals are searched separately within these clades, and are only subsequently integrated using machine learning techniques. We find that this strategy markedly improves overall prediction performance, concomitant with better alignment quality. Using the popular DCA algorithm to systematically search pairs of such alignments, a genome-wide all-against-all interaction scan in a bacterial genome is demonstrated. Given the recent successes of AlphaFold in predicting direct PPIs at atomic detail, a discover-and-refine approach is proposed: our method could provide a fast and accurate strategy for pre-screening the entire genome, submitting to AlphaFold only promising interaction candidates-thus reducing false positives as well as computation time.
Collapse
Affiliation(s)
- Tao Fang
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Damian Szklarczyk
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Radja Hachilif
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
| |
Collapse
|
2
|
Moratalla-Navarro F, Moreno V, Sanz-Pamplona R. TALKIEN: crossTALK IntEraction Network. A web-based tool for deciphering molecular communication through ligand-receptor interactions. Mol Omics 2023; 19:688-696. [PMID: 37403821 DOI: 10.1039/d3mo00049d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2023]
Abstract
Molecular crosstalk, the dialogue between different cell types, is attracting more attention in cancer research. On the one hand, the communication between tumor and non-tumor cells in the microenvironment or between different tumor clones has influential consequences for the progression and spread of tumors and response to treatment. On the other hand, novel techniques such as single-cell sequencing or spatial transcriptomics provide detailed information that needs to be interpreted. TALKIEN: crossTALK IntEraction Network is a simple and intuitive online R/shiny application to visualize molecular crosstalk information through the construction and analysis of a protein-protein interaction network. Taking two or more lists of genes or proteins as input, which are representative of cell lineages, TALKIEN extracts information about ligand-receptor interactions, builds a network and analyzes it using systems biology techniques such as centrality measures and component analysis, among others. Moreover, it expands the network displaying pathways downstream receptors. The application allows users to select different graphical layouts, performs functional analysis and gives information about drugs targeting receptors. In conclusion, TALKIEN allows users to detect ligand-receptor interactions generating new in silico predictions of cell-cell communication thus providing a translational rationale for future experiments. It is freely available at https://www.odap-ico.org/talkien.
Collapse
Affiliation(s)
- Ferran Moratalla-Navarro
- Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program (ODAP), Catalan Institute of Oncology (ICO), Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Epidemiologia y Salud Pública (CIBERESP), Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain
| | - Víctor Moreno
- Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program (ODAP), Catalan Institute of Oncology (ICO), Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Epidemiologia y Salud Pública (CIBERESP), Spain
- Department of Clinical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain
| | - Rebeca Sanz-Pamplona
- Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program (ODAP), Catalan Institute of Oncology (ICO), Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Epidemiologia y Salud Pública (CIBERESP), Spain
- University Hospital Lozano Blesa, Aragon Health Research Institute (IISA), ARAID Foundation, Aragon Government, Zaragoza, Spain.
| |
Collapse
|
3
|
Xie S, Xie X, Zhao X, Liu F, Wang Y, Ping J, Ji Z. HNSPPI: a hybrid computational model combing network and sequence information for predicting protein-protein interaction. Brief Bioinform 2023; 24:bbad261. [PMID: 37480553 DOI: 10.1093/bib/bbad261] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 06/24/2023] [Accepted: 06/26/2023] [Indexed: 07/24/2023] Open
Abstract
Most life activities in organisms are regulated through protein complexes, which are mainly controlled via Protein-Protein Interactions (PPIs). Discovering new interactions between proteins and revealing their biological functions are of great significance for understanding the molecular mechanisms of biological processes and identifying the potential targets in drug discovery. Current experimental methods only capture stable protein interactions, which lead to limited coverage. In addition, expensive cost and time consuming are also the obvious shortcomings. In recent years, various computational methods have been successfully developed for predicting PPIs based only on protein homology, primary sequences of protein or gene ontology information. Computational efficiency and data complexity are still the main bottlenecks for the algorithm generalization. In this study, we proposed a novel computational framework, HNSPPI, to predict PPIs. As a hybrid supervised learning model, HNSPPI comprehensively characterizes the intrinsic relationship between two proteins by integrating amino acid sequence information and connection properties of PPI network. The experimental results show that HNSPPI works very well on six benchmark datasets. Moreover, the comparison analysis proved that our model significantly outperforms other five existing algorithms. Finally, we used the HNSPPI model to explore the SARS-CoV-2-Human interaction system and found several potential regulations. In summary, HNSPPI is a promising model for predicting new protein interactions from known PPI data.
Collapse
Affiliation(s)
- Shijie Xie
- College of Artificial Intelligence, Nanjing Agricultural University, No. 1 Weigang Rd, Nanjing, Jiangsu 210095, China
| | - Xiaojun Xie
- College of Artificial Intelligence, Nanjing Agricultural University, No. 1 Weigang Rd, Nanjing, Jiangsu 210095, China
| | - Xin Zhao
- Department of Hepatobiliary Surgery, Beijing Chaoyang Hospital affiliated to Capital Medical University, Beijing 100020, China
| | - Fei Liu
- Joint International Research Laboratory of Animal Health and Food Safety of Ministry of Education & Single Molecule Nanometry Laboratory (Sinmolab), Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Yiming Wang
- Key Laboratory of Biological Interactions and Crop Health, Department of Plant Pathology, Nanjing Agricultural University, 210095, Nanjing, China
| | - Jihui Ping
- MOE International Joint Collaborative Research Laboratory for Animal Health and Food Safety & Jiangsu Engineering Laboratory of Animal Immunology, College of Veterinary Medicine, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Zhiwei Ji
- College of Artificial Intelligence, Nanjing Agricultural University, No. 1 Weigang Rd, Nanjing, Jiangsu 210095, China
| |
Collapse
|
4
|
Koca MB, Nourani E, Abbasoğlu F, Karadeniz İ, Sevilgen FE. Graph convolutional network based virus-human protein-protein interaction prediction for novel viruses. Comput Biol Chem 2022; 101:107755. [PMID: 36037723 DOI: 10.1016/j.compbiolchem.2022.107755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/07/2022] [Accepted: 08/10/2022] [Indexed: 11/03/2022]
Abstract
Computational identification of human-virus protein-protein interactions (PHIs) is a worthwhile step towards understanding infection mechanisms. Analysis of the PHI networks is important for the determination of pathogenic diseases. Prediction of these interactions is a popular problem since experimental detection of PHIs is both time-consuming and expensive. The available methods use biological features like amino acid sequences, molecular structure, or biological activities for prediction. Recent studies show that the topological properties of proteins in protein-protein interaction (PPI) networks increase the performance of the predictions. The basic network projections, random-walk-based models, or graph neural networks are used for generating topologically enriched (hybrid) protein embeddings. In this study, we propose a three-stage machine learning pipeline that generates and uses hybrid embeddings for PHI prediction. In the first stage, numerical features are extracted from the amino acid sequences using the Doc2Vec and Byte Pair Encoding method. The amino acid embeddings are used as node features while training a modified GraphSAGE model, which is an improved version of the graph convolutional network. Lastly, the hybrid protein embeddings are used for training a binary interaction classifier model that predicts whether there is an interaction between the given two proteins or not. The proposed method is evaluated with comprehensive experiments to test its functionality and compare it with the state-of-art methods. The experimental results on the benchmark dataset prove the efficiency of the proposed model by having a 3-23% better area under curve (AUC) score than its competitors.
Collapse
Affiliation(s)
- Mehmet Burak Koca
- Department of Computer Engineering, Faculty of Engineering, Gebze Technical University, Kocaeli, Turkey
| | - Esmaeil Nourani
- Department of Information Technology, Faculty of Computer Engineering and Information Technology, Azarbaijan Shahid Madani University, Tabriz, Iran
| | - Ferda Abbasoğlu
- Department of Computer Engineering, Faculty of Engineering, Gebze Technical University, Kocaeli, Turkey
| | - İlknur Karadeniz
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Işık University, İstanbul, Turkey.
| | - Fatih Erdoğan Sevilgen
- Department of Computer Engineering, Faculty of Engineering, Gebze Technical University, Kocaeli, Turkey; Institute for Data Science and Artificial Intelligence, Boğaziçi University, İstanbul, Turkey
| |
Collapse
|
5
|
Lugo-Martinez J, Zeiberg D, Gaudelet T, Malod-Dognin N, Przulj N, Radivojac P. Classification in biological networks with hypergraphlet kernels. Bioinformatics 2021; 37:1000-1007. [PMID: 32886115 DOI: 10.1093/bioinformatics/btaa768] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 06/13/2020] [Accepted: 08/26/2020] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. RESULTS We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. AVAILABILITY AND IMPLEMENTATION https://github.com/jlugomar/hypergraphlet-kernels. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jose Lugo-Martinez
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Daniel Zeiberg
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Thomas Gaudelet
- Department of Computer Science, University College London, London WC1E 6BT, UK
| | | | - Natasa Przulj
- Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain.,ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
6
|
Schmitt-Ulms G, Mehrabian M, Williams D, Ehsani S. The IDIP framework for assessing protein function and its application to the prion protein. Biol Rev Camb Philos Soc 2021; 96:1907-1932. [PMID: 33960099 DOI: 10.1111/brv.12731] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 04/22/2021] [Accepted: 04/26/2021] [Indexed: 01/06/2023]
Abstract
The quest to determine the function of a protein can represent a profound challenge. Although this task is the mandate of countless research groups, a general framework for how it can be approached is conspicuously lacking. Moreover, even expectations for when the function of a protein can be considered to be 'known' are not well defined. In this review, we begin by introducing concepts pertinent to the challenge of protein function assignments. We then propose a framework for inferring a protein's function from four data categories: 'inheritance', 'distribution', 'interactions' and 'phenotypes' (IDIP). We document that the functions of proteins emerge at the intersection of inferences drawn from these data categories and emphasise the benefit of considering them in an evolutionary context. We then apply this approach to the cellular prion protein (PrPC ), well known for its central role in prion diseases, whose function continues to be considered elusive by many investigators. We document that available data converge on the conclusion that the function of the prion protein is to control a critical post-translational modification of the neural cell adhesion molecule in the context of epithelial-to-mesenchymal transition and related plasticity programmes. Finally, we argue that this proposed function of PrPC has already passed the test of time and is concordant with the IDIP framework in a way that other functions considered for this protein fail to achieve. We anticipate that the IDIP framework and the concepts analysed herein will aid the investigation of other proteins whose primary functional assignments have thus far been intractable.
Collapse
Affiliation(s)
- Gerold Schmitt-Ulms
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Toronto, ON, M5T 0S8, Canada.,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | | | - Declan Williams
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Toronto, ON, M5T 0S8, Canada
| | - Sepehr Ehsani
- Theoretical and Philosophical Biology, Department of Philosophy, University College London, Bloomsbury, London, WC1E 6BT, U.K.,Ronin Institute for Independent Scholarship, Montclair, NJ, 07043, U.S.A
| |
Collapse
|
7
|
Pathogen and Host-Pathogen Protein Interactions Provide a Key to Identify Novel Drug Targets. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11607-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
|
8
|
Lugo-Martinez J, Bar-Joseph Z, Dengjel J, Murphy RF. Integration of Heterogeneous Experimental Data Improves Global Map of Human Protein Complexes. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2019; 2019:144-153. [PMID: 32457940 DOI: 10.1145/3307339.3342150] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Protein complexes play a significant role in the core functionality of cells. These complexes are typically identified by detecting densely connected subgraphs in protein-protein interaction (PPI) networks. Recently, multiple large-scale mass spectrometry-based experiments have significantly increased the availability of PPI data in order to further expand the set of known complexes. However, high-throughput experimental data generally are incomplete, show limited agreement between experiments, and show frequent false positive interactions. There is a need for computational approaches that can address these limitations in order to improve the coverage and accuracy of human protein complexes. Here, we present a new method that integrates data from multiple heterogeneous experiments and sources in order to increase the reliability and coverage of predicted protein complexes. We first fused the heterogeneous data into a feature matrix and trained classifiers to score pairwise protein interactions. We next used graph based methods to combine pairwise interactions into predicted protein complexes. Our approach improves the accuracy and coverage of protein pairwise interactions, accurately identifies known complexes, and suggests both novel additions to known complexes and entirely new complexes. Our results suggest that integration of heterogeneous experimental data helps improve the reliability and coverage of diverse high-throughput mass-spectrometry experiments, leading to an improved global map of human protein complexes.
Collapse
Affiliation(s)
- Jose Lugo-Martinez
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA
| | - Ziv Bar-Joseph
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA
| | - Jörn Dengjel
- Department of Biology, Université de Fribourg, 1700 Fribourg, Switzerland
| | - Robert F Murphy
- Computational Biology Department, Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
9
|
Lange A, Schäfer A, Bender A, Steimle A, Beier S, Parusel R, Frick JS. Galleria mellonella: A Novel Invertebrate Model to Distinguish Intestinal Symbionts From Pathobionts. Front Immunol 2018; 9:2114. [PMID: 30283451 PMCID: PMC6156133 DOI: 10.3389/fimmu.2018.02114] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Accepted: 08/28/2018] [Indexed: 12/19/2022] Open
Abstract
Insects and mammals share evolutionary conserved innate immune responses to maintain intestinal homeostasis. We investigated whether the larvae of the greater wax moth Galleria mellonella may be used as an experimental organism to distinguish between symbiotic Bacteroides vulgatus and pathobiotic Escherichia coli, which are mammalian intestinal commensals. Oral application of the symbiont or pathobiont to G. mellonella resulted in clearly distinguishable innate immune responses that could be verified by analyzing similar innate immune components in mice in vivo and in vitro. The differential innate immune responses were initiated by the recognition of bacterial components via pattern recognition receptors. The pathobiont detection resulted in increased expression of reactive oxygen and nitrogen species related genes as well as antimicrobial peptide gene expression. In contrast, the treatment/application with symbiotic bacteria led to weakened immune responses in both mammalian and insect models. As symbionts and pathobionts play a crucial role in development of inflammatory bowel diseases, we hence suggest G. mellonella as a future replacement organism in inflammatory bowel disease research.
Collapse
Affiliation(s)
- Anna Lange
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| | - Andrea Schäfer
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| | - Annika Bender
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| | - Alexander Steimle
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| | - Sina Beier
- Algorithms in Bioinformatics, ZBIT Center for Bioinformatics, University of Tübingen, Tübingen, Germany
| | - Raphael Parusel
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| | - Julia-Stefanie Frick
- Department for Medical Microbiology and Hygiene, Interfacultary Institute for Microbiology and Infection Medicine, University of Tübingen, Tübingen, Germany
| |
Collapse
|
10
|
Molinarolo S, Lee S, Leisle L, Lueck JD, Granata D, Carnevale V, Ahern CA. Cross-kingdom auxiliary subunit modulation of a voltage-gated sodium channel. J Biol Chem 2018; 293:4981-4992. [PMID: 29371400 PMCID: PMC5892571 DOI: 10.1074/jbc.ra117.000852] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 01/17/2018] [Indexed: 02/04/2023] Open
Abstract
Voltage-gated, sodium ion-selective channels (NaV) generate electrical signals contributing to the upstroke of the action potential in animals. NaVs are also found in bacteria and are members of a larger family of tetrameric voltage-gated channels that includes CaVs, KVs, and NaVs. Prokaryotic NaVs likely emerged from a homotetrameric Ca2+-selective voltage-gated progenerator, and later developed Na+ selectivity independently. The NaV signaling complex in eukaryotes contains auxiliary proteins, termed beta (β) subunits, which are potent modulators of the expression profiles and voltage-gated properties of the NaV pore, but it is unknown whether they can functionally interact with prokaryotic NaV channels. Herein, we report that the eukaryotic NaVβ1-subunit isoform interacts with and enhances the surface expression as well as the voltage-dependent gating properties of the bacterial NaV, NaChBac in Xenopus oocytes. A phylogenetic analysis of the β-subunit gene family proteins confirms that these proteins appeared roughly 420 million years ago and that they have no clear homologues in bacterial phyla. However, a comparison between eukaryotic and bacterial NaV structures highlighted the presence of a conserved fold, which could support interactions with the β-subunit. Our electrophysiological, biochemical, structural, and bioinformatics results suggests that the prerequisites for β-subunit regulation are an evolutionarily stable and intrinsic property of some voltage-gated channels.
Collapse
Affiliation(s)
- Steven Molinarolo
- From the Department of Molecular Physiology and Biophysics, Carver College of Medicine, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa 52242
| | - Sora Lee
- the Weill Cornell Medical College, Cornell University, New York, New York 10065, and
| | - Lilia Leisle
- the Weill Cornell Medical College, Cornell University, New York, New York 10065, and
| | - John D Lueck
- From the Department of Molecular Physiology and Biophysics, Carver College of Medicine, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa 52242
| | - Daniele Granata
- the Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122
| | - Vincenzo Carnevale
- the Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122
| | - Christopher A Ahern
- From the Department of Molecular Physiology and Biophysics, Carver College of Medicine, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa 52242,
| |
Collapse
|
11
|
Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:3-13. [DOI: 10.1016/j.pbiomolbio.2016.10.002] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2016] [Revised: 10/15/2016] [Accepted: 10/19/2016] [Indexed: 12/22/2022]
|
12
|
Ricci DP, Melfi MD, Lasker K, Dill DL, McAdams HH, Shapiro L. Cell cycle progression in Caulobacter requires a nucleoid-associated protein with high AT sequence recognition. Proc Natl Acad Sci U S A 2016; 113:E5952-E5961. [PMID: 27647925 PMCID: PMC5056096 DOI: 10.1073/pnas.1612579113] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Faithful cell cycle progression in the dimorphic bacterium Caulobacter crescentus requires spatiotemporal regulation of gene expression and cell pole differentiation. We discovered an essential DNA-associated protein, GapR, that is required for Caulobacter growth and asymmetric division. GapR interacts with adenine and thymine (AT)-rich chromosomal loci, associates with the promoter regions of cell cycle-regulated genes, and shares hundreds of recognition sites in common with known master regulators of cell cycle-dependent gene expression. GapR target loci are especially enriched in binding sites for the transcription factors GcrA and CtrA and overlap with nearly all of the binding sites for MucR1, a regulator that controls the establishment of swarmer cell fate. Despite constitutive synthesis, GapR accumulates preferentially in the swarmer compartment of the predivisional cell. Homologs of GapR, which are ubiquitous among the α-proteobacteria and are encoded on multiple bacteriophage genomes, also accumulate in the predivisional cell swarmer compartment when expressed in Caulobacter The Escherichia coli nucleoid-associated protein H-NS, like GapR, selectively associates with AT-rich DNA, yet it does not localize preferentially to the swarmer compartment when expressed exogenously in Caulobacter, suggesting that recognition of AT-rich DNA is not sufficient for the asymmetric accumulation of GapR. Further, GapR does not silence the expression of H-NS target genes when expressed in E. coli, suggesting that GapR and H-NS have distinct functions. We propose that Caulobacter has co-opted a nucleoid-associated protein with high AT recognition to serve as a mediator of cell cycle progression.
Collapse
Affiliation(s)
- Dante P Ricci
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
| | - Michael D Melfi
- Department of Developmental Biology, Stanford University, Stanford, CA 94305; Department of Chemistry, Stanford University, Stanford, CA 94305
| | - Keren Lasker
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
| | - David L Dill
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - Harley H McAdams
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
| | - Lucy Shapiro
- Department of Developmental Biology, Stanford University, Stanford, CA 94305;
| |
Collapse
|
13
|
Kludas J, Arvas M, Castillo S, Pakula T, Oja M, Brouard C, Jäntti J, Penttilä M, Rousu J. Machine Learning of Protein Interactions in Fungal Secretory Pathways. PLoS One 2016; 11:e0159302. [PMID: 27441920 PMCID: PMC4956264 DOI: 10.1371/journal.pone.0159302] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 06/30/2016] [Indexed: 12/18/2022] Open
Abstract
In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker's yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities.
Collapse
Affiliation(s)
- Jana Kludas
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| | - Mikko Arvas
- VTT Technical Research Centre of Finland, Espoo, Finland
| | | | - Tiina Pakula
- VTT Technical Research Centre of Finland, Espoo, Finland
| | - Merja Oja
- VTT Technical Research Centre of Finland, Espoo, Finland
| | - Céline Brouard
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| | - Jussi Jäntti
- VTT Technical Research Centre of Finland, Espoo, Finland
| | - Merja Penttilä
- VTT Technical Research Centre of Finland, Espoo, Finland
| | - Juho Rousu
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| |
Collapse
|
14
|
Hou Q, Dutilh BE, Huynen MA, Heringa J, Feenstra KA. Sequence specificity between interacting and non-interacting homologs identifies interface residues--a homodimer and monomer use case. BMC Bioinformatics 2015; 16:325. [PMID: 26449222 PMCID: PMC4599308 DOI: 10.1186/s12859-015-0758-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 09/30/2015] [Indexed: 11/17/2022] Open
Abstract
Background Protein families participating in protein-protein interactions may contain sub-families that have different binding characteristics, ranging from right binding to showing no interaction at all. Composition differences at the sequence level in these sub-families are often decisive to their differential functional interaction. Methods to predict interface sites from protein sequences typically exploit conservation as a signal. Here, instead, we provide proof of concept that the sequence specificity between interacting versus non-interacting groups can be exploited to recognise interaction sites. Results We collected homodimeric and monomeric proteins and formed homologous groups, each having an interacting (homodimer) subgroup and a non-interacting (monomer) subgroup. We then compiled multiple sequence alignments of the proteins in the homologous groups and identified compositional differences between the homodimeric and monomeric subgroups for each of the alignment positions. Our results show that this specificity signal distinguishes interface and other surface residues with 40.9 % recall and up to 25.1 % precision. Conclusions To our best knowledge, this is the first large scale study that exploits sequence specificity between interacting and non-interacting homologs to predict interaction sites from sequence information only. The performance obtained indicates that this signal contains valuable information to identify protein-protein interaction sites. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0758-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qingzhen Hou
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands. .,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands. .,Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
| | - Martijn A Huynen
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Geert Grooteplein 28, 6525 GA, Nijmegen, The Netherlands.
| | - Jaap Heringa
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| | - K Anton Feenstra
- Center for Integrative Bioinformatics VU (IBIVU), Vrije University Amsterdam, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
15
|
Goncearenco A, Shaytan AK, Shoemaker BA, Panchenko AR. Structural Perspectives on the Evolutionary Expansion of Unique Protein-Protein Binding Sites. Biophys J 2015. [PMID: 26213149 DOI: 10.1016/j.bpj.2015.06.056] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Structures of protein complexes provide atomistic insights into protein interactions. Human proteins represent a quarter of all structures in the Protein Data Bank; however, available protein complexes cover less than 10% of the human proteome. Although it is theoretically possible to infer interactions in human proteins based on structures of homologous protein complexes, it is still unclear to what extent protein interactions and binding sites are conserved, and whether protein complexes from remotely related species can be used to infer interactions and binding sites. We considered biological units of protein complexes and clustered protein-protein binding sites into similarity groups based on their structure and sequence, which allowed us to identify unique binding sites. We showed that the growth rate of the number of unique binding sites in the Protein Data Bank was much slower than the growth rate of the number of structural complexes. Next, we investigated the evolutionary roots of unique binding sites and identified the major phyletic branches with the largest expansion in the number of novel binding sites. We found that many binding sites could be traced to the universal common ancestor of all cellular organisms, whereas relatively few binding sites emerged at the major evolutionary branching points. We analyzed the physicochemical properties of unique binding sites and found that the most ancient sites were the largest in size, involved many salt bridges, and were the most compact and least planar. In contrast, binding sites that appeared more recently in the evolution of eukaryotes were characterized by a larger fraction of polar and aromatic residues, and were less compact and more planar, possibly due to their more transient nature and roles in signaling processes.
Collapse
Affiliation(s)
- Alexander Goncearenco
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Alexey K Shaytan
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Benjamin A Shoemaker
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland
| | - Anna R Panchenko
- Computational Biology Branch of the National Center for Biotechnology Information, Bethesda, Maryland.
| |
Collapse
|
16
|
Abstract
Cells make decisions to differentiate, divide, or apoptose based on multiple signals of internal and external origin. These decisions are discrete outputs from dynamic networks comprised of signaling pathways. Yet the validity of this decomposition of regulatory proteins into distinct pathways is unclear because many regulatory proteins are pleiotropic and interact through cross-talk with components of other pathways. In addition to the deterministic complexity of interconnected networks, there is stochastic complexity arising from the fluctuations in concentrations of regulatory molecules. Even within a genetically identical population of cells grown in the same environment, cell-to-cell variations in mRNA and protein concentrations can be as high as 50% in yeast and even higher in mammalian cells. Thus, if everything is connected and stochastic, what hope could we have for a quantitative understanding of cellular decisions? Here we discuss the implications of recent advances in genomics, single-cell, and single-cell genomics technology for network modularity and cellular decisions. On the basis of these recent advances, we argue that most gene expression stochasticity and pathway interconnectivity is nonfunctional and that cellular decisions are likely much more predictable than previously expected.
Collapse
Affiliation(s)
- Oguzhan Atay
- Department of Biology, Stanford University, Stanford, CA 94305
| | - Jan M Skotheim
- Department of Biology, Stanford University, Stanford, CA 94305
| |
Collapse
|
17
|
Korneenko TV, Pestov NB, Okkelman IA, Modyanov NN, Shakhparonov MI. [P4-ATP-ase Atp8b1/FIC1: structural properties and (patho)physiological functions]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2015; 41:3-12. [PMID: 26050466 DOI: 10.1134/s1068162015010070] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
P4-ATP-ases comprise an interesting family among P-type ATP-ases, since they are thought to play a major role in the transfer of phospholipids such as phosphatydylserine from the outer leaflet to the inner leaflet. Isoforms of P4-ATP-ases are partially interchangeable but peculiarities of tissue-specific expression of their genes, intracellular localization of proteins, as well as regulatory pathways lead to the fact that, on the organismal level, serious pathologies may develop in the presence of structural abnormalities in certain isoforms. Among P4-ATP-ases a special place is occupied by ATP8B1, for which several mutations are known that lead to serious hereditary diseases: two forms of congenital cholestasis (PFIC1 or Byler disease and benign recurrent intrahepatic cholestasis) with extraliver symptoms such as sensorineural hearing loss. The physiological function of the Atp8b1/FIC1 protein is known in general outline: it is responsible for transport of certain phospholipids (phosphatydylserine, cardiolipin) for the outer monolayer of the plasma membrane to the inner one. It is well known that perturbation of membrane asymmetry, caused by the lack of Atp8B1 activity, leads to death of hairy cells of the inner ear, dysfunction of bile acid transport in liver-cells that causes cirrhosis. It is also probable that insufficient activity of Atp8b1/FIC1 increases susceptibility to bacterial pneumonia.Regulatory pathways of Atp8b1/FIC1 activity in vivo remain to be insufficiently studied and this opens novel perspectives for research in this field that may allow better understanding of molecular processes behind the development of certain pathologies and to reveal novel therapeutical targets.
Collapse
|
18
|
Abstract
The assembly of individual proteins into functional complexes is fundamental to nearly all biological processes. In recent decades, many thousands of homomeric and heteromeric protein complex structures have been determined, greatly improving our understanding of the fundamental principles that control symmetric and asymmetric quaternary structure organization. Furthermore, our conception of protein complexes has moved beyond static representations to include dynamic aspects of quaternary structure, including conformational changes upon binding, multistep ordered assembly pathways, and structural fluctuations occurring within fully assembled complexes. Finally, major advances have been made in our understanding of protein complex evolution, both in reconstructing evolutionary histories of specific complexes and in elucidating general mechanisms that explain how quaternary structure tends to evolve. The evolution of quaternary structure occurs via changes in self-assembly state or through the gain or loss of protein subunits, and these processes can be driven by both adaptive and nonadaptive influences.
Collapse
Affiliation(s)
- Joseph A Marsh
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom;
| | | |
Collapse
|
19
|
Paulus JD, Link BA. Loss of optineurin in vivo results in elevated cell death and alters axonal trafficking dynamics. PLoS One 2014; 9:e109922. [PMID: 25329564 PMCID: PMC4199637 DOI: 10.1371/journal.pone.0109922] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2014] [Accepted: 09/12/2014] [Indexed: 12/11/2022] Open
Abstract
Mutations in Optineurin have been associated with ALS, glaucoma, and Paget’s disease of bone in humans, but little is known about how these mutations contribute to disease. Most of the cellular consequences of Optineurin loss have come from in vitro studies, and it remains unclear whether these same defects would be seen in vivo. To answer this question, we assessed the cellular consequences of Optineurin loss in zebrafish embryos to determine if they showed the same defects as have been described in the in vitro studies. We found that loss of Optineurin resulted in increased cell death, as well as subtle cell morphology, cell migration and vesicle trafficking defects. However, unlike experiments on cells in culture, we found no indication that the Golgi apparatus was disrupted or that NF-κB target genes were upregulated. Therefore, we conclude that in vivo loss of Optineurin shows some, but not all, of the defects seen in in vitro work.
Collapse
Affiliation(s)
- Jeremiah D. Paulus
- Department of Cell Biology, Neurobiology and Anatomy, Medical College of Wisconsin, Milwaukee, WI, United States of America
| | - Brian A. Link
- Department of Cell Biology, Neurobiology and Anatomy, Medical College of Wisconsin, Milwaukee, WI, United States of America
- * E-mail:
| |
Collapse
|
20
|
Abstract
MOTIVATION Biological network comparison software largely relies on the concept of alignment where close matches between the nodes of two or more networks are sought. These node matches are based on sequence similarity and/or interaction patterns. However, because of the incomplete and error-prone datasets currently available, such methods have had limited success. Moreover, the results of network alignment are in general not amenable for distance-based evolutionary analysis of sets of networks. In this article, we describe Netdis, a topology-based distance measure between networks, which offers the possibility of network phylogeny reconstruction. RESULTS We first demonstrate that Netdis is able to correctly separate different random graph model types independent of network size and density. The biological applicability of the method is then shown by its ability to build the correct phylogenetic tree of species based solely on the topology of current protein interaction networks. Our results provide new evidence that the topology of protein interaction networks contains information about evolutionary processes, despite the lack of conservation of individual interactions. As Netdis is applicable to all networks because of its speed and simplicity, we apply it to a large collection of biological and non-biological networks where it clusters diverse networks by type. AVAILABILITY AND IMPLEMENTATION The source code of the program is freely available at http://www.stats.ox.ac.uk/research/proteins/resources. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Waqar Ali
- Department of Statistics, University of Oxford, Oxford OX1 3TG, UK and Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, CA 90089-2910, USA
| | - Tiago Rito
- Department of Statistics, University of Oxford, Oxford OX1 3TG, UK and Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, CA 90089-2910, USA
| | - Gesine Reinert
- Department of Statistics, University of Oxford, Oxford OX1 3TG, UK and Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, CA 90089-2910, USA
| | - Fengzhu Sun
- Department of Statistics, University of Oxford, Oxford OX1 3TG, UK and Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, CA 90089-2910, USA
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Oxford OX1 3TG, UK and Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, CA 90089-2910, USA
| |
Collapse
|
21
|
Mulder NJ, Akinola RO, Mazandu GK, Rapanoel H. Using biological networks to improve our understanding of infectious diseases. Comput Struct Biotechnol J 2014; 11:1-10. [PMID: 25379138 PMCID: PMC4212278 DOI: 10.1016/j.csbj.2014.08.006] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Infectious diseases are the leading cause of death, particularly in developing countries. Although many drugs are available for treating the most common infectious diseases, in many cases the mechanism of action of these drugs or even their targets in the pathogen remain unknown. In addition, the key factors or processes in pathogens that facilitate infection and disease progression are often not well understood. Since proteins do not work in isolation, understanding biological systems requires a better understanding of the interconnectivity between proteins in different pathways and processes, which includes both physical and other functional interactions. Such biological networks can be generated within organisms or between organisms sharing a common environment using experimental data and computational predictions. Though different data sources provide different levels of accuracy, confidence in interactions can be measured using interaction scores. Connections between interacting proteins in biological networks can be represented as graphs and edges, and thus studied using existing algorithms and tools from graph theory. There are many different applications of biological networks, and here we discuss three such applications, specifically applied to the infectious disease tuberculosis, with its causative agent Mycobacterium tuberculosis and host, Homo sapiens. The applications include the use of the networks for function prediction, comparison of networks for evolutionary studies, and the generation and use of host–pathogen interaction networks.
Collapse
Affiliation(s)
- Nicola J Mulder
- Computational Biology Group, Department of Clinical Laboratory Sciences, IDM, University of Cape Town Faculty of Health Sciences, Anzio Road, Observatory, Cape Town, South Africa
| | - Richard O Akinola
- Computational Biology Group, Department of Clinical Laboratory Sciences, IDM, University of Cape Town Faculty of Health Sciences, Anzio Road, Observatory, Cape Town, South Africa
| | - Gaston K Mazandu
- Computational Biology Group, Department of Clinical Laboratory Sciences, IDM, University of Cape Town Faculty of Health Sciences, Anzio Road, Observatory, Cape Town, South Africa
| | - Holifidy Rapanoel
- Computational Biology Group, Department of Clinical Laboratory Sciences, IDM, University of Cape Town Faculty of Health Sciences, Anzio Road, Observatory, Cape Town, South Africa
| |
Collapse
|
22
|
Andreani J, Guerois R. Evolution of protein interactions: From interactomes to interfaces. Arch Biochem Biophys 2014; 554:65-75. [DOI: 10.1016/j.abb.2014.05.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Revised: 04/28/2014] [Accepted: 05/12/2014] [Indexed: 12/16/2022]
|
23
|
Goncearenco A, Shoemaker BA, Zhang D, Sarychev A, Panchenko AR. Coverage of protein domain families with structural protein-protein interactions: current progress and future trends. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:187-93. [PMID: 24931138 DOI: 10.1016/j.pbiomolbio.2014.05.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Revised: 04/14/2014] [Accepted: 05/17/2014] [Indexed: 11/16/2022]
Abstract
Protein interactions have evolved into highly precise and regulated networks adding an immense layer of complexity to cellular systems. The most accurate atomistic description of protein binding sites can be obtained directly from structures of protein complexes. The availability of structurally characterized protein interfaces significantly improves our understanding of interactomes, and the progress in structural characterization of protein-protein interactions (PPIs) can be measured by calculating the structural coverage of protein domain families. We analyze the coverage of protein domain families (defined according to CDD and Pfam databases) by structures, structural protein-protein complexes and unique protein binding sites. Structural PPI coverage of currently available protein families is about 30% without any signs of saturation in coverage growth dynamics. Given the current growth rates of domain databases and structural PPI deposition, complete domain coverage with PPIs is not expected in the near future. As a result of this study we identify families without any protein-protein interaction evidence (listed on a supporting website http://www.ncbi.nlm.nih.gov/Structure/ibis/coverage/) and propose them as potential targets for structural studies with a focus on protein interactions.
Collapse
Affiliation(s)
- Alexander Goncearenco
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Benjamin A Shoemaker
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Dachuan Zhang
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Alexey Sarychev
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Anna R Panchenko
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States.
| |
Collapse
|
24
|
Lua RC, Marciano DC, Katsonis P, Adikesavan AK, Wilkins AD, Lichtarge O. Prediction and redesign of protein-protein interactions. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:194-202. [PMID: 24878423 DOI: 10.1016/j.pbiomolbio.2014.05.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 05/02/2014] [Accepted: 05/17/2014] [Indexed: 12/14/2022]
Abstract
Understanding the molecular basis of protein function remains a central goal of biology, with the hope to elucidate the role of human genes in health and in disease, and to rationally design therapies through targeted molecular perturbations. We review here some of the computational techniques and resources available for characterizing a critical aspect of protein function - those mediated by protein-protein interactions (PPI). We describe several applications and recent successes of the Evolutionary Trace (ET) in identifying molecular events and shapes that underlie protein function and specificity in both eukaryotes and prokaryotes. ET is a part of analytical approaches based on the successes and failures of evolution that enable the rational control of PPI.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Anbu K Adikesavan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Angela D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
25
|
Marsh JA, Teichmann SA. Protein flexibility facilitates quaternary structure assembly and evolution. PLoS Biol 2014; 12:e1001870. [PMID: 24866000 PMCID: PMC4035275 DOI: 10.1371/journal.pbio.1001870] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 04/17/2014] [Indexed: 11/25/2022] Open
Abstract
The flexibility of individual proteins aids their evolutionary recruitment into complexes with increasing numbers of distinct subunits. The intrinsic flexibility of proteins allows them to undergo large conformational fluctuations in solution or upon interaction with other molecules. Proteins also commonly assemble into complexes with diverse quaternary structure arrangements. Here we investigate how the flexibility of individual protein chains influences the assembly and evolution of protein complexes. We find that flexibility appears to be particularly conducive to the formation of heterologous (i.e., asymmetric) intersubunit interfaces. This leads to a strong association between subunit flexibility and homomeric complexes with cyclic and asymmetric quaternary structure topologies. Similarly, we also observe that the more nonhomologous subunits that assemble together within a complex, the more flexible those subunits tend to be. Importantly, these findings suggest that subunit flexibility should be closely related to the evolutionary history of a complex. We confirm this by showing that evolutionarily more recent subunits are generally more flexible than evolutionarily older subunits. Finally, we investigate the very different explorations of quaternary structure space that have occurred in different evolutionary lineages. In particular, the increased flexibility of eukaryotic proteins appears to enable the assembly of heteromeric complexes with more unique components. Proteins often interact with other proteins and assemble into complexes. Here we show that the flexibility of individual proteins is important for their recruitment to complexes, as it facilitates the formation of asymmetric interfaces between different subunits. The role of flexibility becomes increasingly important as a greater number of distinct proteins are packed together within a single complex: the more distinct subunits, the more flexible those subunits need to be. A consequence of this is that, when a protein complex gains a new subunit during evolution, the newer subunit will tend to be more flexible than the older subunits. This suggests that we may be able to partially reconstruct the evolutionary history of a protein complex by considering the flexibility of its subunits. We also find that the types of protein complexes an organism forms are closely related to the flexibility of its proteins, with eukaryotic species, and particularly animals, using their increased flexibility to assemble complexes involving more distinct components.
Collapse
Affiliation(s)
- Joseph A. Marsh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- * E-mail:
| | - Sarah A. Teichmann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
26
|
Synthetic genetic array screen identifies PP2A as a therapeutic target in Mad2-overexpressing tumors. Proc Natl Acad Sci U S A 2014; 111:1628-33. [PMID: 24425774 DOI: 10.1073/pnas.1315588111] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The spindle checkpoint is essential to ensure proper chromosome segregation and thereby maintain genomic stability. Mitotic arrest deficiency 2 (Mad2), a critical component of the spindle checkpoint, is overexpressed in many cancer cells. Thus, we hypothesized that Mad2 overexpression could specifically make cancer cells susceptible to death by inducing a synthetic dosage lethality defect. Because the spindle checkpoint pathway is highly conserved between yeast and humans, we performed a synthetic genetic array analysis in yeast, which revealed that Mad2 overexpression induced lethality in 13 gene deletions. Among the human homologs of candidate genes, knockdown of PPP2R1A, a gene encoding a constant regulatory subunit of protein phosphatase 2, significantly inhibited the growth of Mad2-overexpressing tumor cells. PPP2R1A inhibition induced Mad2 phosphorylation and suppressed Mad2 protein levels. Depletion of PPP2R1A inhibited colony formation of Mad2-overexpressing HeLa cells but not of unphosphorylated Mad2 mutant-overexpressing cells, suggesting that the lethality induced by PP2A depletion in Mad2-overexpressing cells is dependent on Mad2 phosphorylation. Also, the PP2A inhibitor cantharidin induced Mad2 phosphorylation and inhibited the growth of Mad2-overexpressing cancer cells. Aurora B knockdown inhibited Mad2 phosphorylation in mitosis, resulting in the blocking of PPP2R1A inhibition-induced cell death. Taken together, our results strongly suggest that PP2A is a good therapeutic target in Mad2-overexpressing tumors.
Collapse
|
27
|
Folador EL, Hassan SS, Lemke N, Barh D, Silva A, Ferreira RS, Azevedo V. An improved interolog mapping-based computational prediction of protein–protein interactions with increased network coverage. Integr Biol (Camb) 2014; 6:1080-7. [DOI: 10.1039/c4ib00136b] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Automated and efficient methods that map ortholog interactions from several organisms and public databases (pDB) are needed to identify new interactions in an organism of interest (interolog mapping).
Collapse
Affiliation(s)
- Edson Luiz Folador
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Syed Shah Hassan
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Ney Lemke
- Laboratory of Bioinformatic and Computational Biofisic
- Instituto de Biociência
- Universidade Estadual de São Paulo (UNESP)
- Botucatu, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology
- Institute of Integrative Omics and Applied Biotechnology (IIOAB)
- Purba Medinipur, India
| | - Artur Silva
- Instituto de Ciências Biológicas
- Universidade Federal do Para
- Belém, Brazil
| | - Rafaela Salgado Ferreira
- Department of Biochemistry and Immunology
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| | - Vasco Azevedo
- Department of General Biology
- Instituto de Ciências Biológicas (ICB)
- Federal University of Minas Gerais (UFMG)
- Belo Horizonte, Brazil
| |
Collapse
|
28
|
Abrusán G. Integration of new genes into cellular networks, and their structural maturation. Genetics 2013; 195:1407-17. [PMID: 24056411 PMCID: PMC3832282 DOI: 10.1534/genetics.113.152256] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 08/27/2013] [Indexed: 12/21/2022] Open
Abstract
It has been recently discovered that new genes can originate de novo from noncoding DNA, and several biological traits including expression or sequence composition form a continuum from noncoding sequences to conserved genes. In this article, using yeast genes I test whether the integration of new genes into cellular networks and their structural maturation shows such a continuum by analyzing their changes with gene age. I show that 1) The number of regulatory, protein-protein, and genetic interactions increases continuously with gene age, although with very different rates. New regulatory interactions emerge rapidly within a few million years, while the number of protein-protein and genetic interactions increases slowly, with a rate of 2-2.25 × 10(-8)/year and 4.8 × 10(-8)/year, respectively. 2) Gene essentiality evolves relatively quickly: the youngest essential genes appear in proto-genes ∼14 MY old. 3) In contrast to interactions, the secondary structure of proteins and their robustness to mutations indicate that new genes face a bottleneck in their evolution: proto-genes are characterized by high β-strand content, high aggregation propensity, and low robustness against mutations, while conserved genes are characterized by lower strand content and higher stability, most likely due to the higher probability of gene loss among young genes and accumulation of neutral mutations.
Collapse
Affiliation(s)
- György Abrusán
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged H-6701, Hungary
| |
Collapse
|
29
|
Abstract
UNLABELLED Protein interaction networks are important for the understanding of regulatory mechanisms, for the explanation of experimental data and for the prediction of protein functions. Unfortunately, most interaction data is available only for model organisms. As a possible remedy, the transfer of interactions to organisms of interest is common practice, but it is not clear when interactions can be transferred from one organism to another and, thus, the confidence in the derived interactions is low. Here, we propose to use a rich set of features to train Random Forests in order to score transferred interactions. We evaluated the transfer from a range of eukaryotic organisms to S. cerevisiae using orthologs. Directly transferred interactions to S. cerevisiae are on average only 24% consistent with the current S. cerevisiae interaction network. By using commonly applied filter approaches the transfer precision can be improved, but at the cost of a large decrease in the number of transferred interactions. Our Random Forest approach uses various features derived from both the target and the source network as well as the ortholog annotations to assign confidence values to transferred interactions. Thereby, we could increase the average transfer consistency to 85%, while still transferring almost 70% of all correctly transferable interactions. We tested our approach for the transfer of interactions to other species and showed that our approach outperforms competing methods for the transfer of interactions to species where no experimental knowledge is available. Finally, we applied our predictor to score transferred interactions to 83 targets species and we were able to extend the available interactome of B. taurus, M. musculus and G. gallus with over 40,000 interactions each. Our transferred interaction networks are publicly available via our web interface, which allows to inspect and download transferred interaction sets of different sizes, for various species, and at specified expected precision levels. AVAILABILITY http://services.bio.ifi.lmu.de/coin-db/.
Collapse
Affiliation(s)
- Robert Pesch
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
- * E-mail:
| | - Ralf Zimmer
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| |
Collapse
|
30
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 506] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
31
|
Kolář M, Meier J, Mustonen V, Lässig M, Berg J. GraphAlignment: Bayesian pairwise alignment of biological networks. BMC SYSTEMS BIOLOGY 2012; 6:144. [PMID: 23171476 PMCID: PMC3573967 DOI: 10.1186/1752-0509-6-144] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 11/07/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. RESULTS We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0.On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6).On empirical bacterial protein-protein interaction networks (PIN) and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. CONCLUSIONS The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.
Collapse
Affiliation(s)
- Michal Kolář
- Institut für Theoretische Physik, Universität zu Köln, Zülpicher Straße 77, D-50937 Köln, Germany
| | | | | | | | | |
Collapse
|