1
|
Varshney N, Mishra AK. Deep Learning in Phosphoproteomics: Methods and Application in Cancer Drug Discovery. Proteomes 2023; 11:proteomes11020016. [PMID: 37218921 DOI: 10.3390/proteomes11020016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/24/2023] Open
Abstract
Protein phosphorylation is a key post-translational modification (PTM) that is a central regulatory mechanism of many cellular signaling pathways. Several protein kinases and phosphatases precisely control this biochemical process. Defects in the functions of these proteins have been implicated in many diseases, including cancer. Mass spectrometry (MS)-based analysis of biological samples provides in-depth coverage of phosphoproteome. A large amount of MS data available in public repositories has unveiled big data in the field of phosphoproteomics. To address the challenges associated with handling large data and expanding confidence in phosphorylation site prediction, the development of many computational algorithms and machine learning-based approaches have gained momentum in recent years. Together, the emergence of experimental methods with high resolution and sensitivity and data mining algorithms has provided robust analytical platforms for quantitative proteomics. In this review, we compile a comprehensive collection of bioinformatic resources used for the prediction of phosphorylation sites, and their potential therapeutic applications in the context of cancer.
Collapse
Affiliation(s)
- Neha Varshney
- Division of Biological Sciences, Department of Cellular and Molecular Medicine, University of California, San Diego, CA 93093, USA
- Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | - Abhinava K Mishra
- Molecular, Cellular and Developmental Biology Department, University of California, Santa Barbara, CA 93106, USA
| |
Collapse
|
2
|
Oyarzun P, Kashyap M, Fica V, Salas-Burgos A, Gonzalez-Galarza FF, McCabe A, Jones AR, Middleton D, Kobe B. A Proteome-Wide Immunoinformatics Tool to Accelerate T-Cell Epitope Discovery and Vaccine Design in the Context of Emerging Infectious Diseases: An Ethnicity-Oriented Approach. Front Immunol 2021; 12:598778. [PMID: 33717077 PMCID: PMC7952308 DOI: 10.3389/fimmu.2021.598778] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 01/11/2021] [Indexed: 01/06/2023] Open
Abstract
Emerging infectious diseases (EIDs) caused by viruses are increasing in frequency, causing a high disease burden and mortality world-wide. The COVID-19 pandemic caused by the novel SARS-like coronavirus (SARS-CoV-2) underscores the need to innovate and accelerate the development of effective vaccination strategies against EIDs. Human leukocyte antigen (HLA) molecules play a central role in the immune system by determining the peptide repertoire displayed to the T-cell compartment. Genetic polymorphisms of the HLA system thus confer a strong variability in vaccine-induced immune responses and may complicate the selection of vaccine candidates, because the distribution and frequencies of HLA alleles are highly variable among different ethnic groups. Herein, we build on the emerging paradigm of rational epitope-based vaccine design, by describing an immunoinformatics tool (Predivac-3.0) for proteome-wide T-cell epitope discovery that accounts for ethnic-level variations in immune responsiveness. Predivac-3.0 implements both CD8+ and CD4+ T-cell epitope predictions based on HLA allele frequencies retrieved from the Allele Frequency Net Database. The tool was thoroughly assessed, proving comparable performances (AUC ~0.9) against four state-of-the-art pan-specific immunoinformatics methods capable of population-level analysis (NetMHCPan-4.0, Pickpocket, PSSMHCPan and SMM), as well as a strong accuracy on proteome-wide T-cell epitope predictions for HIV-specific immune responses in the Japanese population. The utility of the method was investigated for the COVID-19 pandemic, by performing in silico T-cell epitope mapping of the SARS-CoV-2 spike glycoprotein according to the ethnic context of the countries where the ChAdOx1 vaccine is currently initiating phase III clinical trials. Potentially immunodominant CD8+ and CD4+ T-cell epitopes and population coverages were predicted for each population (the Epitope Discovery mode), along with optimized sets of broadly recognized (promiscuous) T-cell epitopes maximizing coverage in the target populations (the Epitope Optimization mode). Population-specific epitope-rich regions (T-cell epitope clusters) were further predicted in protein antigens based on combined criteria of epitope density and population coverage. Overall, we conclude that Predivac-3.0 holds potential to contribute in the understanding of ethnic-level variations of vaccine-induced immune responsiveness and to guide the development of epitope-based next-generation vaccines against emerging pathogens, whose geographic distributions and populations in need of vaccinations are often well-defined for regional epidemics.
Collapse
Affiliation(s)
- Patricio Oyarzun
- Facultad de Ingeniería y Tecnología, Universidad San Sebastián, Sede Concepción, Concepción, Chile
| | - Manju Kashyap
- Facultad de Ingeniería y Tecnología, Universidad San Sebastián, Sede Concepción, Concepción, Chile
| | - Victor Fica
- Facultad de Ingeniería y Tecnología, Universidad San Sebastián, Sede Concepción, Concepción, Chile
| | | | - Faviel F Gonzalez-Galarza
- Center for Biomedical Research, Faculty of Medicine, Autonomous University of Coahuila, Torreon, Mexico
| | - Antony McCabe
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Andrew R Jones
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Derek Middleton
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Bostjan Kobe
- School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
3
|
Bradley D, Viéitez C, Rajeeve V, Selkrig J, Cutillas PR, Beltrao P. Sequence and Structure-Based Analysis of Specificity Determinants in Eukaryotic Protein Kinases. Cell Rep 2021; 34:108602. [PMID: 33440154 PMCID: PMC7809594 DOI: 10.1016/j.celrep.2020.108602] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 11/03/2020] [Accepted: 12/14/2020] [Indexed: 01/04/2023] Open
Abstract
Protein kinases lie at the heart of cell-signaling processes and are often mutated in disease. Kinase target recognition at the active site is in part determined by a few amino acids around the phosphoacceptor residue. However, relatively little is known about how most preferences are encoded in the kinase sequence or how these preferences evolved. Here, we used alignment-based approaches to predict 30 specificity-determining residues (SDRs) for 16 preferences. These were studied with structural models and were validated by activity assays of mutant kinases. Cancer mutation data revealed that kinase SDRs are mutated more frequently than catalytic residues. We have observed that, throughout evolution, kinase specificity has been strongly conserved across orthologs but can diverge after gene duplication, as illustrated by the G protein-coupled receptor kinase family. The identified SDRs can be used to predict kinase specificity from sequence and aid in the interpretation of evolutionary or disease-related genomic variants.
Collapse
Affiliation(s)
- David Bradley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Cristina Viéitez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK; European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany
| | - Vinothini Rajeeve
- Integrative Cell Signalling & Proteomics, Centre for Haemato-Oncology, Barts Cancer Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
| | - Joel Selkrig
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany
| | - Pedro R Cutillas
- Integrative Cell Signalling & Proteomics, Centre for Haemato-Oncology, Barts Cancer Institute, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK.
| | - Pedro Beltrao
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK.
| |
Collapse
|
4
|
Savage SR, Zhang B. Using phosphoproteomics data to understand cellular signaling: a comprehensive guide to bioinformatics resources. Clin Proteomics 2020; 17:27. [PMID: 32676006 PMCID: PMC7353784 DOI: 10.1186/s12014-020-09290-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 07/04/2020] [Indexed: 12/19/2022] Open
Abstract
Mass spectrometry-based phosphoproteomics is becoming an essential methodology for the study of global cellular signaling. Numerous bioinformatics resources are available to facilitate the translation of phosphopeptide identification and quantification results into novel biological and clinical insights, a critical step in phosphoproteomics data analysis. These resources include knowledge bases of kinases and phosphatases, phosphorylation sites, kinase inhibitors, and sequence variants affecting kinase function, and bioinformatics tools that can predict phosphorylation sites in addition to the kinase that phosphorylates them, infer kinase activity, and predict the effect of mutations on kinase signaling. However, these resources exist in silos and it is challenging to select among multiple resources with similar functions. Therefore, we put together a comprehensive collection of resources related to phosphoproteomics data interpretation, compared the use of tools with similar functions, and assessed the usability from the standpoint of typical biologists or clinicians. Overall, tools could be improved by standardization of enzyme names, flexibility of data input and output format, consistent maintenance, and detailed manuals.
Collapse
Affiliation(s)
- Sara R. Savage
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN USA
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| |
Collapse
|
5
|
Pérez-Mejías G, Velázquez-Cruz A, Guerra-Castellano A, Baños-Jaime B, Díaz-Quintana A, González-Arzola K, Ángel De la Rosa M, Díaz-Moreno I. Exploring protein phosphorylation by combining computational approaches and biochemical methods. Comput Struct Biotechnol J 2020; 18:1852-1863. [PMID: 32728408 PMCID: PMC7369424 DOI: 10.1016/j.csbj.2020.06.043] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 12/14/2022] Open
Abstract
Post-translational modifications of proteins expand their functional diversity, regulating the response of cells to a variety of stimuli. Among these modifications, phosphorylation is the most ubiquitous and plays a prominent role in cell signaling. The addition of a phosphate often affects the function of a protein by altering its structure and dynamics. However, these alterations are often difficult to study and the functional and structural implications remain unresolved. New approaches are emerging to overcome common obstacles related to the production and manipulation of these samples. Here, we summarize the available methods for phosphoprotein purification and phosphomimetic engineering, highlighting the advantages and disadvantages of each. We propose a general workflow for protein phosphorylation analysis combining computational and biochemical approaches, building on recent advances that enable user-friendly and easy-to-access Molecular Dynamics simulations. We hope this innovative workflow will inform the best experimental approach to explore such post-translational modifications. We have applied this workflow to two different human protein models: the hemeprotein cytochrome c and the RNA binding protein HuR. Our results illustrate the usefulness of Molecular Dynamics as a decision-making tool to design the most appropriate phosphomimetic variant.
Collapse
Affiliation(s)
- Gonzalo Pérez-Mejías
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Alejandro Velázquez-Cruz
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Alejandra Guerra-Castellano
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Blanca Baños-Jaime
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Antonio Díaz-Quintana
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Katiuska González-Arzola
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Miguel Ángel De la Rosa
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| | - Irene Díaz-Moreno
- Instituto de Investigaciones Químicas (IIQ), Centro de Investigaciones Científicas Isla de la Cartuja (cicCartuja), Universidad de Sevilla, Consejo Superior de Investigaciones Científicas (CSIC), Avda., Américo Vespucio 49, Sevilla 41092, Spain
| |
Collapse
|
6
|
Deznabi I, Arabaci B, Koyutürk M, Tastan O. DeepKinZero: zero-shot learning for predicting kinase-phosphosite associations involving understudied kinases. Bioinformatics 2020; 36:3652-3661. [PMID: 32044914 PMCID: PMC7320620 DOI: 10.1093/bioinformatics/btaa013] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 12/17/2019] [Accepted: 01/06/2020] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Protein phosphorylation is a key regulator of protein function in signal transduction pathways. Kinases are the enzymes that catalyze the phosphorylation of other proteins in a target-specific manner. The dysregulation of phosphorylation is associated with many diseases including cancer. Although the advances in phosphoproteomics enable the identification of phosphosites at the proteome level, most of the phosphoproteome is still in the dark: more than 95% of the reported human phosphosites have no known kinases. Determining which kinase is responsible for phosphorylating a site remains an experimental challenge. Existing computational methods require several examples of known targets of a kinase to make accurate kinase-specific predictions, yet for a large body of kinases, only a few or no target sites are reported. RESULTS We present DeepKinZero, the first zero-shot learning approach to predict the kinase acting on a phosphosite for kinases with no known phosphosite information. DeepKinZero transfers knowledge from kinases with many known target phosphosites to those kinases with no known sites through a zero-shot learning model. The kinase-specific positional amino acid preferences are learned using a bidirectional recurrent neural network. We show that DeepKinZero achieves significant improvement in accuracy for kinases with no known phosphosites in comparison to the baseline model and other methods available. By expanding our knowledge on understudied kinases, DeepKinZero can help to chart the phosphoproteome atlas. AVAILABILITY AND IMPLEMENTATION The source codes are available at https://github.com/Tastanlab/DeepKinZero. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iman Deznabi
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
- College of Information and Computer Sciences, University of Massachusetts, Amherst, MA 01003, USA
| | - Busra Arabaci
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Mehmet Koyutürk
- Department of Computer and Data Sciences
- Center for Proteomics & Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| |
Collapse
|
7
|
McCormick JW, Pincus D, Resnekov O, Reynolds KA. Strategies for Engineering and Rewiring Kinase Regulation. Trends Biochem Sci 2019; 45:259-271. [PMID: 31866305 DOI: 10.1016/j.tibs.2019.11.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/13/2019] [Accepted: 11/15/2019] [Indexed: 12/31/2022]
Abstract
Eukaryotic protein kinases (EPKs) catalyze the transfer of a phosphate group onto another protein in response to appropriate regulatory cues. In doing so, they provide a primary means for cellular information transfer. Consequently, EPKs play crucial roles in cell differentiation and cell-cycle progression, and kinase dysregulation is associated with numerous disease phenotypes including cancer. Nonnative cues for synthetically regulating kinases are thus much sought after, both for dissecting cell signaling pathways and for pharmaceutical development. In recent years advances in protein engineering and sequence analysis have led to new approaches for manipulating kinase activity, localization, and in some instances specificity. These tools have revealed fundamental principles of intracellular signaling and suggest paths forward for the design of therapeutic allosteric kinase regulators.
Collapse
Affiliation(s)
- James W McCormick
- The Green Center for Systems Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - David Pincus
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA; Center for Physics of Evolving Systems, University of Chicago, Chicago, IL 60637, USA
| | | | - Kimberly A Reynolds
- The Green Center for Systems Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
8
|
Abstract
Proteomics and phosphoproteomics have been emerging as new dimensions of omics. Phosphorylation has a profound impact on the biological functions and applications of proteins. It influences everything from intrinsic activity and extrinsic executions to cellular localization. This post-translational modification has been subjected to detailed study and has been an object of analytical curiosity with the advent of faster instrumentation. The major strength of phosphoproteomic research lies in the fact that it gives an overall picture of the workforce of the cell. Phosphoproteomics gives deeper insights into understanding the mechanism behind development and progression of a disease. This review for the first time consolidates the list of existing bioinformatics tools developed for phosphoproteomics. The gap between development of bioinformatics tools and their implementation in clinical research is highlighted. The challenge facing progress is ideally believed to be the interdisciplinary arena this field of research is associated with. For meaningful solutions and deliverables, these tools need to be implemented in clinical studies for obtaining answers to pharmacodynamic questions, saving time, costs and energy. This review hopes to invoke some thought in this direction.
Collapse
|
9
|
Cheng A, Grant CE, Noble WS, Bailey TL. MoMo: discovery of statistically significant post-translational modification motifs. Bioinformatics 2019; 35:2774-2782. [PMID: 30596994 PMCID: PMC6691336 DOI: 10.1093/bioinformatics/bty1058] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 12/14/2018] [Accepted: 12/26/2018] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alice Cheng
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Charles E Grant
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | | |
Collapse
|
10
|
Cao M, Chen G, Yu J, Shi S. Computational prediction and analysis of species-specific fungi phosphorylation via feature optimization strategy. Brief Bioinform 2018; 21:595-608. [DOI: 10.1093/bib/bby122] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 11/16/2018] [Accepted: 11/22/2018] [Indexed: 11/12/2022] Open
Abstract
Abstract
Protein phosphorylation is a reversible and ubiquitous post-translational modification that primarily occurs at serine, threonine and tyrosine residues and regulates a variety of biological processes. In this paper, we first briefly summarized the current progresses in computational prediction of eukaryotic protein phosphorylation sites, which mainly focused on animals and plants, especially on human, with a less extent on fungi. Since the number of identified fungi phosphorylation sites has greatly increased in a wide variety of organisms and their roles in pathological physiology still remain largely unknown, more attention has been paid on the identification of fungi-specific phosphorylation. Here, experimental fungi phosphorylation sites data were collected and most of the sites were classified into different types to be encoded with various features and trained via a two-step feature optimization method. A novel method for prediction of species-specific fungi phosphorylation-PreSSFP was developed, which can identify fungi phosphorylation in seven species for specific serine, threonine and tyrosine residues (http://computbiol.ncu.edu.cn/PreSSFP). Meanwhile, we critically evaluated the performance of PreSSFP and compared it with other existing tools. The satisfying results showed that PreSSFP is a robust predictor. Feature analyses exhibited that there have some significant differences among seven species. The species-specific prediction via two-step feature optimization method to mine important features for training could considerably improve the prediction performance. We anticipate that our study provides a new lead for future computational analysis of fungi phosphorylation.
Collapse
Affiliation(s)
- Man Cao
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Guodong Chen
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Jialin Yu
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Shaoping Shi
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| |
Collapse
|
11
|
Marholz LJ, Zeringo NA, Lou HJ, Turk BE, Parker LL. In Silico Design and in Vitro Characterization of Universal Tyrosine Kinase Peptide Substrates. Biochemistry 2018. [PMID: 29528224 DOI: 10.1021/acs.biochem.8b00044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A majority of the 90 human protein tyrosine kinases (PTKs) are understudied "orphan" enzymes with few or no known substrates. Designing experiments aimed at assaying the catalytic activity of these PTKs has been a long-running problem. In the past, researchers have used polypeptides with a randomized 4:1 molar ratio of glutamic acid to tyrosine as general PTK substrates. However, these substrates are inefficient and perform poorly for many applications. In this work, we apply the KINATEST-ID pipeline for artificial kinase substrate discovery to design a set of candidate "universal" PTK peptide substrate sequences. We identified two unique peptide sequences from this set that had robust activity with a panel of 15 PTKs tested in an initial screen. Kinetic characterization with seven receptor and nonreceptor PTKs confirmed these peptides to be efficient and general PTK substrates. The broad scope of these artificial substrates demonstrates that they should be useful as tools for probing understudied PTK activity.
Collapse
Affiliation(s)
- Laura J Marholz
- Department of Biochemistry, Molecular Biology and Biophysics , University of Minnesota , 420 Washington Avenue Southeast , Minneapolis , Minnesota 55455 , United States
| | - Nicholas A Zeringo
- Department of Pharmacology , Yale University School of Medicine , P.O. Box 208066, 333 Cedar Street , New Haven , Connecticut 06520 , United States
| | - Hua Jane Lou
- Department of Pharmacology , Yale University School of Medicine , P.O. Box 208066, 333 Cedar Street , New Haven , Connecticut 06520 , United States
| | - Benjamin E Turk
- Department of Pharmacology , Yale University School of Medicine , P.O. Box 208066, 333 Cedar Street , New Haven , Connecticut 06520 , United States
| | - Laurie L Parker
- Department of Biochemistry, Molecular Biology and Biophysics , University of Minnesota , 420 Washington Avenue Southeast , Minneapolis , Minnesota 55455 , United States
| |
Collapse
|
12
|
Karasev DA, Veselova DA, Veselovsky AV, Sobolev BN, Zgoda VG, Archakov AI. Spatial features of proteins related to their phosphorylation and associated structural changes. Proteins 2017; 86:13-20. [DOI: 10.1002/prot.25397] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Revised: 09/13/2017] [Accepted: 10/04/2017] [Indexed: 11/06/2022]
Affiliation(s)
- Dmitry A. Karasev
- Department of Bioinformatics; Institute of Biomedical Chemistry (IBMC); Moscow Russia
- Department of Biochemistry; Pirogov Russian National Research Medical University (RNRMU); Moscow Russia
| | - Darya A. Veselova
- Department of Bioinformatics; Institute of Biomedical Chemistry (IBMC); Moscow Russia
| | | | - Boris N. Sobolev
- Department of Bioinformatics; Institute of Biomedical Chemistry (IBMC); Moscow Russia
| | - Victor G. Zgoda
- Department of Bioinformatics; Institute of Biomedical Chemistry (IBMC); Moscow Russia
| | - Alexander I. Archakov
- Department of Bioinformatics; Institute of Biomedical Chemistry (IBMC); Moscow Russia
| |
Collapse
|
13
|
Karabulut NP, Frishman D. Sequence- and Structure-Based Analysis of Tissue-Specific Phosphorylation Sites. PLoS One 2016; 11:e0157896. [PMID: 27332813 PMCID: PMC4917084 DOI: 10.1371/journal.pone.0157896] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2016] [Accepted: 06/07/2016] [Indexed: 01/22/2023] Open
Abstract
Phosphorylation is the most widespread and well studied reversible posttranslational modification. Discovering tissue-specific preferences of phosphorylation sites is important as phosphorylation plays a role in regulating almost every cellular activity and disease state. Here we present a comprehensive analysis of global and tissue-specific sequence and structure properties of phosphorylation sites utilizing recent proteomics data. We identified tissue-specific motifs in both sequence and spatial environments of phosphorylation sites. Target site preferences of kinases across tissues indicate that, while many kinases mediate phosphorylation in all tissues, there are also kinases that exhibit more tissue-specific preferences which, notably, are not caused by tissue-specific kinase expression. We also demonstrate that many metabolic pathways are differentially regulated by phosphorylation in different tissues.
Collapse
Affiliation(s)
- Nermin Pinar Karabulut
- Department of Genome Oriented Bioinformatics, Technische Universität München, Freising, Germany
| | - Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technische Universität München, Freising, Germany
- Helmholtz Zentrum Munich; German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Neuherberg, Germany
- St Petersburg State Polytechnical University, St Petersburg, Russia
- * E-mail:
| |
Collapse
|
14
|
Peppelenbosch MP, Frijns N, Fuhler G. Systems medicine approaches for peptide array-based protein kinase profiling: progress and prospects. Expert Rev Proteomics 2016; 13:571-8. [PMID: 27241729 DOI: 10.1080/14789450.2016.1187564] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
INTRODUCTION Pharmacological manipulation of signalling pathways is becoming an increasingly important avenue for the rational clinical management of disease but is hampered by a lack of technologies that allow the generation of comprehensive descriptions of cellular signalling. AREAS COVERED Herein, the authors discuss the potential of peptide array-based kinome profiling for evaluating cellular signalling in the context of drug discovery. Expert commentary: Genomic and proteomic approaches have been of significant value to our elucidation of the molecular mechanisms that govern physiology. However, an equally, if not more important goal, is to define those proteins that participate in signalling pathways that ultimately control cell fate, especially kinases. Traditional genetic and biochemical approaches can certainly provide answers here, but for technical and practical reasons, are typically pursued one gene or pathway at a time. A more comprehensive approach is one in which peptide arrays of kinase-specific substrates are incubated with cell lysates and (33)P-ATP generating comprehensive descriptions, or where arrays are interrogated with phosphospecific antibodies. Both approaches allow analysis of cellular signalling without a priori assumptions to possibly influenced pathways.
Collapse
Affiliation(s)
| | | | - Gwenny Fuhler
- c Erasmus MC , Erasmus MC Cancer Institute , Rotterdam , Zuid-Holland, CA , Netherlands
| |
Collapse
|
15
|
Róna G, Borsos M, Ellis JJ, Mehdi AM, Christie M, Környei Z, Neubrandt M, Tóth J, Bozóky Z, Buday L, Madarász E, Bodén M, Kobe B, Vértessy BG. Dynamics of re-constitution of the human nuclear proteome after cell division is regulated by NLS-adjacent phosphorylation. Cell Cycle 2015; 13:3551-64. [PMID: 25483092 DOI: 10.4161/15384101.2014.960740] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Phosphorylation by the cyclin-dependent kinase 1 (Cdk1) adjacent to nuclear localization signals (NLSs) is an important mechanism of regulation of nucleocytoplasmic transport. However, no systematic survey has yet been performed in human cells to analyze this regulatory process, and the corresponding cell-cycle dynamics have not yet been investigated. Here, we focused on the human proteome and found that numerous proteins, previously not identified in this context, are associated with Cdk1-dependent phosphorylation sites adjacent to their NLSs. Interestingly, these proteins are involved in key regulatory events of DNA repair, epigenetics, or RNA editing and splicing. This finding indicates that cell-cycle dependent events of genome editing and gene expression profiling may be controlled by nucleocytoplasmic trafficking. For in-depth investigations, we selected a number of these proteins and analyzed how point mutations, expected to modify the phosphorylation ability of the NLS segments, perturb nucleocytoplasmic localization. In each case, we found that mutations mimicking hyper-phosphorylation abolish nuclear import processes. To understand the mechanism underlying these phenomena, we performed a video microscopy-based kinetic analysis to obtain information on cell-cycle dynamics on a model protein, dUTPase. We show that the NLS-adjacent phosphorylation by Cdk1 of human dUTPase, an enzyme essential for genomic integrity, results in dynamic cell cycle-dependent distribution of the protein. Non-phosphorylatable mutants have drastically altered protein re-import characteristics into the nucleus during the G1 phase. Our results suggest a dynamic Cdk1-driven mechanism of regulation of the nuclear proteome composition during the cell cycle.
Collapse
Key Words
- Cdc28, cyclin-dependent protein kinase (Cdk) encoded by CDC28
- Cdk1, cyclin-dependent kinase 1
- GO, gene ontology
- NES, nuclear export signal
- NLS, nuclear localization signal
- SNP, single nucleotide polymorphisms
- SV40, Simian virus 40
- UBA1, Ubiquitin-activating enzyme E1
- UNG2, Human Uracil-DNA glycosylase 2
- cNLS, classical nuclear localization signal
- cell cycle
- dNTP, deoxyribonucleotide triphosphate
- dTTP, deoxythymidine triphosphate
- dUMP, deoxyuridine monophosphate
- dUTP, deoxyuridine triphosphate
- dUTPase
- importin
- phosphorylation
- trafficking
Collapse
Affiliation(s)
- Gergely Róna
- a Institute of Enzymology; RCNS; Hungarian Academy of Sciences ; Budapest , Hungary
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Pin1: Intimate involvement with the regulatory protein kinase networks in the global phosphorylation landscape. Biochim Biophys Acta Gen Subj 2015; 1850:2077-86. [PMID: 25766872 DOI: 10.1016/j.bbagen.2015.02.018] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Revised: 02/25/2015] [Accepted: 02/27/2015] [Indexed: 02/06/2023]
Abstract
BACKGROUND Protein phosphorylation is a universal regulatory mechanism that involves an extensive network of protein kinases. The discovery of the phosphorylation-dependent peptidyl-prolyl isomerase Pin1 added an additional layer of complexity to these regulatory networks. SCOPE OF REVIEW We have evaluated interactions between Pin1 and the regulatory kinome and proline-dependent phosphoproteome taking into consideration findings from targeted studies as well as data that has emerged from systematic phosphoproteomic workflows and from curated protein interaction databases. MAJOR CONCLUSIONS The relationship between Pin1 and the regulatory protein kinase networks is not restricted simply to the recognition of proteins that are substrates for proline-directed kinases. In this respect, Pin1 itself is phosphorylated in cells by protein kinases that modulate its functional properties. Furthermore, the phosphorylation-dependent targets of Pin1 include a number of protein kinases as well as other enzymes such as phosphatases and regulatory subunits of kinases that modulate the actions of protein kinases. GENERAL SIGNIFICANCE As a result of its interactions with numerous protein kinases and their substrates, as well as itself being a target for phosphorylation, Pin1 has an intricate relationship with the regulatory protein kinase and phosphoproteomic networks that orchestrate complex cellular processes and respond to environmental cues. This article is part of a Special Issue entitled Proline-directed Foldases: Cell Signaling Catalysts and Drug Targets.
Collapse
|
17
|
Huang SY, Shi SP, Qiu JD, Liu MC. Using support vector machines to identify protein phosphorylation sites in viruses. J Mol Graph Model 2015; 56:84-90. [DOI: 10.1016/j.jmgm.2014.12.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 12/13/2014] [Accepted: 12/16/2014] [Indexed: 10/24/2022]
|
18
|
Abstract
The succession of protein activation and deactivation mediated by phosphorylation and dephosphorylation events constitutes a key mechanism of molecular information transfer in cellular systems. To deduce the details of those molecular information cascades and networks has been a central goal pursued by both experimental and computational approaches. Many computational network reconstruction methods employing an array of different statistical learning methods have been developed to infer phosphorylation networks based on different types of molecular data sets such as protein sequence, protein structure, or phosphoproteomics data. In this chapter, different computational network inference methods and resources for biological network reconstruction with a particular focus on phosphorylation networks are surveyed.
Collapse
|
19
|
Patrick R, Lê Cao KA, Kobe B, Bodén M. PhosphoPICK: modelling cellular context to map kinase-substrate phosphorylation events. ACTA ACUST UNITED AC 2014; 31:382-9. [PMID: 25304781 DOI: 10.1093/bioinformatics/btu663] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
MOTIVATION The determinants of kinase-substrate phosphorylation can be found both in the substrate sequence and the surrounding cellular context. Cell cycle progression, interactions with mediating proteins and even prior phosphorylation events are necessary for kinases to maintain substrate specificity. While much work has focussed on the use of sequence-based methods to predict phosphorylation sites, there has been very little work invested into the application of systems biology to understand phosphorylation. Lack of specificity in many kinase substrate binding motifs means that sequence methods for predicting kinase binding sites are susceptible to high false-positive rates. RESULTS We present here a model that takes into account protein-protein interaction information, and protein abundance data across the cell cycle to predict kinase substrates for 59 human kinases that are representative of important biological pathways. The model shows high accuracy for substrate prediction (with an average AUC of 0.86) across the 59 kinases tested. When using the model to complement sequence-based kinase-specific phosphorylation site prediction, we found that the additional information increased prediction performance for most comparisons made, particularly on kinases from the CMGC family. We then used our model to identify functional overlaps between predicted CDK2 substrates and targets from the E2F family of transcription factors. Our results demonstrate that a model harnessing context data can account for the short-falls in sequence information and provide a robust description of the cellular events that regulate protein phosphorylation. AVAILABILITY AND IMPLEMENTATION The method is freely available online as a web server at the website http://bioinf.scmb.uq.edu.au/phosphopick. CONTACT m.boden@uq.edu.au SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ralph Patrick
- School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia
| | - Kim-Anh Lê Cao
- School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia
| | - Bostjan Kobe
- School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia
| | - Mikael Bodén
- School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia School of Chemistry and Molecular Biosciences and Queensland Facility for Advanced Bioinformatics, The University of Queensland, St Lucia 4072, Translational Research Institute, The University of Queensland Diamantina Institute, Brisbane, St Lucia 4102, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, St Lucia, 4072, Australia
| |
Collapse
|
20
|
Palmeri A, Ferrè F, Helmer-Citterich M. Exploiting holistic approaches to model specificity in protein phosphorylation. Front Genet 2014; 5:315. [PMID: 25324856 PMCID: PMC4179730 DOI: 10.3389/fgene.2014.00315] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Accepted: 08/21/2014] [Indexed: 12/27/2022] Open
Abstract
Phosphate plays a chemically unique role in shaping cellular signaling of all current living systems, especially eukaryotes. Protein phosphorylation has been studied at several levels, from the near-site context, both in sequence and structure, to the crowded cellular environment, and ultimately to the systems-level perspective. Despite the tremendous advances in mass spectrometry and efforts dedicated to the development of ad hoc highly sophisticated methods, phosphorylation site inference and associated kinase identification are still unresolved problems in kinome biology. The sequence and structure of the substrate near-site context are not sufficient alone to model the in vivo phosphorylation rules, and they should be integrated with orthogonal information in all possible applications. Here we provide an overview of the different contexts that contribute to protein phosphorylation, discussing their potential impact in phosphorylation site annotation and in predicting kinase-substrate specificity.
Collapse
Affiliation(s)
- Antonio Palmeri
- Department of Biology, Centre for Molecular Bioinformatics, University of Rome Tor Vergata Rome, Italy
| | - Fabrizio Ferrè
- Department of Biology, Centre for Molecular Bioinformatics, University of Rome Tor Vergata Rome, Italy
| | - Manuela Helmer-Citterich
- Department of Biology, Centre for Molecular Bioinformatics, University of Rome Tor Vergata Rome, Italy
| |
Collapse
|
21
|
Santra T, Kolch W, Kholodenko BN. Navigating the multilayered organization of eukaryotic signaling: a new trend in data integration. PLoS Comput Biol 2014; 10:e1003385. [PMID: 24550716 PMCID: PMC3923657 DOI: 10.1371/journal.pcbi.1003385] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The ever-increasing capacity of biological molecular data acquisition outpaces our ability to understand the meaningful relationships between molecules in a cell. Multiple databases were developed to store and organize these molecular data. However, emerging fundamental questions about concerted functions of these molecules in hierarchical cellular networks are poorly addressed. Here we review recent advances in the development of publically available databases that help us analyze the signal integration and processing by multilayered networks that specify biological responses in model organisms and human cells
Collapse
Affiliation(s)
- Tapesh Santra
- Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
| | - Walter Kolch
- Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
- Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland
- School of Medicine and Medical Science, University College Dublin, Belfield, Dublin, Ireland
| | - Boris N. Kholodenko
- Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
- Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland
- School of Medicine and Medical Science, University College Dublin, Belfield, Dublin, Ireland
- * E-mail:
| |
Collapse
|
22
|
Roux PP, Thibault P. The coming of age of phosphoproteomics--from large data sets to inference of protein functions. Mol Cell Proteomics 2013; 12:3453-64. [PMID: 24037665 DOI: 10.1074/mcp.r113.032862] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Protein phosphorylation is one of the most common post-translational modifications used in signal transduction to control cell growth, proliferation, and survival in response to both intracellular and extracellular stimuli. This modification is finely coordinated by a network of kinases and phosphatases that recognize unique sequence motifs and/or mediate their functions through scaffold and adaptor proteins. Detailed information on the nature of kinase substrates and site-specific phosphoregulation is required in order for one to better understand their pathophysiological roles. Recent advances in affinity chromatography and mass spectrometry (MS) sensitivity have enabled the large-scale identification and profiling of protein phosphorylation, but appropriate follow-up experiments are required in order to ascertain the functional significance of identified phosphorylation sites. In this review, we present meaningful technical details for MS-based phosphoproteomic analyses and describe important considerations for the selection of model systems and the functional characterization of identified phosphorylation sites.
Collapse
Affiliation(s)
- Philippe P Roux
- Institute for Research in Immunology and Cancer, Université de Montréal, P.O. Box 6128, Station. Centre-ville, Montréal, Québec H3C 3J7, Canada
| | | |
Collapse
|
23
|
Oyarzún P, Ellis JJ, Bodén M, Kobe B. PREDIVAC: CD4+ T-cell epitope prediction for vaccine design that covers 95% of HLA class II DR protein diversity. BMC Bioinformatics 2013; 14:52. [PMID: 23409948 PMCID: PMC3598884 DOI: 10.1186/1471-2105-14-52] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 01/31/2013] [Indexed: 12/18/2022] Open
Abstract
Background CD4+ T-cell epitopes play a crucial role in eliciting vigorous protective immune responses during peptide (epitope)-based vaccination. The prediction of these epitopes focuses on the peptide binding process by MHC class II proteins. The ability to account for MHC class II polymorphism is critical for epitope-based vaccine design tools, as different allelic variants can have different peptide repertoires. In addition, the specificity of CD4+ T-cells is often directed to a very limited set of immunodominant peptides in pathogen proteins. The ability to predict what epitopes are most likely to dominate an immune response remains a challenge. Results We developed the computational tool Predivac to predict CD4+ T-cell epitopes. Predivac can make predictions for 95% of all MHC class II protein variants (allotypes), a substantial advance over other available methods. Predivac bases its prediction on the concept of specificity-determining residues. The performance of the method was assessed both for high-affinity HLA class II peptide binding and CD4+ T-cell epitope prediction. In terms of epitope prediction, Predivac outperformed three available pan-specific approaches (delivering the highest specificity). A central finding was the high accuracy delivered by the method in the identification of immunodominant and promiscuous CD4+ T-cell epitopes, which play an essential role in epitope-based vaccine design. Conclusions The comprehensive HLA class II allele coverage along with the high specificity in identifying immunodominant CD4+ T-cell epitopes makes Predivac a valuable tool to aid epitope-based vaccine design in the context of a genetically heterogeneous human population.The tool is available at: http://predivac.biosci.uq.edu.au/.
Collapse
Affiliation(s)
- Patricio Oyarzún
- School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, QLD 4072, Australia.
| | | | | | | |
Collapse
|
24
|
Sim CH, Gabriel K, Mills RD, Culvenor JG, Cheng HC. Analysis of the regulatory and catalytic domains of PTEN-induced kinase-1 (PINK1). Hum Mutat 2012; 33:1408-22. [DOI: 10.1002/humu.22127] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2012] [Accepted: 05/15/2012] [Indexed: 01/23/2023]
|
25
|
The importance of conserved features of yeast actin-binding protein 1 (Abp1p): the conditional nature of essentiality. Genetics 2012; 191:1199-211. [PMID: 22661326 DOI: 10.1534/genetics.112.141739] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Saccharomyces cerevisiae Actin-Binding Protein 1 (Abp1p) is a member of the Abp1 family of proteins, which are in diverse organisms including fungi, nematodes, flies, and mammals. All proteins in this family possess an N-terminal Actin Depolymerizing Factor Homology (ADF-H) domain, a central Proline-Rich Region (PRR), and a C-terminal SH3 domain. In this study, we employed sequence analysis to identify additional conserved features of the family, including sequences rich in proline, glutamic acid, serine, and threonine amino acids (PEST), which are found in all family members examined, and two motifs, Conserved Fungal Motifs 1 and 2 (CFM1 and CFM2), that are conserved in fungi. We also discovered that, similar to its mammalian homologs, Abp1p is phosphorylated in its PRR. This phosphorylation is mediated by the Cdc28p and Pho85p kinases, and it protects Abp1p from proteolysis mediated by the conserved PEST sequences. We provide evidence for an intramolecular interaction between the PRR region and SH3 domain that may be affected by phosphorylation. Although deletion of CFM1 alone caused no detectable phenotype in any genetic backgrounds or conditions tested, deletion of this motif resulted in a significant reduction of growth when it was combined with a deletion of the ADF-H domain. Importantly, this result demonstrates that deletion of highly conserved domains on its own may produce no phenotype unless the domains are assayed in conjunction with deletions of other functionally important elements within the same protein. Detection of this type of intragenic synthetic lethality provides an important approach for understanding the function of individual protein domains or motifs.
Collapse
|
26
|
Palmeri A, Gherardini PF, Tsigankov P, Ausiello G, Späth GF, Zilberstein D, Helmer-Citterich M. PhosTryp: a phosphorylation site predictor specific for parasitic protozoa of the family trypanosomatidae. BMC Genomics 2011; 12:614. [PMID: 22182631 PMCID: PMC3285042 DOI: 10.1186/1471-2164-12-614] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 12/19/2011] [Indexed: 01/13/2023] Open
Abstract
Background Protein phosphorylation modulates protein function in organisms at all levels of complexity. Parasites of the Leishmania genus undergo various developmental transitions in their life cycle triggered by changes in the environment. The molecular mechanisms that these organisms use to process and integrate these external cues are largely unknown. However Leishmania lacks transcription factors, therefore most regulatory processes may occur at a post-translational level and phosphorylation has recently been demonstrated to be an important player in this process. Experimental identification of phosphorylation sites is a time-consuming task. Moreover some sites could be missed due to the highly dynamic nature of this process or to difficulties in phospho-peptide enrichment. Results Here we present PhosTryp, a phosphorylation site predictor specific for trypansomatids. This method uses an SVM-based approach and has been trained with recent Leishmania phosphosproteomics data. PhosTryp achieved a 17% improvement in prediction performance compared with Netphos, a non organism-specific predictor. The analysis of the peptides correctly predicted by our method but missed by Netphos demonstrates that PhosTryp captures Leishmania-specific phosphorylation features. More specifically our results show that Leishmania kinases have sequence specificities which are different from their counterparts in higher eukaryotes. Consequently we were able to propose two possible Leishmania-specific phosphorylation motifs. We further demonstrate that this improvement in performance extends to the related trypanosomatids Trypanosoma brucei and Trypanosoma cruzi. Finally, in order to maximize the usefulness of PhosTryp, we trained a predictor combining all the peptides from L. infantum, T. brucei and T. cruzi. Conclusions Our work demonstrates that training on organism-specific data results in an improvement that extends to related species. PhosTryp is freely available at http://phostryp.bio.uniroma2.it
Collapse
Affiliation(s)
- Antonio Palmeri
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, Rome
| | | | | | | | | | | | | |
Collapse
|
27
|
Proteomic databases and tools to decipher post-translational modifications. J Proteomics 2011; 75:127-44. [DOI: 10.1016/j.jprot.2011.09.014] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2011] [Revised: 09/14/2011] [Accepted: 09/18/2011] [Indexed: 01/10/2023]
|
28
|
Ben-Shimon A, Niv MY. Deciphering the Arginine-binding preferences at the substrate-binding groove of Ser/Thr kinases by computational surface mapping. PLoS Comput Biol 2011; 7:e1002288. [PMID: 22125489 PMCID: PMC3219626 DOI: 10.1371/journal.pcbi.1002288] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2011] [Accepted: 10/12/2011] [Indexed: 11/18/2022] Open
Abstract
Protein kinases are key signaling enzymes that catalyze the transfer of γ-phosphate from an ATP molecule to a phospho-accepting residue in the substrate. Unraveling the molecular features that govern the preference of kinases for particular residues flanking the phosphoacceptor is important for understanding kinase specificities toward their substrates and for designing substrate-like peptidic inhibitors. We applied ANCHORSmap, a new fragment-based computational approach for mapping amino acid side chains on protein surfaces, to predict and characterize the preference of kinases toward Arginine binding. We focus on positions P−2 and P−5, commonly occupied by Arginine (Arg) in substrates of basophilic Ser/Thr kinases. The method accurately identified all the P−2/P−5 Arg binding sites previously determined by X-ray crystallography and produced Arg preferences that corresponded to those experimentally found by peptide arrays. The predicted Arg-binding positions and their associated pockets were analyzed in terms of shape, physicochemical properties, amino acid composition, and in-silico mutagenesis, providing structural rationalization for previously unexplained trends in kinase preferences toward Arg moieties. This methodology sheds light on several kinases that were described in the literature as having non-trivial preferences for Arg, and provides some surprising departures from the prevailing views regarding residues that determine kinase specificity toward Arg. In particular, we found that the preference for a P−5 Arg is not necessarily governed by the 170/230 acidic pair, as was previously assumed, but by several different pairs of acidic residues, selected from positions 133, 169, and 230 (PKA numbering). The acidic residue at position 230 serves as a pivotal element in recognizing Arg from both the P−2 and P−5 positions. Protein kinases are key signaling enzymes and major drug targets that catalyze the transfer of phosphate group to a phospho-accepting residue in the substrate. Unraveling molecular features that govern the preference of kinases for particular residues flanking the phosphoacceptor (substrate consensus sequence, SCS) is important for understanding kinase-substrates specificities and for designing peptidic inhibitors. Current methods used to predict this set of essential residues usually rely on linking between experimentally determined SCSs to kinase sequences. As such, these methods are less sensitive when specificity is dictated by subtle or kinase-unique sequence/structural features. In this study, we took a different approach for studying kinases specificities, by applying a new fragment-based method for mapping amino acid side chains on protein surfaces. We predicted and characterized the preference of Ser/Thr kinases toward Arginine binding, using the unbound kinase structures. The method produced high quality predictions and was able to provide novel insights and interesting departures from the prevailing views regarding the specificity-determining elements governing specificity toward Arginine. This work paves the way for studying the kinase binding preferences for other amino acids, for predicting protein-peptide structures, for facilitating the design of novel inhibitors, and for re-engineering of kinase specificities.
Collapse
Affiliation(s)
- Avraham Ben-Shimon
- Institute of Biochemistry, Food Science and Nutrition, The Robert H. Smith Faculty of Agriculture, Food and Environment and The Fritz Haber Center for Molecular Dynamics, The Hebrew University, Israel
| | - Masha Y. Niv
- Institute of Biochemistry, Food Science and Nutrition, The Robert H. Smith Faculty of Agriculture, Food and Environment and The Fritz Haber Center for Molecular Dynamics, The Hebrew University, Israel
- * E-mail:
| |
Collapse
|
29
|
Safaei J, Maňuch J, Gupta A, Stacho L, Pelech S. Prediction of 492 human protein kinase substrate specificities. Proteome Sci 2011; 9 Suppl 1:S6. [PMID: 22165948 PMCID: PMC3379035 DOI: 10.1186/1477-5956-9-s1-s6] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Complex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses. Defective signal transduction underlies many pathologies, including cancer, diabetes, autoimmunity and about 400 other human diseases. Therefore, there is high impetus to define the composition and architecture of cellular communications networks in humans. The major components of intracellular signaling networks are protein kinases and protein phosphatases, which catalyze the reversible phosphorylation of proteins. Here, we have focused on identification of kinase-substrate interactions through prediction of the phosphorylation site specificity from knowledge of the primary amino acid sequence of the catalytic domain of each kinase. RESULTS The presented method predicts 488 different kinase catalytic domain substrate specificity matrices in 478 typical and 4 atypical human kinases that rely on both positive and negative determinants for scoring individual phosphosites for their suitability as kinase substrates. This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction. Comparison of our predicted matrices with experimentally-derived matrices from about 9,000 known kinase-phosphosite substrate pairs revealed a high degree of concordance with the established preferences of about 150 well studied protein kinases. Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates. CONCLUSIONS Application of this improved kinase substrate prediction algorithm to the primary structures of over 23, 000 proteins encoded by the human genome has permitted the identification of about 650, 000 putative phosphosites, which are posted on the open source PhosphoNET website (http://www.phosphonet.ca).
Collapse
Affiliation(s)
- Javad Safaei
- Department of Computer Science, University of British Columbia, Vancouver, Canada.
| | | | | | | | | |
Collapse
|
30
|
Trost B, Kusalik A. Computational prediction of eukaryotic phosphorylation sites. Bioinformatics 2011; 27:2927-35. [DOI: 10.1093/bioinformatics/btr525] [Citation(s) in RCA: 121] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
31
|
Sanz-García M, Vázquez-Cedeira M, Kellerman E, Renbaum P, Levy-Lahad E, Lazo PA. Substrate profiling of human vaccinia-related kinases identifies coilin, a Cajal body nuclear protein, as a phosphorylation target with neurological implications. J Proteomics 2011; 75:548-60. [PMID: 21920476 DOI: 10.1016/j.jprot.2011.08.019] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2011] [Revised: 08/19/2011] [Accepted: 08/23/2011] [Indexed: 01/13/2023]
Abstract
Protein phosphorylation by kinases plays a central role in the regulation and coordination of multiple biological processes. In general, knowledge on kinase specificity is restricted to substrates identified in the context of specific cellular responses, but kinases are likely to have multiple additional substrates and be integrated in signaling networks that might be spatially and temporally different, and in which protein complexes and subcellular localization can play an important role. In this report the substrate specificity of atypical human vaccinia-related kinases (VRK1 and VRK2) using a human peptide-array containing 1080 sequences phosphorylated in known signaling pathways has been studied. The two kinases identify a subset of potential peptide targets, all of them result in a consensus sequence composed of at least four basic residues in peptide targets. Linear peptide arrays are therefore a useful approach in the characterization of kinases and substrate identification, which can contribute to delineate the signaling network in which VRK proteins participate. One of these target proteins is coilin; a basic protein located in nuclear Cajal bodies. Coilin is phosphorylated in Ser184 by both VRK1 and VRK2. Coilin colocalizes and interacts with VRK1 in Cajal bodies, but not with the mutant VRK1 (R358X). VRK1 (R358X) is less active than VRK1. Altered regulation of coilin might be implicated in several neurological diseases such as ataxias and spinal muscular atrophies.
Collapse
Affiliation(s)
- Marta Sanz-García
- Experimental Therapeutics and Translational Oncology Program, Instituto de Biología Molecular y Celular del Cáncer, Consejo Superior de Investigaciones Científicas(CSIC)-Universidad de Salamanca, Salamanca 37007, Spain
| | | | | | | | | | | |
Collapse
|
32
|
Ellis JJ, Kobe B. Predicting protein kinase specificity: Predikin update and performance in the DREAM4 challenge. PLoS One 2011; 6:e21169. [PMID: 21829434 PMCID: PMC3145639 DOI: 10.1371/journal.pone.0021169] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2010] [Accepted: 05/23/2011] [Indexed: 11/18/2022] Open
Abstract
Predikin is a system for making predictions about protein kinase specificity. It was declared the “best performer” in the protein kinase section of the Peptide Recognition Domain specificity prediction category of the recent DREAM4 challenge (an independent test using unpublished data). In this article we discuss some recent improvements to the Predikin web server — including a more streamlined approach to substrate-to-kinase predictions and whole-proteome predictions — and give an analysis of Predikin's performance in the DREAM4 challenge. We also evaluate these improvements using a data set of yeast kinases that have been experimentally characterised, and we discuss the usefulness of Frobenius distance in assessing the predictive power of position weight matrices.
Collapse
Affiliation(s)
- Jonathan J Ellis
- School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Centre for Infectious Disease Research, University of Queensland, Brisbane, Queensland, Australia.
| | | |
Collapse
|
33
|
Remmerie N, De Vijlder T, Laukens K, Dang TH, Lemière F, Mertens I, Valkenborg D, Blust R, Witters E. Next generation functional proteomics in non-model plants: A survey on techniques and applications for the analysis of protein complexes and post-translational modifications. PHYTOCHEMISTRY 2011; 72:1192-218. [PMID: 21345472 DOI: 10.1016/j.phytochem.2011.01.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2010] [Revised: 11/21/2010] [Accepted: 01/03/2011] [Indexed: 05/11/2023]
Abstract
The congruent development of computational technology, bioinformatics and analytical instrumentation makes proteomics ready for the next leap. Present-day state of the art proteomics grew from a descriptive method towards a full stake holder in systems biology. High throughput and genome wide studies are now made at the functional level. These include quantitative aspects, functional aspects with respect to protein interactions as well as post translational modifications and advanced computational methods that aid in predicting protein function and mapping these functionalities across the species border. In this review an overview is given of the current status of these aspects in plant studies with special attention to non-genomic model plants.
Collapse
Affiliation(s)
- Noor Remmerie
- Center for Proteomics, University of Antwerp, Groenenborgerlaan 171, B-2020 Antwerp, Belgium
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Xu H, Schaniel C, Lemischka IR, Ma'ayan A. Toward a complete in silico, multi-layered embryonic stem cell regulatory network. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 2:708-33. [PMID: 20890967 DOI: 10.1002/wsbm.93] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Recent efforts in systematically profiling embryonic stem (ES) cells have yielded a wealth of high-throughput data. Complementarily, emerging databases and computational tools facilitate ES cell studies and further pave the way toward the in silico reconstruction of regulatory networks encompassing multiple molecular layers. Here, we briefly survey databases, algorithms, and software tools used to organize and analyze high-throughput experimental data collected to study mammalian cellular systems with a focus on ES cells. The vision of using heterogeneous data to reconstruct a complete multi-layered ES cell regulatory network is discussed. This review also provides an accompanying manually extracted dataset of different types of regulatory interactions from low-throughput experimental ES cell studies available at http://amp.pharm.mssm.edu/iscmid/literature.
Collapse
Affiliation(s)
- Huilei Xu
- Department of Gene and Cell Medicine and The Black Family Stem Cell Institute, Mount Sinai School of Medicine, New York, NY 10029, USA
| | | | | | | |
Collapse
|
35
|
Abstract
Methods for predicting protein post-translational modifications have been developed extensively. In this chapter, we review major post-translational modification prediction strategies, with a particular focus on statistical and machine learning approaches. We present the workflow of the methods and summarize the advantages and disadvantages of the methods.
Collapse
Affiliation(s)
- Chunmei Liu
- Department of Systems and Computer Science, Howard University, Washington, DC, USA.
| | | |
Collapse
|
36
|
Abstract
BACKGROUND With the rapid accumulation of phosphoproteomics data, phosphorylation-site prediction is becoming an increasingly active research area. More than a dozen phosphorylation-site prediction tools have been released in the past decade. However, there is currently no open-source framework specifically designed for phosphorylation-site prediction except Musite. RESULTS Here we present the Musite open-source framework for building applications to perform machine learning based phosphorylation-site prediction. Musite was implemented with six modules loosely coupled with each other. With its well-designed Java application programming interface (API), Musite can be easily extended to integrate various sources of biological evidence for phosphorylation-site prediction. CONCLUSIONS Released under the GNU GPL open source license, Musite provides an open and extensible framework for phosphorylation-site prediction. The software with its source code is available at http://musite.sourceforge.net.
Collapse
Affiliation(s)
- Jianjiong Gao
- Department of Computer Science, C.S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
| | | |
Collapse
|
37
|
Trost M, Bridon G, Desjardins M, Thibault P. Subcellular phosphoproteomics. MASS SPECTROMETRY REVIEWS 2010; 29:962-90. [PMID: 20931658 DOI: 10.1002/mas.20297] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Protein phosphorylation represents one of the most extensively studied post-translational modifications, primarily due to the emergence of sensitive methods enabling the detection of this modification both in vitro and in vivo. The availability of enrichment methods combined with sensitive mass spectrometry instrumentation has played a crucial role in uncovering the dynamic changes and the large expanding repertoire of this reversible modification. The structural changes imparted by the phosphorylation of specific residues afford exquisite mechanisms for the regulation of protein functions by modulating new binding sites on scaffold proteins or by abrogating protein-protein interactions. However, the dynamic interplay of protein phosphorylation is not occurring randomly within the cell but is rather finely orchestrated by specific kinases and phosphatases that are unevenly distributed across subcellular compartments. This spatial separation not only regulates protein phosphorylation but can also control the activity of other enzymes and the transfer of other post-translational modifications. While numerous large-scale phosphoproteomics studies highlighted the extent and diversity of phosphoproteins present in total cell lysates, the further understanding of their regulation and biological activities require a spatio-temporal resolution only achievable through subcellular fractionation. This review presents a first account of the emerging field of subcellular phosphoproteomics where cell fractionation approaches are combined with sensitive mass spectrometry methods to facilitate the identification of low abundance proteins and to unravel the intricate regulation of protein phosphorylation.
Collapse
Affiliation(s)
- Matthias Trost
- Institute for Research in Immunology and Cancer, Université de Montréal, P.O. Box 6128, Station Centre-ville, Montréal, Québec, Canada H3C 3J7
| | | | | | | |
Collapse
|
38
|
Marfori M, Mynott A, Ellis JJ, Mehdi AM, Saunders NFW, Curmi PM, Forwood JK, Bodén M, Kobe B. Molecular basis for specificity of nuclear import and prediction of nuclear localization. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2010; 1813:1562-77. [PMID: 20977914 DOI: 10.1016/j.bbamcr.2010.10.013] [Citation(s) in RCA: 303] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/15/2010] [Revised: 10/15/2010] [Accepted: 10/19/2010] [Indexed: 01/03/2023]
Abstract
Although proteins are translated on cytoplasmic ribosomes, many of these proteins play essential roles in the nucleus, mediating key cellular processes including but not limited to DNA replication and repair as well as transcription and RNA processing. Thus, understanding how these critical nuclear proteins are accurately targeted to the nucleus is of paramount importance in biology. Interaction and structural studies in the recent years have jointly revealed some general rules on the specificity determinants of the recognition of nuclear targeting signals by their specific receptors, at least for two nuclear import pathways: (i) the classical pathway, which involves the classical nuclear localization sequences (cNLSs) and the receptors importin-α/karyopherin-α and importin-β/karyopherin-β1; and (ii) the karyopherin-β2 pathway, which employs the proline-tyrosine (PY)-NLSs and the receptor transportin-1/karyopherin-β2. The understanding of specificity rules allows the prediction of protein nuclear localization. We review the current understanding of the molecular determinants of the specificity of nuclear import, focusing on the importin-α•cargo recognition, as well as the currently available databases and predictive tools relevant to nuclear localization. This article is part of a Special Issue entitled: Regulation of Signaling and Cellular Fate through Modulation of Nuclear Protein Import.
Collapse
Affiliation(s)
- Mary Marfori
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Ia KK, Mills RD, Hossain MI, Chan KC, Jarasrassamee B, Jorissen RN, Cheng HC. Structural elements and allosteric mechanisms governing regulation and catalysis of CSK-family kinases and their inhibition of Src-family kinases. Growth Factors 2010; 28:329-50. [PMID: 20476842 DOI: 10.3109/08977194.2010.484424] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
C-terminal Src kinase (CSK) and CSK-homologous kinase (CHK) are endogenous inhibitors constraining the activity of the oncogenic Src-family kinases (SFKs) in cells. Both kinases suppress SFKs by selectively phosphorylating their consensus C-terminal regulatory tyrosine. In addition to phosphorylation, CHK can suppress SFKs by a unique non-catalytic inhibitory mechanism that involves tight binding of CHK to SFKs to form stable complexes. In this review, we discuss how allosteric regulators, phosphorylation, and inter-domain interactions interplay to govern the activity of CSK and CHK and their ability to inhibit SFKs. In particular, based upon the published results of structural and biochemical analysis of CSK and CHK, we attempt to chart the allosteric networks in CSK and CHK that govern their catalysis and ability to inhibit SFKs. We also discuss how the published three-dimensional structure of CSK complexed with an SFK member sheds light on the structural basis of substrate recognition by protein kinases.
Collapse
Affiliation(s)
- Kim K Ia
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria, 3010, Australia
| | | | | | | | | | | | | |
Collapse
|
40
|
Shameer K, Madan LL, Veeranna S, Gopal B, Sowdhamini R. PeptideMine--a webserver for the design of peptides for protein-peptide binding studies derived from protein-protein interactomes. BMC Bioinformatics 2010; 11:473. [PMID: 20858292 PMCID: PMC2955050 DOI: 10.1186/1471-2105-11-473] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2009] [Accepted: 09/22/2010] [Indexed: 01/18/2023] Open
Abstract
Background Signal transduction events often involve transient, yet specific, interactions between structurally conserved protein domains and polypeptide sequences in target proteins. The identification and validation of these associating domains is crucial to understand signal transduction pathways that modulate different cellular or developmental processes. Bioinformatics strategies to extract and integrate information from diverse sources have been shown to facilitate the experimental design to understand complex biological events. These methods, primarily based on information from high-throughput experiments, have also led to the identification of new connections thus providing hypothetical models for cellular events. Such models, in turn, provide a framework for directing experimental efforts for validating the predicted molecular rationale for complex cellular processes. In this context, it is envisaged that the rational design of peptides for protein-peptide binding studies could substantially facilitate the experimental strategies to evaluate a predicted interaction. This rational design procedure involves the integration of protein-protein interaction data, gene ontology, physico-chemical calculations, domain-domain interaction data and information on functional sites or critical residues. Results Here we describe an integrated approach called "PeptideMine" for the identification of peptides based on specific functional patterns present in the sequence of an interacting protein. This approach based on sequence searches in the interacting sequence space has been developed into a webserver, which can be used for the identification and analysis of peptides, peptide homologues or functional patterns from the interacting sequence space of a protein. To further facilitate experimental validation, the PeptideMine webserver also provides a list of physico-chemical parameters corresponding to the peptide to determine the feasibility of using the peptide for in vitro biochemical or biophysical studies. Conclusions The strategy described here involves the integration of data and tools to identify potential interacting partners for a protein and design criteria for peptides based on desired biochemical properties. Alongside the search for interacting protein sequences using three different search programs, the server also provides the biochemical characteristics of candidate peptides to prune peptide sequences based on features that are most suited for a given experiment. The PeptideMine server is available at the URL: http://caps.ncbs.res.in/peptidemine
Collapse
Affiliation(s)
- Khader Shameer
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bellary Road, Bangalore, 560065, India
| | | | | | | | | |
Collapse
|
41
|
Gao J, Thelen JJ, Dunker AK, Xu D. Musite, a tool for global prediction of general and kinase-specific phosphorylation sites. Mol Cell Proteomics 2010; 9:2586-600. [PMID: 20702892 DOI: 10.1074/mcp.m110.001388] [Citation(s) in RCA: 202] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Reversible protein phosphorylation is one of the most pervasive post-translational modifications, regulating diverse cellular processes in various organisms. High throughput experimental studies using mass spectrometry have identified many phosphorylation sites, primarily from eukaryotes. However, the vast majority of phosphorylation sites remain undiscovered, even in well studied systems. Because mass spectrometry-based experimental approaches for identifying phosphorylation events are costly, time-consuming, and biased toward abundant proteins and proteotypic peptides, in silico prediction of phosphorylation sites is potentially a useful alternative strategy for whole proteome annotation. Because of various limitations, current phosphorylation site prediction tools were not well designed for comprehensive assessment of proteomes. Here, we present a novel software tool, Musite, specifically designed for large scale predictions of both general and kinase-specific phosphorylation sites. We collected phosphoproteomics data in multiple organisms from several reliable sources and used them to train prediction models by a comprehensive machine-learning approach that integrates local sequence similarities to known phosphorylation sites, protein disorder scores, and amino acid frequencies. Application of Musite on several proteomes yielded tens of thousands of phosphorylation site predictions at a high stringency level. Cross-validation tests show that Musite achieves some improvement over existing tools in predicting general phosphorylation sites, and it is at least comparable with those for predicting kinase-specific phosphorylation sites. In Musite V1.0, we have trained general prediction models for six organisms and kinase-specific prediction models for 13 kinases or kinase families. Although the current pretrained models were not correlated with any particular cellular conditions, Musite provides a unique functionality for training customized prediction models (including condition-specific models) from users' own data. In addition, with its easily extensible open source application programming interface, Musite is aimed at being an open platform for community-based development of machine learning-based phosphorylation site prediction applications. Musite is available at http://musite.sourceforge.net/.
Collapse
Affiliation(s)
- Jianjiong Gao
- Department of Computer Science, University of Missouri, Columbia, Missouri 65211, USA
| | | | | | | |
Collapse
|
42
|
Control of cell cycle progression by phosphorylation of cyclin-dependent kinase (CDK) substrates. Biosci Rep 2010; 30:243-55. [DOI: 10.1042/bsr20090171] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The eukaryotic cell cycle is a fundamental evolutionarily conserved process that regulates cell division from simple unicellular organisms, such as yeast, through to higher multicellular organisms, such as humans. The cell cycle comprises several phases, including the S-phase (DNA synthesis phase) and M-phase (mitotic phase). During S-phase, the genetic material is replicated, and is then segregated into two identical daughter cells following mitotic M-phase and cytokinesis. The S- and M-phases are separated by two gap phases (G1 and G2) that govern the readiness of cells to enter S- or M-phase. Genetic and biochemical studies demonstrate that cell division in eukaryotes is mediated by CDKs (cyclin-dependent kinases). Active CDKs comprise a protein kinase subunit whose catalytic activity is dependent on association with a regulatory cyclin subunit. Cell-cycle-stage-dependent accumulation and proteolytic degradation of different cyclin subunits regulates their association with CDKs to control different stages of cell division. CDKs promote cell cycle progression by phosphorylating critical downstream substrates to alter their activity. Here, we will review some of the well-characterized CDK substrates to provide mechanistic insights into how these kinases control different stages of cell division.
Collapse
|
43
|
Abstract
Most transcription factors including nuclear receptors (NRs) act as sensors of the extracellular and intracellular compartments. As such, NRs serve as integrating platforms for a variety of stimuli and are targets for Post-translational modifications such as phosphorylations. During the last decade, knowledge of NRs phosphorylation advanced considerably because of the emergence of new technologies. Indeed, the development of a wide range of phosphorylation site databases, high accuracy mass spectrometry, and phospho-specific antibodies allowed the identification of multiple novel phosphorylation sites in NRs. New and improved methods also emerge to connect these data with the downstream consequences of phosphorylation on NRs structure (computational prediction, NMR), intracellular localization (FRAP), interaction with coregulators (proteomics, FRET, FLIM), and affinity for DNA (ChIP, ChIP-seq, FRAP). In the future, such integrated strategies should provide data with a treasure-trove of information about the integration of numerous signaling events by NRs.
Collapse
|
44
|
Eisenhaber B, Eisenhaber F. Prediction of posttranslational modification of proteins from their amino acid sequence. Methods Mol Biol 2010; 609:365-84. [PMID: 20221930 DOI: 10.1007/978-1-60327-241-4_21] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
If posttranslational modifications (PTMs) are chemical alterations of the protein primary structure during the protein's life cycle as a result of an enzymatic reaction, then the motif in the substrate protein sequence that is recognized by the enzyme can serve as basis for predictor construction that recognizes PTM sites in database sequences. The recognition motif consists generally of two regions: first, a small, central segment that enters the catalytic cleft of the enzyme and that is specific for this type of PTM and, second, a sequence environment of about 10 or more residues with linker characteristics (a trend for small and polar residues with flexible backbone) on either side of the central part that are needed to provide accessibility of the central segment to the enzyme's catalytic site. In this review, we consider predictors for cleavage of targeting signals, lipid PTMs, phosphorylation, and glycosylation.
Collapse
Affiliation(s)
- Birgit Eisenhaber
- Experimental Therapeutic Centre, Bioinformatics Institute, Agency for science, Technology, and Research, Singapore
| | | |
Collapse
|
45
|
Annan RB, Lee AY, Reid ID, Sayad A, Whiteway M, Hallett M, Thomas DY. A biochemical genomics screen for substrates of Ste20p kinase enables the in silico prediction of novel substrates. PLoS One 2009; 4:e8279. [PMID: 20020052 PMCID: PMC2791418 DOI: 10.1371/journal.pone.0008279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2009] [Accepted: 11/19/2009] [Indexed: 01/13/2023] Open
Abstract
The Ste20/PAK family is involved in many cellular processes, including the regulation of actin-based cytoskeletal dynamics and the activation of MAPK signaling pathways. Despite its numerous roles, few of its substrates have been identified. To better characterize the roles of the yeast Ste20p kinase, we developed an in vitro biochemical genomics screen to identify its substrates. When applied to 539 purified yeast proteins, the screen reported 14 targets of Ste20p phosphorylation. We used the data resulting from our screen to build an in silico predictor to identify Ste20p substrates on a proteome-wide basis. Since kinase-substrate specificity is often mediated by additional binding events at sites distal to the phosphorylation site, the predictor uses the presence/absence of multiple sequence motifs to evaluate potential substrates. Statistical validation estimates a threefold improvement in substrate recovery over random predictions, despite the lack of a single dominant motif that can characterize Ste20p phosphorylation. The set of predicted substrates significantly overrepresents elements of the genetic and physical interaction networks surrounding Ste20p, suggesting that some of the predicted substrates are in vivo targets. We validated this combined experimental and computational approach for identifying kinase substrates by confirming the in vitro phosphorylation of polarisome components Bni1p and Bud6p, thus suggesting a mechanism by which Ste20p effects polarized growth.
Collapse
Affiliation(s)
- Robert B Annan
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada.
| | | | | | | | | | | | | |
Collapse
|
46
|
Kumar N, Mohanty D. Identification of substrates for Ser/Thr kinases using residue-based statistical pair potentials. Bioinformatics 2009; 26:189-97. [DOI: 10.1093/bioinformatics/btp633] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
47
|
Abstract
The function and survival of all organisms is dependent on the dynamic control of energy metabolism, when energy demand is matched to energy supply. The AMP-activated protein kinase (AMPK) alphabetagamma heterotrimer has emerged as an important integrator of signals that control energy balance through the regulation of multiple biochemical pathways in all eukaryotes. In this review, we begin with the discovery of the AMPK family and discuss the recent structural studies that have revealed the molecular basis for AMP binding to the enzyme's gamma subunit. AMPK's regulation involves autoinhibitory features and phosphorylation of both the catalytic alpha subunit and the beta-targeting subunit. We review the role of AMPK at the cellular level through examination of its many substrates and discuss how it controls cellular energy balance. We look at how AMPK integrates stress responses such as exercise as well as nutrient and hormonal signals to control food intake, energy expenditure, and substrate utilization at the whole body level. Lastly, we review the possible role of AMPK in multiple common diseases and the role of the new age of drugs targeting AMPK signaling.
Collapse
Affiliation(s)
- Gregory R Steinberg
- Protein Chemistry and Metabolism, St. Vincent's Institute of Medical Research, University of Melbourne, Fitzroy, Victoria, Australia.
| | | |
Collapse
|
48
|
Abstract
Acetylation is a well-studied posttranslational modification that has been associated with a broad spectrum of biological processes, notably gene regulation. Many studies have contributed to our knowledge of the enzymology underlying acetylation, including efforts to understand the molecular mechanism of substrate recognition by several acetyltransferases, but traditional experiments to determine intrinsic features of substrate site specificity have proven challenging. Here, we combine experimental methods with clustering analysis of protein sequences to predict protein acetylation based on the sequence characteristics of acetylated lysines within histones with our unique prediction tool PredMod. We define a local amino acid sequence composition that represents potential acetylation sites by implementing a clustering analysis of histone and nonhistone sequences. We show that this sequence composition has predictive power on 2 independent experimental datasets of acetylation marks. Finally, we detect acetylation for selected putative substrates using mass spectrometry, and report several nonhistone acetylated substrates in budding yeast. Our approach, combined with more traditional experimental methods, may be useful for identifying acetylated substrates proteome-wide.
Collapse
|