Invergo BM. Accurate, high-coverage assignment of in vivo protein kinases to phosphosites from in vitro phosphoproteomic specificity data.
PLoS Comput Biol 2022;
18:e1010110. [PMID:
35560139 PMCID:
PMC9132282 DOI:
10.1371/journal.pcbi.1010110]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 05/25/2022] [Accepted: 04/15/2022] [Indexed: 12/03/2022] Open
Abstract
Phosphoproteomic experiments routinely observe thousands of phosphorylation sites. To understand the intracellular signaling processes that generated this data, one or more causal protein kinases must be assigned to each phosphosite. However, limited knowledge of kinase specificity typically restricts assignments to a small subset of a kinome. Starting from a statistical model of a high-throughput, in vitro kinase-substrate assay, I have developed an approach to high-coverage, multi-label kinase-substrate assignment called IV-KAPhE (“In vivo-Kinase Assignment for Phosphorylation Evidence”). Tested on human data, IV-KAPhE outperforms other methods of similar scope. Such computational methods generally predict a densely connected kinase-substrate network, with most sites targeted by multiple kinases, pointing either to unaccounted-for biochemical constraints or significant cross-talk and signaling redundancy. I show that such predictions can potentially identify biased kinase-site misannotations within families of closely related kinase isozymes and they provide a robust basis for kinase activity analysis.
Proteins can pass around information inside cells about changes in the environment. This process, called intracellular signaling, helps to trigger appropriate cellular responses to environmental changes. One of the main ways information is passed to proteins is through chemical “tagging,” called phosphorylation, by enzymes called protein kinases. We can measure the phosphorylation state of practically all proteins in a cell at any moment. Starting from known cases of phosphorylation by a kinase, many computational methods have been developed to predict if the kinase might tag a certain spot on another protein or if an observed tag was attached by the kinase, with different models for each kinase. I have developed a new method that instead uses a single model to assign one or more kinases to each observed tag, built from the latest large-scale experimental data. This change in focus and unbiased training data allows my method to be significantly more accurate than past methods. I also explored useful applications for my method. For example, I used it to show that much of our knowledge about which kinase is responsible for each tag is probably inaccurately biased towards the commonly studied ones.
Collapse