1
|
Nemzer S, Sabath N, Wool A, Altber Z, Ando H, Pardoll DM, Ganguly S, Turpaz Y, Levine Z, Granit RZ. Abstract 4292: Gene model correction for PVRIG and TIGIT in single cell sequencing data enables accurate detection and study of its functional relevance. Cancer Res 2023. [DOI: 10.1158/1538-7445.am2023-4292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Abstract
Abstract
Single cell RNA sequencing (scRNA-seq) has gained increased popularity in recent years and has revolutionized the study of cell populations; however, this technology presents several caveats regarding specific gene expression measurement. Here we examine the expression levels of several immune checkpoint genes, which are currently assessed in clinical studies. We find that unlike in most bulk sequencing studies, PVRIG and murine TIGIT suffers from poor detection in 10x Chromium scRNA-seq and other types of assays that utilize the GENCODE gene model. We show that the default GENCODE gene model, typically used in the analysis of such data, is incorrect in the PVRIG genomic region and contains also a predicted read-through transcript to which PVRIG reads co-align, causing these to be discarded and hence hindering its proper detection. Moreover, we find that the murine TIGIT 3’ UTR is mis-annotated, leading to the loss of legitimate reads. We thus generated a corrected reference genome, by removing the faulty read-through in the case of PVRIG and by extending the TIGIT 3’ UTR and demonstrate that by employing these changes we can correctly capture genuine expression levels of these checkpoints, and which align with our findings at the protein level using FACS and CITEseq. Furthermore, we show that specialized read multimap algorithms such as RSEM and STARsolo can also partially improve the detection of PVRIG. Our study provides means to better interrogate the expression of PVRIG and murine TIGIT in scRNA-seq and emphasize the importance of optimizing gene models and alignment algorithms to enable accurate gene expression measurement in scRNA-seq and bulk sequencing. Moreover, our results support detailed study of the expression of immune checkpoints in clinical and pre-clinical studies towards the development of cancer immunotherapy treatments.
Citation Format: Sergey Nemzer, Niv Sabath, Assaf Wool, Zoya Altber, Hirofumi Ando, Drew M. Pardoll, Sudipto Ganguly, Yaron Turpaz, Zurit Levine, Roy Z. Granit. Gene model correction for PVRIG and TIGIT in single cell sequencing data enables accurate detection and study of its functional relevance. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 4292.
Collapse
Affiliation(s)
| | | | | | | | - Hirofumi Ando
- 2Johns Hopkins University School of Medicine, Baltimore, MD
| | | | | | | | | | | |
Collapse
|
2
|
Alteber Z, Cojocaru G, Frenkel M, Weyl E, Sabath N, Wool A, Novik A, Kliger Y, Levine Z, Ophir E. 252 Novel DNAM-1 axis member, PVRIG, is potentially a dominant checkpoint involved in stem-like memory T cells – dendritic cell interaction. J Immunother Cancer 2021. [DOI: 10.1136/jitc-2021-sitc2021.252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
BackgroundT-cell accumulation in tumors is a prerequisite for response to cancer immunotherapy. Recent studies highlighted the importance of an early-memory (stem-like) T-cell sub-population, that can self-renew and differentiate into effector cells, and of dendritic cells (DCs), which are essential for T-cell priming and expansion following checkpoint blockade.1 2 PVRIG is a novel inhibitory receptor that competes with the co-activating receptor DNAM-1, for the binding of a shared ligand, PVRL2. PVRIG expression is induced on T and NK tumor infiltrating cells, whereas PVRL2 is expressed on tumor, endothelial and myeloid cells in the tumor micro-environment (TME).3 We investigated the expression of PVRIG and PVRL2 across TME immune subpopulations.MethodsPublicly available TME scRNA sequencing datasets were analyzed for the expression of PVRIG and PVRL2 across immune subsets. Unsupervised principal-component-analysis and hierarchical co-expression pattern among genes known to be expressed on naïve, memory, and exhausted CD8+ T-cells was performed. Observations were validated by flow-cytometry and immunohistochemistry analysis across a variety of tumor indications. Proximity Extension Assay (PEA, Olink) was conducted using serums collected at several time-points from COM701 (anti-PVRIG antibody) and nivolumab treated patients in a Phase-1 study (NCT03667716).ResultsAcross scRNA datasets, PVRIG, like TIGIT and PD-1, was expressed by both stem-like (TCF1+PD1+) and exhausted (TIM3+CD39+) CD8+ T-cells. High resolution unsupervised scRNA gene co-expression analysis revealed that while TIGIT is strongly correlated with PD-1, CTLA-4, and other markers of exhausted T-cells, PVRIG uniquely clusters with markers of early memory T-cells (figure 1). Accordingly, PVRIG protein expression was increased on CD28+ early-memory T-cells across indications (figure 2).RNA expression data also revealed that PVRL2 is more abundantly expressed across DC-subtypes compared to PD-L1 and PVR (ligand of TIGIT, figure 3). Flow cytometry confirmed dominant PVRL2 expression on DC subtypes across tumor indications. Immunohistochemistry analysis identified PVRL2 expression in Tertiary Lymphoid Structures (figure 4). Finally, preliminary analysis of serum from COM701+nivolumab treated patients revealed elevated induction of activated-DC markers in two patients that responded clinically (RECIST criteria), compared to non-responders (figure 5).Abstract 252 Figure 1PVRIG clusters with early differentiation/memory genes, unlike other immune checkpoints that cluster with exhausted genes, in CD8+ T cells. (A) Unsupervised PCA analysis was performed on a scRNA expression matrix of TME CD8+ T cells, which includes all variable genes. Using cells as features and genes as entries, hierarchical co-expression pattern among genes known to be expressed on naïve, memory, and exhausted CD8+ T-cells was performed. (B) scRNA-Seq datasets were analyzed for co-expression pattern among 19 genes, including genes known to be expressed on naïve (TCF7, IL7R, SELL), memory (GZMK, EOMES), and exhausted (PDCD1, LAG3, HAVCR2) CD8 T cells. Average gene-gene correlation over all datasets was calculated. Representative dataset of n=13 (CRC, NSCLC, HNSCC, Melanoma, Liver cancer) is presented.Abstract 252 Figure 2PVRIG is expressed by early memory CD8+ T cells in the TME. Samples (n=11) of CRC, ovarian and bladder cancer were dissociated to single cell suspensions and analyzed for gene expression by flow-cytometry. Paired T-test was used to compare between PVRIG expression among cell populationsAbstract 252 Figure 3PVRL2 is dominantly expressed on dendritic cells. (A) tSNE map depicting the expression profile of PVR/PVRL2/PDL1 in major dendritic cell subsets in Basal Cell Carcinoma patients. (B) Dot plots showing the percent of cells and average level of expression of PVR/PVRL2/PDL1 in major dendritic cell subsets across multiple scRNA cancer studies.Abstract 252 Figure 4PVRL2 is expressed in tertiary lymphoid structures in the tumor bed. Tertiary Lymphoid Structures (TLSs) were identified in subsets of samples across all tumors tested (NSCLC, CRC primary and metastasis, ovarian cancer, endometrial cancer, breast primary TNBC and breast metastasis) and for most cases TLSs were positive for PVRL2. Staining was preformed using a proprietary rabbit mAb raised against the ECD of PVRL2 on a Dako AutostainerAbstract 252 Figure 5Elevated induction of activated-DCs markers in patients that clinically responded to COM701+nivolumab, compared to non-responders. Serum of 7 patients from the nivlumab+COM701 dose escalation arm, were analyzed using Olink Explore 1536. For each patient, the difference between all on-treatment time points to pre-treatment were calculated. Group difference based on response, RECIST criteria (responders (R): CR+PR vs. non responders (NR): SD+PD) were compared by student t-test for all available time points groupedConclusionsPVRIG is co-expressed with PD-1 and TIGIT on stem-like and exhausted T cells but has a unique dominant expression on early memory cells, while PVRL2 is abundantly expressed across DC-types. PVRIG blockade could therefore enhance memory T-cells activation by DCs, resulting in their increased expansion and differentiation. Accordingly, early data shows increased induction of activated DC markers, potentially following efficient T-DC interaction, in serum of two patients responding to COM701+nivolumab.ReferencesJansen CS, Prokhnevska N, Master VA, et al. An intra-tumoral niche maintains and differentiates stem-like CD8 T cells. Nature 2019;576:465–470. Held W, et al. Intratumoral CD8+ T cells with stem cell-like properties: implications for cancer immunotherapy. Sci Transl Med 2019;11(515):eaay6863Whelan S, Ophir E, Kotturi MF, Levy O, Ganguly S, Leung L, et al. PVRIG and PVRL2 are induced in cancer and inhibit CD8+ T-cell function. Cancer Immunol Res 2019;7:257–68.Ethics ApprovalClinical trial identificationNCT03667716The study was approved by each site’s ethics board.ConsentClinical trial identificationNCT03667716Written informed consent was obtained from the patient for publication of this abstract and any accompanying images. A copy of the written consent is available for review by the Editor of this journal.
Collapse
|
3
|
Benita Y, Novik A, Cojocaru G, Borukhov I, Wool A, Kliger Y, Zekharya T, Levine Z, Nemzer S, Levy O, Toporik A. Abstract 584: From code to cure: Computational discovery of novel immune checkpoints. Cancer Res 2017. [DOI: 10.1158/1538-7445.am2017-584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Antibody blockade of CTLA4 and PD-1 immune checkpoints emerged as an effective treatment modality for cancer. However, the majority of patients do not achieve sustained long term benefit, suggesting a need for targeting of additional immune checkpoints. To identify additional B7/CD28 immune checkpoint targets, we developed a unique compendium of computational algorithms that identified multiple novel targets including TIGIT in 2008, which was an unknown protein at the time of discovery [Proc Natl Acad Sci U S A. 2009 Oct 20;106(42):17858-63], and PVRIG which we recently disclosed. Since their initial discovery, these targets have been functionally validated and anti-tumor activity was demonstrated with antibodies that target them.
In this presentation, we will describe the computational algorithms that led to the discovery of these novel immune checkpoints. These algorithms combine two complementary aspects: (i) endogenous immune checkpoint function prediction and (ii) prediction of immuno-modulatory activity in cancer. Immune checkpoint function was predicted based on gene structure similarity to B7/CD28 family members that is reminiscent of ancient common evolutionary origins. A gene structure alignment tool was developed to identify functional homologs of B7/CD28 genes even in the absence of sequence similarity. Next, the expression profile of these candidates was modeled and compared to profiles of known immune checkpoints in normal and cancer tissues. We will review the details of TIGIT and PVRIG discovery, which were among the immune checkpoints predicted in this process.
Our approach demonstrates the powerful ability of computational biology to translate genomic knowledge into rational and reliable drug target discovery.
Citation Format: Yair Benita, Amit Novik, Gady Cojocaru, Itamar Borukhov, Assaf Wool, Yossef Kliger, Tomer Zekharya, Zurit Levine, Sergey Nemzer, Ofer Levy, Amir Toporik. From code to cure: Computational discovery of novel immune checkpoints [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 584. doi:10.1158/1538-7445.AM2017-584
Collapse
|
4
|
Fuchs TC, Mally A, Wool A, Beiman M, Hewitt P. An Exploratory Evaluation of the Utility of Transcriptional and Urinary Kidney Injury Biomarkers for the Prediction of Aristolochic Acid–Induced Renal Injury in Male Rats. Vet Pathol 2013; 51:680-94. [DOI: 10.1177/0300985813498779] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The predictive value of different urinary and transcriptional biomarkers was evaluated in a proof-of-principle toxicology study in rats using aristolochic acid (AA), a known nephrotoxic agent. Male Wistar rats were orally dosed with 0.1, 1, or 10 mg/kg for 12 days. Urine was collected on days 1, 5, and 12 over 24 hours. Gene expression analysis was also conducted using quantitative real-time polymerase chain reaction and Illumina whole-genome chips. Protein biomarkers (Kim-1, Timp-1, vascular endothelial growth factor, osteopontin, clusterin, cystatin C, calbindin D-28K, β2-microglobulin, α–glutathione S-transferase, GSTY1b, RPA-1, and neutrophil gelatinase-associated lipocalin) were measured in these urine samples. Treatment with AA resulted in a slight dose- and/or time-dependent increase in urinary β2-microglobulin, lipocalin 2, and osteopontin before an increase in serum creatinine or serum urea nitrogen was observed. A strong decrease in urinary calbindin D-28K was also detected. The Compugen Ltd. prediction model scored both the 1- and 10-mg/kg AA dose groups as positive for nephrotoxicity despite the absence of renal histopathological changes. In addition, several previously described transcriptional biomarkers were identified as early predictors of renal toxicity as they were detected before morphological alterations had occurred. Altogether, these findings demonstrated the predictive values of renal biomarkers approved by the Food and Drug Administration, European Medicines Agency, and Pharmaceuticals & Medical Devices Agency in AA-induced renal injury in rats and confirmed the utility of renal transcriptional biomarkers for detecting progression of compound-induced renal injury in rats. In addition, several transcriptional biomarkers identified in this exploratory study could present early predictors of renal tubular epithelium injury in rats.
Collapse
Affiliation(s)
- T. C. Fuchs
- Merck Serono, Non-Clinical Safety, Darmstadt, Germany
| | - A. Mally
- Department of Toxicology, University of Wuerzburg, Wuerzburg, Germany
| | - A. Wool
- Compugen Ltd., Tel Aviv, Israel
| | | | - P. Hewitt
- Merck Serono, Non-Clinical Safety, Darmstadt, Germany
| |
Collapse
|
5
|
Krintel SB, Essioux L, Wool A, Johansen JS, Schreiber E, Zekharya T, Akiva P, Ostergaard M, Hetland ML. CD6 and syntaxin binding protein 6 variants and response to tumor necrosis factor alpha inhibitors in Danish patients with rheumatoid arthritis. PLoS One 2012; 7:e38539. [PMID: 22685579 PMCID: PMC3369852 DOI: 10.1371/journal.pone.0038539] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 05/07/2012] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND TNFα inhibitor therapy has greatly improved the treatment of patients with rheumatoid arthritis, however at least 30% do not respond. We aimed to investigate insertions and deletions (INDELS) associated with response to TNFα inhibitors in patients with rheumatoid arthritis (RA). METHODOLOGY AND PRINCIPAL FINDINGS In the DANBIO Registry we identified 237 TNFα inhibitor naïve patients with RA (81% women; median age 56 years; disease duration 6 years) who initiated treatment with infliximab (n=160), adalimumab (n=56) or etanercept (n=21) between 1999 and 2008 according to national treatment guidelines. Clinical response was assessed at week 26 using EULAR response criteria. Based on literature, we selected 213 INDELS potentially related to RA and treatment response using the GeneVa® (Compugen) in silico database of 350,000 genetic variations in the human genome. Genomic segments were amplified by polymerase chain reaction (PCR), and genotyped by Sanger sequencing or fragment analysis. We tested the association between genotypes and EULAR good response versus no response, and EULAR good response versus moderate/no response using Fisher's exact test. At baseline the median DAS28 was 5.1. At week 26, 68 (29%) patients were EULAR good responders, while 81 (34%) and 88 (37%) patients were moderate and non-responders, respectively. A 19 base pair insertion within the CD6 gene was associated with EULAR good response vs. no response (OR=4.43, 95% CI: 1.99-10.09, p=7.211×10(-5)) and with EULAR good response vs. moderate/no response (OR=4.54, 95% CI: 2.29-8.99, p=3.336×10(-6)). A microsatellite within the syntaxin binding protein 6 (STXBP6) was associated with EULAR good response vs. no response (OR=4.01, 95% CI: 1.92-8.49, p=5.067×10(-5)). CONCLUSION Genetic variations within CD6 and STXBP6 may influence response to TNFα inhibitors in patients with RA.
Collapse
MESH Headings
- Adalimumab
- Adult
- Aged
- Aged, 80 and over
- Antibodies, Monoclonal/therapeutic use
- Antibodies, Monoclonal, Humanized/therapeutic use
- Antigens, CD/genetics
- Antigens, Differentiation, T-Lymphocyte/genetics
- Antirheumatic Agents/therapeutic use
- Arthritis, Rheumatoid/drug therapy
- Arthritis, Rheumatoid/genetics
- Carrier Proteins/genetics
- Cohort Studies
- DNA Mutational Analysis
- Denmark
- Etanercept
- Female
- Genotype
- Humans
- INDEL Mutation
- Immunoglobulin G/therapeutic use
- Infliximab
- Male
- Middle Aged
- Polymerase Chain Reaction
- Receptors, Tumor Necrosis Factor/therapeutic use
- Treatment Outcome
- Tumor Necrosis Factor-alpha/antagonists & inhibitors
- Young Adult
Collapse
Affiliation(s)
- Sophine B Krintel
- DANBIO Registry and Department of Rheumatology, Copenhagen University Hospital at Glostrup, Copenhagen, Denmark.
| | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Pini A, Shemesh R, Samuel CS, Bathgate RAD, Zauberman A, Hermesh C, Wool A, Bani D, Rotman G. Prevention of bleomycin-induced pulmonary fibrosis by a novel antifibrotic peptide with relaxin-like activity. J Pharmacol Exp Ther 2010; 335:589-99. [PMID: 20826567 DOI: 10.1124/jpet.110.170977] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Pulmonary fibrosis is a progressive and lethal lung disease characterized by accumulation of extracellular matrix and loss of pulmonary function. No cure exists for this pathologic condition, and current treatments often fail to slow its progression or relieve its symptoms. Relaxin was previously shown to induce a matrix-degrading phenotype in human lung fibroblasts in vitro and to inhibit pulmonary fibrosis in vivo. A novel peptide that targets the relaxin RXFP1/LGR7 receptor was recently identified using our computational platform designed to predict novel G protein-coupled receptor peptide agonists. In this study, we examined the antifibrotic properties of this novel peptide, designated CGEN25009, in human cell-based assays and in a murine model of bleomycin-induced pulmonary fibrosis. Similar to relaxin, CGEN25009 was found to have an inhibitory effect on transforming growth factor-β1-induced collagen deposition in human dermal fibroblasts and to enhance MMP-2 expression. The peptide's biological activity was also similar to relaxin in generating cellular stimulation of cAMP, cGMP, and NO in the THP-1 human cell line. In vivo, 2-week administration of CGEN25009 in a preventive or therapeutic mode (i.e., concurrently with or 7 days after bleomycin treatment, respectively) caused a significant reduction in lung inflammation and injury and ameliorated adverse airway remodeling and peribronchial fibrosis. The results of this study indicate that CGEN25009 displays antifibrotic and anti-inflammatory properties and may offer a new therapeutic option for the treatment of pulmonary fibrosis.
Collapse
|
7
|
Shemesh R, Hermesh C, Toporik A, Levine Z, Novik A, Wool A, Kliger Y, Rosenberg A, Bathgate RAD, Cohen Y. Activation of relaxin-related receptors by short, linear peptides derived from a collagen-containing precursor. Ann N Y Acad Sci 2009; 1160:78-86. [PMID: 19416163 DOI: 10.1111/j.1749-6632.2009.03827.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
In a screening effort based on algorithmic predictions for novel G-protein-coupled receptor (GPCR) peptide activators, we were able to identify and examine two novel peptides (P59 and P74) which are short, linear, and derived from a natural, previously unidentified precursor protein containing a collagen-like repeat. Both peptides seemed to show an apparent cAMP-related effect on CHO-K1 cells transiently transfected with either LGR7 or LGR8, usually after treatment with cAMP-generating forskolin, compared to the same cells treated with forskolin plus relaxin. This activation was not found for the relaxin-3 receptor (GPR135). In a set of follow-up experiments, both peptides were found to stimulate cAMP production, mostly upon initial stimulation of cAMP production by 5 micro M forskolin in cells transfected with either LGR7 or LGR8. In a dye-free cell impedance GPCR activation assay, we were able to show that these peptides were also able to activate a cellular response mediated by these receptors. Although untransfected CHO-K1 cells showed some cellular activation by both relaxin and at least one of our newly discovered peptides, both LGR7- and LGR8-transfected cells showed a stronger response, indicating stimulation of a cellular pathway through activation of these receptors. In conclusion, we were able to show that these newly discovered peptides, which have no similarity to any member of the relaxin-insulin-like peptide family, are potential ligands for the relaxin-related family of receptors and as such might serve as novel candidates for relaxin-related therapeutic indications. Both peptides are linear and were found to be active after being chemically synthesized.
Collapse
|
8
|
Shemesh R, Toporik A, Levine Z, Hecht I, Rotman G, Wool A, Dahary D, Gofer E, Kliger Y, Soffer MA, Rosenberg A, Eshel D, Cohen Y. Discovery and validation of novel peptide agonists for G-protein-coupled receptors. J Biol Chem 2008; 283:34643-9. [PMID: 18854305 DOI: 10.1074/jbc.m805181200] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
G-protein-coupled receptors (GPCRs) represent an important group of targets for pharmaceutical therapeutics. The completion of the human genome revealed a large number of putative GPCRs. However, the identification of their natural ligands, and especially peptides, suffers from low discovery rates, thus impeding development of therapeutics based on these potential drug targets. We describe the discovery of novel GPCR ligands encrypted in the human proteome. Hundreds of potential peptide ligands were predicted by machine learning algorithms. In vitro screening of selected 33 peptides on a set of 152 GPCRs, including a group of designated orphan receptors, was conducted by intracellular calcium measurements and cAMP assays. The screening revealed eight novel peptides as potential agonists that specifically activated six different receptors in a dose-dependent manner. Most of the peptides showed distinct stimulatory patterns targeted at designated and orphan GPCRs. Further analysis demonstrated a significant in vivo effect for one of the peptides in a mouse inflammation model.
Collapse
Affiliation(s)
- Ronen Shemesh
- Compugen Limited, 72 Pinchas Rosen St., Tel Aviv 69512, Israel.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
Motivation: Many secretory proteins are synthesized as inactive precursors that must undergo post-translational proteolysis in order to mature and become active. In the current study, we address the challenge of sequence-based discovery of proteolytic sites in secreted proteins using machine learning. Results: The results revealed that only half of the extracellular proteolytic sites are currently annotated, leaving over 3600 unannotated ones. Furthermore, we have found that only 6% of the unannotated sites are similar to known proteolytic sites, whereas the remaining 94% do not share significant similarity with any annotated proteolytic site. The computational challenges in these two cases are very different. While the precision in detecting the former group is close to perfect, only a mere 22% of the latter group were detected with a precision of 80%. The applicability of the classifier is demonstrated through members of the FGF family, in which we verified the conservation of physiologically-relevant proteolytic sites in homologous proteins. Contact:kliger@compugen.co.il; yossef.kliger@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
|
10
|
Abstract
TwinPeaks, a close variant of the SEQUEST protein identification algorithm, is capable of unrestricted, large-scale, identification of post-translation modifications (PTMs). TwinPeaks is applied on a sample of 100441 tandem mass spectra from the HUPO Plasma Proteome Project data set, with full non-redundant human as a reference protein database. With a 3.5% error rate, TwinPeaks identifies a collection of 539 spectra that were not identified by the usual PTM-restricted identification algorithm. At this error rate, TwinPeaks increases the rate of spectra identifications by at least 17.6%, making unrestricted PTM identification an integral part of proteomics.
Collapse
Affiliation(s)
- Moshe Havilio
- Notal Vision Limited, 5 Droyanov Street, Tel Aviv, 63143 Israel.
| | | |
Collapse
|
11
|
Abstract
Identification of proteins using matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) peptide mass fingerprinting (PMF) is a key technique in proteomics. The method is known to be sensitive as well as amenable to high-throughput operation, but the resulting identifications suffer from a relatively low level of confidence. One way of increasing the confidence is by improving measurement accuracy using one of several calibration methods. This paper presents a new strategy for calibration of MALDI-TOF PMF spectra that makes use of the phenomenon of peptide mass clustering, and enables spectrum calibration prior to the step of database interrogation, before or after peak extraction. Typically, mass errors are reduced by 40-60%. Accuracy improvement at this early stage can help avoid losing protein candidates, reduce the number of external calibration spots, eliminate internal calibrants, and reduce the number of candidates being scored, thereby reducing analysis time. Different variants of the method are discussed and compared to known calibration methods, such as relying on known calibrants or comparison to putative database candidates. In order to allow precise description of the method and to place the results in perspective, theoretical considerations of peptide databases and scoring functions are also discussed.
Collapse
|