1
|
Parente AD, Bolland DE, Huisinga KL, Provost JJ. Physiology of malate dehydrogenase and how dysregulation leads to disease. Essays Biochem 2024:EBC20230085. [PMID: 38962852 DOI: 10.1042/ebc20230085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 06/10/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024]
Abstract
Malate dehydrogenase (MDH) is pivotal in mammalian tissue metabolism, participating in various pathways beyond its classical roles and highlighting its adaptability to cellular demands. This enzyme is involved in maintaining redox balance, lipid synthesis, and glutamine metabolism and supports rapidly proliferating cells' energetic and biosynthetic needs. The involvement of MDH in glutamine metabolism underlines its significance in cell physiology. In contrast, its contribution to lipid metabolism highlights its role in essential biosynthetic processes necessary for cell maintenance and proliferation. The enzyme's regulatory mechanisms, such as post-translational modifications, underscore its complexity and importance in metabolic regulation, positioning MDH as a potential target in metabolic dysregulation. Furthermore, the association of MDH with various pathologies, including cancer and neurological disorders, suggests its involvement in disease progression. The overexpression of MDH isoforms MDH1 and MDH2 in cancers like breast, prostate, and pancreatic ductal adenocarcinoma, alongside structural modifications, implies their critical role in the metabolic adaptation of tumor cells. Additionally, mutations in MDH2 linked to pheochromocytomas, paragangliomas, and other metabolic diseases emphasize MDH's role in metabolic homeostasis. This review spotlights MDH's potential as a biomarker and therapeutic target, advocating for further research into its multifunctional roles and regulatory mechanisms in health and disease.
Collapse
Affiliation(s)
- Amy D Parente
- Department of Chemistry and Biochemistry, Mercyhurst University, Erie, PA, U.S.A
| | - Danielle E Bolland
- Department of Biology, University of Minnesota Morris, Morris, MN 56267, U.S.A
| | - Kathryn L Huisinga
- Department of Chemistry and Biochemistry, Malone University, Canton, OH 44709, U.S.A
| | - Joseph J Provost
- Department of Chemistry and Biochemistry, University of San Diego, San Diego, CA 92110, U.S.A
| |
Collapse
|
2
|
Jeong SK, Kim CY, Paik YK. ASV-ID, a Proteogenomic Workflow To Predict Candidate Protein Isoforms on the Basis of Transcript Evidence. J Proteome Res 2018; 17:4235-4242. [PMID: 30289715 DOI: 10.1021/acs.jproteome.8b00548] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named "ASV-ID" (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.
Collapse
|
3
|
Wang J, Dumartin L, Mafficini A, Ulug P, Sangaralingam A, Alamiry NA, Radon TP, Salvia R, Lawlor RT, Lemoine NR, Scarpa A, Chelala C, Crnogorac-Jurcevic T. Splice variants as novel targets in pancreatic ductal adenocarcinoma. Sci Rep 2017; 7:2980. [PMID: 28592875 PMCID: PMC5462735 DOI: 10.1038/s41598-017-03354-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 04/26/2017] [Indexed: 12/22/2022] Open
Abstract
Despite a wealth of genomic information, a comprehensive alternative splicing (AS) analysis of pancreatic ductal adenocarcinoma (PDAC) has not been performed yet. In the present study, we assessed whole exome-based transcriptome and AS profiles of 43 pancreas tissues using Affymetrix exon array. The AS analysis of PDAC indicated on average two AS probe-sets (ranging from 1-28) in 1,354 significantly identified protein-coding genes, with skipped exon and alternative first exon being the most frequently utilised. In addition to overrepresented extracellular matrix (ECM)-receptor interaction and focal adhesion that were also seen in transcriptome differential expression (DE) analysis, Fc gamma receptor-mediated phagocytosis and axon guidance AS genes were also highly represented. Of note, the highest numbers of AS probe-sets were found in collagen genes, which encode the characteristically abundant stroma seen in PDAC. We also describe a set of 37 'hypersensitive' genes which were frequently targeted by somatic mutations, copy number alterations, DE and AS, indicating their propensity for multidimensional regulation. We provide the most comprehensive overview of the AS landscape in PDAC with underlying changes in the spliceosomal machinery. We also collate a set of AS and DE genes encoding cell surface proteins, which present promising diagnostic and therapeutic targets in PDAC.
Collapse
Affiliation(s)
- Jun Wang
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK.
| | - Laurent Dumartin
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Andrea Mafficini
- ARC-Net Research Centre and Department of Diagnostics and Publich Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Pinar Ulug
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Ajanthah Sangaralingam
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Namaa Audi Alamiry
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Tomasz P Radon
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Roberto Salvia
- ARC-Net Research Centre and Department of Diagnostics and Publich Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Rita T Lawlor
- ARC-Net Research Centre and Department of Diagnostics and Publich Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Nicholas R Lemoine
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Aldo Scarpa
- ARC-Net Research Centre and Department of Diagnostics and Publich Health, Section of Pathology, University and Hospital Trust of Verona, Verona, Italy
| | - Claude Chelala
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK
| | - Tatjana Crnogorac-Jurcevic
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, John Vane Science Centre, London, EC1M 6BQ, UK.
| |
Collapse
|
4
|
Tran TT, Bollineni RC, Strozynski M, Koehler CJ, Thiede B. Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches. J Proteome Res 2017; 16:2571-2578. [PMID: 28508642 DOI: 10.1021/acs.jproteome.7b00126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.
Collapse
Affiliation(s)
- Trung T Tran
- Department of Biosciences, University of Oslo , Oslo 0316, Norway
| | - Ravi C Bollineni
- Department of Biosciences, University of Oslo , Oslo 0316, Norway
| | | | | | - Bernd Thiede
- Department of Biosciences, University of Oslo , Oslo 0316, Norway
| |
Collapse
|
5
|
Abstract
The Protein Ontology (PRO) is the reference ontology for proteins in the Open Biomedical Ontologies (OBO) foundry and consists of three sub-ontologies representing protein classes of homologous genes, proteoforms (e.g., splice isoforms, sequence variants, and post-translationally modified forms), and protein complexes. PRO defines classes of proteins and protein complexes, both species-specific and species nonspecific, and indicates their relationships in a hierarchical framework, supporting accurate protein annotation at the appropriate level of granularity, analyses of protein conservation across species, and semantic reasoning. In the first section of this chapter, we describe the PRO framework including categories of PRO terms and the relationship of PRO to other ontologies and protein resources. Next, we provide a tutorial about the PRO website ( proconsortium.org ) where users can browse and search the PRO hierarchy, view reports on individual PRO terms, and visualize relationships among PRO terms in a hierarchical table view, a multiple sequence alignment view, and a Cytoscape network view. Finally, we describe several examples illustrating the unique and rich information available in PRO.
Collapse
|
6
|
Guerrero CR, Jagtap PD, Johnson JE, Griffin TJ. Using Galaxy for Proteomics. PROTEOME INFORMATICS 2016. [DOI: 10.1039/9781782626732-00289] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The area of informatics for mass spectrometry (MS)-based proteomics data has steadily grown over the last two decades. Numerous, effective software programs now exist for various aspects of proteomic informatics. However, many researchers still have difficulties in using these software. These difficulties arise from problems with running and integrating disparate software programs, scalability issues when dealing with large data volumes, and lack of ability to share and reproduce workflows comprised of different software. The Galaxy framework for bioinformatics provides an attractive option for solving many of these current issues in proteomic informatics. Originally developed as a workbench to enable genomic data analysis, numerous researchers are now turning to Galaxy to implement software for MS-based proteomics applications. Here, we provide an introduction to Galaxy and its features, and describe how software tools are deployed, published and shared via the scalable framework. We also describe some of the existing tools in Galaxy for basic MS-based proteomics data analysis and informatics. Finally, we describe how proteomics tools in Galaxy can be combined with other existing tools for genomic and transcriptomic data analysis to enable powerful multi-omic data analysis applications.
Collapse
Affiliation(s)
- Candace R. Guerrero
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota 1479 Gortner Avenue, St. Paul MN 55108 USA
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota 512 Walter Library, 117 Pleasant Street SE Minneapolis MN 55455 USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota 1479 Gortner Avenue, St. Paul MN 55108 USA
| |
Collapse
|
7
|
Gan L, Yang B, Mei H. The effect of iron dextran on the transcriptome of pig hippocampus. Genes Genomics 2016. [DOI: 10.1007/s13258-016-0469-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
8
|
Sheynkman GM, Shortreed MR, Cesnik AJ, Smith LM. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:521-45. [PMID: 27049631 PMCID: PMC4991544 DOI: 10.1146/annurev-anchem-071015-041722] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215;
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Anthony J Cesnik
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
- Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin 53706;
| |
Collapse
|
9
|
Tavares R, Wajnberg G, Scherer NDM, Pauletti BA, Cassoli JS, Ferreira CG, Paes Leme AF, de Araujo-Souza PS, Martins-de-Souza D, Passetti F. Unveiling alterative splice diversity from human oligodendrocyte proteome data. J Proteomics 2016; 151:293-301. [PMID: 27222040 DOI: 10.1016/j.jprot.2016.05.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Revised: 05/14/2016] [Accepted: 05/20/2016] [Indexed: 10/21/2022]
Abstract
Oligodendrocytes produce and maintain the myelin sheath of axons in the central nervous system. Because misassembled myelin sheaths have been associated with brain disorders such as multiple sclerosis and schizophrenia, recent advances have been made towards the description of the oligodendrocyte proteome. The identification of splice variants represented in the proteome is as important as determining the level of oligodendrocyte-associated proteins. Here, we used an oligodendrocyte proteome dataset deposited in ProteomeXchange to search against a customized protein sequence file containing computationally predicted splice variants. Our approach resulted in the identification of 39 splice variants, including one variant from the GTPase KRAS gene and another from the human glutaminase gene family. We also detected the mRNA expression of five selected splice variants and demonstrated that a fraction of these have their canonical proteins participating in direct protein-protein interactions. In conclusion, we believe our findings contribute to the molecular characterization of oligodendrocytes and may encourage other research groups working with central nervous system disorders to investigate the biological significance of these splice variants. The splice variants identified in this study may encode proteins that could be targeted in novel treatment strategies and diagnostic methods. SIGNIFICANCE Several disorders of the central nervous system (CNS) are associated with misassembled myelin sheaths, which are produced and maintained by oligodendrocytes (OL). Recently, the OL proteome has been explored to identify key proteins and molecular functions associated with CNS disorders. We developed an innovative approach to select, with a higher level of confidence, a relevant list of splice variants from a proteome dataset and detected the mRNA expression of five selected variants: EEF1D, KRAS, MFF, SDR39U1, and SUGT1. We also described splice variants extracted from OL proteome data. Among the splice variants identified, some are from genes previously linked to CNS and related disorders. Our findings may contribute to oligodendrocyte characterization and encourage other research groups to investigate the biological role of splice variants and to improve current treatments and diagnostic methods for CNS disorders.
Collapse
Affiliation(s)
- Raphael Tavares
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Rio de Janeiro, RJ, Brazil; Bioinformatics Unit, Clinical Research Coordination, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil
| | - Gabriel Wajnberg
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Rio de Janeiro, RJ, Brazil; Bioinformatics Unit, Clinical Research Coordination, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil
| | - Nicole de Miranda Scherer
- Bioinformatics Unit, Clinical Research Coordination, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil
| | - Bianca Alves Pauletti
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), CNPEM, Campinas, SP, Brazil
| | - Juliana S Cassoli
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, SP, Brazil
| | - Carlos Gil Ferreira
- Clinical Research Coordination, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil
| | - Adriana Franco Paes Leme
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências (LNBio), CNPEM, Campinas, SP, Brazil
| | - Patricia Savio de Araujo-Souza
- Department of Immunobiology, Fluminense Federal University (UFF), Niterói, RJ, Brazil; Program of Cellular Biology, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil
| | - Daniel Martins-de-Souza
- Laboratory of Neuroproteomics, Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, SP, Brazil
| | - Fabio Passetti
- Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fundação Oswaldo Cruz (FIOCRUZ), Rio de Janeiro, RJ, Brazil; Bioinformatics Unit, Clinical Research Coordination, Instituto Nacional de Câncer (INCA), Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
10
|
Jenkinson C, Elliott VL, Evans A, Oldfield L, Jenkins RE, O’Brien DP, Apostolidou S, Gentry-Maharaj A, Fourkala EO, Jacobs IJ, Menon U, Cox T, Campbell F, Pereira SP, Tuveson DA, Park BK, Greenhalf W, Sutton R, Timms JF, Neoptolemos JP, Costello E. Decreased Serum Thrombospondin-1 Levels in Pancreatic Cancer Patients Up to 24 Months Prior to Clinical Diagnosis: Association with Diabetes Mellitus. Clin Cancer Res 2016; 22:1734-1743. [PMID: 26573598 PMCID: PMC4820087 DOI: 10.1158/1078-0432.ccr-15-0879] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 09/19/2015] [Indexed: 12/24/2022]
Abstract
PURPOSE Identification of serum biomarkers enabling earlier diagnosis of pancreatic ductal adenocarcinoma (PDAC) could improve outcome. Serum protein profiles in patients with preclinical disease and at diagnosis were investigated. EXPERIMENTAL DESIGN Serum from cases up to 4 years prior to PDAC diagnosis and controls (UKCTOCS,n= 174) were studied, alongside samples from patients diagnosed with PDAC, chronic pancreatitis, benign biliary disease, type 2 diabetes mellitus, and healthy subjects (n= 298). Isobaric tags for relative and absolute quantification (iTRAQ) enabled comparisons of pooled serum from a test set (n= 150). Validation was undertaken using multiple reaction monitoring (MRM) and/or Western blotting in all 472 human samples and samples from a KPC mouse model. RESULTS iTRAQ identified thrombospondin-1 (TSP-1) as reduced preclinically and in diagnosed samples. MRM confirmed significant reduction in levels of TSP-1 up to 24 months prior to diagnosis. A combination of TSP-1 and CA19-9 gave an AUC of 0.86, significantly outperforming both markers alone (0.69 and 0.77, respectively;P< 0.01). TSP-1 was also decreased in PDAC patients compared with healthy controls (P< 0.05) and patients with benign biliary obstruction (P< 0.01). Low levels of TSP-1 correlated with poorer survival, preclinically (P< 0.05) and at clinical diagnosis (P< 0.02). In PDAC patients, reduced TSP-1 levels were more frequently observed in those with confirmed diabetes mellitus (P< 0.01). Significantly lower levels were also observed in PDAC patients with diabetes compared with individuals with type 2 diabetes mellitus (P= 0.01). CONCLUSIONS Circulating TSP-1 levels decrease up to 24 months prior to diagnosis of PDAC and significantly enhance the diagnostic performance of CA19-9. The influence of diabetes mellitus on biomarker behavior should be considered in future studies.
Collapse
Affiliation(s)
- Claire Jenkinson
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Victoria L. Elliott
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Anthony Evans
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Lucy Oldfield
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Rosalind E. Jenkins
- MRC Centre for Drug Safety Science, Department of Pharmacology and Therapeutics, University of Liverpool, UK
| | - Darragh P. O’Brien
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
| | - Sophia Apostolidou
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
| | | | - Evangelia-O Fourkala
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
| | - Ian J. Jacobs
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
- Faculty of Medical & Human Sciences, 1.018 Core Technology Facility, University of Manchester, UK
| | - Usha Menon
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
| | - Trevor Cox
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
| | | | | | - David A. Tuveson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - B. Kevin Park
- MRC Centre for Drug Safety Science, Department of Pharmacology and Therapeutics, University of Liverpool, UK
| | - William Greenhalf
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Robert Sutton
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - John F. Timms
- Department of Women’s Cancer, Institute for Women’s Health, University College London, UK
| | - John P. Neoptolemos
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| | - Eithne Costello
- Department of Molecular and Clinical Cancer Medicine, University of Liverpool, UK
- National Institute for Health Research Liverpool Pancreas Biomedical Research Unit, Royal Liverpool University Hospital, UK
| |
Collapse
|
11
|
Chan AKC, Bruce JIE, Siriwardena AK. Glucose metabolic phenotype of pancreatic cancer. World J Gastroenterol 2016; 22:3471-3485. [PMID: 27022229 PMCID: PMC4806205 DOI: 10.3748/wjg.v22.i12.3471] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 01/30/2016] [Accepted: 03/02/2016] [Indexed: 02/06/2023] Open
Abstract
AIM: To construct a global “metabolic phenotype” of pancreatic ductal adenocarcinoma (PDAC) reflecting tumour-related metabolic enzyme expression.
METHODS: A systematic review of the literature was performed using OvidSP and PubMed databases using keywords “pancreatic cancer” and individual glycolytic and mitochondrial oxidative phosphorylation (MOP) enzymes. Both human and animal studies investigating the oncological effect of enzyme expression changes and inhibitors in both an in vitro and in vivo setting were included in the review. Data reporting changes in enzyme expression and the effects on PDAC cells, such as survival and metastatic potential, were extracted to construct a metabolic phenotype.
RESULTS: Seven hundred and ten papers were initially retrieved, and were screened to meet the review inclusion criteria. 107 unique articles were identified as reporting data involving glycolytic enzymes, and 28 articles involving MOP enzymes in PDAC. Data extraction followed a pre-defined protocol. There is consistent over-expression of glycolytic enzymes and lactate dehydrogenase in keeping with the Warburg effect to facilitate rapid adenosine-triphosphate production from glycolysis. Certain isoforms of these enzymes were over-expressed specifically in PDAC. Altering expression levels of HK, PGI, FBA, enolase, PK-M2 and LDA-A with metabolic inhibitors have shown a favourable effect on PDAC, thus identifying these as potential therapeutic targets. However, the Warburg effect on MOP enzymes is less clear, with different expression levels at different points in the Krebs cycle resulting in a fundamental change of metabolite levels, suggesting that other essential anabolic pathways are being stimulated.
CONCLUSION: Further characterisation of the PDAC metabolic phenotype is necessary as currently there are few clinical studies and no successful clinical trials targeting metabolic enzymes.
Collapse
|
12
|
Kalvala A, Gao L, Aguila B, Reese T, Otterson GA, Villalona-Calero MA, Duan W. Overexpression of Rad51C splice variants in colorectal tumors. Oncotarget 2016; 6:8777-87. [PMID: 25669972 PMCID: PMC4496183 DOI: 10.18632/oncotarget.3209] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 12/24/2014] [Indexed: 01/04/2023] Open
Abstract
Functional alterations in Rad51C are the cause of the Fanconi anemia complementation group O (FANCO) gene disorder. We have identified novel splice variants of Rad51C mRNA in colorectal tumors and cells. The alternatively spliced transcript variants are formed either without exon-7 (variant 1), without exon 6 and 7 (variant 2) or without exon 7 and 8 (variant 3). Real time PCR analysis of nine pair-matched colorectal tumors and non-tumors showed that variant 1 was overexpressed in tumors compared to matched non-tumors. Among 38 colorectal tumor RNA samples analyzed, 18 contained variant 1, 12 contained variant 2, 14 contained variant 3, and eight expressed full length Rad51C exclusively. Bisulfite DNA sequencing showed promoter methylation of Rad51C in tumor cells. 5-azacytidine treatment of LS-174T cells caused a 14 fold increase in variant 1, a 4.8 fold increase for variant 3 and 3.4 fold for variant 2 compared to 2.5 fold increase in WT. Expression of Rad51C variants is associated with FANCD2 foci positive colorectal tumors and is associated with microsatellite stability in those tumors. Further investigation is needed to elucidate differential function of the Rad51C variants to evaluate potential effects in drug resistance and DNA repair.
Collapse
Affiliation(s)
- Arjun Kalvala
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Li Gao
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Brittany Aguila
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Tyler Reese
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Gregory A Otterson
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A.,Division of Medical Oncology Department of Internal Medicine, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Miguel A Villalona-Calero
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A.,Division of Medical Oncology Department of Internal Medicine, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A.,Department of Pharmacology at The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| | - Wenrui Duan
- Comprehensive Cancer Center, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A.,Division of Medical Oncology Department of Internal Medicine, The Ohio State University College of Medicine and Public Health, Columbus, Ohio, U.S.A
| |
Collapse
|
13
|
Subbannayya Y, Pinto SM, Gowda H, Prasad TSK. Proteogenomics for understanding oncology: recent advances and future prospects. Expert Rev Proteomics 2016; 13:297-308. [PMID: 26697917 DOI: 10.1586/14789450.2016.1136217] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The concept of proteogenomics has emerged rapidly as a valuable approach to integrate mass spectrometry-derived proteomic data with genomic and transcriptomic data. It is used to harness the full potential of the former dataset in the discovery of potential biomarkers, therapeutic targets and novel proteins associated with various biological processes including diseases. Proteogenomic strategies have been successfully utilized to identify novel genes and redefine annotation of existing gene models in various genomes. In recent years, this approach has been extended to the field of cancer biology to unravel complexities in the tumor genomes and proteomes. Standard proteomics workflows employing translated cancer genomes and transcriptomes can potentially identify peptides from mutant proteins, splice variants and fusion proteins in the tumor proteome, which in addition to the currently available biomarker panels can serve as potential diagnostic and prognostic biomarkers, besides having therapeutic utility. This review focuses on the role of proteogenomics to understand cancer biology.
Collapse
Affiliation(s)
- Yashwanth Subbannayya
- a YU-IOB Center for Systems Biology and Molecular Medicine , Yenepoya University , Mangalore, India.,b Institute of Bioinformatics , Bangalore , India
| | - Sneha M Pinto
- a YU-IOB Center for Systems Biology and Molecular Medicine , Yenepoya University , Mangalore, India.,b Institute of Bioinformatics , Bangalore , India
| | - Harsha Gowda
- a YU-IOB Center for Systems Biology and Molecular Medicine , Yenepoya University , Mangalore, India.,b Institute of Bioinformatics , Bangalore , India
| | - T S Keshava Prasad
- a YU-IOB Center for Systems Biology and Molecular Medicine , Yenepoya University , Mangalore, India.,b Institute of Bioinformatics , Bangalore , India.,c NIMHANS-IOB Proteomics and Bioinformatics Laboratory, Neurobiology Research Centre , National Institute of Mental Health and Neurosciences , Bangalore , India
| |
Collapse
|
14
|
Ludwig MR, Kojima K, Bowersock GJ, Chen D, Jhala NC, Buchsbaum DJ, Grizzle WE, Klug CA, Mobley JA. Surveying the serologic proteome in a tissue-specific kras(G12D) knockin mouse model of pancreatic cancer. Proteomics 2016; 16:516-31. [DOI: 10.1002/pmic.201500133] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Revised: 09/30/2015] [Accepted: 11/09/2015] [Indexed: 12/21/2022]
Affiliation(s)
| | - Kyoko Kojima
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
| | - Gregory J. Bowersock
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
| | - Dongquan Chen
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
- Departments of Medicine; University of Alabama at Birmingham; Birmingham AL USA
| | - Nirag C. Jhala
- Department of Pathology and Laboratory Medicine; University of Pennsylvania; Philadelphia PA USA
| | - Donald J. Buchsbaum
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
- Radiation Oncology; University of Alabama at Birmingham; Birmingham AL USA
| | - William E. Grizzle
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
- Pathology; University of Alabama at Birmingham; Birmingham AL USA
| | - Christopher A. Klug
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
- Microbiology; University of Alabama at Birmingham; Birmingham AL USA
| | - James A. Mobley
- Comprehensive Cancer Center; University of Alabama at Birmingham; Birmingham AL USA
- Departments of Medicine; University of Alabama at Birmingham; Birmingham AL USA
- Surgery; University of Alabama at Birmingham; Birmingham AL USA
| |
Collapse
|
15
|
Hwang H, Park GW, Kim KH, Lee JY, Lee HK, Ji ES, Park SKR, Xu T, Yates JR, Kwon KH, Park YM, Lee HJ, Paik YK, Kim JY, Yoo JS. Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases. J Proteome Res 2015; 14:5028-37. [DOI: 10.1021/acs.jproteome.5b00472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- Heeyoun Hwang
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
| | - Gun Wook Park
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
- Graduate
School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Kwang Hoe Kim
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
- Graduate
School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Ju Yeon Lee
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
| | - Hyun Kyoung Lee
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
- Graduate
School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Eun Sun Ji
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
| | - Sung-Kyu Robin Park
- Department
of Chemical Physiology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Tao Xu
- Department
of Chemical Physiology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - John R. Yates
- Department
of Chemical Physiology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Kyung-Hoon Kwon
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
| | - Young Mok Park
- Center
for Cognition and Sociality, Institute for Basic Science, Daejeon 34047, Republic of Korea
| | - Hyoung-Joo Lee
- Yonsei
Proteome Research Center and Department of Integrated OMICS for Biomedical
Science, and Department of Biochemistry, College of Life Science and
Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Young-Ki Paik
- Yonsei
Proteome Research Center and Department of Integrated OMICS for Biomedical
Science, and Department of Biochemistry, College of Life Science and
Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Jin Young Kim
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
| | - Jong Shin Yoo
- Biomedical
Omics Group, Korea Basic Science Institute, Chungbuk 28119, Republic of Korea
- Graduate
School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| |
Collapse
|
16
|
Vermillion KL, Jagtap P, Johnson JE, Griffin TJ, Andrews MT. Characterizing Cardiac Molecular Mechanisms of Mammalian Hibernation via Quantitative Proteogenomics. J Proteome Res 2015; 14:4792-804. [DOI: 10.1021/acs.jproteome.5b00575] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Katie L. Vermillion
- Department
of Biology, University of Minnesota Duluth, 1035 Kirby Drive, Duluth, Minnesota 55812, United States
| | - Pratik Jagtap
- Center
for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner
Avenue, St. Paul, Minnesota 55108, United States
- Department
of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church St SE, Minneapolis, Minnesota 55455, United States
| | - James E. Johnson
- Minnesota Supercomputing Institute, 512 Walter Library 117 Pleasant Street SE, Minneapolis, Minnesota 55455, United States
| | - Timothy J. Griffin
- Center
for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner
Avenue, St. Paul, Minnesota 55108, United States
- Department
of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church St SE, Minneapolis, Minnesota 55455, United States
| | - Matthew T. Andrews
- Department
of Biology, University of Minnesota Duluth, 1035 Kirby Drive, Duluth, Minnesota 55812, United States
| |
Collapse
|
17
|
Jeong SK, Hancock WS, Paik YK. GenomewidePDB 2.0: A Newly Upgraded Versatile Proteogenomic Database for the Chromosome-Centric Human Proteome Project. J Proteome Res 2015; 14:3710-9. [PMID: 26272709 DOI: 10.1021/acs.jproteome.5b00541] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Since the launch of the Chromosome-centric Human Proteome Project (C-HPP) in 2012, the number of "missing" proteins has fallen to 2932, down from ∼5932 since the number was first counted in 2011. We compared the characteristics of missing proteins with those of already annotated proteins with respect to transcriptional expression pattern and the time periods in which newly identified proteins were annotated. We learned that missing proteins commonly exhibit lower levels of transcriptional expression and less tissue-specific expression compared with already annotated proteins. This makes it more difficult to identify missing proteins as time goes on. One of the C-HPP goals is to identify alternative spliced product of proteins (ASPs), which are usually difficult to find by shot-gun proteomic methods due to their sequence similarities with the representative proteins. To resolve this problem, it may be necessary to use a targeted proteomics approach (e.g., selected and multiple reaction monitoring [S/MRM] assays) and an innovative bioinformatics platform that enables the selection of target peptides for rarely expressed missing proteins or ASPs. Given that the success of efforts to identify missing proteins may rely on more informative public databases, it was necessary to upgrade the available integrative databases. To this end, we attempted to improve the features and utility of GenomewidePDB by integrating transcriptomic information (e.g., alternatively spliced transcripts), annotated peptide information, and an advanced search interface that can find proteins of interest when applying a targeted proteomics strategy. This upgraded version of the database, GenomewidePDB 2.0, may not only expedite identification of the remaining missing proteins but also enhance the exchange of information among the proteome community. GenomewidePDB 2.0 is available publicly at http://genomewidepdb.proteomix.org/.
Collapse
Affiliation(s)
- Seul-Ki Jeong
- Yonsei Proteome Research Center and Biomedical Proteome Research Center , 50 Yonsei-Ro, Seodaemun-gu, Seoul 120-749, Korea
| | - William S Hancock
- Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University , 12 Oxford Street, Boston, Massachusetts 02115, United States
| | - Young-Ki Paik
- Yonsei Proteome Research Center and Biomedical Proteome Research Center , 50 Yonsei-Ro, Seodaemun-gu, Seoul 120-749, Korea.,Department of Biochemistry, Department of Integrated Omics for Biomedical Science (World Class University Graduate Program), Yonsei University , 50 Yonsei-Ro, Sudaemoon-ku, Seoul 120-749, Korea
| |
Collapse
|
18
|
Abascal F, Ezkurdia I, Rodriguez-Rivas J, Rodriguez JM, del Pozo A, Vázquez J, Valencia A, Tress ML. Alternatively Spliced Homologous Exons Have Ancient Origins and Are Highly Expressed at the Protein Level. PLoS Comput Biol 2015; 11:e1004325. [PMID: 26061177 PMCID: PMC4465641 DOI: 10.1371/journal.pcbi.1004325] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 05/08/2015] [Indexed: 11/19/2022] Open
Abstract
Alternative splicing of messenger RNA can generate a wide variety of mature RNA transcripts, and these transcripts may produce protein isoforms with diverse cellular functions. While there is much supporting evidence for the expression of alternative transcripts, the same is not true for the alternatively spliced protein products. Large-scale mass spectroscopy experiments have identified evidence of alternative splicing at the protein level, but with conflicting results. Here we carried out a rigorous analysis of the peptide evidence from eight large-scale proteomics experiments to assess the scale of alternative splicing that is detectable by high-resolution mass spectroscopy. We find fewer splice events than would be expected: we identified peptides for almost 64% of human protein coding genes, but detected just 282 splice events. This data suggests that most genes have a single dominant isoform at the protein level. Many of the alternative isoforms that we could identify were only subtly different from the main splice isoform. Very few of the splice events identified at the protein level disrupted functional domains, in stark contrast to the two thirds of splice events annotated in the human genome that would lead to the loss or damage of functional domains. The most striking result was that more than 20% of the splice isoforms we identified were generated by substituting one homologous exon for another. This is significantly more than would be expected from the frequency of these events in the genome. These homologous exon substitution events were remarkably conserved—all the homologous exons we identified evolved over 460 million years ago—and eight of the fourteen tissue-specific splice isoforms we identified were generated from homologous exons. The combination of proteomics evidence, ancient origin and tissue-specific splicing indicates that isoforms generated from homologous exons may have important cellular roles. Alternative splicing is thought to be one means for generating the protein diversity necessary for the whole range of cellular functions. While the presence of alternatively spliced transcripts in the cell has been amply demonstrated, the same cannot be said for alternatively spliced proteins. The quest for alternative protein isoforms has focused primarily on the analysis of peptides from large-scale mass spectroscopy experiments, but evidence for alternative isoforms has been patchy and contradictory. A careful analysis of the peptide evidence is needed to fully understand the scale of alternative splicing detectable at the protein level. Here we analysed peptides from eight large-scale data sets, identifying just 282 splice events among 12,716 genes. This suggests that most genes have a single dominant isoform. Many of the alternative isoforms that we identified were only subtly different from the main splice variant, and one in five was generated by substitution of homologous exons by swapping one related exon for another. Remarkably, the alternative isoforms generated from homologous exons were highly conserved, first appearing 460 million years ago, and several appear to have tissue-specific roles in the brain and heart. Our results suggest that these particular isoforms are likely to have important cellular roles.
Collapse
Affiliation(s)
- Federico Abascal
- Structural Biology and Bioinformatics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Iakes Ezkurdia
- Unidad de Proteómica, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Juan Rodriguez-Rivas
- Structural Biology and Bioinformatics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Jose Manuel Rodriguez
- National Bioinformatics Institute (INB), Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Angela del Pozo
- Instituto de Genetica Medica y Molecular, Hospital Universitario La Paz, Madrid, Spain
| | - Jesús Vázquez
- Laboratorio de Proteómica Cardiovascular, Centro Nacional de Investigaciones Cardiovasculares (CNIC) Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and Bioinformatics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
- National Bioinformatics Institute (INB), Spanish National Cancer Research Centre (CNIO), Madrid, Spain
- * E-mail: (AV); (MLT)
| | - Michael L. Tress
- Structural Biology and Bioinformatics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
- * E-mail: (AV); (MLT)
| |
Collapse
|
19
|
Pan S, Brentnall TA, Chen R. Proteomics analysis of bodily fluids in pancreatic cancer. Proteomics 2015; 15:2705-15. [PMID: 25780901 DOI: 10.1002/pmic.201400476] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Revised: 02/06/2015] [Accepted: 03/13/2015] [Indexed: 12/12/2022]
Abstract
Proteomics study of pancreatic cancer using bodily fluids emphasizes biomarker discovery and clinical application, presenting unique prospect and challenges. Depending on the physiological nature of the bodily fluid and its proximity to pancreatic cancer, the proteomes of bodily fluids, such as pancreatic juice, pancreatic cyst fluid, blood, bile, and urine, can be substantially different in terms of protein constitution and the dynamic range of protein concentration. Thus, a comprehensive discovery and specific detection of cancer-associated proteins within these varied fluids is a complex task, requiring rigorous experiment design and a concerted approach. While major challenges still remain, fluid proteomics studies in pancreatic cancer to date have provided a wealth of information in revealing proteome alterations associated with pancreatic cancer in various bodily fluids.
Collapse
Affiliation(s)
- Sheng Pan
- Department of Medicine, University of Washington, Seattle, WA, USA
| | | | - Ru Chen
- Department of Medicine, University of Washington, Seattle, WA, USA
| |
Collapse
|
20
|
Identification of sialylated glycoproteins from metabolically oligosaccharide engineered pancreatic cells. Clin Proteomics 2015; 12:11. [PMID: 25987888 PMCID: PMC4434541 DOI: 10.1186/s12014-015-9083-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 03/23/2015] [Indexed: 12/24/2022] Open
Abstract
In this study, we investigated the use of metabolic oligosaccharide engineering and bio-orthogonal ligation reactions combined with lectin microarray and mass spectrometry to analyze sialoglycoproteins in the SW1990 human pancreatic cancer line. Specifically, cells were treated with the azido N-acetylmannosamine analog, 1,3,4-Bu3ManNAz, to label sialoglycoproteins with azide-modified sialic acids. The metabolically labeled sialoglyproteins were then biotinylated via the Staudinger ligation, and sialoglycopeptides containing azido-sialic acid glycans were immobilized to a solid support. The peptides linked to metabolically labeled sialylated glycans were then released from sialoglycopeptides and analyzed by mass spectrometry; in parallel, the glycans from azido-sialoglycoproteins were characterized by lectin microarrays. This method identified 75 unique N-glycosite-containing peptides from 55 different metabolically labeled sialoglycoproteins of which 42 were previously linked to cancer in the literature. A comparison of two of these glycoproteins, LAMP1 and ORP150, in histological tumor samples showed overexpression of these proteins in the cancerous tissue demonstrating that our approach constitutes a viable strategy to identify and discover sialoglycoproteins associated with cancer, which can serve as biomarkers for cancer diagnosis or targets for therapy.
Collapse
|
21
|
Olivares O, Vasseur S. Metabolic rewiring of pancreatic ductal adenocarcinoma: New routes to follow within the maze. Int J Cancer 2015; 138:787-96. [PMID: 25732227 DOI: 10.1002/ijc.29501] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Revised: 02/10/2015] [Accepted: 02/17/2015] [Indexed: 12/13/2022]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is a debilitating and almost universally fatal malignancy. Despite advances in understanding of the oncogenetics of the disease, very few clinical benefits have been shown. One of the main characteristics of PDAC is the tumor architecture where tumor cells are surrounded by a firm desmoplasia. By reducing vascularization, thus both oxygen and nutrients delivery to the tumor, this stroma causes the appearance of hypoxic zones driving metabolic adaptation in surviving tumor cells in order to cope with challenging conditions. This metabolic reprogramming promoted by environmental constraints enhances PDAC aggressiveness. In this review, we provide a brief overview of previous works regarding the importance of glucose and glutamine addiction of PDAC cells. In particular we aim to highlight the need for exploring the impact of metabolites other than glucose and glutamine, such as non-essential amino acids and oncometabolites, to find new treatments. We also discuss the need for progress in methodology for metabolites detection. The overall purpose of our review is to emphasize the need to look beyond what is currently known, with a focus on amino acid availability, in order to improve our understanding of PDAC biology.
Collapse
Affiliation(s)
- Orianne Olivares
- INSERM U1068, Centre De Recherche En Cancérologie De Marseille (CRCM), F-13009, Marseille, France.,Institut Paoli-Calmettes, F-13009, Marseille, France.,CNRS, UMR7258, CRCM, F-13009, Marseille, France.,Université Aix-Marseille, F-13284, Marseille, France
| | - Sophie Vasseur
- INSERM U1068, Centre De Recherche En Cancérologie De Marseille (CRCM), F-13009, Marseille, France.,Institut Paoli-Calmettes, F-13009, Marseille, France.,CNRS, UMR7258, CRCM, F-13009, Marseille, France.,Université Aix-Marseille, F-13284, Marseille, France
| |
Collapse
|
22
|
Tavares R, Scherer NM, Ferreira CG, Costa FF, Passetti F. Splice variants in the proteome: a promising and challenging field to targeted drug discovery. Drug Discov Today 2015; 20:353-60. [DOI: 10.1016/j.drudis.2014.11.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 10/19/2014] [Accepted: 11/07/2014] [Indexed: 12/15/2022]
|
23
|
Nesvizhskii AI. Proteogenomics: concepts, applications and computational strategies. Nat Methods 2015; 11:1114-25. [PMID: 25357241 DOI: 10.1038/nmeth.3144] [Citation(s) in RCA: 505] [Impact Index Per Article: 56.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 09/22/2014] [Indexed: 12/19/2022]
Abstract
Proteogenomics is an area of research at the interface of proteomics and genomics. In this approach, customized protein sequence databases generated using genomic and transcriptomic information are used to help identify novel peptides (not present in reference protein sequence databases) from mass spectrometry-based proteomic data; in turn, the proteomic data can be used to provide protein-level evidence of gene expression and to help refine gene models. In recent years, owing to the emergence of new sequencing technologies such as RNA-seq and dramatic improvements in the depth and throughput of mass spectrometry-based proteomics, the pace of proteogenomic research has greatly accelerated. Here I review the current state of proteogenomic methods and applications, including computational strategies for building and using customized protein sequence databases. I also draw attention to the challenge of false positive identifications in proteogenomics and provide guidelines for analyzing the data and reporting the results of proteogenomic studies.
Collapse
Affiliation(s)
- Alexey I Nesvizhskii
- 1] Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA. [2] Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
24
|
Onco-proteogenomics: cancer proteomics joins forces with genomics. Nat Methods 2015; 11:1107-13. [PMID: 25357240 DOI: 10.1038/nmeth.3138] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 06/26/2014] [Indexed: 12/21/2022]
Abstract
The complexities of tumor genomes are rapidly being uncovered, but how they are regulated into functional proteomes remains poorly understood. Standard proteomics workflows use databases of known proteins, but these databases do not capture the uniqueness of the cancer transcriptome, with its point mutations, unusual splice variants and gene fusions. Onco-proteogenomics integrates mass spectrometry-generated data with genomic information to identify tumor-specific peptides. Linking tumor-derived DNA, RNA and protein measurements into a central-dogma perspective has the potential to improve our understanding of cancer biology.
Collapse
|
25
|
Jia C, Hu Y, Liu Y, Li M. Mapping Splicing Quantitative Trait Loci in RNA-Seq. Cancer Inform 2015; 14:45-53. [PMID: 25733796 PMCID: PMC4333812 DOI: 10.4137/cin.s24832] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 07/23/2014] [Accepted: 07/25/2014] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND One of the major mechanisms of generating mRNA diversity is alternative splicing, a regulated
process that allows for the flexibility of producing functionally different proteins from the same
genomic sequences. This process is often altered in cancer cells to produce aberrant proteins that
drive the progression of cancer. A better understanding of the misregulation of alternative splicing
will shed light on the development of novel targets for pharmacological interventions of cancer. METHODS In this study, we evaluated three statistical methods, random effects meta-regression, beta
regression, and generalized linear mixed effects model, for the analysis of splicing quantitative
trait loci (sQTL) using RNA-Seq data. All the three methods use exon-inclusion levels estimated by
the PennSeq algorithm, a statistical method that utilizes paired-end reads and accounts for
non-uniform sequencing coverage. RESULTS Using both simulated and real RNA-Seq datasets, we compared these three methods with GLiMMPS, a
recently developed method for sQTL analysis. Our results indicate that the most reliable and
powerful method was the random effects meta-regression approach, which identified sQTLs at low false
discovery rates but higher power when compared to GLiMMPS. CONCLUSIONS We have evaluated three statistical methods for the analysis of sQTLs in RNA-Seq. Results from
our study will be instructive for researchers in selecting the appropriate statistical methods for
sQTL analysis.
Collapse
Affiliation(s)
- Cheng Jia
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Yu Hu
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Yichuan Liu
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Mingyao Li
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
26
|
Gan L, Xie L, Zuo F, Xiang Z, He N. Transcriptomic analysis of Rongchang pig brains and livers. Gene 2015; 560:96-106. [PMID: 25637719 DOI: 10.1016/j.gene.2015.01.051] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Revised: 12/31/2014] [Accepted: 01/26/2015] [Indexed: 01/01/2023]
Abstract
Recent developments in high-throughput RNA sequencing (RNA-seq) technology have led to a dramatic impact on our understanding of the structure and expression profiles of the mammalian transcriptome. To gain insights into the usefulness of swine production and biomedical model, the transcriptome profiling of Rongchang pig brains and livers was characterized using RNA-seq technology to uncover functional candidate molecules. In the study, total RNAs from brains and livers of Rongchang pig were sequenced and 8.6Gb sequencing data was obtained. This analysis revealed tissue specificity through the identification of 5575 and 4600 differentially expressed genes (DEGs) in brains and livers, respectively and the functional analysis of DEGs. Furthermore, 83 neuropeptide gene transcripts, 69 neuropeptide receptor gene transcripts, 10 pro-neuropeptide convertase gene transcripts and many other neuropeptide related protein gene transcripts were identified. Totally, the major characteristics of the transcriptional profiles of Rongchang pig brains and livers were present.
Collapse
Affiliation(s)
- Ling Gan
- The Department of Veterinary Medicine, Rongchang Campus, Southwest University, Rongchang, Chongqing 402460, China.
| | - Liwei Xie
- Center of Molecular Medicine, University of Georgia, Athens, GA 30602, USA.
| | - Fuyuan Zuo
- The Department of Animal Husbandry, Rongchang Campus, Southwest University, Rongchang, Chongqing 402460, China.
| | - Zhonghuai Xiang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Beibei, Chongqing 400715, China.
| | - Ningjia He
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Beibei, Chongqing 400715, China.
| |
Collapse
|
27
|
Li HD, Menon R, Omenn GS, Guan Y. Revisiting the identification of canonical splice isoforms through integration of functional genomics and proteomics evidence. Proteomics 2014; 14:2709-18. [PMID: 25265570 DOI: 10.1002/pmic.201400170] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/11/2014] [Accepted: 09/23/2014] [Indexed: 01/08/2023]
Abstract
Canonical isoforms in different databases have been defined as the most prevalent, most conserved, most expressed, longest, or the one with the clearest description of domains or posttranslational modifications. In this article, we revisit these definitions of canonical isoforms based on functional genomics and proteomics evidence, focusing on mouse data. We report a novel functional relationship network-based approach for identifying the highest connected isoforms (HCIs). We show that 46% of these HCIs are not the longest transcripts. In addition, this approach revealed many genes that have more than one highly connected isoforms. Averaged across 175 RNA-seq datasets covering diverse tissues and conditions, 65% of the HCIs show higher expression levels than nonhighest connected isoforms at the transcript level. At the protein level, these HCIs highly overlap with the expressed splice variants, based on proteomic data from eight different normal tissues. These results suggest that a more confident definition of canonical isoforms can be made through integration of multiple lines of evidence, including HCIs defined by biological processes and pathways, expression prevalence at the transcript level, and relative or absolute abundance at the protein level. This integrative proteogenomics approach can successfully identify principal isoforms that are responsible for the canonical functions of genes.
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | | | | |
Collapse
|
28
|
Lisitsa A, Moshkovskii S, Chernobrovkin A, Ponomarenko E, Archakov A. Profiling proteoforms: promising follow-up of proteomics for biomarker discovery. Expert Rev Proteomics 2014; 11:121-9. [PMID: 24437377 DOI: 10.1586/14789450.2014.878652] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Today, proteomics usually compares clinical samples by use of bottom-up profiling with high resolution mass spectrometry, where all protein products of a single gene are considered as an integral whole. At the same time, proteomics of proteoforms, which considers the variety of protein species, offers the potential to discover valuable biomarkers. Proteoforms are protein species that arise as a consequence of genetic polymorphisms, alternative splicing, post-translational modifications and other less-explored molecular events. The comprehensive observation of proteoforms has been an exclusive privilege of top-down proteomics. Here, we review the possibilities of a bottom-up approach to address the microheterogeneity of the human proteome. Special focus is given to shotgun proteomics and structure-based bioinformatics as a source of hypothetical proteoforms, which can potentially be verified by targeted mass spectrometry to determine the relevance of proteoforms to diseases.
Collapse
Affiliation(s)
- Andrey Lisitsa
- Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 119121, Pogodinskaya Street 10, Moscow, Russia
| | | | | | | | | |
Collapse
|
29
|
Jia C, Hu Y, Liu Y, Li M. Mapping Splicing Quantitative Trait Loci in RNA-Seq. Cancer Inform 2014; 13:35-43. [PMID: 25452687 PMCID: PMC4218654 DOI: 10.4137/cin.s13971] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 07/23/2014] [Accepted: 07/25/2014] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND One of the major mechanisms of generating mRNA diversity is alternative splicing, a regulated
process that allows for the flexibility of producing functionally different proteins from the same
genomic sequences. This process is often altered in cancer cells to produce aberrant proteins that
drive the progression of cancer. A better understanding of the misregulation of alternative splicing
will shed light on the development of novel targets for pharmacological interventions of cancer. METHODS In this study, we evaluated three statistical methods, random effects meta-regression, beta
regression, and generalized linear mixed effects model, for the analysis of splicing quantitative
trait loci (sQTL) using RNA-Seq data. All the three methods use exon-inclusion levels estimated by
the PennSeq algorithm, a statistical method that utilizes paired-end reads and accounts for
non-uniform sequencing coverage. RESULTS Using both simulated and real RNA-Seq datasets, we compared these three methods with GLiMMPS, a
recently developed method for sQTL analysis. Our results indicate that the most reliable and
powerful method was the random effects meta-regression approach, which identified sQTLs at low false
discovery rates but higher power when compared to GLiMMPS. CONCLUSIONS We have evaluated three statistical methods for the analysis of sQTLs in RNA-Seq. Results from
our study will be instructive for researchers in selecting the appropriate statistical methods for
sQTL analysis.
Collapse
Affiliation(s)
- Cheng Jia
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Yu Hu
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Yichuan Liu
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Mingyao Li
- Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
30
|
Pawar H, Renuse S, Khobragade SN, Chavan S, Sathe G, Kumar P, Mahale KN, Gore K, Kulkarni A, Dixit T, Raju R, Prasad TSK, Harsha HC, Patole MS, Pandey A. Neglected Tropical Diseases and Omics Science: Proteogenomics Analysis of the Promastigote Stage ofLeishmania majorParasite. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:499-512. [DOI: 10.1089/omi.2013.0159] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Harsh Pawar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Rajiv Gandhi University of Health Sciences, Bangalore, India
| | - Santosh Renuse
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Department of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam, India
| | | | - Sandip Chavan
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Manipal University, Madhav Nagar, Manipal, India
| | - Gajanan Sathe
- Institute of Bioinformatics, International Technology Park, Bangalore, India
- Manipal University, Madhav Nagar, Manipal, India
| | - Praveen Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | | | | | - Tanwi Dixit
- National Centre for Cell Sciences, Pune, India
| | - Rajesh Raju
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | - H. C. Harsha
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| |
Collapse
|
31
|
Li HD, Menon R, Omenn GS, Guan Y. The emerging era of genomic data integration for analyzing splice isoform function. Trends Genet 2014; 30:340-7. [PMID: 24951248 DOI: 10.1016/j.tig.2014.05.005] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Revised: 05/21/2014] [Accepted: 05/23/2014] [Indexed: 01/17/2023]
Abstract
The vast majority of multi-exon genes in humans undergo alternative splicing, which greatly increases the functional diversity of protein species. Predicting functions at the isoform level is essential to further our understanding of developmental abnormalities and cancers, which frequently exhibit aberrant splicing and dysregulation of isoform expression. However, determination of isoform function is very difficult, and efforts to predict isoform function have been limited in the functional genomics field. Deep sequencing of RNA now provides an unprecedented amount of expression data at the transcript level. We describe here emerging computational approaches that integrate such large-scale whole-transcriptome sequencing (RNA-seq) data for predicting the functions of alternatively spliced isoforms, and we discuss their applications in developmental and cancer biology. We outline future directions for isoform function prediction, emphasizing the need for heterogeneous genomic data integration and tissue-specific, dynamic isoform-level network modeling, which will allow the field to realize its full potential.
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Rajasree Menon
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA; Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, MI, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA; Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, MI, USA; Department of Electrical Engineering and Computer Science, Ann Arbor, MI, USA.
| |
Collapse
|
32
|
Wang X, Zhang B. Integrating genomic, transcriptomic, and interactome data to improve Peptide and protein identification in shotgun proteomics. J Proteome Res 2014; 13:2715-23. [PMID: 24792918 PMCID: PMC4059263 DOI: 10.1021/pr500194t] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
![]()
Mass spectrometry (MS)-based shotgun
proteomics is an effective
technology for global proteome profiling. The ultimate goal is to
assign tandem MS spectra to peptides and subsequently infer proteins
and their abundance. In addition to database searching and protein
assembly algorithms, computational approaches have been developed
to integrate genomic, transcriptomic, and interactome information
to improve peptide and protein identification. Earlier efforts focus
primarily on making databases more comprehensive using publicly available
genomic and transcriptomic data. More recently, with the increasing
affordability of the Next Generation Sequencing (NGS) technologies,
personalized protein databases derived from sample-specific genomic
and transcriptomic data have emerged as an attractive strategy. In
addition, incorporating interactome data not only improves protein
identification but also puts identified proteins into their functional
context and thus facilitates data interpretation. In this paper, we
survey the major integrative bioinformatics approaches that have been
developed during the past decade and discuss their merits and demerits.
Collapse
Affiliation(s)
- Xiaojing Wang
- Department of Biomedical Informatics, ‡Vanderbilt-Ingram Cancer Center, and §Department of Cancer Biology, Vanderbilt University School of Medicine , Nashville, Tennessee 37232, United States
| | | |
Collapse
|
33
|
Omenn GS, Guan Y, Menon R. A new class of protein cancer biomarker candidates: differentially expressed splice variants of ERBB2 (HER2/neu) and ERBB1 (EGFR) in breast cancer cell lines. J Proteomics 2014; 107:103-12. [PMID: 24802673 DOI: 10.1016/j.jprot.2014.04.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Revised: 04/09/2014] [Accepted: 04/10/2014] [Indexed: 02/07/2023]
Abstract
Combined RNA-Seq and proteomics analyses reveal striking differential expression of splice isoforms of key proteins in important cancer pathways and networks. Even between primary tumor cell lines from histologically similar inflammatory breast cancers, we find striking differences in hormone receptor-negative cell lines that are ERBB2 (Her2/neu)-amplified versus ERBB1 (EGFR) over-expressed with low ERBB2 activity. We have related these findings to protein-protein interaction networks, signaling and metabolic pathways, and methods for predicting functional variants among multiple alternative isoforms. Understanding the upstream ligands and regulators and the downstream pathways and interaction networks for ERBB receptors is certain to be important for explanation and prediction of the variable levels of expression and therapeutic responses of ERBB+tumors in the breast and in other organ sites. Alternative splicing is a remarkable evolutionary development that increases protein diversity from multi-exonic genes without requiring expansion of the genome. It is no longer sufficient to report the up- or down-expression of genes and proteins without dissecting the complexity due to alternative splicing. This article is part of a Special Issue entitled: 20Years of Proteomics in memory of Viatliano Pallini. Guest Editors: Luca Bini , Juan J. Calvete, Natacha Turck, Denis Hochstrasser and Jean-Charles Sanchez.
Collapse
Affiliation(s)
- Gilbert S Omenn
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Yuanfang Guan
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Rajasree Menon
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| |
Collapse
|
34
|
Abstract
Omics-based technology platforms have made new kinds of cancer profiling tests feasible. There are several valuable examples in clinical practice, and many more under development. A concerted, transparent process of discovery with lock-down of candidate assays and classifiers and clear specification of intended clinical use is essential. The Institute of Medicine has now proposed a three-stage scheme of confirming and validating analytical findings, validating performance on clinical specimens, and demonstrating explicit clinical utility for an approvable test (Micheel et al., Evolution of translational omics: lessons learned and path forward, 2012).
Collapse
|
35
|
Nirujogi RS, Pawar H, Renuse S, Kumar P, Chavan S, Sathe G, Sharma J, Khobragade S, Pande J, Modak B, Prasad TSK, Harsha HC, Patole MS, Pandey A. Moving from unsequenced to sequenced genome: reanalysis of the proteome of Leishmania donovani. J Proteomics 2014; 97:48-61. [PMID: 23665000 PMCID: PMC4710096 DOI: 10.1016/j.jprot.2013.04.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Revised: 04/02/2013] [Accepted: 04/11/2013] [Indexed: 10/26/2022]
Abstract
The kinetoplastid protozoan parasite, Leishmania donovani, is the causative agent of kala azar or visceral leishmaniasis. Kala azar is a severe form of leishmaniasis that is fatal in the majority of untreated cases. Studies on proteomic analysis of L. donovani thus far have been carried out using homology-based identification based on related Leishmania species (L. infantum, L. major and L. braziliensis) whose genomes have been sequenced. Recently, the genome of L. donovani was fully sequenced and the data became publicly available. We took advantage of the availability of its genomic sequence to carry out a more accurate proteogenomic analysis of L. donovani proteome using our previously generated dataset. This resulted in identification of 17,504 unique peptides upon database-dependent search against the annotated proteins in L. donovani. These peptides were assigned to 3999 unique proteins in L. donovani. 2296 proteins were identified in both the life stages of L. donovani, while 613 and 1090 proteins were identified only from amastigote and promastigote stages, respectively. The proteomic data was also searched against six-frame translated L. donovani genome, which led to 255 genome search-specific peptides (GSSPs) resulting in identification of 20 novel genes and correction of 40 existing gene models in L. donovani. BIOLOGICAL SIGNIFICANCE Leishmania donovani genome sequencing was recently completed, which permitted us to use a proteogenomic approach to map its proteome and to carry out annotation of it genome. This resulted in mapping of 50% (3999 proteins) of L. donovani proteome. Our study identified 20 novel genes previously not predicted from the L. donovani genome in addition to correcting annotations of 40 existing gene models. The identified proteins may help in better understanding of stage-specific protein expression profiles in L. donovani and to identify novel stage-specific drug targets in L. donovani which could be used in the treatment of leishmaniasis. This article is part of a Special Issue entitled: Trends in Microbial Proteomics.
Collapse
Affiliation(s)
- Raja Sekhar Nirujogi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry 605014, India
| | - Harsh Pawar
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Rajiv Gandhi University of Health Sciences, Bangalore 560041, India
| | - Santosh Renuse
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Department of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam 690525, India
| | - Praveen Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India
| | - Sandip Chavan
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - Gajanan Sathe
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | | | | | - Bhakti Modak
- National Centre for Cell Sciences, Pune 411007, India
| | - T S Keshava Prasad
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry 605014, India; Manipal University, Madhav Nagar, Manipal 576104, India
| | - H C Harsha
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA; Department of Pathology, Johns Hopkins University School of Medicine, Baltimore 21205, MD, USA.
| |
Collapse
|
36
|
Zhong J, Cui Y, Guo J, Chen Z, Yang L, He QY, Zhang G, Wang T. Resolving chromosome-centric human proteome with translating mRNA analysis: a strategic demonstration. J Proteome Res 2013; 13:50-9. [PMID: 24200226 DOI: 10.1021/pr4007409] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Chromosome-centric human proteome project (C-HPP) aims at differentiating chromosome-based and tissue-specific protein compositions in terms of protein expression, quantification, and modification. We previously found that the analysis of translating mRNA (mRNA attached to ribosome-nascent chain complex, RNC-mRNA) can explain over 94% of mRNA-protein abundance. Therefore, we propose here to use full-length RNC-mRNA information to illustrate protein expression both qualitatively and quantitatively. We performed RNA-seq on RNC-mRNA (RNC-seq) and detected 12,758 and 14,113 translating genes in human normal bronchial epithelial (HBE) cells and human colorectal adenocarcinoma Caco-2 cells, respectively. We found that most of these genes were mapped with >80% of coding sequence coverage. In Caco-2 cells, we provided translating evidence on 4180 significant single-nucleotide variations. While using RNC-mRNA data as a standard for proteomic data integration, both translating and protein evidence of 7876 genes can be acquired from four interlaboratory data sets with different MS platforms. In addition, we detected 1397 noncoding mRNAs that were attached to ribosomes, suggesting a potential source of new protein explorations. By comparing the two cell lines, a total of 677 differentially translated genes were found to be nonevenly distributed across chromosomes. In addition, 2105 genes in Caco-2 and 750 genes in HBE cells are expressed in a cell-specific manner. These genes are significantly and specifically clustered on multiple chromosomes, such as chromosome 19. We conclude that HPP/C-HPP investigations can be considerably improved by integrating RNC-mRNA analysis with MS, bioinformatics, and antibody-based verifications.
Collapse
Affiliation(s)
- Jiayong Zhong
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University , 601 Huangpu Avenue West, Guangzhou 510632, China
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Pang CNI, Tay AP, Aya C, Twine NA, Harkness L, Hart-Smith G, Chia SZ, Chen Z, Deshpande NP, Kaakoush NO, Mitchell HM, Kassem M, Wilkins MR. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. J Proteome Res 2013; 13:84-98. [PMID: 24152167 DOI: 10.1021/pr400820p] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, The University of New South Wales , Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Omenn GS. Plasma proteomics, the Human Proteome Project, and cancer-associated alternative splice variant proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1844:866-73. [PMID: 24211518 DOI: 10.1016/j.bbapap.2013.10.016] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 10/17/2013] [Accepted: 10/31/2013] [Indexed: 12/24/2022]
Abstract
This article addresses three inter-related subjects: the development of the Human Plasma Proteome Peptide Atlas, the launch of the Human Proteome Project, and the emergence of alternative splice variant transcripts and proteins as important features of evolution and pathogenesis. The current Plasma Peptide Atlas provides evidence on which peptides have been detected for every protein confidently identified in plasma; there are links to their spectra and their estimated abundance, facilitating the planning of targeted proteomics for biomarker studies. The Human Proteome Project (HPP) combines a chromosome-centric C-HPP with a biology and disease-driven B/D-HPP, upon a foundation of mass spectrometry, antibody, and knowledgebase resource pillars. The HPP aims to identify the approximately 7000 "missing proteins" and to characterize all proteins and their many isoforms. Success will enable the larger research community to utilize newly-available peptides, spectra, informative MS transitions, and databases for targeted analyses of priority proteins for each organ and disease. Among the isoforms of proteins, splice variants have the special feature of greatly enlarging protein diversity without enlarging the genome; evidence is accumulating of striking differential expression of splice variants in cancers. In this era of RNA-sequencing and advanced mass spectrometry, it is no longer sufficient to speak simply of increased or decreased expression of genes or proteins without carefully examining the splice variants in the protein mixture produced from each multi-exon gene. This article is part of a Special Issue entitled: Biomarkers: A Proteomic Challenge.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan, Ann Arbor, MI, USA; Institute for Systems Biology, Seattle, WA, USA
| |
Collapse
|
39
|
Menon R, Im H, Zhang EY, Wu SL, Chen R, Snyder M, Hancock WS, Omenn GS. Distinct splice variants and pathway enrichment in the cell-line models of aggressive human breast cancer subtypes. J Proteome Res 2013; 13:212-27. [PMID: 24111759 DOI: 10.1021/pr400773v] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
This study was conducted as a part of the Chromosome-Centric Human Proteome Project (C-HPP) of the Human Proteome Organization. The United States team of C-HPP is focused on characterizing the protein-coding genes in chromosome 17. Despite its small size, chromosome 17 is rich in protein-coding genes; it contains many cancer-associated genes, including BRCA1, ERBB2, (Her2/neu), and TP53. The goal of this study was to examine the splice variants expressed in three ERBB2 expressed breast cancer cell-line models of hormone-receptor-negative breast cancers by integrating RNA-Seq and proteomic mass spectrometry data. The cell lines represent distinct phenotypic variations subtype: SKBR3 (ERBB2+ (overexpression)/ER-/PR-; adenocarcinoma), SUM190 (ERBB2+ (overexpression)/ER-/PR-; inflammatory breast cancer), and SUM149 (ERBB2 (low expression) ER-/PR-; inflammatory breast cancer). We identified more than one splice variant for 1167 genes expressed in at least one of the three cancer cell lines. We found multiple variants of genes that are in the signaling pathways downstream of ERBB2 along with variants specific to one cancer cell line compared with the other two cancer cell lines and with normal mammary cells. The overall transcript profiles based on read counts indicated more similarities between SKBR3 and SUM190. The top-ranking Gene Ontology and BioCarta pathways for the cell-line specific variants pointed to distinct key mechanisms including: amino sugar metabolism, caspase activity, and endocytosis in SKBR3; different aspects of metabolism, especially of lipids in SUM190; cell-to-cell adhesion, integrin, and ERK1/ERK2 signaling; and translational control in SUM149. The analyses indicated an enrichment in the electron transport chain processes in the ERBB2 overexpressed cell line models and an association of nucleotide binding, RNA splicing, and translation processes with the IBC models, SUM190 and SUM149. Detailed experimental studies on the distinct variants identified from each of these three breast cancer cell line models that may open opportunities for drug target discovery and help unveil their specific roles in cancer progression and metastasis.
Collapse
Affiliation(s)
- Rajasree Menon
- Department of Computational Medicine & Bioinformatics, University of Michigan , 100 Washtenaw Avenue, Ann Arbor, Michigan 48109, United States
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Twenty-one proteins up-regulated in human H-ras oncogene transgenic rat pancreas cancers are up-regulated in human pancreas cancer. Pancreas 2013; 42:1034-9. [PMID: 23648844 DOI: 10.1097/mpa.0b013e3182883624] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
OBJECTIVES We have established rat models of pancreatic ductal adenocarcinoma (PDAC) in which expression of a human H-ras(G12V) or K-ras(G12V) oncogene regulated by the Cre/lox system drives pancreatic carcinogenesis. Pancreatic ductal adenocarcinoma which develops in H-ras(G12V) and K-ras(G12V) transgenic rats is cytogenetically and histopathologically similar to human PDAC. The present study was designed to determine the feasibility of using the commercially available H-ras(G12V) transgenic rat to find diagnostic protein biomarkers for human pancreatic cancer. METHODS For an animal model to be useful for searching for protein biomarkers for a disease, it is essential that proteins that are up-regulated in the model are also up-regulated in humans. We used liquid chromatography-tandem mass spectrometry (LC-MS/MS) to compare H-ras(G12V) transgenic rat PDAC with surrounding normal pancreas tissue. RESULTS We identified 30 up-regulated proteins in the H-ras(G12V) transgenic rat PDAC lesions; importantly, 21 human homologs of these 30 rat proteins are up-regulated in human pancreatic cancer patients. CONCLUSIONS These results indicate that numerous proteins that are up-regulated in H-ras(G12V) transgenic rat PDAC are also up-regulated in human pancreatic cancer; therefore, this rat model can be used to search for diagnostic biomarkers for this disease.
Collapse
|
41
|
Sheynkman GM, Shortreed MR, Frey BL, Smith LM. Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Mol Cell Proteomics 2013; 12:2341-53. [PMID: 23629695 DOI: 10.1074/mcp.o113.028142] [Citation(s) in RCA: 105] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Human proteomic databases required for MS peptide identification are frequently updated and carefully curated, yet are still incomplete because it has been challenging to acquire every protein sequence from the diverse assemblage of proteoforms expressed in every tissue and cell type. In particular, alternative splicing has been shown to be a major source of this cell-specific proteomic variation. Many new alternative splice forms have been detected at the transcript level using next generation sequencing methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of next generation sequencing methods, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Eighty million paired-end Illumina reads and ∼500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching. Based on the RefSeq gene models, we detected 136,123 annotated and 144,818 unannotated transcript junctions. Of those, 24,834 unannotated junctions passed various quality filters (e.g. minimum read depth) and these entries were translated into 33,589 polypeptide sequences and used for database searching. We discovered 57 splice junction peptides not present in the Uniprot-Trembl proteomic database comprising an array of different splicing events, including skipped exons, alternative donors and acceptors, and noncanonical transcriptional start sites. To our knowledge this is the first example of using sample-specific RNA-Seq data to create a splice-junction database and discover new peptides resulting from alternative splicing.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave., Madison, Wisconsin 53706, USA
| | | | | | | |
Collapse
|
42
|
Omenn GS, Menon R, Zhang Y. Innovations in proteomic profiling of cancers: alternative splice variants as a new class of cancer biomarker candidates and bridging of proteomics with structural biology. J Proteomics 2013; 90:28-37. [PMID: 23603631 DOI: 10.1016/j.jprot.2013.04.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Revised: 04/05/2013] [Accepted: 04/07/2013] [Indexed: 01/05/2023]
Abstract
Alternative splicing allows a single gene to generate multiple RNA transcripts which can be translated into functionally diverse protein isoforms. Current knowledge of splicing is derived mainly from RNA transcripts, with very little known about the expression level, 3D structures, and functional differences of the proteins. Splicing is a remarkable phenomenon of molecular and biological evolution. Studies which simply report up-regulation or down-regulation of protein or mRNA expression are confounded by the effects of mixtures of these isoforms. Besides understanding the net biological effects of the mixtures, we may be able to develop biomarker tests based on the observable differential expression of particular splice variants or combinations of splice variants in specific disease states. Here we review our work on differential expression of splice variant proteins in cancers and the feasibility of integrating proteomic analysis with structure-based conformational predictions of the differences between such isoforms.
Collapse
Affiliation(s)
- Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA.
| | | | | |
Collapse
|
43
|
Khatun J, Yu Y, Wrobel JA, Risk BA, Gunawardena HP, Secrest A, Spitzer WJ, Xie L, Wang L, Chen X, Giddings MC. Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions. BMC Genomics 2013; 14:141. [PMID: 23448259 PMCID: PMC3607840 DOI: 10.1186/1471-2164-14-141] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2012] [Accepted: 02/22/2013] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Proteogenomic mapping is an approach that uses mass spectrometry data from proteins to directly map protein-coding genes and could aid in locating translational regions in the human genome. In concert with the ENcyclopedia of DNA Elements (ENCODE) project, we applied proteogenomic mapping to produce proteogenomic tracks for the UCSC Genome Browser, to explore which putative translational regions may be missing from the human genome. RESULTS We generated ~1 million high-resolution tandem mass (MS/MS) spectra for Tier 1 ENCODE cell lines K562 and GM12878 and mapped them against the UCSC hg19 human genome, and the GENCODE V7 annotated protein and transcript sets. We then compared the results from the three searches to identify the best-matching peptide for each MS/MS spectrum, thereby increasing the confidence of the putative new protein-coding regions found via the whole genome search. At a 1% false discovery rate, we identified 26,472, 24,406, and 13,128 peptides from the protein, transcript, and whole genome searches, respectively; of these, 481 were found solely via the whole genome search. The proteogenomic mapping data are available on the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeUncBsuProt. CONCLUSIONS The whole genome search revealed that ~4% of the uniquely mapping identified peptides were located outside GENCODE V7 annotated exons. The comparison of the results from the disparate searches also identified 15% more spectra than would have been found solely from a protein database search. Therefore, whole genome proteogenomic mapping is a complementary method for genome annotation when performed in conjunction with other searches.
Collapse
Affiliation(s)
- Jainab Khatun
- College of Arts and Sciences, Boise State University, Boise, ID, USA
| | - Yanbao Yu
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
| | - John A Wrobel
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
| | - Brian A Risk
- College of Arts and Sciences, Boise State University, Boise, ID, USA
| | - Harsha P Gunawardena
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
- Program in Molecular Biology & Biotechnology, UNC School of Medicine, Chapel Hill, NC, USA
| | - Ashley Secrest
- College of Arts and Sciences, Boise State University, Boise, ID, USA
| | - Wendy J Spitzer
- College of Arts and Sciences, Boise State University, Boise, ID, USA
| | - Ling Xie
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
| | - Li Wang
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
| | - Xian Chen
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
- Program in Molecular Biology & Biotechnology, UNC School of Medicine, Chapel Hill, NC, USA
| | - Morgan C Giddings
- College of Arts and Sciences, Boise State University, Boise, ID, USA
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, NC, USA
| |
Collapse
|
44
|
Pawar H, Sahasrabuddhe NA, Renuse S, Keerthikumar S, Sharma J, Kumar GSS, Venugopal A, Sekhar NR, Kelkar DS, Nemade H, Khobragade SN, Muthusamy B, Kandasamy K, Harsha HC, Chaerkady R, Patole MS, Pandey A. A proteogenomic approach to map the proteome of an unsequenced pathogen - Leishmania donovani. Proteomics 2012; 12:832-44. [DOI: 10.1002/pmic.201100505] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Harsh Pawar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Rajiv Gandhi University of Health Sciences; Bangalore Karnataka India
| | - Nandini A. Sahasrabuddhe
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | - Santosh Renuse
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | | | - Jyoti Sharma
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Manipal University; Madhav Nagar Manipal Karnataka India
| | - Ghantasala. S. Sameer Kumar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Abhilash Venugopal
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
| | - Nirujogi Raja Sekhar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Dhanashree S. Kelkar
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
| | - Harshal Nemade
- National Centre for Cell Sciences; Pune Maharashtra India
| | | | - Babylakshmi Muthusamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
| | - Kumaran Kandasamy
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - H. C. Harsha
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
| | - Raghothama Chaerkady
- Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
| | | | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Oncology; Johns Hopkins University School of Medicine; Baltimore MD USA
- Department of Pathology; Johns Hopkins University School of Medicine; Baltimore MD USA
| |
Collapse
|
45
|
Ezkurdia I, del Pozo A, Frankish A, Rodriguez JM, Harrow J, Ashman K, Valencia A, Tress ML. Comparative proteomics reveals a significant bias toward alternative protein isoforms with conserved structure and function. Mol Biol Evol 2012; 29:2265-83. [PMID: 22446687 PMCID: PMC3424414 DOI: 10.1093/molbev/mss100] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints.
Collapse
Affiliation(s)
- Iakes Ezkurdia
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre, Madrid, Spain
| | | | | | | | | | | | | | | |
Collapse
|
46
|
Gianazza E, Vegeto E, Eberini I, Sensi C, Miller I. Neglected markers: Altered serum proteome in murine models of disease. Proteomics 2012; 12:691-707. [DOI: 10.1002/pmic.201100320] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 08/28/2011] [Indexed: 11/09/2022]
|
47
|
Paulo JA, Lee LS, Banks PA, Steen H, Conwell DL. Difference gel electrophoresis identifies differentially expressed proteins in endoscopically collected pancreatic fluid. Electrophoresis 2011; 32:1939-51. [PMID: 21792986 DOI: 10.1002/elps.201100203] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Alterations in the pancreatic fluid proteome of individuals with chronic pancreatitis (CP) may offer insights into the development and progression of the disease. The endoscopic pancreatic function test (ePFT) can safely collect large volumes of pancreatic fluid that are potentially amenable to proteomic analyses using difference gel electrophoresis (DIGE) coupled with liquid chromatography-tandem mass spectrometry (LC-MS/MS). Pancreatic fluid was collected endoscopically using the ePFT method following secretin stimulation from three individuals with severe CP and three chronic abdominal pain (CAP) controls. The fluid was processed to minimize protein degradation and the protein profiles of each cohort, as determined by DIGE and LC-MS/MS, were compared. This DIGE-LC-MS/MS analysis reveals proteins that are differentially expressed in CP compared with CAP controls. Proteins with higher abundance in pancreatic fluid from CP individuals include: actin, desmoplankin, α-1-antitrypsin, SNC73, and serotransferrin. Those of relatively lower abundance include carboxypeptidase B, lipase, α-1-antichymotrypsin, α-2-macroglobulin, actin-related protein (Arp2/3) subunit 4, glyceraldehyde-3-phosphate dehydrogenase, and protein disulfide isomerase. Endoscopic collection (ePFT) in tandem with DIGE-LC-MS/MS is a suitable approach for pancreatic fluid proteome analysis; however, further optimization of our protocol, as outlined herein, may improve proteome coverage in future analyses.
Collapse
Affiliation(s)
- Joao A Paulo
- Department of Pathology, Children's Hospital Boston and Harvard Medical School, Boston, MA 02115, USA.
| | | | | | | | | |
Collapse
|
48
|
Kooren JA, Rhodus NL, Tang C, Jagtap PD, Horrigan BJ, Griffin TJ. Evaluating the potential of a novel oral lesion exudate collection method coupled with mass spectrometry-based proteomics for oral cancer biomarker discovery. Clin Proteomics 2011; 8:13. [PMID: 21914210 PMCID: PMC3200993 DOI: 10.1186/1559-0275-8-13] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 09/13/2011] [Indexed: 01/12/2023] Open
Abstract
Introduction Early diagnosis of Oral Squamous Cell Carcinoma (OSCC) increases the survival rate of oral cancer. For early diagnosis, molecular biomarkers contained in samples collected non-invasively and directly from at-risk oral premalignant lesions (OPMLs) would be ideal. Methods In this pilot study we evaluated the potential of a novel method using commercial PerioPaper absorbent strips for non-invasive collection of oral lesion exudate material coupled with mass spectrometry-based proteomics for oral cancer biomarker discovery. Results Our evaluation focused on three core issues. First, using an "on-strip" processing method, we found that protein can be isolated from exudate samples in amounts compatible with large-scale mass spectrometry-based proteomic analysis. Second, we found that the OPML exudate proteome was distinct from that of whole saliva, while being similar to the OPML epithelial cell proteome, demonstrating the fidelity of our exudate collection method. Third, in a proof-of-principle study, we identified numerous, inflammation-associated proteins showing an expected increase in abundance in OPML exudates compared to healthy oral tissue exudates. These results demonstrate the feasibility of identifying differentially abundant proteins from exudate samples, which is essential for biomarker discovery studies. Conclusions Collectively, our findings demonstrate that our exudate collection method coupled with mass spectrometry-based proteomics has great potential for transforming OSCC biomarker discovery and clinical diagnostics assay development.
Collapse
Affiliation(s)
- Joel A Kooren
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 321 Church St SE, 6-155 Jackson Hall, Minneapolis, Minnesota, 55455, USA.
| | | | | | | | | | | |
Collapse
|
49
|
Chaerkady R, Kelkar DS, Muthusamy B, Kandasamy K, Dwivedi SB, Sahasrabuddhe NA, Kim MS, Renuse S, Pinto SM, Sharma R, Pawar H, Sekhar NR, Mohanty AK, Getnet D, Yang Y, Zhong J, Dash AP, MacCallum RM, Delanghe B, Mlambo G, Kumar A, Keshava Prasad TS, Okulate M, Kumar N, Pandey A. A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry. Genome Res 2011; 21:1872-81. [PMID: 21795387 DOI: 10.1101/gr.127951.111] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Anopheles gambiae is a major mosquito vector responsible for malaria transmission, whose genome sequence was reported in 2002. Genome annotation is a continuing effort, and many of the approximately 13,000 genes listed in VectorBase for Anopheles gambiae are predictions that have still not been validated by any other method. To identify protein-coding genes of An. gambiae based on its genomic sequence, we carried out a deep proteomic analysis using high-resolution Fourier transform mass spectrometry for both precursor and fragment ions. Based on peptide evidence, we were able to support or correct more than 6000 gene annotations including 80 novel gene structures and about 500 translational start sites. An additional validation by RT-PCR and cDNA sequencing was successfully performed for 105 selected genes. Our proteogenomic analysis led to the identification of 2682 genome search-specific peptides. Numerous cases of encoded proteins were documented in regions annotated as intergenic, introns, or untranslated regions. Using a database created to contain potential splice sites, we also identified 35 novel splice junctions. This is a first report to annotate the An. gambiae genome using high-accuracy mass spectrometry data as a complementary technology for genome annotation.
Collapse
Affiliation(s)
- Raghothama Chaerkady
- McKusick-Nathans Institute of Genetic Medicine and Department of Biological Chemistry, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Renuse S, Chaerkady R, Pandey A. Proteogenomics. Proteomics 2011; 11:620-30. [DOI: 10.1002/pmic.201000615] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 11/14/2010] [Accepted: 11/16/2010] [Indexed: 12/13/2022]
|