1
|
Thangudu RR, Rudnick PA, Holck M, Singhal D, MacCoss MJ, Edwards NJ, Ketchum KA, Kinsinger CR, Kim E, Basu A. Abstract LB-242: Proteomic Data Commons: A resource for proteogenomic analysis. Cancer Res 2020. [DOI: 10.1158/1538-7445.am2020-lb-242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The objective of the National Cancer Institutes' Proteomic Data Commons (PDC) is to make cancer-related proteomic datasets accessible to the public. The PDC provides the cancer research community with a unified data repository that enables data sharing across cancer proteomic studies and also enables multi-omic integration in support of precision medicine. As a domain-specific repository within the Cancer Research Data Commons (CRDC), the vision for the PDC is to provide researchers the ability to find and analyze proteomic data across a wide variety of tumor types. Currently, the PDC houses data, supported by a large collection of metadata attributes, for nearly 40 datasets from over 12 cancer types produced by several large-scale cancer research programs, each with cohort sizes greater than 100 patients.
The PDC facilitates the analysis of proteomic, genomic, and imaging data derived from the same tumor. Most of the datasets in the PDC also have corresponding genomic and imaging data available in the Genomic Data Commons and The Cancer Imaging Archive respectively. Researchers can discover which genomic variants are detectable at the protein-level or better understand associations between gene expression, copy number variation, and protein abundance. The resource is currently available to the public in beta phase (https://pdc.esacinc.com) and will be officially launched on the cancer.gov domain in March 2020.
The PDC data portal is supported by a robust and extensible data model and provides user-friendly exploration, visualization and data analysis. This allows researchers to search for and visualize expression of proteins (through their mapped genes) across all studies, analyze protein abundance for all cases in a study through heatmaps, build and explore pan-cancer cohorts using highly curated, clinical metadata, and comprehensively view a study without needing to download the data.
The PDC provides quick access to mapping of peptide identities and quantities on the human genome as well as protein databases containing patient/tumor-specific variants and novel splicing events. It also enables fast, accurate, and convenient proteomic validation of novel genomic alterations through the PepQuery algorithm.
Through a highly versatile application programming interface (API), PDC allows users to interact with data programmatically and facilitates integration with data from other resources in their scripts for multi-omic analysis.
Big data interoperability is critical for progress in precision medicine. PDC is designed to interoperate with other resources including the CRDC nodes, allowing users to analyze PDC data with the tools and pipelines available on the NCI cloud resources. It further allows users to use their own tools to co-analyze genomic and proteomic data available from a common sample on Amazon Web Services (AWS) platform or on a local system.
The presentation will provide an overview of the PDC and it's available datasets, as well as a discussion of how it facilitates multi-omic data analyses.
Citation Format: Ratna Rajesh Thangudu, Paul A. Rudnick, Michael Holck, Deepak Singhal, Michael J. MacCoss, Nathan J. Edwards, Karen A. Ketchum, Christopher R. Kinsinger, Erika Kim, Anand Basu. Proteomic Data Commons: A resource for proteogenomic analysis [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr LB-242.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Erika Kim
- 5National Cancer Institute, Bethesda, MD
| | | |
Collapse
|
2
|
Zhang X, Nguyen KD, Rudnick PA, Roper N, Kawaler E, Maity TK, Awasthi S, Gao S, Biswas R, Venugopalan A, Cultraro CM, Fenyö D, Guha U. Quantitative Mass Spectrometry to Interrogate Proteomic Heterogeneity in Metastatic Lung Adenocarcinoma and Validate a Novel Somatic Mutation CDK12-G879V. Mol Cell Proteomics 2019; 18:622-641. [PMID: 30617155 PMCID: PMC6442362 DOI: 10.1074/mcp.ra118.001266] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 01/04/2019] [Indexed: 12/20/2022] Open
Abstract
Lung cancer is the leading cause of cancer death in both men and women. Tumor heterogeneity is an impediment to targeted treatment of all cancers, including lung cancer. Here, we sought to characterize tumor proteome and phosphoproteome changes by longitudinal, prospective collection of tumor tissue from an exceptional responder lung adenocarcinoma patient who survived with metastatic lung adenocarcinoma for over seven years while undergoing HER2-directed therapy in combination with chemotherapy. We employed "Super-SILAC" and TMT labeling strategies to quantify the proteome and phosphoproteome of a lung metastatic site and eight distinct metastatic progressive lymph nodes collected during these seven years, including five lymph nodes procured at autopsy. We identified specific signaling networks enriched in lung compared with the lymph node metastatic sites. We correlated the changes in protein abundance with changes in copy number alteration (CNA) and transcript expression. ERBB2/HER2 protein expression was higher in lung, consistent with a higher degree of ERBB2 amplification in lung compared with the lymph node metastatic sites. To further interrogate the mass spectrometry data, a patient-specific database was built by incorporating all the somatic and germline variants identified by whole genome sequencing (WGS) of genomic DNA from the lung, one lymph node metastatic site and blood. An extensive validation pipeline was built to confirm variant peptides. We validated 360 spectra corresponding to 55 germline and 6 somatic variant peptides. Targeted MRM assays revealed two novel variant somatic peptides, CDK12-G879V and FASN-R1439Q, expressed in lung and lymph node metastatic sites, respectively. The CDK12-G879V mutation likely results in a nonfunctional CDK12 kinase and chemotherapy susceptibility in lung metastatic sites. Knockdown of CDK12 in lung adenocarcinoma cells increased chemotherapy sensitivity which was rescued by wild type, but not CDK12-G879V expression, consistent with the complete resolution of the lung metastatic sites in this patient.
Collapse
Affiliation(s)
- Xu Zhang
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Khoa Dang Nguyen
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Paul A Rudnick
- §Spectragen Informatics LLC, Bainbridge Island, Washington
| | - Nitin Roper
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Emily Kawaler
- ¶Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU School of Medicine, New York, New York
| | - Tapan K Maity
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Shivangi Awasthi
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Shaojian Gao
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Romi Biswas
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Abhilash Venugopalan
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - Constance M Cultraro
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland
| | - David Fenyö
- ¶Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU School of Medicine, New York, New York
| | - Udayan Guha
- From the ‡Thoracic and GI Malignancies Branch, Center for Cancer Research, NCI, NIH, Bethesda, Maryland;.
| |
Collapse
|
3
|
Rudnick PA, Markey SP, Roth J, Mirokhin Y, Yan X, Tchekhovskoi DV, Edwards NJ, Thangudu RR, Ketchum KA, Kinsinger CR, Mesri M, Rodriguez H, Stein SE. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline. J Proteome Res 2016; 15:1023-32. [PMID: 26860878 DOI: 10.1021/acs.jproteome.5b01091] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
Collapse
Affiliation(s)
- Paul A Rudnick
- Spectragen Informatics, Bainbridge Island, Washington 98110, United States.,Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Sanford P Markey
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Jeri Roth
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Yuri Mirokhin
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Xinjian Yan
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Dmitrii V Tchekhovskoi
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| | - Nathan J Edwards
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center , Washington, D.C. 20007, United States
| | | | | | - Christopher R Kinsinger
- Office of Cancer Clinical Proteomics Research, National Cancer Institute , Bethesda, Maryland 20892, United States
| | - Mehdi Mesri
- Office of Cancer Clinical Proteomics Research, National Cancer Institute , Bethesda, Maryland 20892, United States
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute , Bethesda, Maryland 20892, United States
| | - Stephen E Stein
- Biomolecular Measurement Division, National Institute of Standards and Technology , Gaithersburg, Maryland 20899, United States
| |
Collapse
|
4
|
Abstract
Multiple-reaction monitoring (MRM) of peptides has been recognized as a promising technology because it is sensitive and robust. Borrowed from stable-isotope dilution (SID) methodologies in the field of small molecules, MRM is now routinely used in proteomics laboratories. While its usefulness validating candidate targets is widely accepted, it has not been established as a discovery tool. Traditional thinking has been that MRM workflows cannot be multiplexed high enough to efficiently profile. This is due to slower instrument scan rates and the complexities of developing increasingly large scheduling methods. In this issue, Colangelo et al. (Proteomics 2015, 15, 1202-1214) describe a pipeline (xMRM) for discovery-style MRM using label-free methods (i.e. relative quantitation). Label-free comes with cost benefits as does MRM, where data are easier to analyze than full-scan. Their paper offers numerous improvements in method design and data analysis. The robustness of their pipeline was tested on rodent postsynaptic density fractions. There, they were able to accurately quantify 112 proteins at a CV% of 11.4, with only 2.5% of the 1697 transitions requiring user intervention. Colangelo et al. aim to extend the reach of MRM deeper into the realm of discovery proteomics, an area that is currently dominated by data-dependent and data-independent workflows.
Collapse
|
5
|
Rudnick PA. Refining spectral library searching. Proteomics 2014; 13:3247-50. [PMID: 24123856 DOI: 10.1002/pmic.201300426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Revised: 09/30/2013] [Accepted: 10/02/2013] [Indexed: 11/09/2022]
Abstract
Spectral library searching has many advantages over sequence database searching, yet it has not been widely adopted. One possible reason for this is that users are unsure exactly how to interpret the similarity scores (e.g., "dot products" are not probability-based scores). Methods to create decoys have been proposed, but, as developers caution, may produce proxies that are not equivalent to reversed sequences. In this issue, Shao et al. (Proteomics 2013, 13, 3273-3283) report advances in spectral library searching where the focus is not on improving the performance of their search engine, SpectraST, but is instead on improving the statistical meaningfulness of its discriminant score and removing the need for decoys. The results in their paper indicate that by "standardizing" the input and library spectra, sensitivity is not lost but is, surprisingly, gained. Their tests also show that false discovery rate (FDR) estimates, derived from their new score, track better with "ground truth" than decoy searching. It is possible that their work strikes a good balance between the theory of library searching and its application. And as such, they hope to have removed a major entrance barrier for some researchers previously unwilling to try library searching.
Collapse
Affiliation(s)
- Paul A Rudnick
- Spectragen Informatics, Rockville, MD, USA; Mass Spectrometry Data Center, National Institute of Standards and Technology, Gaithersburg, MD, USA
| |
Collapse
|
6
|
Dong Q, Yan X, Kilpatrick LE, Liang Y, Mirokhin YA, Roth JS, Rudnick PA, Stein SE. Tandem mass spectral libraries of peptides in digests of individual proteins: Human Serum Albumin (HSA). Mol Cell Proteomics 2014; 13:2435-49. [PMID: 24889059 DOI: 10.1074/mcp.o113.037135] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
This work presents a method for creating a mass spectral library containing tandem spectra of identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA(1)) was selected for this purpose owing to its ubiquity, high level of characterization and availability of digest data. The underlying experimental data consisted of ∼3000 one-dimensional LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2918 different peptide ions, using a variety of manually-validated filters to ensure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and post-translational modifications (PTMs) present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality.
Collapse
Affiliation(s)
- Qian Dong
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Xinjian Yan
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Lisa E Kilpatrick
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Yuxue Liang
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Yuri A Mirokhin
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Jeri S Roth
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Paul A Rudnick
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| | - Stephen E Stein
- From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
7
|
Rudnick PA, Wang X, Yan X, Sedransk N, Stein SE. Improved normalization of systematic biases affecting ion current measurements in label-free proteomics data. Mol Cell Proteomics 2014; 13:1341-51. [PMID: 24563535 DOI: 10.1074/mcp.m113.030593] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Normalization is an important step in the analysis of quantitative proteomics data. If this step is ignored, systematic biases can lead to incorrect assumptions about regulation. Most statistical procedures for normalizing proteomics data have been borrowed from genomics where their development has focused on the removal of so-called 'batch effects.' In general, a typical normalization step in proteomics works under the assumption that most peptides/proteins do not change; scaling is then used to give a median log-ratio of 0. The focus of this work was to identify other factors, derived from knowledge of the variables in proteomics, which might be used to improve normalization. Here we have examined the multi-laboratory data sets from Phase I of the NCI's CPTAC program. Surprisingly, the most important bias variables affecting peptide intensities within labs were retention time and charge state. The magnitude of these observations was exaggerated in samples of unequal concentrations or "spike-in" levels, presumably because the average precursor charge for peptides with higher charge state potentials is lower at higher relative sample concentrations. These effects are consistent with reduced protonation during electrospray and demonstrate that the physical properties of the peptides themselves can serve as good reporters of systematic biases. Between labs, retention time, precursor m/z, and peptide length were most commonly the top-ranked bias variables, over the standardly used average intensity (A). A larger set of variables was then used to develop a stepwise normalization procedure. This statistical model was found to perform as well or better on the CPTAC mock biomarker data than other commonly used methods. Furthermore, the method described here does not require a priori knowledge of the systematic biases in a given data set. These improvements can be attributed to the inclusion of variables other than average intensity during normalization.
Collapse
Affiliation(s)
- Paul A Rudnick
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Gaithersburg, Maryland
| | | | | | | | | |
Collapse
|
8
|
Simón-Manso Y, Lowenthal MS, Kilpatrick LE, Sampson ML, Telu KH, Rudnick PA, Mallard WG, Bearden DW, Schock TB, Tchekhovskoi DV, Blonder N, Yan X, Liang Y, Zheng Y, Wallace WE, Neta P, Phinney KW, Remaley AT, Stein SE. Metabolite Profiling of a NIST Standard Reference Material for Human Plasma (SRM 1950): GC-MS, LC-MS, NMR, and Clinical Laboratory Analyses, Libraries, and Web-Based Resources. Anal Chem 2013; 85:11725-31. [DOI: 10.1021/ac402503m] [Citation(s) in RCA: 184] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Yamil Simón-Manso
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Mark S. Lowenthal
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Lisa E. Kilpatrick
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Maureen L. Sampson
- Department
of Laboratory Medicine, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Kelly H. Telu
- Chemical
Sciences Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Paul A. Rudnick
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - W. Gary Mallard
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Daniel W. Bearden
- Hollings
Marine Laboratory, Chemical Sciences Division, National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Tracey B. Schock
- Hollings
Marine Laboratory, Chemical Sciences Division, National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Dmitrii V. Tchekhovskoi
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Niksa Blonder
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Xinjian Yan
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Yuxue Liang
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Yufang Zheng
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - William E. Wallace
- Chemical
Sciences Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Pedatsur Neta
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Karen W. Phinney
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| | - Alan T. Remaley
- Department
of Laboratory Medicine, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Stephen E. Stein
- Biomolecular
Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8380, United States
| |
Collapse
|
9
|
Walmsley SJ, Rudnick PA, Liang Y, Dong Q, Stein SE, Nesvizhskii AI. Comprehensive analysis of protein digestion using six trypsins reveals the origin of trypsin as a significant source of variability in proteomics. J Proteome Res 2013; 12:5666-80. [PMID: 24116745 DOI: 10.1021/pr400611h] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Trypsin is an endoprotease commonly used for sample preparation in proteomics experiments. Importantly, protein digestion is dependent on multiple factors, including the trypsin origin and digestion conditions. In-depth characterization of trypsin activity could lead to improved reliability of peptide detection and quantitation in both targeted and discovery proteomics studies. To this end, we assembled a data analysis pipeline and suite of visualization tools for quality control and comprehensive characterization of preanalytical variability in proteomics experiments. Using these tools, we evaluated six available proteomics-grade trypsins and their digestion of a single purified protein, human serum albumin (HSA). HSA was aliquoted and then digested for 2 or 18 h for each trypsin, and the resulting digests were desalted and analyzed in triplicate by reversed-phase liquid chromatography-tandem mass spectrometry. Peptides were identified and quantified using the NIST MSQC pipeline and a comprehensive HSA mass spectral library. We performed a statistical analysis of peptide abundances from different digests and further visualized the data using the principal component analysis and quantitative protein "sequence maps". While the performance of individual trypsins across repeat digests was reproducible, significant differences were observed depending on the origin of the trypsin (i.e., bovine vs porcine). Bovine trypsins produced a higher number of peptides containing missed cleavages, whereas porcine trypsins produced more semitryptic peptides. In addition, many cleavage sites showed variable digestion kinetics patterns, evident from the comparison of peptide abundances in 2 h vs 18 h digests. Overall, this work illustrates effects of an often neglected source of variability in proteomics experiments: the origin of the trypsin.
Collapse
Affiliation(s)
- Scott J Walmsley
- Department of Pathology, University of Michigan , 4237 Medical Science I, 1301 Catherine Road, Ann Arbor, Michigan 48109, United States
| | | | | | | | | | | |
Collapse
|
10
|
Knudtson KL, Chien AS, Reyero Vinas NG, Martin L, Murray JM, Rudnick PA, Searle BC, Zianni M, Hunter TC, Van Ee J, Needleman D, Kuster-Schock E. ABRF 2013: Best Poster Competition. J Biomol Tech 2013. [DOI: 10.7171/jbt.13-2402-007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
11
|
Ivanov AR, Colangelo CM, Dufresne CP, Friedman DB, Lilley KS, Mechtler K, Phinney BS, Rose KL, Rudnick PA, Searle BC, Shaffer SA, Weintraub ST. Interlaboratory studies and initiatives developing standards for proteomics. Proteomics 2013; 13:904-9. [PMID: 23319436 DOI: 10.1002/pmic.201200532] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 12/18/2012] [Accepted: 12/19/2012] [Indexed: 01/02/2023]
Abstract
Proteomics is a rapidly transforming interdisciplinary field of research that embraces a diverse set of analytical approaches to tackle problems in fundamental and applied biology. This viewpoint article highlights the benefits of interlaboratory studies and standardization initiatives to enable investigators to address many of the challenges found in proteomics research. Among these initiatives, we discuss our efforts on a comprehensive performance standard for characterizing PTMs by MS that was recently developed by the Association of Biomolecular Resource Facilities (ABRF) Proteomics Standards Research Group (sPRG).
Collapse
Affiliation(s)
- Alexander R Ivanov
- Barnett Institute of Chemical and Biological Analysis, Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Lowenthal MS, Phillips MM, Rimmer CA, Rudnick PA, Simón-Manso Y, Stein SE, Tchekhovskoi D, Phinney KW. Developing qualitative LC-MS methods for characterization of Vaccinium berry Standard Reference Materials. Anal Bioanal Chem 2012; 405:4451-65. [DOI: 10.1007/s00216-012-6346-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Revised: 08/02/2012] [Accepted: 08/07/2012] [Indexed: 10/27/2022]
|
13
|
Tu C, Rudnick PA, Martinez MY, Cheek KL, Stein SE, Slebos RJC, Liebler DC. Depletion of abundant plasma proteins and limitations of plasma proteomics. J Proteome Res 2010; 9:4982-91. [PMID: 20677825 DOI: 10.1021/pr100646w] [Citation(s) in RCA: 254] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Immunoaffinity depletion with antibodies to the top 7 or top 14 high-abundance plasma proteins is used to enhance detection of lower abundance proteins in both shotgun and targeted proteomic analyses. We evaluated the effects of top 7/top 14 immunodepletion on the shotgun proteomic analysis of human plasma. Our goal was to evaluate the impact of immunodepletion on detection of proteins across detectable ranges of abundance. The depletion columns afforded highly repeatable and efficient plasma protein fractionation. Relatively few nontargeted proteins were captured by the depletion columns. Analyses of unfractionated and immunodepleted plasma by peptide isoelectric focusing (IEF), followed by liquid chromatography-tandem mass spectrometry (LC-MS/MS), demonstrated enrichment of nontargeted plasma proteins by an average of 4-fold, as assessed by MS/MS spectral counting. Either top 7 or top 14 immunodepletion resulted in a 25% increase in identified proteins compared to unfractionated plasma. Although 23 low-abundance (<10 ng mL(-1)) plasma proteins were detected, they accounted for only 5-6% of total protein identifications in immunodepleted plasma. In both unfractionated and immunodepleted plasma, the 50 most abundant plasma proteins accounted for 90% of cumulative spectral counts and precursor ion intensities, leaving little capacity to sample lower abundance proteins. Untargeted proteomic analyses using current LC-MS/MS platforms-even with immunodepletion-cannot be expected to efficiently discover low-abundance, disease-specific biomarkers in plasma.
Collapse
Affiliation(s)
- Chengjian Tu
- The Jim Ayers Institute for Precancer Detection and Diagnosis, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA
| | | | | | | | | | | | | |
Collapse
|
14
|
Tabb DL, Vega-Montoto L, Rudnick PA, Variyath AM, Ham AJL, Bunk DM, Kilpatrick LE, Billheimer DD, Blackman RK, Cardasis HL, Carr SA, Clauser KR, Jaffe JD, Kowalski KA, Neubert TA, Regnier FE, Schilling B, Tegeler TJ, Wang M, Wang P, Whiteaker JR, Zimmerman LJ, Fisher SJ, Gibson BW, Kinsinger CR, Mesri M, Rodriguez H, Stein SE, Tempst P, Paulovich AG, Liebler DC, Spiegelman C. Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry. J Proteome Res 2010; 9:761-76. [PMID: 19921851 DOI: 10.1021/pr9006365] [Citation(s) in RCA: 409] [Impact Index Per Article: 29.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The complexity of proteomic instrumentation for LC-MS/MS introduces many possible sources of variability. Data-dependent sampling of peptides constitutes a stochastic element at the heart of discovery proteomics. Although this variation impacts the identification of peptides, proteomic identifications are far from completely random. In this study, we analyzed interlaboratory data sets from the NCI Clinical Proteomic Technology Assessment for Cancer to examine repeatability and reproducibility in peptide and protein identifications. Included data spanned 144 LC-MS/MS experiments on four Thermo LTQ and four Orbitrap instruments. Samples included yeast lysate, the NCI-20 defined dynamic range protein mix, and the Sigma UPS 1 defined equimolar protein mix. Some of our findings reinforced conventional wisdom, such as repeatability and reproducibility being higher for proteins than for peptides. Most lessons from the data, however, were more subtle. Orbitraps proved capable of higher repeatability and reproducibility, but aberrant performance occasionally erased these gains. Even the simplest protein digestions yielded more peptide ions than LC-MS/MS could identify during a single experiment. We observed that peptide lists from pairs of technical replicates overlapped by 35-60%, giving a range for peptide-level repeatability in these experiments. Sample complexity did not appear to affect peptide identification repeatability, even as numbers of identified spectra changed by an order of magnitude. Statistical analysis of protein spectral counts revealed greater stability across technical replicates for Orbitraps, making them superior to LTQ instruments for biomarker candidate discovery. The most repeatable peptides were those corresponding to conventional tryptic cleavage sites, those that produced intense MS signals, and those that resulted from proteins generating many distinct peptides. Reproducibility among different instruments of the same type lagged behind repeatability of technical replicates on a single instrument by several percent. These findings reinforce the importance of evaluating repeatability as a fundamental characteristic of analytical technologies.
Collapse
Affiliation(s)
- David L Tabb
- Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Johann DJ, Wei BR, Prieto DA, Chan KC, Ye X, Valera VA, Simpson RM, Rudnick PA, Xiao Z, Stein SE, Issaq HJ, Linehan WM, Veenstra TD, Blonder J. Combined blood/tissue analysis for cancer biomarker discovery: application to renal cell carcinoma. Anal Chem 2010; 82:1584-8. [PMID: 20121140 PMCID: PMC3251958 DOI: 10.1021/ac902204k] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
A method that relies on subtractive tissue-directed shot-gun proteomics to identify tumor proteins in the blood of a patient newly diagnosed with cancer is described. To avoid analytical and statistical biases caused by physiologic variability of protein expression in the human population, this method was applied on clinical specimens obtained from a single patient diagnosed with nonmetastatic renal cell carcinoma (RCC). The proteomes extracted from tumor, normal adjacent tissue and preoperative plasma were analyzed using 2D-liquid chromatography-mass spectrometry (LC-MS). The lists of identified proteins were filtered to discover proteins that (i) were found in the tumor but not normal tissue, (ii) were identified in matching plasma, and (iii) whose spectral count was higher in tumor tissue than plasma. These filtering criteria resulted in identification of eight tumor proteins in the blood. Subsequent Western-blot analysis confirmed the presence of cadherin-5, cadherin-11, DEAD-box protein-23, and pyruvate kinase in the blood of the patient in the study as well as in the blood of four other patients diagnosed with RCC. These results demonstrate the utility of a combined blood/tissue analysis strategy that permits the detection of tumor proteins in the blood of a patient diagnosed with RCC.
Collapse
Affiliation(s)
- Donald J. Johann
- Medical Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - Bih-Rong Wei
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - DaRue A. Prieto
- Laboratory of Proteomics and Analytical Technologies, Advanced Technology Program, SAIC- Frederick, Inc., National Cancer Institute at Frederick, P.O. Box B, Frederick, Maryland 21702
| | - King C. Chan
- Laboratory of Proteomics and Analytical Technologies, Advanced Technology Program, SAIC- Frederick, Inc., National Cancer Institute at Frederick, P.O. Box B, Frederick, Maryland 21702
| | - Xiaoying Ye
- Laboratory of Proteomics and Analytical Technologies, Advanced Technology Program, SAIC- Frederick, Inc., National Cancer Institute at Frederick, P.O. Box B, Frederick, Maryland 21702
| | - Vladimir A. Valera
- Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - R. Mark Simpson
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - Paul A. Rudnick
- The NIST Mass Spectrometry Data Center, National Institute of Standards and Technology, Gaithersburg, Maryland 20899
| | - Zhen Xiao
- Medical Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - Stephen E. Stein
- The NIST Mass Spectrometry Data Center, National Institute of Standards and Technology, Gaithersburg, Maryland 20899
| | - Haleem J. Issaq
- Medical Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - W. Marston Linehan
- Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - Timothy D. Veenstra
- Medical Oncology Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892
| | - Josip Blonder
- Laboratory of Proteomics and Analytical Technologies, Advanced Technology Program, SAIC- Frederick, Inc., National Cancer Institute at Frederick, P.O. Box B, Frederick, Maryland 21702
| |
Collapse
|
16
|
Paulovich AG, Billheimer D, Ham AJL, Vega-Montoto L, Rudnick PA, Tabb DL, Wang P, Blackman RK, Bunk DM, Cardasis HL, Clauser KR, Kinsinger CR, Schilling B, Tegeler TJ, Variyath AM, Wang M, Whiteaker JR, Zimmerman LJ, Fenyo D, Carr SA, Fisher SJ, Gibson BW, Mesri M, Neubert TA, Regnier FE, Rodriguez H, Spiegelman C, Stein SE, Tempst P, Liebler DC. Interlaboratory study characterizing a yeast performance standard for benchmarking LC-MS platform performance. Mol Cell Proteomics 2009; 9:242-54. [PMID: 19858499 DOI: 10.1074/mcp.m900222-mcp200] [Citation(s) in RCA: 140] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize pre-analytical and analytical variation in comparative proteomics experiments.
Collapse
|
17
|
Rudnick PA, Clauser KR, Kilpatrick LE, Tchekhovskoi DV, Neta P, Blonder N, Billheimer DD, Blackman RK, Bunk DM, Cardasis HL, Ham AJL, Jaffe JD, Kinsinger CR, Mesri M, Neubert TA, Schilling B, Tabb DL, Tegeler TJ, Vega-Montoto L, Variyath AM, Wang M, Wang P, Whiteaker JR, Zimmerman LJ, Carr SA, Fisher SJ, Gibson BW, Paulovich AG, Regnier FE, Rodriguez H, Spiegelman C, Tempst P, Liebler DC, Stein SE. Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analyses. Mol Cell Proteomics 2009; 9:225-41. [PMID: 19837981 PMCID: PMC2830836 DOI: 10.1074/mcp.m900223-mcp200] [Citation(s) in RCA: 158] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications.
Collapse
Affiliation(s)
- Paul A Rudnick
- National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, Spiegelman CH, Zimmerman LJ, Ham AJL, Keshishian H, Hall SC, Allen S, Blackman RK, Borchers CH, Buck C, Cardasis HL, Cusack MP, Dodder NG, Gibson BW, Held JM, Hiltke T, Jackson A, Johansen EB, Kinsinger CR, Li J, Mesri M, Neubert TA, Niles RK, Pulsipher TC, Ransohoff D, Rodriguez H, Rudnick PA, Smith D, Tabb DL, Tegeler TJ, Variyath AM, Vega-Montoto LJ, Wahlander Å, Waldemarson S, Wang M, Whiteaker JR, Zhao L, Anderson NL, Fisher SJ, Liebler DC, Paulovich AG, Regnier FE, Tempst P, Carr SA. Erratum: Corrigendum: Multi-site assessment of the precision and reproducibility of multiple reaction monitoring–based measurements of proteins in plasma. Nat Biotechnol 2009. [DOI: 10.1038/nbt0909-864b] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
19
|
Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, Spiegelman CH, Zimmerman LJ, Ham AJL, Keshishian H, Hall SC, Allen S, Blackman RK, Borchers CH, Buck C, Cardasis HL, Cusack MP, Dodder NG, Gibson BW, Held JM, Hiltke T, Jackson A, Johansen EB, Kinsinger CR, Li J, Mesri M, Neubert TA, Niles RK, Pulsipher TC, Ransohoff D, Rodriguez H, Rudnick PA, Smith D, Tabb DL, Tegeler TJ, Variyath AM, Vega-Montoto LJ, Wahlander A, Waldemarson S, Wang M, Whiteaker JR, Zhao L, Anderson NL, Fisher SJ, Liebler DC, Paulovich AG, Regnier FE, Tempst P, Carr SA. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat Biotechnol 2009; 27:633-41. [PMID: 19561596 DOI: 10.1038/nbt.1546] [Citation(s) in RCA: 819] [Impact Index Per Article: 54.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 05/31/2009] [Indexed: 01/13/2023]
Abstract
Verification of candidate biomarkers relies upon specific, quantitative assays optimized for selective detection of target proteins, and is increasingly viewed as a critical step in the discovery pipeline that bridges unbiased biomarker discovery to preclinical validation. Although individual laboratories have demonstrated that multiple reaction monitoring (MRM) coupled with isotope dilution mass spectrometry can quantify candidate protein biomarkers in plasma, reproducibility and transferability of these assays between laboratories have not been demonstrated. We describe a multilaboratory study to assess reproducibility, recovery, linear dynamic range and limits of detection and quantification of multiplexed, MRM-based assays, conducted by NCI-CPTAC. Using common materials and standardized protocols, we demonstrate that these assays can be highly reproducible within and across laboratories and instrument platforms, and are sensitive to low mug/ml protein concentrations in unfractionated plasma. We provide data and benchmarks against which individual laboratories can compare their performance and evaluate new technologies for biomarker verification in plasma.
Collapse
Affiliation(s)
- Terri A Addona
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Guo T, Wang W, Rudnick PA, Song T, Li J, Zhuang Z, Weil RJ, DeVoe DL, Lee CS, Balgley BM. Proteome analysis of microdissected formalin-fixed and paraffin-embedded tissue specimens. J Histochem Cytochem 2007; 55:763-72. [PMID: 17409379 DOI: 10.1369/jhc.7a7177.2007] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Targeted proteomics research, based on the enrichment of disease-relevant proteins from isolated cell populations selected from high-quality tissue specimens, offers great potential for the identification of diagnostic, prognostic, and predictive biological markers for use in the clinical setting and during preclinical testing and clinical trials, as well as for the discovery and validation of new protein drug targets. Formalin-fixed and paraffin-embedded (FFPE) tissue collections, with attached clinical and outcome information, are invaluable resources for conducting retrospective protein biomarker investigations and performing translational studies of cancer and other diseases. Combined capillary isoelectric focusing/nano-reversed-phase liquid chromatography separations equipped with nano-electrospray ionization-tandem mass spectrometry are employed for the studies of proteins extracted from microdissected FFPE glioblastoma tissues using a heat-induced antigen retrieval (AR) technique. A total of 14,478 distinct peptides are identified, leading to the identification of 2733 non-redundant SwissProt protein entries. Eighty-three percent of identified FFPE tissue proteins overlap with those obtained from the pellet fraction of fresh-frozen tissue of the same patient. This large degree of protein overlapping is attributed to the application of detergent-based protein extraction in both the cell pellet preparation protocol and the AR technique.
Collapse
Affiliation(s)
- Tong Guo
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Wang W, Guo T, Rudnick PA, Song T, Li J, Zhuang Z, Zheng W, Devoe DL, Lee CS, Balgley BM. Membrane Proteome Analysis of Microdissected Ovarian Tumor Tissues Using Capillary Isoelectric Focusing/Reversed-Phase Liquid Chromatography−Tandem MS. Anal Chem 2006; 79:1002-9. [PMID: 17263328 DOI: 10.1021/ac061613i] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This work expands our tissue proteome capabilities from the analysis of soluble proteins in previous studies to the examination of membrane proteins within the pellets of enriched and selectively isolated tumor cells procured from microdissected tissue specimens. The pellets of targeted ovarian tumor cells are treated by two different membrane protein extraction methods, including the use of detergent and organic solvent. The detergent-based membrane protein preparation protocol not only extracts proteins effectively from cell pellets but also is compatible with subsequent proteome analysis using combined capillary isoelctric focusing/nano reversed-phase liquid chromatography separations coupled with nano electrospray ionization mass spectrometry. Among proteins identified from an amount of pellet equivalent to 20 000 cells, 773 proteins are predicted to contain one or more transmembrane domains, corresponding to 22% membrane proteome coverage within the SwissProt Human protein sequence entries.
Collapse
Affiliation(s)
- Weijie Wang
- Calibrant Biosystems, 910 Clopper Road, Suite 220N, Gaithersburg, Maryland 20878, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Guo T, Rudnick PA, Wang W, Lee CS, Devoe DL, Balgley BM. Characterization of the human salivary proteome by capillary isoelectric focusing/nanoreversed-phase liquid chromatography coupled with ESI-tandem MS. J Proteome Res 2006; 5:1469-78. [PMID: 16739998 DOI: 10.1021/pr060065m] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Saliva is a readily available body fluid with great diagnostic potential. The foundation for saliva-based diagnostics, however, is the development of a complete catalog of secreted and "leaked" proteins detectable in saliva. By employing a capillary isoelectric focusing-based multidimensional separation platform coupled with electrospray ionization tandem mass spectrometry (MS), a total of 5338 distinct peptides were sequenced, leading to the identification of 1381 distinct proteins. A search of bacterial protein sequences also identified many peptides unique to several organisms and unique to the NCBI nonredundant database. To the best of our knowledge, this proteome study represents the largest catalog of proteins measured from a single saliva sample to date. Data analysis was performed on individual MS/MS spectra using the highly specific peptide identification algorithm, OMSSA. Searches were conducted against a decoyed SwissProt human database to control the false-positive rate at 1%. Furthermore, the well-curated SwissProt sequences represent perhaps the least redundant human protein sequence database (12,484 records versus the 50,009 records found in the International Protein Index human database), therefore minimizing multiple protein inferences from single peptides. This combined bioanalytical and bioinformatic approach has established a solid foundation for building up the human salivary proteome for the realization of the diagnostic potential of saliva.
Collapse
Affiliation(s)
- Tong Guo
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | | | | | | | | | | |
Collapse
|
23
|
Wang Y, Rudnick PA, Evans EL, Li J, Zhuang Z, Devoe DL, Lee CS, Balgley BM. Proteome analysis of microdissected tumor tissue using a capillary isoelectric focusing-based multidimensional separation platform coupled with ESI-tandem MS. Anal Chem 2006; 77:6549-56. [PMID: 16223239 DOI: 10.1021/ac050491b] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
This study demonstrates the ability to perform sensitive proteome analysis on the limited protein quantities available through tissue microdissection. Capillary isoelectric focusing combined with nano-reversed-phase liquid chromatography in an automated and integrated platform not only provides systematic resolution of complex peptide mixtures based on their differences in isoelectric point and hydrophobicity but also eliminates peptide loss and analyte dilution. In comparison with strong cation exchange chromatography, the significant advantages of electrokinetic focusing-based separations include high resolving power, high concentration and narrow analyte bands, and effective usage of electrospray ionization-tandem MS toward peptide identifications. Through the use of capillary isoelectric focusing-based multidimensional peptide separations, a total of 6866 fully tryptic peptides were detected, leading to the identification of 1820 distinct proteins. Each distinct protein was identified by at least one distinct peptide sequence. These high mass accuracy and high-confidence identifications were generated from three proteome runs of a single glioblastoma multiforme tissue sample, each run consuming only 10 microg of total protein, an amount corresponding to 20,000 selectively isolated cells. Instead of performing multiple runs of multidimensional separations, the overall peak capacity can be greatly enhanced for mining deeper into tissue proteomics by increasing the number of CIEF fractions without an accompanying increase in sample consumption.
Collapse
Affiliation(s)
- Yueju Wang
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Rudnick PA, Wang Y, Evans E, Lee CS, Balgley BM. Large Scale Analysis of MASCOT Results Using a Mass Accuracy-Based THreshold (MATH) Effectively Improves Data Interpretation. J Proteome Res 2005; 4:1353-60. [PMID: 16083287 DOI: 10.1021/pr0500509] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In this report, we take a heuristic approach to studying the effects of mass tolerance settings and database size on the sensitivity and specificity of MASCOT. We also examine the efficacy of the MASCOT Identity Threshold as a discriminator when applied to QqTOF data with an average mass accuracy of 10 ppm or better. As predicted, arbitrarily large mass tolerance settings negatively affect MASCOT's specificity, and to a lesser degree, sensitivity. Increased mass tolerances also render the generation of a significance threshold less effective. To study these effects, we used Bayes' Law to calculate MASCOT's predictive values. With a relatively small search database (Human IPI), MASCOT had a mean positive predictive value of 0.993 when combined with MASCOT's Identity Threshold. However, the corresponding average negative predictive value, or the probability that an ion was not present given no score or a score below threshold, was reduced as mass tolerances were tightened, and had an average value of 0.717. This value was improved upon by extrapolating an empirical threshold using a reversed database search and a new algorithm to rapidly identify false positive identifications. Using the empirical threshold reduced false negative identifications on the average 17% while limiting the false positive rate to below 5%; even larger reductions were obtained using mass tolerances approaching two times the actual error of the experimental data. A simple application of this strategy to the analysis of a microdissected glioblastoma multiforme sample analyzed by IEF/LC-MS/MS is reported, as is a description of the tools required to implement a large scale analysis using this alternative approach.
Collapse
Affiliation(s)
- Paul A Rudnick
- Calibrant Biosystems, 7507 Standish Pl., Rockville, MD 20855, USA.
| | | | | | | | | |
Collapse
|
25
|
Abstract
For top-down proteomics, nano-reversed phase liquid chromatography (RPLC) plays a major role in both single and multidimensional protein separations in an effort to increase the overall peak capacity for the resolution of complex protein mixtures prior to mass spectrometry analysis. Effects of various chromatography conditions, including alkyl chain length in the stationary phase, capillary column temperature, and ion-pairing agent, on the resolution of intact proteins are studied using nano-RPLC-electrospray ionization-mass spectrometry. Optimal chromatography conditions include the use of C18 column heated at 60 degrees C and the addition of trifluoroacetic acid instead of heptafluorobutyric acid as the ion-paring agent in the mobile phase. Under optimized chromatography conditions, there are no significant differences in the separation performance of yeast cell lysates present in the native versus denatured states. Denatured yeast proteins resolved and eluted from nano-RPLC can be subjected to proteolytic digestion in an on- or off-line approach to provide improved protein sequence coverage toward protein identification in a combined top-down/bottom-up proteome platform.
Collapse
Affiliation(s)
- Yueju Wang
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | | | | | | |
Collapse
|
26
|
Wang Y, Balgley BM, Rudnick PA, Evans EL, DeVoe DL, Lee CS. Integrated Capillary Isoelectric Focusing/Nano-reversed Phase Liquid Chromatography Coupled with ESI−MS for Characterization of Intact Yeast Proteins. J Proteome Res 2005; 4:36-42. [PMID: 15707355 DOI: 10.1021/pr049876l] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
An integrated protein concentration/separation platform, combining capillary isoelectric focusing (CIEF) with nano-reversed phase liquid chromatography (nano-RPLC), is developed to provide significant protein concentration and high resolving power for the analysis of complex protein mixtures. Upon completion of protein focusing, the proteins are sequentially and hydrodynamically loaded into individual trap columns using a group of microinjection and microselection valves. Repeated pro-tein loadings and injections into trap columns are carried out automatically until the entire CIEF cap-illary content is sampled and fractionated. Each CIEF fraction "parked" in separate trap columns is further resolved using nano-RPLC, and the eluants are analyzed using electrospray ionization-mass spectrometry.
Collapse
Affiliation(s)
- Yueju Wang
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | | | | | | | | | | |
Collapse
|
27
|
Abstract
To evaluate the role of uridylyl-transferase, the Sinorhizobium meliloti glnD gene was isolated by heterologous complementation in Azotobacter vinelandii. The glnD gene is cotranscribed with a gene homologous to Salmonella mviN. glnD1::Omega or mviN1::Omega mutants could not be isolated by a powerful sucrose counterselection procedure unless a complementing cosmid was provided, indicating that glnD and mviN are members of an indispensable operon in S. meliloti.
Collapse
Affiliation(s)
- P A Rudnick
- Laboratoire de Biologie Moléculaire des Relations Plantes-Microorganismes, INRA/CNRS, 31326 Castanet-Tolosan Cedex, France
| | | | | | | |
Collapse
|