101
|
Ramisetty SR, Washburn MP. Unraveling the dynamics of protein interactions with quantitative mass spectrometry. Crit Rev Biochem Mol Biol 2011; 46:216-28. [PMID: 21438726 DOI: 10.3109/10409238.2011.567244] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Knowledge of structure and dynamics of proteins and protein complexes is important to unveil the molecular basis and mechanisms involved in most biological processes. Protein complex dynamics can be defined as the changes in the composition of a protein complex during a cellular process. Protein dynamics can be defined as conformational changes in a protein during enzyme activation, for example, when a protein binds to a ligand or when a protein binds to another protein. Mass spectrometry (MS) combined with affinity purification has become the analytical tool of choice for mapping protein-protein interaction networks and the recent developments in the quantitative proteomics field has made it possible to identify dynamically interacting proteins. Furthermore, hydrogen/deuterium exchange MS is emerging as a powerful technique to study structure and conformational dynamics of proteins or protein assemblies in solution. Methods have been developed and applied for the identification of transient and/or weak dynamic interaction partners and for the analysis of conformational dynamics of proteins or protein complexes. This review is an overview of existing and recent developments in studying the overall dynamics of in vivo protein interaction networks and protein complexes using MS-based methods.
Collapse
|
102
|
Fermin D, Basrur V, Yocum AK, Nesvizhskii AI. Abacus: a computational tool for extracting and pre-processing spectral count data for label-free quantitative proteomic analysis. Proteomics 2011; 11:1340-5. [PMID: 21360675 DOI: 10.1002/pmic.201000650] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2010] [Revised: 12/15/2010] [Accepted: 12/29/2010] [Indexed: 01/16/2023]
Abstract
We describe Abacus, a computational tool for extracting spectral counts from MS/MS data sets. The program aggregates data from multiple experiments, adjusts spectral counts to accurately account for peptides shared across multiple proteins, and performs common normalization steps. It can also output the spectral count data at the gene level, thus simplifying the integration and comparison between gene and protein expression data. Abacus is compatible with the widely used Trans-Proteomic Pipeline suite of tools and comes with a graphical user interface making it easy to interact with the program. The main aim of Abacus is to streamline the analysis of spectral count data by providing an automated, easy to use solution for extracting this information from proteomic data sets for subsequent, more sophisticated statistical analysis.
Collapse
Affiliation(s)
- Damian Fermin
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | | | | | | |
Collapse
|
103
|
Neilson KA, Ali NA, Muralidharan S, Mirzaei M, Mariani M, Assadourian G, Lee A, van Sluyter SC, Haynes PA. Less label, more free: approaches in label-free quantitative mass spectrometry. Proteomics 2011; 11:535-53. [PMID: 21243637 DOI: 10.1002/pmic.201000553] [Citation(s) in RCA: 520] [Impact Index Per Article: 37.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Revised: 10/21/2010] [Accepted: 11/02/2010] [Indexed: 01/09/2023]
Abstract
In this review we examine techniques, software, and statistical analyses used in label-free quantitative proteomics studies for area under the curve and spectral counting approaches. Recent advances in the field are discussed in an order that reflects a logical workflow design. Examples of studies that follow this design are presented to highlight the requirement for statistical assessment and further experiments to validate results from label-free quantitation. Limitations of label-free approaches are considered, label-free approaches are compared with labelling techniques, and forward-looking applications for label-free quantitative data are presented. We conclude that label-free quantitative proteomics is a reliable, versatile, and cost-effective alternative to labelled quantitation.
Collapse
Affiliation(s)
- Karlie A Neilson
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
104
|
Xie X, Yi Z, Bowen B, Wolf C, Flynn CR, Sinha S, Mandarino LJ, Meyer C. Characterization of the Human Adipocyte Proteome and Reproducibility of Protein Abundance by One-Dimensional Gel Electrophoresis and HPLC-ESI-MS/MS. J Proteome Res 2011; 9:4521-34. [PMID: 20812759 DOI: 10.1021/pr100268f] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Abnormalities in adipocytes play an important role in various conditions, including the metabolic syndrome, type 2 diabetes mellitus and cardiovascular disease, but little is known about alterations at the protein level. We therefore sought to (1) comprehensively characterize the human adipocyte proteome for the first time and (2) demonstrate feasibility of measuring adipocyte protein abundances by one-dimensional SDS-PAGE and high performance liquid chromatography-electron spray ionization-tandem mass spectrometry (HPLC-ESI-MS/MS). In adipocytes isolated from approximately 0.5 g of subcutaneous abdominal adipose tissue of three healthy, lean subjects, we identified a total of 1493 proteins. Triplicate analysis indicated a 22.5% coefficient of variation of protein abundances. Proteins ranged from 5.8 to 629 kDa and included a large number of proteins involved in lipid metabolism, such as fatty acid transport, fatty acid oxidation, lipid storage, lipolysis, and lipid droplet maintenance. Furthermore, we found most glycolysis enzymes and numerous proteins associated with oxidative stress, protein synthesis and degradation as well as some adipokines. 22% of all proteins were of mitochondrial origin. These results provide the first detailed characterization of the human adipocyte proteome, suggest an important role of adipocyte mitochondria, and demonstrate feasibility of this approach to examine alterations of adipocyte protein abundances in human diseases.
Collapse
Affiliation(s)
- Xitao Xie
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona, USA
| | | | | | | | | | | | | | | |
Collapse
|
105
|
Hoehenwarter W, Wienkoop S. Spectral counting robust on high mass accuracy mass spectrometers. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2010; 24:3609-14. [PMID: 21108307 DOI: 10.1002/rcm.4818] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Mass spectrometry is central to shotgun proteomics, an application that seeks to quantify as much of the total protein complement of a biological sample as possible. The high mass accuracy, resolution, capacity and scan rate of modern mass spectrometers have greatly facilitated this endeavor. The sum of MS to MS/MS transitions in tandem mass spectrometry, the spectral count (SC), of a peptide has been shown to be a reliable estimate of its relative abundance. However, when using SCs, optimal MS configurations are crucial in order to maximize the number of low abundant proteins quantified while keeping the estimates for the highly abundant proteins within the linear dynamic range.In this study, LC/MS/MS analysis was performed using an LTQ-OrbiTrap on a sample containing many highly abundant proteins. Tuning the LTQ-OrbiTrap mass spectrometer to minimize redundant MS/MS acquisition and to maximize resolution of the proteome by accurately measured m/z ratios resulted in an appreciable increase in quantified low abundant proteins. An exclusion duration of 90 s and an exclusion width of 10 ppm were found best of those tested. The spectral count of individual proteins was found to be highly reproducible and protein abundance ratios were not affected by the different settings that were applied. We conclude that on a high mass accuracy instrument spectral counting is a robust measure of protein abundance even for samples containing many highly abundant proteins and that tuning dynamic exclusion parameters appreciably improves the number of proteins that can be reliably quantified.
Collapse
Affiliation(s)
- Wolfgang Hoehenwarter
- Department of Molecular Systems Biology, Faculty of Life Sciences, University of Vienna, Althanstrasse 14, A-1090 Vienna, Austria.
| | | |
Collapse
|
106
|
Webb-Robertson BJM, McCue LA, Waters KM, Matzke MM, Jacobs JM, Metz TO, Varnum SM, Pounds JG. Combined statistical analyses of peptide intensities and peptide occurrences improves identification of significant peptides from MS-based proteomics data. J Proteome Res 2010; 9:5748-56. [PMID: 20831241 PMCID: PMC2974810 DOI: 10.1021/pr1005247] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
Liquid chromatography−mass spectrometry-based (LC−MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing observations in LC−MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error or nonrandom mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values across experimental groups. We pair the G-test results, evaluating independence of missing data (IMD) with an analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use three LC−MS data sets to demonstrate the robustness and sensitivity of the IMD−ANOVA approach. Missing abundance values in LC−MS data are difficult to analyze statistically because the mechanisms by which the data are missing are unknown (processing or biological effect). We present a new approach that pairs a test of independence on missing data to discern qualitative difference across treatment groups with traditional statistical tests that evaluate quantitative differences. The combination of these two statistics yields a more robust statistical description of the data.
Collapse
|
107
|
Zhou JY, Schepmoes AA, Zhang X, Moore RJ, Monroe ME, Lee JH, Camp DG, Smith RD, Qian WJ. Improved LC-MS/MS spectral counting statistics by recovering low-scoring spectra matched to confidently identified peptide sequences. J Proteome Res 2010; 9:5698-704. [PMID: 20812748 DOI: 10.1021/pr100508p] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Spectral counting has become a popular method for LC-MS/MS based proteome quantification; however, this methodology is often not reliable when proteins are identified by a small number of spectra. Here, we present a simple strategy to improve spectral counting based quantification for low-abundance proteins by recovering low-quality or low-scoring spectra for confidently identified peptides. In this approach, stringent data filtering criteria were initially applied to achieve confident peptide identifications with low false discovery rate (e.g., < 1% at peptide level) after LC-MS/MS analysis and database search by SEQUEST. Then, all low-scoring MS/MS spectra that matched to this set of confidently identified peptides were recovered, leading to more than 20% increase of total identified spectra. The validity of these recovered spectra was assessed by the parent ion mass measurement error distribution, retention time distribution, and by comparing the individual low score and high score spectra that correspond to the same peptides. The results support that the recovered low-scoring spectra have similar confidence levels in peptide identifications as the spectra passing the initial stringent filter. The application of this strategy of recovering low-scoring spectra significantly improved the spectral count quantification statistics for low-abundance proteins, as illustrated in the identification of mouse brain region specific proteins.
Collapse
Affiliation(s)
- Jian-Ying Zhou
- Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
108
|
Washburn MP. Driving biochemical discovery with quantitative proteomics. Trends Biochem Sci 2010; 36:170-7. [PMID: 20880711 DOI: 10.1016/j.tibs.2010.09.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Revised: 08/31/2010] [Accepted: 09/01/2010] [Indexed: 11/26/2022]
Abstract
Proteomic analysis of biological samples plays an increasing role in modern research. Although the application of proteomics technologies varies across many disciplines, proteomics largely is a tool for discovery that then leads to novel hypotheses. In recent years, new methods and technologies have been developed and applied in many areas of proteomics, and there is a strong push towards using proteomics in a quantitative manner. Indeed, mass spectrometry-based, quantitative proteomics approaches have been applied to great success in a variety of biochemical studies. In particular, the use of quantitative proteomics provides new insights into protein complexes and post-translational modifications and leads to the generation of novel insights into these important biochemical systems.
Collapse
Affiliation(s)
- Michael P Washburn
- Stowers Institute for Medical Research, Kansas City, 1000 E. 50(th) St., MO 64110, USA.
| |
Collapse
|
109
|
Dicker L, Lin X, Ivanov AR. Increased power for the analysis of label-free LC-MS/MS proteomics data by combining spectral counts and peptide peak attributes. Mol Cell Proteomics 2010; 9:2704-18. [PMID: 20823122 DOI: 10.1074/mcp.m110.002774] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomics provides a wealth of information about proteins present in biological samples. In bottom-up LC-MS/MS-based proteomics, proteins are enzymatically digested into peptides prior to query by LC-MS/MS. Thus, the information directly available from the LC-MS/MS data is at the peptide level. If a protein-level analysis is desired, the peptide-level information must be rolled up into protein-level information. We propose a principal component analysis-based statistical method, ProPCA, for efficiently estimating relative protein abundance from bottom-up label-free LC-MS/MS data that incorporates both spectral count information and LC-MS peptide ion peak attributes, such as peak area, volume, or height. ProPCA may be used effectively with a variety of quantification platforms and is easily implemented. We show that ProPCA outperformed existing quantitative methods for peptide-protein roll-up, including spectral counting methods and other methods for combining LC-MS peptide peak attributes. The performance of ProPCA was validated using a data set derived from the LC-MS/MS analysis of a mixture of protein standards (the UPS2 proteomic dynamic range standard introduced by The Association of Biomolecular Resource Facilities Proteomics Standards Research Group in 2006). Finally, we applied ProPCA to a comparative LC-MS/MS analysis of digested total cell lysates prepared for LC-MS/MS analysis by alternative lysis methods and show that ProPCA identified more differentially abundant proteins than competing methods.
Collapse
Affiliation(s)
- Lee Dicker
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA
| | | | | |
Collapse
|
110
|
McKinney KQ, Lee YY, Choi HS, Groseclose G, Iannitti DA, Martinie JB, Russo MW, Lundgren DH, Han DK, Bonkovsky HL, Hwang SI. Discovery of putative pancreatic cancer biomarkers using subcellular proteomics. J Proteomics 2010; 74:79-88. [PMID: 20807598 DOI: 10.1016/j.jprot.2010.08.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Revised: 07/16/2010] [Accepted: 08/24/2010] [Indexed: 01/07/2023]
Abstract
Pancreatic cancer (PC) is a highly aggressive disease that frequently remains undetected until it has progressed to an advanced, systemic stage. Successful treatment of PC is hindered by the lack of early detection. The application of proteomic analysis to PC combined with subcellular fractionation has introduced new possibilities in the field of biomarker discovery. We utilized matched pairs of pancreas tumor and non-tumor pancreas from patients undergoing tumor resection. The tissues were treated to obtain cellular protein fractions corresponding to cytosol, membrane, nucleus and cytoskeleton. The fractions were then separated by molecular weight and digested with trypsin, followed by liquid chromatography and tandem mass spectrometry. The spectra obtained were searched using Sequest engine and combined into a single analysis file to obtain a semi-quantitative number, spectral count, using Scaffold software. We identified 2393 unique proteins in non-tumor and cancer pancreas. Utilizing PLGEM statistical analysis we determined 104 proteins were significantly changed in cancer. From these, we further validated four secreted proteins that are up-regulated in cancer and have potential for development as minimally-invasive diagnostic markers. We conclude that subcellular fractionation followed by gel electrophoresis and tandem mass spectrometry is a powerful strategy for identification of differentially expressed proteins in pancreatic cancer.
Collapse
Affiliation(s)
- Kimberly Q McKinney
- Proteomics Laboratory for Clinical and Translational Research, Carolinas HealthCare System, Charlotte, NC, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
111
|
Assigning significance in label-free quantitative proteomics to include single-peptide-hit proteins with low replicates. INTERNATIONAL JOURNAL OF PROTEOMICS 2010; 2010. [PMID: 21152383 PMCID: PMC2997754 DOI: 10.1155/2010/731582] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Selecting differentially regulated proteins with an assignment of statistical significance remains difficult for proteins with a single-peptide hit or a small fold-change when sample replicates are limited. This study presents a label-free quantitative proteomics scheme that was used to select differentially regulated proteins with single-peptide hits and with <2-fold change at a 5% false discovery rate. The scheme incorporated a labeled internal control into two unlabeled samples to facilitate error modeling when there were no replicates for the unlabeled samples. The results showed that, while both a power law global error model with a signal-to-noise ratio statistic (PLGEM-STN) and a constant fold-change threshold could be used, neither of them alone was stringent enough to select differentially regulated proteins at a 5% false discovery rate. Thus, the rule of minimum number of permuted significant pairings (MPSP) was introduced to reduce false discovery rates in combination with PLGEM-STN or a fold-change threshold. MPSP played a critical role in extending the selection of differentially regulated proteins to those with single-peptide hits or with lower fold-changes. Although the approaches were demonstrated for limited sample replicates, they should also be applicable to the situation where more sample replicates are available.
Collapse
|
112
|
Hawkridge AM, Wysocky RB, Petitte JN, Anderson KE, Mozdziak PE, Fletcher OJ, Horowitz JM, Muddiman DC. Measuring the intra-individual variability of the plasma proteome in the chicken model of spontaneous ovarian adenocarcinoma. Anal Bioanal Chem 2010; 398:737-49. [PMID: 20640409 DOI: 10.1007/s00216-010-3979-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2010] [Revised: 06/15/2010] [Accepted: 06/28/2010] [Indexed: 01/19/2023]
Abstract
The domestic chicken (Gallus domesticus) has emerged as a powerful experimental model for studying the onset and progression of spontaneous epithelial ovarian cancer (EOC) with a disease prevalence that can exceed 35% between 2 and 7 years of age. An experimental strategy for biomarker discovery is reported herein that combines the chicken model of EOC, longitudinal plasma sample collection with matched tissues, advanced mass spectrometry-based proteomics, and concepts derived from the index of individuality (Harris, Clin Chem 20: 1535-1542, 1974). Blood was drawn from 148 age-matched chickens starting at 2.5 years of age every 3 months for 1 year. At the conclusion of the 1 year sample collection period, the 73 birds that remained alive were euthanized, necropsied, and tissues were collected. Pathological assessment of resected tissues from these 73 birds confirmed that five birds (6.8%) developed EOC. A proteomics workflow including in-gel digestion, nanoLC coupled to high-performance mass spectrometry, and label-free (spectral counting) quantification was used to measure the biological intra-individual variability (CV(W)) of the chicken plasma proteome. Longitudinal plasma sample sets from two birds within the 73-bird biorepository were selected for this study; one bird was considered "healthy" and the second bird developed late-stage EOC. A total of 116 proteins from un-depleted plasma were identified with 80 proteins shared among all sample sets. Analytical variability (CV(A)) of the label-free proteomics workflow was measured using a single plasma sample analyzed five times and was found to be ≥CV(W) in both birds for 16 proteins (20%) and in either bird for 25 proteins (31%). Ovomacroglobulin (ovostatin) was found to increase (p < 0.001) over a 6 month period in the late-stage EOC bird providing an initial candidate protein for further investigation.
Collapse
Affiliation(s)
- Adam M Hawkridge
- W. M. Keck FT-ICR Mass Spectrometry Laboratory and Department of Chemistry, North Carolina State University, Box 8204, Raleigh, NC 27695, USA.
| | | | | | | | | | | | | | | |
Collapse
|
113
|
Zhang Y, Wen Z, Washburn MP, Florens L. Effect of dynamic exclusion duration on spectral count based quantitative proteomics. Anal Chem 2010; 81:6317-26. [PMID: 19586016 DOI: 10.1021/ac9004887] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To increase proteome coverage, dynamic exclusion (DE) is a widely used tool. When DE is enabled, more proteins can be identified, although the total spectral counts will decrease. To investigate the effects of DE duration on spectral-counting based quantitative proteomics, we analyzed the same sample via multidimensional protein identification technology while enabling different DE durations (15, 60, 90, 300, 600 s) or turning DE off. Normalized spectral abundance factors (NSAFs) measured for abundant proteins varied little with or without DE, while enabling DE lead to higher peptide counts, higher NSAFs, and better reproducibility of detection for proteins of relatively lower abundance. The optimal DE duration, which generated the maximum number of peptides, proteins, and peptides per protein, was observed to be 90 s in our settings. We developed a mathematical model for analyzing the effects of DE duration on peptide spectral counts. We found that the optimal DE duration depends on the average chromatographic peak width at the base of eluting peptides and mass spectrometry parameters, leading us to calculate an optimized DE duration of 97.9 s, in excellent agreement with our observations. In this study, we provide a systematic approach for the optimization of spectral counts for improved quantitative proteomics analysis.
Collapse
Affiliation(s)
- Ying Zhang
- Stowers Institute for Medical Research, 1000 East 50th Street, Kansas City, Missouri 64110, USA
| | | | | | | |
Collapse
|
114
|
Little KM, Lee JK, Ley K. ReSASC: a resampling-based algorithm to determine differential protein expression from spectral count data. Proteomics 2010; 10:1212-22. [PMID: 20058246 DOI: 10.1002/pmic.200900328] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Label-free methods for MS/MS quantification of protein expression are becoming more prevalent as instrument sensitivity increases. Spectral counts (SCs) are commonly used, readily obtained, and increase linearly with protein abundance; however, a statistical framework has been lacking. To accommodate the highly non-normal distribution of SCs, we developed ReSASC (resampling-based significance analysis for spectral counts), which evaluates differential expression between two conditions by pooling similarly expressed proteins and sampling from this pool to create permutation-based synthetic sets of SCs for each protein. At a set confidence level and corresponding p-value cutoff, ReSASC defines a new p-value, p', as the number of synthetic SC sets with p>p(cutoff) divided by the total number of sets. We have applied ReSASC to two published SC data sets and found that ReSASC compares favorably with existing methods while being easy to operate and requiring only standard computing resources.
Collapse
Affiliation(s)
- Kristina M Little
- Division of Inflammation Biology, La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
115
|
Zhang Y, Wen Z, Washburn MP, Florens L. Refinements to label free proteome quantitation: how to deal with peptides shared by multiple proteins. Anal Chem 2010; 82:2272-81. [PMID: 20166708 DOI: 10.1021/ac9023999] [Citation(s) in RCA: 297] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Quantitative shotgun proteomics is dependent on the detection, identification, and quantitative analysis of peptides. An issue arises with peptides that are shared between multiple proteins. What protein did they originate from and how should these shared peptides be used in a quantitative proteomics workflow? To systematically evaluate shared peptides in label-free quantitative proteomics, we devised a well-defined protein sample consisting of known concentrations of six albumins from different species, which we added to a highly complex yeast lysate. We used the spectral counts based normalized spectral abundance factor (NSAF) as the starting point for our analysis and compared an exhaustive list of possible combinations of parameters to determine what was the optimal approach for dealing with shared peptides and shared spectral counts. We showed that distributing shared spectral counts based on the number of unique spectral counts led to the most accurate and reproducible results.
Collapse
Affiliation(s)
- Ying Zhang
- Stowers Institute for Medical Research, 1000 East 50th Street, Kansas City, Missouri 64110, USA
| | | | | | | |
Collapse
|
116
|
|
117
|
Lundgren DH, Hwang SI, Wu L, Han DK. Role of spectral counting in quantitative proteomics. Expert Rev Proteomics 2010; 7:39-53. [PMID: 20121475 DOI: 10.1586/epr.09.69] [Citation(s) in RCA: 322] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Spectral count, defined as the total number of spectra identified for a protein, has gained acceptance as a practical, label-free, semiquantitative measure of protein abundance in proteomic studies. In this review, we discuss issues affecting the performance of spectral counting relative to other label-free methods, as well as its limitations. Possible consequences of modifications, which are commonly applied to raw spectral counts to improve abundance estimations, are considered. The use of spectral counting for different types of quantitation studies is explored and critiqued. Different statistical methods and underlying frameworks that have been applied to spectral count analysis are described and compared, and problem areas that undermine confident statistical analysis are considered. Finally, the issue of accurate estimation of false-discovery rates is addressed and identified as a major current challenge in quantitative proteomics.
Collapse
Affiliation(s)
- Deborah H Lundgren
- Department of Cell Biology and Center for Vascular Biology, University of Connecticut Health Center, 263 Farmington Avenue, Farmington, CT 06030, USA
| | | | | | | |
Collapse
|
118
|
Heinecke NL, Pratt BS, Vaisar T, Becker L. PepC: proteomics software for identifying differentially expressed proteins based on spectral counting. Bioinformatics 2010; 26:1574-5. [PMID: 20413636 DOI: 10.1093/bioinformatics/btq171] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED Identifying biologically significant changes in protein abundance between two conditions is a key issue when analyzing proteomic data. One widely used approach centers on spectral counting, a label-free method that sums all the tandem mass spectra for a protein observed in an analysis. To assess the significance of the results, we recently combined the t-test and G-test, with random permutation analysis, and we validated this approach biochemically. To automate the statistical method, we developed PepC, a software program that balances the trade-off between the number of differentially expressed proteins identified and the false discovery rate. This tool can be applied to a wide range of proteomic datasets, making data analysis rapid, reproducible and easily interpretable by proteomics specialists and non-specialists alike. AVAILABILITY AND IMPLEMENTATION The software is implemented in Java. It has been added to the Trans Proteomic Pipeline project's 'Petunia' web interface, but can also be run as a command line program. The source code is GNU Lesser General Public License and the program is freely available on the web. http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/src/Quantitation/Pepc.
Collapse
|
119
|
Karp NA, Huber W, Sadowski PG, Charles PD, Hester SV, Lilley KS. Addressing accuracy and precision issues in iTRAQ quantitation. Mol Cell Proteomics 2010; 9:1885-97. [PMID: 20382981 DOI: 10.1074/mcp.m900628-mcp200] [Citation(s) in RCA: 403] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
iTRAQ (isobaric tags for relative or absolute quantitation) is a mass spectrometry technology that allows quantitative comparison of protein abundance by measuring peak intensities of reporter ions released from iTRAQ-tagged peptides by fragmentation during MS/MS. However, current data analysis techniques for iTRAQ struggle to report reliable relative protein abundance estimates and suffer with problems of precision and accuracy. The precision of the data is affected by variance heterogeneity: low signal data have higher relative variability; however, low abundance peptides dominate data sets. Accuracy is compromised as ratios are compressed toward 1, leading to underestimation of the ratio. This study investigated both issues and proposed a methodology that combines the peptide measurements to give a robust protein estimate even when the data for the protein are sparse or at low intensity. Our data indicated that ratio compression arises from contamination during precursor ion selection, which occurs at a consistent proportion within an experiment and thus results in a linear relationship between expected and observed ratios. We proposed that a correction factor can be calculated from spiked proteins at known ratios. Then we demonstrated that variance heterogeneity is present in iTRAQ data sets irrespective of the analytical packages, LC-MS/MS instrumentation, and iTRAQ labeling kit (4-plex or 8-plex) used. We proposed using an additive-multiplicative error model for peak intensities in MS/MS quantitation and demonstrated that a variance-stabilizing normalization is able to address the error structure and stabilize the variance across the entire intensity range. The resulting uniform variance structure simplifies the downstream analysis. Heterogeneity of variance consistent with an additive-multiplicative model has been reported in other MS-based quantitation including fields outside of proteomics; consequently the variance-stabilizing normalization methodology has the potential to increase the capabilities of MS in quantitation across diverse areas of biology and chemistry.
Collapse
Affiliation(s)
- Natasha A Karp
- European Bioinformatics Institute, European Molecular Biology Laboratory Outstation, Hinxton, UK
| | | | | | | | | | | |
Collapse
|
120
|
Lundgren DH, Martinez H, Wright ME, Han DK. Protein identification using Sorcerer 2 and SEQUEST. ACTA ACUST UNITED AC 2010; Chapter 13:Unit 13.3. [PMID: 19957274 DOI: 10.1002/0471250953.bi1303s28] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Sage-N's Sorcerer 2 provides an integrated data analysis system for comprehensive protein identification and characterization. It runs on a proprietary version of SEQUEST(R), the most widely used search engine for identifying proteins in complex mixtures. The protocol presented here describes the basic steps performed to process mass spectrometric data with Sorcerer 2 and how to analyze results using TPP and Scaffold. The unit also provides an overview of the SEQUEST(R) algorithm, along with Sorcerer-SEQUEST(R) enhancements, and a discussion of data filtering methods, important considerations in data interpretation, and additional resources that can be of assistance to users running Sorcerer and interpreting SEQUEST(R) results.
Collapse
Affiliation(s)
- Deborah H Lundgren
- Department of Cell Biology, Center for Vascular Biology, University of Connecticut Health Center, Farmington, Connecticut, USA
| | | | | | | |
Collapse
|
121
|
Lee A, Chick JM, Kolarich D, Haynes PA, Robertson GR, Tsoli M, Jankova L, Clarke SJ, Packer NH, Baker MS. Liver membrane proteome glycosylation changes in mice bearing an extra-hepatic tumor. Mol Cell Proteomics 2010; 10:M900538MCP200. [PMID: 20167946 DOI: 10.1074/mcp.m900538-mcp200] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Cancer is well known to be associated with alterations in membrane protein glycosylation (Bird, N. C., Mangnall, D., and Majeed, A. W. (2006) Biology of colorectal liver metastases: A review. J. Surg. Oncol. 94, 68-80; Dimitroff, C. J., Pera, P., Dall'Olio, F., Matta, K. L., Chandrasekaran, E. V., Lau, J. T., and Bernacki, R. J. (1999) Cell surface n-acetylneuraminic acid alpha2,3-galactoside-dependent intercellular adhesion of human colon cancer cells. Biochem. Biophys. Res. Commun. 256, 631-636; and Arcinas, A., Yen, T. Y., Kebebew, E., and Macher, B. A. (2009) Cell surface and secreted protein profiles of human thyroid cancer cell lines reveal distinct glycoprotein patterns. J. Proteome Res. 8, 3958-3968). Equally, it has been well established that tumor-associated inflammation through the release of pro-inflammatory cytokines is a common cause of reduced hepatic drug metabolism and increased toxicity in advanced cancer patients being treated with cytotoxic chemotherapies. However, little is known about the impact of bearing a tumor (and downstream effects like inflammation) on liver membrane protein glycosylation. In this study, proteomic and glycomic analyses were used in combination to determine whether liver membrane protein glycosylation was affected in mice bearing the Engelbreth-Holm Swarm sarcoma. Peptide IPG-IEF and label-free quantitation determined that many enzymes involved in the protein glycosylation pathway specifically; mannosidases (Man1a-I, Man1b-I and Man2a-I), mannoside N-acetylglucosaminyltransferases (Mgat-I and Mgat-II), galactosyltransferases (B3GalT-VII, B4GalT-I, B4GalT-III, C1GalT-I, C1GalT-II, and GalNT-I), and sialyltransferases (ST3Gal-I, ST6Gal-I, and ST6GalNAc-VI) were up-regulated in all livers of tumor-bearing mice (n = 3) compared with nontumor bearing controls (n = 3). In addition, many cell surface lectins: Sialoadhesin-1 (Siglec-1), C-type lectin family 4f (Kupffer cell receptor), and Galactose-binding lectin 9 (Galectin-9) were determined to be up-regulated in the liver of tumor-bearing compared with control mice. Global glycan analysis identified seven N-glycans and two O-glycans that had changed on the liver membrane proteins derived from tumor-bearing mice. Interestingly, α (2,3) sialic acid was found to be up-regulated on the liver membrane of tumor-bearing mice, which reflected the increased expression of its associated sialyltransferase and lectin receptor (siglec-1). The overall increased sialylation on the liver membrane of Engelbreth-Holm Swarm bearing mice correlates with the increased expression of their associated glycosyltransferases and suggests that glycosylation of proteins in the liver plays a role in tumor-induced liver inflammation.
Collapse
Affiliation(s)
- Albert Lee
- Department of Chemistry and Biomolecular Sciences, Macquarie University, NSW 2109 Australia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
122
|
Boehmer J, Ward J, Peters R, Shefcheck K, McFarland M, Bannerman D. Proteomic analysis of the temporal expression of bovine milk proteins during coliform mastitis and label-free relative quantification. J Dairy Sci 2010; 93:593-603. [DOI: 10.3168/jds.2009-2526] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2009] [Accepted: 10/13/2009] [Indexed: 11/19/2022]
|
123
|
Hwang H, Bowen BP, Lefort N, Flynn CR, De Filippis EA, Roberts C, Smoke CC, Meyer C, Højlund K, Yi Z, Mandarino LJ. Proteomics analysis of human skeletal muscle reveals novel abnormalities in obesity and type 2 diabetes. Diabetes 2010; 59:33-42. [PMID: 19833877 PMCID: PMC2797941 DOI: 10.2337/db09-0214] [Citation(s) in RCA: 186] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
OBJECTIVE Insulin resistance in skeletal muscle is an early phenomenon in the pathogenesis of type 2 diabetes. Studies of insulin resistance usually are highly focused. However, approaches that give a more global picture of abnormalities in insulin resistance are useful in pointing out new directions for research. In previous studies, gene expression analyses show a coordinated pattern of reduction in nuclear-encoded mitochondrial gene expression in insulin resistance. However, changes in mRNA levels may not predict changes in protein abundance. An approach to identify global protein abundance changes involving the use of proteomics was used here. RESEARCH DESIGN AND METHODS Muscle biopsies were obtained basally from lean, obese, and type 2 diabetic volunteers (n = 8 each); glucose clamps were used to assess insulin sensitivity. Muscle protein was subjected to mass spectrometry-based quantification using normalized spectral abundance factors. RESULTS Of 1,218 proteins assigned, 400 were present in at least half of all subjects. Of these, 92 were altered by a factor of 2 in insulin resistance, and of those, 15 were significantly increased or decreased by ANOVA (P < 0.05). Analysis of protein sets revealed patterns of decreased abundance in mitochondrial proteins and altered abundance of proteins involved with cytoskeletal structure (desmin and alpha actinin-2 both decreased), chaperone function (TCP-1 subunits increased), and proteasome subunits (increased). CONCLUSIONS The results confirm the reduction in mitochondrial proteins in insulin-resistant muscle and suggest that changes in muscle structure, protein degradation, and folding also characterize insulin resistance.
Collapse
Affiliation(s)
- Hyonson Hwang
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- Department of Kinesiology, Arizona State University, Tempe, Arizona
| | - Benjamin P. Bowen
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- Harrington Department of Bioengineering, Arizona State University, Tempe, Arizona
| | - Natalie Lefort
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- Department of Kinesiology, Arizona State University, Tempe, Arizona
| | - Charles R. Flynn
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
| | | | - Christine Roberts
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
| | | | - Christian Meyer
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
| | - Kurt Højlund
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- Diabetes Research Centre, Department of Endocrinology, Odense University Hospital, Odense, Denmark
| | - Zhengping Yi
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- School of Life Sciences, Arizona State University, Tempe, Arizona
| | - Lawrence J. Mandarino
- Center for Metabolic Biology, Arizona State University, Tempe, Arizona
- Department of Kinesiology, Arizona State University, Tempe, Arizona
- School of Life Sciences, Arizona State University, Tempe, Arizona
- Corresponding author: Lawrence J. Mandarino,
| |
Collapse
|
124
|
Abstract
Mass-spectrometry-based proteomics, the large-scale analysis of proteins by mass spectrometry, has emerged as a new technology over the last decade and become routine in many plant biology laboratories. While early work consisted merely of listing proteins identified in a given organ or under different conditions of interest, there is a growing need to apply comparative and quantitative proteomics strategies toward gaining novel insights into functional aspects of plant proteins and their dynamics. However, during the transition from qualitative to quantitative protein analysis, the potential and challenges will be tightly coupled. Several strategies for differential proteomics that involve stable isotopes or label-free comparisons and their statistical assessment are possible, each having specific strengths and limitations. Furthermore, incomplete proteome coverage and restricted dynamic range still impose the strongest limitations to data throughput and precise quantitative analysis. This review gives an overview of the current state of the art in differential proteomics and possible strategies in data processing.
Collapse
|
125
|
Shao C, Liu Y, Ruan H, Li Y, Wang H, Kohl F, Goropashnaya AV, Fedorov VB, Zeng R, Barnes BM, Yan J. Shotgun proteomics analysis of hibernating arctic ground squirrels. Mol Cell Proteomics 2009; 9:313-26. [PMID: 19955082 DOI: 10.1074/mcp.m900260-mcp200] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Mammalian hibernation involves complex mechanisms of metabolic reprogramming and tissue protection. Previous gene expression studies of hibernation have mainly focused on changes at the mRNA level. Large scale proteomics studies on hibernation have lagged behind largely because of the lack of an adequate protein database specific for hibernating species. We constructed a ground squirrel protein database for protein identification and used a label-free shotgun proteomics approach to analyze protein expression throughout the torpor-arousal cycle during hibernation in arctic ground squirrels (Urocitellus parryii). We identified more than 3,000 unique proteins from livers of arctic ground squirrels. Among them, 517 proteins showed significant differential expression comparing animals sampled after at least 8 days of continuous torpor (late torpid), within 5 h of a spontaneous arousal episode (early aroused), and 1-2 months after hibernation had ended (non-hibernating). Consistent with changes at the mRNA level shown in a previous study on the same tissue samples, proteins involved in glycolysis and fatty acid synthesis were significantly underexpressed at the protein level in both late torpid and early aroused animals compared with non-hibernating animals, whereas proteins involved in fatty acid catabolism were significantly overexpressed. On the other hand, when we compared late torpid and early aroused animals, there were discrepancies between mRNA and protein levels for a large number of genes. Proteins involved in protein translation and degradation, mRNA processing, and oxidative phosphorylation were significantly overexpressed in early aroused animals compared with late torpid animals, whereas no significant changes at the mRNA levels between these stages had been observed. Our results suggest that there is substantial post-transcriptional regulation of proteins during torpor-arousal cycles of hibernation.
Collapse
Affiliation(s)
- Chunxuan Shao
- Chinese Academy of Sciences-German Max Planck Society(CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes of Biological Sciences,320 Yue Yang Road, Shanghai 200031, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
126
|
Fournier ML, Paulson A, Pavelka N, Mosley AL, Gaudenz K, Bradford WD, Glynn E, Li H, Sardiu ME, Fleharty B, Seidel C, Florens L, Washburn MP. Delayed correlation of mRNA and protein expression in rapamycin-treated cells and a role for Ggc1 in cellular sensitivity to rapamycin. Mol Cell Proteomics 2009; 9:271-84. [PMID: 19955083 DOI: 10.1074/mcp.m900415-mcp200] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
To identify new molecular targets of rapamycin, an anticancer and immunosuppressive drug, we analyzed temporal changes in yeast over 6 h in response to rapamycin at the transcriptome and proteome levels and integrated the expression patterns with functional profiling. We show that the integration of transcriptomics, proteomics, and functional data sets provides novel insights into the molecular mechanisms of rapamycin action. We first observed a temporal delay in the correlation of mRNA and protein expression where mRNA expression at 1 and 2 h correlated best with protein expression changes after 6 h of rapamycin treatment. This was especially the case for the inhibition of ribosome biogenesis and induction of heat shock and autophagy essential to promote the cellular sensitivity to rapamycin. However, increased levels of vacuolar protease could enhance resistance to rapamycin. Of the 85 proteins identified as statistically significantly changing in abundance, most of the proteins that decreased in abundance were correlated with a decrease in mRNA expression. However, of the 56 proteins increasing in abundance, 26 were not correlated with an increase in mRNA expression. These protein changes were correlated with unchanged or down-regulated mRNA expression. These proteins, involved in mitochondrial genome maintenance, endocytosis, or drug export, represent new candidates effecting rapamycin action whose expression might be post-transcriptionally or post-translationally regulated. We identified GGC1, a mitochondrial GTP/GDP carrier, as a new component of the rapamycin/target of rapamycin (TOR) signaling pathway. We determined that the protein product of GGC1 was stabilized in the presence of rapamycin, and the deletion of the GGC1 enhanced growth fitness in the presence of rapamycin. A dynamic mRNA expression analysis of Deltaggc1 and wild-type cells treated with rapamycin revealed a key role for Ggc1p in the regulation of ribosome biogenesis and cell cycle progression under TOR control.
Collapse
|
127
|
Pendyala G, Trauger SA, Kalisiak E, Ellis RJ, Siuzdak G, Fox HS. Cerebrospinal fluid proteomics reveals potential pathogenic changes in the brains of SIV-infected monkeys. J Proteome Res 2009; 8:2253-60. [PMID: 19281240 DOI: 10.1021/pr800854t] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The HIV-1-associated neurocognitive disorder occurs in approximately one-third of infected individuals. It has persisted in the current era of antiretroviral therapy, and its study is complicated by the lack of biomarkers for this condition. Since the cerebrospinal fluid is the most proximal biofluid to the site of pathology, we studied the cerebrospinal fluid in a nonhuman primate model for HIV-1-associated neurocognitive disorder. Here we present a simple and efficient liquid chromatography-coupled mass spectrometry-based proteomics approach that utilizes small amounts of cerebrospinal fluid. First, we demonstrate the validity of the methodology using human cerebrospinal fluid. Next, using the simian immunodeficiency virus-infected monkey model, we show its efficacy in identifying proteins such as alpha-1-antitrypsin, complement C3, hemopexin, IgM heavy chain, and plasminogen, whose increased expression is linked to disease. Finally, we find that the increase in cerebrospinal fluid proteins is linked to increased expression of their genes in the brain parenchyma, revealing that the cerebrospinal fluid alterations identified reflect changes in the brain itself and not merely leakage of the blood-brain or blood-cerebrospinal fluid barriers. This study reveals new central nervous system alterations in lentivirus-induced neurological disease, and this technique can be applied to other systems in which limited amounts of biofluids can be obtained.
Collapse
Affiliation(s)
- Gurudutt Pendyala
- Department of Pharmacology and Experimental Neuroscience, University of Nebraska Medical Center, Omaha, Nebraska 68198-5800, USA
| | | | | | | | | | | |
Collapse
|
128
|
Sardiu ME, Florens L, Washburn MP. Evaluation of Clustering Algorithms for Protein Complex and Protein Interaction Network Assembly. J Proteome Res 2009; 8:2944-52. [DOI: 10.1021/pr900073d] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Laurence Florens
- Stowers Institute for Medical Research, Kansas City, Missouri 64110
| | | |
Collapse
|
129
|
Li Q, Roxas BA. An assessment of false discovery rates and statistical significance in label-free quantitative proteomics with combined filters. BMC Bioinformatics 2009; 10:43. [PMID: 19187558 PMCID: PMC2645366 DOI: 10.1186/1471-2105-10-43] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2008] [Accepted: 02/02/2009] [Indexed: 11/25/2022] Open
Abstract
Background Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. Results We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4–16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [15N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates. Conclusion We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The approach described in this study would be useful in many exploratory experiments where a sample amount or instrument time is limited. Naturally, this method is also suitable for proteomics experiments where multiple sample replicates are available. It is simple, and is complementary to other more sophisticated algorithms that are not designed for dealing with a small number of sample replicates.
Collapse
Affiliation(s)
- Qingbo Li
- Center for Pharmaceutical Biotechnology, College of Pharmacy, University of Illinois at Chicago, Chicago, IL 60607, USA.
| | | |
Collapse
|
130
|
Abstract
In a recent editorial (J. Proteome Res. 2007, 6, 1633) and elsewhere questions have been raised regarding the lack of attention paid to good analytical practice with respect to the reporting of quantitative results in proteomics. Using those comments as a starting point, several issues are discussed that relate to the challenges involved in achieving adequate sampling with MS-based methods in order to generate valid data for large-scale studies. The discussion touches on the relationships that connect sampling depth and the power to detect protein abundance change, conflict of interest, and strategies to overcome bureaucratic obstacles that impede the use of peer-to-peer technologies for transfer and storage of large data files generated in such experiments.
Collapse
Affiliation(s)
- Murray Hackett
- Department of Chemical Engineering, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
131
|
Mosley AL, Florens L, Wen Z, Washburn MP. A label free quantitative proteomic analysis of the Saccharomyces cerevisiae nucleus. J Proteomics 2008; 72:110-20. [PMID: 19038371 DOI: 10.1016/j.jprot.2008.10.008] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2008] [Revised: 10/27/2008] [Accepted: 10/28/2008] [Indexed: 01/09/2023]
Abstract
To gain insight into the nuclear proteome of Saccharomyces cerevisiae, nuclei were isolated and fractionated via sucrose gradient sedimentation. The resulting fractions were analyzed using multidimensional protein identification technology and the detected proteins were quantified using normalized spectral counts. A large number of low abundance proteins, many of which are involved in transcriptional regulation, were recovered. Sucrose gradient elution profiles of known protein complex components demonstrated that this approach may provide insight into the question of what percentage of the total population of a protein is in one complex, versus another protein complex, or exists as a free protein.
Collapse
Affiliation(s)
- Amber L Mosley
- Stowers Institute for Medical Research, 1000 E. 50th St., Kansas City, MO 64110, United States
| | | | | | | |
Collapse
|
132
|
Rao PK, Rodriguez GM, Smith I, Li Q. Protein dynamics in iron-starved Mycobacterium tuberculosis revealed by turnover and abundance measurement using hybrid-linear ion trap-Fourier transform mass spectrometry. Anal Chem 2008; 80:6860-9. [PMID: 18690695 DOI: 10.1021/ac800288t] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
To study the proteome response of Mycobacterium tuberculosis H37Rv to a change in iron level, iron-starved late-log-phase cells were diluted in fresh low- and high-iron media containing [ (15)N]-labeled asparagine as the sole nitrogen source for labeling the proteins synthesized upon dilution. We determined the relative protein abundance and protein turnover in M. tuberculosis H37Rv under these two conditions. For measurements, we used a high-resolution hybrid-linear ion trap-Fourier transform mass spectrometer coupled with nanoliquid chromatography separation. While relative protein abundance analysis shows that only 5 proteins were upregulated by high iron, 24 proteins had elevated protein turnover for the cells in the high-iron medium. This suggests that protein turnover is a sensitive parameter to assess the proteome dynamics. Cluster analysis was used to explore the interconnection of protein abundance and turnover, revealing coordination of the cellular processes of protein synthesis, degradation, and secretion that determine the abundance and allocation of a protein in the cytosol and the extracellular matrix of the cells. Further potential utility of the approach is discussed.
Collapse
Affiliation(s)
- Prahlad K Rao
- Center for Pharmaceutical Biotechnology, College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60607, USA
| | | | | | | |
Collapse
|
133
|
Choi H, Fermin D, Nesvizhskii AI. Significance analysis of spectral count data in label-free shotgun proteomics. Mol Cell Proteomics 2008; 7:2373-85. [PMID: 18644780 DOI: 10.1074/mcp.m800203-mcp200] [Citation(s) in RCA: 287] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Spectral counting has become a commonly used approach for measuring protein abundance in label-free shotgun proteomics. At the same time, the development of data analysis methods has lagged behind. Currently most studies utilizing spectral counts rely on simple data transforms and posthoc corrections of conventional signal-to-noise ratio statistics. However, these adjustments can neither handle the bias toward high abundance proteins nor deal with the drawbacks due to the limited number of replicates. We present a novel statistical framework (QSpec) for the significance analysis of differential expression with extensions to a variety of experimental design factors and adjustments for protein properties. Using synthetic and real experimental data sets, we show that the proposed method outperforms conventional statistical methods that search for differential expression for individual proteins. We illustrate the flexibility of the model by analyzing a data set with a complicated experimental design involving cellular localization and time course.
Collapse
Affiliation(s)
- Hyungwon Choi
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | | |
Collapse
|