1
|
Tsantilas KA, Merrihew GE, Robbins JE, Johnson RS, Park J, Plubell DL, Canterbury JD, Huang E, Riffle M, Sharma V, MacLean BX, Eckels J, Wu CC, Bereman MS, Spencer SE, Hoofnagle AN, MacCoss MJ. A Framework for Quality Control in Quantitative Proteomics. J Proteome Res 2024; 23:4392-4408. [PMID: 39248652 DOI: 10.1021/acs.jproteome.4c00363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]
Abstract
A thorough evaluation of the quality, reproducibility, and variability of bottom-up proteomics data is necessary at every stage of a workflow, from planning to analysis. We share vignettes applying adaptable quality control (QC) measures to assess sample preparation, system function, and quantitative analysis. System suitability samples are repeatedly measured longitudinally with targeted methods, and we share examples where they are used on three instrument platforms to identify severe system failures and track function over months to years. Internal QCs incorporated at the protein and peptide levels allow our team to assess sample preparation issues and to differentiate system failures from sample-specific issues. External QC samples prepared alongside our experimental samples are used to verify the consistency and quantitative potential of our results during batch correction and normalization before assessing biological phenotypes. We combine these controls with rapid analysis (Skyline), longitudinal QC metrics (AutoQC), and server-based data deposition (PanoramaWeb). We propose that this integrated approach to QC is a useful starting point for groups to facilitate rapid quality control assessment to ensure that valuable instrument time is used to collect the best quality data possible. Data are available on Panorama Public and ProteomeXchange under the identifier PXD051318.
Collapse
Affiliation(s)
- Kristine A Tsantilas
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Gennifer E Merrihew
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Julia E Robbins
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Richard S Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Jea Park
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Deanna L Plubell
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Jesse D Canterbury
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Michael Riffle
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Vagisha Sharma
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Brendan X MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Josh Eckels
- LabKey, 500 Union St #1000, Seattle, Washington 98101, United States
| | - Christine C Wu
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Michael S Bereman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27607, United States
| | - Sandra E Spencer
- Canada's Michael Smith Genome Sciences Centre (BC Cancer Research Institute), University of British Columbia, Vancouver, British Columbia V5Z 4S6, Canada
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington 98195, United States
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
2
|
Tsantilas KA, Merrihew GE, Robbins JE, Johnson RS, Park J, Plubell DL, Canterbury JD, Huang E, Riffle M, Sharma V, MacLean BX, Eckels J, Wu CC, Bereman MS, Spencer SE, Hoofnagle AN, MacCoss MJ. A framework for quality control in quantitative proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589318. [PMID: 38645098 PMCID: PMC11030400 DOI: 10.1101/2024.04.12.589318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
A thorough evaluation of the quality, reproducibility, and variability of bottom-up proteomics data is necessary at every stage of a workflow from planning to analysis. We share vignettes applying adaptable quality control (QC) measures to assess sample preparation, system function, and quantitative analysis. System suitability samples are repeatedly measured longitudinally with targeted methods, and we share examples where they are used on three instrument platforms to identify severe system failures and track function over months to years. Internal QCs incorporated at protein and peptide-level allow our team to assess sample preparation issues and to differentiate system failures from sample-specific issues. External QC samples prepared alongside our experimental samples are used to verify the consistency and quantitative potential of our results during batch correction and normalization before assessing biological phenotypes. We combine these controls with rapid analysis (Skyline), longitudinal QC metrics (AutoQC), and server-based data deposition (PanoramaWeb). We propose that this integrated approach to QC is a useful starting point for groups to facilitate rapid quality control assessment to ensure that valuable instrument time is used to collect the best quality data possible. Data are available on Panorama Public and on ProteomeXchange under the identifier PXD051318.
Collapse
Affiliation(s)
- Kristine A. Tsantilas
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Gennifer E. Merrihew
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Julia E. Robbins
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Richard S. Johnson
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Jea Park
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Deanna L. Plubell
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Jesse D. Canterbury
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Michael Riffle
- Department of Biochemistry, University of Washington, Washington 98195, United States
| | - Vagisha Sharma
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Brendan X. MacLean
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Josh Eckels
- LabKey, 500 Union St #1000, Seattle, Washington 98101, United States
| | - Christine C. Wu
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Michael S. Bereman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27607
| | - Sandra E. Spencer
- Canada’s Michael Smith Genome Sciences Centre (BC Cancer Research Institute), University of British Columbia, Vancouver, British Columbia V5Z 4S6, Canada
| | - Andrew N. Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington 98195, United States
| | - Michael J. MacCoss
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| |
Collapse
|
3
|
Bielow C, Hoffmann N, Jimenez-Morales D, Van Den Bossche T, Vizcaíno JA, Tabb DL, Bittremieux W, Walzer M. Communicating Mass Spectrometry Quality Information in mzQC with Python, R, and Java. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1875-1882. [PMID: 38918936 PMCID: PMC11311537 DOI: 10.1021/jasms.4c00174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/07/2024] [Accepted: 06/11/2024] [Indexed: 06/27/2024]
Abstract
Mass spectrometry is a powerful technique for analyzing molecules in complex biological samples. However, inter- and intralaboratory variability and bias can affect the data due to various factors, including sample handling and preparation, instrument calibration and performance, and data acquisition and processing. To address this issue, the Quality Control (QC) working group of the Human Proteome Organization's Proteomics Standards Initiative has established the standard mzQC file format for reporting and exchanging information relating to data quality. mzQC is based on the JavaScript Object Notation (JSON) format and provides a lightweight yet versatile file format that can be easily implemented in software. Here, we present open-source software libraries to process mzQC data in three programming languages: Python, using pymzqc; R, using rmzqc; and Java, using jmzqc. The libraries follow a common data model and provide shared functionalities, including the (de)serialization and validation of mzQC files. We demonstrate use of the software libraries in a workflow for extracting, analyzing, and visualizing QC metrics from different sources. Additionally, we show how these libraries can be integrated with each other, with existing software tools, and in automated workflows for the QC of mass spectrometry data. All software libraries are available as open source under the MS-Quality-Hub organization on GitHub (https://github.com/MS-Quality-Hub).
Collapse
Affiliation(s)
- Chris Bielow
- Bioinformatics
Solution Center, Institut für Mathematik und Informatik, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany
| | - Nils Hoffmann
- Institute
for Bio- and Geosciences (IBG-5), Forschungszentrum Jülich
GmbH, 52428 Jülich, Germany
| | - David Jimenez-Morales
- Department
of Medicine, Stanford University School
of Medicine, Stanford, California 94305, United States
| | - Tim Van Den Bossche
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- VIB-UGent
Center for Medical Biotechnology, VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, EMBL-European
Bioinformatics Institute (EMBL-EBI),
Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- European
Research Institute for the Biology of Ageing, University Medical Center Groningen, Groningen 9713 AV, The Netherlands
| | - Wout Bittremieux
- Department
of Computer Science, University of Antwerp, Antwerpen 2020, Belgium
| | - Mathias Walzer
- European
Molecular Biology Laboratory, EMBL-European
Bioinformatics Institute (EMBL-EBI),
Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
4
|
Hu Y, Schnaubelt M, Chen L, Zhang B, Hoang T, Lih TM, Zhang Z, Zhang H. MS-PyCloud: A Cloud Computing-Based Pipeline for Proteomic and Glycoproteomic Data Analyses. Anal Chem 2024; 96:10145-10151. [PMID: 38869158 DOI: 10.1021/acs.analchem.3c01497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
Rapid development and wide adoption of mass spectrometry-based glycoproteomic technologies have empowered scientists to study proteins and protein glycosylation in complex samples on a large scale. This progress has also created unprecedented challenges for individual laboratories to store, manage, and analyze proteomic and glycoproteomic data, both in the cost for proprietary software and high-performance computing and in the long processing time that discourages on-the-fly changes of data processing settings required in explorative and discovery analysis. We developed an open-source, cloud computing-based pipeline, MS-PyCloud, with graphical user interface (GUI), for proteomic and glycoproteomic data analysis. The major components of this pipeline include data file integrity validation, MS/MS database search for spectral assignments to peptide sequences, false discovery rate estimation, protein inference, quantitation of global protein levels, and specific glycan-modified glycopeptides as well as other modification-specific peptides such as phosphorylation, acetylation, and ubiquitination. To ensure the transparency and reproducibility of data analysis, MS-PyCloud includes open-source software tools with comprehensive testing and versioning for spectrum assignments. Leveraging public cloud computing infrastructure via Amazon Web Services (AWS), MS-PyCloud scales seamlessly based on analysis demand to achieve fast and efficient performance. Application of the pipeline to the analysis of large-scale LC-MS/MS data sets demonstrated the effectiveness and high performance of MS-PyCloud. The software can be downloaded at https://github.com/huizhanglab-jhu/ms-pycloud.
Collapse
Affiliation(s)
- Yingwei Hu
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Michael Schnaubelt
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Li Chen
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Bai Zhang
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Trung Hoang
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - T Mamie Lih
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Zhen Zhang
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| | - Hui Zhang
- Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
| |
Collapse
|
5
|
Zhang NH, Deutsch EW. SpectiCal: m/ z Calibration of MS2 Peptide Spectra Using Known Low Mass Ions. J Proteome Res 2024; 23:1519-1530. [PMID: 38538550 DOI: 10.1021/acs.jproteome.3c00882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
Most tandem mass spectrometry fragmentation spectra have small calibration errors that can lead to suboptimal interpretation and annotation. We developed SpectiCal, a software tool that can read mzML files from data-dependent acquisition proteomics experiments in parallel, compute m/z calibrations for each file prior to identification analysis based on known low-mass ions, and produce information about frequently observed peaks and their explanations. Using calibration coefficients, the data can be corrected to generate new calibrated mzML files. SpectiCal was tested using five public data sets, creating a table of commonly observed low-mass ions and their identifications. Information about the calibration and individual peaks is written in PDF and TSV files. This includes information for each peak, such as the number of runs in which it appears, the percentage of spectra in which it appears, and a plot of the aggregated region surrounding each peak. SpectiCal can be used to compute MS run calibrations, examine MS runs for artifacts that might hinder downstream analysis, and generate tables of detected low-mass ions for further analysis. SpectiCal is freely available at https://github.com/PlantProteomes/SpectiCal.
Collapse
Affiliation(s)
- Nathan H Zhang
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
6
|
Huang J, Zhao Y, Meng B, Lu A, Wei Y, Dong L, Fang X, An D, Dai X. SEAOP: a statistical ensemble approach for outlier detection in quantitative proteomics data. Brief Bioinform 2024; 25:bbae129. [PMID: 38557674 PMCID: PMC10982946 DOI: 10.1093/bib/bbae129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 02/01/2024] [Accepted: 03/07/2024] [Indexed: 04/04/2024] Open
Abstract
Quality control in quantitative proteomics is a persistent challenge, particularly in identifying and managing outliers. Unsupervised learning models, which rely on data structure rather than predefined labels, offer potential solutions. However, without clear labels, their effectiveness might be compromised. Single models are susceptible to the randomness of parameters and initialization, which can result in a high rate of false positives. Ensemble models, on the other hand, have shown capabilities in effectively mitigating the impacts of such randomness and assisting in accurately detecting true outliers. Therefore, we introduced SEAOP, a Python toolbox that utilizes an ensemble mechanism by integrating multi-round data management and a statistics-based decision pipeline with multiple models. Specifically, SEAOP uses multi-round resampling to create diverse sub-data spaces and employs outlier detection methods to identify candidate outliers in each space. Candidates are then aggregated as confirmed outliers via a chi-square test, adhering to a 95% confidence level, to ensure the precision of the unsupervised approaches. Additionally, SEAOP introduces a visualization strategy, specifically designed to intuitively and effectively display the distribution of both outlier and non-outlier samples. Optimal hyperparameter models of SEAOP for outlier detection were identified by using a gradient-simulated standard dataset and Mann-Kendall trend test. The performance of the SEAOP toolbox was evaluated using three experimental datasets, confirming its reliability and accuracy in handling quantitative proteomics.
Collapse
Affiliation(s)
- Jinze Huang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Yang Zhao
- Technology Innovation Center of Mass Spectrometry for State Market Regulation, Center for Advanced Measurement Science, National Institute of Metrology, Beijing 100029, China
| | - Bo Meng
- Technology Innovation Center of Mass Spectrometry for State Market Regulation, Center for Advanced Measurement Science, National Institute of Metrology, Beijing 100029, China
| | - Ao Lu
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Yaoguang Wei
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Lianhua Dong
- Technology Innovation Center of Mass Spectrometry for State Market Regulation, Center for Advanced Measurement Science, National Institute of Metrology, Beijing 100029, China
| | - Xiang Fang
- Technology Innovation Center of Mass Spectrometry for State Market Regulation, Center for Advanced Measurement Science, National Institute of Metrology, Beijing 100029, China
| | - Dong An
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Xinhua Dai
- Technology Innovation Center of Mass Spectrometry for State Market Regulation, Center for Advanced Measurement Science, National Institute of Metrology, Beijing 100029, China
| |
Collapse
|
7
|
Naake T, Rainer J, Huber W. MsQuality: an interoperable open-source package for the calculation of standardized quality metrics of mass spectrometry data. Bioinformatics 2023; 39:btad618. [PMID: 37812234 PMCID: PMC10580266 DOI: 10.1093/bioinformatics/btad618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/08/2023] [Accepted: 10/06/2023] [Indexed: 10/10/2023] Open
Abstract
MOTIVATION Multiple factors can impact accuracy and reproducibility of mass spectrometry data. There is a need to integrate quality assessment and control into data analytic workflows. RESULTS The MsQuality package calculates 43 low-level quality metrics based on the controlled mzQC vocabulary defined by the HUPO-PSI on a single mass spectrometry-based measurement of a sample. It helps to identify low-quality measurements and track data quality. Its use of community-standard quality metrics facilitates comparability of quality assessment and control (QA/QC) criteria across datasets. AVAILABILITY AND IMPLEMENTATION The R package MsQuality is available through Bioconductor at https://bioconductor.org/packages/MsQuality.
Collapse
Affiliation(s)
- Thomas Naake
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Johannes Rainer
- Institute for Biomedicine (Affiliated to the University of Lübeck), Eurac Research, Bolzano 39100, Italy
| | - Wolfgang Huber
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| |
Collapse
|
8
|
Bowser BL, Patterson KL, Robinson RA. Evaluating cPILOT Data toward Quality Control Implementation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:1741-1752. [PMID: 37459602 DOI: 10.1021/jasms.3c00179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Multiplexing enables the monitoring of hundreds to thousands of proteins in quantitative proteomics analyses and increases sample throughput. In most mass-spectrometry-based proteomics workflows, multiplexing is achieved by labeling biological samples with heavy isotopes via precursor isotopic labeling or isobaric tagging. Enhanced multiplexing strategies, such as combined precursor isotopic labeling and isobaric tagging (cPILOT), combine multiple technologies to afford an even higher sample throughput. Critical to enhanced multiplexing analyses is ensuring that analytical performance is optimal and that missingness of sample channels is minimized. Automation of sample preparation steps and use of quality control (QC) metrics can be incorporated into multiplexing analyses and reduce the likelihood of missing information, thus maximizing the amount of usable quantitative data. Here, we implemented QC metrics previously developed in our laboratory to evaluate a 36-plex cPILOT experiment that encompassed 144 mouse samples of various tissue types, time points, genotypes, and biological replicates. The evaluation focuses on the use of a sample pool generated from all samples in the experiment to monitor the daily instrument performance and to provide a means for data normalization across sample batches. Our results show that tracking QC metrics enabled the quantification of ∼7000 proteins in each sample batch, of which ∼70% had minimal missing values across up to 36 sample channels. Implementation of QC metrics for future cPILOT studies as well as other enhanced multiplexing strategies will help yield high-quality data sets.
Collapse
Affiliation(s)
- Bailey L Bowser
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Khiry L Patterson
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Renã As Robinson
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Department of Neurology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, United States
- Vanderbilt Memory & Alzheimer's Center, Nashville, Tennessee 37212, United States
- Vanderbilt Institute of Chemical Biology, Nashville, Tennessee 37232, United States
- Vanderbilt Brain Institute, Nashville, Tennessee 37232, United States
| |
Collapse
|
9
|
Quality Control—A Stepchild in Quantitative Proteomics: A Case Study for the Human CSF Proteome. Biomolecules 2023; 13:biom13030491. [PMID: 36979426 PMCID: PMC10046854 DOI: 10.3390/biom13030491] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/08/2023] [Accepted: 03/01/2023] [Indexed: 03/09/2023] Open
Abstract
Proteomic studies using mass spectrometry (MS)-based quantification are a main approach to the discovery of new biomarkers. However, a number of analytical conditions in front and during MS data acquisition can affect the accuracy of the obtained outcome. Therefore, comprehensive quality assessment of the acquired data plays a central role in quantitative proteomics, though, due to the immense complexity of MS data, it is often neglected. Here, we address practically the quality assessment of quantitative MS data, describing key steps for the evaluation, including the levels of raw data, identification and quantification. With this, four independent datasets from cerebrospinal fluid, an important biofluid for neurodegenerative disease biomarker studies, were assessed, demonstrating that sample processing-based differences are already reflected at all three levels but with varying impacts on the quality of the quantitative data. Specifically, we provide guidance to critically interpret the quality of MS data for quantitative proteomics. Moreover, we provide the free and open source quality control tool MaCProQC, enabling systematic, rapid and uncomplicated data comparison of raw data, identification and feature detection levels through defined quality metrics and a step-by-step quality control workflow.
Collapse
|
10
|
Morgenstern D, Barzilay R, Levin Y. RawBeans: A Simple, Vendor-Independent, Raw-Data Quality-Control Tool. J Proteome Res 2021; 20:2098-2104. [PMID: 33657803 PMCID: PMC8041395 DOI: 10.1021/acs.jproteome.0c00956] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
![]()
Every laboratory performing mass-spectrometry-based
proteomics
strives to generate high-quality data. Among the many factors that
impact the outcome of any experiment in proteomics is the LC–MS
system performance, which should be monitored within each specific
experiment and also long term. This process is termed quality control
(QC). We present an easy-to-use tool that rapidly produces a visual,
HTML-based report that includes the key parameters needed to monitor
the LC–MS system performance, with a focus on monitoring the
performance within an experiment. The tool, named RawBeans, generates
a report for individual files or for a set of samples from a whole
experiment. We anticipate that it will help proteomics users and experts
evaluate raw data quality independent of data processing. The tool
is available at https://bitbucket.org/incpm/prot-qc/downloads. The mass-spectrometry proteomics data have been deposited to the
ProteomeXchange Consortium via the PRIDE partner repository with the
data set identifier PXD022816.
Collapse
Affiliation(s)
- David Morgenstern
- de Botton Institute for Protein Profiling, The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Rotem Barzilay
- Ilana and Pascal Mantoux Institute for Bioinformatics, The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yishai Levin
- de Botton Institute for Protein Profiling, The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
11
|
Hackett WE, Zaia J. Calculating Glycoprotein Similarities From Mass Spectrometric Data. Mol Cell Proteomics 2021; 20:100028. [PMID: 32883803 PMCID: PMC8724611 DOI: 10.1074/mcp.r120.002223] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/24/2020] [Accepted: 09/03/2020] [Indexed: 12/23/2022] Open
Abstract
Complex protein glycosylation occurs through biosynthetic steps in the secretory pathway that create macro- and microheterogeneity of structure and function. Required for all life forms, glycosylation diversifies and adapts protein interactions with binding partners that underpin interactions at cell surfaces and pericellular and extracellular environments. Because these biological effects arise from heterogeneity of structure and function, it is necessary to measure their changes as part of the quest to understand nature. Quite often, however, the assumption behind proteomics that posttranslational modifications are discrete additions that can be modeled using the genome as a template does not apply to protein glycosylation. Rather, it is necessary to quantify the glycosylation distribution at each glycosite and to aggregate this information into a population of mature glycoproteins that exist in a given biological system. To date, mass spectrometric methods for assigning singly glycosylated peptides are well-established. But it is necessary to quantify glycosylation heterogeneity accurately in order to gauge the alterations that occur during biological processes. The task is to quantify the glycosylated peptide forms as accurately as possible and then apply appropriate bioinformatics algorithms to the calculation of micro- and macro-similarities. In this review, we summarize current approaches for protein quantification as they apply to this glycoprotein similarity problem.
Collapse
Affiliation(s)
- William E Hackett
- Bioinformatics Program, Boston University, Boston, Massachusetts, USA
| | - Joseph Zaia
- Bioinformatics Program, Boston University, Boston, Massachusetts, USA; Department of Biochemistry, Boston University, Boston, Massachusetts, USA.
| |
Collapse
|
12
|
Lombard-Banek C, Schiel JE. Mass Spectrometry Advances and Perspectives for the Characterization of Emerging Adoptive Cell Therapies. Molecules 2020; 25:E1396. [PMID: 32204371 PMCID: PMC7144572 DOI: 10.3390/molecules25061396] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 03/06/2020] [Accepted: 03/11/2020] [Indexed: 12/12/2022] Open
Abstract
Adoptive cell therapy is an emerging anti-cancer modality, whereby the patient's own immune cells are engineered to express T-cell receptor (TCR) or chimeric antigen receptor (CAR). CAR-T cell therapies have advanced the furthest, with recent approvals of two treatments by the Food and Drug Administration of Kymriah (trisagenlecleucel) and Yescarta (axicabtagene ciloleucel). Recent developments in proteomic analysis by mass spectrometry (MS) make this technology uniquely suited to enable the comprehensive identification and quantification of the relevant biochemical architecture of CAR-T cell therapies and fulfill current unmet needs for CAR-T product knowledge. These advances include improved sample preparation methods, enhanced separation technologies, and extension of MS-based proteomic to single cells. Innovative technologies such as proteomic analysis of raw material quality attributes (MQA) and final product quality attributes (PQA) may provide insights that could ultimately fuel development strategies and lead to broad implementation.
Collapse
Affiliation(s)
- Camille Lombard-Banek
- National Institute of Standards and Technology, Gaithersburg, MD 20899, USA;
- Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| | - John E. Schiel
- National Institute of Standards and Technology, Gaithersburg, MD 20899, USA;
- Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| |
Collapse
|
13
|
Solovyeva EM, Lobas AA, Surin AK, Levitsky LI, Gorshkov VA, Gorshkov MV. viQC: Visual and Intuitive Quality Control for Mass Spectrometry-Based Proteome Analysis. JOURNAL OF ANALYTICAL CHEMISTRY 2019. [DOI: 10.1134/s1061934819140119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
14
|
Review of Issues and Solutions to Data Analysis Reproducibility and Data Quality in Clinical Proteomics. Methods Mol Biol 2019. [PMID: 31552637 DOI: 10.1007/978-1-4939-9744-2_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
In any analytical discipline, data analysis reproducibility is closely interlinked with data quality. In this book chapter focused on mass spectrometry-based proteomics approaches, we introduce how both data analysis reproducibility and data quality can influence each other and how data quality and data analysis designs can be used to increase robustness and improve reproducibility. We first introduce methods and concepts to design and maintain robust data analysis pipelines such that reproducibility can be increased in parallel. The technical aspects related to data analysis reproducibility are challenging, and current ways to increase the overall robustness are multifaceted. Software containerization and cloud infrastructures play an important part.We will also show how quality control (QC) and quality assessment (QA) approaches can be used to spot analytical issues, reduce the experimental variability, and increase confidence in the analytical results of (clinical) proteomics studies, since experimental variability plays a substantial role in analysis reproducibility. Therefore, we give an overview on existing solutions for QC/QA, including different quality metrics, and methods for longitudinal monitoring. The efficient use of both types of approaches undoubtedly provides a way to improve the experimental reliability, reproducibility, and level of consistency in proteomics analytical measurements.
Collapse
|
15
|
Hou X, Yu M, Liu A, Wang X, Li Y, Liu J, Schnoor JL, Jiang G. Glycosylation of Tetrabromobisphenol A in Pumpkin. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:8805-8812. [PMID: 31283198 PMCID: PMC6931399 DOI: 10.1021/acs.est.9b02122] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Tetrabromobisphenol A (TBBPA) is the most widely used brominated flame retardant (BFR), and it bioaccumulates throughout the food chains. Its fate in the first trophic level, plants, is of special interest. In this study, a four-day hydroponic exposure of TBBPA at a concentration of 1 μmol L-1 to pumpkin seedlings was conducted. A nontarget screening method for hydrophilic bromine-containing metabolites was modified, based on both typical isotope patterns of bromine and mass defect, and used to process mass spectra data. A total of 20 glycosylation and malonyl glycosylation metabolites were found for TBBPA in the pumpkin plants. Representative glycosyl TBBPA reference standards were synthesized to evaluate the contribution of this glycosylation process. Approximately 86% of parent TBBPA was metabolized to form those 20 glycosyl TBBPAs, showing that glycosylation was the most dominant metabolism pathway for TBBPA in pumpkin at the tested exposure concentration.
Collapse
Affiliation(s)
- Xingwang Hou
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing, 100085, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Miao Yu
- Department of Environmental Medical and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Aifeng Liu
- CAS Key Laboratory of Biobased Materials, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
| | - Xiaoyun Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing, 100085, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yanlin Li
- Department of Civil and Environmental Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Jiyan Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing, 100085, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China
- Corresponding Author: Phone: 8610-62849334; fax: 8610-62849339;
| | - Jerald L. Schnoor
- Department of Civil and Environmental Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, P.O. Box 2871, Beijing, 100085, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
16
|
Kim T, Chen IR, Parker BL, Humphrey SJ, Crossett B, Cordwell SJ, Yang P, Yang JYH. QCMAP: An Interactive Web-Tool for Performance Diagnosis and Prediction of LC-MS Systems. Proteomics 2019; 19:e1900068. [PMID: 31099962 DOI: 10.1002/pmic.201900068] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 05/07/2019] [Indexed: 01/04/2023]
Abstract
The increasing role played by liquid chromatography-mass spectrometry (LC-MS)-based proteomics in biological discovery has led to a growing need for quality control (QC) on the LC-MS systems. While numerous quality control tools have been developed to track the performance of LC-MS systems based on a pre-defined set of performance factors (e.g., mass error, retention time), the precise influence and contribution of the performance factors and their generalization property to different biological samples are not as well characterized. Here, a web-based application (QCMAP) is developed for interactive diagnosis and prediction of the performance of LC-MS systems across different biological sample types. Leveraging on a standardized HeLa cell sample run as QC within a multi-user facility, predictive models are trained on a panel of commonly used performance factors to pinpoint the precise conditions to a (un)satisfactory performance in three LC-MS systems. It is demonstrated that the learned model can be applied to predict LC-MS system performance for brain samples generated from an independent study. By compiling these predictive models into our web-application, QCMAP allows users to benchmark the performance of their LC-MS systems using their own samples and identify key factors for instrument optimization. QCMAP is freely available from: http://shiny.maths.usyd.edu.au/QCMAP/.
Collapse
Affiliation(s)
- Taiyun Kim
- School of Mathematics and Statistics, University of Sydney, NSW, 2006, Australia.,Judith and David Coffey Life Lab, Charles Perkins Centre, University of Sydney, NSW, 2006, Australia
| | - Irene Rui Chen
- School of Mathematics and Statistics, University of Sydney, NSW, 2006, Australia.,Judith and David Coffey Life Lab, Charles Perkins Centre, University of Sydney, NSW, 2006, Australia
| | - Benjamin L Parker
- School of Life and Environmental Sciences, University of Sydney, NSW, 2006, Australia
| | - Sean J Humphrey
- School of Life and Environmental Sciences, University of Sydney, NSW, 2006, Australia
| | - Ben Crossett
- Sydney Mass Spectrometry, University of Sydney, NSW, 2006, Australia
| | - Stuart J Cordwell
- School of Life and Environmental Sciences, University of Sydney, NSW, 2006, Australia.,Sydney Mass Spectrometry, University of Sydney, NSW, 2006, Australia
| | - Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, NSW, 2006, Australia.,Computational Systems Biology Group, Children's Medical Research Institute, Faculty of Medicine and Health, University of Sydney, Westmead, NSW, 2145, Australia
| | - Jean Yee Hwa Yang
- School of Mathematics and Statistics, University of Sydney, NSW, 2006, Australia.,Judith and David Coffey Life Lab, Charles Perkins Centre, University of Sydney, NSW, 2006, Australia
| |
Collapse
|
17
|
Dogu E, Taheri SM, Olivella R, Marty F, Lienert I, Reiter L, Sabido E, Vitek O. MSstatsQC 2.0: R/Bioconductor Package for Statistical Quality Control of Mass Spectrometry-Based Proteomics Experiments. J Proteome Res 2018; 18:678-686. [DOI: 10.1021/acs.jproteome.8b00732] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Eralp Dogu
- Department of Statistics, Muğla Sitki Koçman University, Muğla 48000, Turkey
| | - Sara Mohammad Taheri
- College of Computer Science, Northeastern University, Boston, Massachusetts 02115,United States
| | - Roger Olivella
- Proteomics Unit, Centre de Regulaci Genmica, Barcelona Institute of Science and Technology, Universitat Pompeu Fabra, 08002 Barcelona, Spain
| | | | | | | | - Eduard Sabido
- Proteomics Unit, Centre de Regulaci Genmica, Barcelona Institute of Science and Technology, Universitat Pompeu Fabra, 08002 Barcelona, Spain
| | - Olga Vitek
- College of Computer Science, Northeastern University, Boston, Massachusetts 02115,United States
| |
Collapse
|
18
|
Stanfill BA, Nakayasu ES, Bramer LM, Thompson AM, Ansong CK, Clauss TR, Gritsenko MA, Monroe ME, Moore RJ, Orton DJ, Piehowski PD, Schepmoes AA, Smith RD, Webb-Robertson BJM, Metz TO. Quality Control Analysis in Real-time (QC-ART): A Tool for Real-time Quality Control Assessment of Mass Spectrometry-based Proteomics Data. Mol Cell Proteomics 2018; 17:1824-1836. [PMID: 29666158 PMCID: PMC6126382 DOI: 10.1074/mcp.ra118.000648] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 03/13/2018] [Indexed: 12/29/2022] Open
Abstract
Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those because of collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected. In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality because of the need for instrument cleaning and/or re-calibration. To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired to dynamically flag potential issues with instrument performance or sample quality. QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis. We demonstrate the utility and performance of QC-ART in identifying deviations in data quality because of both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments.
Collapse
Affiliation(s)
| | | | - Lisa M Bramer
- From the ‡Computational and Statistical Analytics Division
| | - Allison M Thompson
- ¶Environmental and Molecular Sciences Laboratory, 902 Battelle Blvd, Pacific Northwest National Laboratory, Richland, Washington
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Martínez-Bartolomé S, Medina-Aunon JA, López-García MÁ, González-Tejedo C, Prieto G, Navajas R, Salazar-Donate E, Fernández-Costa C, Yates JR, Albar JP. PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets. J Proteome Res 2018; 17:1547-1558. [DOI: 10.1021/acs.jproteome.7b00858] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Salvador Martínez-Bartolomé
- Proteomics Laboratory, National Center for Biotechnology, CSIC, Madrid 28049, Spain
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | | | | | | | - Gorka Prieto
- Department of Communications Engineering, University of the Basque Country (UPV/EHU), Bilbao 48013, Spain
| | - Rosana Navajas
- Proteomics Laboratory, National Center for Biotechnology, CSIC, Madrid 28049, Spain
| | | | - Carolina Fernández-Costa
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States
- Immunology, Centro de Investigaciones Biomédicas (CINBIO), Centro singular de Investigación de Galicia: Instituto de Investigación Sanitaria Galicia Sur (IIS-GS), University of Vigo, Campus Universitario, s/n, Vigo 36310, Spain
| | - John R. Yates
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Juan Pablo Albar
- Proteomics Laboratory, National Center for Biotechnology, CSIC, Madrid 28049, Spain
| |
Collapse
|
20
|
Chiva C, Olivella R, Borràs E, Espadas G, Pastor O, Solé A, Sabidó E. QCloud: A cloud-based quality control system for mass spectrometry-based proteomics laboratories. PLoS One 2018; 13:e0189209. [PMID: 29324744 PMCID: PMC5764250 DOI: 10.1371/journal.pone.0189209] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 11/21/2017] [Indexed: 01/03/2023] Open
Abstract
The increasing number of biomedical and translational applications in mass spectrometry-based proteomics poses new analytical challenges and raises the need for automated quality control systems. Despite previous efforts to set standard file formats, data processing workflows and key evaluation parameters for quality control, automated quality control systems are not yet widespread among proteomics laboratories, which limits the acquisition of high-quality results, inter-laboratory comparisons and the assessment of variability of instrumental platforms. Here we present QCloud, a cloud-based system to support proteomics laboratories in daily quality assessment using a user-friendly interface, easy setup, automated data processing and archiving, and unbiased instrument evaluation. QCloud supports the most common targeted and untargeted proteomics workflows, it accepts data formats from different vendors and it enables the annotation of acquired data and reporting incidences. A complete version of the QCloud system has successfully been developed and it is now open to the proteomics community (http://qcloud.crg.eu). QCloud system is an open source project, publicly available under a Creative Commons License Attribution-ShareAlike 4.0.
Collapse
Affiliation(s)
- Cristina Chiva
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Roger Olivella
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Eva Borràs
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Guadalupe Espadas
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Olga Pastor
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Amanda Solé
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
| | - Eduard Sabidó
- Proteomics Unit, Centre de Regulació Genòmica (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Barcelona
- Universitat Pompeu Fabra (UPF), Barcelona, Barcelona
- * E-mail:
| |
Collapse
|
21
|
Dogu E, Mohammad-Taheri S, Abbatiello SE, Bereman MS, MacLean B, Schilling B, Vitek O. MSstatsQC: Longitudinal System Suitability Monitoring and Quality Control for Targeted Proteomic Experiments. Mol Cell Proteomics 2017; 16:1335-1347. [PMID: 28483925 PMCID: PMC5500765 DOI: 10.1074/mcp.m116.064774] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 04/12/2017] [Indexed: 01/14/2023] Open
Abstract
Selected Reaction Monitoring (SRM) is a powerful tool for targeted detection and quantification of peptides in complex matrices. An important objective of SRM is to obtain peptide quantifications that are (1) suitable for the investigation, and (2) reproducible across laboratories and runs. The first objective is achieved by system suitability tests (SST), which verify that mass spectrometric instrumentation performs as specified. The second objective is achieved by quality control (QC), which provides in-process quality assurance of the sample profile. A common aspect of SST and QC is the longitudinal nature of the data. Although SST and QC have received a lot of attention in the proteomic community, the currently used statistical methods are limited. This manuscript improves upon the statistical methodology for SST and QC that is currently used in proteomics. It adapts the modern methods of longitudinal statistical process control, such as simultaneous and time weighted control charts and change point analysis, to SST and QC of SRM experiments, discusses their advantages, and provides practical guidelines. Evaluations on simulated data sets, and on data sets from the Clinical Proteomics Technology Assessment for Cancer (CPTAC) consortium, demonstrated that these methods substantially improve our ability of real time monitoring, early detection and prevention of chromatographic and instrumental problems. We implemented the methods in an open-source R-based software package MSstatsQC and its web-based graphical user interface. They are available for use stand-alone, or for integration with automated pipelines. Although the examples focus on targeted proteomics, the statistical methods in this manuscript apply more generally to quantitative proteomics.
Collapse
Affiliation(s)
- Eralp Dogu
- From the ‡College of Computer and Information Science, Northeastern University, Massachusetts 02115
- §College of Science, Mugla Sitki Kocman University 48000, Turkey
| | - Sara Mohammad-Taheri
- From the ‡College of Computer and Information Science, Northeastern University, Massachusetts 02115
| | | | - Michael S Bereman
- ‖Department of Biological Sciences, Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695
| | - Brendan MacLean
- **Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195
| | - Birgit Schilling
- ‡‡Buck Institute for Research on Aging, Novato, California 94945;
| | - Olga Vitek
- From the ‡College of Computer and Information Science, Northeastern University, Massachusetts 02115;
- §§College of Science, Northeastern University, Massachusetts 02115
| |
Collapse
|
22
|
Bittremieux W, Walzer M, Tenzer S, Zhu W, Salek RM, Eisenacher M, Tabb DL. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry. Anal Chem 2017; 89:4474-4479. [PMID: 28318237 DOI: 10.1021/acs.analchem.6b04310] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
To have confidence in results acquired during biological mass spectrometry experiments, a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization-Proteomics Standards Initiative has established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the previously proposed qcML format will be adapted to support a variety of use cases for both proteomics and metabolomics applications, and it will be established as an official PSI format. An important consideration is to avoid enforcing restrictive requirements on quality control but instead provide the basic technical necessities required to support extensive quality control for any type of mass spectrometry-based workflow. We want to emphasize that this is an open community effort, and we seek participation from all scientists with an interest in this field.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , Middelheimlaan 1, 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , Wilrijkstraat 10, 2650 Edegem, Belgium
| | - Mathias Walzer
- Department of Computer Science, University of Tübingen , Tübingen 72076, Germany.,Center for Bioinformatics, University of Tübingen , Tübingen 72074, Germany
| | - Stefan Tenzer
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz D 55131, Germany
| | - Weimin Zhu
- National Center for Protein Science , No. 38, Science Park Road, Changping District, Beijing 102206, China
| | - Reza M Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Martin Eisenacher
- Medical Bioinformatics, Medizinisches Proteom-Center, Ruhr-University Bochum , Bochum 44801, Germany
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences , Tygerberg Hospital, Francie Van Zijl Drive, Cape Town 7505, South Africa
| |
Collapse
|
23
|
Abstract
Because proteomics experiments are so complex they can readily fail, and do so without clear cause. Using standard experimental design techniques and incorporating quality control can greatly increase the chances of success. This chapter introduces the relevant concepts and provides examples specific to proteomic workflows. Applying these notions to design successful proteomics experiments is straightforward. It can help identify failure causes and greatly increase the likelihood of inter-laboratory reproducibility.
Collapse
Affiliation(s)
- Daniel Ruderman
- Lawrence J. Ellison Institute for Transformative Medicine of USC, Keck School of Medicine of USC, 2250 Alcazar St. CSC-240, Los Angeles, CA, 90033, USA.
| |
Collapse
|
24
|
Abstract
Data quality assessment is important for reproducibility of proteomics experiments and reusability of proteomics data. We describe a set of statistical tools to routinely visualize and examine the quality control (QC) metrics obtained for raw LC-MS/MS data on different instrument types and mass spectrometers. The QC metrics used here are the identification free QuaMeter metrics. Statistical assessments introduced include (a) principal component analysis, (b) dissimilarity measures, (c) T 2-chart for quality control, and (d) change point analysis. We demonstrate the workflow by a step-by-step assessment of a subset of Study 5 for the Clinical Proteomics Technology Assessment for Cancer (CPTAC) using our R functions.
Collapse
Affiliation(s)
- Xia Wang
- Department of Mathematical Sciences, University of Cincinnati, 2815 Commons Way, Cincinnati, OH, 45221-0025, USA.
| |
Collapse
|
25
|
Bittremieux W, Valkenborg D, Martens L, Laukens K. Computational quality control tools for mass spectrometry proteomics. Proteomics 2016; 17. [DOI: 10.1002/pmic.201600159] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 07/28/2016] [Accepted: 08/19/2016] [Indexed: 12/30/2022]
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science; University of Antwerp; Antwerp Belgium
- Biomedical Informatics Research Center Antwerp (biomina); University of Antwerp/Antwerp, University Hospital; Edegem Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO); Mol Belgium
- CFP; University of Antwerp; Antwerp Belgium
- I-BioStat; Hasselt University; Diepenbeek Belgium
| | - Lennart Martens
- Medical Biotechnology Center; VIB; Ghent Belgium
- Department of Biochemistry, Faculty of Medicine and Health Sciences; Ghent University; Ghent Belgium
- Bioinformatics Institute Ghent; Ghent University; Zwijnaarde Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science; University of Antwerp; Antwerp Belgium
- Biomedical Informatics Research Center Antwerp (biomina); University of Antwerp/Antwerp, University Hospital; Edegem Belgium
| |
Collapse
|
26
|
Bereman MS, Beri J, Sharma V, Nathe C, Eckels J, MacLean B, MacCoss MJ. An Automated Pipeline to Monitor System Performance in Liquid Chromatography-Tandem Mass Spectrometry Proteomic Experiments. J Proteome Res 2016; 15:4763-4769. [PMID: 27700092 DOI: 10.1021/acs.jproteome.6b00744] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We report the development of a completely automated pipeline to monitor system suitability in bottom-up proteomic experiments. LC-MS/MS runs are automatically imported into Skyline and multiple identification-free metrics are extracted from targeted peptides. These data are then uploaded to the Panorama Skyline document repository where metrics can be viewed in a web-based interface using powerful process control techniques, including Levey-Jennings and Pareto plots. The interface is versatile and takes user input, which allows the user significant control over the visualization of the data. The pipeline is vendor and instrument-type neutral, supports multiple acquisition techniques (e.g., MS 1 filtering, data-independent acquisition, parallel reaction monitoring, and selected reaction monitoring), can track performance of multiple instruments, and requires no manual intervention aside from initial setup. Data can be viewed from any computer with Internet access and a web browser, facilitating sharing of QC data between researchers. Herein, we describe the use of this pipeline, termed Panorama AutoQC, to evaluate LC-MS/MS performance in a range of scenarios including identification of suboptimal instrument performance, evaluation of ultrahigh pressure chromatography, and identification of the major sources of variation throughout years of peptide data collection.
Collapse
Affiliation(s)
| | | | - Vagisha Sharma
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
| | - Cory Nathe
- LabKey Software , Seattle, Washington 98109, United States
| | - Josh Eckels
- LabKey Software , Seattle, Washington 98109, United States
| | - Brendan MacLean
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
| |
Collapse
|
27
|
Avtonomov D, Raskind A, Nesvizhskii AI. BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics. J Proteome Res 2016; 15:2500-9. [PMID: 27306858 PMCID: PMC5583644 DOI: 10.1021/acs.jproteome.6b00021] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Mass spectrometry (MS) coupled to liquid chromatography (LC) is a commonly used technique in metabolomic and proteomic research. As the size and complexity of LC-MS-based experiments grow, it becomes increasingly more difficult to perform quality control of both raw data and processing results. In a practical setting, quality control steps for raw LC-MS data are often overlooked, and assessment of an experiment's success is based on some derived metrics such as "the number of identified compounds". The human brain interprets visual data much better than plain text, hence the saying "a picture is worth a thousand words". Here, we present the BatMass software package, which allows for performing quick quality control of raw LC-MS data through its fast visualization capabilities. It also serves as a testbed for developers of LC-MS data processing algorithms by providing a data access library for open mass spectrometry file formats and a means of visually mapping processing results back to the original data. We illustrate the utility of BatMass with several use cases of quality control and data exploration.
Collapse
Affiliation(s)
- Dmitry Avtonomov
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109
| | | | - Alexey I. Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
28
|
Martens L. Public proteomics data: How the field has evolved from sceptical inquiry to the promise of in silico proteomics. EUPA OPEN PROTEOMICS 2016; 11:42-44. [PMID: 29900110 PMCID: PMC5988554 DOI: 10.1016/j.euprot.2016.02.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Revised: 02/13/2016] [Accepted: 02/15/2016] [Indexed: 12/23/2022]
Abstract
Proteomics data sharing moved from validation to re-use. New tools and services make data very easily accessible. Metadata provision can still benefit from improvements. Quality control metrics will soon be reported along with submitted data. Data re-use will enable the advent of actual in silico proteomics.
Collapse
Affiliation(s)
- Lennart Martens
- Department of Medical Protein Research, VIB 9000 Ghent, Belgium.,Department of Biochemistry, Ghent University, 9000 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
29
|
Bittremieux W, Meysman P, Martens L, Valkenborg D, Laukens K. Unsupervised Quality Assessment of Mass Spectrometry Proteomics Experiments by Multivariate Quality Control Metrics. J Proteome Res 2016; 15:1300-7. [PMID: 26974716 DOI: 10.1021/acs.jproteome.6b00028] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Despite many technological and computational advances, the results of a mass spectrometry proteomics experiment are still subject to a large variability. For the understanding and evaluation of how technical variability affects the results of an experiment, several computationally derived quality control metrics have been introduced. However, despite the availability of these metrics, a systematic approach to quality control is often still lacking because the metrics are not fully understood and are hard to interpret. Here, we present a toolkit of powerful techniques to analyze and interpret multivariate quality control metrics to assess the quality of mass spectrometry proteomics experiments. We show how unsupervised techniques applied to these quality control metrics can provide an initial discrimination between low-quality experiments and high-quality experiments prior to manual investigation. Furthermore, we provide a technique to obtain detailed information on the quality control metrics that are related to the decreased performance, which can be used as actionable information to improve the experimental setup. Our toolkit is released as open-source and can be downloaded from https://bitbucket.org/proteinspector/qc_analysis/ .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| | - Pieter Meysman
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| | - Lennart Martens
- Department of Medical Protein Research, VIB , 9000 Ghent, Belgium.,Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University , 9000 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University , 9000 Ghent, Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO) , 2400 Mol, Belgium.,CFP, University of Antwerp , 2020 Antwerp, Belgium.,I-BioStat, Hasselt University , 3590 Diepenbeek, Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| |
Collapse
|
30
|
Bielow C, Mastrobuoni G, Kempa S. Proteomics Quality Control: Quality Control Software for MaxQuant Results. J Proteome Res 2015; 15:777-87. [PMID: 26653327 DOI: 10.1021/acs.jproteome.5b00780] [Citation(s) in RCA: 118] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Mass spectrometry-based proteomics coupled to liquid chromatography has matured into an automatized, high-throughput technology, producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality analysis (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream analysis. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC-MS data generated by the MaxQuant software pipeline. PTXQC creates a QC report containing a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of experimental designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant's Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC .
Collapse
Affiliation(s)
- Chris Bielow
- Max-Delbrück-Centrum for Molecular Medicine Berlin , Robert-Rössle-Straße 10, 13125 Berlin, Germany.,Berlin Institute of Health , Kapelle-Ufer 2, 10117 Berlin, Germany
| | - Guido Mastrobuoni
- Max-Delbrück-Centrum for Molecular Medicine Berlin , Robert-Rössle-Straße 10, 13125 Berlin, Germany
| | - Stefan Kempa
- Max-Delbrück-Centrum for Molecular Medicine Berlin , Robert-Rössle-Straße 10, 13125 Berlin, Germany.,Berlin Institute of Health , Kapelle-Ufer 2, 10117 Berlin, Germany
| |
Collapse
|
31
|
Beri J, Rosenblatt MM, Strauss E, Urh M, Bereman MS. Reagent for Evaluating Liquid Chromatography–Tandem Mass Spectrometry (LC-MS/MS) Performance in Bottom-Up Proteomic Experiments. Anal Chem 2015; 87:11635-40. [DOI: 10.1021/acs.analchem.5b04121] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Joshua Beri
- Department
of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Michael M. Rosenblatt
- Promega
Corporation, 2800 Woods Hollow Road, Madison, Wisconsin 53711, United States
| | - Ethan Strauss
- Promega
Corporation, 2800 Woods Hollow Road, Madison, Wisconsin 53711, United States
| | - Marjeta Urh
- Promega
Corporation, 2800 Woods Hollow Road, Madison, Wisconsin 53711, United States
| | - Michael S. Bereman
- Department
of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
- Department
of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|
32
|
Bennett KL, Wang X, Bystrom CE, Chambers MC, Andacht TM, Dangott LJ, Elortza F, Leszyk J, Molina H, Moritz RL, Phinney BS, Thompson JW, Bunger MK, Tabb DL. The 2012/2013 ABRF Proteomic Research Group Study: Assessing Longitudinal Intralaboratory Variability in Routine Peptide Liquid Chromatography Tandem Mass Spectrometry Analyses. Mol Cell Proteomics 2015; 14:3299-309. [PMID: 26435129 DOI: 10.1074/mcp.o115.051888] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Indexed: 11/06/2022] Open
Abstract
Questions concerning longitudinal data quality and reproducibility of proteomic laboratories spurred the Protein Research Group of the Association of Biomolecular Resource Facilities (ABRF-PRG) to design a study to systematically assess the reproducibility of proteomic laboratories over an extended period of time. Developed as an open study, initially 64 participants were recruited from the broader mass spectrometry community to analyze provided aliquots of a six bovine protein tryptic digest mixture every month for a period of nine months. Data were uploaded to a central repository, and the operators answered an accompanying survey. Ultimately, 45 laboratories submitted a minimum of eight LC-MSMS raw data files collected in data-dependent acquisition (DDA) mode. No standard operating procedures were enforced; rather the participants were encouraged to analyze the samples according to usual practices in the laboratory. Unlike previous studies, this investigation was not designed to compare laboratories or instrument configuration, but rather to assess the temporal intralaboratory reproducibility. The outcome of the study was reassuring with 80% of the participating laboratories performing analyses at a medium to high level of reproducibility and quality over the 9-month period. For the groups that had one or more outlying experiments, the major contributing factor that correlated to the survey data was the performance of preventative maintenance prior to the LC-MSMS analyses. Thus, the Protein Research Group of the Association of Biomolecular Resource Facilities recommends that laboratories closely scrutinize the quality control data following such events. Additionally, improved quality control recording is imperative. This longitudinal study provides evidence that mass spectrometry-based proteomics is reproducible. When quality control measures are strictly adhered to, such reproducibility is comparable among many disparate groups. Data from the study are available via ProteomeXchange under the accession code PXD002114.
Collapse
Affiliation(s)
- Keiryn L Bennett
- From the ‡CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria;
| | - Xia Wang
- §University of Cincinnati, Department of Mathematical Sciences, University of Cincinnati, Cincinnati, Ohio, 45221-0025
| | - Cory E Bystrom
- ¶Cleveland HeartLab, Inc., Research and Development, Cleveland HeartLab, Inc., Cleveland, Ohio, 44103
| | - Matthew C Chambers
- ‖Vanderbilt University, Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, 37232
| | - Tracy M Andacht
- **Centers for Disease Control and Prevention, Emergency Response Branch, Division of Laboratory Sciences, National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia, 30341
| | - Larry J Dangott
- ‡‡Texas A&M University, Department of Biochemistry & Biophysics, Texas A&M University, College Station, Texas, 77843
| | - Félix Elortza
- §§CIC bioGUNE, Centro de Investigacion Cooperativa en Biociencias, ProteoRed-ISCIII, Bilbao, Spain
| | - John Leszyk
- ¶¶University of Massachusetts, Department of Biochemistry and Molecular Pharmacology Proteomics and Mass Spectrometry Facility, University of Massachusetts Medical School, Shrewsbury, Massachusetts, 01545
| | - Henrik Molina
- ‖‖The Rockefeller University, Proteomics Resource Center, The Rockefeller University, New York, New York, 10065
| | | | - Brett S Phinney
- University of California, Davis, Proteomics Core, University of California-Davis Genome Center, Davis, California, 95616
| | - J Will Thompson
- Duke University, Proteomics and Metabolomics Core Facility, Duke University Medical Center, Durham, North Carolina, 27708
| | | | - David L Tabb
- ‖Vanderbilt University, Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, 37232;
| |
Collapse
|
33
|
Abstract
The urinary proteome is the focus of many studies due to the ease of urine collection and the relative proteome stability. Systems biology allows the combination of multiple omics studies, forming a link between proteomics, metabolomics, genomics and transcriptomics. In-depth data interpretation is achieved by bioinformatics analysis of -omics data sets. It is expected that the contribution of systems biology to the study of the urinary proteome will offer novel insights. The main focus of this review is on technical aspects of proteomics studies, available tools for systems biology analysis and the application of urinary proteomics in clinical studies and systems biology.
Collapse
|
34
|
Campos A, Díaz R, Martínez-Bartolomé S, Sierra J, Gallardo O, Sabidó E, López-Lucendo M, Ignacio Casal J, Pasquarello C, Scherl A, Chiva C, Borras E, Odena A, Elortza F, Azkargorta M, Ibarrola N, Canals F, Albar JP, Oliveira E. Multicenter experiment for quality control of peptide-centric LC-MS/MS analysis - A longitudinal performance assessment with nLC coupled to orbitrap MS analyzers. J Proteomics 2015; 127:264-74. [PMID: 25982386 DOI: 10.1016/j.jprot.2015.05.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Revised: 05/07/2015] [Accepted: 05/11/2015] [Indexed: 11/19/2022]
Abstract
Proteomic technologies based on mass spectrometry (MS) have greatly evolved in the past years, and nowadays it is possible to routinely identify thousands of peptides from complex biological samples in a single LC-MS/MS experiment. Despite the advancements in proteomic technologies, the scientific community still faces important challenges in terms of depth and reproducibility of proteomics analyses. Here, we present a multicenter study designed to evaluate long-term performance of LC-MS/MS platforms within the Spanish Proteomics Facilities Network (ProteoRed-ISCIII). The study was performed under well-established standard operating procedures, and demonstrated that it is possible to attain qualitative and quantitative reproducibility over time. Our study highlights the importance of deploying quality assessment metrics routinely in individual laboratories and in multi-laboratory studies. The mass spectrometry data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD000205.This article is part of a Special Issue entitled: HUPO 2014.
Collapse
Affiliation(s)
- Alex Campos
- ProteoRed-ISCIII, Proteomics Platform, Barcelona Science Park, Barcelona, Spain; Integromics, Madrid, Spain.
| | - Ramón Díaz
- ProteoRed-ISCIII, Proteomics Platform, Barcelona Science Park, Barcelona, Spain
| | | | | | - Oscar Gallardo
- ProteoRed-ISCIII, CSIC/UAB Proteomics Laboratory, Instituto de Investigaciones Biomédicas de Barcelona, Spanish National Research Council, Barcelona, Spain
| | - Eduard Sabidó
- ProteoRed-ISCIII, Proteomics Unit, Universitat Pompeu Fabra (UPF) and Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Maria López-Lucendo
- ProteoRed-ISCIII, Proteomics Facility and Functional Proteomics Laboratory, Centro de Investigaciones, Biológicas, Madrid, Spain
| | - J Ignacio Casal
- ProteoRed-ISCIII, Proteomics Facility and Functional Proteomics Laboratory, Centro de Investigaciones, Biológicas, Madrid, Spain
| | | | - Alexander Scherl
- Department of Human Protein Sciences, CMU, University of Geneva, Switzerland
| | - Cristina Chiva
- ProteoRed-ISCIII, Proteomics Unit, Universitat Pompeu Fabra (UPF) and Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Eva Borras
- ProteoRed-ISCIII, Proteomics Unit, Universitat Pompeu Fabra (UPF) and Centre de Regulació Genòmica (CRG), Barcelona, Spain
| | - Antonia Odena
- ProteoRed-ISCIII, Proteomics Platform, Barcelona Science Park, Barcelona, Spain
| | - Félix Elortza
- ProteoRed-ISCIII, Proteomics Platform, CIC bioGUNE, CIBERehd, Technology Park of Bizkaia, Derio, Spain
| | - Mikel Azkargorta
- ProteoRed-ISCIII, Proteomics Platform, CIC bioGUNE, CIBERehd, Technology Park of Bizkaia, Derio, Spain
| | - Nieves Ibarrola
- ProteoRed-ISCIII, Centro de Investigación del Cáncer and Instituto de Biología Molecular y Celular del Cáncer, CSIC-University of Salamanca, Salamanca, Spain
| | - Francesc Canals
- ProteoRed-ISCIII, Proteomic Laboratory, Vall d'Hebron Institute of Oncology-VHIO, Vall d'Hebron University Hospital, Barcelona, Spain
| | - Juan P Albar
- ProteoRed-ISCIII, Proteomics Facility, Centro Nacional de Biotecnología - CSIC, Madrid, Spain
| | - Eliandre Oliveira
- ProteoRed-ISCIII, Proteomics Platform, Barcelona Science Park, Barcelona, Spain
| |
Collapse
|
35
|
Bereman MS. Tools for monitoring system suitability in LC MS/MS centric proteomic experiments. Proteomics 2014; 15:891-902. [DOI: 10.1002/pmic.201400373] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 09/12/2014] [Accepted: 10/13/2014] [Indexed: 11/06/2022]
Affiliation(s)
- Michael S. Bereman
- Department of Biological Sciences, Center for Human Health and the Environment; North Carolina State University; Raleigh NC USA
| |
Collapse
|
36
|
Zawadzka AM, Schilling B, Held JM, Sahu AK, Cusack MP, Drake PM, Fisher SJ, Gibson BW. Variation and quantification among a target set of phosphopeptides in human plasma by multiple reaction monitoring and SWATH-MS2 data-independent acquisition. Electrophoresis 2014; 35:3487-97. [PMID: 24853916 PMCID: PMC4565165 DOI: 10.1002/elps.201400167] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Revised: 04/26/2014] [Accepted: 05/13/2014] [Indexed: 11/07/2022]
Abstract
Human plasma contains proteins that reflect overall health and represents a rich source of proteins for identifying and understanding disease pathophysiology. However, few studies have investigated changes in plasma phosphoproteins. In addition, little is known about the normal variations in these phosphoproteins, especially with respect to specific sites of modification. To address these questions, we evaluated variability in plasma protein phosphorylation in healthy individuals using multiple reaction monitoring (MRM) and SWATH-MS2 data-independent acquisition. First, we developed a discovery workflow for phosphopeptide enrichment from plasma and identified targets for MRM assays. Next, we analyzed plasma from healthy donors using an analytical workflow consisting of MRM and SWATH-MS2 that targeted phosphopeptides from 58 and 68 phosphoproteins, respectively. These two methods produced similar results showing low variability in 13 phosphosites from 10 phosphoproteins (CVinter < 30%) and high interpersonal variation of 16 phosphosites from 14 phosphoproteins (CVinter > 30%). Moreover, these phosphopeptides originate from phosphoproteins involved in cellular processes governing homeostasis, immune response, cell-extracellular matrix interactions, lipid and sugar metabolism, and cell signaling. This limited assessment of technical and biological variability in phosphopeptides generated from plasma phosphoproteins among healthy volunteers constitutes a reference for future studies that target protein phosphorylation as biomarkers.
Collapse
Affiliation(s)
- Anna M. Zawadzka
- Buck Institute for Research on Aging, 8001 Redwood Blvd., Novato, CA 94945
| | - Birgit Schilling
- Buck Institute for Research on Aging, 8001 Redwood Blvd., Novato, CA 94945
| | - Jason M. Held
- Division of Oncology and Department of Anesthesiology, Washington University School of Medicine, Campus Box 8069, 660 S. Euclid Avenue, St. Louis, MO 63110
| | - Alexandria K. Sahu
- Buck Institute for Research on Aging, 8001 Redwood Blvd., Novato, CA 94945
| | - Michael P. Cusack
- Buck Institute for Research on Aging, 8001 Redwood Blvd., Novato, CA 94945
| | - Penelope M. Drake
- Department of Obstetrics, Gynecology and Reproductive Sciences, 513 Parnassus Ave., Box 0556, University of California San Francisco, San Francisco, CA 94143
| | - Susan J. Fisher
- Department of Obstetrics, Gynecology and Reproductive Sciences, 513 Parnassus Ave., Box 0556, University of California San Francisco, San Francisco, CA 94143
| | - Bradford W. Gibson
- Buck Institute for Research on Aging, 8001 Redwood Blvd., Novato, CA 94945
- Department of Pharmaceutical Chemistry, 513 Parnassus Ave., Box 0556, University of California San Francisco, San Francisco, CA 94143
| |
Collapse
|
37
|
Walzer M, Pernas LE, Nasso S, Bittremieux W, Nahnsen S, Kelchtermans P, Pichler P, van den Toorn HWP, Staes A, Vandenbussche J, Mazanek M, Taus T, Scheltema RA, Kelstrup CD, Gatto L, van Breukelen B, Aiche S, Valkenborg D, Laukens K, Lilley KS, Olsen JV, Heck AJR, Mechtler K, Aebersold R, Gevaert K, Vizcaíno JA, Hermjakob H, Kohlbacher O, Martens L. qcML: an exchange format for quality control metrics from mass spectrometry experiments. Mol Cell Proteomics 2014; 13:1905-13. [PMID: 24760958 PMCID: PMC4125725 DOI: 10.1074/mcp.m113.035907] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 03/13/2014] [Indexed: 12/22/2022] Open
Abstract
Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.
Collapse
Affiliation(s)
- Mathias Walzer
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Lucia Espona Pernas
- §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland
| | - Sara Nasso
- ¶Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland; §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland
| | - Wout Bittremieux
- ‖Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium; **Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Sven Nahnsen
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Pieter Kelchtermans
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; ¶¶Flemish Institute for Technological Research (VITO), Boeretang 200, B-2400 Mol Belgium
| | - Peter Pichler
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Henk W P van den Toorn
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - An Staes
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Jonathan Vandenbussche
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Michael Mazanek
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Thomas Taus
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Richard A Scheltema
- Department of Proteomics and Signal Transduction, Max-Planck Institute of Biochemistry, Am Klopferspitz 18, D-82152 Martinsried, Germany
| | - Christian D Kelstrup
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen, Denmark
| | - Laurent Gatto
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, CB2 1GA, United Kingdom; Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Bas van Breukelen
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - Stephan Aiche
- Department of Mathematics and Computer Science, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Dirk Valkenborg
- ¶¶Flemish Institute for Technological Research (VITO), Boeretang 200, B-2400 Mol Belgium; I-BioStat, Hasselt University, Belgium; CFP-CeProMa, University of Antwerp, Belgium
| | - Kris Laukens
- ‖Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium; **Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, CB2 1GA, United Kingdom
| | - Jesper V Olsen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen, Denmark
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - Karl Mechtler
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Ruedi Aebersold
- §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland; Faculty of Science, University of Zurich, Zurich, Switzerland
| | - Kris Gevaert
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Oliver Kohlbacher
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Lennart Martens
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium;
| |
Collapse
|
38
|
Bereman MS, Johnson R, Bollinger J, Boss Y, Shulman N, MacLean B, Hoofnagle AN, MacCoss MJ. Implementation of statistical process control for proteomic experiments via LC MS/MS. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2014; 25:581-7. [PMID: 24496601 PMCID: PMC4020592 DOI: 10.1007/s13361-013-0824-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 12/22/2013] [Accepted: 12/25/2013] [Indexed: 05/11/2023]
Abstract
Statistical process control (SPC) is a robust set of tools that aids in the visualization, detection, and identification of assignable causes of variation in any process that creates products, services, or information. A tool has been developed termed Statistical Process Control in Proteomics (SProCoP) which implements aspects of SPC (e.g., control charts and Pareto analysis) into the Skyline proteomics software. It monitors five quality control metrics in a shotgun or targeted proteomic workflow. None of these metrics require peptide identification. The source code, written in the R statistical language, runs directly from the Skyline interface, which supports the use of raw data files from several of the mass spectrometry vendors. It provides real time evaluation of the chromatographic performance (e.g., retention time reproducibility, peak asymmetry, and resolution), and mass spectrometric performance (targeted peptide ion intensity and mass measurement accuracy for high resolving power instruments) via control charts. Thresholds are experiment- and instrument-specific and are determined empirically from user-defined quality control standards that enable the separation of random noise and systematic error. Finally, Pareto analysis provides a summary of performance metrics and guides the user to metrics with high variance. The utility of these charts to evaluate proteomic experiments is illustrated in two case studies.
Collapse
Affiliation(s)
- Michael S Bereman
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA,
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Amidan BG, Orton DJ, Lamarche BL, Monroe ME, Moore RJ, Venzin AM, Smith RD, Sego LH, Tardiff MF, Payne SH. Signatures for mass spectrometry data quality. J Proteome Res 2014; 13:2215-22. [PMID: 24611607 PMCID: PMC4104976 DOI: 10.1021/pr401143e] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
![]()
Ensuring data quality and proper
instrument functionality is a
prerequisite for scientific investigation. Manual quality assurance
is time-consuming and subjective. Metrics for describing liquid chromatography
mass spectrometry (LC–MS) data have been developed; however,
the wide variety of LC–MS instruments and configurations precludes
applying a simple cutoff. Using 1150 manually classified quality control
(QC) data sets, we trained logistic regression classification models
to predict whether a data set is in or out of control. Model parameters
were optimized by minimizing a loss function that accounts for the
trade-off between false positive and false negative errors. The classifier
models detected bad data sets with high sensitivity while maintaining
high specificity. Moreover, the composite classifier was dramatically
more specific than single metrics. Finally, we evaluated the performance
of the classifier on a separate validation set where it performed
comparably to the results for the testing/training data sets. By presenting
the methods and software used to create the classifier, other groups
can create a classifier for their specific QC regimen, which is highly
variable lab-to-lab. In total, this manuscript presents 3400 LC–MS
data sets for the same QC sample (whole cell lysate of Shewanella
oneidensis), deposited to the ProteomeXchange with identifiers
PXD000320–PXD000324.
Collapse
Affiliation(s)
- Brett G Amidan
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Such-Sanmartín G, Sidoli S, Ventura-Espejo E, Jensen ON. KYSS: Mass spectrometry data quality assessment for protein analysis and large-scale proteomics. Biochem Biophys Res Commun 2014; 445:702-7. [DOI: 10.1016/j.bbrc.2014.01.066] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 01/17/2014] [Indexed: 02/02/2023]
|
41
|
Wang X, Chambers MC, Vega-Montoto LJ, Bunk DM, Stein SE, Tabb DL. QC metrics from CPTAC raw LC-MS/MS data interpreted through multivariate statistics. Anal Chem 2014; 86:2497-509. [PMID: 24494671 PMCID: PMC3982976 DOI: 10.1021/ac4034455] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
![]()
Shotgun proteomics experiments integrate
a complex sequence of
processes, any of which can introduce variability. Quality metrics
computed from LC-MS/MS data have relied upon identifying MS/MS scans,
but a new mode for the QuaMeter software produces metrics that are
independent of identifications. Rather than evaluating each metric
independently, we have created a robust multivariate statistical toolkit
that accommodates the correlation structure of these metrics and allows
for hierarchical relationships among data sets. The framework enables
visualization and structural assessment of variability. Study 1 for
the Clinical Proteomics Technology Assessment for Cancer (CPTAC),
which analyzed three replicates of two common samples at each of two
time points among 23 mass spectrometers in nine laboratories, provided
the data to demonstrate this framework, and CPTAC Study 5 provided
data from complex lysates under Standard Operating Procedures (SOPs)
to complement these findings. Identification-independent quality metrics
enabled the differentiation of sites and run-times through robust
principal components analysis and subsequent factor analysis. Dissimilarity
metrics revealed outliers in performance, and a nested ANOVA model
revealed the extent to which all metrics or individual metrics were
impacted by mass spectrometer and run time. Study 5 data revealed
that even when SOPs have been applied, instrument-dependent variability
remains prominent, although it may be reduced, while within-site variability
is reduced significantly. Finally, identification-independent quality
metrics were shown to be predictive of identification sensitivity
in these data sets. QuaMeter and the associated multivariate framework
are available from http://fenchurch.mc.vanderbilt.edu and http://homepages.uc.edu/~wang2x7/, respectively.
Collapse
Affiliation(s)
- Xia Wang
- Department of Mathematical Sciences, University of Cincinnati , Cincinnati, Ohio 45221, United States
| | | | | | | | | | | |
Collapse
|
42
|
Martens L. Bringing proteomics into the clinic: the need for the field to finally take itself seriously. Proteomics Clin Appl 2013; 7:388-91. [PMID: 23637000 DOI: 10.1002/prca.201300020] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Revised: 03/29/2013] [Accepted: 03/30/2013] [Indexed: 01/12/2023]
Abstract
Proteomics has fast become a standard tool in the life sciences, with increasingly sophisticated approaches and instruments delivering ever growing numbers of identified and quantified proteins. Yet despite the enormous technological progress, and the triumphant papers published on whole-cell proteomes being collected and analyzed, proteomics has so far failed to enter the clinic for routine applications. This is a peculiar contradiction, and one that warrants some closer study. I here argue that for proteomics to make a difference in the clinic, it needs to stop shirking responsibility, and to mature into an analytical, transparent, and reproducible discipline that also invests in the consolidation of its technology rather than only focusing on the next big leap forward. A key enabling factor in this maturation process is quality control and quality assurance, with bioinformatics, in its least noticeable but most influential form, as a key underlying technology.
Collapse
Affiliation(s)
- Lennart Martens
- Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium.
| |
Collapse
|
43
|
Taylor RM, Dance J, Taylor RJ, Prince JT. Metriculator: quality assessment for mass spectrometry-based proteomics. Bioinformatics 2013; 29:2948-9. [PMID: 24002108 DOI: 10.1093/bioinformatics/btt510] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Quality control in mass spectrometry-based proteomics remains subjective, labor-intensive and inconsistent between laboratories. We introduce Metriculator, a software designed to facilitate long-term storage of extensive performance metrics as introduced by NIST in 2010. Metriculator features a web interface that generates interactive comparison plots for contextual understanding of metric values and an automated metric generation toolkit. The comparison plots are designed for at-a-glance determination of outliers and trends in the datasets, together with relevant statistical comparisons. Easy-to-use quantitative comparisons and a framework for integration plugins will encourage a culture of quality assurance within the proteomics community. AVAILABILITY AND IMPLEMENTATION Available under the MIT license at http://github.com/princelab/metriculator.
Collapse
Affiliation(s)
- Ryan M Taylor
- Department of Chemistry and Biochemistry, Department of Computer Science and Department of Information Systems, Brigham Young University, Provo, UT 84602, USA
| | | | | | | |
Collapse
|
44
|
Chen YY, Chambers MC, Li M, Ham AJL, Turner JL, Zhang B, Tabb DL. IDPQuantify: combining precursor intensity with spectral counts for protein and peptide quantification. J Proteome Res 2013; 12:4111-21. [PMID: 23879310 DOI: 10.1021/pr400438q] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Differentiating and quantifying protein differences in complex samples produces significant challenges in sensitivity and specificity. Label-free quantification can draw from two different information sources: precursor intensities and spectral counts. Intensities are accurate for calculating protein relative abundance, but values are often missing due to peptides that are identified sporadically. Spectral counting can reliably reproduce difference lists, but differentiating peptides or quantifying all but the most concentrated protein changes is usually beyond its abilities. Here we developed new software, IDPQuantify, to align multiple replicates using principal component analysis, extract accurate precursor intensities from MS data, and combine intensities with spectral counts for significant gains in differentiation and quantification. We have applied IDPQuantify to three comparative proteomic data sets featuring gold standard protein differences spiked in complicated backgrounds. The software is able to associate peptides with peaks that are otherwise left unidentified to increase the efficiency of protein quantification, especially for low-abundance proteins. By combing intensities with spectral counts from IDPicker, it gains an average of 30% more true positive differences among top differential proteins. IDPQuantify quantifies protein relative abundance accurately in these test data sets to produce good correlations between known and measured concentrations.
Collapse
Affiliation(s)
- Yao-Yi Chen
- Department of Biomedical Informatics, Vanderbilt University Medical School, Nashville, Tennessee 37232-8575, United States
| | | | | | | | | | | | | |
Collapse
|
45
|
Bramwell D. An introduction to statistical process control in research proteomics. J Proteomics 2013; 95:3-21. [PMID: 23791708 DOI: 10.1016/j.jprot.2013.06.010] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 05/28/2013] [Accepted: 06/03/2013] [Indexed: 10/26/2022]
Abstract
BACKGROUND Statistical process control is a well-established and respected method which provides a general purpose, and consistent framework for monitoring and improving the quality of a process. It is routinely used in many industries where the quality of final products is critical and is often required in clinical diagnostic laboratories [1,2]. To date, the methodology has been little utilised in research proteomics. It has been shown to be capable of delivering quantitative QC procedures for qualitative clinical assays [3] making it an ideal methodology to apply to this area of biological research. OBJECTIVE To introduce statistical process control as an objective strategy for quality control and show how it could be used to benefit proteomics researchers and enhance the quality of the results they generate. RESULTS We demonstrate that rules which provide basic quality control are easy to derive and implement and could have a major impact on data quality for many studies. CONCLUSIONS Statistical process control is a powerful tool for investigating and improving proteomics research work-flows. The process of characterising measurement systems and defining control rules forces the exploration of key questions that can lead to significant improvements in performance. BIOLOGICAL SIGNIFICANCE This work asserts that QC is essential to proteomics discovery experiments. Every experimenter must know the current capabilities of their measurement system and have an objective means for tracking and ensuring that performance. Proteomic analysis work-flows are complicated and multi-variate. QC is critical for clinical chemistry measurements and huge strides have been made in ensuring the quality and validity of results in clinical biochemistry labs. This work introduces some of these QC concepts and works to bridge their use from single analyte QC to applications in multi-analyte systems. This article is part of a Special Issue entitled: Standardization and Quality Control in Proteomics.
Collapse
Affiliation(s)
- David Bramwell
- Biosignatures Ltd., Keel House, Newcastle Upon Tyne, UK.
| |
Collapse
|
46
|
Using R and Bioconductor for proteomics data analysis. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1844:42-51. [PMID: 23692960 DOI: 10.1016/j.bbapap.2013.04.032] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 04/09/2013] [Accepted: 04/30/2013] [Indexed: 10/26/2022]
Abstract
This review presents how R, the popular statistical environment and programming language, can be used in the frame of proteomics data analysis. A short introduction to R is given, with special emphasis on some of the features that make R and its add-on packages premium software for sound and reproducible data analysis. The reader is also advised on how to find relevant R software for proteomics. Several use cases are then presented, illustrating data input/output, quality control, quantitative proteomics and data analysis. Detailed code and additional links to extensive documentation are available in the freely available companion package RforProteomics. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
|
47
|
Teleman J, Waldemarson S, Malmström J, Levander F. Automated quality control system for LC-SRM setups. J Proteomics 2013; 95:77-83. [PMID: 23584149 DOI: 10.1016/j.jprot.2013.03.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Revised: 03/22/2013] [Accepted: 03/25/2013] [Indexed: 10/26/2022]
Abstract
UNLABELLED Selected reaction monitoring (SRM) is emerging as a standard tool for high-throughput protein quantification. For reliable and reproducible SRM protein quantification it is essential that system performance is stable. We present here a quality control workflow that is based on repeated analysis of a standard sample to allow insight into the stability of the key properties of a SRM setup. This is supported by automated software to monitor system performance and display information like signal intensities and retention time stability over time, and alert upon deviations from expected metrics. Utilising the software to evaluate 407 repeated injections of a standard sample during half a year, outliers in relative peptide signal intensities and relative peptide fragment ratios are identified, indicating the need for instrument maintenance. We therefore believe that the software could be a vital and powerful tool for any lab regularly performing SRM, increasing the reliability and quality of the SRM platform. BIOLOGICAL SIGNIFICANCE Selected reaction monitoring (SRM) mass spectrometry is becoming established as a standard technique for accurate protein quantification. However, to achieve the required quantification reproducibility of the liquid chromatography (LC)-SRM setup, system performance needs to be monitored over time. Here we introduce a workflow with associated software to enable automated monitoring of LC-SRM setups. We believe that usage of the presented concepts will further strengthen the role of SRM as a reliable tool for protein quantification. This article is part of a Special Issue entitled: Standardization and Quality Control in Proteomics.
Collapse
Affiliation(s)
- Johan Teleman
- Department of Immunotechnology, Lund University, BMC D13, 22184 Lund, Sweden
| | | | | | | |
Collapse
|
48
|
Tabb DL. Quality assessment for clinical proteomics. Clin Biochem 2012; 46:411-20. [PMID: 23246537 DOI: 10.1016/j.clinbiochem.2012.12.003] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 12/01/2012] [Accepted: 12/03/2012] [Indexed: 12/21/2022]
Abstract
Proteomics has emerged from the labs of technologists to enter widespread application in clinical contexts. This transition, however, has been hindered by overstated early claims of accuracy, concerns about reproducibility, and the challenges of handling batch effects properly. New efforts have produced sets of performance metrics and measurements of variability that establish sound expectations for experiments in clinical proteomics. As researchers begin incorporating these metrics in a quality by design paradigm, the variability of individual steps in experimental pipelines will be reduced, regularizing overall outcomes. This review discusses the evolution of quality assessment in 2D gel electrophoresis, mass spectrometry-based proteomic profiling, tandem mass spectrometry-based protein inventories, and proteomic quantitation. Taken together, the advances in each of these technologies are establishing databases that will be increasingly useful for decision-making in clinical experimentation.
Collapse
Affiliation(s)
- David L Tabb
- Department of Biomedical Informatics, Vanderbilt University, USA.
| |
Collapse
|
49
|
Pichler P, Mazanek M, Dusberger F, Weilnböck L, Huber CG, Stingl C, Luider TM, Straube WL, Köcher T, Mechtler K. SIMPATIQCO: a server-based software suite which facilitates monitoring the time course of LC-MS performance metrics on Orbitrap instruments. J Proteome Res 2012; 11:5540-7. [PMID: 23088386 PMCID: PMC3558011 DOI: 10.1021/pr300163u] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
![]()
While the performance of liquid chromatography (LC) and
mass spectrometry (MS) instrumentation continues to increase, applications
such as analyses of complete or near-complete proteomes and quantitative
studies require constant and optimal system performance. For this
reason, research laboratories and core facilities alike are recommended
to implement quality control (QC) measures as part of their routine
workflows. Many laboratories perform sporadic quality control checks.
However, successive and systematic longitudinal monitoring of system
performance would be facilitated by dedicated automatic or semiautomatic
software solutions that aid an effortless analysis and display of
QC metrics over time. We present the software package SIMPATIQCO (SIMPle AuTomatIc Quality COntrol) designed
for evaluation of data from LTQ Orbitrap, Q-Exactive, LTQ FT, and
LTQ instruments. A centralized SIMPATIQCO server can process QC data
from multiple instruments. The software calculates QC metrics supervising
every step of data acquisition from LC and electrospray to MS. For
each QC metric the software learns the range indicating adequate system
performance from the uploaded data using robust statistics. Results
are stored in a database and can be displayed in a comfortable manner
from any computer in the laboratory via a web browser. QC data can
be monitored for individual LC runs as well as plotted over time.
SIMPATIQCO thus assists the longitudinal monitoring of important QC
metrics such as peptide elution times, peak widths, intensities, total
ion current (TIC) as well as sensitivity, and overall LC–MS
system performance; in this way the software also helps identify potential
problems. The SIMPATIQCO software package is available free of charge.
Collapse
Affiliation(s)
- Peter Pichler
- Research Institute of Molecular Pathology, Vienna, Austria.
| | | | | | | | | | | | | | | | | | | |
Collapse
|