1
|
He Q, Guo H, Li Y, He G, Li X, Shuai J. SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics. Interdiscip Sci 2024; 16:579-592. [PMID: 38472692 DOI: 10.1007/s12539-024-00611-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 03/14/2024]
Abstract
Mass spectrometry is crucial in proteomics analysis, particularly using Data Independent Acquisition (DIA) for reliable and reproducible mass spectrometry data acquisition, enabling broad mass-to-charge ratio coverage and high throughput. DIA-NN, a prominent deep learning software in DIA proteome analysis, generates peptide results but may include low-confidence peptides. Conventionally, biologists have to manually screen peptide fragment ion chromatogram peaks (XIC) for identifying high-confidence peptides, a time-consuming and subjective process prone to variability. In this study, we introduce SeFilter-DIA, a deep learning algorithm, aiming at automating the identification of high-confidence peptides. Leveraging compressed excitation neural network and residual network models, SeFilter-DIA extracts XIC features and effectively discerns between high and low-confidence peptides. Evaluation of the benchmark datasets demonstrates SeFilter-DIA achieving 99.6% AUC on the test set and 97% for other performance indicators. Furthermore, SeFilter-DIA is applicable for screening peptides with phosphorylation modifications. These results demonstrate the potential of SeFilter-DIA to replace manual screening, providing an efficient and objective approach for high-confidence peptide identification while mitigating associated limitations.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Huan Guo
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Yulin Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Guoqiang He
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Xiang Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.
| | - Jianwei Shuai
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325001, China.
| |
Collapse
|
2
|
Fröhlich K, Fahrner M, Brombacher E, Seredynska A, Maldacker M, Kreutz C, Schmidt A, Schilling O. Data-Independent Acquisition: A Milestone and Prospect in Clinical Mass Spectrometry-Based Proteomics. Mol Cell Proteomics 2024; 23:100800. [PMID: 38880244 PMCID: PMC11380018 DOI: 10.1016/j.mcpro.2024.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 06/08/2024] [Accepted: 06/13/2024] [Indexed: 06/18/2024] Open
Abstract
Data-independent acquisition (DIA) has revolutionized the field of mass spectrometry (MS)-based proteomics over the past few years. DIA stands out for its ability to systematically sample all peptides in a given m/z range, allowing an unbiased acquisition of proteomics data. This greatly mitigates the issue of missing values and significantly enhances quantitative accuracy, precision, and reproducibility compared to many traditional methods. This review focuses on the critical role of DIA analysis software tools, primarily focusing on their capabilities and the challenges they address in proteomic research. Advances in MS technology, such as trapped ion mobility spectrometry, or high field asymmetric waveform ion mobility spectrometry require sophisticated analysis software capable of handling the increased data complexity and exploiting the full potential of DIA. We identify and critically evaluate leading software tools in the DIA landscape, discussing their unique features, and the reliability of their quantitative and qualitative outputs. We present the biological and clinical relevance of DIA-MS and discuss crucial publications that paved the way for in-depth proteomic characterization in patient-derived specimens. Furthermore, we provide a perspective on emerging trends in clinical applications and present upcoming challenges including standardization and certification of MS-based acquisition strategies in molecular diagnostics. While we emphasize the need for continuous development of software tools to keep pace with evolving technologies, we advise researchers against uncritically accepting the results from DIA software tools. Each tool may have its own biases, and some may not be as sensitive or reliable as others. Our overarching recommendation for both researchers and clinicians is to employ multiple DIA analysis tools, utilizing orthogonal analysis approaches to enhance the robustness and reliability of their findings.
Collapse
Affiliation(s)
- Klemens Fröhlich
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany
| | - Eva Brombacher
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany; Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Adrianna Seredynska
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Maximilian Maldacker
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| | - Alexander Schmidt
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany.
| |
Collapse
|
3
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
4
|
He Q, Zhong CQ, Li X, Guo H, Li Y, Gao M, Yu R, Liu X, Zhang F, Guo D, Ye F, Guo T, Shuai J, Han J. Dear-DIA XMBD: Deep Autoencoder Enables Deconvolution of Data-Independent Acquisition Proteomics. RESEARCH (WASHINGTON, D.C.) 2023; 6:0179. [PMID: 37377457 PMCID: PMC10292580 DOI: 10.34133/research.0179] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 06/01/2023] [Indexed: 06/29/2023]
Abstract
Data-independent acquisition (DIA) technology for protein identification from mass spectrometry and related algorithms is developing rapidly. The spectrum-centric analysis of DIA data without the use of spectra library from data-dependent acquisition data represents a promising direction. In this paper, we proposed an untargeted analysis method, Dear-DIAXMBD, for direct analysis of DIA data. Dear-DIAXMBD first integrates the deep variational autoencoder and triplet loss to learn the representations of the extracted fragment ion chromatograms, then uses the k-means clustering algorithm to aggregate fragments with similar representations into the same classes, and finally establishes the inverted index tables to determine the precursors of fragment clusters between precursors and peptides and between fragments and peptides. We show that Dear-DIAXMBD performs superiorly with the highly complicated DIA data of different species obtained by different instrument platforms. Dear-DIAXMBD is publicly available at https://github.com/jianweishuai/Dear-DIA-XMBD.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Chuan-Qi Zhong
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Huan Guo
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Yiming Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Mingxuan Gao
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
| | - Rongshan Yu
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Xianming Liu
- Bruker (Beijing) Scientific Technology Co. Ltd., Beijing, China
| | - Fangfei Zhang
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Donghui Guo
- Department of Electronic Engineering,
Xiamen University, Xiamen 361005, China
| | - Fangfu Ye
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Tiannan Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
- Westlake Omics Ltd., Yunmeng Road 1, Hangzhou, China
| | - Jianwei Shuai
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Jiahuai Han
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| |
Collapse
|
5
|
Cox J. Prediction of peptide mass spectral libraries with machine learning. Nat Biotechnol 2023; 41:33-43. [PMID: 36008611 DOI: 10.1038/s41587-022-01424-w] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/11/2022] [Indexed: 01/21/2023]
Abstract
The recent development of machine learning methods to identify peptides in complex mass spectrometric data constitutes a major breakthrough in proteomics. Longstanding methods for peptide identification, such as search engines and experimental spectral libraries, are being superseded by deep learning models that allow the fragmentation spectra of peptides to be predicted from their amino acid sequence. These new approaches, including recurrent neural networks and convolutional neural networks, use predicted in silico spectral libraries rather than experimental libraries to achieve higher sensitivity and/or specificity in the analysis of proteomics data. Machine learning is galvanizing applications that involve large search spaces, such as immunopeptidomics and proteogenomics. Current challenges in the field include the prediction of spectra for peptides with post-translational modifications and for cross-linked pairs of peptides. Permeation of machine-learning-based spectral prediction into search engines and spectrum-centric data-independent acquisition workflows for diverse peptide classes and measurement conditions will continue to push sensitivity and dynamic range in proteomics applications in the coming years.
Collapse
Affiliation(s)
- Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.
- Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway.
| |
Collapse
|
6
|
Heil LR, Fondrie WE, McGann CD, Federation AJ, Noble WS, MacCoss MJ, Keich U. Building Spectral Libraries from Narrow-Window Data-Independent Acquisition Mass Spectrometry Data. J Proteome Res 2022; 21:1382-1391. [PMID: 35549345 DOI: 10.1021/acs.jproteome.1c00895] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Advances in library-based methods for peptide detection from data-independent acquisition (DIA) mass spectrometry have made it possible to detect and quantify tens of thousands of peptides in a single mass spectrometry run. However, many of these methods rely on a comprehensive, high-quality spectral library containing information about the expected retention time and fragmentation patterns of peptides in the sample. Empirical spectral libraries are often generated through data-dependent acquisition and may suffer from biases as a result. Spectral libraries can be generated in silico, but these models are not trained to handle all possible post-translational modifications. Here, we propose a false discovery rate-controlled spectrum-centric search workflow to generate spectral libraries directly from gas-phase fractionated DIA tandem mass spectrometry data. We demonstrate that this strategy is able to detect phosphorylated peptides and can be used to generate a spectral library for accurate peptide detection and quantitation in wide-window DIA data. We compare the results of this search workflow to other library-free approaches and demonstrate that our search is competitive in terms of accuracy and sensitivity. These results demonstrate that the proposed workflow has the capacity to generate spectral libraries while avoiding the limitations of other methods.
Collapse
Affiliation(s)
- Lilian R Heil
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States
| | - William E Fondrie
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States
| | - Christopher D McGann
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States
| | - Alexander J Federation
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States.,Paul G. Allen School for Computer Science and Engineering, University of Washington, Seattle, Washington 98105, United States
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington 98105, United States
| | - Uri Keich
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
7
|
Buric F, Zrimec J, Zelezniak A. Parallel Factor Analysis Enables Quantification and Identification of Highly Convolved Data-Independent-Acquired Protein Spectra. PATTERNS (NEW YORK, N.Y.) 2020; 1:100137. [PMID: 33336195 PMCID: PMC7733873 DOI: 10.1016/j.patter.2020.100137] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 09/14/2020] [Accepted: 10/12/2020] [Indexed: 11/26/2022]
Abstract
High-throughput data-independent acquisition (DIA) is the method of choice for quantitative proteomics, combining the best practices of targeted and shotgun approaches. The resultant DIA spectra are, however, highly convolved and with no direct precursor-fragment correspondence, complicating biological sample analysis. Here, we present CANDIA (canonical decomposition of data-independent-acquired spectra), a GPU-powered unsupervised multiway factor analysis framework that deconvolves multispectral scans to individual analyte spectra, chromatographic profiles, and sample abundances, using parallel factor analysis. The deconvolved spectra can be annotated with traditional database search engines or used as high-quality input for de novo sequencing methods. We demonstrate that spectral libraries generated with CANDIA substantially reduce the false discovery rate underlying the validation of spectral quantification. CANDIA covers up to 33 times more total ion current than library-based approaches, which typically use less than 5% of total recorded ions, thus allowing quantification and identification of signals from unexplored DIA spectra. Conventional DIA spectral libraries cover less than 3% of a scan's total ion count CANDIA deconvolves peptide signals by leveraging all scan data CANDIA uses GPUs to enable tensor algebra on massive DIA mass spectrometry data CANDIA output enables high-confidence and precise quantitative proteomics
The latest high-throughput mass spectrometry-based technologies can record virtually all molecules from complex biological samples, providing a holistic picture of proteomes in cells and tissues and enabling an evaluation of the overall status of a person's health. However, current best practices are still only scratching the surface of the wealth of available information obtained from the massive proteome datasets, and efficient novel data-driven strategies are needed. Powered by advances in GPU hardware and open-source machine-learning frameworks, we developed a data-driven approach, CANDIA, which disassembles highly complex proteomics data into the elementary molecular signatures of the proteins in biological samples. Our work provides a performant and adaptable solution that complements existing mass spectrometry techniques. As the central mathematical methods are generic, other scientific fields that are dealing with highly convolved datasets will benefit from this work.
Collapse
Affiliation(s)
- Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, Gothenburg 412 96, Sweden
| | - Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, Gothenburg 412 96, Sweden
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, Gothenburg 412 96, Sweden.,Science for Life Laboratory, Tomtebodavägen 23a, Stockholm 171 65, Sweden
| |
Collapse
|
8
|
Vaca Jacome AS, Peckner R, Shulman N, Krug K, DeRuff KC, Officer A, Christianson KE, MacLean B, MacCoss MJ, Carr SA, Jaffe JD. Avant-garde: an automated data-driven DIA data curation tool. Nat Methods 2020; 17:1237-1244. [PMID: 33199889 PMCID: PMC7723322 DOI: 10.1038/s41592-020-00986-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 09/25/2020] [Indexed: 12/03/2022]
Abstract
Several challenges remain in data-independent acquisition (DIA) data analysis, such as to confidently identify peptides, define integration boundaries, remove interferences, and control false discovery rates. In practice, a visual inspection of the signals is still required, which is impractical with large datasets. We present Avant-garde as a tool to refine DIA (and parallel reaction monitoring) data. Avant-garde uses a novel data-driven scoring strategy: signals are refined by learning from the dataset itself, using all measurements in all samples to achieve the best optimization. We evaluate the performance of Avant-garde using benchmark DIA datasets and show that it can determine the quantitative suitability of a peptide peak, and reach the same levels of selectivity, accuracy, and reproducibility as manual validation. Avant-garde is complementary to existing DIA analysis engines and aims to establish a strong foundation for subsequent analysis of quantitative mass spectrometry data.
Collapse
Affiliation(s)
| | - Ryan Peckner
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cogen Therapeutics, Cambridge, MA, USA
| | | | - Karsten Krug
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Adam Officer
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jacob D Jaffe
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Inzen Therapeutics, Cambridge, MA, USA.
- Inzen Therapeutics, Cambridge, MA, USA.
| |
Collapse
|
9
|
Tsai TH, Choi M, Banfai B, Liu Y, MacLean BX, Dunkley T, Vitek O. Selection of Features with Consistent Profiles Improves Relative Protein Quantification in Mass Spectrometry Experiments. Mol Cell Proteomics 2020; 19:944-959. [PMID: 32234965 PMCID: PMC7261813 DOI: 10.1074/mcp.ra119.001792] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 02/27/2020] [Indexed: 12/11/2022] Open
Abstract
In bottom-up mass spectrometry-based proteomics, relative protein quantification is often achieved with data-dependent acquisition (DDA), data-independent acquisition (DIA), or selected reaction monitoring (SRM). These workflows quantify proteins by summarizing the abundances of all the spectral features of the protein (e.g. precursor ions, transitions or fragments) in a single value per protein per run. When abundances of some features are inconsistent with the overall protein profile (for technological reasons such as interferences, or for biological reasons such as post-translational modifications), the protein-level summaries and the downstream conclusions are undermined. We propose a statistical approach that automatically detects spectral features with such inconsistent patterns. The detected features can be separately investigated, and if necessary, removed from the data set. We evaluated the proposed approach on a series of benchmark-controlled mixtures and biological investigations with DDA, DIA and SRM data acquisitions. The results demonstrated that it could facilitate and complement manual curation of the data. Moreover, it can improve the estimation accuracy, sensitivity and specificity of detecting differentially abundant proteins, and reproducibility of conclusions across different data processing tools. The approach is implemented as an option in the open-source R-based software MSstats.
Collapse
Affiliation(s)
- Tsung-Heng Tsai
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
| | - Meena Choi
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
| | - Balazs Banfai
- Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Yansheng Liu
- Department of Pharmacology, Yale Cancer Biology Institute, Yale University School of Medicine, West Haven, Connecticut
| | - Brendan X MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Tom Dunkley
- Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Basel, Switzerland
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts.
| |
Collapse
|
10
|
Zhang F, Ge W, Ruan G, Cai X, Guo T. Data‐Independent Acquisition Mass Spectrometry‐Based Proteomics and Software Tools: A Glimpse in 2020. Proteomics 2020; 20:e1900276. [DOI: 10.1002/pmic.201900276] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 03/27/2020] [Indexed: 01/02/2023]
Affiliation(s)
- Fangfei Zhang
- Key Laboratory of Structural Biology of Zhejiang ProvinceSchool of Life SciencesWestlake University 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
- Institute of Basic Medical SciencesWestlake Institute for Advanced Study 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
| | - Weigang Ge
- Key Laboratory of Structural Biology of Zhejiang ProvinceSchool of Life SciencesWestlake University 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
- Institute of Basic Medical SciencesWestlake Institute for Advanced Study 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
| | - Guan Ruan
- Key Laboratory of Structural Biology of Zhejiang ProvinceSchool of Life SciencesWestlake University 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
- Institute of Basic Medical SciencesWestlake Institute for Advanced Study 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
| | - Xue Cai
- Key Laboratory of Structural Biology of Zhejiang ProvinceSchool of Life SciencesWestlake University 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
- Institute of Basic Medical SciencesWestlake Institute for Advanced Study 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
| | - Tiannan Guo
- Key Laboratory of Structural Biology of Zhejiang ProvinceSchool of Life SciencesWestlake University 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
- Institute of Basic Medical SciencesWestlake Institute for Advanced Study 18 Shilongshan Road Hangzhou Zhejiang Province 310024 China
| |
Collapse
|
11
|
Huang T, Bruderer R, Muntel J, Xuan Y, Vitek O, Reiter L. Combining Precursor and Fragment Information for Improved Detection of Differential Abundance in Data Independent Acquisition. Mol Cell Proteomics 2020; 19:421-430. [PMID: 31888964 PMCID: PMC7000113 DOI: 10.1074/mcp.ra119.001705] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 12/16/2019] [Indexed: 11/16/2022] Open
Abstract
In bottom-up, label-free discovery proteomics, biological samples are acquired in a data-dependent (DDA) or data-independent (DIA) manner, with peptide signals recorded in an intact (MS1) and fragmented (MS2) form. While DDA has only the MS1 space for quantification, DIA contains both MS1 and MS2 at high quantitative quality. DIA profiles of complex biological matrices such as tissues or cells can contain quantitative interferences, and the interferences at the MS1 and the MS2 signals are often independent. When comparing biological conditions, the interferences can compromise the detection of differential peptide or protein abundance and lead to false positive or false negative conclusions.We hypothesized that the combined use of MS1 and MS2 quantitative signals could improve our ability to detect differentially abundant proteins. Therefore, we developed a statistical procedure incorporating both MS1 and MS2 quantitative information of DIA. We benchmarked the performance of the MS1-MS2-combined method to the individual use of MS1 or MS2 in DIA using four previously published controlled mixtures, as well as in two previously unpublished controlled mixtures. In the majority of the comparisons, the combined method outperformed the individual use of MS1 or MS2. This was particularly true for comparisons with low fold changes, few replicates, and situations where MS1 and MS2 were of similar quality. When applied to a previously unpublished investigation of lung cancer, the MS1-MS2-combined method increased the coverage of known activated pathways.Since recent technological developments continue to increase the quality of MS1 signals (e.g. using the BoxCar scan mode for Orbitrap instruments), the combination of the MS1 and MS2 information has a high potential for future statistical analysis of DIA data.
Collapse
Affiliation(s)
- Ting Huang
- Northeastern University, Boston MA 02115
| | | | - Jan Muntel
- Biognosys, Wagistrasse 21, 8952 Schlieren, Switzerland
| | - Yue Xuan
- Thermo Fisher Scientific, 28199 Bremen, Germany
| | - Olga Vitek
- Northeastern University, Boston MA 02115.
| | - Lukas Reiter
- Biognosys, Wagistrasse 21, 8952 Schlieren, Switzerland.
| |
Collapse
|
12
|
Pascovici D, Wu JX, McKay MJ, Joseph C, Noor Z, Kamath K, Wu Y, Ranganathan S, Gupta V, Mirzaei M. Clinically Relevant Post-Translational Modification Analyses-Maturing Workflows and Bioinformatics Tools. Int J Mol Sci 2018; 20:E16. [PMID: 30577541 PMCID: PMC6337699 DOI: 10.3390/ijms20010016] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 12/09/2018] [Accepted: 12/17/2018] [Indexed: 01/04/2023] Open
Abstract
Post-translational modifications (PTMs) can occur soon after translation or at any stage in the lifecycle of a given protein, and they may help regulate protein folding, stability, cellular localisation, activity, or the interactions proteins have with other proteins or biomolecular species. PTMs are crucial to our functional understanding of biology, and new quantitative mass spectrometry (MS) and bioinformatics workflows are maturing both in labelled multiplexed and label-free techniques, offering increasing coverage and new opportunities to study human health and disease. Techniques such as Data Independent Acquisition (DIA) are emerging as promising approaches due to their re-mining capability. Many bioinformatics tools have been developed to support the analysis of PTMs by mass spectrometry, from prediction and identifying PTM site assignment, open searches enabling better mining of unassigned mass spectra-many of which likely harbour PTMs-through to understanding PTM associations and interactions. The remaining challenge lies in extracting functional information from clinically relevant PTM studies. This review focuses on canvassing the options and progress of PTM analysis for large quantitative studies, from choosing the platform, through to data analysis, with an emphasis on clinically relevant samples such as plasma and other body fluids, and well-established tools and options for data interpretation.
Collapse
Affiliation(s)
- Dana Pascovici
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Jemma X Wu
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Matthew J McKay
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Chitra Joseph
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| | - Zainab Noor
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
| | - Karthik Kamath
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Yunqi Wu
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
| | - Shoba Ranganathan
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
| | - Vivek Gupta
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| | - Mehdi Mirzaei
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia.
- Australian Proteome Analysis Facility, Macquarie University, Sydney, NSW 2109, Australia.
- Department of Clinical Medicine, Macquarie University, Sydney, NSW 2109, Australia.
| |
Collapse
|
13
|
Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry. Nat Commun 2018; 9:5128. [PMID: 30510204 PMCID: PMC6277451 DOI: 10.1038/s41467-018-07454-w] [Citation(s) in RCA: 307] [Impact Index Per Article: 51.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 10/16/2018] [Indexed: 01/27/2023] Open
Abstract
Data independent acquisition (DIA) mass spectrometry is a powerful technique that is improving the reproducibility and throughput of proteomics studies. Here, we introduce an experimental workflow that uses this technique to construct chromatogram libraries that capture fragment ion chromatographic peak shape and retention time for every detectable peptide in a proteomics experiment. These coordinates calibrate protein databases or spectrum libraries to a specific mass spectrometer and chromatography setup, facilitating DIA-only pipelines and the reuse of global resource libraries. We also present EncyclopeDIA, a software tool for generating and searching chromatogram libraries, and demonstrate the performance of our workflow by quantifying proteins in human and yeast cells. We find that by exploiting calibrated retention time and fragmentation specificity in chromatogram libraries, EncyclopeDIA can detect 20–25% more peptides from DIA experiments than with data dependent acquisition-based spectrum libraries alone. Data-independent acquisition (DIA)-based proteomics often relies on mass spectrum libraries from data-dependent acquisition experiments. Here, the authors present a method to generate DIA-based chromatogram libraries, enabling DIA-only workflows and detecting more peptides than with spectrum libraries alone.
Collapse
|
14
|
Ludwig C, Gillet L, Rosenberger G, Amon S, Collins BC, Aebersold R. Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Mol Syst Biol 2018; 14:e8126. [PMID: 30104418 PMCID: PMC6088389 DOI: 10.15252/msb.20178126] [Citation(s) in RCA: 625] [Impact Index Per Article: 104.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 05/11/2018] [Accepted: 05/15/2018] [Indexed: 01/16/2023] Open
Abstract
Many research questions in fields such as personalized medicine, drug screens or systems biology depend on obtaining consistent and quantitatively accurate proteomics data from many samples. SWATH-MS is a specific variant of data-independent acquisition (DIA) methods and is emerging as a technology that combines deep proteome coverage capabilities with quantitative consistency and accuracy. In a SWATH-MS measurement, all ionized peptides of a given sample that fall within a specified mass range are fragmented in a systematic and unbiased fashion using rather large precursor isolation windows. To analyse SWATH-MS data, a strategy based on peptide-centric scoring has been established, which typically requires prior knowledge about the chromatographic and mass spectrometric behaviour of peptides of interest in the form of spectral libraries and peptide query parameters. This tutorial provides guidelines on how to set up and plan a SWATH-MS experiment, how to perform the mass spectrometric measurement and how to analyse SWATH-MS data using peptide-centric scoring. Furthermore, concepts on how to improve SWATH-MS data acquisition, potential trade-offs of parameter settings and alternative data analysis strategies are discussed.
Collapse
Affiliation(s)
- Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich (TUM), Freising, Germany
| | - Ludovic Gillet
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Sabine Amon
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
15
|
Manning AJ, Lee J, Wolfgeher DJ, Kron SJ, Greenberg JT. Simple strategies to enhance discovery of acetylation post-translational modifications by quadrupole-orbitrap LC-MS/MS. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2018; 1866:224-229. [DOI: 10.1016/j.bbapap.2017.10.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 09/07/2017] [Accepted: 10/13/2017] [Indexed: 12/26/2022]
|
16
|
Collins BC, Hunter CL, Liu Y, Schilling B, Rosenberger G, Bader SL, Chan DW, Gibson BW, Gingras AC, Held JM, Hirayama-Kurogi M, Hou G, Krisp C, Larsen B, Lin L, Liu S, Molloy MP, Moritz RL, Ohtsuki S, Schlapbach R, Selevsek N, Thomas SN, Tzeng SC, Zhang H, Aebersold R. Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry. Nat Commun 2017; 8:291. [PMID: 28827567 PMCID: PMC5566333 DOI: 10.1038/s41467-017-00249-5] [Citation(s) in RCA: 367] [Impact Index Per Article: 52.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 06/12/2017] [Indexed: 01/15/2023] Open
Abstract
Quantitative proteomics employing mass spectrometry is an indispensable tool in life science research. Targeted proteomics has emerged as a powerful approach for reproducible quantification but is limited in the number of proteins quantified. SWATH-mass spectrometry consists of data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics (accuracy, sensitivity, and selectivity) of targeted proteomics at large scale. While previous SWATH-mass spectrometry studies have shown high intra-lab reproducibility, this has not been evaluated between labs. In this multi-laboratory evaluation study including 11 sites worldwide, we demonstrate that using SWATH-mass spectrometry data acquisition we can consistently detect and reproducibly quantify >4000 proteins from HEK293 cells. Using synthetic peptide dilution series, we show that the sensitivity, dynamic range and reproducibility established with SWATH-mass spectrometry are uniformly achieved. This study demonstrates that the acquisition of reproducible quantitative proteomics data by multiple labs is achievable, and broadly serves to increase confidence in SWATH-mass spectrometry data acquisition as a reproducible method for large-scale protein quantification.SWATH-mass spectrometry consists of a data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics on the scale of thousands of proteins. Here, using data generated by eleven groups worldwide, the authors show that SWATH-MS is capable of generating highly reproducible data across different laboratories.
Collapse
Affiliation(s)
- Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093, Zurich, Switzerland
| | | | - Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093, Zurich, Switzerland
| | - Birgit Schilling
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, CA, 94945, USA
| | - George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093, Zurich, Switzerland
- PhD. Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, 8057, Switzerland
| | - Samuel L Bader
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
| | - Daniel W Chan
- Department of Pathology, Clinical Chemistry Division, Johns Hopkins University School of Medicine, Baltimore, MD, 21231, USA
| | - Bradford W Gibson
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, CA, 94945, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, 94143, USA
| | - Anne-Claude Gingras
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, M5G 1X5, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, M5S 1A8, Ontario, Canada
| | - Jason M Held
- Departments of Medicine and Anesthesiology, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, MO, 63110, USA
| | - Mio Hirayama-Kurogi
- Department of Pharmaceutical Microbiology, Faculty of Life Sciences, Kumamoto University, 5-1 Oe-honmachi, Chuo-ku, Kumamoto, 862-0973, Japan
| | - Guixue Hou
- Proteomics Division, BGI-Shenzhen, Shenzhen, 518083, China
| | - Christoph Krisp
- Department of Chemistry and Biomolecular Sciences, Australian Proteome Analysis Facility (APAF), Macquarie University, Sydney, 2109, Australia
| | - Brett Larsen
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, M5G 1X5, Ontario, Canada
| | - Liang Lin
- Proteomics Division, BGI-Shenzhen, Shenzhen, 518083, China
| | - Siqi Liu
- Proteomics Division, BGI-Shenzhen, Shenzhen, 518083, China
| | - Mark P Molloy
- Department of Chemistry and Biomolecular Sciences, Australian Proteome Analysis Facility (APAF), Macquarie University, Sydney, 2109, Australia
| | - Robert L Moritz
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
| | - Sumio Ohtsuki
- Department of Pharmaceutical Microbiology, Faculty of Life Sciences, Kumamoto University, 5-1 Oe-honmachi, Chuo-ku, Kumamoto, 862-0973, Japan
| | - Ralph Schlapbach
- Functional Genomics Center Zurich, ETH Zurich/University of Zurich, Winterthurerstr. 190, 8057, Zurich, Switzerland
| | - Nathalie Selevsek
- Functional Genomics Center Zurich, ETH Zurich/University of Zurich, Winterthurerstr. 190, 8057, Zurich, Switzerland
| | - Stefani N Thomas
- Department of Pathology, Clinical Chemistry Division, Johns Hopkins University School of Medicine, Baltimore, MD, 21231, USA
| | - Shin-Cheng Tzeng
- Departments of Medicine and Anesthesiology, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, MO, 63110, USA
| | - Hui Zhang
- Department of Pathology, Clinical Chemistry Division, Johns Hopkins University School of Medicine, Baltimore, MD, 21231, USA
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093, Zurich, Switzerland.
- Faculty of Science, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
17
|
Rosenberger G, Bludau I, Schmitt U, Heusel M, Hunter CL, Liu Y, MacCoss MJ, MacLean BX, Nesvizhskii AI, Pedrioli PGA, Reiter L, Röst HL, Tate S, Ting YS, Collins BC, Aebersold R. Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Nat Methods 2017; 14:921-927. [PMID: 28825704 PMCID: PMC5581544 DOI: 10.1038/nmeth.4398] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 07/07/2017] [Indexed: 12/18/2022]
Abstract
Liquid chromatography coupled to tandem mass spectrometry is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, exemplified by SWATH-MS, emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale datasets. Here we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies and provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent accumulation of false positives across large-scale datasets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for detected peptide queries, peptides and inferred proteins.
Collapse
Affiliation(s)
- George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Isabell Bludau
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Uwe Schmitt
- ID Scientific IT Services, ETH Zurich, Zurich, Switzerland
| | - Moritz Heusel
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD program in Molecular and Translational Biomedicine, Competence Center Personalized Medicine (CC-PM), ETH Zurich and University of Zurich, Zurich, Switzerland
| | | | - Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Brendan X MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| | - Patrick G A Pedrioli
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | | | - Hannes L Röst
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | | | - Ying S Ting
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
18
|
White FM, Wolf-Yadlin A. Methods for the Analysis of Protein Phosphorylation-Mediated Cellular Signaling Networks. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:295-315. [PMID: 27049636 DOI: 10.1146/annurev-anchem-071015-041542] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Protein phosphorylation-mediated cellular signaling networks regulate almost all aspects of cell biology, including the responses to cellular stimulation and environmental alterations. These networks are highly complex and comprise hundreds of proteins and potentially thousands of phosphorylation sites. Multiple analytical methods have been developed over the past several decades to identify proteins and protein phosphorylation sites regulating cellular signaling, and to quantify the dynamic response of these sites to different cellular stimulation. Here we provide an overview of these methods, including the fundamental principles governing each method, their relative strengths and weaknesses, and some examples of how each method has been applied to the analysis of complex signaling networks. When applied correctly, each of these techniques can provide insight into the topology, dynamics, and regulation of protein phosphorylation signaling networks.
Collapse
Affiliation(s)
- Forest M White
- Department of Biological Engineering and David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;
| | | |
Collapse
|
19
|
Abstract
The ultimate aim of proteomics is to fully identify and quantify the entire complement of proteins and post-translational modifications in biological samples of interest. For the last 15 years, liquid chromatography-tandem mass spectrometry (LC-MS/MS) in data-dependent acquisition (DDA) mode has been the standard for proteomics when sampling breadth and discovery were the main objectives; multiple reaction monitoring (MRM) LC-MS/MS has been the standard for targeted proteomics when precise quantification, reproducibility, and validation were the main objectives. Recently, improvements in mass spectrometer design and bioinformatics algorithms have resulted in the rediscovery and development of another sampling method: data-independent acquisition (DIA). DIA comprehensively and repeatedly samples every peptide in a protein digest, producing a complex set of mass spectra that is difficult to interpret without external spectral libraries. Currently, DIA approaches the identification breadth of DDA while achieving the reproducible quantification characteristic of MRM or its newest version, parallel reaction monitoring (PRM). In comparative
de novo identification and quantification studies in human cell lysates, DIA identified up to 89% of the proteins detected in a comparable DDA experiment while providing reproducible quantification of over 85% of them. DIA analysis aided by spectral libraries derived from prior DIA experiments or auxiliary DDA data produces identification and quantification as reproducible and precise as that achieved by MRM/PRM, except on low‑abundance peptides that are obscured by stronger signals. DIA is still a work in progress toward the goal of sensitive, reproducible, and precise quantification without external spectral libraries. New software tools applied to DIA analysis have to deal with deconvolution of complex spectra as well as proper filtering of false positives and false negatives. However, the future outlook is positive, and various researchers are working on novel bioinformatics techniques to address these issues and increase the reproducibility, fidelity, and identification breadth of DIA.
Collapse
Affiliation(s)
- Alex Hu
- Department of Genome Sciences, University of Washington, Seattle, WA, 98109, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, 98109, USA
| | | |
Collapse
|
20
|
Trevisiol S, Ayoub D, Lesur A, Ancheva L, Gallien S, Domon B. The use of proteases complementary to trypsin to probe isoforms and modifications. Proteomics 2016; 16:715-28. [DOI: 10.1002/pmic.201500379] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 11/06/2015] [Accepted: 12/08/2015] [Indexed: 12/15/2022]
Affiliation(s)
- Stéphane Trevisiol
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Daniel Ayoub
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Antoine Lesur
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Lina Ancheva
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Sébastien Gallien
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| | - Bruno Domon
- Luxembourg Clinical Proteomics Center (LCP); Luxembourg Institute of Health; Strassen Luxembourg
| |
Collapse
|
21
|
Keller A, Bader SL, Kusebauch U, Shteynberg D, Hood L, Moritz RL. Opening a SWATH Window on Posttranslational Modifications: Automated Pursuit of Modified Peptides. Mol Cell Proteomics 2015; 15:1151-63. [PMID: 26704149 DOI: 10.1074/mcp.m115.054478] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Indexed: 11/06/2022] Open
Abstract
Posttranslational modifications of proteins play an important role in biology. For example, phosphorylation is a key component in signal transduction in all three domains of life, and histones can be modified in such a variety of ways that a histone code for gene regulation has been proposed. Shotgun proteomics is commonly used to identify posttranslational modifications as well as chemical modifications from sample processing. However, it favors the detection of abundant peptides over the repertoire presented, and the data analysis usually requires advance specification of modification masses and target amino acids, their number constrained by available computational resources. Recent advances in data independent acquisition mass spectrometry technologies such as SWATH-MS enable a deeper recording of the peptide contents of samples, including peptides with modifications. Here, we present a novel approach that applies the power of SWATH-MS analysis to the automated pursuit of modified peptides. With the new SWATHProphet(PTM) functionality added to the open source SWATHProphet software, precursor ions consistent with a modification are identified along with the mass and localization of the modification in the peptide sequence in a sensitive and unrestricted manner without the need to anticipate the modifications in advance. Using this method, we demonstrate the detection of a wide assortment of modified peptides, many unanticipated, in samples containing unpurified synthetic peptides and human urine, as well as in phospho-enriched human tissue culture cell samples.
Collapse
Affiliation(s)
- Andrew Keller
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| | - Samuel L Bader
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| | - Ulrike Kusebauch
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| | - David Shteynberg
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| | - Leroy Hood
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| | - Robert L Moritz
- From the §Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109
| |
Collapse
|
22
|
Shteynberg D, Mendoza L, Hoopmann MR, Sun Z, Schmidt F, Deutsch EW, Moritz RL. reSpect: software for identification of high and low abundance ion species in chimeric tandem mass spectra. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2015; 26:1837-1847. [PMID: 26419769 PMCID: PMC4750398 DOI: 10.1007/s13361-015-1252-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 06/22/2015] [Accepted: 08/11/2015] [Indexed: 06/05/2023]
Abstract
Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contribute to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), which enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post-search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the iterations that follow. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
| | | | | | - Zhi Sun
- Institute for Systems Biology, Seattle, WA, USA
| | - Frank Schmidt
- ZIK-FunGene Junior Research Group Applied Proteomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | | | | |
Collapse
|
23
|
Schilling B, MacLean B, Held JM, Sahu AK, Rardin MJ, Sorensen DJ, Peters T, Wolfe AJ, Hunter CL, MacCoss MJ, Gibson BW. Multiplexed, Scheduled, High-Resolution Parallel Reaction Monitoring on a Full Scan QqTOF Instrument with Integrated Data-Dependent and Targeted Mass Spectrometric Workflows. Anal Chem 2015; 87:10222-9. [PMID: 26398777 PMCID: PMC5677521 DOI: 10.1021/acs.analchem.5b02983] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Recent advances in commercial mass spectrometers with higher resolving power and faster scanning capabilities have expanded their functionality beyond traditional data-dependent acquisition (DDA) to targeted proteomics with higher precision and multiplexing. Using an orthogonal quadrupole time-of flight (QqTOF) LC-MS system, we investigated the feasibility of implementing large-scale targeted quantitative assays using scheduled, high resolution multiple reaction monitoring (sMRM-HR), also referred to as parallel reaction monitoring (sPRM). We assessed the selectivity and reproducibility of PRM, also referred to as parallel reaction monitoring, by measuring standard peptide concentration curves and system suitability assays. By evaluating up to 500 peptides in a single assay, the robustness and accuracy of PRM assays were compared to traditional SRM workflows on triple quadrupole instruments. The high resolution and high mass accuracy of the full scan MS/MS spectra resulted in sufficient selectivity to monitor 6-10 MS/MS fragment ions per target precursor, providing flexibility in postacquisition assay refinement and optimization. The general applicability of the sPRM workflow was assessed in complex biological samples by first targeting 532 peptide precursor ions in a yeast lysate, and then 466 peptide precursors from a previously generated candidate list of differentially expressed proteins in whole cell lysates from E. coli. Lastly, we found that sPRM assays could be rapidly and efficiently developed in Skyline from DDA libraries when acquired on the same QqTOF platform, greatly facilitating their successful implementation. These results establish a robust sPRM workflow on a QqTOF platform to rapidly transition from discovery analysis to highly multiplexed, targeted peptide quantitation.
Collapse
Affiliation(s)
- Birgit Schilling
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
| | - Brendan MacLean
- Department of Genome Sciences, University of Washington School of Medicine, Foege Building S113, 3720 15th Avenue NE, Seattle, Washington 98195, United States
| | - Jason M. Held
- Departments of Medicine and Anesthesiology, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, Missouri 63110, United States
| | - Alexandria K. Sahu
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
| | - Matthew J. Rardin
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
| | - Dylan J. Sorensen
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
| | - Theodore Peters
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
| | - Alan J. Wolfe
- Department of Microbiology and Immunology, Stritch School of Medicine, Health Sciences Division, Loyola University Chicago, 2160 South First Avenue, Maywood, Illinois 60153, United States
| | | | - Michael J. MacCoss
- Department of Genome Sciences, University of Washington School of Medicine, Foege Building S113, 3720 15th Avenue NE, Seattle, Washington 98195, United States
| | - Bradford W. Gibson
- Buck Institute for Research on Aging, 8001 Redwood Boulevard, Novato, California 94945, United States
- Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143, United States
| |
Collapse
|
24
|
Bilbao A, Zhang Y, Varesio E, Luban J, Strambio-De-Castillia C, Lisacek F, Hopfgartner G. Ranking Fragment Ions Based on Outlier Detection for Improved Label-Free Quantification in Data-Independent Acquisition LC-MS/MS. J Proteome Res 2015; 14:4581-93. [PMID: 26412574 DOI: 10.1021/acs.jproteome.5b00394] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Data-independent acquisition LC-MS/MS techniques complement supervised methods for peptide quantification. However, due to the wide precursor isolation windows, these techniques are prone to interference at the fragment ion level, which, in turn, is detrimental for accurate quantification. The nonoutlier fragment ion (NOFI) ranking algorithm has been developed to assign low priority to fragment ions affected by interference. By using the optimal subset of high-priority fragment ions, these interfered fragment ions are effectively excluded from quantification. NOFI represents each fragment ion as a vector of four dimensions related to chromatographic and MS fragmentation attributes and applies multivariate outlier detection techniques. Benchmarking conducted on a well-defined quantitative data set (i.e., the SWATH Gold Standard) indicates that NOFI on average is able to accurately quantify 11-25% more peptides than the commonly used Top-N library intensity ranking method. The sum of the area of the Top3-5 NOFIs produces similar coefficients of variation as compared to that with the library intensity method but with more accurate quantification results. On a biologically relevant human dendritic cell digest data set, NOFI properly assigns low-priority ranks to 85% of annotated interferences, resulting in sensitivity values between 0.92 and 0.80, against 0.76 for the Spectronaut interference detection algorithm.
Collapse
Affiliation(s)
- Aivett Bilbao
- Life Sciences Mass Spectrometry, School of Pharmaceutical Sciences, University of Geneva, University of Lausanne , CH-1211 Geneva 4, Switzerland.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , CH-1211 Geneva 4, Switzerland
| | - Ying Zhang
- Life Sciences Mass Spectrometry, School of Pharmaceutical Sciences, University of Geneva, University of Lausanne , CH-1211 Geneva 4, Switzerland
| | - Emmanuel Varesio
- Life Sciences Mass Spectrometry, School of Pharmaceutical Sciences, University of Geneva, University of Lausanne , CH-1211 Geneva 4, Switzerland
| | - Jeremy Luban
- Program in Molecular Medicine, University of Massachusetts Medical School , Worcester, Massachusetts 01605, United States
| | - Caterina Strambio-De-Castillia
- Program in Molecular Medicine, University of Massachusetts Medical School , Worcester, Massachusetts 01605, United States
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , CH-1211 Geneva 4, Switzerland.,Faculty of Sciences, University of Geneva , CH-1211 Geneva 4, Switzerland
| | - Gérard Hopfgartner
- Life Sciences Mass Spectrometry, School of Pharmaceutical Sciences, University of Geneva, University of Lausanne , CH-1211 Geneva 4, Switzerland
| |
Collapse
|
25
|
Sweredoski MJ, Moradian A, Raedle M, Franco C, Hess S. High Resolution Parallel Reaction Monitoring with Electron Transfer Dissociation for Middle-Down Proteomics. Anal Chem 2015; 87:8360-6. [DOI: 10.1021/acs.analchem.5b01542] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Michael J. Sweredoski
- Proteome
Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, California 91125, United States
| | - Annie Moradian
- Proteome
Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, California 91125, United States
| | - Matthias Raedle
- Proteome
Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, California 91125, United States
- Hochschule Weihenstephan-Triesdorf, University of Applied Sciences, Faculty of Biotechnology and Bioinformatic, Am Hofgarten 4, 85354 Freising, Germany
| | - Catarina Franco
- Proteome
Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, California 91125, United States
- Instituto
de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Av. da República, 2780-157 Oeiras, Portugal
| | - Sonja Hess
- Proteome
Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
26
|
Krautkramer KA, Reiter L, Denu JM, Dowell JA. Quantification of SAHA-Dependent Changes in Histone Modifications Using Data-Independent Acquisition Mass Spectrometry. J Proteome Res 2015; 14:3252-62. [PMID: 26120868 DOI: 10.1021/acs.jproteome.5b00245] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Histone post-translational modifications (PTMs) are important regulators of chromatin structure and gene expression. Quantitative analysis of histone PTMs by mass spectrometry remains extremely challenging due to the complex and combinatorial nature of histone PTMs. The most commonly used mass spectrometry-based method for high-throughput histone PTM analysis is data-dependent acquisition (DDA). However, stochastic precursor selection and dependence on MS1 ions for quantification impede comprehensive interrogation of histone PTM states using DDA methods. To overcome these limitations, we utilized a data-independent acquisition (DIA) workflow that provides superior run-to-run consistency and postacquisition flexibility in comparison to DDA methods. In addition, we developed a novel DIA-based methodology to quantify isobaric, co-eluting histone peptides that lack unique MS2 transitions. Our method enabled deconvolution and quantification of histone PTMs that are otherwise refractory to quantitation, including the heavily acetylated tail of histone H4. Using this workflow, we investigated the effects of the histone deacetylase inhibitor SAHA (suberoylanilide hydroxamic acid) on the global histone PTM state of human breast cancer MCF7 cells. A total of 62 unique histone PTMs were quantified, revealing novel SAHA-induced changes in acetylation and methylation of histones H3 and H4.
Collapse
Affiliation(s)
| | - Lukas Reiter
- §BiognoSYS AG, Wagistrasse 25, CH-8952 Schlieren, Switzerland
| | | | | |
Collapse
|