1
|
Fröhlich K, Fahrner M, Brombacher E, Seredynska A, Maldacker M, Kreutz C, Schmidt A, Schilling O. Data-Independent Acquisition: A Milestone and Prospect in Clinical Mass Spectrometry-Based Proteomics. Mol Cell Proteomics 2024; 23:100800. [PMID: 38880244 PMCID: PMC11380018 DOI: 10.1016/j.mcpro.2024.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 06/08/2024] [Accepted: 06/13/2024] [Indexed: 06/18/2024] Open
Abstract
Data-independent acquisition (DIA) has revolutionized the field of mass spectrometry (MS)-based proteomics over the past few years. DIA stands out for its ability to systematically sample all peptides in a given m/z range, allowing an unbiased acquisition of proteomics data. This greatly mitigates the issue of missing values and significantly enhances quantitative accuracy, precision, and reproducibility compared to many traditional methods. This review focuses on the critical role of DIA analysis software tools, primarily focusing on their capabilities and the challenges they address in proteomic research. Advances in MS technology, such as trapped ion mobility spectrometry, or high field asymmetric waveform ion mobility spectrometry require sophisticated analysis software capable of handling the increased data complexity and exploiting the full potential of DIA. We identify and critically evaluate leading software tools in the DIA landscape, discussing their unique features, and the reliability of their quantitative and qualitative outputs. We present the biological and clinical relevance of DIA-MS and discuss crucial publications that paved the way for in-depth proteomic characterization in patient-derived specimens. Furthermore, we provide a perspective on emerging trends in clinical applications and present upcoming challenges including standardization and certification of MS-based acquisition strategies in molecular diagnostics. While we emphasize the need for continuous development of software tools to keep pace with evolving technologies, we advise researchers against uncritically accepting the results from DIA software tools. Each tool may have its own biases, and some may not be as sensitive or reliable as others. Our overarching recommendation for both researchers and clinicians is to employ multiple DIA analysis tools, utilizing orthogonal analysis approaches to enhance the robustness and reliability of their findings.
Collapse
Affiliation(s)
- Klemens Fröhlich
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany
| | - Eva Brombacher
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany; Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Adrianna Seredynska
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Maximilian Maldacker
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| | - Alexander Schmidt
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany.
| |
Collapse
|
2
|
Wen B, Freestone J, Riffle M, MacCoss MJ, Noble WS, Keich U. Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.01.596967. [PMID: 38895431 PMCID: PMC11185562 DOI: 10.1101/2024.06.01.596967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
A pressing statistical challenge in the field of mass spectrometry proteomics is how to assess whether a given software tool provides accurate error control. Each software tool for searching such data uses its own internally implemented methodology for reporting and controlling the error. Many of these software tools are closed source, with incompletely documented methodology, and the strategies for validating the error are inconsistent across tools. In this work, we identify three different methods for validating false discovery rate (FDR) control in use in the field, one of which is invalid, one of which can only provide a lower bound rather than an upper bound, and one of which is valid but under-powered. The result is that the field has a very poor understanding of how well we are doing with respect to FDR control, particularly for the analysis of data-independent acquisition (DIA) data. We therefore propose a new, more powerful method for evaluating FDR control in this setting, and we then employ that method, along with an existing lower bounding technique, to characterize a variety of popular search tools. We find that the search tools for analysis of data-dependent acquisition (DDA) data generally seem to control the FDR at the peptide level, whereas none of the DIA search tools consistently controls the FDR at the peptide level across all the datasets we investigated. Furthermore, this problem becomes much worse when the latter tools are evaluated at the protein level. These results may have significant implications for various downstream analyses, since proper FDR control has the potential to reduce noise in discovery lists and thereby boost statistical power.
Collapse
Affiliation(s)
- Bo Wen
- Department of Genome Sciences, University of Washington
| | - Jack Freestone
- School of Mathematics and Statistics, University of Sydney
| | | | | | - William S Noble
- Department of Genome Sciences, University of Washington
- Paul G. Allen School of Computer Science and Engineering, University of Washington
| | - Uri Keich
- School of Mathematics and Statistics, University of Sydney
| |
Collapse
|
3
|
Lin A, See D, Fondrie WE, Keich U, Noble WS. Target-decoy false discovery rate estimation using Crema. Proteomics 2024; 24:e2300084. [PMID: 38380501 DOI: 10.1002/pmic.202300084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 01/06/2024] [Accepted: 01/16/2024] [Indexed: 02/22/2024]
Abstract
Assigning statistical confidence estimates to discoveries produced by a tandem mass spectrometry proteomics experiment is critical to enabling principled interpretation of the results and assessing the cost/benefit ratio of experimental follow-up. The most common technique for computing such estimates is to use target-decoy competition (TDC), in which observed spectra are searched against a database of real (target) peptides and a database of shuffled or reversed (decoy) peptides. TDC procedures for estimating the false discovery rate (FDR) at a given score threshold have been developed for application at the level of spectra, peptides, or proteins. Although these techniques are relatively straightforward to implement, it is common in the literature to skip over the implementation details or even to make mistakes in how the TDC procedures are applied in practice. Here we present Crema, an open-source Python tool that implements several TDC methods of spectrum-, peptide- and protein-level FDR estimation. Crema is compatible with a variety of existing database search tools and provides a straightforward way to obtain robust FDR estimates.
Collapse
Affiliation(s)
- Andy Lin
- Chemical and Biological Signatures, Pacific Northwest National Laboratory, Seattle, Washington, USA
| | - Donavan See
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, USA
| | | | - Uri Keich
- School of Mathematics and Statistics, University of Sydney, Sydney, Australia
| | - William Stafford Noble
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| |
Collapse
|
4
|
Picciani M, Gabriel W, Giurcoiu VG, Shouman O, Hamood F, Lautenbacher L, Jensen CB, Müller J, Kalhor M, Soleymaniniya A, Kuster B, The M, Wilhelm M. Oktoberfest: Open-source spectral library generation and rescoring pipeline based on Prosit. Proteomics 2024; 24:e2300112. [PMID: 37672792 DOI: 10.1002/pmic.202300112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/17/2023] [Accepted: 08/18/2023] [Indexed: 09/08/2023]
Abstract
Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state-of-the-art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm-lab/oktoberfest) and can easily be installed locally through the cross-platform PyPI Python package.
Collapse
Affiliation(s)
- Mario Picciani
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Wassim Gabriel
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Victor-George Giurcoiu
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Omar Shouman
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Firas Hamood
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Ludwig Lautenbacher
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Cecilia Bang Jensen
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Julian Müller
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mostafa Kalhor
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Armin Soleymaniniya
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Matthew The
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
5
|
Kotimoole CN, Ramya VK, Kaur P, Reiling N, Shandil RK, Narayanan S, Flo TH, Prasad TSK. Discovery of Species-Specific Proteotypic Peptides To Establish a Spectral Library Platform for Identification of Nontuberculosis Mycobacteria from Mass Spectrometry-Based Proteomics. J Proteome Res 2024; 23:1102-1117. [PMID: 38358903 DOI: 10.1021/acs.jproteome.3c00850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Nontuberculous mycobacteria are opportunistic bacteria pulmonary and extra-pulmonary infections in humans that closely resemble Mycobacterium tuberculosis. Although genome sequencing strategies helped determine NTMs, a common assay for the detection of coinfection by multiple NTMs with M. tuberculosis in the primary attempt of diagnosis is still elusive. Such a lack of efficiency leads to delayed therapy, an inappropriate choice of drugs, drug resistance, disease complications, morbidity, and mortality. Although a high-resolution LC-MS/MS-based multiprotein panel assay can be developed due to its specificity and sensitivity, it needs a library of species-specific peptides as a platform. Toward this, we performed an analysis of proteomes of 9 NTM species with more than 20 million peptide spectrum matches gathered from 26 proteome data sets. Our metaproteomic analyses determined 48,172 species-specific proteotypic peptides across 9 NTMs. Notably, M. smegmatis (26,008), M. abscessus (12,442), M. vaccae (6487), M. fortuitum (1623), M. avium subsp. paratuberculosis (844), M. avium subsp. hominissuis (580), and M. marinum (112) displayed >100 species-specific proteotypic peptides. Finally, these peptides and corresponding spectra have been compiled into a spectral library, FASTA, and JSON formats for future reference and validation in clinical cohorts by the biomedical community for further translation.
Collapse
Affiliation(s)
- Chinmaya Narayana Kotimoole
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Vadageri Krishnamurthy Ramya
- Foundation for Neglected Disease Research, 20A, KIADB Industrial Area, Veerapura Village, Doddaballapur, Bengaluru 561203, India
| | - Parvinder Kaur
- Foundation for Neglected Disease Research, 20A, KIADB Industrial Area, Veerapura Village, Doddaballapur, Bengaluru 561203, India
| | - Norbert Reiling
- Microbial Interface Biology, Research Center Borstel, Leibniz Lung Center, Parkallee 22, D-23845 Borstel, Germany
- German Center for Infection Research (DZIF), Site Hamburg-Lübeck-Borstel-Riems, 23845 Borstel, Germany
| | - Radha Krishan Shandil
- Foundation for Neglected Disease Research, 20A, KIADB Industrial Area, Veerapura Village, Doddaballapur, Bengaluru 561203, India
| | - Shridhar Narayanan
- Foundation for Neglected Disease Research, 20A, KIADB Industrial Area, Veerapura Village, Doddaballapur, Bengaluru 561203, India
| | - Trude Helen Flo
- Centre of Molecular Inflammation Research, Department of Clinical and Molecular Medicine Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Kunnskapssenteret, Øya 424.04.035, Norway
| | | |
Collapse
|
6
|
The M, Picciani M, Jensen C, Gabriel W, Kuster B, Wilhelm M. AI-Assisted Processing Pipeline to Boost Protein Isoform Detection. Methods Mol Biol 2024; 2836:157-181. [PMID: 38995541 DOI: 10.1007/978-1-0716-4007-4_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Proteomics, the study of proteins within biological systems, has seen remarkable advancements in recent years, with protein isoform detection emerging as one of the next major frontiers. One of the primary challenges is achieving the necessary peptide and protein coverage to confidently differentiate isoforms as a result of the protein inference problem and protein false discovery rate estimation challenge in large data. In this chapter, we describe the application of artificial intelligence-assisted peptide property prediction for database search engine rescoring by Oktoberfest, an approach that has proven effective, particularly for complex samples and extensive search spaces, which can greatly increase peptide coverage. Further, it illustrates a method for increasing isoform coverage by the PickedGroupFDR approach that is designed to excel when applied on large data. Real-world examples are provided to illustrate the utility of the tools in the context of rescoring, protein grouping, and false discovery rate estimation. By implementing these cutting-edge techniques, researchers can achieve a substantial increase in both peptide and isoform coverage, thus unlocking the potential of protein isoform detection in their studies and shedding light on their roles and functions in biological processes.
Collapse
Affiliation(s)
- Matthew The
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mario Picciani
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Cecilia Jensen
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Wassim Gabriel
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.
| |
Collapse
|
7
|
Liu L, Trendel J, Jiang G, Liu Y, Bruckmann A, Küster B, Sprunck S, Dresselhaus T, Bleckmann A. RBPome identification in egg-cell like callus of Arabidopsis. Biol Chem 2023; 404:1137-1149. [PMID: 37768858 DOI: 10.1515/hsz-2023-0195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 09/11/2023] [Indexed: 09/30/2023]
Abstract
RNA binding proteins (RBPs) have multiple and essential roles in transcriptional and posttranscriptional regulation of gene expression in all living organisms. Their biochemical identification in the proteome of a given cell or tissue requires significant protein amounts, which limits studies in rare and highly specialized cells. As a consequence, we know almost nothing about the role(s) of RBPs in reproductive processes such as egg cell development, fertilization and early embryogenesis in flowering plants. To systematically identify the RBPome of egg cells in the model plant Arabidopsis, we performed RNA interactome capture (RIC) experiments using the egg cell-like RKD2-callus and were able to identify 728 proteins associated with poly(A+)-RNA. Transcripts for 97 % of identified proteins could be verified in the egg cell transcriptome. 46 % of identified proteins can be associated with the RNA life cycle. Proteins involved in mRNA binding, RNA processing and metabolism are highly enriched. Compared with the few available RBPome datasets of vegetative plant tissues, we identified 475 egg cell-enriched RBPs, which will now serve as a resource to study RBP function(s) during egg cell development, fertilization and early embryogenesis. First candidates were already identified showing an egg cell-specific expression pattern in ovules.
Collapse
Affiliation(s)
- Liping Liu
- Cell Biology and Plant Biochemistry, University of Regensburg, D-93053 Regensburg, Germany
| | - Jakob Trendel
- Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), D-85354 Freising, Germany
| | - Guojing Jiang
- Cell Biology and Plant Biochemistry, University of Regensburg, D-93053 Regensburg, Germany
| | - Yanhui Liu
- College of Life Science, Longyan University, Longyan 364012, China
| | - Astrid Bruckmann
- Biochemistry I, University of Regensburg, D-93053 Regensburg, Germany
| | - Bernhard Küster
- Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), D-85354 Freising, Germany
| | - Stefanie Sprunck
- Cell Biology and Plant Biochemistry, University of Regensburg, D-93053 Regensburg, Germany
| | - Thomas Dresselhaus
- Cell Biology and Plant Biochemistry, University of Regensburg, D-93053 Regensburg, Germany
| | - Andrea Bleckmann
- Cell Biology and Plant Biochemistry, University of Regensburg, D-93053 Regensburg, Germany
| |
Collapse
|
8
|
Abele M, Doll E, Bayer FP, Meng C, Lomp N, Neuhaus K, Scherer S, Kuster B, Ludwig C. Unified Workflow for the Rapid and In-Depth Characterization of Bacterial Proteomes. Mol Cell Proteomics 2023; 22:100612. [PMID: 37391045 PMCID: PMC10407251 DOI: 10.1016/j.mcpro.2023.100612] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 05/18/2023] [Accepted: 06/26/2023] [Indexed: 07/02/2023] Open
Abstract
Bacteria are the most abundant and diverse organisms among the kingdoms of life. Due to this excessive variance, finding a unified, comprehensive, and safe workflow for quantitative bacterial proteomics is challenging. In this study, we have systematically evaluated and optimized sample preparation, mass spectrometric data acquisition, and data analysis strategies in bacterial proteomics. We investigated workflow performances on six representative species with highly different physiologic properties to mimic bacterial diversity. The best sample preparation strategy was a cell lysis protocol in 100% trifluoroacetic acid followed by an in-solution digest. Peptides were separated on a 30-min linear microflow liquid chromatography gradient and analyzed in data-independent acquisition mode. Data analysis was performed with DIA-NN using a predicted spectral library. Performance was evaluated according to the number of identified proteins, quantitative precision, throughput, costs, and biological safety. With this rapid workflow, over 40% of all encoded genes were detected per bacterial species. We demonstrated the general applicability of our workflow on a set of 23 taxonomically and physiologically diverse bacterial species. We could confidently identify over 45,000 proteins in the combined dataset, of which 30,000 have not been experimentally validated before. Our work thereby provides a valuable resource for the microbial scientific community. Finally, we grew Escherichia coli and Bacillus cereus in replicates under 12 different cultivation conditions to demonstrate the high-throughput suitability of the workflow. The proteomic workflow we present in this manuscript does not require any specialized equipment or commercial software and can be easily applied by other laboratories to support and accelerate the proteomic exploration of the bacterial kingdom.
Collapse
Affiliation(s)
- Miriam Abele
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technical University of Munich, Freising, Germany; Division of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Etienne Doll
- Division of Microbial Ecology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Florian P Bayer
- Division of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Chen Meng
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Nina Lomp
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Division of Microbial Ecology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany; Core Facility Microbiome, ZIEL - Institute for Food & Health, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Division of Microbial Ecology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Bernhard Kuster
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technical University of Munich, Freising, Germany; Division of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), TUM School of Life Sciences, Technical University of Munich, Freising, Germany.
| |
Collapse
|
9
|
Higgins L, Gerdes H, Cutillas PR. Principles of phosphoproteomics and applications in cancer research. Biochem J 2023; 480:403-420. [PMID: 36961757 PMCID: PMC10212522 DOI: 10.1042/bcj20220220] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 02/24/2023] [Accepted: 02/28/2023] [Indexed: 03/25/2023]
Abstract
Phosphorylation constitutes the most common and best-studied regulatory post-translational modification in biological systems and archetypal signalling pathways driven by protein and lipid kinases are disrupted in essentially all cancer types. Thus, the study of the phosphoproteome stands to provide unique biological information on signalling pathway activity and on kinase network circuitry that is not captured by genetic or transcriptomic technologies. Here, we discuss the methods and tools used in phosphoproteomics and highlight how this technique has been used, and can be used in the future, for cancer research. Challenges still exist in mass spectrometry phosphoproteomics and in the software required to provide biological information from these datasets. Nevertheless, improvements in mass spectrometers with enhanced scan rates, separation capabilities and sensitivity, in biochemical methods for sample preparation and in computational pipelines are enabling an increasingly deep analysis of the phosphoproteome, where previous bottlenecks in data acquisition, processing and interpretation are being relieved. These powerful hardware and algorithmic innovations are not only providing exciting new mechanistic insights into tumour biology, from where new drug targets may be derived, but are also leading to the discovery of phosphoproteins as mediators of drug sensitivity and resistance and as classifiers of disease subtypes. These studies are, therefore, uncovering phosphoproteins as a new generation of disruptive biomarkers to improve personalised anti-cancer therapies.
Collapse
Affiliation(s)
- Luke Higgins
- Cell Signaling and Proteomics Group, Centre for Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London, U.K
| | - Henry Gerdes
- Cell Signaling and Proteomics Group, Centre for Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London, U.K
| | - Pedro R. Cutillas
- Cell Signaling and Proteomics Group, Centre for Genomics and Computational Biology, Barts Cancer Institute, Queen Mary University of London, London, U.K
- Alan Turing Institute, The British Library, London, U.K
- Digital Environment Research Institute, Queen Mary University of London, London, U.K
| |
Collapse
|
10
|
Prakash A, García-Seisdedos D, Wang S, Kundu DJ, Collins A, George N, Moreno P, Papatheodorou I, Jones AR, Vizcaíno JA. Integrated View of Baseline Protein Expression in Human Tissues. J Proteome Res 2023; 22:729-742. [PMID: 36577097 PMCID: PMC9990129 DOI: 10.1021/acs.jproteome.2c00406] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The availability of proteomics datasets in the public domain, and in the PRIDE database, in particular, has increased dramatically in recent years. This unprecedented large-scale availability of data provides an opportunity for combined analyses of datasets to get organism-wide protein abundance data in a consistent manner. We have reanalyzed 24 public proteomics datasets from healthy human individuals to assess baseline protein abundance in 31 organs. We defined tissue as a distinct functional or structural region within an organ. Overall, the aggregated dataset contains 67 healthy tissues, corresponding to 3,119 mass spectrometry runs covering 498 samples from 489 individuals. We compared protein abundances between different organs and studied the distribution of proteins across these organs. We also compared the results with data generated in analogous studies. Additionally, we performed gene ontology and pathway-enrichment analyses to identify organ-specific enriched biological processes and pathways. As a key point, we have integrated the protein abundance results into the resource Expression Atlas, where they can be accessed and visualized either individually or together with gene expression data coming from transcriptomics datasets. We believe this is a good mechanism to make proteomics data more accessible for life scientists.
Collapse
Affiliation(s)
- Ananth Prakash
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - David García-Seisdedos
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Shengbo Wang
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Deepti Jaiswal Kundu
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Andrew Collins
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, LiverpoolL69 7ZB, United Kingdom
| | - Nancy George
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Pablo Moreno
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Irene Papatheodorou
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| | - Andrew R Jones
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, LiverpoolL69 7ZB, United Kingdom
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom.,Open Targets, Wellcome Genome Campus, Hinxton, CambridgeCB10 1SD, United Kingdom
| |
Collapse
|
11
|
Phlairaharn T, Ye Z, Krismer E, Pedersen AK, Pietzner M, Olsen JV, Schoof EM, Searle BC. Optimizing linear ion trap data independent acquisition towards single cell proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.21.529444. [PMID: 36865114 PMCID: PMC9980145 DOI: 10.1101/2023.02.21.529444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
A linear ion trap (LIT) is an affordable, robust mass spectrometer that proves fast scanning speed and high sensitivity, where its primary disadvantage is inferior mass accuracy compared to more commonly used time-of-flight (TOF) or orbitrap (OT) mass analyzers. Previous efforts to utilize the LIT for low-input proteomics analysis still rely on either built-in OTs for collecting precursor data or OT-based library generation. Here, we demonstrate the potential versatility of the LIT for low-input proteomics as a stand-alone mass analyzer for all mass spectrometry measurements, including library generation. To test this approach, we first optimized LIT data acquisition methods and performed library-free searches with and without entrapment peptides to evaluate both the detection and quantification accuracy. We then generated matrix-matched calibration curves to estimate the lower limit of quantification using only 10 ng of starting material. While LIT-MS1 measurements provided poor quantitative accuracy, LIT-MS2 measurements were quantitatively accurate down to 0.5 ng on column. Finally, we optimized a suitable strategy for spectral library generation from low-input material, which we used to analyze single-cell samples by LIT-DIA using LIT-based libraries generated from as few as 40 cells.
Collapse
|