1
|
Vegesna M, Sundararaman N, Bharadwaj A, Washington K, Pandey R, Haghani A, Chazarin B, Binek A, Fu Q, Cheng S, Herrington D, Van Eyk JE. Enhancing Proteomics Quality Control: Insights from the Visualization Tool QCeltis. J Proteome Res 2025; 24:1148-1160. [PMID: 39992359 DOI: 10.1021/acs.jproteome.4c00777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Large-scale mass-spectrometry-based proteomics experiments are complex and prone to analytical variability, requiring rigorous quality checks across each step in the workflow: sample preparation, chromatography, mass spectrometry, and the bioinformatics stages. This includes quality control (QC) measures that address biological and technical variation. Most QC approaches involve detecting sample outliers and monitoring parameters related to sample preparation and mass spectrometer performance. Evaluating these parameters regularly is essential for reliable downstream analysis and proteomics research. Here, we introduce "QCeltis", a Python package designed to facilitate automated QC analysis across the proteomics workflow, aiding in the identification of technical biases and consistency verification. QCeltis is a versatile tool for detecting QC issues in large-scale data-independent acquisition proteomics experiments by not only identifying sample preparation and acquisition issues but also aiding in differentiating between QC issues vs batch effects. QCeltis is available for command-line use in Windows and Linux environments. We present three case studies showcasing QCeltis's capabilities across different data sets, including depleted plasma, whole blood vs plasma, and dried blood spot samples, emphasizing its potential impact on large-scale proteomics projects. This package can be used to enhance data reliability and enable nuanced downstream analysis and interpretation for proteomics studies.
Collapse
Affiliation(s)
- Manasa Vegesna
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Ajay Bharadwaj
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Kirstin Washington
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Rakhi Pandey
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Ali Haghani
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Blandine Chazarin
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Aleksandra Binek
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Qin Fu
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Susan Cheng
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - David Herrington
- Department of Cardiovascular Medicine, Wake Forest University, Winston-Salem, North Carolina 27101, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
2
|
Sing JC, Charkow J, Walter A, Gao M, Müller TD, Bittremieux W, Sachsenberg T, Röst HL. pyOpenMS-viz: Streamlining Mass Spectrometry Data Visualization with pandas. J Proteome Res 2025. [PMID: 40019346 DOI: 10.1021/acs.jproteome.4c00873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2025]
Abstract
Mass spectrometry data visualization is essential for a wide range of applications, such as validation of workflows and results, benchmarking new algorithms, and creating comprehensive quality control reports. Python offers a popular and powerful framework for analyzing and visualizing multidimensional data; however, generating commonly used mass spectrometry plots in Python can be cumbersome. Here we present pyOpenMS-viz, a versatile, unified framework for generating mass spectrometry plots. pyOpenMS-viz directly extends pandas DataFrame plotting for generating figures in a single line of code. This implementation enables easy integration across various Python-based mass spectrometry tools that already use pandas DataFrames to store MS data. pyOpenMS-viz is open-source under a BSD 3-Clause license and freely available at https://github.com/OpenMS/pyopenms_viz.
Collapse
Affiliation(s)
- Justin Cyril Sing
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Joshua Charkow
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Axel Walter
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, 72076 Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen 72076, Germany
| | - Mingxuan Gao
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Tom David Müller
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, 72076 Tübingen, Germany
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
| | - Timo Sachsenberg
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, 72076 Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen 72076, Germany
| | - Hannes Luc Röst
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
3
|
Xia D, Pan G, Liu Y, Liu H, Zhao B, Wu J, Tang T, Lu G, Wang R. Unlocking the future potential of SWATH-MS: Advancing non-target screening workflow for the qualitative and quantitative analysis of emerging contaminants. WATER RESEARCH 2025; 277:123323. [PMID: 40020354 DOI: 10.1016/j.watres.2025.123323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 02/15/2025] [Accepted: 02/17/2025] [Indexed: 03/03/2025]
Abstract
SWATH-MS offers a robust data-independent acquisition method for complex proteomics and metabolomics. This study presents a detailed non-target screening workflow utilizing SWATH-MS to detect and analyze emerging contaminants (ECs) in aquatic environments. Our workflow, covering peak picking, alignment, prioritization, structure identification, and quantification, effectively identified all qualifying peaks from 298 standard compounds with different concentrations, discarding any that did not meet the criteria. In extracts of real water samples spiked at 100 and 10 ng/mL, our workflow prioritized 2083 and 1328 features, respectively. Following structure identification, these features were assigned confidence levels ranging from 1 to 5. Of these, 215 and 92 spiked standards achieved level 1. The remaining standards were not recognized as level 1 due to low intensities or poor peak shapes that failed to meet certain criteria. Additionally, using fragment ion peak areas for quantification significantly improved the linearity of standard curves, enhancing R2 values for ∼63 % of the standards. Incorporating fragment ion data improved quantification accuracy, increasing compounds within the 80 %-120 % range from 78 % to 90 % at 100 ng/mL and within the 50 %-150 % range from 36 % to 69 % at 10 ng/mL. These findings underscore SWATH-MS's potential to enhance monitoring of ECs and ecological risk assessments, providing critical insights for environmental management.
Collapse
Affiliation(s)
- Di Xia
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China
| | - Guofang Pan
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China
| | - Yaxiong Liu
- NMPA Key Laboratory of Rapid Drug Inspection Technology, Guangzhou 510663, China
| | - He Liu
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China
| | - Bo Zhao
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China
| | - Jiahui Wu
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China
| | - Ting Tang
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China
| | - Guining Lu
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China
| | - Rui Wang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou 510655, China; Australian Laboratory for Emerging Contaminants, School of Chemistry, University of Melbourne, Victoria 3010, Australia.
| |
Collapse
|
4
|
Prakash A, Collins A, Vilmovsky L, Fexova S, Jones AR, Vizcaino JA. Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets. J Proteome Res 2025; 24:685-695. [PMID: 39764611 PMCID: PMC11811993 DOI: 10.1021/acs.jproteome.4c00788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 11/18/2024] [Accepted: 12/19/2024] [Indexed: 02/08/2025]
Abstract
The PRIDE database is the largest public data repository of mass spectrometry-based proteomics data and currently stores more than 40,000 data sets covering a wide range of organisms, experimental techniques, and biological conditions. During the past few years, PRIDE has seen a significant increase in the amount of submitted data-independent acquisition (DIA) proteomics data sets. This provides an excellent opportunity for large-scale data reanalysis and reuse. We have reanalyzed 15 public label-free DIA data sets across various healthy human tissues to provide a state-of-the-art view of the human proteome in baseline conditions (without any perturbations). We computed baseline protein abundances and compared them across various tissues, samples, and data sets. Our second aim was to compare protein abundances obtained here from the results of previous analyses using human baseline data-dependent acquisition (DDA) data sets. We observed a good correlation across some tissues, especially in the liver and colon, but weak correlations were found in others, such as the lung and pancreas. The reanalyzed results including protein abundance values and curated metadata are made available to view and download from the resource Expression Atlas.
Collapse
Affiliation(s)
- Ananth Prakash
- European
Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
| | - Andrew Collins
- Institute
of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K.
| | - Liora Vilmovsky
- European
Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
| | - Silvie Fexova
- European
Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
| | - Andrew R. Jones
- Institute
of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K.
| | - Juan Antonio Vizcaino
- European
Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
| |
Collapse
|
5
|
Cui Y, Arnold FJ, Li JS, Wu J, Wang D, Philippe J, Colwin MR, Michels S, Chen C, Sallam T, Thompson LM, La Spada AR, Li W. Multi-omic quantitative trait loci link tandem repeat size variation to gene regulation in human brain. Nat Genet 2025; 57:369-378. [PMID: 39809899 DOI: 10.1038/s41588-024-02057-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 12/10/2024] [Indexed: 01/16/2025]
Abstract
Tandem repeat (TR) size variation is implicated in ~50 neurological disorders, yet its impact on gene regulation in the human brain remains largely unknown. In the present study, we quantified the impact of TR size variation on brain gene regulation across distinct molecular phenotypes, based on 4,412 multi-omics samples from 1,597 donors, including 1,586 newly sequenced ones. We identified ~2.2 million TR molecular quantitative trait loci (TR-xQTLs), linking ~139,000 unique TRs to nearby molecular phenotypes, including many known disease-risk TRs, such as the G2C4 expansion in C9orf72 associated with amyotrophic lateral sclerosis. Fine-mapping revealed ~18,700 TRs as potential causal variants. Our in vitro experiments further confirmed the causal and independent regulatory effects of three TRs. Additional colocalization analysis indicated the potential causal role of TR variation in brain-related phenotypes, highlighted by a 3'-UTR TR in NUDT14 linked to cortical surface area and a TG repeat in PLEKHA1, associated with Alzheimer's disease.
Collapse
Affiliation(s)
- Ya Cui
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| | - Frederick J Arnold
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Jason Sheng Li
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Jie Wu
- Departments of Psychiatry and Human Behavior, Neurobiology and Behavior, and Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Dan Wang
- Division of Cardiology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Julien Philippe
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Michael R Colwin
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
| | - Sebastian Michels
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA
- Department of Neurology, University of Ulm, Oberer Eselsberg, Ulm, Germany
| | - Chaorong Chen
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA
| | - Tamer Sallam
- Division of Cardiology, Department of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Leslie M Thompson
- Departments of Psychiatry and Human Behavior, Neurobiology and Behavior, and Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| | - Albert R La Spada
- Departments of Pathology & Laboratory Medicine, Neurology, Biological Chemistry, and Neurobiology & Behavior, University of California, Irvine, Irvine, CA, USA.
- UCI Center for Neurotherapeutics, University of California, Irvine, Irvine, CA, USA.
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, University of California, Irvine, Irvine, CA, USA.
| |
Collapse
|
6
|
Guo T, Steen JA, Mann M. Mass-spectrometry-based proteomics: from single cells to clinical applications. Nature 2025; 638:901-911. [PMID: 40011722 DOI: 10.1038/s41586-025-08584-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 01/02/2025] [Indexed: 02/28/2025]
Abstract
Mass-spectrometry (MS)-based proteomics has evolved into a powerful tool for comprehensively analysing biological systems. Recent technological advances have markedly increased sensitivity, enabling single-cell proteomics and spatial profiling of tissues. Simultaneously, improvements in throughput and robustness are facilitating clinical applications. In this Review, we present the latest developments in proteomics technology, including novel sample-preparation methods, advanced instrumentation and innovative data-acquisition strategies. We explore how these advances drive progress in key areas such as protein-protein interactions, post-translational modifications and structural proteomics. Integrating artificial intelligence into the proteomics workflow accelerates data analysis and biological interpretation. We discuss the application of proteomics to single-cell analysis and spatial profiling, which can provide unprecedented insights into cellular heterogeneity and tissue architecture. Finally, we examine the transition of proteomics from basic research to clinical practice, including biomarker discovery in body fluids and the promise and challenges of implementing proteomics-based diagnostics. This Review provides a broad and high-level overview of the current state of proteomics and its potential to revolutionize our understanding of biology and transform medical practice.
Collapse
Affiliation(s)
- Tiannan Guo
- State Key Laboratory of Medical Proteomics, School of Medicine, Westlake University, Hangzhou, China.
- Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, China.
- Research Center for Industries of the Future, School of Life Sciences, Westlake University, Hangzhou, China.
| | - Judith A Steen
- Department of Neurology, Harvard Medical School, Boston, MA, USA.
- F.M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA.
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
7
|
Basharat A, Xiong X, Xu T, Zang Y, Sun L, Liu X. TopDIA: A Software Tool for Top-Down Data-Independent Acquisition Proteomics. J Proteome Res 2025; 24:55-64. [PMID: 39641251 PMCID: PMC11705214 DOI: 10.1021/acs.jproteome.4c00293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 10/06/2024] [Accepted: 11/27/2024] [Indexed: 12/07/2024]
Abstract
Top-down mass spectrometry is widely used for proteoform identification, characterization, and quantification owing to its ability to analyze intact proteoforms. In the past decade, top-down proteomics has been dominated by top-down data-dependent acquisition mass spectrometry (TD-DDA-MS), and top-down data-independent acquisition mass spectrometry (TD-DIA-MS) has not been well studied. While TD-DIA-MS produces complex multiplexed tandem mass spectrometry (MS/MS) spectra, which are challenging to confidently identify, it selects more precursor ions for MS/MS analysis and has the potential to increase proteoform identifications compared with TD-DDA-MS. Here we present TopDIA, the first software tool for proteoform identification by TD-DIA-MS. It generates demultiplexed pseudo MS/MS spectra from TD-DIA-MS data and then searches the pseudo MS/MS spectra against a protein sequence database for proteoform identification. We compared the performance of TD-DDA-MS and TD-DIA-MS using Escherichia coli K-12 MG1655 cells and demonstrated that TD-DIA-MS with TopDIA increased proteoform and protein identifications compared with TD-DDA-MS.
Collapse
Affiliation(s)
- Abdul
Rehman Basharat
- Department
of BioHealth Informatics, Luddy School of Informatics, Computing and
Engineering, Indiana University-Purdue University
Indianapolis, Indianapolis, Indiana 46202, United States
| | - Xingzhao Xiong
- Deming
Department of Medicine, Tulane University
School of Medicine, New Orleans, Louisiana 70112, United States
| | - Tian Xu
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yong Zang
- Department
of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Liangliang Sun
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Xiaowen Liu
- Deming
Department of Medicine, Tulane University
School of Medicine, New Orleans, Louisiana 70112, United States
| |
Collapse
|
8
|
Rajczewski AT, Blakeley-Ruiz. JA, Meyer A, Vintila S, McIlvin MR, Van Den Bossche T, Searle BC, Griffin TJ, Saito MA, Kleiner M, Jagtap PD. Data-Independent Acquisition Mass Spectrometry as a Tool for Metaproteomics: Interlaboratory Comparison Using a Model Microbiome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.09.18.613707. [PMID: 39345414 PMCID: PMC11430069 DOI: 10.1101/2024.09.18.613707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Mass spectrometry (MS)-based metaproteomics is used to identify and quantify proteins in microbiome samples, with the frequently used methodology being Data-Dependent Acquisition mass spectrometry (DDA-MS). However, DDA-MS is limited in its ability to reproducibly identify and quantify lower abundant peptides and proteins. To address DDA-MS deficiencies, proteomics researchers have started using Data-Independent Acquisition Mass Spectrometry (DIA-MS) for reproducible detection and quantification of peptides and proteins. We sought to evaluate the reproducibility and accuracy of DIA-MS metaproteomic measurements relative to DDA-MS using a mock community of known taxonomic composition. Artificial microbial communities of known composition were analyzed independently in three laboratories using DDA- and DIA-MS acquisition methods. DIA-MS yielded more protein and peptide identifications than DDA-MS in each laboratory. In addition, the protein and peptide identifications were more reproducible in all laboratories and provided an accurate quantification of proteins and taxonomic groups in the samples. We also identified some limitations of current DIA tools when applied to metaproteomic data, highlighting specific needs to improve DIA tools enabling analysis of metaproteomic datasets from complex microbiomes. Ultimately, DIA-MS represents a promising strategy for MS-based metaproteomics due to its large number of detected proteins and peptides, reproducibility, deep sequencing capabilities, and accurate quantitation.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis MN USA
| | | | - Annaliese Meyer
- MIT-WHOI Joint Program in Oceanography/Applied Ocean Science and Engineering, Department of Chemistry, Woods Hole Oceanographic Institution, Woods Hole MA USA, Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge MA USA
| | - Simina Vintila
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh NC USA
| | - Matthew R. McIlvin
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole MA USA
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent Belgium
| | - Brian C. Searle
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus OH USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis MN USA
| | - Mak A. Saito
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole MA USA
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh NC USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis MN USA
| |
Collapse
|
9
|
Li K, Teo GC, Yang KL, Yu F, Nesvizhskii AI. diaTracer enables spectrum-centric analysis of diaPASEF proteomics data. Nat Commun 2025; 16:95. [PMID: 39747075 PMCID: PMC11696033 DOI: 10.1038/s41467-024-55448-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 12/06/2024] [Indexed: 01/04/2025] Open
Abstract
Data-independent acquisition has become a widely used strategy for peptide and protein quantification in liquid chromatography-tandem mass spectrometry-based proteomics studies. The integration of ion mobility separation into data-independent acquisition analysis, such as the diaPASEF technology available on Bruker's timsTOF platform, further improves the quantification accuracy and protein depth achievable using data-independent acquisition. We introduce diaTracer, a spectrum-centric computational tool optimized for diaPASEF data. diaTracer performs three-dimensional (mass to charge ratio, retention time, ion mobility) peak tracing and feature detection to generate precursor-resolved "pseudo-tandem mass spectra", facilitating direct ("spectral-library free") peptide identification and quantification from diaPASEF data. diaTracer is available as a stand-alone tool and is fully integrated into the widely used FragPipe computational platform. We demonstrate the performance of diaTracer and FragPipe using diaPASEF data from triple-negative breast cancer, cerebrospinal fluid, and plasma samples, data from phosphoproteomics and human leukocyte antigens immunopeptidomics experiments, and low-input data from a spatial proteomics study. We also show that diaTracer enables unrestricted identification of post-translational modifications from diaPASEF data using open/mass-offset searches.
Collapse
Affiliation(s)
- Kai Li
- Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kevin L Yang
- Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Alexey I Nesvizhskii
- Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
10
|
Wang Q, Chen Q, Lin Y, He D, Ji H, Tan CSH. Spike-In Proteome Enhances Data-Independent Acquisition for Thermal Proteome Profiling. Anal Chem 2024; 96:19695-19705. [PMID: 39618045 DOI: 10.1021/acs.analchem.4c04837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Target deconvolution is essential for elucidating the molecular mechanisms, therapeutic efficacy, and off-target toxicity of small-molecule drugs. Thermal proteome profiling (TPP) is a robust and popular method for identifying drug-protein interactions. Nevertheless, classical implementation of TPP using isobaric labeling of peptides is tedious, time-consuming, and costly. This prompts the adoption of a label-free approach with data-independent acquisition (DIA), but with substantial compromise in protein coverage and precision. To address these shortcomings, we improvised a spike-in proteome strategy for DIA with TPP to counteract the reduction in protein quantity following sample heating. Protein coverage, data completeness, and quantification precision are significantly improved as result. Additionally, a calibration algorithm was developed to correct for spike-in effects on fold changes. The integration of DIA-TPP with the matrix-augmented pooling strategy (MAPS) to increase experiment throughput demonstrates performance comparable to that of existing TMT-TPP-MAPS. With this spike-in proteome strategy, we also successfully identified the thermal stabilization of CA13 by dorzolamide hydrochloride as well as GSTZ1 and tyrosyl-DNA phosphodiesterase 1 of opicapone that eluded detection without spike-in proteome.
Collapse
Affiliation(s)
- Qiqi Wang
- Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, College of Science, Southern University of Science and Technology, Shenzhen, Guangdong 518055, P.R. China
- Shenzhen Key Laboratory of Functional Proteomics, Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen 518055, China
| | - Qiufen Chen
- Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, College of Science, Southern University of Science and Technology, Shenzhen, Guangdong 518055, P.R. China
- Shenzhen Key Laboratory of Functional Proteomics, Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen 518055, China
| | - Yue Lin
- Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, College of Science, Southern University of Science and Technology, Shenzhen, Guangdong 518055, P.R. China
- Shenzhen Key Laboratory of Functional Proteomics, Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen 518055, China
| | - Dan He
- Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, College of Science, Southern University of Science and Technology, Shenzhen, Guangdong 518055, P.R. China
- Shenzhen Key Laboratory of Functional Proteomics, Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen 518055, China
| | - Hongchao Ji
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Chris Soon Heng Tan
- Department of Chemistry and Research Center for Chemical Biology and Omics Analysis, College of Science, Southern University of Science and Technology, Shenzhen, Guangdong 518055, P.R. China
- Shenzhen Key Laboratory of Functional Proteomics, Guangming Advanced Research Institute, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
11
|
Fochtman D, Marczak L, Pietrowska M, Wojakowska A. Challenges of MS-based small extracellular vesicles proteomics. J Extracell Vesicles 2024; 13:e70020. [PMID: 39692094 DOI: 10.1002/jev2.70020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 11/06/2024] [Accepted: 11/24/2024] [Indexed: 12/19/2024] Open
Abstract
Proteomic profiling of small extracellular vesicles (sEV) is a powerful tool for discovering biomarkers of various diseases. This process most often assisted by mass spectrometry (MS) usually lacks standardization and recognition of challenges which may lead to unreliable results. General recommendations for sEV MS analyses have been briefly given in the MISEV2023 guidelines. The present work goes into detail for every step of sEV protein profiling with an overview of factors influencing such analyses. This includes reporting and defining the sEV source and vesicle isolation, protein solubilization and digestion, 'offline' and 'online' sample complexity reduction, the analysis type itself, and subsequent data analysis. Every stage in this process affects the others, which could result in different outcomes. Although characterization and comparisons of different sEV isolation methods are known and accessible and MS-based profiling details are provided for cell or tissue samples, no consensus work has been ever published to describe the whole process of sEV proteomic analysis. Reliable results can be obtained from sEV profiling provided that the analysis is well planned, prepared for, and backed by pilot studies or appropriate research.
Collapse
Affiliation(s)
- Daniel Fochtman
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland
| | - Lukasz Marczak
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland
| | - Monika Pietrowska
- Maria Sklodowska-Curie National Research Institute of Oncology, Gliwice, Poland
| | - Anna Wojakowska
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Poznan, Poland
| |
Collapse
|
12
|
Albrecht V, Müller-Reif J, Nordmann TM, Mund A, Schweizer L, Geyer PE, Niu L, Wang J, Post F, Oeller M, Metousis A, Bach Nielsen A, Steger M, Wewer Albrechtsen NJ, Mann M. Bridging the Gap From Proteomics Technology to Clinical Application: Highlights From the 68th Benzon Foundation Symposium. Mol Cell Proteomics 2024; 23:100877. [PMID: 39522756 DOI: 10.1016/j.mcpro.2024.100877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Revised: 11/05/2024] [Accepted: 11/07/2024] [Indexed: 11/16/2024] Open
Abstract
The 68th Benzon Foundation Symposium brought together leading experts to explore the integration of mass spectrometry-based proteomics and artificial intelligence to revolutionize personalized medicine. This report highlights key discussions on recent technological advances in mass spectrometry-based proteomics, including improvements in sensitivity, throughput, and data analysis. Particular emphasis was placed on plasma proteomics and its potential for biomarker discovery across various diseases. The symposium addressed critical challenges in translating proteomic discoveries to clinical practice, including standardization, regulatory considerations, and the need for robust "business cases" to motivate adoption. Promising applications were presented in areas such as cancer diagnostics, neurodegenerative diseases, and cardiovascular health. The integration of proteomics with other omics technologies and imaging methods was explored, showcasing the power of multimodal approaches in understanding complex biological systems. Artificial intelligence emerged as a crucial tool for the acquisition of large-scale proteomic datasets, extracting meaningful insights, and enhancing clinical decision-making. By fostering dialog between academic researchers, industry leaders in proteomics technology, and clinicians, the symposium illuminated potential pathways for proteomics to transform personalized medicine, advancing the cause of more precise diagnostics and targeted therapies.
Collapse
Affiliation(s)
- Vincent Albrecht
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Johannes Müller-Reif
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Thierry M Nordmann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Andreas Mund
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; BioInnovation Institute, OmicVision Biosciences, Copenhagen, Denmark
| | - Lisa Schweizer
- BioInnovation Institute, OmicVision Biosciences, Copenhagen, Denmark
| | - Philipp E Geyer
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany; ions.bio GmbH, Planegg, Germany
| | - Lili Niu
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Computational Biomarker Discovery, Novo Nordisk, Copenhagen, Denmark
| | - Juanjuan Wang
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Frederik Post
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marc Oeller
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Andreas Metousis
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Annelaura Bach Nielsen
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department for Clinical Biochemistry, University Hospital Copenhagen - Bispebjerg, Copenhagen, Copenhagen, Denmark
| | - Medini Steger
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Nicolai J Wewer Albrechtsen
- NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Department for Clinical Biochemistry, University Hospital Copenhagen - Bispebjerg, Copenhagen, Copenhagen, Denmark
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany; NNF Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
13
|
Sheng Y, Mills G, Zhao X. Identifying therapeutic strategies for triple-negative breast cancer via phosphoproteomics. Expert Rev Proteomics 2024:1-17. [PMID: 39588933 DOI: 10.1080/14789450.2024.2432477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 11/14/2024] [Indexed: 11/27/2024]
Abstract
INTRODUCTION Given the poor prognosis of patients with TNBC, it is urgent to identify new biomarkers and therapeutic targets to enable personalized treatment strategies and improve patient survival. Comprehensive insights beyond genomic and transcriptomic analysis are crucial to improved outcomes for patients. As proteins are the workhorses of cellular function with their activity primarily regulated by phosphorylation, advanced phosphoproteomics techniques, such as mass spectrometry and antibody arrays, are essential for elucidating kinase signaling pathways that drive TNBC progression and contribute to therapy resistance. AREA COVERED This review discusses the critical need to integrate phosphoproteomics into TNBC research, evaluates commonly used technologies and their applications, and explores their advantages and limitations. We highlight significant findings from phosphoproteomic analyses in TNBC and address the challenges of implementing these technologies into clinical practice. EXPERT OPINION Rapid advances in phosphoproteomics analysis facilitate subtype stratification, adaptive response monitoring, and identification of biomarkers and therapeutic targets in TNBC. However, challenges in analyzing protein phosphorylation, especially in deep spatially resolved analysis of malignant cells and the tumor ecosystem, hinder the translation of phosphoproteomics to the CLIA setting. Nonetheless, phosphoproteomics offers a powerful tool that, when integrated into routine clinical practice, has the potential to revolutionize patient care.
Collapse
Affiliation(s)
- Yuhan Sheng
- Division of Oncological Sciences Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
- Cancer Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Gordon Mills
- Division of Oncological Sciences Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
| | - Xuejiao Zhao
- Division of Oncological Sciences Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
- Department of Obstetrics and Gynecology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
14
|
Liu Y, Mei L, Liang C, Zhong CQ, Tong M, Yu R. Cross-Run Hybrid Features Improve the Identification of Data-Independent Acquisition Proteomics. ACS OMEGA 2024; 9:46362-46372. [PMID: 39583733 PMCID: PMC11579728 DOI: 10.1021/acsomega.4c07398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 09/25/2024] [Accepted: 10/02/2024] [Indexed: 11/26/2024]
Abstract
The analysis of data-independent acquisition (DIA) mass spectrometry data is crucial for comprehensive proteomics studies. However, traditional single-run methods often fall short in terms of identification depth and consistency. We present HFDiscrim, a specialized multirun DIA analysis tool aimed at enhancing the depth and consistency of reliable peptide identifications of DIA analysis tools. HFDiscrim was extensively benchmarked on multiple data sets, including the MCB data set, the ccRCC data set, and a three-species benchmark mixture. Compared to PyProphet, HFDiscrim identified 22.04% more precursors, 19.1% more peptides, and 13.2% more proteins while maintaining a controllable false discovery rate. Furthermore, HFDiscrim demonstrated higher identification rates and improved reproducibility across multiple runs. HFDiscrim is publicly available at https://github.com/yachliu/HFDiscrim.
Collapse
Affiliation(s)
- Yachen Liu
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Longfei Mei
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
| | - Chenyu Liang
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Chuan-Qi Zhong
- School
of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Mengsha Tong
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
- School
of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Rongshan Yu
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
- Aginome
Scientific, Xiamen, Fujian 361005, China
| |
Collapse
|
15
|
Zhang C, Zhang K, Zhang M, Zhang D, Ye Q, Wang X, Akagi T, Duan Y. SWATH-MS based proteomics reveals the role of photosynthesis related proteins and secondary metabolic pathways in the colored leaves of sweet olive (Osmanthus fragrans). BMC Genomics 2024; 25:1026. [PMID: 39487388 PMCID: PMC11529170 DOI: 10.1186/s12864-024-10867-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 10/04/2024] [Indexed: 11/04/2024] Open
Abstract
Colored leaves, a notable horticultural trait, have high research and ornamental value. The evergreen sweet olive (Osmanthus fragrans), one of the top ten traditional flowers in China, has been cultivated for more than two thousand years. However, in recent years, an increasing number of O. fragrans cultivars with colored leaves have been cultivated for their ornamental value. To study the molecular mechanism underlying the observed changes in leaf color, we selected O. fragrans 'Yinbi Shuanghui' (Y), which has yellow-white leaves, and O. fragrans 'Sijigui' (S), which has green leaves, as materials. Pigment content measurement showed that the chlorophyll, carotenoid and anthocyanin contents in Y were lower than in S. According to the SWATH-MS sequencing results, a total of 3,959 proteins were quantitatively identified, 1,300 of which were differentially expressed proteins (DEPs), including 782 up-regulated and 518 down-regulated proteins in Y compared to S. Functional enrichment analysis of DEPs revealed that down-regulated expression of photosynthesis related proteins may lead to the inhibition of chlorophyll synthesis in Y, this may be the main cause of leaf color change. Moreover, a protein interaction prediction model also showed that proteins such as PetC, PsbO, PsbP, and PsbQ were key proteins in the interaction network, and the up-regulated proteins participating in the anthocyanin and carotenoid pathways may be related to the formation of yellow-white leaves. Taken together, our findings represent the first SWATH-MS-based proteomic report on colored leaf O. fragrans and reveal that chlorophyll synthesis and secondary metabolism pathways contribute to the changes in leaf color.
Collapse
Affiliation(s)
- Cheng Zhang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Kailu Zhang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Min Zhang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Daowu Zhang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Qi Ye
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Xianrong Wang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China
| | - Takashi Akagi
- Graduate School of Environmental and Life Science, Okayama University, Okayama, Japan.
- Japan Science and Technology Agency (JST), PRESTO, Kawaguchi-shi, Japan.
| | - Yifan Duan
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Life Sciences, Nanjing Forestry University, Nanjing, 210037, China.
- Zhejiang Provincial Key Laboratory of Forest Aromatic Plants-based Healthcare Functions, Zhejiang A & F University, Hangzhou, 311300, China.
| |
Collapse
|
16
|
Liu R, Lu G, Hu X, Li J, Zhang Z, Tang K. Capillary zone electrophoresis-tandem mass spectrometry for in-depth proteomics analysis via data-independent acquisition. Anal Bioanal Chem 2024; 416:5805-5814. [PMID: 39196334 DOI: 10.1007/s00216-024-05502-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/01/2024] [Accepted: 08/15/2024] [Indexed: 08/29/2024]
Abstract
A capillary zone electrophoresis (CZE) system was coupled to an Orbitrap mass spectrometer operating in a data-independent acquisition (DIA) mode for in-depth proteomics analysis. The performance of this CZE-DIA-MS system was systemically evaluated and optimized under different operating conditions. The performance of the fully optimized CZE-DIA-MS system was subsequently compared to the one by using the same CZE-MS system operating in a data-dependent acquisition (DDA) mode. The experimental results show that the numbers of identified peptides and proteins acquired in the DIA mode are much higher than the ones acquired in the DDA mode, especially with the small sample loading amount. Specifically, the numbers of identified peptides and proteins acquired in the DIA mode are 1.8-fold and 2-fold higher than the ones acquired in the DDA mode by using 12.5 ng Hela digests. The proteins identified in the DIA mode also cover almost all the proteins identified in the DDA mode. In addition, a potential cancer biomarker protein, carbohydrate antigen 125, undetected in the DDA mode, can be easily identified in the DIA mode even with 12.5 ng Hela digests. The performance of the CZE-DIA-MS system for in-depth proteomics analysis with a limited sample amount has been fully demonstrated for the first time through this study.
Collapse
Affiliation(s)
- Rong Liu
- Zhejiang Engineering Research Center of Advanced Mass Spectrometry and Clinical Application, Institute of Mass Spectrometry, Ningbo University, Ningbo, 315211, PR China
- Zhenhai Institute of Mass Spectrometry, Ningbo, 315211, PR China
- School of Materials Science and Chemical Engineering, Ningbo University, Ningbo, 315211, PR China
| | - Gang Lu
- Institute of Drug Discovery Technology, Ningbo University, Ningbo, 315211, PR China
| | - Xiaozhong Hu
- Zhejiang Engineering Research Center of Advanced Mass Spectrometry and Clinical Application, Institute of Mass Spectrometry, Ningbo University, Ningbo, 315211, PR China
- Zhenhai Institute of Mass Spectrometry, Ningbo, 315211, PR China
- School of Materials Science and Chemical Engineering, Ningbo University, Ningbo, 315211, PR China
| | - Junhui Li
- Zhejiang Engineering Research Center of Advanced Mass Spectrometry and Clinical Application, Institute of Mass Spectrometry, Ningbo University, Ningbo, 315211, PR China
- Zhenhai Institute of Mass Spectrometry, Ningbo, 315211, PR China
- School of Materials Science and Chemical Engineering, Ningbo University, Ningbo, 315211, PR China
| | - Zhenbin Zhang
- Institute of Drug Discovery Technology, Ningbo University, Ningbo, 315211, PR China
| | - Keqi Tang
- Zhejiang Engineering Research Center of Advanced Mass Spectrometry and Clinical Application, Institute of Mass Spectrometry, Ningbo University, Ningbo, 315211, PR China.
- Zhenhai Institute of Mass Spectrometry, Ningbo, 315211, PR China.
- School of Materials Science and Chemical Engineering, Ningbo University, Ningbo, 315211, PR China.
| |
Collapse
|
17
|
Kim J, Jeong K, Kaulich PT, Winkels K, Tholey A, Kohlbacher O. FLASHQuant: A Fast Algorithm for Proteoform Quantification in Top-Down Proteomics. Anal Chem 2024; 96:17227-17234. [PMID: 39424290 PMCID: PMC11525931 DOI: 10.1021/acs.analchem.4c03117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 09/11/2024] [Accepted: 10/01/2024] [Indexed: 10/21/2024]
Abstract
Accurate quantification of individual proteoforms is a crucial step in identifying proteome-wide alterations in different biological conditions. Intact proteoforms have been analyzed predominantly by liquid chromatography-mass spectrometry (LC-MS)-based top-down proteomics (TDP) and quantified primarily by the label-free quantification (LFQ) method, as it requires no additional costly labeling. In TDP, due to frequent coelution and complex signal structures, overlapping signals deriving from multiple proteoforms complicate accurate quantification. Here, we introduce FLASHQuant for MS1-level LFQ analysis in TDP, which is capable of automatically resolving and quantifying coeluting proteoforms. In benchmark tests performed with both spike-in proteins and proteome-level mixture data sets, FLASHQuant was shown to perform highly accurate and reproducible quantification in short runtimes of just a few minutes per LC-MS run. In particular, it was demonstrated that resolving overlapping proteoforms boosts the quantification accuracy. FLASHQuant is publicly available as platform-independent open-source software at https://openms.org/flashquant/, accompanied by the simple alignment algorithm ConsensusFeatureGroupDetector for multiple LC-MS runs.
Collapse
Affiliation(s)
- Jihyung Kim
- Applied
Bioinformatics, Department for Computer Science, University of Tübingen, 72076 Tübingen, Germany
- Institute
for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Kyowon Jeong
- Applied
Bioinformatics, Department for Computer Science, University of Tübingen, 72076 Tübingen, Germany
- Institute
for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Philipp T. Kaulich
- Systematic
Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Konrad Winkels
- Systematic
Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Andreas Tholey
- Systematic
Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Oliver Kohlbacher
- Applied
Bioinformatics, Department for Computer Science, University of Tübingen, 72076 Tübingen, Germany
- Institute
for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
- Translational
Bioinformatics, University Hospital Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
18
|
Yang YY, Cao Z, Wang Y. Mass Spectrometry-Based Proteomics for Assessing Epitranscriptomic Regulations. MASS SPECTROMETRY REVIEWS 2024. [PMID: 39422510 DOI: 10.1002/mas.21911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 09/26/2024] [Accepted: 09/28/2024] [Indexed: 10/19/2024]
Abstract
Epitranscriptomics is a rapidly evolving field that explores chemical modifications in RNA and how they contribute to dynamic and reversible regulations of gene expression. These modifications, for example, N6-methyladenosine (m6A), are crucial in various RNA metabolic processes, including splicing, stability, subcellular localization, and translation efficiency of mRNAs. Mass spectrometry-based proteomics has become an indispensable tool in unraveling the complexities of epitranscriptomics, offering high-throughput, precise protein identification, and accurate quantification of differential protein expression. Over the past two decades, advances in mass spectrometry, including the improvement of high-resolution mass spectrometers and innovative sample preparation methods, have allowed researchers to perform in-depth analyses of epitranscriptomic regulations. This review focuses on the applications of bottom-up proteomics in the field of epitranscriptomics, particularly in identifying and quantifying epitranscriptomic reader, writer, and eraser (RWE) proteins and in characterizing their functions, posttranslational modifications, and interactions with other proteins. Together, by leveraging modern proteomics, researchers can gain deep insights into the intricate regulatory networks of RNA modifications, advancing fundamental biology, and fostering potential therapeutic applications.
Collapse
Affiliation(s)
- Yen-Yu Yang
- Department of Chemistry, University of California, Riverside, California, USA
| | - Zhongwen Cao
- Environmental Toxicology Graduate Program, University of California, Riverside, California, USA
| | - Yinsheng Wang
- Department of Chemistry, University of California, Riverside, California, USA
- Environmental Toxicology Graduate Program, University of California, Riverside, California, USA
| |
Collapse
|
19
|
Li K, Teo GC, Yang KL, Yu F, Nesvizhskii AI. diaTracer enables spectrum-centric analysis of diaPASEF proteomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.25.595875. [PMID: 38854051 PMCID: PMC11160675 DOI: 10.1101/2024.05.25.595875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Data-independent acquisition (DIA) has become a widely used strategy for peptide and protein quantification in mass spectrometry-based proteomics studies. The integration of ion mobility separation into DIA analysis, such as the diaPASEF technology available on Bruker's timsTOF platform, further improves the quantification accuracy and protein depth achievable using DIA. We introduce diaTracer, a new spectrum-centric computational tool optimized for diaPASEF data. diaTracer performs three-dimensional (m/z, retention time, ion mobility) peak tracing and feature detection to generate precursor-resolved "pseudo-MS/MS" spectra, facilitating direct ("spectral-library free") peptide identification and quantification from diaPASEF data. diaTracer is available as a stand-alone tool and is fully integrated into the widely used FragPipe computational platform. We demonstrate the performance of diaTracer and FragPipe using diaPASEF data from triple-negative breast cancer (TNBC), cerebrospinal fluid (CSF), and plasma samples, data from phosphoproteomics and HLA immunopeptidomics experiments, and low-input data from a spatial proteomics study. We also show that diaTracer enables unrestricted identification of post-translational modifications from diaPASEF data using open/mass-offset searches.
Collapse
Affiliation(s)
- Kai Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kevin L. Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
20
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Valipour Kahrood H, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and Visualization of Quantitative Proteomics Data Using FragPipe-Analyst. J Proteome Res 2024; 23:4303-4315. [PMID: 39254081 DOI: 10.1021/acs.jproteome.4c00294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows, including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
21
|
Kohler D, Staniak M, Yu F, Nesvizhskii AI, Vitek O. An MSstats workflow for detecting differentially abundant proteins in large-scale data-independent acquisition mass spectrometry experiments with FragPipe processing. Nat Protoc 2024; 19:2915-2938. [PMID: 38769142 DOI: 10.1038/s41596-024-01000-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 03/11/2024] [Indexed: 05/22/2024]
Abstract
Technological advances in mass spectrometry and proteomics have made it possible to perform larger-scale and more-complex experiments. The volume and complexity of the resulting data create major challenges for downstream analysis. In particular, next-generation data-independent acquisition (DIA) experiments enable wider proteome coverage than more traditional targeted approaches but require computational workflows that can manage much larger datasets and identify peptide sequences from complex and overlapping spectral features. Data-processing tools such as FragPipe, DIA-NN and Spectronaut have undergone substantial improvements to process spectral features in a reasonable time. Statistical analysis tools are needed to draw meaningful comparisons between experimental samples, but these tools were also originally designed with smaller datasets in mind. This protocol describes an updated version of MSstats that has been adapted to be compatible with large-scale DIA experiments. A very large DIA experiment, processed with FragPipe, is used as an example to demonstrate different MSstats workflows. The choice of workflow depends on the user's computational resources. For datasets that are too large to fit into a standard computer's memory, we demonstrate the use of MSstatsBig, a companion R package to MSstats. The protocol also highlights key decisions that have a major effect on both the results and the processing time of the analysis. The MSstats processing can be expected to take 1-3 h depending on the usage of MSstatsBig. The protocol can be run in the point-and-click graphical user interface MSstatsShiny or implemented with minimal coding expertise in R.
Collapse
Affiliation(s)
- Devon Kohler
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA
| | | | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.
| |
Collapse
|
22
|
Ran P, Wang Y, Li K, He S, Tan S, Lv J, Zhu J, Tang S, Feng J, Qin Z, Li Y, Huang L, Yin Y, Zhu L, Yang W, Ding C. STAVER: a standardized benchmark dataset-based algorithm for effective variation reduction in large-scale DIA-MS data. Brief Bioinform 2024; 25:bbae553. [PMID: 39504480 PMCID: PMC11540132 DOI: 10.1093/bib/bbae553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/12/2024] [Accepted: 10/19/2024] [Indexed: 11/08/2024] Open
Abstract
Mass spectrometry (MS)-based proteomics has become instrumental in comprehensively investigating complex biological systems. Data-independent acquisition (DIA)-MS, utilizing hybrid spectral library search strategies, allows for the simultaneous quantification of thousands of proteins, showing promise in enhancing protein identification and quantification precision. However, low-quality profiles can considerably undermine quantitative precision, resulting in inaccurate protein quantification. To tackle this challenge, we introduced STAVER, a novel algorithm that leverages standardized benchmark datasets to reduce non-biological variation in large-scale DIA-MS analyses. By eliminating unwanted noise in MS signals, STAVER significantly improved protein quantification precision, especially in hybrid spectral library searches. Moreover, we validated STAVER's robustness and applicability across multiple large-scale DIA datasets, demonstrating significantly enhanced precision and reproducibility of protein quantification. STAVER offers an innovative and effective approach for enhancing the quality of large-scale DIA proteomic data, facilitating cross-platform and cross-laboratory comparative analyses. This advancement significantly enhances the consistency and reliability of findings in clinical research. The complete package is available at https://github.com/Ran485/STAVER.
Collapse
Affiliation(s)
- Peng Ran
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yunzhi Wang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Kai Li
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Shiman He
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Subei Tan
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jiacheng Lv
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jiajun Zhu
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Shaoshuai Tang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jinwen Feng
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Zhaoyu Qin
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yan Li
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Lin Huang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yanan Yin
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Lingli Zhu
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Wenjun Yang
- Department of Pediatric Orthopedics, Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine, No. 1665, Kongjiang Road, Yangpu District, Shanghai 200092, China
| | - Chen Ding
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
- Departments of Cancer Research Institute, Affiliated Cancer Hospital of Xinjiang Medical University Xinjiang Key Laboratory of Translational Biomedical Engineering, Urumqi 830000, P. R. China
| |
Collapse
|
23
|
Shi M, Huang C, Chen R, Chen DDY, Yan B. A New Evaluation Metric for Quantitative Accuracy of LC-MS/MS-Based Proteomics with Data-Independent Acquisition. J Proteome Res 2024; 23:3780-3790. [PMID: 39193824 DOI: 10.1021/acs.jproteome.4c00088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024]
Abstract
Data-independent acquisition (DIA) has improved the identification and quantitation coverage of peptides and proteins in liquid chromatography-tandem mass spectrometry-based proteomics. However, different DIA data-processing tools can produce very different identification and quantitation results for the same data set. Currently, benchmarking studies of DIA tools are predominantly focused on comparing the identification results, while the quantitative accuracy of DIA measurements is acknowledged to be important but insufficiently investigated, and the absence of suitable metrics for comparing quantitative accuracy is one of the reasons. A new metric is proposed for the evaluation of quantitative accuracy to avoid the influence of differences in false discovery rate control stringency. The part of the quantitation results with high reliability was acquired from each DIA tool first, and the quantitative accuracy was evaluated by comparing quantification error rates at the same number of accurate ratios. From the results of four benchmark data sets, the proposed metric was shown to be more sensitive to discriminating the quantitative performance of DIA tools. Moreover, the DIA tools with advantages in quantitative accuracy were consistently revealed by this metric. The proposed metric can also help researchers in optimizing algorithms of the same DIA tool and sample preprocessing methods to enhance quantitative accuracy.
Collapse
Affiliation(s)
- Mengtian Shi
- College of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Chiyuan Huang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Renhui Chen
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - David Da Yong Chen
- Department of Chemistry, University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Binjun Yan
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| |
Collapse
|
24
|
Dens C, Adams C, Laukens K, Bittremieux W. Machine Learning Strategies to Tackle Data Challenges in Mass Spectrometry-Based Proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2143-2155. [PMID: 39074335 DOI: 10.1021/jasms.4c00180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
In computational proteomics, machine learning (ML) has emerged as a vital tool for enhancing data analysis. Despite significant advancements, the diversity of ML model architectures and the complexity of proteomics data present substantial challenges in the effective development and evaluation of these tools. Here, we highlight the necessity for high-quality, comprehensive data sets to train ML models and advocate for the standardization of data to support robust model development. We emphasize the instrumental role of key data sets like ProteomeTools and MassIVE-KB in advancing ML applications in proteomics and discuss the implications of data set size on model performance, highlighting that larger data sets typically yield more accurate models. To address data scarcity, we explore algorithmic strategies such as self-supervised pretraining and multitask learning. Ultimately, we hope that this discussion can serve as a call to action for the proteomics community to collaborate on data standardization and collection efforts, which are crucial for the sustainable advancement and refinement of ML methodologies in the field.
Collapse
Affiliation(s)
- Ceder Dens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Charlotte Adams
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| |
Collapse
|
25
|
Capraz T, Huber W. Feature selection by replicate reproducibility and non-redundancy. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae548. [PMID: 39254597 PMCID: PMC11410923 DOI: 10.1093/bioinformatics/btae548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 08/05/2024] [Accepted: 09/06/2024] [Indexed: 09/11/2024]
Abstract
MOTIVATION A fundamental step in many analyses of high-dimensional data is dimension reduction. Two basic approaches are introduction of new synthetic coordinates and selection of extant features. Advantages of the latter include interpretability, simplicity, transferability, and modularity. A common criterion for unsupervized feature selection is variance or dynamic range. However, in practice, it can occur that high-variance features are noisy, that important features have low variance, or that variances are simply not comparable across features because they are measured in unrelated numeric scales or physical units. Moreover, users may want to include measures of signal-to-noise ratio and non-redundancy into feature selection. RESULTS Here, we introduce the RNR algorithm, which selects features based on (i) the reproducibility of their signal across replicates and (ii) their non-redundancy, measured by linear dependence. It takes as input a typically large set of features measured on a collection of objects with two or more replicates per object. It returns an ordered list of features, i1,i2,…,ik, where feature i1 is the one with the highest reproducibility across replicates, i2 that with the highest reproducibility across replicates after projecting out the dimension spanned by i1, and so on. Applications to microscopy-based imaging of cells and proteomics highlight benefits of the approach. AVAILABILITY AND IMPLEMENTATION The RNR method is available via Bioconductor (Huber W, Carey VJ, Gentleman R et al. (Orchestrating high-throughput genomic analysis with bioconductor. Nat Methods 2015;12:115-21.) in the R package FeatSeekR. Its source code is also available at https://github.com/tcapraz/FeatSeekR under the GPL-3 open source license.
Collapse
Affiliation(s)
- Tümay Capraz
- Genome Biology Unit, EMBL, Heidelberg, 69117, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, 69117, Germany
| | | |
Collapse
|
26
|
He Q, Guo H, Li Y, He G, Li X, Shuai J. SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics. Interdiscip Sci 2024; 16:579-592. [PMID: 38472692 DOI: 10.1007/s12539-024-00611-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 03/14/2024]
Abstract
Mass spectrometry is crucial in proteomics analysis, particularly using Data Independent Acquisition (DIA) for reliable and reproducible mass spectrometry data acquisition, enabling broad mass-to-charge ratio coverage and high throughput. DIA-NN, a prominent deep learning software in DIA proteome analysis, generates peptide results but may include low-confidence peptides. Conventionally, biologists have to manually screen peptide fragment ion chromatogram peaks (XIC) for identifying high-confidence peptides, a time-consuming and subjective process prone to variability. In this study, we introduce SeFilter-DIA, a deep learning algorithm, aiming at automating the identification of high-confidence peptides. Leveraging compressed excitation neural network and residual network models, SeFilter-DIA extracts XIC features and effectively discerns between high and low-confidence peptides. Evaluation of the benchmark datasets demonstrates SeFilter-DIA achieving 99.6% AUC on the test set and 97% for other performance indicators. Furthermore, SeFilter-DIA is applicable for screening peptides with phosphorylation modifications. These results demonstrate the potential of SeFilter-DIA to replace manual screening, providing an efficient and objective approach for high-confidence peptide identification while mitigating associated limitations.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Huan Guo
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Yulin Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Guoqiang He
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Xiang Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.
| | - Jianwei Shuai
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325001, China.
| |
Collapse
|
27
|
Jiang Y, Rex DA, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Hegeman AD, Mayta M, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics Using Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:338-417. [PMID: 39193565 PMCID: PMC11348894 DOI: 10.1021/acsmeasuresciau.3c00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 08/29/2024]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this Review will serve as a handbook for researchers who are new to the field of bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Devasahayam Arokia
Balaya Rex
- Center for
Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
- Department
of Biology, Institute of Molecular Biology
and Biophysics, ETH Zurich, Zurich 8093, Switzerland
- Laboratory
of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical
Sciences Division, National Institute of
Standards and Technology, NIST, Charleston, South Carolina 29412, United States
| | - Germán L. Rosano
- Mass
Spectrometry
Unit, Institute of Molecular and Cellular
Biology of Rosario, Rosario, 2000 Argentina
| | - Norbert Volkmar
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Trenton M. Peters-Clarke
- Department
of Pharmaceutical Chemistry, University
of California—San Francisco, San Francisco, California, 94158, United States
| | - Susan B. Egbert
- Department
of Chemistry, University of Manitoba, Winnipeg, Manitoba, R3T 2N2 Canada
| | - Simion Kreimer
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Emma H. Doud
- Center
for Proteome Analysis, Indiana University
School of Medicine, Indianapolis, Indiana, 46202-3082, United States
| | - Oliver M. Crook
- Oxford
Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United
Kingdom
| | - Amit Kumar Yadav
- Translational
Health Science and Technology Institute, NCR Biotech Science Cluster 3rd Milestone Faridabad-Gurgaon
Expressway, Faridabad, Haryana 121001, India
| | | | - Adrian D. Hegeman
- Departments
of Horticultural Science and Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota 55108, United States
| | - Martín
L. Mayta
- School
of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martin 3103, Argentina
- Molecular
Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Nicholas M. Riley
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems biology, Seattle, Washington 98109, United States
| | - Jesse G. Meyer
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| |
Collapse
|
28
|
Humphries EM, Loudon C, Craft GE, Hains PG, Robinson PJ. Quantitative Comparison of Deparaffinization, Rehydration, and Extraction Methods for FFPE Tissue Proteomics and Phosphoproteomics. Anal Chem 2024; 96:13358-13370. [PMID: 39102789 DOI: 10.1021/acs.analchem.3c04479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
Formalin-fixed paraffin-embedded (FFPE) tissues are suitable for proteomic and phosphoproteomic biomarker studies by data-independent acquisition mass spectrometry. The choice of the sample preparation method influences the number, intensity, and reproducibility of identifications. By comparing four deparaffinization and rehydration methods, including heptane, histolene, SubX, and xylene, we found that heptane and methanol produced the lowest coefficients of variation (CVs). Using this, five extraction methods from the literature were modified and evaluated for their performance using kidney, leg muscle, lung, and testicular rat organs. All methods performed well, except for SP3 due to insufficient tissue lysis. Heat n' Beat was the fastest and most reproducible method with the highest digestion efficiency and lowest CVs. S-Trap produced the highest peptide yield, while TFE produced the best phosphopeptide enrichment efficiency. The quantitation of FFPE-derived peptides remains an ongoing challenge with bias in UV and fluorescence assays across methods, most notably in SPEED. Functional enrichment analysis demonstrated that each method favored extracting some gene ontology cellular components over others including chromosome, cytoplasmic, cytoskeleton, endoplasmic reticulum, membrane, mitochondrion, and nucleoplasm protein groups. The outcome is a set of recommendations for choosing the most appropriate method for different settings.
Collapse
Affiliation(s)
- Erin M Humphries
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Clare Loudon
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - George E Craft
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Peter G Hains
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Phillip J Robinson
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| |
Collapse
|
29
|
Shi J, Liu Y, Xu YJ. MS based foodomics: An edge tool integrated metabolomics and proteomics for food science. Food Chem 2024; 446:138852. [PMID: 38428078 DOI: 10.1016/j.foodchem.2024.138852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/05/2024] [Accepted: 02/24/2024] [Indexed: 03/03/2024]
Abstract
Foodomics has become a popular methodology in food science studies. Mass spectrometry (MS) based metabolomics and proteomics analysis played indispensable roles in foodomics research. So far, several methodologies have been developed to detect the metabolites and proteins in diets and consumers, including sample preparation, MS data acquisition, annotation and interpretation. Moreover, multiomics analysis integrated metabolomics and proteomics have received considerable attentions in the field of food safety and nutrition, because of more comprehensive and deeply. In this context, we intended to review the emerging strategies and their applications in MS-based foodomics, as well as future challenges and trends. The principle and application of multiomics were also discussed, such as the optimization of data acquisition, development of analysis algorithm and exploration of systems biology.
Collapse
Affiliation(s)
- Jiachen Shi
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| | - Yuanfa Liu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| | - Yong-Jiang Xu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| |
Collapse
|
30
|
Wettstein R, Hugener J, Gillet L, Hernández-Armenta Y, Henggeler A, Xu J, van Gerwen J, Wollweber F, Arter M, Aebersold R, Beltrao P, Pilhofer M, Matos J. Waves of regulated protein expression and phosphorylation rewire the proteome to drive gametogenesis in budding yeast. Dev Cell 2024; 59:1764-1782.e8. [PMID: 38906138 DOI: 10.1016/j.devcel.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 02/25/2024] [Accepted: 05/20/2024] [Indexed: 06/23/2024]
Abstract
Sexually reproducing eukaryotes employ a developmentally regulated cell division program-meiosis-to generate haploid gametes from diploid germ cells. To understand how gametes arise, we generated a proteomic census encompassing the entire meiotic program of budding yeast. We found that concerted waves of protein expression and phosphorylation modify nearly all cellular pathways to support meiotic entry, meiotic progression, and gamete morphogenesis. Leveraging this comprehensive resource, we pinpointed dynamic changes in mitochondrial components and showed that phosphorylation of the FoF1-ATP synthase complex is required for efficient gametogenesis. Furthermore, using cryoET as an orthogonal approach to visualize mitochondria, we uncovered highly ordered filament arrays of Ald4ALDH2, a conserved aldehyde dehydrogenase that is highly expressed and phosphorylated during meiosis. Notably, phosphorylation-resistant mutants failed to accumulate filaments, suggesting that phosphorylation regulates context-specific Ald4ALDH2 polymerization. Overall, this proteomic census constitutes a broad resource to guide the exploration of the unique sequence of events underpinning gametogenesis.
Collapse
Affiliation(s)
- Rahel Wettstein
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Jannik Hugener
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland; Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Ludovic Gillet
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Yi Hernández-Armenta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK
| | - Adrian Henggeler
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Jingwei Xu
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Julian van Gerwen
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Florian Wollweber
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Meret Arter
- Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK.
| | - Martin Pilhofer
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland.
| | - Joao Matos
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland.
| |
Collapse
|
31
|
Wu E, Xu G, Xie D, Qiao L. Data-independent acquisition in metaproteomics. Expert Rev Proteomics 2024; 21:271-280. [PMID: 39152734 DOI: 10.1080/14789450.2024.2394190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 08/12/2024] [Accepted: 08/14/2024] [Indexed: 08/19/2024]
Abstract
INTRODUCTION Metaproteomics offers insights into the function of complex microbial communities, while it is also capable of revealing microbe-microbe and host-microbe interactions. Data-independent acquisition (DIA) mass spectrometry is an emerging technology, which holds great potential to achieve deep and accurate metaproteomics with higher reproducibility yet still facing a series of challenges due to the inherent complexity of metaproteomics and DIA data. AREAS COVERED This review offers an overview of the DIA metaproteomics approaches, covering aspects such as database construction, search strategy, and data analysis tools. Several cases of current DIA metaproteomics studies are presented to illustrate the procedures. Important ongoing challenges are also highlighted. Future perspectives of DIA methods for metaproteomics analysis are further discussed. Cited references are searched through and collected from Google Scholar and PubMed. EXPERT OPINION Considering the inherent complexity of DIA metaproteomics data, data analysis strategies specifically designed for interpretation are imperative. From this point of view, we anticipate that deep learning methods and de novo sequencing methods will become more prevalent in the future, potentially improving protein coverage in metaproteomics. Moreover, the advancement of metaproteomics also depends on the development of sample preparation methods, data analysis strategies, etc. These factors are key to unlocking the full potential of metaproteomics.
Collapse
Affiliation(s)
- Enhui Wu
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
- Department of Chemistry, Fudan University, Shanghai, China
| | - Guanyang Xu
- Department of Chemistry, Fudan University, Shanghai, China
| | - Dong Xie
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Liang Qiao
- Department of Chemistry, Fudan University, Shanghai, China
| |
Collapse
|
32
|
Karpov OA, Stotland A, Raedschelders K, Chazarin B, Ai L, Murray CI, Van Eyk JE. Proteomics of the heart. Physiol Rev 2024; 104:931-982. [PMID: 38300522 PMCID: PMC11381016 DOI: 10.1152/physrev.00026.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/25/2023] [Accepted: 01/14/2024] [Indexed: 02/02/2024] Open
Abstract
Mass spectrometry-based proteomics is a sophisticated identification tool specializing in portraying protein dynamics at a molecular level. Proteomics provides biologists with a snapshot of context-dependent protein and proteoform expression, structural conformations, dynamic turnover, and protein-protein interactions. Cardiac proteomics can offer a broader and deeper understanding of the molecular mechanisms that underscore cardiovascular disease, and it is foundational to the development of future therapeutic interventions. This review encapsulates the evolution, current technologies, and future perspectives of proteomic-based mass spectrometry as it applies to the study of the heart. Key technological advancements have allowed researchers to study proteomes at a single-cell level and employ robot-assisted automation systems for enhanced sample preparation techniques, and the increase in fidelity of the mass spectrometers has allowed for the unambiguous identification of numerous dynamic posttranslational modifications. Animal models of cardiovascular disease, ranging from early animal experiments to current sophisticated models of heart failure with preserved ejection fraction, have provided the tools to study a challenging organ in the laboratory. Further technological development will pave the way for the implementation of proteomics even closer within the clinical setting, allowing not only scientists but also patients to benefit from an understanding of protein interplay as it relates to cardiac disease physiology.
Collapse
Affiliation(s)
- Oleg A Karpov
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Aleksandr Stotland
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Koen Raedschelders
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Blandine Chazarin
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Lizhuo Ai
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Christopher I Murray
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| |
Collapse
|
33
|
Zhang Y, Hu C, Wu X, Song J. Calib-RT: an open source python package for peptide retention time calibration in DIA mass spectrometry data. Bioinformatics 2024; 40:btae417. [PMID: 38960865 PMCID: PMC11223842 DOI: 10.1093/bioinformatics/btae417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 05/27/2024] [Accepted: 07/02/2024] [Indexed: 07/05/2024] Open
Abstract
MOTIVATION The data independent acquisition (DIA) mass spectrometry (MS) method is increasingly popular in the field of proteomics. But the loss of the correspondence between peptide ions and their spectra in DIA makes the identification challenging. One effective approach to reduce false positive identification is to calculate the deviation between the peptide's estimated retention time (RT) and measured RT. During this process, scaling the spectral library RT into the estimated RT, known as the RT calibration, is a prerequisite for calculating the deviation. Currently, within the DIA algorithm ecosystem, there is a lack of engine-independent and readily usable RT calibration toolkits. RESULTS In this work, we introduce Calib-RT, a RT calibration method tailored to the characteristics of RT data. This method can achieve the nonlinear calibration across various data scales and tolerate a certain level of noise interference. Calib-RT is expected to enrich the open source DIA algorithm toolchain and assist in the development of DIA identification algorithms. AVAILABILITY AND IMPLEMENTATION Calib-RT is released as an open source software under the MIT license and can be installed from PyPi as a python module. The source code is available on GitHub at https://github.com/chenghui03/Calib_RT.
Collapse
Affiliation(s)
- Yichi Zhang
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Chenghui Hu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Jian Song
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| |
Collapse
|
34
|
Ling CW, Deng K, Yang Y, Lin HR, Liu CY, Li BY, Hu W, Liang X, Zhao H, Tang XY, Zheng JS, Chen YM. Mapping the gut microecological multi-omics signatures to serum metabolome and their impact on cardiometabolic health in elderly adults. EBioMedicine 2024; 105:105209. [PMID: 38908099 PMCID: PMC11253218 DOI: 10.1016/j.ebiom.2024.105209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 05/04/2024] [Accepted: 06/04/2024] [Indexed: 06/24/2024] Open
Abstract
BACKGROUND Mapping gut microecological features to serum metabolites (SMs) will help identify functional links between gut microbiome and cardiometabolic health. METHODS This study encompassed 836-1021 adults over 9.7 year in a cohort, assessing metabolic syndrome (MS), carotid atherosclerotic plaque (CAP), and other metadata triennially. We analyzed mid-term microbial metagenomics, targeted fecal and serum metabolomics, host genetics, and serum proteomics. FINDINGS Gut microbiota and metabolites (GMM) accounted for 15.1% overall variance in 168 SMs, with individual GMM factors explaining 5.65%-10.1%, host genetics 3.23%, and sociodemographic factors 5.95%. Specifically, GMM elucidated 5.5%-49.6% variance in the top 32 GMM-explained SMs. Each 20% increase in the 32 metabolite score (derived from the 32 SMs) correlated with 73% (95% confidence interval [CI]: 53%-95%) and 19% (95% CI: 11%-27%) increases in MS and CAP incidences, respectively. Among the 32 GMM-explained SMs, sebacic acid, indoleacetic acid, and eicosapentaenoic acid were linked to MS or CAP incidence. Serum proteomics revealed certain proteins, particularly the apolipoprotein family, mediated the relationship between GMM-SMs and cardiometabolic risks. INTERPRETATION This study reveals the significant influence of GMM on SM profiles and illustrates the intricate connections between GMM-explained SMs, serum proteins, and the incidence of MS and CAP, providing insights into the roles of gut dysbiosis in cardiometabolic health via regulating blood metabolites. FUNDING This study was jointly supported by the National Natural Science Foundation of China, Key Research and Development Program of Guangzhou, 5010 Program for Clinical Research of Sun Yat-sen University, and the 'Pioneer' and 'Leading goose' R&D Program of Zhejiang.
Collapse
Affiliation(s)
- Chu-Wen Ling
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China; Department of Clinical Nutrition, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 510080, China
| | - Kui Deng
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China; Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Yingdi Yang
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Hong-Rou Lin
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Chun-Ying Liu
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Bang-Yan Li
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Wei Hu
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Xinxiu Liang
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Hui Zhao
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Xin-Yi Tang
- Department of Pediatrics, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510630, China.
| | - Ju-Sheng Zheng
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China.
| | - Yu-Ming Chen
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
35
|
Baker C, Bruderer R, Abbott J, Arthur JSC, Brenes AJ. Optimizing Spectronaut Search Parameters to Improve Data Quality with Minimal Proteome Coverage Reductions in DIA Analyses of Heterogeneous Samples. J Proteome Res 2024; 23:1926-1936. [PMID: 38691771 PMCID: PMC11165578 DOI: 10.1021/acs.jproteome.3c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/18/2024] [Accepted: 04/19/2024] [Indexed: 05/03/2024]
Abstract
Data-independent acquisition has seen breakthroughs that enable comprehensive proteome profiling using short gradients. As the proteome coverage continues to increase, the quality of the data generated becomes much more relevant. Using Spectronaut, we show that the default search parameters can be easily optimized to minimize the occurrence of false positives across different samples. Using an immunological infection model system to demonstrate the impact of adjusting search settings, we analyzed Mus musculus macrophages and compared their proteome to macrophages spiked withCandida albicans. This experimental system enabled the identification of "false positives" as Candida albicans peptides and proteins should not be present in the Mus musculus-only samples. We show that adjusting the search parameters reduced "false positive" identifications by 89% at the peptide and protein level, thereby considerably increasing the quality of the data. We also show that these optimized parameters incurred a moderate cost, only reducing the overall number of "true positive" identifications across each biological replicate by <6.7% at both the peptide and protein level. We believe the value of our updated search parameters extends beyond a two-organism analysis and would be of great value to any DIA experiment analyzing heterogeneous populations of cell types or tissues.
Collapse
Affiliation(s)
- Christa
P. Baker
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | | | - James Abbott
- Data
Analysis Group, Division of Computational Biology, School of Life
Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - J. Simon C. Arthur
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - Alejandro J. Brenes
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| |
Collapse
|
36
|
Staes A, Mendes Maia T, Dufour S, Bouwmeester R, Gabriels R, Martens L, Gevaert K, Impens F, Devos S. Benefit of In Silico Predicted Spectral Libraries in Data-Independent Acquisition Data Analysis Workflows. J Proteome Res 2024; 23:2078-2089. [PMID: 38666436 DOI: 10.1021/acs.jproteome.4c00048] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2025]
Abstract
Data-independent acquisition (DIA) has become a well-established method for MS-based proteomics. However, the list of options to analyze this type of data is quite extensive, and the use of spectral libraries has become an important factor in DIA data analysis. More specifically the use of in silico predicted libraries is gaining more interest. By working with a differential spike-in of human standard proteins (UPS2) in a constant yeast tryptic digest background, we evaluated the sensitivity, precision, and accuracy of the use of in silico predicted libraries in data DIA data analysis workflows compared to more established workflows. Three commonly used DIA software tools, DIA-NN, EncyclopeDIA, and Spectronaut, were each tested in spectral library mode and spectral library-free mode. In spectral library mode, we used independent spectral library prediction tools PROSIT and MS2PIP together with DeepLC, next to classical data-dependent acquisition (DDA)-based spectral libraries. In total, we benchmarked 12 computational workflows for DIA. Our comparison showed that DIA-NN reached the highest sensitivity while maintaining a good compromise on the reproducibility and accuracy levels in either library-free mode or using in silico predicted libraries pointing to a general benefit in using in silico predicted libraries.
Collapse
Affiliation(s)
- An Staes
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- VIB Proteomics Core, B9052 Ghent, Belgium
| | - Teresa Mendes Maia
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- VIB Proteomics Core, B9052 Ghent, Belgium
| | - Sara Dufour
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- VIB Proteomics Core, B9052 Ghent, Belgium
| | - Robbin Bouwmeester
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
| | - Ralf Gabriels
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
| | - Lennart Martens
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
| | - Francis Impens
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- VIB Proteomics Core, B9052 Ghent, Belgium
| | - Simon Devos
- VIB Center for Medical Biotechnology, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, B9052 Ghent, Belgium
- VIB Proteomics Core, B9052 Ghent, Belgium
| |
Collapse
|
37
|
Lewis JM, Jebeli L, Coulon PML, Lay CE, Scott NE. Glycoproteomic and proteomic analysis of Burkholderia cenocepacia reveals glycosylation events within FliF and MotB are dispensable for motility. Microbiol Spectr 2024; 12:e0034624. [PMID: 38709084 PMCID: PMC11237607 DOI: 10.1128/spectrum.00346-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 04/16/2024] [Indexed: 05/07/2024] Open
Abstract
Across the Burkholderia genus O-linked protein glycosylation is highly conserved. While the inhibition of glycosylation has been shown to be detrimental for virulence in Burkholderia cepacia complex species, such as Burkholderia cenocepacia, little is known about how specific glycosylation sites impact protein functionality. Within this study, we sought to improve our understanding of the breadth, dynamics, and requirement for glycosylation across the B. cenocepacia O-glycoproteome. Assessing the B. cenocepacia glycoproteome across different culture media using complementary glycoproteomic approaches, we increase the known glycoproteome to 141 glycoproteins. Leveraging this repertoire of glycoproteins, we quantitively assessed the glycoproteome of B. cenocepacia using Data-Independent Acquisition (DIA) revealing the B. cenocepacia glycoproteome is largely stable across conditions with most glycoproteins constitutively expressed. Examination of how the absence of glycosylation impacts the glycoproteome reveals that the protein abundance of only five glycoproteins (BCAL1086, BCAL2974, BCAL0525, BCAM0505, and BCAL0127) are altered by the loss of glycosylation. Assessing ΔfliF (ΔBCAL0525), ΔmotB (ΔBCAL0127), and ΔBCAM0505 strains, we demonstrate the loss of FliF, and to a lesser extent MotB, mirror the proteomic effects observed in the absence of glycosylation in ΔpglL. While both MotB and FliF are essential for motility, we find loss of glycosylation sites in MotB or FliF does not impact motility supporting these sites are dispensable for function. Combined this work broadens our understanding of the B. cenocepacia glycoproteome supporting that the loss of glycoproteins in the absence of glycosylation is not an indicator of the requirement for glycosylation for protein function. IMPORTANCE Burkholderia cenocepacia is an opportunistic pathogen of concern within the Cystic Fibrosis community. Despite a greater appreciation of the unique physiology of B. cenocepacia gained over the last 20 years a complete understanding of the proteome and especially the O-glycoproteome, is lacking. In this study, we utilize systems biology approaches to expand the known B. cenocepacia glycoproteome as well as track the dynamics of glycoproteins across growth phases, culturing media and in response to the loss of glycosylation. We show that the glycoproteome of B. cenocepacia is largely stable across conditions and that the loss of glycosylation only impacts five glycoproteins including the motility associated proteins FliF and MotB. Examination of MotB and FliF shows, while these proteins are essential for motility, glycosylation is dispensable. Combined this work supports that B. cenocepacia glycosylation can be dispensable for protein function and may influence protein properties beyond stability.
Collapse
Affiliation(s)
- Jessica M Lewis
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Leila Jebeli
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Pauline M L Coulon
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Catrina E Lay
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Nichollas E Scott
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| |
Collapse
|
38
|
Alfahel L, Gschwendtberger T, Kozareva V, Dumas L, Gibbs R, Kertser A, Baruch K, Zaccai S, Kahn J, Thau-Habermann N, Eggenschwiler R, Sterneckert J, Hermann A, Sundararaman N, Vaibhav V, Van Eyk JE, Rafuse VF, Fraenkel E, Cantz T, Petri S, Israelson A. Targeting low levels of MIF expression as a potential therapeutic strategy for ALS. Cell Rep Med 2024; 5:101546. [PMID: 38703766 PMCID: PMC11148722 DOI: 10.1016/j.xcrm.2024.101546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/03/2023] [Accepted: 04/10/2024] [Indexed: 05/06/2024]
Abstract
Mutations in SOD1 cause amyotrophic lateral sclerosis (ALS), a neurodegenerative disease characterized by motor neuron (MN) loss. We previously discovered that macrophage migration inhibitory factor (MIF), whose levels are extremely low in spinal MNs, inhibits mutant SOD1 misfolding and toxicity. In this study, we show that a single peripheral injection of adeno-associated virus (AAV) delivering MIF into adult SOD1G37R mice significantly improves their motor function, delays disease progression, and extends survival. Moreover, MIF treatment reduces neuroinflammation and misfolded SOD1 accumulation, rescues MNs, and corrects dysregulated pathways as observed by proteomics and transcriptomics. Furthermore, we reveal low MIF levels in human induced pluripotent stem cell-derived MNs from familial ALS patients with different genetic mutations, as well as in post mortem tissues of sporadic ALS patients. Our findings indicate that peripheral MIF administration may provide a potential therapeutic mechanism for modulating misfolded SOD1 in vivo and disease outcome in ALS patients.
Collapse
Affiliation(s)
- Leenor Alfahel
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | - Thomas Gschwendtberger
- Department of Neurology, Hannover Medical School, 30625 Hannover, Germany; Center for Systems Neuroscience, Hannover Medical School, 30625 Hannover, Germany
| | - Velina Kozareva
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Laura Dumas
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | - Rachel Gibbs
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | | | - Kuti Baruch
- ImmunoBrain Checkpoint Ltd., Ness Ziona 7404905, Israel
| | - Shir Zaccai
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | - Joy Kahn
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | | | - Reto Eggenschwiler
- Gastroenterology, Hepatology and Endocrinology Department, Hannover Medical School, 30625 Hannover, Germany; Translational Hepatology and Stem Cell Biology, REBIRTH - Research Center for Translational Regenerative Medicine and Department of Gastroenterology, Hepatology and Endocrinology, Hannover Medical School, 30625 Hannover, Germany
| | - Jared Sterneckert
- Center for Regenerative Therapies Dresden, Technical University Dresden, 01307 Dresden, Germany
| | - Andreas Hermann
- Translational Neurodegeneration Section, "Albrecht Kossel", Department of Neurology, University Medical Center Rostock, University of Rostock, 18147 Rostock, Germany; Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) Rostock/Greifswald, 18147 Rostock, Germany; Center for Transdisciplinary Neurosciences Rostock (CTNR), University Medical Center Rostock, University of Rostock, 18147 Rostock, Germany
| | - Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Vineet Vaibhav
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Victor F Rafuse
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | - Ernest Fraenkel
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tobias Cantz
- Gastroenterology, Hepatology and Endocrinology Department, Hannover Medical School, 30625 Hannover, Germany; Translational Hepatology and Stem Cell Biology, REBIRTH - Research Center for Translational Regenerative Medicine and Department of Gastroenterology, Hepatology and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; Max Planck Institute for Molecular Biomedicine, Cell and Developmental Biology, 48149 Münster, Germany
| | - Susanne Petri
- Department of Neurology, Hannover Medical School, 30625 Hannover, Germany; Center for Systems Neuroscience, Hannover Medical School, 30625 Hannover, Germany
| | - Adrian Israelson
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel.
| |
Collapse
|
39
|
Rosenberger G, Li W, Turunen M, He J, Subramaniam PS, Pampou S, Griffin AT, Karan C, Kerwin P, Murray D, Honig B, Liu Y, Califano A. Network-based elucidation of colon cancer drug resistance mechanisms by phosphoproteomic time-series analysis. Nat Commun 2024; 15:3909. [PMID: 38724493 PMCID: PMC11082183 DOI: 10.1038/s41467-024-47957-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 04/16/2024] [Indexed: 05/12/2024] Open
Abstract
Aberrant signaling pathway activity is a hallmark of tumorigenesis and progression, which has guided targeted inhibitor design for over 30 years. Yet, adaptive resistance mechanisms, induced by rapid, context-specific signaling network rewiring, continue to challenge therapeutic efficacy. Leveraging progress in proteomic technologies and network-based methodologies, we introduce Virtual Enrichment-based Signaling Protein-activity Analysis (VESPA)-an algorithm designed to elucidate mechanisms of cell response and adaptation to drug perturbations-and use it to analyze 7-point phosphoproteomic time series from colorectal cancer cells treated with clinically-relevant inhibitors and control media. Interrogating tumor-specific enzyme/substrate interactions accurately infers kinase and phosphatase activity, based on their substrate phosphorylation state, effectively accounting for signal crosstalk and sparse phosphoproteome coverage. The analysis elucidates time-dependent signaling pathway response to each drug perturbation and, more importantly, cell adaptive response and rewiring, experimentally confirmed by CRISPR knock-out assays, suggesting broad applicability to cancer and other diseases.
Collapse
Affiliation(s)
- George Rosenberger
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Wenxue Li
- Yale Cancer Biology Institute, Yale University, West Haven, CT, USA
| | - Mikko Turunen
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Jing He
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Prem S Subramaniam
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Sergey Pampou
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- J.P. Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Aaron T Griffin
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Medical Scientist Training Program, Columbia University Irving Medical Center, New York, NY, USA
| | - Charles Karan
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- J.P. Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Patrick Kerwin
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Diana Murray
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Barry Honig
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, New York, NY, USA
- Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Yansheng Liu
- Yale Cancer Biology Institute, Yale University, West Haven, CT, USA.
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT, USA.
| | - Andrea Califano
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, New York, NY, USA.
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA.
- Chan Zuckerberg Biohub New York, New York, NY, USA.
| |
Collapse
|
40
|
Sing JC, Charkow J, AlHigaylan M, Horecka I, Xu L, Röst HL. MassDash: A Web-Based Dashboard for Data-Independent Acquisition Mass Spectrometry Visualization. J Proteome Res 2024. [PMID: 38684072 DOI: 10.1021/acs.jproteome.4c00026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
With the increased usage and diversity of methods and instruments being applied to analyze Data-Independent Acquisition (DIA) data, visualization is becoming increasingly important to validate automated software results. Here we present MassDash, a cross-platform DIA mass spectrometry visualization and validation software for comparing features and results across popular tools. MassDash provides a web-based interface and Python package for interactive feature visualizations and summary report plots across multiple automated DIA feature detection tools, including OpenSwath, DIA-NN, and dreamDIA. Furthermore, MassDash processes peptides on the fly, enabling interactive visualization of peptides across dozens of runs simultaneously on a personal computer. MassDash supports various multidimensional visualizations across retention time, ion mobility, m/z, and intensity, providing additional insights into the data. The modular framework is easily extendable, enabling rapid algorithm development of novel peak-picker techniques, such as deep-learning-based approaches and refinement of existing tools. MassDash is open-source under a BSD 3-Clause license and freely available at https://github.com/Roestlab/massdash, and a demo version can be accessed at https://massdash.streamlit.app.
Collapse
Affiliation(s)
- Justin C Sing
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Joshua Charkow
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Mohammed AlHigaylan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Ira Horecka
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Leon Xu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Hannes L Röst
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| |
Collapse
|
41
|
Li T, Liu Y, Zhu H, Cao L, Zhou Y, Liu D, Shen Q. Cellular ATP redistribution achieved by deleting Tgparp improves lignocellulose utilization of Trichoderma under heat stress. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2024; 17:54. [PMID: 38637859 PMCID: PMC11027231 DOI: 10.1186/s13068-024-02502-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 04/05/2024] [Indexed: 04/20/2024]
Abstract
BACKGROUND Thermotolerance is widely acknowledged as a pivotal factor for fungal survival across diverse habitats. Heat stress induces a cascade of disruptions in various life processes, especially in the acquisition of carbon sources, while the mechanisms by which filamentous fungi adapt to heat stress and maintain carbon sources are still not fully understood. RESULTS Using Trichoderma guizhouense, a representative beneficial microorganism for plants, we discover that heat stress severely inhibits the lignocellulases secretion, affecting carbon source utilization efficiency. Proteomic results at different temperatures suggest that proteins involved in the poly ADP-ribosylation pathway (TgPARP and TgADPRase) may play pivotal roles in thermal adaptation and lignocellulose utilization. TgPARP is induced by heat stress, while the deletion of Tgparp significantly improves the lignocellulose utilization capacity and lignocellulases secretion in T. guizhouense. Simultaneously, the absence of Tgparp prevents the excessive depletion of ATP and NAD+, enhances the protective role of mitochondrial membrane potential (MMP), and elevates the expression levels of the unfolded protein response (UPR)-related regulatory factor Tgire. Further investigations reveal that a stable MMP can establish energy homeostasis, allocating more ATP within the endoplasmic reticulum (ER) to reduce protein accumulation in the ER, thereby enhancing the lignocellulases secretion in T. guizhouense under heat stress. CONCLUSIONS Overall, these findings underscored the significance of Tgparp as pivotal regulators in lignocellulose utilization under heat stress and provided further insights into the molecular mechanism of filamentous fungi in utilizing lignocellulose.
Collapse
Affiliation(s)
- Tuo Li
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Yang Liu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Han Zhu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Linhua Cao
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Yihao Zhou
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Dongyang Liu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China.
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China.
| | - Qirong Shen
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| |
Collapse
|
42
|
Basharat AR, Xiong X, Xu T, Zang Y, Sun L, Liu X. TopDIA: A Software Tool for Top-Down Data-Independent Acquisition Proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588302. [PMID: 38645171 PMCID: PMC11030422 DOI: 10.1101/2024.04.05.588302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Top-down mass spectrometry is widely used for proteoform identification, characterization, and quantification owing to its ability to analyze intact proteoforms. In the last decade, top-down proteomics has been dominated by top-down data-dependent acquisition mass spectrometry (TD-DDA-MS), and top-down data-independent acquisition mass spectrometry (TD-DIA-MS) has not been well studied. While TD-DIA-MS produces complex multiplexed tandem mass spectrometry (MS/MS) spectra, which are challenging to confidently identify, it selects more precursor ions for MS/MS analysis and has the potential to increase proteoform identifications compared with TD-DDA-MS. Here we present TopDIA, the first software tool for proteoform identification by TD-DIA-MS. It generates demultiplexed pseudo MS/MS spectra from TD-DIA-MS data and then searches the pseudo MS/MS spectra against a protein sequence database for proteoform identification. We compared the performance of TD-DDA-MS and TD-DIA-MS using Escherichia coli K-12 MG1655 cells and demonstrated that TD-DIA-MS with TopDIA increased proteoform and protein identifications compared with TD-DDA-MS.
Collapse
Affiliation(s)
- Abdul Rehman Basharat
- Department of BioHealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA
| | - Xingzhao Xiong
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| | - Tian Xu
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Yong Zang
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Xiaowen Liu
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| |
Collapse
|
43
|
Joyce AW, Searle BC. Computational approaches to identify sites of phosphorylation. Proteomics 2024; 24:e2300088. [PMID: 37897210 DOI: 10.1002/pmic.202300088] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 10/07/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023]
Abstract
Due to their oftentimes ambiguous nature, phosphopeptide positional isomers can present challenges in bottom-up mass spectrometry-based workflows as search engine scores alone are often not enough to confidently distinguish them. Additional scoring algorithms can remedy this by providing confidence metrics in addition to these search results, reducing ambiguity. Here we describe challenges to interpreting phosphoproteomics data and review several different approaches to determine sites of phosphorylation for both data-dependent and data-independent acquisition-based workflows. Finally, we discuss open questions regarding neutral losses, gas-phase rearrangement, and false localization rate estimation experienced by both types of acquisition workflows and best practices for managing ambiguity in phosphosite determination.
Collapse
Affiliation(s)
- Alex W Joyce
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio, USA
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA
| | - Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio, USA
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
44
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Kahrood HV, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and visualization of quantitative proteomics data using FragPipe-Analyst. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583643. [PMID: 38496650 PMCID: PMC10942459 DOI: 10.1101/2024.03.05.583643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R. Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B. Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
45
|
Strauss MT, Bludau I, Zeng WF, Voytik E, Ammar C, Schessner JP, Ilango R, Gill M, Meier F, Willems S, Mann M. AlphaPept: a modern and open framework for MS-based proteomics. Nat Commun 2024; 15:2168. [PMID: 38461149 PMCID: PMC10924963 DOI: 10.1038/s41467-024-46485-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/20/2024] [Indexed: 03/11/2024] Open
Abstract
In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making efficient analysis a principal challenge. A plethora of different computational tools can process the MS data to derive peptide and protein identification and quantification. However, during the last years there has been dramatic progress in computer science, including collaboration tools that have transformed research and industry. To leverage these advances, we develop AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Numba for just-in-time compilation on CPU and GPU achieves hundred-fold speed improvements. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while accessing the latest advances. We provide an easy on-ramp for community contributions through the concept of literate programming, implemented in Jupyter Notebooks. Large datasets can rapidly be processed as shown by the analysis of hundreds of proteomes in minutes per file, many-fold faster than acquisition. AlphaPept can be used to build automated processing pipelines with web-serving functionality and compatibility with downstream analysis tools. It provides easy access via one-click installation, a modular Python library for advanced users, and via an open GitHub repository for developers.
Collapse
Affiliation(s)
- Maximilian T Strauss
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Wen-Feng Zeng
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Eugenia Voytik
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Constantin Ammar
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Julia P Schessner
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Florian Meier
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
- Functional Proteomics, Jena University Hospital, Jena, Germany
| | - Sander Willems
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
46
|
Pfeuffer J, Bielow C, Wein S, Jeong K, Netz E, Walter A, Alka O, Nilse L, Colaianni PD, McCloskey D, Kim J, Rosenberger G, Bichmann L, Walzer M, Veit J, Boudaud B, Bernt M, Patikas N, Pilz M, Startek MP, Kutuzova S, Heumos L, Charkow J, Sing JC, Feroz A, Siraj A, Weisser H, Dijkstra TMH, Perez-Riverol Y, Röst H, Kohlbacher O, Sachsenberg T. OpenMS 3 enables reproducible analysis of large-scale mass spectrometry data. Nat Methods 2024; 21:365-367. [PMID: 38366242 DOI: 10.1038/s41592-024-02197-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Affiliation(s)
- Julianus Pfeuffer
- Algorithmic Bioinformatics, Freie Universität Berlin, Berlin, Germany
- Visual and Data-Centric Computing, Zuse Institute Berlin, Berlin, Germany
| | - Chris Bielow
- Bioinformatics Solution Center, Institut für Mathematik und Informatik, Freie Universität Berlin, Berlin, Germany
| | - Samuel Wein
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Kyowon Jeong
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Eugen Netz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Axel Walter
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Oliver Alka
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Lars Nilse
- Institute of Molecular Medicine and Cell Research, University of Freiburg, Freiburg, Germany
| | | | - Douglas McCloskey
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
- BioMed X Institute, Heidelberg, Germany
| | - Jihyung Kim
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | | | - Leon Bichmann
- Yale Center for Systems and Engineering Immunology and Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), , Wellcome Trust Genome Campus, Hinxton, UK
| | - Johannes Veit
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Bertrand Boudaud
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Matthias Bernt
- Department of Computational Biology, Helmholtz Centre for Environmental Research GmbH-UFZ, Leipzig, Germany
| | - Nikolaos Patikas
- Evergrande Center for Immunologic Diseases Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | - Matteo Pilz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Michał Piotr Startek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Svetlana Kutuzova
- Department of Computer Science/Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Lukas Heumos
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Oberschleißheim, Germany
- Institute of Lung Health and Immunity and Comprehensive Pneumology Center with the CPC-M bioArchive, Helmholtz Zentrum Munich, Member of the German Center for Lung Research (DZL), Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Joshua Charkow
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Justin Cyril Sing
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Ayesha Feroz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Arslan Siraj
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | | | - Tjeerd M H Dijkstra
- Department for Women's Health, University Clinic Tübingen, Tübingen, Germany
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), , Wellcome Trust Genome Campus, Hinxton, UK
| | - Hannes Röst
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Timo Sachsenberg
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany.
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany.
| |
Collapse
|
47
|
Govender MA, Stoychev SH, Brandenburg JT, Ramsay M, Fabian J, Govender IS. Proteomic insights into the pathophysiology of hypertension-associated albuminuria: Pilot study in a South African cohort. Clin Proteomics 2024; 21:15. [PMID: 38402394 PMCID: PMC10893729 DOI: 10.1186/s12014-024-09458-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/06/2024] [Indexed: 02/26/2024] Open
Abstract
BACKGROUND Hypertension is an important public health priority with a high prevalence in Africa. It is also an independent risk factor for kidney outcomes. We aimed to identify potential proteins and pathways involved in hypertension-associated albuminuria by assessing urinary proteomic profiles in black South African participants with combined hypertension and albuminuria compared to those who have neither condition. METHODS The study included 24 South African cases with both hypertension and albuminuria and 49 control participants who had neither condition. Protein was extracted from urine samples and analysed using ultra-high-performance liquid chromatography coupled with mass spectrometry. Data were generated using data-independent acquisition (DIA) and processed using Spectronaut™ 15. Statistical and functional data annotation were performed on Perseus and Cytoscape to identify and annotate differentially abundant proteins. Machine learning was applied to the dataset using the OmicLearn platform. RESULTS Overall, a mean of 1,225 and 915 proteins were quantified in the control and case groups, respectively. Three hundred and thirty-two differentially abundant proteins were constructed into a network. Pathways associated with these differentially abundant proteins included the immune system (q-value [false discovery rate] = 1.4 × 10- 45), innate immune system (q = 1.1 × 10- 32), extracellular matrix (ECM) organisation (q = 0.03) and activation of matrix metalloproteinases (q = 0.04). Proteins with high disease scores (76-100% confidence) for both hypertension and chronic kidney disease included angiotensinogen (AGT), albumin (ALB), apolipoprotein L1 (APOL1), and uromodulin (UMOD). A machine learning approach was able to identify a set of 20 proteins, differentiating between cases and controls. CONCLUSIONS The urinary proteomic data combined with the machine learning approach was able to classify disease status and identify proteins and pathways associated with hypertension-associated albuminuria.
Collapse
Affiliation(s)
- Melanie A Govender
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.
| | - Stoyan H Stoychev
- Council for Scientific and Industrial Research, NextGen Health, Pretoria, South Africa
- ReSyn Biosciences, Edenvale, South Africa
| | - Jean-Tristan Brandenburg
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Strengthening Oncology Services, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Michèle Ramsay
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - June Fabian
- Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Medical Research Council/Wits University Rural Public Health and Health Transitions Research Unit (Agincourt), School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Ireshyn S Govender
- Council for Scientific and Industrial Research, NextGen Health, Pretoria, South Africa.
- ReSyn Biosciences, Edenvale, South Africa.
| |
Collapse
|
48
|
Li Y, He Q, Guo H, Shuai SC, Cheng J, Liu L, Shuai J. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. J Proteome Res 2024; 23:834-843. [PMID: 38252705 DOI: 10.1021/acs.jproteome.3c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Collapse
Affiliation(s)
- Yulin Li
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Qingzu He
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Huan Guo
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, Illinois 60208, United States
| | - Jinyan Cheng
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Liyu Liu
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| |
Collapse
|
49
|
Osipov A, Nikolic O, Gertych A, Parker S, Hendifar A, Singh P, Filippova D, Dagliyan G, Ferrone CR, Zheng L, Moore JH, Tourtellotte W, Van Eyk JE, Theodorescu D. The Molecular Twin artificial-intelligence platform integrates multi-omic data to predict outcomes for pancreatic adenocarcinoma patients. NATURE CANCER 2024; 5:299-314. [PMID: 38253803 PMCID: PMC10899109 DOI: 10.1038/s43018-023-00697-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/30/2023] [Indexed: 01/24/2024]
Abstract
Contemporary analyses focused on a limited number of clinical and molecular biomarkers have been unable to accurately predict clinical outcomes in pancreatic ductal adenocarcinoma. Here we describe a precision medicine platform known as the Molecular Twin consisting of advanced machine-learning models and use it to analyze a dataset of 6,363 clinical and multi-omic molecular features from patients with resected pancreatic ductal adenocarcinoma to accurately predict disease survival (DS). We show that a full multi-omic model predicts DS with the highest accuracy and that plasma protein is the top single-omic predictor of DS. A parsimonious model learning only 589 multi-omic features demonstrated similar predictive performance as the full multi-omic model. Our platform enables discovery of parsimonious biomarker panels and performance assessment of outcome prediction models learning from resource-intensive panels. This approach has considerable potential to impact clinical care and democratize precision cancer medicine worldwide.
Collapse
Affiliation(s)
- Arsen Osipov
- Department of Medicine (Medical Oncology), Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Oncology, Pancreatic Cancer Precision Medicine Center of Excellence, Johns Hopkins University, Baltimore, MD, USA
| | | | - Arkadiusz Gertych
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sarah Parker
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences and Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Andrew Hendifar
- Department of Medicine (Medical Oncology), Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | | | | | - Grant Dagliyan
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Cristina R Ferrone
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Lei Zheng
- Department of Oncology, Pancreatic Cancer Precision Medicine Center of Excellence, Johns Hopkins University, Baltimore, MD, USA
| | - Jason H Moore
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Warren Tourtellotte
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Jennifer E Van Eyk
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences and Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Dan Theodorescu
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
50
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|