1
|
Dens C, Adams C, Laukens K, Bittremieux W. Machine Learning Strategies to Tackle Data Challenges in Mass Spectrometry-Based Proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024. [PMID: 39074335 DOI: 10.1021/jasms.4c00180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
In computational proteomics, machine learning (ML) has emerged as a vital tool for enhancing data analysis. Despite significant advancements, the diversity of ML model architectures and the complexity of proteomics data present substantial challenges in the effective development and evaluation of these tools. Here, we highlight the necessity for high-quality, comprehensive data sets to train ML models and advocate for the standardization of data to support robust model development. We emphasize the instrumental role of key data sets like ProteomeTools and MassIVE-KB in advancing ML applications in proteomics and discuss the implications of data set size on model performance, highlighting that larger data sets typically yield more accurate models. To address data scarcity, we explore algorithmic strategies such as self-supervised pretraining and multitask learning. Ultimately, we hope that this discussion can serve as a call to action for the proteomics community to collaborate on data standardization and collection efforts, which are crucial for the sustainable advancement and refinement of ML methodologies in the field.
Collapse
Affiliation(s)
- Ceder Dens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Charlotte Adams
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Middelheimlaan 1, 2020 Antwerpen, Belgium
| |
Collapse
|
2
|
Shi J, Liu Y, Xu YJ. MS based foodomics: An edge tool integrated metabolomics and proteomics for food science. Food Chem 2024; 446:138852. [PMID: 38428078 DOI: 10.1016/j.foodchem.2024.138852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/05/2024] [Accepted: 02/24/2024] [Indexed: 03/03/2024]
Abstract
Foodomics has become a popular methodology in food science studies. Mass spectrometry (MS) based metabolomics and proteomics analysis played indispensable roles in foodomics research. So far, several methodologies have been developed to detect the metabolites and proteins in diets and consumers, including sample preparation, MS data acquisition, annotation and interpretation. Moreover, multiomics analysis integrated metabolomics and proteomics have received considerable attentions in the field of food safety and nutrition, because of more comprehensive and deeply. In this context, we intended to review the emerging strategies and their applications in MS-based foodomics, as well as future challenges and trends. The principle and application of multiomics were also discussed, such as the optimization of data acquisition, development of analysis algorithm and exploration of systems biology.
Collapse
Affiliation(s)
- Jiachen Shi
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| | - Yuanfa Liu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| | - Yong-Jiang Xu
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Food, National Engineering Laboratory for Cereal Fermentation Technology, Collaborative Innovation Center of Food Safety and Quality Control in Jiangsu Province, Jiangnan University, 1800 Lihu Road, Wuxi 214122, Jiangsu, People's Republic of China.
| |
Collapse
|
3
|
Wettstein R, Hugener J, Gillet L, Hernández-Armenta Y, Henggeler A, Xu J, van Gerwen J, Wollweber F, Arter M, Aebersold R, Beltrao P, Pilhofer M, Matos J. Waves of regulated protein expression and phosphorylation rewire the proteome to drive gametogenesis in budding yeast. Dev Cell 2024; 59:1764-1782.e8. [PMID: 38906138 DOI: 10.1016/j.devcel.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 02/25/2024] [Accepted: 05/20/2024] [Indexed: 06/23/2024]
Abstract
Sexually reproducing eukaryotes employ a developmentally regulated cell division program-meiosis-to generate haploid gametes from diploid germ cells. To understand how gametes arise, we generated a proteomic census encompassing the entire meiotic program of budding yeast. We found that concerted waves of protein expression and phosphorylation modify nearly all cellular pathways to support meiotic entry, meiotic progression, and gamete morphogenesis. Leveraging this comprehensive resource, we pinpointed dynamic changes in mitochondrial components and showed that phosphorylation of the FoF1-ATP synthase complex is required for efficient gametogenesis. Furthermore, using cryoET as an orthogonal approach to visualize mitochondria, we uncovered highly ordered filament arrays of Ald4ALDH2, a conserved aldehyde dehydrogenase that is highly expressed and phosphorylated during meiosis. Notably, phosphorylation-resistant mutants failed to accumulate filaments, suggesting that phosphorylation regulates context-specific Ald4ALDH2 polymerization. Overall, this proteomic census constitutes a broad resource to guide the exploration of the unique sequence of events underpinning gametogenesis.
Collapse
Affiliation(s)
- Rahel Wettstein
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Jannik Hugener
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland; Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Ludovic Gillet
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Yi Hernández-Armenta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK
| | - Adrian Henggeler
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Jingwei Xu
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Julian van Gerwen
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Florian Wollweber
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland
| | - Meret Arter
- Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK.
| | - Martin Pilhofer
- Institute of Molecular Biology and Biophysics, ETH Zürich, 8093 Zürich, Switzerland.
| | - Joao Matos
- Max Perutz Laboratories, University of Vienna, 1030 Vienna, Austria; Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland.
| |
Collapse
|
4
|
Karpov OA, Stotland A, Raedschelders K, Chazarin B, Ai L, Murray CI, Van Eyk JE. Proteomics of the heart. Physiol Rev 2024; 104:931-982. [PMID: 38300522 DOI: 10.1152/physrev.00026.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/25/2023] [Accepted: 01/14/2024] [Indexed: 02/02/2024] Open
Abstract
Mass spectrometry-based proteomics is a sophisticated identification tool specializing in portraying protein dynamics at a molecular level. Proteomics provides biologists with a snapshot of context-dependent protein and proteoform expression, structural conformations, dynamic turnover, and protein-protein interactions. Cardiac proteomics can offer a broader and deeper understanding of the molecular mechanisms that underscore cardiovascular disease, and it is foundational to the development of future therapeutic interventions. This review encapsulates the evolution, current technologies, and future perspectives of proteomic-based mass spectrometry as it applies to the study of the heart. Key technological advancements have allowed researchers to study proteomes at a single-cell level and employ robot-assisted automation systems for enhanced sample preparation techniques, and the increase in fidelity of the mass spectrometers has allowed for the unambiguous identification of numerous dynamic posttranslational modifications. Animal models of cardiovascular disease, ranging from early animal experiments to current sophisticated models of heart failure with preserved ejection fraction, have provided the tools to study a challenging organ in the laboratory. Further technological development will pave the way for the implementation of proteomics even closer within the clinical setting, allowing not only scientists but also patients to benefit from an understanding of protein interplay as it relates to cardiac disease physiology.
Collapse
Affiliation(s)
- Oleg A Karpov
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Aleksandr Stotland
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Koen Raedschelders
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Blandine Chazarin
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Lizhuo Ai
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Christopher I Murray
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| |
Collapse
|
5
|
Zhang Y, Hu C, Wu X, Song J. Calib-RT: an open source python package for peptide retention time calibration in DIA mass spectrometry data. Bioinformatics 2024; 40:btae417. [PMID: 38960865 PMCID: PMC11223842 DOI: 10.1093/bioinformatics/btae417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 05/27/2024] [Accepted: 07/02/2024] [Indexed: 07/05/2024] Open
Abstract
MOTIVATION The data independent acquisition (DIA) mass spectrometry (MS) method is increasingly popular in the field of proteomics. But the loss of the correspondence between peptide ions and their spectra in DIA makes the identification challenging. One effective approach to reduce false positive identification is to calculate the deviation between the peptide's estimated retention time (RT) and measured RT. During this process, scaling the spectral library RT into the estimated RT, known as the RT calibration, is a prerequisite for calculating the deviation. Currently, within the DIA algorithm ecosystem, there is a lack of engine-independent and readily usable RT calibration toolkits. RESULTS In this work, we introduce Calib-RT, a RT calibration method tailored to the characteristics of RT data. This method can achieve the nonlinear calibration across various data scales and tolerate a certain level of noise interference. Calib-RT is expected to enrich the open source DIA algorithm toolchain and assist in the development of DIA identification algorithms. AVAILABILITY AND IMPLEMENTATION Calib-RT is released as an open source software under the MIT license and can be installed from PyPi as a python module. The source code is available on GitHub at https://github.com/chenghui03/Calib_RT.
Collapse
Affiliation(s)
- Yichi Zhang
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Chenghui Hu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Jian Song
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| |
Collapse
|
6
|
Ling CW, Deng K, Yang Y, Lin HR, Liu CY, Li BY, Hu W, Liang X, Zhao H, Tang XY, Zheng JS, Chen YM. Mapping the gut microecological multi-omics signatures to serum metabolome and their impact on cardiometabolic health in elderly adults. EBioMedicine 2024; 105:105209. [PMID: 38908099 PMCID: PMC11253218 DOI: 10.1016/j.ebiom.2024.105209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 05/04/2024] [Accepted: 06/04/2024] [Indexed: 06/24/2024] Open
Abstract
BACKGROUND Mapping gut microecological features to serum metabolites (SMs) will help identify functional links between gut microbiome and cardiometabolic health. METHODS This study encompassed 836-1021 adults over 9.7 year in a cohort, assessing metabolic syndrome (MS), carotid atherosclerotic plaque (CAP), and other metadata triennially. We analyzed mid-term microbial metagenomics, targeted fecal and serum metabolomics, host genetics, and serum proteomics. FINDINGS Gut microbiota and metabolites (GMM) accounted for 15.1% overall variance in 168 SMs, with individual GMM factors explaining 5.65%-10.1%, host genetics 3.23%, and sociodemographic factors 5.95%. Specifically, GMM elucidated 5.5%-49.6% variance in the top 32 GMM-explained SMs. Each 20% increase in the 32 metabolite score (derived from the 32 SMs) correlated with 73% (95% confidence interval [CI]: 53%-95%) and 19% (95% CI: 11%-27%) increases in MS and CAP incidences, respectively. Among the 32 GMM-explained SMs, sebacic acid, indoleacetic acid, and eicosapentaenoic acid were linked to MS or CAP incidence. Serum proteomics revealed certain proteins, particularly the apolipoprotein family, mediated the relationship between GMM-SMs and cardiometabolic risks. INTERPRETATION This study reveals the significant influence of GMM on SM profiles and illustrates the intricate connections between GMM-explained SMs, serum proteins, and the incidence of MS and CAP, providing insights into the roles of gut dysbiosis in cardiometabolic health via regulating blood metabolites. FUNDING This study was jointly supported by the National Natural Science Foundation of China, Key Research and Development Program of Guangzhou, 5010 Program for Clinical Research of Sun Yat-sen University, and the 'Pioneer' and 'Leading goose' R&D Program of Zhejiang.
Collapse
Affiliation(s)
- Chu-Wen Ling
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China; Department of Clinical Nutrition, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 510080, China
| | - Kui Deng
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China; Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Yingdi Yang
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Hong-Rou Lin
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Chun-Ying Liu
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Bang-Yan Li
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Wei Hu
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China
| | - Xinxiu Liang
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Hui Zhao
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China
| | - Xin-Yi Tang
- Department of Pediatrics, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510630, China.
| | - Ju-Sheng Zheng
- Zhejiang Key Laboratory of Multi-Omics in Infection and Immunity, Center for Infectious Disease Research, School of Medicine, Westlake University, Hangzhou, 310030, China.
| | - Yu-Ming Chen
- Department of Epidemiology, Guangdong Provincial Key Laboratory of Food, Nutrition and Health, School of Public Health, Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
7
|
Baker C, Bruderer R, Abbott J, Arthur JSC, Brenes AJ. Optimizing Spectronaut Search Parameters to Improve Data Quality with Minimal Proteome Coverage Reductions in DIA Analyses of Heterogeneous Samples. J Proteome Res 2024; 23:1926-1936. [PMID: 38691771 PMCID: PMC11165578 DOI: 10.1021/acs.jproteome.3c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/18/2024] [Accepted: 04/19/2024] [Indexed: 05/03/2024]
Abstract
Data-independent acquisition has seen breakthroughs that enable comprehensive proteome profiling using short gradients. As the proteome coverage continues to increase, the quality of the data generated becomes much more relevant. Using Spectronaut, we show that the default search parameters can be easily optimized to minimize the occurrence of false positives across different samples. Using an immunological infection model system to demonstrate the impact of adjusting search settings, we analyzed Mus musculus macrophages and compared their proteome to macrophages spiked withCandida albicans. This experimental system enabled the identification of "false positives" as Candida albicans peptides and proteins should not be present in the Mus musculus-only samples. We show that adjusting the search parameters reduced "false positive" identifications by 89% at the peptide and protein level, thereby considerably increasing the quality of the data. We also show that these optimized parameters incurred a moderate cost, only reducing the overall number of "true positive" identifications across each biological replicate by <6.7% at both the peptide and protein level. We believe the value of our updated search parameters extends beyond a two-organism analysis and would be of great value to any DIA experiment analyzing heterogeneous populations of cell types or tissues.
Collapse
Affiliation(s)
- Christa
P. Baker
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | | | - James Abbott
- Data
Analysis Group, Division of Computational Biology, School of Life
Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - J. Simon C. Arthur
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - Alejandro J. Brenes
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| |
Collapse
|
8
|
Lewis JM, Jebeli L, Coulon PML, Lay CE, Scott NE. Glycoproteomic and proteomic analysis of Burkholderia cenocepacia reveals glycosylation events within FliF and MotB are dispensable for motility. Microbiol Spectr 2024; 12:e0034624. [PMID: 38709084 DOI: 10.1128/spectrum.00346-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 04/16/2024] [Indexed: 05/07/2024] Open
Abstract
Across the Burkholderia genus O-linked protein glycosylation is highly conserved. While the inhibition of glycosylation has been shown to be detrimental for virulence in Burkholderia cepacia complex species, such as Burkholderia cenocepacia, little is known about how specific glycosylation sites impact protein functionality. Within this study, we sought to improve our understanding of the breadth, dynamics, and requirement for glycosylation across the B. cenocepacia O-glycoproteome. Assessing the B. cenocepacia glycoproteome across different culture media using complementary glycoproteomic approaches, we increase the known glycoproteome to 141 glycoproteins. Leveraging this repertoire of glycoproteins, we quantitively assessed the glycoproteome of B. cenocepacia using Data-Independent Acquisition (DIA) revealing the B. cenocepacia glycoproteome is largely stable across conditions with most glycoproteins constitutively expressed. Examination of how the absence of glycosylation impacts the glycoproteome reveals that the protein abundance of only five glycoproteins (BCAL1086, BCAL2974, BCAL0525, BCAM0505, and BCAL0127) are altered by the loss of glycosylation. Assessing ΔfliF (ΔBCAL0525), ΔmotB (ΔBCAL0127), and ΔBCAM0505 strains, we demonstrate the loss of FliF, and to a lesser extent MotB, mirror the proteomic effects observed in the absence of glycosylation in ΔpglL. While both MotB and FliF are essential for motility, we find loss of glycosylation sites in MotB or FliF does not impact motility supporting these sites are dispensable for function. Combined this work broadens our understanding of the B. cenocepacia glycoproteome supporting that the loss of glycoproteins in the absence of glycosylation is not an indicator of the requirement for glycosylation for protein function. IMPORTANCE Burkholderia cenocepacia is an opportunistic pathogen of concern within the Cystic Fibrosis community. Despite a greater appreciation of the unique physiology of B. cenocepacia gained over the last 20 years a complete understanding of the proteome and especially the O-glycoproteome, is lacking. In this study, we utilize systems biology approaches to expand the known B. cenocepacia glycoproteome as well as track the dynamics of glycoproteins across growth phases, culturing media and in response to the loss of glycosylation. We show that the glycoproteome of B. cenocepacia is largely stable across conditions and that the loss of glycosylation only impacts five glycoproteins including the motility associated proteins FliF and MotB. Examination of MotB and FliF shows, while these proteins are essential for motility, glycosylation is dispensable. Combined this work supports that B. cenocepacia glycosylation can be dispensable for protein function and may influence protein properties beyond stability.
Collapse
Affiliation(s)
- Jessica M Lewis
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Leila Jebeli
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Pauline M L Coulon
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Catrina E Lay
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Nichollas E Scott
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| |
Collapse
|
9
|
Li K, Teo GC, Yang KL, Yu F, Nesvizhskii AI. diaTracer enables spectrum-centric analysis of diaPASEF proteomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.25.595875. [PMID: 38854051 PMCID: PMC11160675 DOI: 10.1101/2024.05.25.595875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Data-independent acquisition (DIA) has become a widely used strategy for peptide and protein quantification in mass spectrometry-based proteomics studies. The integration of ion mobility separation into DIA analysis, such as the diaPASEF technology available on Bruker's timsTOF platform, further improves the quantification accuracy and protein depth achievable using DIA. We introduce diaTracer, a new spectrum-centric computational tool optimized for diaPASEF data. diaTracer performs three-dimensional (m/z, retention time, ion mobility) peak tracing and feature detection to generate precursor-resolved "pseudo-MS/MS" spectra, facilitating direct ("spectral-library free") peptide identification and quantification from diaPASEF data. diaTracer is available as a stand-alone tool and is fully integrated into the widely used FragPipe computational platform. We demonstrate the performance of diaTracer and FragPipe using diaPASEF data from cerebrospinal fluid (CSF) and plasma samples, data from phosphoproteomics and HLA immunopeptidomics experiments, and low-input data from a spatial proteomics study. We also show that diaTracer enables unrestricted identification of post-translational modifications from diaPASEF data using open/mass offset searches.
Collapse
Affiliation(s)
- Kai Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kevin L. Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
10
|
Alfahel L, Gschwendtberger T, Kozareva V, Dumas L, Gibbs R, Kertser A, Baruch K, Zaccai S, Kahn J, Thau-Habermann N, Eggenschwiler R, Sterneckert J, Hermann A, Sundararaman N, Vaibhav V, Van Eyk JE, Rafuse VF, Fraenkel E, Cantz T, Petri S, Israelson A. Targeting low levels of MIF expression as a potential therapeutic strategy for ALS. Cell Rep Med 2024; 5:101546. [PMID: 38703766 PMCID: PMC11148722 DOI: 10.1016/j.xcrm.2024.101546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/03/2023] [Accepted: 04/10/2024] [Indexed: 05/06/2024]
Abstract
Mutations in SOD1 cause amyotrophic lateral sclerosis (ALS), a neurodegenerative disease characterized by motor neuron (MN) loss. We previously discovered that macrophage migration inhibitory factor (MIF), whose levels are extremely low in spinal MNs, inhibits mutant SOD1 misfolding and toxicity. In this study, we show that a single peripheral injection of adeno-associated virus (AAV) delivering MIF into adult SOD1G37R mice significantly improves their motor function, delays disease progression, and extends survival. Moreover, MIF treatment reduces neuroinflammation and misfolded SOD1 accumulation, rescues MNs, and corrects dysregulated pathways as observed by proteomics and transcriptomics. Furthermore, we reveal low MIF levels in human induced pluripotent stem cell-derived MNs from familial ALS patients with different genetic mutations, as well as in post mortem tissues of sporadic ALS patients. Our findings indicate that peripheral MIF administration may provide a potential therapeutic mechanism for modulating misfolded SOD1 in vivo and disease outcome in ALS patients.
Collapse
Affiliation(s)
- Leenor Alfahel
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | - Thomas Gschwendtberger
- Department of Neurology, Hannover Medical School, 30625 Hannover, Germany; Center for Systems Neuroscience, Hannover Medical School, 30625 Hannover, Germany
| | - Velina Kozareva
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Laura Dumas
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | - Rachel Gibbs
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | | | - Kuti Baruch
- ImmunoBrain Checkpoint Ltd., Ness Ziona 7404905, Israel
| | - Shir Zaccai
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | - Joy Kahn
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel
| | | | - Reto Eggenschwiler
- Gastroenterology, Hepatology and Endocrinology Department, Hannover Medical School, 30625 Hannover, Germany; Translational Hepatology and Stem Cell Biology, REBIRTH - Research Center for Translational Regenerative Medicine and Department of Gastroenterology, Hepatology and Endocrinology, Hannover Medical School, 30625 Hannover, Germany
| | - Jared Sterneckert
- Center for Regenerative Therapies Dresden, Technical University Dresden, 01307 Dresden, Germany
| | - Andreas Hermann
- Translational Neurodegeneration Section, "Albrecht Kossel", Department of Neurology, University Medical Center Rostock, University of Rostock, 18147 Rostock, Germany; Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) Rostock/Greifswald, 18147 Rostock, Germany; Center for Transdisciplinary Neurosciences Rostock (CTNR), University Medical Center Rostock, University of Rostock, 18147 Rostock, Germany
| | - Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Vineet Vaibhav
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Victor F Rafuse
- Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada; Brain Repair Centre, Life Sciences Research Institute, Halifax, Nova Scotia B3H 4R2, Canada
| | - Ernest Fraenkel
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tobias Cantz
- Gastroenterology, Hepatology and Endocrinology Department, Hannover Medical School, 30625 Hannover, Germany; Translational Hepatology and Stem Cell Biology, REBIRTH - Research Center for Translational Regenerative Medicine and Department of Gastroenterology, Hepatology and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; Max Planck Institute for Molecular Biomedicine, Cell and Developmental Biology, 48149 Münster, Germany
| | - Susanne Petri
- Department of Neurology, Hannover Medical School, 30625 Hannover, Germany; Center for Systems Neuroscience, Hannover Medical School, 30625 Hannover, Germany
| | - Adrian Israelson
- Department of Physiology and Cell Biology, Faculty of Health Sciences, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel; The School of Brain Sciences and Cognition, Ben-Gurion University of the Negev, P.O.B. 653, Beer Sheva 84105, Israel.
| |
Collapse
|
11
|
Kohler D, Staniak M, Yu F, Nesvizhskii AI, Vitek O. An MSstats workflow for detecting differentially abundant proteins in large-scale data-independent acquisition mass spectrometry experiments with FragPipe processing. Nat Protoc 2024:10.1038/s41596-024-01000-3. [PMID: 38769142 DOI: 10.1038/s41596-024-01000-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 03/11/2024] [Indexed: 05/22/2024]
Abstract
Technological advances in mass spectrometry and proteomics have made it possible to perform larger-scale and more-complex experiments. The volume and complexity of the resulting data create major challenges for downstream analysis. In particular, next-generation data-independent acquisition (DIA) experiments enable wider proteome coverage than more traditional targeted approaches but require computational workflows that can manage much larger datasets and identify peptide sequences from complex and overlapping spectral features. Data-processing tools such as FragPipe, DIA-NN and Spectronaut have undergone substantial improvements to process spectral features in a reasonable time. Statistical analysis tools are needed to draw meaningful comparisons between experimental samples, but these tools were also originally designed with smaller datasets in mind. This protocol describes an updated version of MSstats that has been adapted to be compatible with large-scale DIA experiments. A very large DIA experiment, processed with FragPipe, is used as an example to demonstrate different MSstats workflows. The choice of workflow depends on the user's computational resources. For datasets that are too large to fit into a standard computer's memory, we demonstrate the use of MSstatsBig, a companion R package to MSstats. The protocol also highlights key decisions that have a major effect on both the results and the processing time of the analysis. The MSstats processing can be expected to take 1-3 h depending on the usage of MSstatsBig. The protocol can be run in the point-and-click graphical user interface MSstatsShiny or implemented with minimal coding expertise in R.
Collapse
Affiliation(s)
- Devon Kohler
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA
| | | | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.
| |
Collapse
|
12
|
Rosenberger G, Li W, Turunen M, He J, Subramaniam PS, Pampou S, Griffin AT, Karan C, Kerwin P, Murray D, Honig B, Liu Y, Califano A. Network-based elucidation of colon cancer drug resistance mechanisms by phosphoproteomic time-series analysis. Nat Commun 2024; 15:3909. [PMID: 38724493 PMCID: PMC11082183 DOI: 10.1038/s41467-024-47957-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 04/16/2024] [Indexed: 05/12/2024] Open
Abstract
Aberrant signaling pathway activity is a hallmark of tumorigenesis and progression, which has guided targeted inhibitor design for over 30 years. Yet, adaptive resistance mechanisms, induced by rapid, context-specific signaling network rewiring, continue to challenge therapeutic efficacy. Leveraging progress in proteomic technologies and network-based methodologies, we introduce Virtual Enrichment-based Signaling Protein-activity Analysis (VESPA)-an algorithm designed to elucidate mechanisms of cell response and adaptation to drug perturbations-and use it to analyze 7-point phosphoproteomic time series from colorectal cancer cells treated with clinically-relevant inhibitors and control media. Interrogating tumor-specific enzyme/substrate interactions accurately infers kinase and phosphatase activity, based on their substrate phosphorylation state, effectively accounting for signal crosstalk and sparse phosphoproteome coverage. The analysis elucidates time-dependent signaling pathway response to each drug perturbation and, more importantly, cell adaptive response and rewiring, experimentally confirmed by CRISPR knock-out assays, suggesting broad applicability to cancer and other diseases.
Collapse
Affiliation(s)
- George Rosenberger
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Wenxue Li
- Yale Cancer Biology Institute, Yale University, West Haven, CT, USA
| | - Mikko Turunen
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Jing He
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Prem S Subramaniam
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Sergey Pampou
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- J.P. Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Aaron T Griffin
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Medical Scientist Training Program, Columbia University Irving Medical Center, New York, NY, USA
| | - Charles Karan
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- J.P. Sulzberger Columbia Genome Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Patrick Kerwin
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Diana Murray
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Barry Honig
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, New York, NY, USA
- Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Yansheng Liu
- Yale Cancer Biology Institute, Yale University, West Haven, CT, USA.
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT, USA.
| | - Andrea Califano
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, New York, NY, USA.
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA.
- Chan Zuckerberg Biohub New York, New York, NY, USA.
| |
Collapse
|
13
|
Sing JC, Charkow J, AlHigaylan M, Horecka I, Xu L, Röst HL. MassDash: A Web-Based Dashboard for Data-Independent Acquisition Mass Spectrometry Visualization. J Proteome Res 2024. [PMID: 38684072 DOI: 10.1021/acs.jproteome.4c00026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
With the increased usage and diversity of methods and instruments being applied to analyze Data-Independent Acquisition (DIA) data, visualization is becoming increasingly important to validate automated software results. Here we present MassDash, a cross-platform DIA mass spectrometry visualization and validation software for comparing features and results across popular tools. MassDash provides a web-based interface and Python package for interactive feature visualizations and summary report plots across multiple automated DIA feature detection tools, including OpenSwath, DIA-NN, and dreamDIA. Furthermore, MassDash processes peptides on the fly, enabling interactive visualization of peptides across dozens of runs simultaneously on a personal computer. MassDash supports various multidimensional visualizations across retention time, ion mobility, m/z, and intensity, providing additional insights into the data. The modular framework is easily extendable, enabling rapid algorithm development of novel peak-picker techniques, such as deep-learning-based approaches and refinement of existing tools. MassDash is open-source under a BSD 3-Clause license and freely available at https://github.com/Roestlab/massdash, and a demo version can be accessed at https://massdash.streamlit.app.
Collapse
Affiliation(s)
- Justin C Sing
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Joshua Charkow
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Mohammed AlHigaylan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Ira Horecka
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Leon Xu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| | - Hannes L Röst
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5G 1A8, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario M5G 1A8, Canada
| |
Collapse
|
14
|
Li T, Liu Y, Zhu H, Cao L, Zhou Y, Liu D, Shen Q. Cellular ATP redistribution achieved by deleting Tgparp improves lignocellulose utilization of Trichoderma under heat stress. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2024; 17:54. [PMID: 38637859 PMCID: PMC11027231 DOI: 10.1186/s13068-024-02502-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 04/05/2024] [Indexed: 04/20/2024]
Abstract
BACKGROUND Thermotolerance is widely acknowledged as a pivotal factor for fungal survival across diverse habitats. Heat stress induces a cascade of disruptions in various life processes, especially in the acquisition of carbon sources, while the mechanisms by which filamentous fungi adapt to heat stress and maintain carbon sources are still not fully understood. RESULTS Using Trichoderma guizhouense, a representative beneficial microorganism for plants, we discover that heat stress severely inhibits the lignocellulases secretion, affecting carbon source utilization efficiency. Proteomic results at different temperatures suggest that proteins involved in the poly ADP-ribosylation pathway (TgPARP and TgADPRase) may play pivotal roles in thermal adaptation and lignocellulose utilization. TgPARP is induced by heat stress, while the deletion of Tgparp significantly improves the lignocellulose utilization capacity and lignocellulases secretion in T. guizhouense. Simultaneously, the absence of Tgparp prevents the excessive depletion of ATP and NAD+, enhances the protective role of mitochondrial membrane potential (MMP), and elevates the expression levels of the unfolded protein response (UPR)-related regulatory factor Tgire. Further investigations reveal that a stable MMP can establish energy homeostasis, allocating more ATP within the endoplasmic reticulum (ER) to reduce protein accumulation in the ER, thereby enhancing the lignocellulases secretion in T. guizhouense under heat stress. CONCLUSIONS Overall, these findings underscored the significance of Tgparp as pivotal regulators in lignocellulose utilization under heat stress and provided further insights into the molecular mechanism of filamentous fungi in utilizing lignocellulose.
Collapse
Affiliation(s)
- Tuo Li
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Yang Liu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Han Zhu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Linhua Cao
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Yihao Zhou
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| | - Dongyang Liu
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China.
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China.
| | - Qirong Shen
- Key Lab of Organic-Based Fertilizers of China and Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing, China
- College of Resources & Environmental Sciences, Nanjing Agricultural University, Nanjing, 210095, Jiangsu, China
| |
Collapse
|
15
|
Basharat AR, Xiong X, Xu T, Zang Y, Sun L, Liu X. TopDIA: A Software Tool for Top-Down Data-Independent Acquisition Proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588302. [PMID: 38645171 PMCID: PMC11030422 DOI: 10.1101/2024.04.05.588302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Top-down mass spectrometry is widely used for proteoform identification, characterization, and quantification owing to its ability to analyze intact proteoforms. In the last decade, top-down proteomics has been dominated by top-down data-dependent acquisition mass spectrometry (TD-DDA-MS), and top-down data-independent acquisition mass spectrometry (TD-DIA-MS) has not been well studied. While TD-DIA-MS produces complex multiplexed tandem mass spectrometry (MS/MS) spectra, which are challenging to confidently identify, it selects more precursor ions for MS/MS analysis and has the potential to increase proteoform identifications compared with TD-DDA-MS. Here we present TopDIA, the first software tool for proteoform identification by TD-DIA-MS. It generates demultiplexed pseudo MS/MS spectra from TD-DIA-MS data and then searches the pseudo MS/MS spectra against a protein sequence database for proteoform identification. We compared the performance of TD-DDA-MS and TD-DIA-MS using Escherichia coli K-12 MG1655 cells and demonstrated that TD-DIA-MS with TopDIA increased proteoform and protein identifications compared with TD-DDA-MS.
Collapse
Affiliation(s)
- Abdul Rehman Basharat
- Department of BioHealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA
| | - Xingzhao Xiong
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| | - Tian Xu
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Yong Zang
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Xiaowen Liu
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| |
Collapse
|
16
|
Joyce AW, Searle BC. Computational approaches to identify sites of phosphorylation. Proteomics 2024; 24:e2300088. [PMID: 37897210 DOI: 10.1002/pmic.202300088] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 10/07/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023]
Abstract
Due to their oftentimes ambiguous nature, phosphopeptide positional isomers can present challenges in bottom-up mass spectrometry-based workflows as search engine scores alone are often not enough to confidently distinguish them. Additional scoring algorithms can remedy this by providing confidence metrics in addition to these search results, reducing ambiguity. Here we describe challenges to interpreting phosphoproteomics data and review several different approaches to determine sites of phosphorylation for both data-dependent and data-independent acquisition-based workflows. Finally, we discuss open questions regarding neutral losses, gas-phase rearrangement, and false localization rate estimation experienced by both types of acquisition workflows and best practices for managing ambiguity in phosphosite determination.
Collapse
Affiliation(s)
- Alex W Joyce
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio, USA
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA
| | - Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio, USA
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
17
|
He Q, Guo H, Li Y, He G, Li X, Shuai J. SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics. Interdiscip Sci 2024:10.1007/s12539-024-00611-4. [PMID: 38472692 DOI: 10.1007/s12539-024-00611-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 03/14/2024]
Abstract
Mass spectrometry is crucial in proteomics analysis, particularly using Data Independent Acquisition (DIA) for reliable and reproducible mass spectrometry data acquisition, enabling broad mass-to-charge ratio coverage and high throughput. DIA-NN, a prominent deep learning software in DIA proteome analysis, generates peptide results but may include low-confidence peptides. Conventionally, biologists have to manually screen peptide fragment ion chromatogram peaks (XIC) for identifying high-confidence peptides, a time-consuming and subjective process prone to variability. In this study, we introduce SeFilter-DIA, a deep learning algorithm, aiming at automating the identification of high-confidence peptides. Leveraging compressed excitation neural network and residual network models, SeFilter-DIA extracts XIC features and effectively discerns between high and low-confidence peptides. Evaluation of the benchmark datasets demonstrates SeFilter-DIA achieving 99.6% AUC on the test set and 97% for other performance indicators. Furthermore, SeFilter-DIA is applicable for screening peptides with phosphorylation modifications. These results demonstrate the potential of SeFilter-DIA to replace manual screening, providing an efficient and objective approach for high-confidence peptide identification while mitigating associated limitations.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Huan Guo
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Yulin Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Guoqiang He
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Xiang Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.
| | - Jianwei Shuai
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325001, China.
| |
Collapse
|
18
|
Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Kahrood HV, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and visualization of quantitative proteomics data using FragPipe-Analyst. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583643. [PMID: 38496650 PMCID: PMC10942459 DOI: 10.1101/2024.03.05.583643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
The FragPipe computational proteomics platform is gaining widespread popularity among the proteomics research community because of its fast processing speed and user-friendly graphical interface. Although FragPipe produces well-formatted output tables that are ready for analysis, there is still a need for an easy-to-use and user-friendly downstream statistical analysis and visualization tool. FragPipe-Analyst addresses this need by providing an R shiny web server to assist FragPipe users in conducting downstream analyses of the resulting quantitative proteomics data. It supports major quantification workflows including label-free quantification, tandem mass tags, and data-independent acquisition. FragPipe-Analyst offers a range of useful functionalities, such as various missing value imputation options, data quality control, unsupervised clustering, differential expression (DE) analysis using Limma, and gene ontology and pathway enrichment analysis using Enrichr. To support advanced analysis and customized visualizations, we also developed FragPipeAnalystR, an R package encompassing all FragPipe-Analyst functionalities that is extended to support site-specific analysis of post-translational modifications (PTMs). FragPipe-Analyst and FragPipeAnalystR are both open-source and freely available.
Collapse
Affiliation(s)
- Yi Hsiao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Haijian Zhang
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hossein Valipour Kahrood
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
- Monash Genomics & Bioinformatics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Joel R. Steele
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Ralf B. Schittenhelm
- Monash Proteomics & Metabolomics Platform, Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
19
|
Strauss MT, Bludau I, Zeng WF, Voytik E, Ammar C, Schessner JP, Ilango R, Gill M, Meier F, Willems S, Mann M. AlphaPept: a modern and open framework for MS-based proteomics. Nat Commun 2024; 15:2168. [PMID: 38461149 PMCID: PMC10924963 DOI: 10.1038/s41467-024-46485-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/20/2024] [Indexed: 03/11/2024] Open
Abstract
In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making efficient analysis a principal challenge. A plethora of different computational tools can process the MS data to derive peptide and protein identification and quantification. However, during the last years there has been dramatic progress in computer science, including collaboration tools that have transformed research and industry. To leverage these advances, we develop AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Numba for just-in-time compilation on CPU and GPU achieves hundred-fold speed improvements. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while accessing the latest advances. We provide an easy on-ramp for community contributions through the concept of literate programming, implemented in Jupyter Notebooks. Large datasets can rapidly be processed as shown by the analysis of hundreds of proteomes in minutes per file, many-fold faster than acquisition. AlphaPept can be used to build automated processing pipelines with web-serving functionality and compatibility with downstream analysis tools. It provides easy access via one-click installation, a modular Python library for advanced users, and via an open GitHub repository for developers.
Collapse
Affiliation(s)
- Maximilian T Strauss
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Wen-Feng Zeng
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Eugenia Voytik
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Constantin Ammar
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Julia P Schessner
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Florian Meier
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
- Functional Proteomics, Jena University Hospital, Jena, Germany
| | - Sander Willems
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
20
|
Pfeuffer J, Bielow C, Wein S, Jeong K, Netz E, Walter A, Alka O, Nilse L, Colaianni PD, McCloskey D, Kim J, Rosenberger G, Bichmann L, Walzer M, Veit J, Boudaud B, Bernt M, Patikas N, Pilz M, Startek MP, Kutuzova S, Heumos L, Charkow J, Sing JC, Feroz A, Siraj A, Weisser H, Dijkstra TMH, Perez-Riverol Y, Röst H, Kohlbacher O, Sachsenberg T. OpenMS 3 enables reproducible analysis of large-scale mass spectrometry data. Nat Methods 2024; 21:365-367. [PMID: 38366242 DOI: 10.1038/s41592-024-02197-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Affiliation(s)
- Julianus Pfeuffer
- Algorithmic Bioinformatics, Freie Universität Berlin, Berlin, Germany
- Visual and Data-Centric Computing, Zuse Institute Berlin, Berlin, Germany
| | - Chris Bielow
- Bioinformatics Solution Center, Institut für Mathematik und Informatik, Freie Universität Berlin, Berlin, Germany
| | - Samuel Wein
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Kyowon Jeong
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Eugen Netz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Axel Walter
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Oliver Alka
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Lars Nilse
- Institute of Molecular Medicine and Cell Research, University of Freiburg, Freiburg, Germany
| | | | - Douglas McCloskey
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
- BioMed X Institute, Heidelberg, Germany
| | - Jihyung Kim
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | | | - Leon Bichmann
- Yale Center for Systems and Engineering Immunology and Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), , Wellcome Trust Genome Campus, Hinxton, UK
| | - Johannes Veit
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Bertrand Boudaud
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| | - Matthias Bernt
- Department of Computational Biology, Helmholtz Centre for Environmental Research GmbH-UFZ, Leipzig, Germany
| | - Nikolaos Patikas
- Evergrande Center for Immunologic Diseases Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | - Matteo Pilz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Michał Piotr Startek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Svetlana Kutuzova
- Department of Computer Science/Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Lukas Heumos
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Oberschleißheim, Germany
- Institute of Lung Health and Immunity and Comprehensive Pneumology Center with the CPC-M bioArchive, Helmholtz Zentrum Munich, Member of the German Center for Lung Research (DZL), Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Joshua Charkow
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Justin Cyril Sing
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Ayesha Feroz
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | - Arslan Siraj
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
| | | | - Tjeerd M H Dijkstra
- Department for Women's Health, University Clinic Tübingen, Tübingen, Germany
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), , Wellcome Trust Genome Campus, Hinxton, UK
| | - Hannes Röst
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany
- Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Timo Sachsenberg
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, Tübingen, Germany.
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tübingen, Germany.
| |
Collapse
|
21
|
Govender MA, Stoychev SH, Brandenburg JT, Ramsay M, Fabian J, Govender IS. Proteomic insights into the pathophysiology of hypertension-associated albuminuria: Pilot study in a South African cohort. Clin Proteomics 2024; 21:15. [PMID: 38402394 PMCID: PMC10893729 DOI: 10.1186/s12014-024-09458-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/06/2024] [Indexed: 02/26/2024] Open
Abstract
BACKGROUND Hypertension is an important public health priority with a high prevalence in Africa. It is also an independent risk factor for kidney outcomes. We aimed to identify potential proteins and pathways involved in hypertension-associated albuminuria by assessing urinary proteomic profiles in black South African participants with combined hypertension and albuminuria compared to those who have neither condition. METHODS The study included 24 South African cases with both hypertension and albuminuria and 49 control participants who had neither condition. Protein was extracted from urine samples and analysed using ultra-high-performance liquid chromatography coupled with mass spectrometry. Data were generated using data-independent acquisition (DIA) and processed using Spectronaut™ 15. Statistical and functional data annotation were performed on Perseus and Cytoscape to identify and annotate differentially abundant proteins. Machine learning was applied to the dataset using the OmicLearn platform. RESULTS Overall, a mean of 1,225 and 915 proteins were quantified in the control and case groups, respectively. Three hundred and thirty-two differentially abundant proteins were constructed into a network. Pathways associated with these differentially abundant proteins included the immune system (q-value [false discovery rate] = 1.4 × 10- 45), innate immune system (q = 1.1 × 10- 32), extracellular matrix (ECM) organisation (q = 0.03) and activation of matrix metalloproteinases (q = 0.04). Proteins with high disease scores (76-100% confidence) for both hypertension and chronic kidney disease included angiotensinogen (AGT), albumin (ALB), apolipoprotein L1 (APOL1), and uromodulin (UMOD). A machine learning approach was able to identify a set of 20 proteins, differentiating between cases and controls. CONCLUSIONS The urinary proteomic data combined with the machine learning approach was able to classify disease status and identify proteins and pathways associated with hypertension-associated albuminuria.
Collapse
Affiliation(s)
- Melanie A Govender
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.
| | - Stoyan H Stoychev
- Council for Scientific and Industrial Research, NextGen Health, Pretoria, South Africa
- ReSyn Biosciences, Edenvale, South Africa
| | - Jean-Tristan Brandenburg
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Strengthening Oncology Services, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Michèle Ramsay
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - June Fabian
- Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Medical Research Council/Wits University Rural Public Health and Health Transitions Research Unit (Agincourt), School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Ireshyn S Govender
- Council for Scientific and Industrial Research, NextGen Health, Pretoria, South Africa.
- ReSyn Biosciences, Edenvale, South Africa.
| |
Collapse
|
22
|
Li Y, He Q, Guo H, Shuai SC, Cheng J, Liu L, Shuai J. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. J Proteome Res 2024; 23:834-843. [PMID: 38252705 DOI: 10.1021/acs.jproteome.3c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Collapse
Affiliation(s)
- Yulin Li
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Qingzu He
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Huan Guo
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, Illinois 60208, United States
| | - Jinyan Cheng
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Liyu Liu
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| |
Collapse
|
23
|
Osipov A, Nikolic O, Gertych A, Parker S, Hendifar A, Singh P, Filippova D, Dagliyan G, Ferrone CR, Zheng L, Moore JH, Tourtellotte W, Van Eyk JE, Theodorescu D. The Molecular Twin artificial-intelligence platform integrates multi-omic data to predict outcomes for pancreatic adenocarcinoma patients. NATURE CANCER 2024; 5:299-314. [PMID: 38253803 PMCID: PMC10899109 DOI: 10.1038/s43018-023-00697-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/30/2023] [Indexed: 01/24/2024]
Abstract
Contemporary analyses focused on a limited number of clinical and molecular biomarkers have been unable to accurately predict clinical outcomes in pancreatic ductal adenocarcinoma. Here we describe a precision medicine platform known as the Molecular Twin consisting of advanced machine-learning models and use it to analyze a dataset of 6,363 clinical and multi-omic molecular features from patients with resected pancreatic ductal adenocarcinoma to accurately predict disease survival (DS). We show that a full multi-omic model predicts DS with the highest accuracy and that plasma protein is the top single-omic predictor of DS. A parsimonious model learning only 589 multi-omic features demonstrated similar predictive performance as the full multi-omic model. Our platform enables discovery of parsimonious biomarker panels and performance assessment of outcome prediction models learning from resource-intensive panels. This approach has considerable potential to impact clinical care and democratize precision cancer medicine worldwide.
Collapse
Affiliation(s)
- Arsen Osipov
- Department of Medicine (Medical Oncology), Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Oncology, Pancreatic Cancer Precision Medicine Center of Excellence, Johns Hopkins University, Baltimore, MD, USA
| | | | - Arkadiusz Gertych
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sarah Parker
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences and Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Andrew Hendifar
- Department of Medicine (Medical Oncology), Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | | | | | - Grant Dagliyan
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Cristina R Ferrone
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Lei Zheng
- Department of Oncology, Pancreatic Cancer Precision Medicine Center of Excellence, Johns Hopkins University, Baltimore, MD, USA
| | - Jason H Moore
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Warren Tourtellotte
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Jennifer E Van Eyk
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Biomedical Sciences and Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Dan Theodorescu
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
24
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
25
|
Shahbazy M, Ramarathinam SH, Li C, Illing PT, Faridi P, Croft NP, Purcell AW. MHCpLogics: an interactive machine learning-based tool for unsupervised data visualization and cluster analysis of immunopeptidomes. Brief Bioinform 2024; 25:bbae087. [PMID: 38487848 PMCID: PMC10940831 DOI: 10.1093/bib/bbae087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/12/2023] [Accepted: 02/15/2024] [Indexed: 03/18/2024] Open
Abstract
The major histocompatibility complex (MHC) encodes a range of immune response genes, including the human leukocyte antigens (HLAs) in humans. These molecules bind peptide antigens and present them on the cell surface for T cell recognition. The repertoires of peptides presented by HLA molecules are termed immunopeptidomes. The highly polymorphic nature of the genres that encode the HLA molecules confers allotype-specific differences in the sequences of bound ligands. Allotype-specific ligand preferences are often defined by peptide-binding motifs. Individuals express up to six classical class I HLA allotypes, which likely present peptides displaying different binding motifs. Such complex datasets make the deconvolution of immunopeptidomic data into allotype-specific contributions and further dissection of binding-specificities challenging. Herein, we developed MHCpLogics as an interactive machine learning-based tool for mining peptide-binding sequence motifs and visualization of immunopeptidome data across complex datasets. We showcase the functionalities of MHCpLogics by analyzing both in-house and published mono- and multi-allelic immunopeptidomics data. The visualization modalities of MHCpLogics allow users to inspect clustered sequences down to individual peptide components and to examine broader sequence patterns within multiple immunopeptidome datasets. MHCpLogics can deconvolute large immunopeptidome datasets enabling the interrogation of clusters for the segregation of allotype-specific peptide sequence motifs, identification of sub-peptidome motifs, and the exportation of clustered peptide sequence lists. The tool facilitates rapid inspection of immunopeptidomes as a resource for the immunology and vaccine communities. MHCpLogics is a standalone application available via an executable installation at: https://github.com/PurcellLab/MHCpLogics.
Collapse
Affiliation(s)
- Mohammad Shahbazy
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Sri H Ramarathinam
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Chen Li
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Patricia T Illing
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Pouya Faridi
- Centre for Cancer Research, Hudson Institute of Medical Research, Clayton, VIC 3168, Australia
- Monash Proteomics and Metabolomics Platform, Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC 3800, Australia
| | - Nathan P Croft
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Anthony W Purcell
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
26
|
Zila N, Eichhoff OM, Steiner I, Mohr T, Bileck A, Cheng PF, Leitner A, Gillet L, Sajic T, Goetze S, Friedrich B, Bortel P, Strobl J, Reitermaier R, Hogan SA, Martínez Gómez JM, Staeger R, Tuchmann F, Peters S, Stary G, Kuttke M, Elbe-Bürger A, Hoeller C, Kunstfeld R, Weninger W, Wollscheid B, Dummer R, French LE, Gerner C, Aebersold R, Levesque MP, Paulitschke V. Proteomic Profiling of Advanced Melanoma Patients to Predict Therapeutic Response to Anti-PD-1 Therapy. Clin Cancer Res 2024; 30:159-175. [PMID: 37861398 DOI: 10.1158/1078-0432.ccr-23-0562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 07/17/2023] [Accepted: 10/18/2023] [Indexed: 10/21/2023]
Abstract
PURPOSE Despite high clinical need, there are no biomarkers that accurately predict the response of patients with metastatic melanoma to anti-PD-1 therapy. EXPERIMENTAL DESIGN In this multicenter study, we applied protein depletion and enrichment methods prior to various proteomic techniques to analyze a serum discovery cohort (n = 56) and three independent serum validation cohorts (n = 80, n = 12, n = 17). Further validation analyses by literature and survival analysis followed. RESULTS We identified several significantly regulated proteins as well as biological processes such as neutrophil degranulation, cell-substrate adhesion, and extracellular matrix organization. Analysis of the three independent serum validation cohorts confirmed the significant differences between responders (R) and nonresponders (NR) observed in the initial discovery cohort. In addition, literature-based validation highlighted 30 markers overlapping with previously published signatures. Survival analysis using the TCGA database showed that overexpression of 17 of the markers we identified correlated with lower overall survival in patients with melanoma. CONCLUSIONS Ultimately, this multilayered serum analysis led to a potential marker signature with 10 key markers significantly altered in at least two independent serum cohorts: CRP, LYVE1, SAA2, C1RL, CFHR3, LBP, LDHB, S100A8, S100A9, and SAA1, which will serve as the basis for further investigation. In addition to patient serum, we analyzed primary melanoma tumor cells from NR and found a potential marker signature with four key markers: LAMC1, PXDN, SERPINE1, and VCAN.
Collapse
Affiliation(s)
- Nina Zila
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
- Division of Biomedical Science, University of Applied Sciences FH Campus Wien, Vienna, Austria
| | - Ossia M Eichhoff
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Irene Steiner
- Center for Medical Data Science, Institute of Medical Statistics, Medical University of Vienna, Vienna, Austria
| | - Thomas Mohr
- Department of Medicine I, Institute of Cancer Research, Medical University of Vienna, Vienna, Austria
| | - Andrea Bileck
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
- Joint Metabolome Facility, University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Phil F Cheng
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Alexander Leitner
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ludovic Gillet
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Tatjana Sajic
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Sandra Goetze
- Department of Health Sciences and Technology, Institute of Translational Medicine, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- ETH PHRT Swiss Multi-Omics Center (SMOC), Zurich, Switzerland
| | - Betty Friedrich
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Patricia Bortel
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
| | - Johanna Strobl
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - René Reitermaier
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Sabrina A Hogan
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Julia M Martínez Gómez
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Ramon Staeger
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Felix Tuchmann
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Sophie Peters
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Georg Stary
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Vienna, Austria
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Mario Kuttke
- Center for Physiology and Pharmacology, Institute of Vascular Biology and Thrombosis Research, Medical University of Vienna, Vienna, Austria
| | | | - Christoph Hoeller
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Rainer Kunstfeld
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Wolfgang Weninger
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Bernd Wollscheid
- Department of Health Sciences and Technology, Institute of Translational Medicine, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Reinhard Dummer
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Lars E French
- Department of Dermatology and Allergy University Hospital, Ludwig-Maximilian-University Munich, Munich, Germany
- Dr. Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami Miller School of Medicine, Miami, Florida
| | - Christopher Gerner
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
- Joint Metabolome Facility, University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Mitchell P Levesque
- Department of Dermatology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Verena Paulitschke
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
27
|
Szyrwiel L, Gille C, Mülleder M, Demichev V, Ralser M. Fast proteomics with dia-PASEF and analytical flow-rate chromatography. Proteomics 2024; 24:e2300100. [PMID: 37287406 DOI: 10.1002/pmic.202300100] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/22/2023] [Accepted: 05/22/2023] [Indexed: 06/09/2023]
Abstract
Increased throughput in proteomic experiments can improve accessibility of proteomic platforms, reduce costs, and facilitate new approaches in systems biology and biomedical research. Here we propose combination of analytical flow rate chromatography with ion mobility separation of peptide ions, data-independent acquisition, and data analysis with the DIA-NN software suite, to achieve high-quality proteomic experiments from limited sample amounts, at a throughput of up to 400 samples per day. For instance, when benchmarking our workflow using a 500-μL/min flow rate and 3-min chromatographic gradients, we report the quantification of 5211 proteins from 2 μg of a mammalian cell-line standard at high quantitative accuracy and precision. We further used this platform to analyze blood plasma samples from a cohort of COVID-19 inpatients, using a 3-min chromatographic gradient and alternating column regeneration on a dual pump system. The method delivered a comprehensive view of the COVID-19 plasma proteome, allowing classification of the patients according to disease severity and revealing plasma biomarker candidates.
Collapse
Affiliation(s)
- Lukasz Szyrwiel
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Christoph Gille
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Core Facility High-Throughput Mass Spectrometry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Michael Mülleder
- Core Facility High-Throughput Mass Spectrometry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Vadim Demichev
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Markus Ralser
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- The Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| |
Collapse
|
28
|
Wu W, Huang Z, Kong W, Peng H, Goh WWB. Optimizing the PROTREC network-based missing protein prediction algorithm. Proteomics 2024; 24:e2200332. [PMID: 37876146 DOI: 10.1002/pmic.202200332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 09/30/2023] [Accepted: 10/06/2023] [Indexed: 10/26/2023]
Abstract
This article summarizes the PROTREC method and investigates the impact that the different hyper-parameters have on the task of missing protein prediction using PROTREC. We evaluate missing protein recovery rates using different PROTREC score selection approaches (MAX, MIN, MEDIAN, and MEAN), different PROTREC score thresholds, as well as different complex size thresholds. In addition, we included two additional cancer datasets in our analysis and introduced a new validation method to check both the robustness of the PROTREC method as well as the correctness of our analysis. Our analysis showed that the missing protein recovery rate can be improved by adopting PROTREC score selection operations of MIN, MEDIAN, and MEAN instead of the default MAX. However, this may come at a cost of reduced numbers of proteins predicted and validated. The users should therefore choose their hyper-parameters carefully to find a balance in the accuracy-quantity trade-off. We also explored the possibility of combining PROTREC with a p-value-based method (FCS) and demonstrated that PROTREC is able to perform well independently without any help from a p-value-based method. Furthermore, we conducted a downstream enrichment analysis to understand the biological pathways and protein networks within the cancerous tissues using the recovered proteins. Missing protein recovery rate using PROTREC can be improved by selecting a different PROTREC score selection method. Different PROTREC score selection methods and other hyper-parameters such as PROTREC score threshold and complex size threshold introduce accuracy-quantity trade-off. PROTREC is able to perform well independently of any filtering using a p-value-based method. Verification of the PROTREC method on additional cancer datasets. Downstream Enrichment Analysis to understand the biological pathways and protein networks in cancerous tissues.
Collapse
Affiliation(s)
- Wenshan Wu
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Zelu Huang
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore, Singapore
| | - Weijia Kong
- Department of Computer Science, National University of Singapore, Singapore, Singapore
- School of Biological Science, Nanyang Technological University, Singapore, Singapore
| | - Hui Peng
- School of Biological Science, Nanyang Technological University, Singapore, Singapore
| | - Wilson Wen Bin Goh
- School of Biological Science, Nanyang Technological University, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- Center for Biomedical Informatics, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
29
|
Bichmann L, Gupta S, Röst H. Data-Independent Acquisition Peptidomics. Methods Mol Biol 2024; 2758:77-88. [PMID: 38549009 DOI: 10.1007/978-1-0716-3646-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
In recent years, data-independent acquisition (DIA) has emerged as a powerful analysis method in biological mass spectrometry (MS). Compared to the previously predominant data-dependent acquisition (DDA), it offers a way to achieve greater reproducibility, sensitivity, and dynamic range in MS measurements. To make DIA accessible to non-expert users, a multifunctional, automated high-throughput pipeline DIAproteomics was implemented in the computational workflow framework "Nextflow" ( https://nextflow.io ). This allows high-throughput processing of proteomics and peptidomics DIA datasets on diverse computing infrastructures. This chapter provides a short summary and usage protocol guide for the most important modes of operation of this pipeline regarding the analysis of peptidomics datasets using the command line. In brief, DIAproteomics is a wrapper around the OpenSwathWorkflow and relies on either existing or ad-hoc generated spectral libraries from matching DDA runs. The OpenSwathWorkflow extracts chromatograms from the DIA runs and performs chromatographic peak-picking. Further downstream of the pipeline, these peaks are scored, aligned, and statistically evaluated for qualitative and quantitative differences across conditions depending on the user's interest. DIAproteomics is open-source and available under a permissive license. We encourage the scientific community to use or modify the pipeline to meet their specific requirements.
Collapse
Affiliation(s)
- Leon Bichmann
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
| | - Shubham Gupta
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Hannes Röst
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
30
|
Qian Y, Guo X, Wang Y, Ouyang Z, Ma X. Mobility-Modulated Sequential Dissociation Analysis Enables Structural Lipidomics in Mass Spectrometry Imaging. Angew Chem Int Ed Engl 2023; 62:e202312275. [PMID: 37946693 DOI: 10.1002/anie.202312275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/09/2023] [Accepted: 11/09/2023] [Indexed: 11/12/2023]
Abstract
Spatial lipidomics based on mass spectrometry imaging (MSI) is a powerful tool for fundamental biology studies and biomarker discovery. But the structure-resolving capability of MSI is limited because of the lack of multiplexed tandem mass spectrometry (MS/MS) method, primarily due to the small sample amount available from each pixel and the poor ion usage in MS/MS analysis. Here, we report a mobility-modulated sequential dissociation (MMSD) strategy for multiplex MS/MS imaging of distinct lipids from biological tissues. With ion mobility-enabled data-independent acquisition and automated spectrum deconvolution, MS/MS spectra of a large number of lipid species from each tissue pixel are acquired, at no expense of imaging speed. MMSD imaging is highlighted by MS/MS imaging of 24 structurally distinct lipids in the mouse brain and the revealing of the correlation of a structurally distinct phosphatidylethanolamine isomer (PE 18 : 1_18 : 1) from a human hepatocellular carcinoma (HCC) tissue. Mapping of structurally distinct lipid isomers is now enabled and spatial lipidomics becomes feasible for MSI.
Collapse
Affiliation(s)
- Yao Qian
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, 100084, China
| | - Xiangyu Guo
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, 100084, China
| | - Yunfang Wang
- Hepato-pancreato-biliary Center, Beijing Tsinghua Changgung Hospital, Tsinghua University, Beijing, 102218, China
| | - Zheng Ouyang
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, 100084, China
| | - Xiaoxiao Ma
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
31
|
Yan B, Shi M, Cai S, Su Y, Chen R, Huang C, Chen DDY. Data-Driven Tool for Cross-Run Ion Selection and Peak-Picking in Quantitative Proteomics with Data-Independent Acquisition LC-MS/MS. Anal Chem 2023; 95:16558-16566. [PMID: 37906674 DOI: 10.1021/acs.analchem.3c02689] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Proteomics provides molecular bases of biology and disease, and liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a platform widely used for bottom-up proteomics. Data-independent acquisition (DIA) improves the run-to-run reproducibility of LC-MS/MS in proteomics research. However, the existing DIA data processing tools sometimes produce large deviations from true values for the peptides and proteins in quantification. Peak-picking error and incorrect ion selection are the two main causes of the deviations. We present a cross-run ion selection and peak-picking (CRISP) tool that utilizes the important advantage of run-to-run consistency of DIA and simultaneously examines the DIA data from the whole set of runs to filter out the interfering signals, instead of only looking at a single run at a time. Eight datasets acquired by mass spectrometers from different vendors with different types of mass analyzers were used to benchmark our CRISP-DIA against other currently available DIA tools. In the benchmark datasets, for analytes with large content variation among samples, CRISP-DIA generally resulted in 20 to 50% relative decrease in error rates compared to other DIA tools, at both the peptide precursor level and the protein level. CRISP-DIA detected differentially expressed proteins more efficiently, with 3.3 to 90.3% increases in the numbers of true positives and 12.3 to 35.3% decreases in the false positive rates, in some cases. In the real biological datasets, CRISP-DIA showed better consistencies of the quantification results. The advantages of assimilating DIA data in multiple runs for quantitative proteomics were demonstrated, which can significantly improve the quantification accuracy.
Collapse
Affiliation(s)
- Binjun Yan
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Mengtian Shi
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- College of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Siyu Cai
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- College of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Yuan Su
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- College of Pharmaceutical Science, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Renhui Chen
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Chiyuan Huang
- Key Laboratory of Systems Biology, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - David Da Yong Chen
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| |
Collapse
|
32
|
Kitata RB, Yang JC, Chen YJ. Advances in data-independent acquisition mass spectrometry towards comprehensive digital proteome landscape. MASS SPECTROMETRY REVIEWS 2023; 42:2324-2348. [PMID: 35645145 DOI: 10.1002/mas.21781] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 12/17/2021] [Accepted: 01/21/2022] [Indexed: 06/15/2023]
Abstract
The data-independent acquisition mass spectrometry (DIA-MS) has rapidly evolved as a powerful alternative for highly reproducible proteome profiling with a unique strength of generating permanent digital maps for retrospective analysis of biological systems. Recent advancements in data analysis software tools for the complex DIA-MS/MS spectra coupled to fast MS scanning speed and high mass accuracy have greatly expanded the sensitivity and coverage of DIA-based proteomics profiling. Here, we review the evolution of the DIA-MS techniques, from earlier proof-of-principle of parallel fragmentation of all-ions or ions in selected m/z range, the sequential window acquisition of all theoretical mass spectra (SWATH-MS) to latest innovations, recent development in computation algorithms for data informatics, and auxiliary tools and advanced instrumentation to enhance the performance of DIA-MS. We further summarize recent applications of DIA-MS and experimentally-derived as well as in silico spectra library resources for large-scale profiling to facilitate biomarker discovery and drug development in human diseases with emphasis on the proteomic profiling coverage. Toward next-generation DIA-MS for clinical proteomics, we outline the challenges in processing multi-dimensional DIA data set and large-scale clinical proteomics, and continuing need in higher profiling coverage and sensitivity.
Collapse
Affiliation(s)
| | - Jhih-Ci Yang
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Applied Chemistry, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
33
|
Hay BN, Akinlaja MO, Baker TC, Houfani AA, Stacey RG, Foster LJ. Integration of data-independent acquisition (DIA) with co-fractionation mass spectrometry (CF-MS) to enhance interactome mapping capabilities. Proteomics 2023; 23:e2200278. [PMID: 37144656 DOI: 10.1002/pmic.202200278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
Proteomics technologies are continually advancing, providing opportunities to develop stronger and more robust protein interaction networks (PINs). In part, this is due to the ever-growing number of high-throughput proteomics methods that are available. This review discusses how data-independent acquisition (DIA) and co-fractionation mass spectrometry (CF-MS) can be integrated to enhance interactome mapping abilities. Furthermore, integrating these two techniques can improve data quality and network generation through extended protein coverage, less missing data, and reduced noise. CF-DIA-MS shows promise in expanding our knowledge of interactomes, notably for non-model organisms (NMOs). CF-MS is a valuable technique on its own, but upon the integration of DIA, the potential to develop robust PINs increases, offering a unique approach for researchers to gain an in-depth understanding into the dynamics of numerous biological processes.
Collapse
Affiliation(s)
- Brenna N Hay
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola O Akinlaja
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Teesha C Baker
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Aicha Asma Houfani
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
34
|
Verma N, Khare D, Poe AJ, Amador C, Ghiam S, Fealy A, Ebrahimi S, Shadrokh O, Song XY, Santiskulvong C, Mastali M, Parker S, Stotland A, Van Eyk JE, Ljubimov AV, Saghizadeh M. MicroRNA and Protein Cargos of Human Limbal Epithelial Cell-Derived Exosomes and Their Regulatory Roles in Limbal Stromal Cells of Diabetic and Non-Diabetic Corneas. Cells 2023; 12:2524. [PMID: 37947602 PMCID: PMC10649916 DOI: 10.3390/cells12212524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/08/2023] [Accepted: 10/18/2023] [Indexed: 11/12/2023] Open
Abstract
Epithelial and stromal/mesenchymal limbal stem cells contribute to corneal homeostasis and cell renewal. Extracellular vesicles (EVs), including exosomes (Exos), can be paracrine mediators of intercellular communication. Previously, we described cargos and regulatory roles of limbal stromal cell (LSC)-derived Exos in non-diabetic (N) and diabetic (DM) limbal epithelial cells (LECs). Presently, we quantify the miRNA and proteome profiles of human LEC-derived Exos and their regulatory roles in N- and DM-LSC. We revealed some miRNA and protein differences in DM vs. N-LEC-derived Exos' cargos, including proteins involved in Exo biogenesis and packaging that may affect Exo production and ultimately cellular crosstalk and corneal function. Treatment by N-Exos, but not by DM-Exos, enhanced wound healing in cultured N-LSCs and increased proliferation rates in N and DM LSCs vs. corresponding untreated (control) cells. N-Exos-treated LSCs reduced the keratocyte markers ALDH3A1 and lumican and increased the MSC markers CD73, CD90, and CD105 vs. control LSCs. These being opposite to the changes quantified in wounded LSCs. Overall, N-LEC Exos have a more pronounced effect on LSC wound healing, proliferation, and stem cell marker expression than DM-LEC Exos. This suggests that regulatory miRNA and protein cargo differences in DM- vs. N-LEC-derived Exos could contribute to the disease state.
Collapse
Affiliation(s)
- Nagendra Verma
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Drirh Khare
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
- Division of Pediatric Blood and Marrow Transplantation & Cellular Therapy, University of Minnesota, Minneapolis, MN 55455, USA
| | - Adam J. Poe
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Cynthia Amador
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Sean Ghiam
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
- Sackler School of Medicine, New York State/American Program of Tel Aviv University, Tel Aviv 6997801, Israel
| | - Andrew Fealy
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Shaghaiegh Ebrahimi
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Odelia Shadrokh
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Xue-Ying Song
- Genomics Core, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (X.-Y.S.); (C.S.)
| | - Chintda Santiskulvong
- Genomics Core, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (X.-Y.S.); (C.S.)
| | - Mitra Mastali
- Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (M.M.); (S.P.); (A.S.); (J.E.V.E.)
| | - Sarah Parker
- Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (M.M.); (S.P.); (A.S.); (J.E.V.E.)
| | - Aleksandr Stotland
- Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (M.M.); (S.P.); (A.S.); (J.E.V.E.)
| | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (M.M.); (S.P.); (A.S.); (J.E.V.E.)
| | - Alexander V. Ljubimov
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA
| | - Mehrnoosh Saghizadeh
- Eye Program, Board of Governors Regenerative Medicine Institute, Cedars Sinai Medical Center, 8700 Beverly Boulevard, AHSP-A8104, Los Angeles, CA 90048, USA; (N.V.); (D.K.); (C.A.); (A.F.); (S.E.); (O.S.); (A.V.L.)
- Departments of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA
| |
Collapse
|
35
|
Lei JT, Jaehnig EJ, Smith H, Holt MV, Li X, Anurag M, Ellis MJ, Mills GB, Zhang B, Labrie M. The Breast Cancer Proteome and Precision Oncology. Cold Spring Harb Perspect Med 2023; 13:a041323. [PMID: 37137501 PMCID: PMC10547392 DOI: 10.1101/cshperspect.a041323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
The goal of precision oncology is to translate the molecular features of cancer into predictive and prognostic tests that can be used to individualize treatment leading to improved outcomes and decreased toxicity. Success for this strategy in breast cancer is exemplified by efficacy of trastuzumab in tumors overexpressing ERBB2 and endocrine therapy for tumors that are estrogen receptor positive. However, other effective treatments, including chemotherapy, immune checkpoint inhibitors, and CDK4/6 inhibitors are not associated with strong predictive biomarkers. Proteomics promises another tier of information that, when added to genomic and transcriptomic features (proteogenomics), may create new opportunities to improve both treatment precision and therapeutic hypotheses. Here, we review both mass spectrometry-based and antibody-dependent proteomics as complementary approaches. We highlight how these methods have contributed toward a more complete understanding of breast cancer and describe the potential to guide diagnosis and treatment more accurately.
Collapse
Affiliation(s)
- Jonathan T Lei
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Eric J Jaehnig
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Hannah Smith
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Matthew V Holt
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Xi Li
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Meenakshi Anurag
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Matthew J Ellis
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Gordon B Mills
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center and Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Marilyne Labrie
- Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon 97239, USA
| |
Collapse
|
36
|
Zhang B, Bassani-Sternberg M. Current perspectives on mass spectrometry-based immunopeptidomics: the computational angle to tumor antigen discovery. J Immunother Cancer 2023; 11:e007073. [PMID: 37899131 PMCID: PMC10619091 DOI: 10.1136/jitc-2023-007073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 10/31/2023] Open
Abstract
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Collapse
Affiliation(s)
- Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| |
Collapse
|
37
|
Buljan M, Banaei-Esfahani A, Blattmann P, Meier-Abt F, Shao W, Vitek O, Tang H, Aebersold R. A computational framework for the inference of protein complex remodeling from whole-proteome measurements. Nat Methods 2023; 20:1523-1529. [PMID: 37749212 PMCID: PMC10555833 DOI: 10.1038/s41592-023-02011-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 08/16/2023] [Indexed: 09/27/2023]
Abstract
Protein complexes are responsible for the enactment of most cellular functions. For the protein complex to form and function, its subunits often need to be present at defined quantitative ratios. Typically, global changes in protein complex composition are assessed with experimental approaches that tend to be time consuming. Here, we have developed a computational algorithm for the detection of altered protein complexes based on the systematic assessment of subunit ratios from quantitative proteomic measurements. We applied it to measurements from breast cancer cell lines and patient biopsies and were able to identify strong remodeling of HDAC2 epigenetic complexes in more aggressive forms of cancer. The presented algorithm is available as an R package and enables the inference of changes in protein complex states by extracting functionally relevant information from bottom-up proteomic datasets.
Collapse
Affiliation(s)
- Marija Buljan
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
- EMPA, Swiss Federal Laboratories for Materials Science and Technology, St Gallen, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Amir Banaei-Esfahani
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland
| | - Peter Blattmann
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Idorsia Pharmaceuticals, Allschwil, Switzerland
| | - Fabienne Meier-Abt
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Medical Oncology and Hematology, University and University Hospital Zurich, Zurich, Switzerland
- Institute of Medical Genetics, University of Zurich, Zurich, Switzerland
| | - Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- State Key Laboratory of Microbial Metabolism, School of Life Science & Biotechnology, and Joint International Research Laboratory of Metabolic & Developmental Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Hua Tang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
- Faculty of Science, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
38
|
Govender IS, Mokoena R, Stoychev S, Naicker P. Urine-HILIC: Automated Sample Preparation for Bottom-Up Urinary Proteome Profiling in Clinical Proteomics. Proteomes 2023; 11:29. [PMID: 37873871 PMCID: PMC10594433 DOI: 10.3390/proteomes11040029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/20/2023] [Accepted: 09/25/2023] [Indexed: 10/25/2023] Open
Abstract
Urine provides a diverse source of information related to a patient's health status and is ideal for clinical proteomics due to its ease of collection. To date, most methods for the preparation of urine samples lack the throughput required to analyze large clinical cohorts. To this end, we developed a novel workflow, urine-HILIC (uHLC), based on an on-bead protein capture, clean-up, and digestion without the need for bottleneck processing steps such as protein precipitation or centrifugation. The workflow was applied to an acute kidney injury (AKI) pilot study. Urine from clinical samples and a pooled sample was subjected to automated sample preparation in a KingFisher™ Flex magnetic handling station using the novel approach based on MagReSyn® HILIC microspheres. For benchmarking, the pooled sample was also prepared using a published protocol based on an on-membrane (OM) protein capture and digestion workflow. Peptides were analyzed by LCMS in data-independent acquisition (DIA) mode using a Dionex Ultimate 3000 UPLC coupled to a Sciex 5600 mass spectrometer. The data were searched in Spectronaut™ 17. Both workflows showed similar peptide and protein identifications in the pooled sample. The uHLC workflow was easier to set up and complete, having less hands-on time than the OM method, with fewer manual processing steps. Lower peptide and protein coefficient of variation was observed in the uHLC technical replicates. Following statistical analysis, candidate protein markers were filtered, at ≥8.35-fold change in abundance, ≥2 unique peptides and ≤1% false discovery rate, and revealed 121 significant, differentially abundant proteins, some of which have known associations with kidney injury. The pilot data derived using this novel workflow provide information on the urinary proteome of patients with AKI. Further exploration in a larger cohort using this novel high-throughput method is warranted.
Collapse
Affiliation(s)
- Ireshyn Selvan Govender
- NextGen Health, Council for Scientific and Industrial Research, Pretoria 0001, South Africa
- ReSyn Biosciences, Edenvale 1610, South Africa
| | - Rethabile Mokoena
- NextGen Health, Council for Scientific and Industrial Research, Pretoria 0001, South Africa
- School of Molecular and Cellular Biology, University of the Witwatersrand, Johannesburg 2193, South Africa
| | - Stoyan Stoychev
- NextGen Health, Council for Scientific and Industrial Research, Pretoria 0001, South Africa
- ReSyn Biosciences, Edenvale 1610, South Africa
| | - Previn Naicker
- NextGen Health, Council for Scientific and Industrial Research, Pretoria 0001, South Africa
| |
Collapse
|
39
|
Zhang X, Ruan C, Wang Y, Wang K, Liu X, Lyu J, Ye M. Integrated Protein Solubility Shift Assays for Comprehensive Drug Target Identification on a Proteome-Wide Scale. Anal Chem 2023; 95:13779-13787. [PMID: 37676971 DOI: 10.1021/acs.analchem.3c00072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Target proteins are often stabilized after binding with a ligand and thereby typically become more resistant to denaturation. Based on this phenomenon, several methods without the need to covalently modify the ligand have been developed to identify target proteins for a specific ligand. These methods usually employ complicated workflows with high cost and limited throughput. Here, we develop an iso-pH shift assay (ipHSA) method, a proteome-wide target identification method that detects ligand-induced protein solubility shifts by precipitating proteins with a single concentration of acidic agent followed by protein quantification via data-independent acquisition (DIA). Using a pan-kinase inhibitor, staurosporine, we demonstrated that ipHSA increased throughput compared to the previously developed pH-dependent protein precipitation (pHDPP) method. ipHSA was found to have high complementarity in staurosporine target identification compared with the improved isothermal shift assay (iTSA) and isosolvent shift assay (iSSA) using DIA instead of tandem mass tags (TMTs) for quantification. To further improve target identification sensitivity, we developed an integrated protein solubility shift assay (IPSSA) by pooling the supernatants yielded from ipHSA, iTSA, and iSSA methods. IPSSA exhibited increased sensitivity in screening staurosporine targets by 38, 29, and 38% compared to individual methods. Increasing the number of replicate experiments further enhanced the sensitivity of target identification. Meanwhile, IPSSA also improved the throughput and reduced the cost compared with previous methods. As a fast and efficient tool for drug target identification, IPSSA is expected to have broad applications in the study of the mechanism of action.
Collapse
Affiliation(s)
- Xiaolei Zhang
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
| | - Chengfei Ruan
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yan Wang
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
| | - Keyun Wang
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
| | - Xiaoyan Liu
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiawen Lyu
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingliang Ye
- CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R & A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
40
|
Ruan T, Li P, Wang H, Li T, Jiang G. Identification and Prioritization of Environmental Organic Pollutants: From an Analytical and Toxicological Perspective. Chem Rev 2023; 123:10584-10640. [PMID: 37531601 DOI: 10.1021/acs.chemrev.3c00056] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]
Abstract
Exposure to environmental organic pollutants has triggered significant ecological impacts and adverse health outcomes, which have been received substantial and increasing attention. The contribution of unidentified chemical components is considered as the most significant knowledge gap in understanding the combined effects of pollutant mixtures. To address this issue, remarkable analytical breakthroughs have recently been made. In this review, the basic principles on recognition of environmental organic pollutants are overviewed. Complementary analytical methodologies (i.e., quantitative structure-activity relationship prediction, mass spectrometric nontarget screening, and effect-directed analysis) and experimental platforms are briefly described. The stages of technique development and/or essential parts of the analytical workflow for each of the methodologies are then reviewed. Finally, plausible technique paths and applications of the future nontarget screening methods, interdisciplinary techniques for achieving toxicant identification, and burgeoning strategies on risk assessment of chemical cocktails are discussed.
Collapse
Affiliation(s)
- Ting Ruan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Pengyang Li
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haotian Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingyu Li
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
41
|
Hartman E, Scott AM, Karlsson C, Mohanty T, Vaara ST, Linder A, Malmström L, Malmström J. Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis. Nat Commun 2023; 14:5359. [PMID: 37660105 PMCID: PMC10475049 DOI: 10.1038/s41467-023-41146-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 08/22/2023] [Indexed: 09/04/2023] Open
Abstract
The incorporation of machine learning methods into proteomics workflows improves the identification of disease-relevant biomarkers and biological pathways. However, machine learning models, such as deep neural networks, typically suffer from lack of interpretability. Here, we present a deep learning approach to combine biological pathway analysis and biomarker identification to increase the interpretability of proteomics experiments. Our approach integrates a priori knowledge of the relationships between proteins and biological pathways and biological processes into sparse neural networks to create biologically informed neural networks. We employ these networks to differentiate between clinical subphenotypes of septic acute kidney injury and COVID-19, as well as acute respiratory distress syndrome of different aetiologies. To gain biological insight into the complex syndromes, we utilize feature attribution-methods to introspect the networks for the identification of proteins and pathways important for distinguishing between subtypes. The algorithms are implemented in a freely available open source Python-package ( https://github.com/InfectionMedicineProteomics/BINN ).
Collapse
Affiliation(s)
- Erik Hartman
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden.
| | - Aaron M Scott
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Christofer Karlsson
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Tirthankar Mohanty
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Suvi T Vaara
- Department of Perioperative and Intensive Care, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Adam Linder
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Lars Malmström
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Johan Malmström
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden.
| |
Collapse
|
42
|
Zhang F, Ge W, Huang L, Li D, Liu L, Dong Z, Xu L, Ding X, Zhang C, Sun Y, A J, Gao J, Guo T. A Comparative Analysis of Data Analysis Tools for Data-Independent Acquisition Mass Spectrometry. Mol Cell Proteomics 2023; 22:100623. [PMID: 37481071 PMCID: PMC10458344 DOI: 10.1016/j.mcpro.2023.100623] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/12/2023] [Accepted: 07/18/2023] [Indexed: 07/24/2023] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry-based proteomics generates reproducible proteome data. The complex processing of the DIA data has led to the development of multiple data analysis tools. In this study, we assessed the performance of five tools (OpenSWATH, EncyclopeDIA, Skyline, DIA-NN, and Spectronaut) using six DIA datasets obtained from TripleTOF, Orbitrap, and TimsTOF Pro instruments. By comparing identification and quantification metrics and examining shared and unique cross-tool identifications, we evaluated both library-based and library-free approaches. Our findings indicate that library-free approaches outperformed library-based methods when the spectral library had limited comprehensiveness. However, our results also suggest that constructing a comprehensive library still offers benefits for most DIA analyses. This study provides comprehensive guidance for DIA data analysis tools, benefiting both experienced and novice users of DIA-mass spectrometry technology.
Collapse
Affiliation(s)
- Fangfei Zhang
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China.
| | - Weigang Ge
- Westlake Omics, Ltd, Hangzhou, Zhejiang Province, China
| | | | - Dan Li
- Westlake Omics, Ltd, Hangzhou, Zhejiang Province, China
| | - Lijuan Liu
- Westlake Omics, Ltd, Hangzhou, Zhejiang Province, China
| | - Zhen Dong
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Luang Xu
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Xuan Ding
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Cheng Zhang
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Yingying Sun
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Jun A
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Jinlong Gao
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China
| | - Tiannan Guo
- Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, China.
| |
Collapse
|
43
|
Zhang Y, Liao J, Le W, Wu G, Zhang W. Improving the Data Quality of Untargeted Metabolomics through a Targeted Data-Dependent Acquisition Based on an Inclusion List of Differential and Preidentified Ions. Anal Chem 2023; 95:12964-12973. [PMID: 37594469 DOI: 10.1021/acs.analchem.3c02888] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Metabolomics based on high-resolution mass spectrometry has become a powerful technique in biomedical research. The development of various analytical tools and online libraries has promoted the identification of biomarkers. However, how to make mass spectrometry collect more data information is an important but underestimated research topic. Herein, we combined full-scan and data-dependent acquisition (DDA) modes to develop a new targeted DDA based on the inclusion list of differential and preidentified ions (dpDDA). In this workflow, the MS1 datasets for statistical analysis and metabolite preidentification were first obtained using full-scan, and then, the MS/MS datasets for metabolite identification were obtained using targeted DDA of quality control samples based on the inclusion list. Compared with the current methods (DDA, data-independent acquisition, targeted DDA with time-staggered precursor ion list, and iterative exclusion DDA), dpDDA showed better stability, higher characteristic ion coverage, higher differential metabolites' MS/MS coverage, and higher quality MS/MS spectra. Moreover, the same trend was verified in the analysis of large-scale clinical samples. More surprisingly, dpDDA can distinguish patients with different severities of coronary heart disease (CHD) based on the Canadian Cardiovascular Society angina classification, which we cannot distinguish through conventional metabolomics data collection. Finally, dpDDA was employed to differentiate CHD from healthy control, and targeted metabolomics confirmed that dpDDA could identify a more complete metabolic pathway network. At the same time, four unreported potential CHD biomarkers were identified, and the area under the receiver operating characteristic curve was greater than 0.85. These results showed that dpDDA would expand the discovery of biomarkers based on metabolomics, more comprehensively explore the key metabolites and their association with diseases, and promote the development of precision medicine.
Collapse
Affiliation(s)
- Yuhao Zhang
- School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, Jiangsu 211198, China
| | - Jingyu Liao
- School of Pharmacy, Guangdong Pharmaceutical University, Guangdong 510006, China
| | - Wanqi Le
- Institute of Interdisciplinary Medical Sciences, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Gaosong Wu
- Institute of Interdisciplinary Medical Sciences, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Weidong Zhang
- School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing, Jiangsu 211198, China
- Institute of Interdisciplinary Medical Sciences, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China
| |
Collapse
|
44
|
Allen C, Meinl R, Paez JS, Searle BC, Just S, Pino LK, Fondrie WE. nf-encyclopedia: A Cloud-Ready Pipeline for Chromatogram Library Data-Independent Acquisition Proteomics Workflows. J Proteome Res 2023; 22:2743-2749. [PMID: 37417926 DOI: 10.1021/acs.jproteome.2c00613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Data-independent acquisition (DIA) mass spectrometry methods provide systematic and comprehensive quantification of the proteome; yet, relatively few open-source tools are available to analyze DIA proteomics experiments. Fewer still are tools that can leverage gas phase fractionated (GPF) chromatogram libraries to enhance the detection and quantification of peptides in these experiments. Here, we present nf-encyclopedia, an open-source NextFlow pipeline that connects three open-source tools, MSConvert, EncyclopeDIA, and MSstats, to analyze DIA proteomics experiments with or without chromatogram libraries. We demonstrate that nf-encyclopedia is reproducible when run on either a cloud platform or a local workstation and provides robust peptide and protein quantification. Additionally, we found that MSstats enhances protein-level quantitative performance over EncyclopeDIA alone. Finally, we benchmarked the ability of nf-encyclopedia to scale to large experiments in the cloud by leveraging the parallelization of compute resources. The nf-encyclopedia pipeline is available under a permissive Apache 2.0 license; run it on your desktop, cluster, or in the cloud: https://github.com/TalusBio/nf-encyclopedia.
Collapse
Affiliation(s)
- Carolyn Allen
- Talus Bioscience, Seattle, Washington 98122, United States
| | - Rico Meinl
- Talus Bioscience, Seattle, Washington 98122, United States
| | | | - Brian C Searle
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, United States
- Proteome Software, Inc., Portland, Oregon 97219, United States
| | - Seth Just
- Proteome Software, Inc., Portland, Oregon 97219, United States
| | - Lindsay K Pino
- Talus Bioscience, Seattle, Washington 98122, United States
| | | |
Collapse
|
45
|
Yi X, Zhu J, Liu W, Peng L, Lu C, Sun P, Huang L, Nie X, Huang S, Guo T, Zhu Y. Proteome Landscapes of Human Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma. Mol Cell Proteomics 2023; 22:100604. [PMID: 37353004 PMCID: PMC10413158 DOI: 10.1016/j.mcpro.2023.100604] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 04/12/2023] [Accepted: 06/20/2023] [Indexed: 06/25/2023] Open
Abstract
Liver cancer is among the top leading causes of cancer mortality worldwide. Particularly, hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (CCA) have been extensively investigated from the aspect of tumor biology. However, a comprehensive and systematic understanding of the molecular characteristics of HCC and CCA remains absent. Here, we characterized the proteome landscapes of HCC and CCA using the data-independent acquisition (DIA) mass spectrometry (MS) method. By comparing the quantitative proteomes of HCC and CCA, we found several differences between the two cancer types. In particular, we found an abnormal lipid metabolism in HCC and activated extracellular matrix-related pathways in CCA. We next developed a three-protein classifier to distinguish CCA from HCC, achieving an area under the curve (AUC) of 0.92, and an accuracy of 90% in an independent validation cohort of 51 patients. The distinct molecular characteristics of HCC and CCA presented in this study provide new insights into the tumor biology of these two major important primary liver cancers. Our findings may help develop more efficient diagnostic approaches and new targeted drug treatments.
Collapse
Affiliation(s)
- Xiao Yi
- Center for ProtTalks, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Jiang Zhu
- Center for Stem Cell Research and Application, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China; Key laboratory of Biological Targeted Therapy, The Ministry of Education, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Wei Liu
- Westlake Omics (Hangzhou) Biotechnology Co, Ltd, Hangzhou, Zhejiang, China
| | - Li Peng
- Department of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Cong Lu
- Center for Stem Cell Research and Application, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China; Key laboratory of Biological Targeted Therapy, The Ministry of Education, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Ping Sun
- Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Lingling Huang
- Westlake Omics (Hangzhou) Biotechnology Co, Ltd, Hangzhou, Zhejiang, China
| | - Xiu Nie
- Department of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Shi'ang Huang
- Center for Stem Cell Research and Application, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China; Key laboratory of Biological Targeted Therapy, The Ministry of Education, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Tiannan Guo
- Center for ProtTalks, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China.
| | - Yi Zhu
- Center for ProtTalks, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China.
| |
Collapse
|
46
|
Xue Z, Zhu T, Zhang F, Zhang C, Xiang N, Qian L, Yi X, Sun Y, Liu W, Cai X, Wang L, Dai X, Yue L, Li L, Pham TV, Piersma SR, Xiao Q, Luo M, Lu C, Zhu J, Zhao Y, Wang G, Xiao J, Liu T, Liu Z, He Y, Wu Q, Gong T, Zhu J, Zheng Z, Ye J, Li Y, Jimenez CR, A J, Guo T. DPHL v.2: An updated and comprehensive DIA pan-human assay library for quantifying more than 14,000 proteins. PATTERNS (NEW YORK, N.Y.) 2023; 4:100792. [PMID: 37521047 PMCID: PMC10382975 DOI: 10.1016/j.patter.2023.100792] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 04/29/2023] [Accepted: 06/12/2023] [Indexed: 08/01/2023]
Abstract
A comprehensive pan-human spectral library is critical for biomarker discovery using mass spectrometry (MS)-based proteomics. DPHL v.1, a previous pan-human library built from 1,096 data-dependent acquisition (DDA) MS data of 16 human tissue types, allows quantifying of 10,943 proteins. Here, we generated DPHL v.2 from 1,608 DDA-MS data. The data included 586 DDA-MS data acquired from 18 tissue types, while 1,022 files were derived from DPHL v.1. DPHL v.2 thus comprises data from 24 sample types, including several cancer types (lung, breast, kidney, and prostate cancer, among others). We generated four variants of DPHL v.2 to include semi-tryptic peptides and protein isoforms. DPHL v.2 was then applied to two colorectal cancer cohorts. The numbers of identified and significantly dysregulated proteins increased by at least 21.7% and 14.2%, respectively, compared with DPHL v.1. Our findings show that the increased human proteome coverage of DPHL v.2 provides larger pools of potential protein biomarkers.
Collapse
Affiliation(s)
- Zhangzhi Xue
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Tiansheng Zhu
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
- College of Mathematics and Computer Science, Zhejiang A & F University, Hangzhou, Zhejiang 311300, China
| | - Fangfei Zhang
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Cheng Zhang
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Nan Xiang
- Westlake Omics (Hangzhou) Biotechnology Co., Ltd., Hangzhou 310024, China
| | - Liujia Qian
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Xiao Yi
- Westlake Omics (Hangzhou) Biotechnology Co., Ltd., Hangzhou 310024, China
| | - Yaoting Sun
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Wei Liu
- Westlake Omics (Hangzhou) Biotechnology Co., Ltd., Hangzhou 310024, China
| | - Xue Cai
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Linyan Wang
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310000, China
| | - Xizhe Dai
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310000, China
| | - Liang Yue
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Lu Li
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Thang V. Pham
- OncoProteomics Laboratory, Department of Medical Oncology, VU University Medical Center, VU University, 1011 Amsterdam, the Netherlands
| | - Sander R. Piersma
- OncoProteomics Laboratory, Department of Medical Oncology, VU University Medical Center, VU University, 1011 Amsterdam, the Netherlands
| | - Qi Xiao
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Meng Luo
- Songjiang Research Institute and Songjiang Hospital, Department of Anatomy and Physiology, College of Basic Medical Science, Shanghai Jiao Tong University School of Medicine, Shanghai 201600, China
| | - Cong Lu
- Center for Stem Cell Research and Application, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Jiang Zhu
- Center for Stem Cell Research and Application, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yongfu Zhao
- Department of General Surgery, The Second Hospital of Dalian Medical University, Dalian, Liaoning Province 116044, China
| | - Guangzhi Wang
- Department of General Surgery, The Second Hospital of Dalian Medical University, Dalian, Liaoning Province 116044, China
| | - Junhong Xiao
- Department of General Surgery, The Second Hospital of Dalian Medical University, Dalian, Liaoning Province 116044, China
| | - Tong Liu
- Harbin Medical University Cancer Hospital, Harbin, Heilongjiang Province 150081, China
| | - Zhiyu Liu
- Department of Urology, The Second Hospital of Dalian Medical University, No.467 Zhongshan Road, Dalian, Liaoning Province 116044, China
| | - Yi He
- Department of Urology, The Second Hospital of Dalian Medical University, No.467 Zhongshan Road, Dalian, Liaoning Province 116044, China
| | - Qijun Wu
- Department of Clinical Epidemiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning Province 110000, China
| | - Tingting Gong
- Department of Clinical Epidemiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning Province 110000, China
| | - Jianqin Zhu
- The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Hangzhou, Zhejiang 310000, China
- Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, Zhejiang 310000, China
| | - Zhiguo Zheng
- The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Hangzhou, Zhejiang 310000, China
- Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, Zhejiang 310000, China
| | - Juan Ye
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310000, China
| | - Yan Li
- Songjiang Research Institute and Songjiang Hospital, Department of Anatomy and Physiology, College of Basic Medical Science, Shanghai Jiao Tong University School of Medicine, Shanghai 201600, China
| | - Connie R. Jimenez
- OncoProteomics Laboratory, Department of Medical Oncology, VU University Medical Center, VU University, 1011 Amsterdam, the Netherlands
| | - Jun A
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| | - Tiannan Guo
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province 310024, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang 310030, China
| |
Collapse
|
47
|
Yu F, Teo GC, Kong AT, Fröhlich K, Li GX, Demichev V, Nesvizhskii AI. Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform. Nat Commun 2023; 14:4154. [PMID: 37438352 PMCID: PMC10338508 DOI: 10.1038/s41467-023-39869-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 06/28/2023] [Indexed: 07/14/2023] Open
Abstract
Liquid chromatography (LC) coupled with data-independent acquisition (DIA) mass spectrometry (MS) has been increasingly used in quantitative proteomics studies. Here, we present a fast and sensitive approach for direct peptide identification from DIA data, MSFragger-DIA, which leverages the unmatched speed of the fragment ion indexing-based search engine MSFragger. Different from most existing methods, MSFragger-DIA conducts a database search of the DIA tandem mass (MS/MS) spectra prior to spectral feature detection and peak tracing across the LC dimension. To streamline the analysis of DIA data and enable easy reproducibility, we integrate MSFragger-DIA into the FragPipe computational platform for seamless support of peptide identification and spectral library building from DIA, data-dependent acquisition (DDA), or both data types combined. We compare MSFragger-DIA with other DIA tools, such as DIA-Umpire based workflow in FragPipe, Spectronaut, DIA-NN library-free, and MaxDIA. We demonstrate the fast, sensitive, and accurate performance of MSFragger-DIA across a variety of sample types and data acquisition schemes, including single-cell proteomics, phosphoproteomics, and large-scale tumor proteome profiling studies.
Collapse
Affiliation(s)
- Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Andy T Kong
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Klemens Fröhlich
- Proteomics Core Facility, Biozentrum, University of Basel, Basel, Switzerland
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Vadim Demichev
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
48
|
Mehta S, Bernt M, Chambers M, Fahrner M, Föll MC, Gruening B, Horro C, Johnson JE, Loux V, Rajczewski AT, Schilling O, Vandenbrouck Y, Gustafsson OJR, Thang WCM, Hyde C, Price G, Jagtap PD, Griffin TJ. A Galaxy of informatics resources for MS-based proteomics. Expert Rev Proteomics 2023; 20:251-266. [PMID: 37787106 DOI: 10.1080/14789450.2023.2265062] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 09/06/2023] [Indexed: 10/04/2023]
Abstract
INTRODUCTION Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.
Collapse
Affiliation(s)
- Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Matthias Bernt
- Helmholtz Centre for Environmental Research - UFZ, Department Computational Biology, Leipzig, Germany
| | | | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Bjoern Gruening
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany
| | - Carlos Horro
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | - Valentin Loux
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
- Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, Jouy-en-Josas, France
| | - Andrew T Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | | | - W C Mike Thang
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Cameron Hyde
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Sippy Downs, University of the Sunshine Coast, Australia
| | - Gareth Price
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
49
|
He Q, Zhong CQ, Li X, Guo H, Li Y, Gao M, Yu R, Liu X, Zhang F, Guo D, Ye F, Guo T, Shuai J, Han J. Dear-DIA XMBD: Deep Autoencoder Enables Deconvolution of Data-Independent Acquisition Proteomics. RESEARCH (WASHINGTON, D.C.) 2023; 6:0179. [PMID: 37377457 PMCID: PMC10292580 DOI: 10.34133/research.0179] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 06/01/2023] [Indexed: 06/29/2023]
Abstract
Data-independent acquisition (DIA) technology for protein identification from mass spectrometry and related algorithms is developing rapidly. The spectrum-centric analysis of DIA data without the use of spectra library from data-dependent acquisition data represents a promising direction. In this paper, we proposed an untargeted analysis method, Dear-DIAXMBD, for direct analysis of DIA data. Dear-DIAXMBD first integrates the deep variational autoencoder and triplet loss to learn the representations of the extracted fragment ion chromatograms, then uses the k-means clustering algorithm to aggregate fragments with similar representations into the same classes, and finally establishes the inverted index tables to determine the precursors of fragment clusters between precursors and peptides and between fragments and peptides. We show that Dear-DIAXMBD performs superiorly with the highly complicated DIA data of different species obtained by different instrument platforms. Dear-DIAXMBD is publicly available at https://github.com/jianweishuai/Dear-DIA-XMBD.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Chuan-Qi Zhong
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Huan Guo
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Yiming Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Mingxuan Gao
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
| | - Rongshan Yu
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Xianming Liu
- Bruker (Beijing) Scientific Technology Co. Ltd., Beijing, China
| | - Fangfei Zhang
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Donghui Guo
- Department of Electronic Engineering,
Xiamen University, Xiamen 361005, China
| | - Fangfu Ye
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Tiannan Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
- Westlake Omics Ltd., Yunmeng Road 1, Hangzhou, China
| | - Jianwei Shuai
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Jiahuai Han
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| |
Collapse
|
50
|
Scott AM, Karlsson C, Mohanty T, Hartman E, Vaara ST, Linder A, Malmström J, Malmström L. Generalized precursor prediction boosts identification rates and accuracy in mass spectrometry based proteomics. Commun Biol 2023; 6:628. [PMID: 37301900 PMCID: PMC10257694 DOI: 10.1038/s42003-023-04977-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 05/24/2023] [Indexed: 06/12/2023] Open
Abstract
Data independent acquisition mass spectrometry (DIA-MS) has recently emerged as an important method for the identification of blood-based biomarkers. However, the large search space required to identify novel biomarkers from the plasma proteome can introduce a high rate of false positives that compromise the accuracy of false discovery rates (FDR) using existing validation methods. We developed a generalized precursor scoring (GPS) method trained on 2.75 million precursors that can confidently control FDR while increasing the number of identified proteins in DIA-MS independent of the search space. We demonstrate how GPS can generalize to new data, increase protein identification rates, and increase the overall quantitative accuracy. Finally, we apply GPS to the identification of blood-based biomarkers and identify a panel of proteins that are highly accurate in discriminating between subphenotypes of septic acute kidney injury from undepleted plasma to showcase the utility of GPS in discovery DIA-MS proteomics.
Collapse
Affiliation(s)
- Aaron M Scott
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden.
| | - Christofer Karlsson
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Tirthankar Mohanty
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Erik Hartman
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Suvi T Vaara
- Division of Anaesthesia and Intensive Care Medicine Department of Surgery, Intensive Care Units, Helsinki University Central Hospital, Box 340, 00029 HUS, Helsinki, Finland
| | - Adam Linder
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Johan Malmström
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Lars Malmström
- Division of Infection Medicine, Department of Clinical Sciences, Lund University, Lund, Sweden.
| |
Collapse
|