1
|
Wu E, Xu G, Xie D, Qiao L. Data-independent acquisition in Metaproteomics. Expert Rev Proteomics 2024. [PMID: 39152734 DOI: 10.1080/14789450.2024.2394190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 08/12/2024] [Accepted: 08/14/2024] [Indexed: 08/19/2024]
Abstract
INTRODUCTION Metaproteomics offers insights into the function of complex microbial communities while it is also capable of revealing microbe-microbe and host-microbe interactions. Data-independent acquisition (DIA) mass spectrometry is an emerging technology, which holds great potential to achieve deep and accurate metaproteomics with higher reproducibility yet still facing a series of challenges due to the inherent complexity of metaproteomics and DIA data. AREAS COVERED This review offers an overview of the DIA metaproteomics approaches, covering aspects such as database construction, search strategy, and data analysis tools. Several cases of current DIA metaproteomics studies are presented to illustrate the procedures. Important ongoing challenges are also highlighted. Future perspectives of DIA methods for metaproteomics analysis are further discussed. Cited references are searched through and collected from Google Scholar and PubMed. EXPERT OPINION Considering the inherent complexity of DIA metaproteomics data, data analysis strategies specifically designed for interpretation is imperative. From this point of view, we anticipate that deep learning methods and de novo sequencing methods will become more prevalent in the future, potentially improving protein coverage in metaproteomics. Moreover, the advancement of metaproteomics also depends on the development of sample preparation methods, data analysis strategies, etc. These factors are key to unlocking the full potential of metaproteomics.
Collapse
Affiliation(s)
- Enhui Wu
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
- Department of Chemistry, Fudan University, Shanghai, China
| | - Guanyang Xu
- Department of Chemistry, Fudan University, Shanghai, China
| | - Dong Xie
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Liang Qiao
- Department of Chemistry, Fudan University, Shanghai, China
| |
Collapse
|
2
|
Lapcik P, Synkova K, Janacova L, Bouchalova P, Potesil D, Nenutil R, Bouchal P. A hybrid DDA/DIA-PASEF based assay library for a deep proteotyping of triple-negative breast cancer. Sci Data 2024; 11:794. [PMID: 39025866 PMCID: PMC11258311 DOI: 10.1038/s41597-024-03632-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 07/10/2024] [Indexed: 07/20/2024] Open
Abstract
Triple-negative breast cancer (TNBC) is the most aggressive subtype of breast cancer, and deeper proteome coverage is needed for its molecular characterization. We present comprehensive library of targeted mass spectrometry assays specific for TNBC and demonstrate its applicability. Proteins were extracted from 105 TNBC tissues and digested. Aliquots were pooled, fractionated using hydrophilic chromatography and analyzed by LC-MS/MS in data-dependent acquisition (DDA) parallel accumulation-serial fragmentation (PASEF) mode on timsTOF Pro LC-MS system. 16 individual lysates were analyzed in data-independent acquisition (DIA)-PASEF mode. Hybrid library was generated in Spectronaut software and covers 244,464 precursors, 168,006 peptides and 11,564 protein groups (FDR = 1%). Application of our library for pilot quantitative analysis of 16 tissues increased identification numbers in Spectronaut 18.5 and DIA-NN 1.8.1 software compared to library-free setting, with Spectronaut achieving the best results represented by 190,310 precursors, 140,566 peptides, and 10,463 protein groups. In conclusion, we introduce assay library that offers the deepest coverage of TNBC proteome to date. The TNBC library is available via PRIDE repository (PXD047793).
Collapse
Grants
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- CZ.02.1.01/0.0/0.0/18_046/0015974 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LM2023033 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
Collapse
Affiliation(s)
- Petr Lapcik
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Klara Synkova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Lucia Janacova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Pavla Bouchalova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - David Potesil
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Rudolf Nenutil
- Department of Oncological Pathology, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| | - Pavel Bouchal
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
3
|
WU E, QIAO L. [Microbial metaproteomics--From sample processing to data acquisition and analysis]. Se Pu 2024; 42:658-668. [PMID: 38966974 PMCID: PMC11224941 DOI: 10.3724/sp.j.1123.2024.02009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Indexed: 07/06/2024] Open
Abstract
Microorganisms are closely associated with human diseases and health. Understanding the composition and function of microbial communities requires extensive research. Metaproteomics has recently become an important method for throughout and in-depth study of microorganisms. However, major challenges in terms of sample processing, mass spectrometric data acquisition, and data analysis limit the development of metaproteomics owing to the complexity and high heterogeneity of microbial community samples. In metaproteomic analysis, optimizing the preprocessing method for different types of samples and adopting different microbial isolation, enrichment, extraction, and lysis schemes are often necessary. Similar to those for single-species proteomics, the mass spectrometric data acquisition modes for metaproteomics include data-dependent acquisition (DDA) and data-independent acquisition (DIA). DIA can collect comprehensive peptide information from a sample and holds great potential for future development. However, data analysis for DIA is challenged by the complexity of metaproteome samples, which hinders the deeper coverage of metaproteomes. The most important step in data analysis is the construction of a protein sequence database. The size and completeness of the database strongly influence not only the number of identifications, but also analyses at the species and functional levels. The current gold standard for metaproteome database construction is the metagenomic sequencing-based protein sequence database. A public database-filtering method based on an iterative database search has been proven to have strong practical value. The peptide-centric DIA data analysis method is a mainstream data analysis strategy. The development of deep learning and artificial intelligence will greatly promote the accuracy, coverage, and speed of metaproteomic analysis. In terms of downstream bioinformatics analysis, a series of annotation tools that can perform species annotation at the protein, peptide, and gene levels has been developed in recent years to determine the composition of microbial communities. The functional analysis of microbial communities is a unique feature of metaproteomics compared with other omics approaches. Metaproteomics has become an important component of the multi-omics analysis of microbial communities, and has great development potential in terms of depth of coverage, sensitivity of detection, and completeness of data analysis.
Collapse
|
4
|
He G, He Q, Cheng J, Yu R, Shuai J, Cao Y. ProPept-MT: A Multi-Task Learning Model for Peptide Feature Prediction. Int J Mol Sci 2024; 25:7237. [PMID: 39000344 PMCID: PMC11241495 DOI: 10.3390/ijms25137237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 06/26/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024] Open
Abstract
In the realm of quantitative proteomics, data-independent acquisition (DIA) has emerged as a promising approach, offering enhanced reproducibility and quantitative accuracy compared to traditional data-dependent acquisition (DDA) methods. However, the analysis of DIA data is currently hindered by its reliance on project-specific spectral libraries derived from DDA analyses, which not only limits proteome coverage but also proves to be a time-intensive process. To overcome these challenges, we propose ProPept-MT, a novel deep learning-based multi-task prediction model designed to accurately forecast key features such as retention time (RT), ion intensity, and ion mobility (IM). Leveraging advanced techniques such as multi-head attention and BiLSTM for feature extraction, coupled with Nash-MTL for gradient coordination, ProPept-MT demonstrates superior prediction performance. Integrating ion mobility alongside RT, mass-to-charge ratio (m/z), and ion intensity forms 4D proteomics. Then, we outline a comprehensive workflow tailored for 4D DIA proteomics research, integrating the use of 4D in silico libraries predicted by ProPept-MT. Evaluation on a benchmark dataset showcases ProPept-MT's exceptional predictive capabilities, with impressive results including a 99.9% Pearson correlation coefficient (PCC) for RT prediction, a median dot product (DP) of 96.0% for fragment ion intensity prediction, and a 99.3% PCC for IM prediction on the test set. Notably, ProPept-MT manifests efficacy in predicting both unmodified and phosphorylated peptides, underscoring its potential as a valuable tool for constructing high-quality 4D DIA in silico libraries.
Collapse
Affiliation(s)
- Guoqiang He
- Postgraduate Training Base Alliance, Wenzhou Medical University, Wenzhou 325000, China
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Qingzu He
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen 361005, China
| | - Jinyan Cheng
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Rongwen Yu
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Jianwei Shuai
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Yi Cao
- Postgraduate Training Base Alliance, Wenzhou Medical University, Wenzhou 325000, China
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| |
Collapse
|
5
|
Baker C, Bruderer R, Abbott J, Arthur JSC, Brenes AJ. Optimizing Spectronaut Search Parameters to Improve Data Quality with Minimal Proteome Coverage Reductions in DIA Analyses of Heterogeneous Samples. J Proteome Res 2024; 23:1926-1936. [PMID: 38691771 PMCID: PMC11165578 DOI: 10.1021/acs.jproteome.3c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/18/2024] [Accepted: 04/19/2024] [Indexed: 05/03/2024]
Abstract
Data-independent acquisition has seen breakthroughs that enable comprehensive proteome profiling using short gradients. As the proteome coverage continues to increase, the quality of the data generated becomes much more relevant. Using Spectronaut, we show that the default search parameters can be easily optimized to minimize the occurrence of false positives across different samples. Using an immunological infection model system to demonstrate the impact of adjusting search settings, we analyzed Mus musculus macrophages and compared their proteome to macrophages spiked withCandida albicans. This experimental system enabled the identification of "false positives" as Candida albicans peptides and proteins should not be present in the Mus musculus-only samples. We show that adjusting the search parameters reduced "false positive" identifications by 89% at the peptide and protein level, thereby considerably increasing the quality of the data. We also show that these optimized parameters incurred a moderate cost, only reducing the overall number of "true positive" identifications across each biological replicate by <6.7% at both the peptide and protein level. We believe the value of our updated search parameters extends beyond a two-organism analysis and would be of great value to any DIA experiment analyzing heterogeneous populations of cell types or tissues.
Collapse
Affiliation(s)
- Christa
P. Baker
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | | | - James Abbott
- Data
Analysis Group, Division of Computational Biology, School of Life
Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - J. Simon C. Arthur
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| | - Alejandro J. Brenes
- Division
of Cell Signalling & Immunology, School of Life Sciences, University of Dundee, Dundee DD1 5EH, United Kingdom
| |
Collapse
|
6
|
Hamaneh M, Ogurtsov AY, Obolensky OI, Yu YK. Systematic Assessment of Deep Learning-Based Predictors of Fragmentation Intensity Profiles. J Proteome Res 2024; 23:1983-1999. [PMID: 38728051 PMCID: PMC11165591 DOI: 10.1021/acs.jproteome.3c00857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 03/05/2024] [Accepted: 04/16/2024] [Indexed: 06/13/2024]
Abstract
In recent years, several deep learning-based methods have been proposed for predicting peptide fragment intensities. This study aims to provide a comprehensive assessment of six such methods, namely Prosit, DeepMass:Prism, pDeep3, AlphaPeptDeep, Prosit Transformer, and the method proposed by Guan et al. To this end, we evaluated the accuracy of the predicted intensity profiles for close to 1.7 million precursors (including both tryptic and HLA peptides) corresponding to more than 18 million experimental spectra procured from 40 independent submissions to the PRIDE repository that were acquired for different species using a variety of instruments and different dissociation types/energies. Specifically, for each method, distributions of similarity (measured by Pearson's correlation and normalized angle) between the predicted and the corresponding experimental b and y fragment intensities were generated. These distributions were used to ascertain the prediction accuracy and rank the prediction methods for particular types of experimental conditions. The effect of variables like precursor charge, length, and collision energy on the prediction accuracy was also investigated. In addition to prediction accuracy, the methods were evaluated in terms of prediction speed. The systematic assessment of these six methods may help in choosing the right method for MS/MS spectra prediction for particular needs.
Collapse
Affiliation(s)
- Mehdi
B. Hamaneh
- National Center for Biotechnology
Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Aleksey Y. Ogurtsov
- National Center for Biotechnology
Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| | | | - Yi-Kuo Yu
- National Center for Biotechnology
Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| |
Collapse
|
7
|
Lu XY, Wu HP, Ma H, Li H, Li J, Liu YT, Pan ZY, Xie Y, Wang L, Ren B, Liu GK. Deep Learning-Assisted Spectrum-Structure Correlation: State-of-the-Art and Perspectives. Anal Chem 2024; 96:7959-7975. [PMID: 38662943 DOI: 10.1021/acs.analchem.4c01639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Spectrum-structure correlation is playing an increasingly crucial role in spectral analysis and has undergone significant development in recent decades. With the advancement of spectrometers, the high-throughput detection triggers the explosive growth of spectral data, and the research extension from small molecules to biomolecules accompanies massive chemical space. Facing the evolving landscape of spectrum-structure correlation, conventional chemometrics becomes ill-equipped, and deep learning assisted chemometrics rapidly emerges as a flourishing approach with superior ability of extracting latent features and making precise predictions. In this review, the molecular and spectral representations and fundamental knowledge of deep learning are first introduced. We then summarize the development of how deep learning assist to establish the correlation between spectrum and molecular structure in the recent 5 years, by empowering spectral prediction (i.e., forward structure-spectrum correlation) and further enabling library matching and de novo molecular generation (i.e., inverse spectrum-structure correlation). Finally, we highlight the most important open issues persisted with corresponding potential solutions. With the fast development of deep learning, it is expected to see ultimate solution of establishing spectrum-structure correlation soon, which would trigger substantial development of various disciplines.
Collapse
Affiliation(s)
- Xin-Yu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hao-Ping Wu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| | - Hao Ma
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hui Li
- Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen 361005, P. R. China
| | - Jia Li
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, P. R. China
| | - Yan-Ti Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Zheng-Yan Pan
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yi Xie
- School of Informatics, Xiamen University, Xiamen 361005, P. R. China
| | - Lei Wang
- Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen 361005, P. R. China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| |
Collapse
|
8
|
Lee H, Ozbulak U, Park H, Depuydt S, De Neve W, Vankerschaver J. Assessing the reliability of point mutation as data augmentation for deep learning with genomic data. BMC Bioinformatics 2024; 25:170. [PMID: 38689247 PMCID: PMC11059627 DOI: 10.1186/s12859-024-05787-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/15/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Deep neural networks (DNNs) have the potential to revolutionize our understanding and treatment of genetic diseases. An inherent limitation of deep neural networks, however, is their high demand for data during training. To overcome this challenge, other fields, such as computer vision, use various data augmentation techniques to artificially increase the available training data for DNNs. Unfortunately, most data augmentation techniques used in other domains do not transfer well to genomic data. RESULTS Most genomic data possesses peculiar properties and data augmentations may significantly alter the intrinsic properties of the data. In this work, we propose a novel data augmentation technique for genomic data inspired by biology: point mutations. By employing point mutations as substitutes for codons, we demonstrate that our newly proposed data augmentation technique enhances the performance of DNNs across various genomic tasks that involve coding regions, such as translation initiation and splice site detection. CONCLUSION Silent and missense mutations are found to positively influence effectiveness, while nonsense mutations and random mutations in non-coding regions generally lead to degradation. Overall, point mutation-based augmentations in genomic datasets present valuable opportunities for improving the accuracy and reliability of predictive models for DNA sequences.
Collapse
Affiliation(s)
| | - Utku Ozbulak
- Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, South Korea
| | - Homin Park
- Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, South Korea
- IDLab, Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Stephen Depuydt
- Erasmus Brussels University of Applied Sciences and Arts, Brussels, Belgium
| | - Wesley De Neve
- Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, South Korea
- IDLab, Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Joris Vankerschaver
- Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, South Korea.
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.
| |
Collapse
|
9
|
Basharat AR, Xiong X, Xu T, Zang Y, Sun L, Liu X. TopDIA: A Software Tool for Top-Down Data-Independent Acquisition Proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588302. [PMID: 38645171 PMCID: PMC11030422 DOI: 10.1101/2024.04.05.588302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Top-down mass spectrometry is widely used for proteoform identification, characterization, and quantification owing to its ability to analyze intact proteoforms. In the last decade, top-down proteomics has been dominated by top-down data-dependent acquisition mass spectrometry (TD-DDA-MS), and top-down data-independent acquisition mass spectrometry (TD-DIA-MS) has not been well studied. While TD-DIA-MS produces complex multiplexed tandem mass spectrometry (MS/MS) spectra, which are challenging to confidently identify, it selects more precursor ions for MS/MS analysis and has the potential to increase proteoform identifications compared with TD-DDA-MS. Here we present TopDIA, the first software tool for proteoform identification by TD-DIA-MS. It generates demultiplexed pseudo MS/MS spectra from TD-DIA-MS data and then searches the pseudo MS/MS spectra against a protein sequence database for proteoform identification. We compared the performance of TD-DDA-MS and TD-DIA-MS using Escherichia coli K-12 MG1655 cells and demonstrated that TD-DIA-MS with TopDIA increased proteoform and protein identifications compared with TD-DDA-MS.
Collapse
Affiliation(s)
- Abdul Rehman Basharat
- Department of BioHealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA
| | - Xingzhao Xiong
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| | - Tian Xu
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Yong Zang
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Xiaowen Liu
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| |
Collapse
|
10
|
Yang Y, Fang Q. Prediction of glycopeptide fragment mass spectra by deep learning. Nat Commun 2024; 15:2448. [PMID: 38503734 PMCID: PMC10951270 DOI: 10.1038/s41467-024-46771-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 03/11/2024] [Indexed: 03/21/2024] Open
Abstract
Deep learning has achieved a notable success in mass spectrometry-based proteomics and is now emerging in glycoproteomics. While various deep learning models can predict fragment mass spectra of peptides with good accuracy, they cannot cope with the non-linear glycan structure in an intact glycopeptide. Herein, we present DeepGlyco, a deep learning-based approach for the prediction of fragment spectra of intact glycopeptides. Our model adopts tree-structured long-short term memory networks to process the glycan moiety and a graph neural network architecture to incorporate potential fragmentation pathways of a specific glycan structure. This feature is beneficial to model explainability and differentiation ability of glycan structural isomers. We further demonstrate that predicted spectral libraries can be used for data-independent acquisition glycoproteomics as a supplement for library completeness. We expect that this work will provide a valuable deep learning resource for glycoproteomics.
Collapse
Affiliation(s)
- Yi Yang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
| | - Qun Fang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
- Department of Chemistry, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
11
|
He Q, Guo H, Li Y, He G, Li X, Shuai J. SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics. Interdiscip Sci 2024:10.1007/s12539-024-00611-4. [PMID: 38472692 DOI: 10.1007/s12539-024-00611-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 03/14/2024]
Abstract
Mass spectrometry is crucial in proteomics analysis, particularly using Data Independent Acquisition (DIA) for reliable and reproducible mass spectrometry data acquisition, enabling broad mass-to-charge ratio coverage and high throughput. DIA-NN, a prominent deep learning software in DIA proteome analysis, generates peptide results but may include low-confidence peptides. Conventionally, biologists have to manually screen peptide fragment ion chromatogram peaks (XIC) for identifying high-confidence peptides, a time-consuming and subjective process prone to variability. In this study, we introduce SeFilter-DIA, a deep learning algorithm, aiming at automating the identification of high-confidence peptides. Leveraging compressed excitation neural network and residual network models, SeFilter-DIA extracts XIC features and effectively discerns between high and low-confidence peptides. Evaluation of the benchmark datasets demonstrates SeFilter-DIA achieving 99.6% AUC on the test set and 97% for other performance indicators. Furthermore, SeFilter-DIA is applicable for screening peptides with phosphorylation modifications. These results demonstrate the potential of SeFilter-DIA to replace manual screening, providing an efficient and objective approach for high-confidence peptide identification while mitigating associated limitations.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Huan Guo
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Yulin Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Guoqiang He
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Xiang Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.
| | - Jianwei Shuai
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325001, China.
| |
Collapse
|
12
|
Palstrøm NB, Campbell AJ, Lindegaard CA, Cakar S, Matthiesen R, Beck HC. Spectral library search for improved TMTpro labelled peptide assignment in human plasma proteomics. Proteomics 2024; 24:e2300236. [PMID: 37706597 DOI: 10.1002/pmic.202300236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 09/15/2023]
Abstract
Clinical biomarker discovery is often based on the analysis of human plasma samples. However, the high dynamic range and complexity of plasma pose significant challenges to mass spectrometry-based proteomics. Current methods for improving protein identifications require laborious pre-analytical sample preparation. In this study, we developed and evaluated a TMTpro-specific spectral library for improved protein identification in human plasma proteomics. The library was constructed by LC-MS/MS analysis of highly fractionated TMTpro-tagged human plasma, human cell lysates, and relevant arterial tissues. The library was curated using several quality filters to ensure reliable peptide identifications. Our results show that spectral library searching using the TMTpro spectral library improves the identification of proteins in plasma samples compared to conventional sequence database searching. Protein identifications made by the spectral library search engine demonstrated a high degree of complementarity with the sequence database search engine, indicating the feasibility of increasing the number of protein identifications without additional pre-analytical sample preparation. The TMTpro-specific spectral library provides a resource for future plasma proteomics research and optimization of search algorithms for greater accuracy and speed in protein identifications in human plasma proteomics, and is made publicly available to the research community via ProteomeXchange with identifier PXD042546.
Collapse
Affiliation(s)
- Nicolai B Palstrøm
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| | - Amanda J Campbell
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| | | | - Samir Cakar
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| | - Rune Matthiesen
- Computational and Experimental Biology Group, CEDOC, Chronic Diseases Research Centre, NOVA Medical School, Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Lisbon, Portugal
| | - Hans C Beck
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| |
Collapse
|
13
|
Lapin J, Yan X, Dong Q. UniSpec: Deep Learning for Predicting the Full Range of Peptide Fragment Ion Series to Enhance the Proteomics Data Analysis Workflow. Anal Chem 2024. [PMID: 38329031 DOI: 10.1021/acs.analchem.3c02321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
We present UniSpec, an attention-driven deep neural network designed to predict comprehensive collision-induced fragmentation spectra, thereby improving peptide identification in shotgun proteomics. Utilizing a training data set of 1.8 million unique high-quality tandem mass spectra (MS2) from 0.8 million unique peptide ions, UniSpec learned with a peptide fragmentation dictionary encompassing 7919 fragment peaks. Among these, 5712 are neutral loss peaks, with 2310 corresponding to modification-specific neutral losses. Remarkably, UniSpec can predict 73%-77% of fragment intensities based on our NIST reference library spectra, a significant leap from the 35%-45% coverage of only b and y ions. Comparative studies with Prosit elucidate that while both models are strong at predicting their respective fragment ion series, UniSpec particularly shines in generating more complex MS2 spectra with diverse ion annotations. The integration of UniSpec's predictions into shotgun proteomics data analysis boosts the identification rate of tryptic peptides by 48% at a 1% false discovery rate (FDR) and 60% at a more confident 0.1% FDR. Using UniSpec's predicted in-silico spectral library, the search results closely matched those from search engines and experimental spectral libraries used in peptide identification, highlighting its potential as a stand-alone identification tool. The source code and Python scripts are available on GitHub (https://github.com/usnistgov/UniSpec) and Zenodo (https://zenodo.org/records/10452792), and all data sets and analysis results generated in this work were deposited in Zenodo (https://zenodo.org/records/10052268).
Collapse
Affiliation(s)
- Joel Lapin
- Department of Physics, Georgetown University, Washington, D.C. 20057, United States
- Associate, Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Xinjian Yan
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Qian Dong
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
14
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
15
|
Jiao X, Li X, Zhang N, Zhang W, Yan B, Huang J, Zhao J, Zhang H, Chen W, Fan D. Postmortem Muscle Proteome Characteristics of Silver Carp ( Hypophthalmichthys molitrix): Insights from Full-Length Transcriptome and Deep 4D Label-Free Proteomic. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:1376-1390. [PMID: 38165648 DOI: 10.1021/acs.jafc.3c06902] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
The coverage of the protein database directly determines the results of shotgun proteomics. In this study, PacBio single-molecule real-time sequencing technology was performed on postmortem silver carp muscle transcripts. A total of 42.43 Gb clean data, 35,834 nonredundant transcripts, and 15,413 unigenes were obtained. In total, 99.32% of the unigenes were successfully annotated and assigned specific functions. PacBio long-read isoform sequencing (Iso-Seq) analysis can provide more accurate protein information with a higher proportion of complete coding sequences and longer lengths. Subsequently, 2671 proteins were identified in deep 4D proteomics informed by a full-length transcriptomics technique, which has been shown to improve the identification of low-abundance muscle proteins and potential protein isoforms. The feature of the sarcomeric protein profile and information on more than 30 major proteins in the white dorsal muscle of silver carp were reported here for the first time. Overall, this study provides valuable transcriptome data resources and the comprehensive muscle protein information detected to date for further study into the processing characteristic of early postmortem fish muscle, as well as a spectral library for data-independent acquisition and data processing. This batch of muscle-specific dependent acquisition data is available via PRIDE with identifier PXD043702.
Collapse
Affiliation(s)
- Xidong Jiao
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Xingying Li
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Nana Zhang
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Ministry of Agriculture and Rural Affairs, Xiamen 361022, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Wenhai Zhang
- Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Ministry of Agriculture and Rural Affairs, Xiamen 361022, China
- Fujian Provincial Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Xiamen 361022, China
- Anjoy Foods Group Co., Ltd., Xiamen 361022, China
| | - Bowen Yan
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Ministry of Agriculture and Rural Affairs, Xiamen 361022, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Jianlian Huang
- Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Ministry of Agriculture and Rural Affairs, Xiamen 361022, China
- Fujian Provincial Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Xiamen 361022, China
- Anjoy Foods Group Co., Ltd., Xiamen 361022, China
| | - Jianxin Zhao
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Hao Zhang
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Wei Chen
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Daming Fan
- State Key Laboratory of Food Science and Resources, Jiangnan University, Wuxi 214122, China
- Key Laboratory of Refrigeration and Conditioning Aquatic Products Processing, Ministry of Agriculture and Rural Affairs, Xiamen 361022, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
16
|
Chan CMJ, Lam H. Merging Full-Spectrum and Fragment Ion Intensity Predictions from Deep Learning for High-Quality Spectral Libraries. J Proteome Res 2023; 22:3692-3702. [PMID: 37910637 DOI: 10.1021/acs.jproteome.3c00180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Spectral libraries are useful resources in proteomic data analysis. Recent advances in deep learning allow tandem mass spectra of peptides to be predicted from their amino acid sequences. This enables predicted spectral libraries to be compiled, and searching against such libraries has been shown to improve the sensitivity in peptide identification over conventional sequence database searching. However, current prediction models lack support for longer peptides, and thus far, predicted library searching has only been demonstrated for backbone ion-only spectrum prediction methods. Here, we propose a deep learning-based full-spectrum prediction method to generate predicted spectral libraries for peptide identification. We demonstrated the superiority of using full-spectrum libraries over backbone ion-only prediction approaches in spectral library searching. Furthermore, merging spectra from different prediction models, as a form of ensemble learning, can produce improved spectral libraries, in terms of identification sensitivity. We also show that a hybrid library combining predicted and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching.
Collapse
Affiliation(s)
- Chak Ming Jerry Chan
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| | - Henry Lam
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| |
Collapse
|
17
|
Kitata RB, Yang JC, Chen YJ. Advances in data-independent acquisition mass spectrometry towards comprehensive digital proteome landscape. MASS SPECTROMETRY REVIEWS 2023; 42:2324-2348. [PMID: 35645145 DOI: 10.1002/mas.21781] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 12/17/2021] [Accepted: 01/21/2022] [Indexed: 06/15/2023]
Abstract
The data-independent acquisition mass spectrometry (DIA-MS) has rapidly evolved as a powerful alternative for highly reproducible proteome profiling with a unique strength of generating permanent digital maps for retrospective analysis of biological systems. Recent advancements in data analysis software tools for the complex DIA-MS/MS spectra coupled to fast MS scanning speed and high mass accuracy have greatly expanded the sensitivity and coverage of DIA-based proteomics profiling. Here, we review the evolution of the DIA-MS techniques, from earlier proof-of-principle of parallel fragmentation of all-ions or ions in selected m/z range, the sequential window acquisition of all theoretical mass spectra (SWATH-MS) to latest innovations, recent development in computation algorithms for data informatics, and auxiliary tools and advanced instrumentation to enhance the performance of DIA-MS. We further summarize recent applications of DIA-MS and experimentally-derived as well as in silico spectra library resources for large-scale profiling to facilitate biomarker discovery and drug development in human diseases with emphasis on the proteomic profiling coverage. Toward next-generation DIA-MS for clinical proteomics, we outline the challenges in processing multi-dimensional DIA data set and large-scale clinical proteomics, and continuing need in higher profiling coverage and sensitivity.
Collapse
Affiliation(s)
| | - Jhih-Ci Yang
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Applied Chemistry, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
18
|
Jin L, Wang F, Wang X, Harvey BP, Bi Y, Hu C, Cui B, Darcy AT, Maull JW, Phillips BR, Kim Y, Jenkins GJ, Sornasse TR, Tian Y. Identification of Plasma Biomarkers from Rheumatoid Arthritis Patients Using an Optimized Sequential Window Acquisition of All THeoretical Mass Spectra (SWATH) Proteomics Workflow. Proteomes 2023; 11:32. [PMID: 37873874 PMCID: PMC10594463 DOI: 10.3390/proteomes11040032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/28/2023] [Accepted: 10/02/2023] [Indexed: 10/25/2023] Open
Abstract
Rheumatoid arthritis (RA) is a systemic autoimmune and inflammatory disease. Plasma biomarkers are critical for understanding disease mechanisms, treatment effects, and diagnosis. Mass spectrometry-based proteomics is a powerful tool for unbiased biomarker discovery. However, plasma proteomics is significantly hampered by signal interference from high-abundance proteins, low overall protein coverage, and high levels of missing data from data-dependent acquisition (DDA). To achieve quantitative proteomics analysis for plasma samples with a balance of throughput, performance, and cost, we developed a workflow incorporating plate-based high abundance protein depletion and sample preparation, comprehensive peptide spectral library building, and data-independent acquisition (DIA) SWATH mass spectrometry-based methodology. In this study, we analyzed plasma samples from both RA patients and healthy donors. The results showed that the new workflow performance exceeded that of the current state-of-the-art depletion-based plasma proteomic platforms in terms of both data quality and proteome coverage. Proteins from biological processes related to the activation of systemic inflammation, suppression of platelet function, and loss of muscle mass were enriched and differentially expressed in RA. Some plasma proteins, particularly acute-phase reactant proteins, showed great power to distinguish between RA patients and healthy donors. Moreover, protein isoforms in the plasma were also analyzed, providing even deeper proteome coverage. This workflow can serve as a basis for further application in discovering plasma biomarkers of other diseases.
Collapse
Affiliation(s)
- Liang Jin
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Fei Wang
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Xue Wang
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Bohdan P. Harvey
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Yingtao Bi
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Chenqi Hu
- DMPK, Takeda Development Center Americas Inc., Cambridge, MA 02142, USA; (C.H.)
| | - Baoliang Cui
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Anhdao T. Darcy
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - John W. Maull
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Ben R. Phillips
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Youngjae Kim
- DMPK, Takeda Development Center Americas Inc., Cambridge, MA 02142, USA; (C.H.)
| | - Gary J. Jenkins
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Thierry R. Sornasse
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| | - Yu Tian
- Research & Development, AbbVie, North Chicago, IL 60064, USA; (L.J.); (B.P.H.); (B.C.); (A.T.D.); (J.W.M.); (T.R.S.)
| |
Collapse
|
19
|
Zhang B, Bassani-Sternberg M. Current perspectives on mass spectrometry-based immunopeptidomics: the computational angle to tumor antigen discovery. J Immunother Cancer 2023; 11:e007073. [PMID: 37899131 PMCID: PMC10619091 DOI: 10.1136/jitc-2023-007073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 10/31/2023] Open
Abstract
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Collapse
Affiliation(s)
- Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| |
Collapse
|
20
|
Hao Y, Chen M, Huang X, Xu H, Wu P, Chen S. 4D-diaXLMS: Proteome-wide Four-Dimensional Data-Independent Acquisition Workflow for Cross-Linking Mass Spectrometry. Anal Chem 2023; 95:14077-14085. [PMID: 37691250 DOI: 10.1021/acs.analchem.3c02824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Cross-linking mass spectrometry (XL-MS) is a powerful tool for examining protein structures and interactions. Nevertheless, analysis of low-abundance cross-linked peptides is often limited in the data-dependent acquisition (DDA) mode due to its semistochastic nature. To address this issue, we introduced a workflow called 4D-diaXLMS, representing the first-ever application of four-dimensional data-independent acquisition for proteome-wide cross-linking analysis. Cross-linking studies of the HeLa cell proteome were evaluated using the classical cross-linker disuccinimidyl suberate as an example. Compared with the DDA analysis, 4D-diaXLMS exhibited marked improvement in the identification coverage of cross-linked peptides, with a total increase of 36% in single-shot analysis across all 16 SCX fractions. This advantage was further amplified when reducing the fraction number to 8 and 4, resulting in 125 and 149% improvements, respectively. Using 4D-diaXLMS, up to 83% of the cross-linked peptides were repeatedly identified in three replicates, more than twice the 38% in the DDA mode. Furthermore, 4D-diaXLMS showed good performance in the quantitative analysis of yeast cross-linked peptides even in a 15-fold excess amount of HeLa cell matrix, with a low coefficient of variation and high quantitative accuracies in all concentrations. Overall, 4D-diaXLMS was proven to have high coverage, good reproducibility, and accurate quantification for in-depth XL-MS analysis in complex samples, demonstrating its immense potential for advances in the field.
Collapse
Affiliation(s)
- Yanhong Hao
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Moran Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Xiao Huang
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Hui Xu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Pengfei Wu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Suming Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| |
Collapse
|
21
|
McGann CD, Barshop WD, Canterbury JD, Lin C, Gabriel W, Huang J, Bergen D, Zabrouskov V, Melani RD, Wilhelm M, McAlister GC, Schweppe DK. Real-Time Spectral Library Matching for Sample Multiplexed Quantitative Proteomics. J Proteome Res 2023; 22:2836-2846. [PMID: 37557900 DOI: 10.1021/acs.jproteome.3c00085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
Sample multiplexed quantitative proteomics assays have proved to be a highly versatile means to assay molecular phenotypes. Yet, stochastic precursor selection and precursor coisolation can dramatically reduce the efficiency of data acquisition and quantitative accuracy. To address this, intelligent data acquisition (IDA) strategies have recently been developed to improve instrument efficiency and quantitative accuracy for both discovery and targeted methods. Toward this end, we sought to develop and implement a new real-time spectral library searching (RTLS) workflow that could enable intelligent scan triggering and peak selection within milliseconds of scan acquisition. To ensure ease of use and general applicability, we built an application to read in diverse spectral libraries and file types from both empirical and predicted spectral libraries. We demonstrate that RTLS methods enable improved quantitation of multiplexed samples, particularly with consideration for quantitation from chimeric fragment spectra. We used RTLS to profile proteome responses to small molecule perturbations and were able to quantify up to 15% more significantly regulated proteins in half the gradient time compared to traditional methods. Taken together, the development of RTLS expands the IDA toolbox to improve instrument efficiency and quantitative accuracy for sample multiplexed analyses.
Collapse
Affiliation(s)
- Chris D McGann
- University of Washington, Seattle, Washington 98105, United States
| | | | | | - Chuwei Lin
- University of Washington, Seattle, Washington 98105, United States
| | | | - Jingjing Huang
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - David Bergen
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Vlad Zabrouskov
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Rafael D Melani
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | | | | | - Devin K Schweppe
- University of Washington, Seattle, Washington 98105, United States
| |
Collapse
|
22
|
Sun Z, Ning Z, Cheng K, Duan H, Wu Q, Mayne J, Figeys D. MetaPep: A core peptide database for faster human gut metaproteomics database searches. Comput Struct Biotechnol J 2023; 21:4228-4237. [PMID: 37692080 PMCID: PMC10491838 DOI: 10.1016/j.csbj.2023.08.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/18/2023] [Accepted: 08/25/2023] [Indexed: 09/12/2023] Open
Abstract
Metaproteomics has increasingly been applied to study functional changes in the human gut microbiome. Peptide identification is an important step in metaproteomics research, with sequence database search (SDS) and spectral library search (SLS) as the two main methods to identify peptides. However, the large search space in metaproteomics studies causes significant challenges for both identification methods. Moreover, with the development of mass spectrometry, it is now feasible to perform metaproteomic projects involving 100-1000 individual microbiomes. These large-scale projects create a conundrum for searching large databases. In this study, we constructed MetaPep, a core peptide database (including both collections of peptide sequences and tandem MS spectra) greatly accelerating the peptide identifications. Raw files from fifteen metaproteomics projects were re-analyzed and the identified peptide-spectrum matches (PSMs) were used to construct the MetaPep database. The constructed MetaPep database achieved rapid and accurate identification of peptides for human gut metaproteomics. MetaPep has a large collection of peptides and spectra that have been identified in published human gut metaproteomics datasets. MetaPep database can be used as an important resource in the current stage of human gut metaproteomics research. This study showed the possibility of applying a core peptide database as a generic metaproteomics workflow. MetaPep could also be an important resource for future human gut metaproteomics research, such as DIA (data-independent acquisition) analysis.
Collapse
Affiliation(s)
- Zhongzhi Sun
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Zhibin Ning
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Kai Cheng
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Haonan Duan
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Qing Wu
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Janice Mayne
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Daniel Figeys
- School of Pharmaceutical Sciences, Faculty of Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| |
Collapse
|
23
|
Son J, Na S, Paek E. DbyDeep: Exploration of MS-Detectable Peptides via Deep Learning. Anal Chem 2023; 95:11193-11200. [PMID: 37459568 PMCID: PMC10401496 DOI: 10.1021/acs.analchem.3c00460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 07/05/2023] [Indexed: 08/02/2023]
Abstract
Predicting peptide detectability is useful in a variety of mass spectrometry (MS)-based proteomics applications, particularly targeted proteomics. However, most machine learning-based computational methods have relied solely on information from the peptide itself, such as its amino acid sequences or physicochemical properties, despite the fact that peptides detected by MS are dependent on many factors, including protein sample preparation, digestion, separation, ionization, and precursor selection during MS experiments. DbyDeep (Detectability by Deep learning) is an innovative end-to-end LSTM network model for peptide detectability prediction that incorporates sequence contexts of peptides and their cleavage sites (by protease). Utilizing the cleavage site contexts could improve the performance of prediction, and DbyDeep outperformed existing methods in predicting peptides recognizable from multiple MS/MS data sets with diverse species and MS instruments. We argue for the necessity of a learning model that encompasses several contexts associated with peptide detection, as opposed to depending just on peptide sequences. There is a Python implementation of DbyDeep at https://github.com/BISCodeRepo/DbyDeep.
Collapse
Affiliation(s)
- Juho Son
- Department
of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
| | - Seungjin Na
- Department
of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
- Institute
for Artificial Intelligence Research, Hanyang
University, Seoul 04763, Republic
of Korea
| | - Eunok Paek
- Department
of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
- Institute
for Artificial Intelligence Research, Hanyang
University, Seoul 04763, Republic
of Korea
| |
Collapse
|
24
|
Yang KL, Yu F, Teo GC, Li K, Demichev V, Ralser M, Nesvizhskii AI. MSBooster: improving peptide identification rates using deep learning-based features. Nat Commun 2023; 14:4539. [PMID: 37500632 PMCID: PMC10374903 DOI: 10.1038/s41467-023-40129-9] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 07/06/2023] [Indexed: 07/29/2023] Open
Abstract
Peptide identification in liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments relies on computational algorithms for matching acquired MS/MS spectra against sequences of candidate peptides using database search tools, such as MSFragger. Here, we present a new tool, MSBooster, for rescoring peptide-to-spectrum matches using additional features incorporating deep learning-based predictions of peptide properties, such as LC retention time, ion mobility, and MS/MS spectra. We demonstrate the utility of MSBooster, in tandem with MSFragger and Percolator, in several different workflows, including nonspecific searches (immunopeptidomics), direct identification of peptides from data independent acquisition data, single-cell proteomics, and data generated on an ion mobility separation-enabled timsTOF MS platform. MSBooster is fast, robust, and fully integrated into the widely used FragPipe computational platform.
Collapse
Affiliation(s)
- Kevin L Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kai Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Vadim Demichev
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Markus Ralser
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Nuffield Department of Medicine, The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
25
|
Yu F, Teo GC, Kong AT, Fröhlich K, Li GX, Demichev V, Nesvizhskii AI. Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform. Nat Commun 2023; 14:4154. [PMID: 37438352 PMCID: PMC10338508 DOI: 10.1038/s41467-023-39869-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 06/28/2023] [Indexed: 07/14/2023] Open
Abstract
Liquid chromatography (LC) coupled with data-independent acquisition (DIA) mass spectrometry (MS) has been increasingly used in quantitative proteomics studies. Here, we present a fast and sensitive approach for direct peptide identification from DIA data, MSFragger-DIA, which leverages the unmatched speed of the fragment ion indexing-based search engine MSFragger. Different from most existing methods, MSFragger-DIA conducts a database search of the DIA tandem mass (MS/MS) spectra prior to spectral feature detection and peak tracing across the LC dimension. To streamline the analysis of DIA data and enable easy reproducibility, we integrate MSFragger-DIA into the FragPipe computational platform for seamless support of peptide identification and spectral library building from DIA, data-dependent acquisition (DDA), or both data types combined. We compare MSFragger-DIA with other DIA tools, such as DIA-Umpire based workflow in FragPipe, Spectronaut, DIA-NN library-free, and MaxDIA. We demonstrate the fast, sensitive, and accurate performance of MSFragger-DIA across a variety of sample types and data acquisition schemes, including single-cell proteomics, phosphoproteomics, and large-scale tumor proteome profiling studies.
Collapse
Affiliation(s)
- Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Andy T Kong
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Klemens Fröhlich
- Proteomics Core Facility, Biozentrum, University of Basel, Basel, Switzerland
| | - Ginny Xiaohe Li
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Vadim Demichev
- Department of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
26
|
Geer LY, Lapin J, Slotta DJ, Mak TD, Stein SE. AIomics: Exploring More of the Proteome Using Mass Spectral Libraries Extended by Artificial Intelligence. J Proteome Res 2023; 22:2246-2255. [PMID: 37232537 PMCID: PMC10542943 DOI: 10.1021/acs.jproteome.2c00807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The unbounded permutations of biological molecules, including proteins and their constituent peptides, present a dilemma in identifying the components of complex biosamples. Sequence search algorithms used to identify peptide spectra can be expanded to cover larger classes of molecules, including more modifications, isoforms, and atypical cleavage, but at the cost of false positives or false negatives due to the simplified spectra they compute from sequence records. Spectral library searching can help solve this issue by precisely matching experimental spectra to library spectra with excellent sensitivity and specificity. However, compiling spectral libraries that span entire proteomes is pragmatically difficult. Neural networks that predict complete spectra containing a full range of annotated and unannotated ions can be used to replace these simplified spectra with libraries of fully predicted spectra, including modified peptides. Using such a network, we created predicted spectral libraries that were used to rescore matches from a sequence search done over a large search space, including a large number of modifications. Rescoring improved the separation of true and false hits by 82%, yielding an 8% increase in peptide identifications, including a 21% increase in nonspecifically cleaved peptides and a 17% increase in phosphopeptides.
Collapse
Affiliation(s)
- Lewis Y. Geer
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Joel Lapin
- Department of Physics, Georgetown University, Washington, DC 20057, United States
- Associate, Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Douglas J. Slotta
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Tytus D. Mak
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Stephen E. Stein
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| |
Collapse
|
27
|
Liu K, Zhang L, Qi Q, Li J, Yan F, Hou J. Growth hormone treatment improves the development of follicles and oocytes in prepubertal lambs. J Ovarian Res 2023; 16:132. [PMID: 37408062 DOI: 10.1186/s13048-023-01209-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 06/17/2023] [Indexed: 07/07/2023] Open
Abstract
BACKGROUND When prepubertal lambs are superovulated, the ovarian response to gonadotropin stimulation has great individual difference and the collected oocytes have lower developmental ability than that of adult ewes. Over the years, growth hormone (GH) has been used in assisted reproduction because it can improve the reproductive performance in humans and animals. However, the effect of GH on ovaries and oocytes of prepubertal lambs remains unclear. METHODS Before and during follicle-stimulating hormone (FSH) superovulation of prepubertal lambs (4‒6-week-old), the lambs were treated with high (50 mg) or low dose (25 mg) of ovine GH in a long (5 days) or short (2 days) period. The recovered oocytes were used for in vitro maturation and fertilization, and several parameters of oocyte quality and development capacity were evaluated. The possible underlying mechanisms of GH action were explored by analysis of granulosa cell (GC) transcriptome, ovarian proteome and follicular fluid metabolome. RESULTS Treatment of lambs with 50 mg GH over 5 days (long treatment) potentially promoted the response of lambs to superovulation and improved the development capacity of retrieved oocytes, consequently increasing the high quality embryo yield from lambs. A number of differently expressed genes or proteins were found in ovaries between GH-treated and untreated lambs. Cellular experiments revealed that GH reduced the oxidative stress of GCs and promoted the GC proliferation probably through activation of the PI3K/Akt signaling pathway. Finally, analysis of follicular fluid metabolome indicated that GH treatment altered the abundance of many metabolites in follicular fluid, such as antioxidants and fatty acids. CONCLUSIONS GH treatment has a beneficial role on function of lamb ovaries, which supports the development of follicles and oocytes and improves the efficiency of embryo production from prepubertal lambs.
Collapse
Affiliation(s)
- Kexiong Liu
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China
| | - Luyao Zhang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China
| | - Qi Qi
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China
| | - Junjin Li
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China
| | - Fengxiang Yan
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China
| | - Jian Hou
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Yuan-Ming-Yuan West Road, Haidian District, Beijing, 100193, China.
| |
Collapse
|
28
|
Révész Á, Hevér H, Steckel A, Schlosser G, Szabó D, Vékey K, Drahos L. Collision energies: Optimization strategies for bottom-up proteomics. MASS SPECTROMETRY REVIEWS 2023; 42:1261-1299. [PMID: 34859467 DOI: 10.1002/mas.21763] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 11/17/2021] [Accepted: 11/17/2021] [Indexed: 06/07/2023]
Abstract
Mass-spectrometry coupled to liquid chromatography is an indispensable tool in the field of proteomics. In the last decades, more and more complex and diverse biochemical and biomedical questions have arisen. Problems to be solved involve protein identification, quantitative analysis, screening of low abundance modifications, handling matrix effect, and concentrations differing by orders of magnitude. This led the development of more tailored protocols and problem centered proteomics workflows, including advanced choice of experimental parameters. In the most widespread bottom-up approach, the choice of collision energy in tandem mass spectrometric experiments has outstanding role. This review presents the collision energy optimization strategies in the field of proteomics which can help fully exploit the potential of MS based proteomics techniques. A systematic collection of use case studies is then presented to serve as a starting point for related further scientific work. Finally, this article discusses the issue of comparing results from different studies or obtained on different instruments, and it gives some hints on methodology transfer between laboratories based on measurement of reference species.
Collapse
Affiliation(s)
- Ágnes Révész
- MS Proteomics Research Group, Institute of Organic Chemistry, Research Centre for Natural Sciences, Budapest, Hungary
| | - Helga Hevér
- Chemical Works of Gedeon Richter Plc, Budapest, Hungary
| | - Arnold Steckel
- Department of Analytical Chemistry, MTA-ELTE Lendület Ion Mobility Mass Spectrometry Research Group, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Gitta Schlosser
- Department of Analytical Chemistry, MTA-ELTE Lendület Ion Mobility Mass Spectrometry Research Group, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dániel Szabó
- MS Proteomics Research Group, Institute of Organic Chemistry, Research Centre for Natural Sciences, Budapest, Hungary
| | - Károly Vékey
- MS Proteomics Research Group, Institute of Organic Chemistry, Research Centre for Natural Sciences, Budapest, Hungary
| | - László Drahos
- MS Proteomics Research Group, Institute of Organic Chemistry, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
29
|
He Q, Zhong CQ, Li X, Guo H, Li Y, Gao M, Yu R, Liu X, Zhang F, Guo D, Ye F, Guo T, Shuai J, Han J. Dear-DIA XMBD: Deep Autoencoder Enables Deconvolution of Data-Independent Acquisition Proteomics. RESEARCH (WASHINGTON, D.C.) 2023; 6:0179. [PMID: 37377457 PMCID: PMC10292580 DOI: 10.34133/research.0179] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 06/01/2023] [Indexed: 06/29/2023]
Abstract
Data-independent acquisition (DIA) technology for protein identification from mass spectrometry and related algorithms is developing rapidly. The spectrum-centric analysis of DIA data without the use of spectra library from data-dependent acquisition data represents a promising direction. In this paper, we proposed an untargeted analysis method, Dear-DIAXMBD, for direct analysis of DIA data. Dear-DIAXMBD first integrates the deep variational autoencoder and triplet loss to learn the representations of the extracted fragment ion chromatograms, then uses the k-means clustering algorithm to aggregate fragments with similar representations into the same classes, and finally establishes the inverted index tables to determine the precursors of fragment clusters between precursors and peptides and between fragments and peptides. We show that Dear-DIAXMBD performs superiorly with the highly complicated DIA data of different species obtained by different instrument platforms. Dear-DIAXMBD is publicly available at https://github.com/jianweishuai/Dear-DIA-XMBD.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Chuan-Qi Zhong
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
| | - Huan Guo
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Yiming Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Mingxuan Gao
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
| | - Rongshan Yu
- Department of Computer Science,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Xianming Liu
- Bruker (Beijing) Scientific Technology Co. Ltd., Beijing, China
| | - Fangfei Zhang
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
| | - Donghui Guo
- Department of Electronic Engineering,
Xiamen University, Xiamen 361005, China
| | - Fangfu Ye
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Tiannan Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences,
Westlake University, 18 Shilongshan Road, Hangzhou 310024, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, China
- Westlake Omics Ltd., Yunmeng Road 1, Hangzhou, China
| | - Jianwei Shuai
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health) and Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| | - Jiahuai Han
- School of Life Sciences,
Xiamen University, Xiamen 361102, China
- State Key Laboratory of Cellular Stress Biology,
Innovation Center for Cell Signaling Network, Xiamen 361102, China
- National Institute for Data Science in Health and Medicine, School of Medicine,
Xiamen University, Xiamen 361102, China
| |
Collapse
|
30
|
Hamza GM, Miele E, Wojchowski DM, Toran P, Worsfold CR, Anthonymuthu TS, Bergo VB, Zhang AX, Silva JC. Affi-BAMS™: A Robust Targeted Proteomics Microarray Platform to Measure Histone Post-Translational Modifications. Int J Mol Sci 2023; 24:10060. [PMID: 37373206 DOI: 10.3390/ijms241210060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 06/08/2023] [Accepted: 06/11/2023] [Indexed: 06/29/2023] Open
Abstract
For targeted protein panels, the ability to specifically assay post-translational modifications (PTMs) in a quantitative, sensitive, and straightforward manner would substantially advance biological and pharmacological studies. The present study highlights the effectiveness of the Affi-BAMS™ epitope-directed affinity bead capture/MALDI MS platform for quantitatively defining complex PTM marks of H3 and H4 histones. Using H3 and H4 histone peptides and isotopically labelled derivatives, this affinity bead and MALDI MS platform achieves a range of >3 orders of magnitude with a technical precision CV of <5%. Using nuclear cellular lysates, Affi-BAMS PTM-peptide capture resolves heterogeneous histone N-terminal PTMs with as little as 100 µg of starting material. In an HDAC inhibitor and MCF7 cell line model, the ability to monitor dynamic histone H3 acetylation and methylation events is further demonstrated (including SILAC quantification). Affi-BAMS (and its capacity for the multiplexing of samples and target PTM-proteins) thus provides a uniquely efficient and effective approach for analyzing dynamic epigenetic histone marks, which is critical for the regulation of chromatin structure and gene expression.
Collapse
Affiliation(s)
- Ghaith M Hamza
- Discovery Biology, Discovery Sciences, R&D, AstraZeneca, Boston, MA 02451, USA
- Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH 03824, USA
| | - Eric Miele
- Discovery Biology, Discovery Sciences, R&D, AstraZeneca, Boston, MA 02451, USA
| | - Don M Wojchowski
- Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH 03824, USA
| | - Paul Toran
- Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH 03824, USA
| | | | | | | | - Andrew X Zhang
- Discovery Biology, Discovery Sciences, R&D, AstraZeneca, Boston, MA 02451, USA
| | - Jeffrey C Silva
- Adeptrix Corporation, Beverly, MA 01915, USA
- Cell Signaling Technology, Danvers, MA 01915, USA
| |
Collapse
|
31
|
Souza Junior DR, Silva ARM, Ronsein GE. Strategies for consistent and automated quantification of HDL proteome using data-independent acquisition (DIA). J Lipid Res 2023:100397. [PMID: 37286042 PMCID: PMC10339053 DOI: 10.1016/j.jlr.2023.100397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/11/2023] [Accepted: 05/31/2023] [Indexed: 06/09/2023] Open
Abstract
The introduction of mass spectrometry-based proteomics has revolutionized HDL field, with the description, characterization and implication of HDL-associated proteins in an array of pathologies. However, acquiring robust, reproducible data is still a challenge in the quantitative assessment of HDL proteome. Data-independent acquisition (DIA) is a mass spectrometry methodology that allows the acquisition of reproducible data, but data analysis remains a challenge in the field. Up to date, there is no consensus in how to process DIA-derived data for HDL proteomics. Here, we developed a pipeline aiming to standardize HDL proteome quantification. We optimized instrument parameters, and compared the performance of four freely available, user-friendly software tools (DIA-NN, EncyclopeDIA, MaxDIA and Skyline) in processing DIA data. Importantly, pooled samples were used as quality controls throughout our experimental setup. A carefully evaluation of precision, linearity, and detection limits, first using E. coli background for HDL proteomics, and second using HDL proteome and synthetic peptides, was undertaken. Finally, as a proof of concept, we employed our optimized and automated pipeline to quantify the proteome of HDL and apolipoprotein B (APOB)-containing lipoproteins. Our results show that determination of precision is key to confidently and consistently quantify HDL proteins. Taking this precaution, any of the available software tested here would be appropriate for quantification of HDL proteome, although their performance varied considerably.
Collapse
Affiliation(s)
| | | | - Graziella Eliza Ronsein
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil.
| |
Collapse
|
32
|
Wen C, Wu X, Lin G, Yan W, Gan G, Xu X, Chen XY, Chen X, Liu X, Fu G, Zhong CQ. Evaluation of DDA Library-Free Strategies for Phosphoproteomics and Ubiquitinomics Data-Independent Acquisition Data. J Proteome Res 2023. [PMID: 37256709 DOI: 10.1021/acs.jproteome.2c00735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Phosphoproteomics and ubiquitinomics data-independent acquisition (DIA) mass spectrometry (MS) data is typically analyzed by using a data-dependent acquisition (DDA) spectral library. The performance of various library-free strategies for analyzing phosphoproteomics and ubiquitinomics DIA MS data has not been evaluated. In this study, we systematically compare four commonly used DDA library-free approaches including Spectronaut's directDIA, DIA-Umpire, DIA-MSFragger, and in silico-predicted library for analysis of phosphoproteomics SWATH, DIA, and diaPASEF data as well as ubiquitinomics diaPASEF data. Spectronaut's directDIA shows the highest sensitivity for phosphopeptide detection not only in synthetic phosphopeptide samples but also in phosphoproteomics SWATH-MS and DIA data from real biological samples, when compared to the other three library-free strategies. For phosphoproteomics diaPASEF data, Spectronaut's directDIA and the in silico-predicted library based on DIA-NN identify almost the same number of phosphopeptides as a project-specific DDA spectral library. However, only about 30% of the total phosphopeptides are commonly identified, suggesting that the library-free strategies for phospho-diaPASEF data need further improvement in terms of sensitivity. For ubiquitinomics diaPASEF data, the in silico-predicted library performs the best among the four workflows and detects ∼50% more K-GG peptides than a project-specific DDA spectral library. Our results demonstrate that Spectronaut's directDIA is suitable for the analysis of phosphoproteomics SWATH-MS and DIA MS data, while the in silico-predicted library based on DIA-NN shows substantial advantages for ubiquitinomics diaPASEF MS data.
Collapse
Affiliation(s)
- Chengwen Wen
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Xiurong Wu
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Guanzhong Lin
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Wei Yan
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Guohong Gan
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Xiao Xu
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Xiang-Yu Chen
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Xi Chen
- SpecAlly Life Technology Co., Ltd., Wuhan 430074, Hubei, China
| | - Xianming Liu
- Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai 200030, China
| | - Guo Fu
- School of Medicine, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| | - Chuan-Qi Zhong
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen 361005, Fujian, China
| |
Collapse
|
33
|
Chen M, Zhu P, Wan Q, Ruan X, Wu P, Hao Y, Zhang Z, Sun J, Nie W, Chen S. High-Coverage Four-Dimensional Data-Independent Acquisition Proteomics and Phosphoproteomics Enabled by Deep Learning-Driven Multidimensional Predictions. Anal Chem 2023; 95:7495-7502. [PMID: 37126374 DOI: 10.1021/acs.analchem.2c05414] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Four-dimensional (4D) data-independent acquisition (DIA)-based proteomics is a promising technology. However, its full performance is restricted by the time-consuming building and limited coverage of a project-specific experimental library. Herein, we developed a versatile multifunctional deep learning model Deep4D based on self-attention that could predict the collisional cross section, retention time, fragment ion intensity, and charge state with high accuracies for both the unmodified and phosphorylated peptides and thus established the complete workflows for high-coverage 4D DIA proteomics and phosphoproteomics based on multidimensional predictions. A 4D predicted library containing ∼2 million peptides was established that could realize experimental library-free DIA analysis, and 33% more proteins were identified than using an experimental library of single-shot measurement in the example of HeLa cells. These results show the great values of the convenient high-coverage 4D DIA proteomics methods.
Collapse
Affiliation(s)
- Moran Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Pujia Zhu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Qiongqiong Wan
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Xianqin Ruan
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Pengfei Wu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Yanhong Hao
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Zhourui Zhang
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Jian Sun
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Wenjing Nie
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Suming Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| |
Collapse
|
34
|
Zhang Q. Mzion enables deep and precise identification of peptides in data-dependent acquisition proteomics. Sci Rep 2023; 13:7056. [PMID: 37120666 PMCID: PMC10148867 DOI: 10.1038/s41598-023-34323-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 04/27/2023] [Indexed: 05/01/2023] Open
Abstract
Sensitive and reliable identification of proteins and peptides pertains the basis of proteomics. We introduce Mzion, a new database search tool for data-dependent acquisition (DDA) proteomics. Our tool utilizes an intensity tally strategy and achieves generally a higher performance in terms of depth and precision across 20 datasets, ranging from large-scale to single-cell proteomics. Compared to several other search engines, Mzion matches on average 20% more peptide spectra at tryptic enzymatic specificity and 80% more at no enzymatic specificity from six large-scale, global datasets. Mzion also identifies more phosphopeptide spectra that can be explained by fewer proteins, demonstrated by six large-scale, local datasets corresponding to the global data. Our findings highlight the potential of Mzion for improving proteomic analysis and advancing our understanding of protein biology.
Collapse
Affiliation(s)
- Qiang Zhang
- Division of Endocrinology, Metabolism and Lipid Research, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
35
|
Zong Y, Wang Y, Yang Y, Zhao D, Wang X, Shen C, Qiao L. DeepFLR facilitates false localization rate control in phosphoproteomics. Nat Commun 2023; 14:2269. [PMID: 37080984 PMCID: PMC10119288 DOI: 10.1038/s41467-023-38035-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 04/06/2023] [Indexed: 04/22/2023] Open
Abstract
Protein phosphorylation is a post-translational modification crucial for many cellular processes and protein functions. Accurate identification and quantification of protein phosphosites at the proteome-wide level are challenging, not least because efficient tools for protein phosphosite false localization rate (FLR) control are lacking. Here, we propose DeepFLR, a deep learning-based framework for controlling the FLR in phosphoproteomics. DeepFLR includes a phosphopeptide tandem mass spectrum (MS/MS) prediction module based on deep learning and an FLR assessment module based on a target-decoy approach. DeepFLR improves the accuracy of phosphopeptide MS/MS prediction compared to existing tools. Furthermore, DeepFLR estimates FLR accurately for both synthetic and biological datasets, and localizes more phosphosites than probability-based methods. DeepFLR is compatible with data from different organisms, instruments types, and both data-dependent and data-independent acquisition approaches, thus enabling FLR estimation for a broad range of phosphoproteomics experiments.
Collapse
Affiliation(s)
- Yu Zong
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai, China
| | - Yuxin Wang
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai, China
- Department of Computer Science, and Institute of Modern Languages and Linguistics, Fudan University, Shanghai, China
| | - Yi Yang
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai, China
| | - Dan Zhao
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai, China
| | | | | | - Liang Qiao
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai, China.
| |
Collapse
|
36
|
Yang Y, Yang L, Zheng M, Cao D, Liu G. Data acquisition methods for non-targeted screening in environmental analysis. Trends Analyt Chem 2023. [DOI: 10.1016/j.trac.2023.116966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
37
|
Martinković F, Popović M, Smolec O, Mrljak V, Eckersall PD, Horvatić A. Data Independent Acquisition Reveals In-Depth Serum Proteome Changes in Canine Leishmaniosis. Metabolites 2023; 13:metabo13030365. [PMID: 36984805 PMCID: PMC10059658 DOI: 10.3390/metabo13030365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 02/19/2023] [Accepted: 02/26/2023] [Indexed: 03/05/2023] Open
Abstract
Comprehensive profiling of serum proteome provides valuable clues of health status and pathophysiological processes, making it the main strategy in biomarker discovery. However, the high dynamic range significantly decreases the number of detectable proteins, obstructing the insights into the underlying biological processes. To circumvent various serum enrichment methods, obtain high-quality proteome wide information using the next-generation proteomic, and study host response in canine leishmaniosis, we applied data-independent acquisition mass spectrometry (DIA-MS) for deep proteomic profiling of clinical samples. The non-depleted serum samples of healthy and naturally Leishmania-infected dogs were analyzed using the label-free 60-min gradient sequential window acquisition of all theoretical mass spectra (SWATH-MS) method. As a result, we identified 554 proteins, 140 of which differed significantly in abundance. Those were included in lipid metabolism, hematological abnormalities, immune response, and oxidative stress, providing valuable information about the complex molecular basis of the clinical and pathological landscape in canine leishmaniosis. Our results show that DIA-MS is a method of choice for understanding complex pathophysiological processes in serum and serum biomarker development.
Collapse
Affiliation(s)
- Franjo Martinković
- Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, HR-10000 Zagreb, Croatia
| | - Marin Popović
- Department of Safety and Protection, Karlovac University of Applied Sciences, Trg Josipa Juraja Strossmayera 9, HR-47000 Karlovac, Croatia
| | - Ozren Smolec
- Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, HR-10000 Zagreb, Croatia
| | - Vladimir Mrljak
- Faculty of Veterinary Medicine, University of Zagreb, Heinzelova 55, HR-10000 Zagreb, Croatia
| | - Peter David Eckersall
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Bearsden Rd, Glasgow G61 1QH, UK
- Interdisciplinary Laboratory of Clinical Analysis of the University of Murcia (Interlab-UMU), Department of Animal Medicine and Surgery, Veterinary School, University of Murcia, 30100 Murcia, Spain
| | - Anita Horvatić
- Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, HR-10000 Zagreb, Croatia
- Correspondence:
| |
Collapse
|
38
|
Searle BC, Shannon AE, Wilburn DB. Scribe: Next Generation Library Searching for DDA Experiments. J Proteome Res 2023; 22:482-490. [PMID: 36695531 DOI: 10.1021/acs.jproteome.2c00672] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.
Collapse
Affiliation(s)
- Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States.,Proteome Software Inc., Portland, Oregon97219, United States
| | - Ariana E Shannon
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| | - Damien Beau Wilburn
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| |
Collapse
|
39
|
Wang Y, Yang M, Ge F, Jiang B, Hu R, Zhou X, Yang Y, Liu M. Lysine Succinylation of VBS Contributes to Sclerotia Development and Aflatoxin Biosynthesis in Aspergillus flavus. Mol Cell Proteomics 2023; 22:100490. [PMID: 36566904 PMCID: PMC9879794 DOI: 10.1016/j.mcpro.2022.100490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 12/06/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022] Open
Abstract
Aspergillus flavus is a common saprophytic and pathogenic fungus, and its secondary metabolic pathways are one of the most highly characterized owing to its aflatoxin (AF) metabolite affecting global economic crops and human health. Different natural environments can cause significant variations in AF synthesis. Succinylation was recently identified as one of the most critical regulatory post-translational modifications affecting metabolic pathways. It is primarily reported in human cells and bacteria with few studies on fungi. Proteomic quantification of lysine succinylation (Ksuc) exploring its potential involvement in secondary metabolism regulation (including AF production) has not been performed under natural conditions in A. flavus. In this study, a quantification method was performed based on tandem mass tag labeling and antibody-based affinity enrichment of succinylated peptides via high accuracy nano-liquid chromatography with tandem mass spectrometry to explore the succinylation mechanism affecting the pathogenicity of naturally isolated A. flavus strains with varying toxin production. Altogether, 1240 Ksuc sites in 768 proteins were identified with 1103 sites in 685 proteins quantified. Comparing succinylated protein levels between high and low AF-producing A. flavus strains, bioinformatics analysis indicated that most succinylated proteins located in the AF biosynthetic pathway were downregulated, which directly affected AF synthesis. Versicolorin B synthase is a key catalytic enzyme for heterochrome B synthesis during AF synthesis. Site-directed mutagenesis and biochemical studies revealed that versicolorin B synthase succinylation is an important regulatory mechanism affecting sclerotia development and AF biosynthesis in A. flavus. In summary, our quantitative study of the lysine succinylome in high/low AF-producing strains revealed the role of Ksuc in regulating AF biosynthesis. We revealed novel insights into the metabolism of AF biosynthesis using naturally isolated A. flavus strains and identified a rich source of metabolism-related enzymes regulated by succinylation.
Collapse
Affiliation(s)
- Yu Wang
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China
| | - Mingkun Yang
- University of Chinese Academy of Sciences, Beijing, China; State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Feng Ge
- University of Chinese Academy of Sciences, Beijing, China; State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Bin Jiang
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China.
| | - Rui Hu
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China.
| | - Xin Zhou
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China
| | - Yunhuang Yang
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China
| | - Maili Liu
- State Key Laboratory of Magnetic Resonance and Atomic Molecular Physics, Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences - Wuhan National Laboratory for Optoelectronics, Hubei Optics Valley Laboratory, Wuhan, China; University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
40
|
Zhao J, Yang Y, Xu H, Zheng J, Shen C, Chen T, Wang T, Wang B, Yi J, Zhao D, Wu E, Qin Q, Xia L, Qiao L. Data-independent acquisition boosts quantitative metaproteomics for deep characterization of gut microbiota. NPJ Biofilms Microbiomes 2023; 9:4. [PMID: 36693863 PMCID: PMC9873935 DOI: 10.1038/s41522-023-00373-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 01/11/2023] [Indexed: 01/26/2023] Open
Abstract
Metaproteomics can provide valuable insights into the functions of human gut microbiota (GM), but is challenging due to the extreme complexity and heterogeneity of GM. Data-independent acquisition (DIA) mass spectrometry (MS) has been an emerging quantitative technique in conventional proteomics, but is still at the early stage of development in the field of metaproteomics. Herein, we applied library-free DIA (directDIA)-based metaproteomics and compared the directDIA with other MS-based quantification techniques for metaproteomics on simulated microbial communities and feces samples spiked with bacteria with known ratios, demonstrating the superior performance of directDIA by a comprehensive consideration of proteome coverage in identification as well as accuracy and precision in quantification. We characterized human GM in two cohorts of clinical fecal samples of pancreatic cancer (PC) and mild cognitive impairment (MCI). About 70,000 microbial proteins were quantified in each cohort and annotated to profile the taxonomic and functional characteristics of GM in different diseases. Our work demonstrated the utility of directDIA in quantitative metaproteomics for investigating intestinal microbiota and its related disease pathogenesis.
Collapse
Affiliation(s)
- Jinzhi Zhao
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China
| | - Yi Yang
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China.,ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, 311200, Hangzhou, China
| | - Hua Xu
- Department of Core Facility of Basic Medical Sciences, and Department of Psychiatry of Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
| | - Jianxujie Zheng
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China
| | - Chengpin Shen
- Shanghai Omicsolution Co., Ltd, 201100, Shanghai, China
| | - Tian Chen
- Changhai Hospital, The Naval Military Medical University, 200433, Shanghai, China
| | - Tao Wang
- Department of Core Facility of Basic Medical Sciences, and Department of Psychiatry of Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
| | - Bing Wang
- College of Food Science and Technology, Shanghai Ocean University, 201306, Shanghai, China
| | - Jia Yi
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China
| | - Dan Zhao
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China
| | - Enhui Wu
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China
| | - Qin Qin
- Changhai Hospital, The Naval Military Medical University, 200433, Shanghai, China.
| | - Li Xia
- Department of Core Facility of Basic Medical Sciences, and Department of Psychiatry of Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China.
| | - Liang Qiao
- Department of Chemistry, Shanghai Stomatological Hospital, Fudan University, 200000, Shanghai, China.
| |
Collapse
|
41
|
Cox J. Prediction of peptide mass spectral libraries with machine learning. Nat Biotechnol 2023; 41:33-43. [PMID: 36008611 DOI: 10.1038/s41587-022-01424-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/11/2022] [Indexed: 01/21/2023]
Abstract
The recent development of machine learning methods to identify peptides in complex mass spectrometric data constitutes a major breakthrough in proteomics. Longstanding methods for peptide identification, such as search engines and experimental spectral libraries, are being superseded by deep learning models that allow the fragmentation spectra of peptides to be predicted from their amino acid sequence. These new approaches, including recurrent neural networks and convolutional neural networks, use predicted in silico spectral libraries rather than experimental libraries to achieve higher sensitivity and/or specificity in the analysis of proteomics data. Machine learning is galvanizing applications that involve large search spaces, such as immunopeptidomics and proteogenomics. Current challenges in the field include the prediction of spectra for peptides with post-translational modifications and for cross-linked pairs of peptides. Permeation of machine-learning-based spectral prediction into search engines and spectrum-centric data-independent acquisition workflows for diverse peptide classes and measurement conditions will continue to push sensitivity and dynamic range in proteomics applications in the coming years.
Collapse
Affiliation(s)
- Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.
- Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway.
| |
Collapse
|
42
|
de Jonge NF, Mildau K, Meijer D, Louwen JJR, Bueschl C, Huber F, van der Hooft JJJ. Good practices and recommendations for using and benchmarking computational metabolomics metabolite annotation tools. Metabolomics 2022; 18:103. [PMID: 36469190 PMCID: PMC9722809 DOI: 10.1007/s11306-022-01963-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND Untargeted metabolomics approaches based on mass spectrometry obtain comprehensive profiles of complex biological samples. However, on average only 10% of the molecules can be annotated. This low annotation rate hampers biochemical interpretation and effective comparison of metabolomics studies. Furthermore, de novo structural characterization of mass spectral data remains a complicated and time-intensive process. Recently, the field of computational metabolomics has gained traction and novel methods have started to enable large-scale and reliable metabolite annotation. Molecular networking and machine learning-based in-silico annotation tools have been shown to greatly assist metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery. AIM OF REVIEW We highlight recent advances in computational metabolite annotation workflows with a special focus on their evaluation and comparison with other tools. Whilst the progress is substantial and promising, we also argue that inconsistencies in benchmarking different tools hamper users from selecting the most appropriate and promising method for their research. We summarize benchmarking strategies of the different tools and outline several recommendations for benchmarking and comparing novel tools. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on recent advances in mass spectral library-based and machine learning-supported metabolite annotation workflows. We discuss large-scale library matching and analogue search, the current bloom of mass spectral similarity scores, and how molecular networking has changed the field. In addition, the potentials and challenges of machine learning-supported metabolite annotation workflows are highlighted. Overall, recent developments in computational metabolomics have started to fundamentally change metabolomics workflows, and we expect that as a community we will be able to overcome current method performance ambiguities and annotation bottlenecks.
Collapse
Affiliation(s)
- Niek F. de Jonge
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Kevin Mildau
- Department of Analytical Chemistry, Biochemical Network Analysis Lab, University of Vienna, Vienna, Austria
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Joris J. R. Louwen
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Christoph Bueschl
- Department of Analytical Chemistry, Biochemical Network Analysis Lab, University of Vienna, Vienna, Austria
| | - Florian Huber
- Centre for Digitalization and Digitality (ZDD), University of Applied Sciences Düsseldorf, Düsseldorf, Germany
| | - Justin J. J. van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
43
|
Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int J Mol Sci 2022; 23:ijms232012272. [PMID: 36293133 PMCID: PMC9603455 DOI: 10.3390/ijms232012272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/03/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022] Open
Abstract
Medical discoveries mainly depend on the capability to process and analyze biological datasets, which inundate the scientific community and are still expanding as the cost of next-generation sequencing technologies is decreasing. Deep learning (DL) is a viable method to exploit this massive data stream since it has advanced quickly with there being successive innovations. However, an obstacle to scientific progress emerges: the difficulty of applying DL to biology, and this because both fields are evolving at a breakneck pace, thus making it hard for an individual to occupy the front lines of both of them. This paper aims to bridge the gap and help computer scientists bring their valuable expertise into the life sciences. This work provides an overview of the most common types of biological data and data representations that are used to train DL models, with additional information on the models themselves and the various tasks that are being tackled. This is the essential information a DL expert with no background in biology needs in order to participate in DL-based research projects in biomedicine, biotechnology, and drug discovery. Alternatively, this study could be also useful to researchers in biology to understand and utilize the power of DL to gain better insights into and extract important information from the omics data.
Collapse
|
44
|
Wang X, Hebert DD, Runsewe DO, Pohlman GE, Hoffmann WD, Irvin JA. Electroactive Polymer-Based Spray Ionization for Direct Mass Spectrometric Analysis. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:1840-1849. [PMID: 36149251 DOI: 10.1021/jasms.2c00148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Electrochemically deposited electroactive polymer (EAP) films were investigated for their potential to enhance the performance of ambient ionization mass spectrometry (MS). Several EAPs of varying hydrophobicity were evaluated, including the superhydrophobic polymer poly[3,4-(2-dodecylethylenedioxy)thiophene] (PEDOT-C12). The EAPs were electropolymerized onto indium tin oxide-coated glass, placed in front of the inlet of a mass spectrometer, and charged to 3.5-4.5 kV. Analyte solutions were then applied to the surface, initiating ionization events. Analytes including peptides and small molecule pharmaceuticals were studied in 0.1% formic acid in methanol/water ("spray solvent") as well as in synthetic biological fluid matrices, using both EAP spray ionization (EAPSI) and paper spray ionization (PSI). Each EAPSI analysis required as little as 0.1 μL of solution, and the resulting sprays were stable and reproducible. The sensitivity, limit of detection (LOD), and limit of quantification (LOQ) were evaluated using bradykinin, cannabinol, and cannabidiol, which were prepared in pure solvents, artificial urine, and artificial saliva. The limits of detection and quantitation for EAPSI were improved relative to PSI by 1-2 orders of magnitude for analytes prepared in methanol/water and on the same order of magnitude as PSI for analytes prepared in artificial saliva and urine. This EAP-based spray ionization technique offers possibilities for rapid MS analysis with small sample sizes, high accuracy, and miniaturization of MS instruments.
Collapse
Affiliation(s)
- Xu Wang
- Materials Science, Engineering and Commercialization Program, Texas State University, San Marcos, Texas 78666, United States
| | - David D Hebert
- Department of Chemistry and Biochemistry, Texas State University, San Marcos, Texas 78666, United States
| | - Damilola O Runsewe
- Materials Science, Engineering and Commercialization Program, Texas State University, San Marcos, Texas 78666, United States
| | - Gabriel E Pohlman
- Department of Chemistry and Biochemistry, Texas State University, San Marcos, Texas 78666, United States
| | - William D Hoffmann
- Materials Science, Engineering and Commercialization Program, Texas State University, San Marcos, Texas 78666, United States
- Department of Chemistry and Biochemistry, Texas State University, San Marcos, Texas 78666, United States
| | - Jennifer A Irvin
- Materials Science, Engineering and Commercialization Program, Texas State University, San Marcos, Texas 78666, United States
- Department of Chemistry and Biochemistry, Texas State University, San Marcos, Texas 78666, United States
| |
Collapse
|
45
|
Wu J, Cao L, Wang J, Wang Y, Hao H, Huang L. Characterization of serum protein expression profiles in the early sarcopenia older adults with low grip strength: a cross-sectional study. BMC Musculoskelet Disord 2022; 23:894. [PMID: 36192674 PMCID: PMC9528053 DOI: 10.1186/s12891-022-05844-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 09/20/2022] [Indexed: 11/25/2022] Open
Abstract
Background Sarcopenia refers to the progressive loss of skeletal muscle mass and muscle function, which seriously threatens the quality of life of the older adults. Therefore, early diagnosis is urgently needed. This study aimed to explore the changes of serum protein profiles in sarcopenia patients through a cross-sectional study, and to provide the reference for clinical diagnosis. Methods This study was a cross-sectional study carried out in the Tianjin institute of physical education teaching experiment training center from December 2019 to December 2020. Ten older adults were recruited, including 5 sarcopenia and 5 healthy older adults. After a detailed diagnostic evaluation, blood samples were collected to prepare serum for proteomic analysis using the HPLC System Easy nLC method. The differentially expressed proteins (DEPs) were screened by the limma package of R software (version 4.1.0). Results A total of 114 DEPs were identified between the patients and healthy older adults, including 48 up-regulated proteins and 66 down-regulated proteins. The functional enrichment analysis showed that the 114 DEPs were significantly enriched in 153 GO terms, which mainly involved in low-density lipoprotein particle remodeling, and negative regulation of immune response,etc. The PPI network further suggested that the cholesteryl ester transfer protein and Apolipoprotein A2 could serve as biomarkers to facilitate diagnosis of sarcopenia. Conclusions This study provided a serum proteomic profile of sarcopenia patients, and identified two proteins with diagnostic value, which might help to improve the diagnostic accuracy of sarcopenia. Supplementary Information The online version contains supplementary material available at 10.1186/s12891-022-05844-2.
Collapse
Affiliation(s)
- Jingqiong Wu
- TianJin University of Sport, No.16 Donghai Road, West Tuanbo New Town, Jinghai District, Tianjin, 301617, PR China.,Guangxi Medical University, Nanning, 530021, Guangxi, PR China
| | - Longjun Cao
- TianJin University of Sport, No.16 Donghai Road, West Tuanbo New Town, Jinghai District, Tianjin, 301617, PR China
| | - Jiazhi Wang
- TianJin University of Sport, No.16 Donghai Road, West Tuanbo New Town, Jinghai District, Tianjin, 301617, PR China
| | - Yizhao Wang
- Tianjin Huanhu Hospital, Tianjin, 300350, PR China
| | - Huimin Hao
- TianJin University of Sport, No.16 Donghai Road, West Tuanbo New Town, Jinghai District, Tianjin, 301617, PR China
| | - Liping Huang
- TianJin University of Sport, No.16 Donghai Road, West Tuanbo New Town, Jinghai District, Tianjin, 301617, PR China.
| |
Collapse
|
46
|
Frankenfield AM, Ni J, Ahmed M, Hao L. Protein Contaminants Matter: Building Universal Protein Contaminant Libraries for DDA and DIA Proteomics. J Proteome Res 2022; 21:2104-2113. [PMID: 35793413 PMCID: PMC10040255 DOI: 10.1021/acs.jproteome.2c00145] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry-based proteomics is constantly challenged by the presence of contaminant background signals. In particular, protein contaminants from reagents and sample handling are almost impossible to avoid. For data-dependent acquisition (DDA) proteomics, an exclusion list can be used to reduce the influence of protein contaminants. However, protein contamination has not been evaluated and is rarely addressed in data-independent acquisition (DIA). How protein contaminants influence proteomic data is also unclear. In this study, we established new protein contaminant FASTA and spectral libraries that are applicable to all proteomic workflows and evaluated the impact of protein contaminants on both DDA and DIA proteomics. We demonstrated that including our contaminant libraries can reduce false discoveries and increase protein identifications, without influencing the quantification accuracy in various proteomic software platforms. With the pressing need to standardize proteomic workflow in the research community, we highly recommend including our contaminant FASTA and spectral libraries in all bottom-up proteomic data analysis. Our contaminant libraries and a step-by-step tutorial to incorporate these libraries in various DDA and DIA data analysis platforms can be valuable resources for proteomic researchers, freely accessible at https://github.com/HaoGroup-ProtContLib.
Collapse
Affiliation(s)
- Ashley M Frankenfield
- Department of Chemistry, The George Washington University, Science and Engineering Hall 4000, 800, 22nd St., Northwest, Washington, DC 20052, United States
| | - Jiawei Ni
- Department of Chemistry, The George Washington University, Science and Engineering Hall 4000, 800, 22nd St., Northwest, Washington, DC 20052, United States
| | - Mustafa Ahmed
- Department of Chemistry, The George Washington University, Science and Engineering Hall 4000, 800, 22nd St., Northwest, Washington, DC 20052, United States
| | - Ling Hao
- Department of Chemistry, The George Washington University, Science and Engineering Hall 4000, 800, 22nd St., Northwest, Washington, DC 20052, United States
| |
Collapse
|
47
|
Bacala R, Hatcher DW, Perreault H, Fu BX. Challenges and opportunities for proteomics and the improvement of bread wheat quality. JOURNAL OF PLANT PHYSIOLOGY 2022; 275:153743. [PMID: 35749977 DOI: 10.1016/j.jplph.2022.153743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/13/2022] [Accepted: 05/30/2022] [Indexed: 06/15/2023]
Abstract
Wheat remains a critical global food source, pressured by climate change and the need to maximize yield, improve processing and nutritional quality and ensure safety. An enormous amount of research has been conducted to understand gluten protein composition and structure in relation to end-use quality, yet progress has become stagnant. This is mainly due to the need and inability to biochemically characterize the intact functional glutenin polymer in order to correlate to quality, necessitating reduction to monomeric subunits and a loss of contextual information. While some individual gluten proteins might have a positive or negative influence on gluten quality, it is the sum total of these proteins, their relative and absolute expression, their sub-cellular trafficking, the amount and size of glutenin polymers, and ratios between gluten protein classes that define viscoelasticity of gluten. The sub-cellular trafficking of gluten proteins during seed maturation is still not completely clear and there is evidence of dual pathways and therefore different destinations for proteins, either constitutively or temporally. The trafficking of proteins is also unclear in endosperm cells as they undergo programmed cell death; Golgi disappear around 12 DPA but protein filling continues at least to 25 DPA. Modulation of the timing of cellular events will invariably affect protein deposition and therefore gluten strength and function. Existing and emerging proteomics technologies such as proteoform profiling and top-down proteomics offer new tools to study gluten protein composition as a whole system and identify compositional patterns that can modify gluten structure with improved functionality.
Collapse
Affiliation(s)
- Ray Bacala
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada; University of Manitoba, Department of Chemistry, 144 Dysart Road, Winnipeg, Manitoba, R3T 2N2, Canada.
| | - Dave W Hatcher
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada
| | - Héléne Perreault
- University of Manitoba, Department of Chemistry, 144 Dysart Road, Winnipeg, Manitoba, R3T 2N2, Canada.
| | - Bin Xiao Fu
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada; Department of Food and Human Nutritional Sciences, 209 - 35 Chancellor's Circle, University of Manitoba, Winnipeg, Manitoba, R3T 2N2, Canada.
| |
Collapse
|
48
|
Srinivasan A, Sing JC, Gingras AC, Röst HL. Improving Phosphoproteomics Profiling Using Data-Independent Mass Spectrometry. J Proteome Res 2022; 21:1789-1799. [PMID: 35877786 DOI: 10.1021/acs.jproteome.2c00172] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry-based profiling of the phosphoproteome is a powerful method of identifying phosphorylation events at a systems level. Most phosphoproteomics studies have used data-dependent acquisition (DDA) mass spectrometry as their method of choice. In this Perspective, we review some recent studies benchmarking DDA and DIA methods for phosphoproteomics and discuss data analysis options for DIA phosphoproteomics. In order to evaluate the impact of data-dependent and data-independent acquisition (DIA) on identification and quantification, we analyze a previously published phosphopeptide-enriched data set consisting of 10 replicates acquired by DDA and DIA each. We find that though more unique identifications are made in DDA data, phosphopeptides are identified more consistently across replicates in DIA. We further discuss the challenges of identifying chromatographically coeluting phosphopeptide isomers and investigate the impact on reproducibility of identifying high-confidence site-localized phosphopeptides in replicates.
Collapse
Affiliation(s)
- Aparna Srinivasan
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.,Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Justin C Sing
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Anne-Claude Gingras
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.,Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Hannes L Röst
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| |
Collapse
|
49
|
Perez-Riverol Y. Proteomic repository data submission, dissemination, and reuse: key messages. Expert Rev Proteomics 2022; 19:297-310. [PMID: 36529941 PMCID: PMC7614296 DOI: 10.1080/14789450.2022.2160324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
INTRODUCTION The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public. AREAS COVERED Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed. EXPERT OPINION The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
50
|
Pietilä S, Suomi T, Elo LL. Introducing untargeted data-independent acquisition for metaproteomics of complex microbial samples. ISME COMMUNICATIONS 2022; 2:51. [PMID: 37938742 PMCID: PMC9723653 DOI: 10.1038/s43705-022-00137-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/27/2022] [Accepted: 06/14/2022] [Indexed: 05/17/2023]
Abstract
Mass spectrometry-based metaproteomics is a relatively new field of research that enables the characterization of the functionality of microbiota. Recently, we demonstrated the applicability of data-independent acquisition (DIA) mass spectrometry to the analysis of complex metaproteomic samples. This allowed us to circumvent many of the drawbacks of the previously used data-dependent acquisition (DDA) mass spectrometry, mainly the limited reproducibility when analyzing samples with complex microbial composition. However, the DDA-assisted DIA approach still required additional DDA data on the samples to assist the analysis. Here, we introduce, for the first time, an untargeted DIA metaproteomics tool that does not require any DDA data, but instead generates a pseudospectral library directly from the DIA data. This reduces the amount of required mass spectrometry data to a single DIA run per sample. The new DIA-only metaproteomics approach is implemented as a new open-source software package named glaDIAtor, including a modern web-based graphical user interface to facilitate wide use of the tool by the community.
Collapse
Affiliation(s)
- Sami Pietilä
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520, Turku, Finland
| | - Tomi Suomi
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520, Turku, Finland
| | - Laura L Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520, Turku, Finland.
- Institute of Biomedicine, University of Turku, FI-20520, Turku, Finland.
| |
Collapse
|