1
|
He G, He Q, Cheng J, Yu R, Shuai J, Cao Y. ProPept-MT: A Multi-Task Learning Model for Peptide Feature Prediction. Int J Mol Sci 2024; 25:7237. [PMID: 39000344 PMCID: PMC11241495 DOI: 10.3390/ijms25137237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 06/26/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024] Open
Abstract
In the realm of quantitative proteomics, data-independent acquisition (DIA) has emerged as a promising approach, offering enhanced reproducibility and quantitative accuracy compared to traditional data-dependent acquisition (DDA) methods. However, the analysis of DIA data is currently hindered by its reliance on project-specific spectral libraries derived from DDA analyses, which not only limits proteome coverage but also proves to be a time-intensive process. To overcome these challenges, we propose ProPept-MT, a novel deep learning-based multi-task prediction model designed to accurately forecast key features such as retention time (RT), ion intensity, and ion mobility (IM). Leveraging advanced techniques such as multi-head attention and BiLSTM for feature extraction, coupled with Nash-MTL for gradient coordination, ProPept-MT demonstrates superior prediction performance. Integrating ion mobility alongside RT, mass-to-charge ratio (m/z), and ion intensity forms 4D proteomics. Then, we outline a comprehensive workflow tailored for 4D DIA proteomics research, integrating the use of 4D in silico libraries predicted by ProPept-MT. Evaluation on a benchmark dataset showcases ProPept-MT's exceptional predictive capabilities, with impressive results including a 99.9% Pearson correlation coefficient (PCC) for RT prediction, a median dot product (DP) of 96.0% for fragment ion intensity prediction, and a 99.3% PCC for IM prediction on the test set. Notably, ProPept-MT manifests efficacy in predicting both unmodified and phosphorylated peptides, underscoring its potential as a valuable tool for constructing high-quality 4D DIA in silico libraries.
Collapse
Affiliation(s)
- Guoqiang He
- Postgraduate Training Base Alliance, Wenzhou Medical University, Wenzhou 325000, China
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Qingzu He
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen 361005, China
| | - Jinyan Cheng
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Rongwen Yu
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Jianwei Shuai
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| | - Yi Cao
- Postgraduate Training Base Alliance, Wenzhou Medical University, Wenzhou 325000, China
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325000, China
| |
Collapse
|
2
|
Zhu F, Niu Q, Li X, Zhao Q, Su H, Shuai J. FM-FCN: A Neural Network with Filtering Modules for Accurate Vital Signs Extraction. RESEARCH (WASHINGTON, D.C.) 2024; 7:0361. [PMID: 38737196 PMCID: PMC11082448 DOI: 10.34133/research.0361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 04/01/2024] [Indexed: 05/14/2024]
Abstract
Neural networks excel at capturing local spatial patterns through convolutional modules, but they may struggle to identify and effectively utilize the morphological and amplitude periodic nature of physiological signals. In this work, we propose a novel network named filtering module fully convolutional network (FM-FCN), which fuses traditional filtering techniques with neural networks to amplify physiological signals and suppress noise. First, instead of using a fully connected layer, we use an FCN to preserve the time-dimensional correlation information of physiological signals, enabling multiple cycles of signals in the network and providing a basis for signal processing. Second, we introduce the FM as a network module that adapts to eliminate unwanted interference, leveraging the structure of the filter. This approach builds a bridge between deep learning and signal processing methodologies. Finally, we evaluate the performance of FM-FCN using remote photoplethysmography. Experimental results demonstrate that FM-FCN outperforms the second-ranked method in terms of both blood volume pulse (BVP) signal and heart rate (HR) accuracy. It substantially improves the quality of BVP waveform reconstruction, with a decrease of 20.23% in mean absolute error (MAE) and an increase of 79.95% in signal-to-noise ratio (SNR). Regarding HR estimation accuracy, FM-FCN achieves a decrease of 35.85% in MAE, 29.65% in error standard deviation, and 32.88% decrease in 95% limits of agreement width, meeting clinical standards for HR accuracy requirements. The results highlight its potential in improving the accuracy and reliability of vital sign measurement through high-quality BVP signal extraction. The codes and datasets are available online at https://github.com/zhaoqi106/FM-FCN.
Collapse
Affiliation(s)
- Fangfang Zhu
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, and State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network,
Xiamen University, Xiamen 361005, China
| | - Qichao Niu
- Vitalsilicon Technology Co. Ltd., Jiaxing, Zhejiang 314006, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Qi Zhao
- School of Computer Science and Software Engineering,
University of Science and Technology Liaoning, Anshan 114051, China
| | - Honghong Su
- Yangtze Delta Region Institute of Tsinghua University, Zhejiang, Jiaxing 314006, China
| | - Jianwei Shuai
- Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou 325001, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou 325001, China
| |
Collapse
|
3
|
Basharat AR, Xiong X, Xu T, Zang Y, Sun L, Liu X. TopDIA: A Software Tool for Top-Down Data-Independent Acquisition Proteomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588302. [PMID: 38645171 PMCID: PMC11030422 DOI: 10.1101/2024.04.05.588302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Top-down mass spectrometry is widely used for proteoform identification, characterization, and quantification owing to its ability to analyze intact proteoforms. In the last decade, top-down proteomics has been dominated by top-down data-dependent acquisition mass spectrometry (TD-DDA-MS), and top-down data-independent acquisition mass spectrometry (TD-DIA-MS) has not been well studied. While TD-DIA-MS produces complex multiplexed tandem mass spectrometry (MS/MS) spectra, which are challenging to confidently identify, it selects more precursor ions for MS/MS analysis and has the potential to increase proteoform identifications compared with TD-DDA-MS. Here we present TopDIA, the first software tool for proteoform identification by TD-DIA-MS. It generates demultiplexed pseudo MS/MS spectra from TD-DIA-MS data and then searches the pseudo MS/MS spectra against a protein sequence database for proteoform identification. We compared the performance of TD-DDA-MS and TD-DIA-MS using Escherichia coli K-12 MG1655 cells and demonstrated that TD-DIA-MS with TopDIA increased proteoform and protein identifications compared with TD-DDA-MS.
Collapse
Affiliation(s)
- Abdul Rehman Basharat
- Department of BioHealth Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA
| | - Xingzhao Xiong
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| | - Tian Xu
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Yong Zang
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824, USA
| | - Xiaowen Liu
- Deming Department of Medicine, Tulane University School of Medicine, New Orleans, LA, 70112, USA
| |
Collapse
|
4
|
He Q, Guo H, Li Y, He G, Li X, Shuai J. SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics. Interdiscip Sci 2024:10.1007/s12539-024-00611-4. [PMID: 38472692 DOI: 10.1007/s12539-024-00611-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 03/14/2024]
Abstract
Mass spectrometry is crucial in proteomics analysis, particularly using Data Independent Acquisition (DIA) for reliable and reproducible mass spectrometry data acquisition, enabling broad mass-to-charge ratio coverage and high throughput. DIA-NN, a prominent deep learning software in DIA proteome analysis, generates peptide results but may include low-confidence peptides. Conventionally, biologists have to manually screen peptide fragment ion chromatogram peaks (XIC) for identifying high-confidence peptides, a time-consuming and subjective process prone to variability. In this study, we introduce SeFilter-DIA, a deep learning algorithm, aiming at automating the identification of high-confidence peptides. Leveraging compressed excitation neural network and residual network models, SeFilter-DIA extracts XIC features and effectively discerns between high and low-confidence peptides. Evaluation of the benchmark datasets demonstrates SeFilter-DIA achieving 99.6% AUC on the test set and 97% for other performance indicators. Furthermore, SeFilter-DIA is applicable for screening peptides with phosphorylation modifications. These results demonstrate the potential of SeFilter-DIA to replace manual screening, providing an efficient and objective approach for high-confidence peptide identification while mitigating associated limitations.
Collapse
Affiliation(s)
- Qingzu He
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Huan Guo
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Yulin Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China
| | - Guoqiang He
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China
| | - Xiang Li
- Department of Physics, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.
| | - Jianwei Shuai
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, 325001, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325001, China.
| |
Collapse
|
5
|
Li Y, He Q, Guo H, Shuai SC, Cheng J, Liu L, Shuai J. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. J Proteome Res 2024; 23:834-843. [PMID: 38252705 DOI: 10.1021/acs.jproteome.3c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Collapse
Affiliation(s)
- Yulin Li
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Qingzu He
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Huan Guo
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, Illinois 60208, United States
| | - Jinyan Cheng
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Liyu Liu
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| |
Collapse
|
6
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|