1
|
Kalhor M, Lapin J, Picciani M, Wilhelm M. Rescoring Peptide Spectrum Matches: Boosting Proteomics Performance by Integrating Peptide Property Predictors Into Peptide Identification. Mol Cell Proteomics 2024; 23:100798. [PMID: 38871251 DOI: 10.1016/j.mcpro.2024.100798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 05/26/2024] [Accepted: 06/09/2024] [Indexed: 06/15/2024] Open
Abstract
Rescoring of peptide spectrum matches originating from database search engines enabled by peptide property predictors is exceeding the performance of peptide identification from traditional database search engines. In contrast to the peptide spectrum match scores calculated by traditional database search engines, rescoring peptide spectrum matches generates scores based on comparing observed and predicted peptide properties, such as fragment ion intensities and retention times. These newly generated scores enable a more efficient discrimination between correct and incorrect peptide spectrum matches. This approach was shown to lead to substantial improvements in the number of confidently identified peptides, facilitating the analysis of challenging datasets in various fields such as immunopeptidomics, metaproteomics, proteogenomics, and single-cell proteomics. In this review, we summarize the key elements leading up to the recent introduction of multiple data-driven rescoring pipelines. We provide an overview of relevant post-processing rescoring tools, introduce prominent data-driven rescoring pipelines for various applications, and highlight limitations, opportunities, and future perspectives of this approach and its impact on mass spectrometry-based proteomics.
Collapse
Affiliation(s)
- Mostafa Kalhor
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Joel Lapin
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mario Picciani
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany; Munich Data Science Institute, Technical University of Munich, Garching, Germany.
| |
Collapse
|
2
|
Sun Y, Chen S, Jiang H, Qin B, Li D, Jia K, Wang C. Towards interpretable machine learning for observational quantification of soil heavy metal concentrations under environmental constraints. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 926:171931. [PMID: 38531447 DOI: 10.1016/j.scitotenv.2024.171931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 03/18/2024] [Accepted: 03/21/2024] [Indexed: 03/28/2024]
Abstract
Monitoring heavy metal concentrations in soils is central to assessing agricultural production safety. Satellite observations permit inferring concentrations from spectrum, thereby contributing to the prevention and control of soil heavy metal pollution. However, heavy metals exhibit weak spectral responses, particularly at low and medium concentrations, and are predominantly influenced by other soil components. Machine learning (ML)-driven modelling can produce predictions but lacks interpretability. Here, we present an interpretable ML framework for concentration quantification modelling and investigated the contributions of spectral and environmental factors-pH and organic carbon-to the estimation of metals with multiple concentration gradients, as analysed through SHAP (SHapley Additive exPlanations) data derived from four learning-based scenarios. The results indicated that scenarios SHC (spectral, pH, and organic carbon) and SH (spectral and pH) were the most optimal for chromium (Cr) [RPD = 1.42, Adj R2 = 0.62], and cadmium (Cd) [RPD = 1.80, Adj R2 = 0.80]. Under environmental constraints, the spectral predictability for Cr and Cd was improved by 67 % and 87 %, respectively. We concluded that interpretable modelling, utilising both spectral and soil environmental factors, holds significant potential for estimating heavy metals across concentration gradients. It is recommended that samples with higher organic carbon content and lower pH be selected to enhance Cr and Cd predictions. An advanced grasp of interpretable predictions facilitates earlier warning of heavy metal contamination and guides the formulation of robust sampling strategies.
Collapse
Affiliation(s)
- Yishan Sun
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China; Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, Guangzhou 510640, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuisen Chen
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China; Joint Laboratory on Low-carbon Digital Monitoring, Guangdong Institute of Carbon Neutrality (Shaoguan), Shaoguan ShenBay Low Carbon Digital Technology Co., Ltd., Shaoguan 512026, China.
| | - Hao Jiang
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China
| | - Boxiong Qin
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China; Joint Laboratory on Low-carbon Digital Monitoring, Guangdong Institute of Carbon Neutrality (Shaoguan), Shaoguan ShenBay Low Carbon Digital Technology Co., Ltd., Shaoguan 512026, China
| | - Dan Li
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China; Joint Laboratory on Low-carbon Digital Monitoring, Guangdong Institute of Carbon Neutrality (Shaoguan), Shaoguan ShenBay Low Carbon Digital Technology Co., Ltd., Shaoguan 512026, China
| | - Kai Jia
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China; Joint Laboratory on Low-carbon Digital Monitoring, Guangdong Institute of Carbon Neutrality (Shaoguan), Shaoguan ShenBay Low Carbon Digital Technology Co., Ltd., Shaoguan 512026, China
| | - Chongyang Wang
- Guangdong Provincial Key Laboratory of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangdong Engineering Technology Research Center of Remote Sensing Big Data Application, Guangzhou Institute of Geography, Guangdong Academy of Science, Guangzhou 510070, China
| |
Collapse
|
3
|
Yang Y, Fang Q. Prediction of glycopeptide fragment mass spectra by deep learning. Nat Commun 2024; 15:2448. [PMID: 38503734 PMCID: PMC10951270 DOI: 10.1038/s41467-024-46771-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 03/11/2024] [Indexed: 03/21/2024] Open
Abstract
Deep learning has achieved a notable success in mass spectrometry-based proteomics and is now emerging in glycoproteomics. While various deep learning models can predict fragment mass spectra of peptides with good accuracy, they cannot cope with the non-linear glycan structure in an intact glycopeptide. Herein, we present DeepGlyco, a deep learning-based approach for the prediction of fragment spectra of intact glycopeptides. Our model adopts tree-structured long-short term memory networks to process the glycan moiety and a graph neural network architecture to incorporate potential fragmentation pathways of a specific glycan structure. This feature is beneficial to model explainability and differentiation ability of glycan structural isomers. We further demonstrate that predicted spectral libraries can be used for data-independent acquisition glycoproteomics as a supplement for library completeness. We expect that this work will provide a valuable deep learning resource for glycoproteomics.
Collapse
Affiliation(s)
- Yi Yang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
| | - Qun Fang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
- Department of Chemistry, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
4
|
Lapin J, Yan X, Dong Q. UniSpec: Deep Learning for Predicting the Full Range of Peptide Fragment Ion Series to Enhance the Proteomics Data Analysis Workflow. Anal Chem 2024. [PMID: 38329031 DOI: 10.1021/acs.analchem.3c02321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
We present UniSpec, an attention-driven deep neural network designed to predict comprehensive collision-induced fragmentation spectra, thereby improving peptide identification in shotgun proteomics. Utilizing a training data set of 1.8 million unique high-quality tandem mass spectra (MS2) from 0.8 million unique peptide ions, UniSpec learned with a peptide fragmentation dictionary encompassing 7919 fragment peaks. Among these, 5712 are neutral loss peaks, with 2310 corresponding to modification-specific neutral losses. Remarkably, UniSpec can predict 73%-77% of fragment intensities based on our NIST reference library spectra, a significant leap from the 35%-45% coverage of only b and y ions. Comparative studies with Prosit elucidate that while both models are strong at predicting their respective fragment ion series, UniSpec particularly shines in generating more complex MS2 spectra with diverse ion annotations. The integration of UniSpec's predictions into shotgun proteomics data analysis boosts the identification rate of tryptic peptides by 48% at a 1% false discovery rate (FDR) and 60% at a more confident 0.1% FDR. Using UniSpec's predicted in-silico spectral library, the search results closely matched those from search engines and experimental spectral libraries used in peptide identification, highlighting its potential as a stand-alone identification tool. The source code and Python scripts are available on GitHub (https://github.com/usnistgov/UniSpec) and Zenodo (https://zenodo.org/records/10452792), and all data sets and analysis results generated in this work were deposited in Zenodo (https://zenodo.org/records/10052268).
Collapse
Affiliation(s)
- Joel Lapin
- Department of Physics, Georgetown University, Washington, D.C. 20057, United States
- Associate, Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Xinjian Yan
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Qian Dong
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
5
|
Ye J, He X, Wang S, Dong MQ, Wu F, Lu S, Feng F. Test-Time Training for Deep MS/MS Spectrum Prediction Improves Peptide Identification. J Proteome Res 2024; 23:550-559. [PMID: 38153036 DOI: 10.1021/acs.jproteome.3c00229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
In bottom-up proteomics, peptide-spectrum matching is critical for peptide and protein identification. Recently, deep learning models have been used to predict tandem mass spectra of peptides, enabling the calculation of similarity scores between the predicted and experimental spectra for peptide-spectrum matching. These models follow the supervised learning paradigm, which trains a general model using paired peptides and spectra from standard data sets and directly employs the model on experimental data. However, this approach can lead to inaccurate predictions due to differences between the training data and the experimental data, such as sample types, enzyme specificity, and instrument calibration. To tackle this problem, we developed a test-time training paradigm that adapts the pretrained model to generate experimental data-specific models, namely, PepT3. PepT3 yields a 10-40% increase in peptide identification depending on the variability in training and experimental data. Intriguingly, when applied to a patient-derived immunopeptidomic sample, PepT3 increases the identification of tumor-specific immunopeptide candidates by 60%. Two-thirds of the newly identified candidates are predicted to bind to the patient's human leukocyte antigen isoforms. To facilitate access of the model and all the results, we have archived all the intermediate files in Zenodo.org with identifier 8231084.
Collapse
Affiliation(s)
- Jianbai Ye
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Xiangnan He
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shujuan Wang
- National Institute of Biological Sciences, Beijing 102206, China
| | - Meng-Qiu Dong
- National Institute of Biological Sciences, Beijing 102206, China
| | - Feng Wu
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shan Lu
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California 92093, United States
| | - Fuli Feng
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
6
|
Schrader M. Origins, Technological Advancement, and Applications of Peptidomics. Methods Mol Biol 2024; 2758:3-47. [PMID: 38549006 DOI: 10.1007/978-1-0716-3646-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Peptidomics is the comprehensive characterization of peptides from biological sources instead of heading for a few single peptides in former peptide research. Mass spectrometry allows to detect a multitude of peptides in complex mixtures and thus enables new strategies leading to peptidomics. The term was established in the year 2001, and up to now, this new field has grown to over 3000 publications. Analytical techniques originally developed for fast and comprehensive analysis of peptides in proteomics were specifically adjusted for peptidomics. Although it is thus closely linked to proteomics, there are fundamental differences with conventional bottom-up proteomics. Fundamental technological advancements of peptidomics since have occurred in mass spectrometry and data processing, including quantification, and more slightly in separation technology. Different strategies and diverse sources of peptidomes are mentioned by numerous applications, such as discovery of neuropeptides and other bioactive peptides, including the use of biochemical assays. Furthermore, food and plant peptidomics are introduced similarly. Additionally, applications with a clinical focus are included, comprising biomarker discovery as well as immunopeptidomics. This overview extensively reviews recent methods, strategies, and applications including links to all other chapters of this book.
Collapse
Affiliation(s)
- Michael Schrader
- Department of Bioengineering Sciences, Weihenstephan-Tr. University of Applied Sciences, Freising, Germany.
| |
Collapse
|
7
|
Schrader M, Fricker LD. Current Challenges and Future Directions in Peptidomics. Methods Mol Biol 2024; 2758:485-498. [PMID: 38549031 DOI: 10.1007/978-1-0716-3646-6_26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
The field of peptidomics has been under development since its start more than 20 years ago. In this chapter we provide a personal outlook for future directions in this field. The applications of peptidomics technologies are spreading more and more from classical research of peptide hormones and neuropeptides towards commercial applications in plant and food-science. Many clinical applications have been developed to analyze the complexity of biofluids, which are being addressed with new instrumentation, automization, and data processing. Additionally, the newly developed field of immunopeptidomics is showing promise for cancer therapies. In conclusion, peptidomics will continue delivering important information in classical fields like neuropeptides and peptide hormones, benefiting from improvements in state-of-the-art technologies. Moreover, new directions of research such as immunopeptidomics will further complement classical omics technologies and may become routine clinical procedures. Taken together, discoveries of new substances, networks, and applications of peptides can be expected in different disciplines.
Collapse
Affiliation(s)
- Michael Schrader
- Department of Bioengineering Sciences, Weihenstephan-Tr. University of Applied Sciences, Freising, Germany.
| | - Lloyd D Fricker
- Departments of Molecular Pharmacology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
8
|
Sosa-Acosta P, Nogueira FCS, Domont GB. Proteomics and Metabolomics in Congenital Zika Syndrome: A Review of Molecular Insights and Biomarker Discovery. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2024; 1443:63-85. [PMID: 38409416 DOI: 10.1007/978-3-031-50624-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
Zika virus (ZIKV) infection can be transmitted vertically, leading to the development of congenital Zika syndrome (CZS) in infected fetuses. During the early stages of gestation, the fetuses face an elevated risk of developing CZS. However, it is important to note that late-stage infections can also result in adverse outcomes. The differences between CZS and non-CZS phenotypes remain poorly understood. In this review, we provide a summary of the molecular mechanisms underlying ZIKV infection and placental and blood-brain barriers trespassing. Also, we have included molecular alterations that elucidate the progression of CZS by proteomics and metabolomics studies. Lastly, this review comprises investigations into body fluid samples, which have aided to identify potential biomarkers associated with CZS.
Collapse
Affiliation(s)
- Patricia Sosa-Acosta
- Proteomics Unit, Department of Biochemistry, Institute of Chemistry, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Laboratory of Proteomics (LabProt), LADETEC, Institute of Chemistry, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Precision Medicine Research Center, Institute of Biophysics Carlos Chagas Filho, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Fábio C S Nogueira
- Proteomics Unit, Department of Biochemistry, Institute of Chemistry, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
- Laboratory of Proteomics (LabProt), LADETEC, Institute of Chemistry, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
- Precision Medicine Research Center, Institute of Biophysics Carlos Chagas Filho, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
| | - Gilberto B Domont
- Proteomics Unit, Department of Biochemistry, Institute of Chemistry, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
- Precision Medicine Research Center, Institute of Biophysics Carlos Chagas Filho, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
| |
Collapse
|
9
|
Van JAD, Luo Y, Danska JS, Dai F, Alexeeff SE, Gunderson EP, Rost H, Wheeler MB. Postpartum defects in inflammatory response after gestational diabetes precede progression to type 2 diabetes: a nested case-control study within the SWIFT study. Metabolism 2023; 149:155695. [PMID: 37802200 DOI: 10.1016/j.metabol.2023.155695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/27/2023] [Accepted: 09/30/2023] [Indexed: 10/08/2023]
Abstract
BACKGROUND Gestational diabetes (GDM) is a distinctive form of diabetes that first presents in pregnancy. While most women return to normoglycemia after delivery, they are nearly ten times more likely to develop type 2 diabetes than women with uncomplicated pregnancies. Current prevention strategies remain limited due to our incomplete understanding of the early underpinnings of progression. AIM To comprehensively characterize the postpartum profiles of women shortly after a GDM pregnancy and identify key mechanisms responsible for the progression to overt type 2 diabetes using multi-dimensional approaches. METHODS We conducted a nested case-control study of 200 women from the Study of Women, Infant Feeding and Type 2 Diabetes After GDM Pregnancy (SWIFT) to examine biochemical, proteomic, metabolomic, and lipidomic profiles at 6-9 weeks postpartum (baseline) after a GDM pregnancy. At baseline and annually up to two years, SWIFT administered research 2-hour 75-gram oral glucose tolerance tests. Women who developed incident type 2 diabetes within four years of delivery (incident case group, n = 100) were pair-matched by age, race, and pre-pregnancy body mass index to those who remained free of diabetes for at least 8 years (control group, n = 100). Correlation analyses were used to assess and integrate relationships across profiling platforms. RESULTS At baseline, all 200 women were free of diabetes. The case group was more likely to present with dysglycemia (e.g., impaired fasting glucose levels, glucose tolerance, or both). We also detected differences between groups across all omic platforms. Notably, protein profiles revealed an underlying inflammatory response with perturbations in protease inhibitors, coagulation components, extracellular matrix components, and lipoproteins, whereas metabolite and lipid profiles implicated disturbances in amino acids and triglycerides at individual and class levels with future progression. We identified significant correlations between profile features and fasting plasma insulin levels, but not with fasting glucose levels. Additionally, specific cross-omic relationships, particularly among proteins and lipids, were accentuated or activated in the case group but not the control group. CONCLUSIONS Overall, we applied orthogonal, complementary profiling techniques to uncover an inflammatory response linked to elevated triglyceride levels shortly after a GDM pregnancy, which is more pronounced in women who progress to overt diabetes.
Collapse
Affiliation(s)
- Julie A D Van
- Department of Physiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Metabolism Research Group, Division of Advanced Diagnostics, Toronto General Research Institute, Toronto, Ontario, Canada.
| | - Yihan Luo
- Department of Physiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Metabolism Research Group, Division of Advanced Diagnostics, Toronto General Research Institute, Toronto, Ontario, Canada
| | - Jayne S Danska
- Program in Genetics and Genome Biology, Hospital for Sick Children Research Institute, Toronto, Ontario, Canada; Departments of Immunology and Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Feihan Dai
- Department of Physiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Stacey E Alexeeff
- Division of Research, Kaiser Permanente Northern California, Oakland, California, United States of America
| | - Erica P Gunderson
- Division of Research, Kaiser Permanente Northern California, Oakland, California, United States of America; Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, California, United States of America
| | - Hannes Rost
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Michael B Wheeler
- Department of Physiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Metabolism Research Group, Division of Advanced Diagnostics, Toronto General Research Institute, Toronto, Ontario, Canada.
| |
Collapse
|
10
|
Jia W, Peng J, Zhang Y, Zhu J, Qiang X, Zhang R, Shi L. Exploring novel ANGICon-EIPs through ameliorated peptidomics techniques: Can deep learning strategies as a core breakthrough in peptide structure and function prediction? Food Res Int 2023; 174:113640. [PMID: 37986483 DOI: 10.1016/j.foodres.2023.113640] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/23/2023] [Accepted: 10/24/2023] [Indexed: 11/22/2023]
Abstract
Dairy-derived angiotensin-I-converting enzyme inhibitory peptides (ANGICon-EIPs) have been regarded as a relatively safe supplementary diet-therapy strategy for individuals with hypertension, and short-chain peptides may have more relevant antihypertensive benefits due to their direct intestinal absorption. Our previous explorations have confirmed that endogenous goat milk short-chain peptides are also an essential source of ANGICon-EIPs. Nonetheless, there are limited explorations on endogenous ANGICon-EIPs owing to the limitations of the extraction and enrichment of endogenous peptides, currently. This review outlined ameliorated pre-treatment strategies, data acquisition methods, and tools for the prediction of peptide structure and function, aiming to provide creative ideas for discovering novel ANGICon-EIPs. Currently, deep learning-based peptide structure and function prediction algorithms have achieved significant advancements. The convolutional neural network (CNN) and peptide sequence-based multi-label deep learning approach for determining the multi-functionalities of bioactive peptides (MLBP) can predict multiple peptide functions with absolute true value and accuracy of 0.699 and 0.708, respectively. Utilizing peptide sequence input, torsion angles, and inter-residue distance to train neural networks, APPTEST predicted the average backbone root mean square deviation (RMSD) value of peptide (5-40 aa) structures as low as 1.96 Å. Overall, with the exploration of more neural network architectures, deep learning could be considered a critical research tool to reduce the cost and improve the efficiency of identifying novel endogenous ANGICon-EIPs.
Collapse
Affiliation(s)
- Wei Jia
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China; Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China; Shaanxi Research Institute of Agricultural Products Processing Technology, Xi'an 710021, China.
| | - Jian Peng
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Yan Zhang
- Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China
| | - Jiying Zhu
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Xin Qiang
- Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China
| | - Rong Zhang
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Lin Shi
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| |
Collapse
|
11
|
Sinitcyn P, Richards AL, Weatheritt RJ, Brademan DR, Marx H, Shishkova E, Meyer JG, Hebert AS, Westphall MS, Blencowe BJ, Cox J, Coon JJ. Global detection of human variants and isoforms by deep proteome sequencing. Nat Biotechnol 2023; 41:1776-1786. [PMID: 36959352 PMCID: PMC10713452 DOI: 10.1038/s41587-023-01714-x] [Citation(s) in RCA: 43] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 02/15/2023] [Indexed: 03/25/2023]
Abstract
An average shotgun proteomics experiment detects approximately 10,000 human proteins from a single sample. However, individual proteins are typically identified by peptide sequences representing a small fraction of their total amino acids. Hence, an average shotgun experiment fails to distinguish different protein variants and isoforms. Deeper proteome sequencing is therefore required for the global discovery of protein isoforms. Using six different human cell lines, six proteases, deep fractionation and three tandem mass spectrometry fragmentation methods, we identify a million unique peptides from 17,717 protein groups, with a median sequence coverage of approximately 80%. Direct comparison with RNA expression data provides evidence for the translation of most nonsynonymous variants. We have also hypothesized that undetected variants likely arise from mutation-induced protein instability. We further observe comparable detection rates for exon-exon junction peptides representing constitutive and alternative splicing events. Our dataset represents a resource for proteoform discovery and provides direct evidence that most frame-preserving alternatively spliced isoforms are translated.
Collapse
Affiliation(s)
- Pavel Sinitcyn
- Computational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, Martinsried, Germany
- Morgridge Institute for Research, Madison, WI, USA
| | - Alicia L Richards
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Robert J Weatheritt
- EMBL Australia and Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Dain R Brademan
- Morgridge Institute for Research, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Harald Marx
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - Evgenia Shishkova
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Jesse G Meyer
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Alexander S Hebert
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
| | - Michael S Westphall
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Benjamin J Blencowe
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, Martinsried, Germany.
| | - Joshua J Coon
- Morgridge Institute for Research, Madison, WI, USA.
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
12
|
Chan CMJ, Lam H. Merging Full-Spectrum and Fragment Ion Intensity Predictions from Deep Learning for High-Quality Spectral Libraries. J Proteome Res 2023; 22:3692-3702. [PMID: 37910637 DOI: 10.1021/acs.jproteome.3c00180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Spectral libraries are useful resources in proteomic data analysis. Recent advances in deep learning allow tandem mass spectra of peptides to be predicted from their amino acid sequences. This enables predicted spectral libraries to be compiled, and searching against such libraries has been shown to improve the sensitivity in peptide identification over conventional sequence database searching. However, current prediction models lack support for longer peptides, and thus far, predicted library searching has only been demonstrated for backbone ion-only spectrum prediction methods. Here, we propose a deep learning-based full-spectrum prediction method to generate predicted spectral libraries for peptide identification. We demonstrated the superiority of using full-spectrum libraries over backbone ion-only prediction approaches in spectral library searching. Furthermore, merging spectra from different prediction models, as a form of ensemble learning, can produce improved spectral libraries, in terms of identification sensitivity. We also show that a hybrid library combining predicted and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching.
Collapse
Affiliation(s)
- Chak Ming Jerry Chan
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| | - Henry Lam
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| |
Collapse
|
13
|
Lei X, Xie Z, Sun Y, Qiu J, Yang X. Recent progress in identification of water disinfection byproducts and opportunities for future research. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 337:122601. [PMID: 37742858 DOI: 10.1016/j.envpol.2023.122601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/26/2023] [Accepted: 09/20/2023] [Indexed: 09/26/2023]
Abstract
Numerous disinfection by-products (DBPs) are formed from reactions between disinfectants and organic/inorganic matter during water disinfection. More than seven hundred DBPs that have been identified in disinfected water, only a fraction of which are regulated by drinking water guidelines, including trihalomethanes, haloacetic acids, bromate, and chlorite. Toxicity assessments have demonstrated that the identified DBPs cannot fully explain the overall toxicity of disinfected water; therefore, the identification of unknown DBPs is an important prerequisite to obtain insights for understanding the adverse effects of drinking water disinfection. Herein, we review the progress in identification of unknown DBPs in the recent five years with classifications of halogenated or nonhalogenated, aliphatic or aromatic, followed by specific halogen groups. The concentration and toxicity data of newly identified DBPs are also included. According to the current advances and existing shortcomings, we envisioned future perspectives in this field.
Collapse
Affiliation(s)
- Xiaoxiao Lei
- Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Ziyan Xie
- Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Yijia Sun
- Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Junlang Qiu
- Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China.
| | - Xin Yang
- Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
14
|
Wallmann G, Leduc A, Slavov N. Data-Driven Optimization of DIA Mass Spectrometry by DO-MS. J Proteome Res 2023; 22:3149-3158. [PMID: 37695820 PMCID: PMC10591957 DOI: 10.1021/acs.jproteome.3c00177] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Indexed: 09/13/2023]
Abstract
Mass spectrometry (MS) enables specific and accurate quantification of proteins with ever-increasing throughput and sensitivity. Maximizing this potential of MS requires optimizing data acquisition parameters and performing efficient quality control for large datasets. To facilitate these objectives for data-independent acquisition (DIA), we developed a second version of our framework for data-driven optimization of MS methods (DO-MS). The DO-MS app v2.0 (do-ms.slavovlab.net) allows one to optimize and evaluate results from both label-free and multiplexed DIA (plexDIA) and supports optimizations particularly relevant to single-cell proteomics. We demonstrate multiple use cases, including optimization of duty cycle methods, peptide separation, number of survey scans per duty cycle, and quality control of single-cell plexDIA data. DO-MS allows for interactive data display and generation of extensive reports, including publication of quality figures that can be easily shared. The source code is available at github.com/SlavovLab/DO-MS.
Collapse
Affiliation(s)
- Georg Wallmann
- Departments
of Bioengineering, Biology, Chemistry and Chemical Biology, Single
Cell Proteomics Center, Northeastern University, Boston, Massachusetts 02115, United States
| | - Andrew Leduc
- Departments
of Bioengineering, Biology, Chemistry and Chemical Biology, Single
Cell Proteomics Center, Northeastern University, Boston, Massachusetts 02115, United States
| | - Nikolai Slavov
- Departments
of Bioengineering, Biology, Chemistry and Chemical Biology, Single
Cell Proteomics Center, Northeastern University, Boston, Massachusetts 02115, United States
- Parallel
Squared Technology Institute, Watertown, Massachusetts 02472, United States
| |
Collapse
|
15
|
Lu H, Xie T, Wu Q, Hu Z, Luo Y, Luo F. Alpha-Glucosidase Inhibitory Peptides: Sources, Preparations, Identifications, and Action Mechanisms. Nutrients 2023; 15:4267. [PMID: 37836551 PMCID: PMC10574726 DOI: 10.3390/nu15194267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 10/15/2023] Open
Abstract
With the change in people's lifestyle, diabetes has emerged as a chronic disease that poses a serious threat to human health, alongside tumor, cardiovascular, and cerebrovascular diseases. α-glucosidase inhibitors, which are oral drugs, have proven effective in preventing and managing this disease. Studies have suggested that bioactive peptides could serve as a potential source of α-glucosidase inhibitors. These peptides possess certain hypoglycemic activity and can effectively regulate postprandial blood glucose levels by inhibiting α-glucosidase activity, thus intervening and regulating diabetes. This paper provides a systematic summary of the sources, isolation, purification, bioavailability, and possible mechanisms of α-glucosidase inhibitory peptides. The sources of the α-glucosidase inhibitory peptides were introduced with emphasis on animals, plants, and microorganisms. This paper also points out the problems in the research process of α-glucosidase inhibitory peptide, with a view to providing certain theoretical support for the further study of this peptide.
Collapse
Affiliation(s)
- Han Lu
- Hunan Key Laboratory of Grain-Oil Deep Process and Quality Control, Central South University of Forestry and Technology, Changsha 410004, China; (H.L.); (T.X.); (Q.W.); (Z.H.)
| | - Tiantian Xie
- Hunan Key Laboratory of Grain-Oil Deep Process and Quality Control, Central South University of Forestry and Technology, Changsha 410004, China; (H.L.); (T.X.); (Q.W.); (Z.H.)
- Hunan Key Laboratory of Forestry Edible Resources Safety and Processing, Central South University of Forestry and Technology, Changsha 410004, China
| | - Qi Wu
- Hunan Key Laboratory of Grain-Oil Deep Process and Quality Control, Central South University of Forestry and Technology, Changsha 410004, China; (H.L.); (T.X.); (Q.W.); (Z.H.)
| | - Zuomin Hu
- Hunan Key Laboratory of Grain-Oil Deep Process and Quality Control, Central South University of Forestry and Technology, Changsha 410004, China; (H.L.); (T.X.); (Q.W.); (Z.H.)
| | - Yi Luo
- Department of Gastroenterology, Xiangya School of Medicine, Central South University, Changsha 410008, China;
| | - Feijun Luo
- Hunan Key Laboratory of Grain-Oil Deep Process and Quality Control, Central South University of Forestry and Technology, Changsha 410004, China; (H.L.); (T.X.); (Q.W.); (Z.H.)
- Hunan Key Laboratory of Forestry Edible Resources Safety and Processing, Central South University of Forestry and Technology, Changsha 410004, China
| |
Collapse
|
16
|
Li X. Recent applications of quantitative mass spectrometry in biopharmaceutical process development and manufacturing. J Pharm Biomed Anal 2023; 234:115581. [PMID: 37494866 DOI: 10.1016/j.jpba.2023.115581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/27/2023] [Accepted: 07/12/2023] [Indexed: 07/28/2023]
Abstract
Biopharmaceutical products have seen rapid growth over the past few decades and continue to dominate the global pharmaceutical market. Aligning with the quality by design (QbD) framework and realization, recent advances in liquid chromatography-mass spectrometry (LC-MS) instrumentation and related techniques have enhanced biopharmaceutical characterization capabilities and have supported an increased development of biopharmaceutical products. Beyond its routine qualitative characterization, the quantitative feature of LC-MS has unique applications in biopharmaceutical process development and manufacturing. This review describes the recent applications and implications of the advancement of quantitative MS methods in biopharmaceutical process development, and characterization of biopharmaceutical product, product-related variants, and process-related impurities. We also provide insights on the emerging applications of quantitative MS in the lifecycle of biopharmaceutical product development including quality control in the Good Manufacturing Practice (GMP) environment and process analytical technology (PAT) practices during process development and manufacturing. Through collaboration with instrument and software vendors and regulatory agencies, we envision broader adoption of phase-appropriate quantitative MS-based methods for the analysis of biopharmaceutical products, which in turn has the potential to enable manufacture of higher quality products for patients.
Collapse
Affiliation(s)
- Xuanwen Li
- Analytical Research and Development, MRL, Merck & Co., Inc., 126 E. Lincoln Avenue, Rahway, NJ 07065, USA.
| |
Collapse
|
17
|
Baker JL. Illuminating the oral microbiome and its host interactions: recent advancements in omics and bioinformatics technologies in the context of oral microbiome research. FEMS Microbiol Rev 2023; 47:fuad051. [PMID: 37667515 PMCID: PMC10503653 DOI: 10.1093/femsre/fuad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/02/2023] [Accepted: 09/01/2023] [Indexed: 09/06/2023] Open
Abstract
The oral microbiota has an enormous impact on human health, with oral dysbiosis now linked to many oral and systemic diseases. Recent advancements in sequencing, mass spectrometry, bioinformatics, computational biology, and machine learning are revolutionizing oral microbiome research, enabling analysis at an unprecedented scale and level of resolution using omics approaches. This review contains a comprehensive perspective of the current state-of-the-art tools available to perform genomics, metagenomics, phylogenomics, pangenomics, transcriptomics, proteomics, metabolomics, lipidomics, and multi-omics analysis on (all) microbiomes, and then provides examples of how the techniques have been applied to research of the oral microbiome, specifically. Key findings of these studies and remaining challenges for the field are highlighted. Although the methods discussed here are placed in the context of their contributions to oral microbiome research specifically, they are pertinent to the study of any microbiome, and the intended audience of this includes researchers would simply like to get an introduction to microbial omics and/or an update on the latest omics methods. Continued research of the oral microbiota using omics approaches is crucial and will lead to dramatic improvements in human health, longevity, and quality of life.
Collapse
Affiliation(s)
- Jonathon L Baker
- Department of Oral Rehabilitation & Biosciences, School of Dentistry, Oregon Health & Science University, 3181 Sam Jackson Park Road, Portland, OR 97202, United States
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA 92037, United States
- Department of Pediatrics, UC San Diego School of Medicine, La Jolla, CA 92093, United States
| |
Collapse
|
18
|
Wallmann G, Leduc A, Slavov N. Data-Driven Optimization of DIA Mass Spectrometry by DO-MS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.02.526809. [PMID: 36778474 PMCID: PMC9915643 DOI: 10.1101/2023.02.02.526809] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Mass spectrometry (MS) enables specific and accurate quantification of proteins with ever increasing throughput and sensitivity. Maximizing this potential of MS requires optimizing data acquisition parameters and performing efficient quality control for large datasets. To facilitate these objectives for data independent acquisition (DIA), we developed a second version of our framework for data-driven optimization of mass spectrometry methods (DO-MS). The DO-MS app v2.0 ( do-ms.slavovlab.net ) allows to optimize and evaluate results from both label free and multiplexed DIA (plexDIA) and supports optimizations particularly relevant for single-cell proteomics. We demonstrate multiple use cases, including optimization of duty cycle methods, peptide separation, number of survey scans per duty cycle, and quality control of single-cell plexDIA data. DO-MS allows for interactive data display and generation of extensive reports, including publication quality figures, that can be easily shared. The source code is available at: github.com/SlavovLab/DO-MS .
Collapse
Affiliation(s)
- Georg Wallmann
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single Cell Proteomics Center, Northeastern University, Boston, MA 02115, USA
| | - Andrew Leduc
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single Cell Proteomics Center, Northeastern University, Boston, MA 02115, USA
| | - Nikolai Slavov
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single Cell Proteomics Center, Northeastern University, Boston, MA 02115, USA
- Parallel Squared Technology Institute, Watertown, MA 02472, USA
| |
Collapse
|
19
|
Ng CCA, Zhou Y, Yao ZP. Algorithms for de-novo sequencing of peptides by tandem mass spectrometry: A review. Anal Chim Acta 2023; 1268:341330. [PMID: 37268337 DOI: 10.1016/j.aca.2023.341330] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 05/04/2023] [Accepted: 05/06/2023] [Indexed: 06/04/2023]
Abstract
Peptide sequencing is of great significance to fundamental and applied research in the fields such as chemical, biological, medicinal and pharmaceutical sciences. With the rapid development of mass spectrometry and sequencing algorithms, de-novo peptide sequencing using tandem mass spectrometry (MS/MS) has become the main method for determining amino acid sequences of novel and unknown peptides. Advanced algorithms allow the amino acid sequence information to be accurately obtained from MS/MS spectra in short time. In this review, algorithms from exhaustive search to the state-of-art machine learning and neural network for high-throughput and automated de-novo sequencing are introduced and compared. Impacts of datasets on algorithm performance are highlighted. The current limitations and promising direction of de-novo peptide sequencing are also discussed in this review.
Collapse
Affiliation(s)
- Cheuk Chi A Ng
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Yin Zhou
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Zhong-Ping Yao
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China.
| |
Collapse
|
20
|
Claushuis B, Cordfunke RA, de Ru AH, Otte A, van Leeuwen HC, Klychnikov OI, van Veelen PA, Corver J, Drijfhout JW, Hensbergen PJ. In-Depth Specificity Profiling of Endopeptidases Using Dedicated Mix-and-Split Synthetic Peptide Libraries and Mass Spectrometry. Anal Chem 2023; 95:11621-11631. [PMID: 37495545 PMCID: PMC10413326 DOI: 10.1021/acs.analchem.3c01215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 07/10/2023] [Indexed: 07/28/2023]
Abstract
Proteases comprise the class of enzymes that catalyzes the hydrolysis of peptide bonds, thereby playing a pivotal role in many aspects of life. The amino acids surrounding the scissile bond determine the susceptibility toward protease-mediated hydrolysis. A detailed understanding of the cleavage specificity of a protease can lead to the identification of its endogenous substrates, while it is also essential for the design of inhibitors. Although many methods for protease activity and specificity profiling exist, none of these combine the advantages of combinatorial synthetic libraries, i.e., high diversity, equimolar concentration, custom design regarding peptide length, and randomization, with the sensitivity and detection power of mass spectrometry. Here, we developed such a method and applied it to study a group of bacterial metalloproteases that have the unique specificity to cleave between two prolines, i.e., Pro-Pro endopeptidases (PPEPs). We not only confirmed the prime-side specificity of PPEP-1 and PPEP-2, but also revealed some new unexpected peptide substrates. Moreover, we have characterized a new PPEP (PPEP-3) that has a prime-side specificity that is very different from that of the other two PPEPs. Importantly, the approach that we present in this study is generic and can be extended to investigate the specificity of other proteases.
Collapse
Affiliation(s)
- Bart Claushuis
- Center
for Proteomics and Metabolomics, Leiden
University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Robert A. Cordfunke
- Department
of Immunology, Leiden University Medical
Center, Leiden, 2333 ZA, The Netherlands
| | - Arnoud H. de Ru
- Center
for Proteomics and Metabolomics, Leiden
University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Annemarie Otte
- Center
for Proteomics and Metabolomics, Leiden
University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Hans C. van Leeuwen
- Department
of CBRN Protection, Netherlands Organization
for Applied Scientific Research TNO, Rijswijk, 2280 AA, The Netherlands
| | - Oleg I. Klychnikov
- Department
of Biochemistry, Moscow State University, Moscow 119991, Russian Federation
| | - Peter A. van Veelen
- Center
for Proteomics and Metabolomics, Leiden
University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Jeroen Corver
- Department
of Medical Microbiology, Leiden University
Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Jan W. Drijfhout
- Department
of Immunology, Leiden University Medical
Center, Leiden, 2333 ZA, The Netherlands
| | - Paul J. Hensbergen
- Center
for Proteomics and Metabolomics, Leiden
University Medical Center, Leiden, 2333 ZA, The Netherlands
| |
Collapse
|
21
|
Leonard TA, Loose M, Martens S. The membrane surface as a platform that organizes cellular and biochemical processes. Dev Cell 2023; 58:1315-1332. [PMID: 37419118 DOI: 10.1016/j.devcel.2023.06.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/22/2023] [Accepted: 06/08/2023] [Indexed: 07/09/2023]
Abstract
Membranes are essential for life. They act as semi-permeable boundaries that define cells and organelles. In addition, their surfaces actively participate in biochemical reaction networks, where they confine proteins, align reaction partners, and directly control enzymatic activities. Membrane-localized reactions shape cellular membranes, define the identity of organelles, compartmentalize biochemical processes, and can even be the source of signaling gradients that originate at the plasma membrane and reach into the cytoplasm and nucleus. The membrane surface is, therefore, an essential platform upon which myriad cellular processes are scaffolded. In this review, we summarize our current understanding of the biophysics and biochemistry of membrane-localized reactions with particular focus on insights derived from reconstituted and cellular systems. We discuss how the interplay of cellular factors results in their self-organization, condensation, assembly, and activity, and the emergent properties derived from them.
Collapse
Affiliation(s)
- Thomas A Leonard
- Max Perutz Labs, Vienna Biocenter Campus (VBC), Dr. Bohr-Gasse 9, 1030, Vienna, Austria; Medical University of Vienna, Center for Medical Biochemistry, Dr. Bohr-Gasse 9, 1030, Vienna, Austria.
| | - Martin Loose
- Institute of Science and Technology Austria, Am Campus 1, 3400 Klosterneuburg, Austria.
| | - Sascha Martens
- Max Perutz Labs, Vienna Biocenter Campus (VBC), Dr. Bohr-Gasse 9, 1030, Vienna, Austria; University of Vienna, Center for Molecular Biology, Department of Biochemistry and Cell Biology, Dr. Bohr-Gasse 9, 1030, Vienna, Austria.
| |
Collapse
|
22
|
Bader JM, Albrecht V, Mann M. MS-based proteomics of body fluids: The end of the beginning. Mol Cell Proteomics 2023:100577. [PMID: 37209816 PMCID: PMC10388585 DOI: 10.1016/j.mcpro.2023.100577] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/07/2023] [Accepted: 05/11/2023] [Indexed: 05/22/2023] Open
Abstract
Accurate biomarkers are a crucial and necessary precondition for precision medicine, yet existing ones are often unspecific and new ones have been very slow to enter the clinic. Mass spectrometry (MS)-based proteomics excels by its untargeted nature, specificity of identification and quantification making it an ideal technology for biomarker discovery and routine measurement. It has unique attributes compared to affinity binder technologies, such as OLINK Proximity Extension Assay and SOMAscan. In a previous review we described technological and conceptual limitations that had held back success (Geyer et al., 2017). We proposed a 'rectangular strategy' to better separate true biomarkers by minimizing cohort-specific effects. Today, this has converged with advances in MS-based proteomics technology, such as increased sample throughput, depth of identification and quantification. As a result, biomarker discovery studies have become more successful, producing biomarker candidates that withstand independent verification and, in some cases, already outperform state-of-the-art clinical assays. We summarize developments over the last years, including the benefits of large and independent cohorts, which are necessary for clinical acceptance. They are also required for machine learning or deep learning. Shorter gradients, new scan modes and multiplexing are about to drastically increase throughput, cross-study integration, and quantification, including proxies for absolute levels. We have found that multi-protein panels are inherently more robust than current single analyte tests and better capture the complexity of human phenotypes. Routine MS measurement in the clinic is fast becoming a viable option. The full set of proteins in a body fluid (global proteome) is the most important reference and the best process control. Additionally, it increasingly has all the information that could be obtained from targeted analysis although the latter may be the most straightforward way to enter into regular use. Many challenges remain, not least of a regulatory and ethical nature, but the outlook for MS-based clinical applications has never been brighter.
Collapse
Affiliation(s)
- Jakob M Bader
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Vincent Albrecht
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.
| |
Collapse
|
23
|
Admon A. The biogenesis of the immunopeptidome. Semin Immunol 2023; 67:101766. [PMID: 37141766 DOI: 10.1016/j.smim.2023.101766] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/26/2023] [Accepted: 04/26/2023] [Indexed: 05/06/2023]
Abstract
The immunopeptidome is the repertoire of peptides bound and presented by the MHC class I, class II, and non-classical molecules. The peptides are produced by the degradation of most cellular proteins, and in some cases, peptides are produced from extracellular proteins taken up by the cells. This review attempts to first describe some of its known and well-accepted concepts, and next, raise some questions about a few of the established dogmas in this field: The production of novel peptides by splicing is questioned, suggesting here that spliced peptides are extremely rare, if existent at all. The degree of the contribution to the immunopeptidome by degradation of cellular protein by the proteasome is doubted, therefore this review attempts to explain why it is likely that this contribution to the immunopeptidome is possibly overstated. The contribution of defective ribosome products (DRiPs) and non-canonical peptides to the immunopeptidome is noted and methods are suggested to quantify them. In addition, the common misconception that the MHC class II peptidome is mostly derived from extracellular proteins is noted, and corrected. It is stressed that the confirmation of sequence assignments of non-canonical and spliced peptides should rely on targeted mass spectrometry using spiking-in of heavy isotope-labeled peptides. Finally, the new methodologies and modern instrumentation currently available for high throughput kinetics and quantitative immunopeptidomics are described. These advanced methods open up new possibilities for utilizing the big data generated and taking a fresh look at the established dogmas and reevaluating them critically.
Collapse
Affiliation(s)
- Arie Admon
- Faculty of Biology, Technion-Israel Institute of Technology, Israel.
| |
Collapse
|