1
|
Dai Y, Yang Y, Wu E, Shen C, Qiao L. Deep Learning Powers Protein Identification from Precursor MS Information. J Proteome Res 2024; 23:3837-3846. [PMID: 39167422 DOI: 10.1021/acs.jproteome.4c00118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Proteome analysis currently heavily relies on tandem mass spectrometry (MS/MS), which does not fully utilize MS1 features, as many precursors remain unselected for MS/MS fragmentation, especially in the cases of low abundance samples and wide abundance dynamic range samples. Therefore, leveraging MS1 features as a complement to MS/MS has become an attractive option to improve the coverage of feature identification. Herein, we propose MonoMS1, an approach combining deep learning-based retention time, ion mobility, detectability prediction, and logistic regression-based scoring for MS1 feature identification. The approach achieved a significant increase in MS1 feature identification based on an E. coli data set. Application of MonoMS1 to data sets with wide dynamic range, such as human serum proteome samples, and with low sample abundance, such as single-cell proteome samples, enabled substantial complementation of MS/MS-based peptide and protein identification. This method opens a new avenue for proteomic analysis and can boost proteomic research on complex samples.
Collapse
Affiliation(s)
- Yameng Dai
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai 200000, China
| | - Yi Yang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 311200, China
| | - Enhui Wu
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai 200000, China
| | - Chengpin Shen
- Shanghai Omicsolution Co., Ltd., Shanghai 201100, China
| | - Liang Qiao
- Department of Chemistry, and Shanghai Stomatological Hospital, Fudan University, Shanghai 200000, China
| |
Collapse
|
2
|
Ivanov MV, Kopeykina AS, Gorshkov MV. Reanalysis of DIA Data Demonstrates the Capabilities of MS/MS-Free Proteomics to Reveal New Biological Insights in Disease-Related Samples. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1775-1785. [PMID: 38938158 DOI: 10.1021/jasms.4c00134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
Data-independent acquisition (DIA) at the shortened data acquisition time is becoming a method of choice for quantitative proteomic applications requiring high throughput analysis of large cohorts of samples. With the advent of the combination of high resolution mass spectrometry with an asymmetric track lossless analyzer, these DIA capabilities were further extended with the recent demonstration of quantitative analyses at the speed of up to hundreds of samples per day. In particular, the proteomic data for the brain samples related to multiple system atrophy disease were acquired using 7 and 28 min chromatography gradients (Guzman et al., Nat. Biotech. 2024). In this work, we applied the recently introduced DirectMS1 method to reanalysis of these data using only MS1 spectra. Both DirectMS1 and DIA results were matched against long gradient DDA analysis from the earlier study of the same sample cohort. While the quantitation efficiency of DirectMS1 was comparable with DIA on the same data sets, we found an additional five proteins of biological significance relevant to the analyzed tissue samples. Among the findings, DirectMS1 was able to detect decreased caspase activity for Vimentin protein in the multiple system atrophy samples missed by the MS/MS-based quantitation methods. Our study suggests that DirectMS1 can be an efficient MS1-only addition to the analysis of DIA data in high-throughput quantitative proteomic studies.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Anna S Kopeykina
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
3
|
Fedorov II, Protasov SA, Tarasova IA, Gorshkov MV. Ultrafast Proteomics. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:1349-1361. [PMID: 39245450 DOI: 10.1134/s0006297924080017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/21/2024] [Accepted: 06/24/2024] [Indexed: 09/10/2024]
Abstract
Current stage of proteomic research in the field of biology, medicine, development of new drugs, population screening, or personalized approaches to therapy dictates the need to analyze large sets of samples within the reasonable experimental time. Until recently, mass spectrometry measurements in proteomics were characterized as unique in identifying and quantifying cellular protein composition, but low throughput, requiring many hours to analyze a single sample. This was in conflict with the dynamics of changes in biological systems at the whole cellular proteome level upon the influence of external and internal factors. Thus, low speed of the whole proteome analysis has become the main factor limiting developments in functional proteomics, where it is necessary to annotate intracellular processes not only in a wide range of conditions, but also over a long period of time. Enormous level of heterogeneity of tissue cells or tumors, even of the same type, dictates the need to analyze biological systems at the level of individual cells. These studies involve obtaining molecular characteristics for tens, if not hundreds of thousands of individual cells, including their whole proteome profiles. Development of mass spectrometry technologies providing high resolution and mass measurement accuracy, predictive chromatography, new methods for peptide separation by ion mobility and processing of proteomic data based on artificial intelligence algorithms have opened a way for significant, if not radical, increase in the throughput of whole proteome analysis and led to implementation of the novel concept of ultrafast proteomics. Work done just in the last few years has demonstrated the proteome-wide analysis throughput of several hundred samples per day at a depth of several thousand proteins, levels unimaginable three or four years ago. The review examines background of these developments, as well as modern methods and approaches that implement ultrafast analysis of the entire proteome.
Collapse
Affiliation(s)
- Ivan I Fedorov
- Moscow Institute of Physics and Technology (National University), Dolgoprudny, Moscow Region, 141700, Russia
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Sergey A Protasov
- Moscow Institute of Physics and Technology (National University), Dolgoprudny, Moscow Region, 141700, Russia
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina A Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
4
|
Kusainova TT, Emekeeva DD, Kazakova EM, Gorshkov VA, Kjeldsen F, Kuskov ML, Zhigach AN, Olkhovskaya IP, Bogoslovskaya OA, Glushchenko NN, Tarasova IA. Ultra-Fast Mass Spectrometry in Plant Biochemistry: Response of Winter Wheat Proteomics to Pre-Sowing Treatment with Iron Compounds. BIOCHEMISTRY. BIOKHIMIIA 2023; 88:1390-1403. [PMID: 37770405 DOI: 10.1134/s0006297923090183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/06/2023] [Accepted: 08/09/2023] [Indexed: 09/30/2023]
Abstract
In recent years, ultrafast liquid chromatography/mass spectrometry methods have been extensively developed for the use in proteome profiling in biochemical studies. These methods are intended for express monitoring of cell response to biotic stimuli and elucidation of correlation of molecular changes with biological processes and phenotypical changes. New technologies, including the use of nanomaterials, are actively introduced to increase agricultural production. However, this requires complex approbation of new fertilizers and investigation of mechanisms underlying the biotic effects on the germination, growth, and development of plants. The aim of this work was to adapt the method of ultrafast chromatography/mass spectrometry for rapid quantitative profiling of molecular changes in 7-day-old wheat seedlings in response to pre-sowing seed treatment with iron compounds. The used method allows to analyze up to 200 samples per day; its practical value lies in the possibility of express proteomic diagnostics of the biotic action of new treatments, including those intended for agricultural needs. Changes in the regulation of photosynthesis, biosynthesis of chlorophyll and porphyrin- and tetrapyrrole-containing compounds, glycolysis (in shoot tissues), and polysaccharide metabolism (in root tissues) were shown after seed treatment with suspensions containing film-forming polymers (PEG 400, Na-CMC, Na2-EDTA), iron (II, III) nanoparticles, or iron (II) sulfate. Observations at the protein levels were consistent with the results of morphometry, superoxide dismutase activity assay, and microelement analysis of 3-day-old germinated seeds and shoots and roots of 7-day-old seedlings. A characteristic molecular signature involving proteins participating in the regulation of photosynthesis and glycolytic process was suggested as a potential marker of the biotic effects of seed treatment with iron compounds, which will be confirmed in further studies.
Collapse
Affiliation(s)
- Tomiris T Kusainova
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Daria D Emekeeva
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Elizaveta M Kazakova
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Vladimir A Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, DK-5230, Denmark
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, DK-5230, Denmark
| | - Mikhail L Kuskov
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Alexey N Zhigach
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina P Olkhovskaya
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Olga A Bogoslovskaya
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Natalia N Glushchenko
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina A Tarasova
- Talroze Institute for Energy Problems of Chemical Physics, Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
5
|
Penanes P, Gorshkov V, Ivanov MV, Gorshkov MV, Kjeldsen F. Potential of Negative-Ion-Mode Proteomics: An MS1-Only Approach. J Proteome Res 2023; 22:2734-2742. [PMID: 37395192 PMCID: PMC10407931 DOI: 10.1021/acs.jproteome.3c00307] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Indexed: 07/04/2023]
Abstract
Current proteomics approaches rely almost exclusively on using the positive ionization mode, resulting in inefficient ionization of many acidic peptides. This study investigates protein identification efficiency in the negative ionization mode using the DirectMS1 method. DirectMS1 is an ultrafast data acquisition method based on accurate peptide mass measurements and predicted retention times. Our method achieves the highest rate of protein identification in the negative ion mode to date, identifying over 1000 proteins in a human cell line at a 1% false discovery rate. This is accomplished using a single-shot 10 min separation gradient, comparable to lengthy MS/MS-based analyses. Optimizing separation and experimental conditions was achieved by utilizing mobile buffers containing 2.5 mM imidazole and 3% isopropanol. The study emphasized the complementary nature of data obtained in positive and negative ion modes. Combining the results from all replicates in both polarities increased the number of identified proteins to 1774. Additionally, we analyzed the method's efficiency using different proteases for protein digestion. Among the four studied proteases (LysC, GluC, AspN, and trypsin), trypsin and LysC demonstrated the highest protein identification yield. This suggests that digestion procedures utilized in positive-mode proteomics can be effectively applied in the negative ion mode. Data are deposited to ProteomeXchange: PXD040583.
Collapse
Affiliation(s)
- Pelayo
A. Penanes
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| | - Vladimir Gorshkov
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mark V. Ivanov
- V.
L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical
Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Mikhail V. Gorshkov
- V.
L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical
Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Frank Kjeldsen
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| |
Collapse
|
6
|
Solovyeva EM, Bubis JA, Tarasova IA, Lobas AA, Ivanov MV, Nazarov AA, Shutkov IA, Gorshkov MV. On the Feasibility of Using an Ultra-Fast DirectMS1 Method of Proteome-Wide Analysis for Searching Drug Targets in Chemical Proteomics. BIOCHEMISTRY. BIOKHIMIIA 2022; 87:1342-1353. [PMID: 36509723 DOI: 10.1134/s000629792211013x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Protein quantitation in tissue cells or physiological fluids based on liquid chromatography/mass spectrometry is one of the key sources of information on the mechanisms of cell functioning during chemotherapeutic treatment. Information on significant changes in protein expression upon treatment can be obtained by chemical proteomics and requires analysis of the cellular proteomes, as well as development of experimental and bioinformatic methods for identification of the drug targets. Low throughput of whole proteome analysis based on liquid chromatography and tandem mass spectrometry is one of the main factors limiting the scale of these studies. The method of direct mass spectrometric identification of proteins, DirectMS1, is one of the approaches developed in recent years allowing ultrafast proteome-wide analyses employing minute-scale gradients for separation of proteolytic mixtures. Aim of this work was evaluation of both possibilities and limitations of the method for identification of drug targets at the level of whole proteome and for revealing cellular processes activated by the treatment. Particularly, the available literature data on chemical proteomics obtained earlier for a large set of onco-pharmaceuticals using multiplex quantitative proteome profiling were analyzed. The results obtained were further compared with the proteome-wide data acquired by the DirectMS1 method using ultrashort separation gradients to evaluate efficiency of the method in identifying known drug targets. Using ovarian cancer cell line A2780 as an example, a whole-proteome comparison of two cell lysis techniques was performed, including the freeze-thaw lysis commonly employed in chemical proteomics and the one based on ultrasonication for cell disruption, which is the widely accepted as a standard in proteomic studies. Also, the proteome-wide profiling was performed using ultrafast DirectMS1 method for A2780 cell line treated with lonidamine, followed by gene ontology analyses to evaluate capabilities of the method in revealing regulation of proteins in the cellular processes associated with drug treatment.
Collapse
Affiliation(s)
- Elizaveta M Solovyeva
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Irina A Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Anna A Lobas
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Alexey A Nazarov
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Ilya A Shutkov
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
7
|
Ivanov MV, Bubis JA, Gorshkov V, Tarasova IA, Levitsky LI, Solovyeva EM, Lipatova AV, Kjeldsen F, Gorshkov MV. DirectMS1Quant: Ultrafast Quantitative Proteomics with MS/MS-Free Mass Spectrometry. Anal Chem 2022; 94:13068-13075. [PMID: 36094425 DOI: 10.1021/acs.analchem.2c02255] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recently, we presented the DirectMS1 method of ultrafast proteome-wide analysis based on minute-long LC gradients and MS1-only mass spectra acquisition. Currently, the method provides the depth of human cell proteome coverage of 2500 proteins at a 1% false discovery rate (FDR) when using 5 min LC gradients and 7.3 min runtime in total. While the standard MS/MS approaches provide 4000-5000 protein identifications within a couple of hours of instrumentation time, we advocate here that the higher number of identified proteins does not always translate into better quantitation quality of the proteome analysis. To further elaborate on this issue, we performed a one-on-one comparison of quantitation results obtained using DirectMS1 with three popular MS/MS-based quantitation methods: label-free (LFQ) and tandem mass tag quantitation (TMT), both based on data-dependent acquisition (DDA) and data-independent acquisition (DIA). For comparison, we performed a series of proteome-wide analyses of well-characterized (ground truth) and biologically relevant samples, including a mix of UPS1 proteins spiked at different concentrations into an Echerichia coli digest used as a background and a set of glioblastoma cell lines. MS1-only data was analyzed using a novel quantitation workflow called DirectMS1Quant developed in this work. The results obtained in this study demonstrated comparable quantitation efficiency of 5 min DirectMS1 with both TMT and DIA methods, yet the latter two utilized a 10-20-fold longer instrumentation time.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Irina A Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Lev I Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Elizaveta M Solovyeva
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Anastasiya V Lipatova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| |
Collapse
|
8
|
Fedorov II, Lineva VI, Tarasova IA, Gorshkov MV. Mass Spectrometry-Based Chemical Proteomics for Drug Target Discoveries. BIOCHEMISTRY. BIOKHIMIIA 2022; 87:983-994. [PMID: 36180990 DOI: 10.1134/s0006297922090103] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 07/04/2022] [Accepted: 07/06/2022] [Indexed: 06/16/2023]
Abstract
Chemical proteomics, emerging rapidly in recent years, has become a main approach to identifying interactions between the small molecules and proteins in the cells on a proteome scale and mapping the signaling and/or metabolic pathways activated and regulated by these interactions. The methods of chemical proteomics allow not only identifying proteins targeted by drugs, characterizing their toxicity and discovering possible off-target proteins, but also elucidation of the fundamental mechanisms of cell functioning under conditions of drug exposure or due to the changes in physiological state of the organism itself. Solving these problems is essential for both basic research in biology and clinical practice, including approaches to early diagnosis of various forms of serious diseases or prediction of the effectiveness of therapeutic treatment. At the same time, recent developments in high-resolution mass spectrometry have provided the technology for searching the drug targets across the whole cell proteomes. This review provides a concise description of the main objectives and problems of mass spectrometry-based chemical proteomics, the methods and approaches to their solution, and examples of implementation of these methods in biomedical research.
Collapse
Affiliation(s)
- Ivan I Fedorov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
- Moscow Institute of Physics and Technology (National University), Dolgoprudny, Moscow Region, 141700, Russia
| | - Victoria I Lineva
- Moscow Institute of Physics and Technology (National University), Dolgoprudny, Moscow Region, 141700, Russia
| | - Irina A Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
9
|
MetaProClust-MS1: an MS1 Profiling Approach for Large-Scale Microbiome Screening. mSystems 2022; 7:e0038122. [PMID: 35950762 PMCID: PMC9426440 DOI: 10.1128/msystems.00381-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Metaproteomics is used to explore the functional dynamics of microbial communities. However, acquiring metaproteomic data by tandem mass spectrometry (MS/MS) is time-consuming and resource-intensive, and there is a demand for computational methods that can be used to reduce these resource requirements. We present MetaProClust-MS1, a computational framework for microbiome feature screening developed to prioritize samples for follow-up MS/MS. In this proof-of-concept study, we tested and compared MetaProClust-MS1 results on gut microbiome data, from fecal samples, acquired using short 15-min MS1-only chromatographic gradients and MS1 spectra from longer 60-min gradients to MS/MS-acquired data. We found that MetaProClust-MS1 identified robust gut microbiome responses caused by xenobiotics with significantly correlated cluster topologies of comparable data sets. We also used MetaProClust-MS1 to reanalyze data from both a clinical MS/MS diagnostic study of pediatric patients with inflammatory bowel disease and an experiment evaluating the therapeutic effects of a small molecule on the brain tissue of Alzheimer's disease mouse models. MetaProClust-MS1 clusters could distinguish between inflammatory bowel disease diagnoses (ulcerative colitis and Crohn's disease) using samples from mucosal luminal interface samples and identified hippocampal proteome shifts of Alzheimer's disease mouse models after small-molecule treatment. Therefore, we demonstrate that MetaProClust-MS1 can screen both microbiomes and single-species proteomes using only MS1 profiles, and our results suggest that this approach may be generalizable to any proteomics experiment. MetaProClust-MS1 may be especially useful for large-scale metaproteomic screening for the prioritization of samples for further metaproteomic characterization, using MS/MS, for instance, in addition to being a promising novel approach for clinical diagnostic screening. IMPORTANCE Growing evidence suggests that human gut microbiome composition and function are highly associated with health and disease. As such, high-throughput metaproteomic studies are becoming more common in gut microbiome research. However, using a conventional long liquid chromatography (LC)-MS/MS gradient metaproteomics approach as an initial screen in large-scale microbiome experiments can be slow and expensive. To combat this challenge, we introduce MetaProClust-MS1, a computational framework for microbiome screening using MS1-only profiles. In this proof-of-concept study, we show that MetaProClust-MS1 identifies clusters of gut microbiome treatments using MS1-only profiles similar to those identified using MS/MS. Our approach allows researchers to prioritize samples and treatments of interest for further metaproteomic analyses and may be generally applicable to any proteomic analysis. In particular, this approach may be especially useful for large-scale metaproteomic screening or in clinical settings where rapid diagnostic evidence is required.
Collapse
|
10
|
Masuda K, Kasahara K, Narumi R, Shimojo M, Shimizu Y. Versatile and multiplexed mass spectrometry-based absolute quantification with cell-free-synthesized internal standard peptides. J Proteomics 2021; 251:104393. [PMID: 34678518 DOI: 10.1016/j.jprot.2021.104393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 10/04/2021] [Accepted: 10/04/2021] [Indexed: 10/20/2022]
Abstract
Preparation of stable isotope-labeled internal standard peptides is crucial for mass spectrometry (MS)-based targeted proteomics. Herein, we developed versatile and multiplexed absolute protein quantification method using MS. A previously developed method based on the cell-free peptide synthesis system, termed MS-based quantification by isotope-labeled cell-free products (MS-QBiC), was improved for multiple peptide synthesis in one-pot reaction. We pluralized the quantification tags used for the quantification of synthesized peptides and thus, made it possible to use cell-free synthesized isotope-labeled peptides as mixtures for the absolute quantification. The improved multiplexed MS-QBiC method was proved to be applied to clarify ribosomal proteins stoichiometry in the ribosomal subunit, one of the largest cellular complexes. The study demonstrates that the developed method enables the preparation of several dozens and even several hundreds of internal standard peptides within a few days for quantification of multiple proteins with only a single-run of MS analysis. SIGNIFICANCE: The developed method can be applied for the preparation of internal standard peptides without limiting the number of peptides to be synthesized, which may result in more practical screening of quantitatively reliable peptides, one of the fundamental steps in the reliable absolute quantification using MS. Furthermore, the method is highly versatile for proteome analysis of any organisms or species without any cDNA or SIL peptide libraries. The quantification can be finished in a few days including design and preparation of appropriate SIL peptides using small-scale batch cell-free reactions, which has a potential to be a part of the standard methodology in a field of quantitative proteomics.
Collapse
Affiliation(s)
- Keiko Masuda
- Laboratory for Cell-Free Protein Synthesis, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-0874, Japan
| | - Keiko Kasahara
- Department of Surgery, Kyoto University Graduate School of Medicine, Sakyo-ku, Kyoto, Kyoto 606-8501, Japan; Laboratory of Proteome Research, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| | - Ryohei Narumi
- Laboratory of Proteome Research, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| | - Masaru Shimojo
- Laboratory for Cell-Free Protein Synthesis, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-0874, Japan
| | - Yoshihiro Shimizu
- Laboratory for Cell-Free Protein Synthesis, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-0874, Japan.
| |
Collapse
|
11
|
Ivanov MV, Bubis JA, Gorshkov V, Abdrakhimov DA, Kjeldsen F, Gorshkov MV. Boosting MS1-only Proteomics with Machine Learning Allows 2000 Protein Identifications in Single-Shot Human Proteome Analysis Using 5 min HPLC Gradient. J Proteome Res 2021; 20:1864-1873. [PMID: 33720732 DOI: 10.1021/acs.jproteome.0c00863] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Proteome-wide analyses rely on tandem mass spectrometry and the extensive separation of proteolytic mixtures. This imposes considerable instrumental time consumption, which is one of the main obstacles in the broader acceptance of proteomics in biomedical and clinical research. Recently, we presented a fast proteomic method termed DirectMS1 based on ultrashort LC gradients as well as MS1-only mass spectra acquisition and data processing. The method allows significant reduction of the proteome-wide analysis time to a few minutes at the depth of quantitative proteome coverage of 1000 proteins at 1% false discovery rate (FDR). In this work, to further increase the capabilities of the DirectMS1 method, we explored the opportunities presented by the recent progress in the machine-learning area and applied the LightGBM decision tree boosting algorithm to the scoring of peptide feature matches when processing MS1 spectra. Furthermore, we integrated the peptide feature identification algorithm of DirectMS1 with the recently introduced peptide retention time prediction utility, DeepLC. Additional approaches to improve the performance of the DirectMS1 method are discussed and demonstrated, such as using FAIMS for gas-phase ion separation. As a result of all improvements to DirectMS1, we succeeded in identifying more than 2000 proteins at 1% FDR from the HeLa cell line in a 5 min gradient LC-FAIMS/MS1 analysis. The data sets generated and analyzed during the current study have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD023977.
Collapse
Affiliation(s)
- Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Daniil A Abdrakhimov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia.,Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow Region 141700, Russia
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| |
Collapse
|
12
|
Abdrakhimov DA, Bubis JA, Gorshkov V, Kjeldsen F, Gorshkov MV, Ivanov MV. Biosaur: An open-source Python software for liquid chromatography-mass spectrometry peptide feature detection with ion mobility support. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2021:e9045. [PMID: 33450063 DOI: 10.1002/rcm.9045] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 12/20/2020] [Accepted: 01/04/2021] [Indexed: 06/12/2023]
Abstract
RATIONALE One of the important steps in initial data processing of peptide mass spectra is the detection of peptide features in full-range mass spectra. Ion mobility offers advantages over previous methods performing this detection by providing an additional structure-specific separation dimension. However, there is a lack of open-source software that utilizes these advantages and detects peptide features in mass spectra acquired along with ion mobility data using new instruments such as timsTOF and/or FAIMS-Orbitrap. METHODS Recently, a utility called Dinosaur was presented, which provides an efficient way for feature detection in peptide ion mass spectra. In this work we extended its functionality by developing Biosaur software to fully employ the additional information provided by ion mobility data. Biosaur was developed using the Python 3.8 programming language. RESULTS Biosaur supports the processing of data acquired using mass spectrometers with ion mobility capabilities, specifically timsTOF and FAIMS. In addition, it processes mass spectra obtained in negative ion mode and reports cosine correlation table for peptide features which is useful for differentiation between in-source fragments and semi-tryptic peptides. CONCLUSIONS Biosaur is a utility for detecting peptide features in liquid chromatography-mass spectra with ion mobility and negative ion supports. The software is distributed with an open-source APACHE 2.0 license and is freely available on Github: https://github.com/abdrakhimov1/Biosaur.
Collapse
Affiliation(s)
- Daniil A Abdrakhimov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow, 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudnyj, RU, 141701, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow, 119334, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, DK-5230, Denmark
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, DK-5230, Denmark
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow, 119334, Russia
| | - Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow, 119334, Russia
| |
Collapse
|
13
|
Ivanov MV, Bubis JA, Gorshkov V, Tarasova IA, Levitsky LI, Lobas AA, Solovyeva EM, Pridatchenko ML, Kjeldsen F, Gorshkov MV. DirectMS1: MS/MS-Free Identification of 1000 Proteins of Cellular Proteomes in 5 Minutes. Anal Chem 2020; 92:4326-4333. [DOI: 10.1021/acs.analchem.9b05095] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- Mark V. Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Julia A. Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M DK-5230, Denmark
| | - Irina A. Tarasova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Lev I. Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Anna A. Lobas
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Elizaveta M. Solovyeva
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Marina L. Pridatchenko
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M DK-5230, Denmark
| | - Mikhail V. Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
- Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia
| |
Collapse
|
14
|
Levitsky LI, Klein JA, Ivanov MV, Gorshkov MV. Pyteomics 4.0: Five Years of Development of a Python Proteomics Framework. J Proteome Res 2019; 18:709-714. [PMID: 30576148 DOI: 10.1021/acs.jproteome.8b00717] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many of the novel ideas that drive today's proteomic technologies are focused essentially on experimental or data-processing workflows. The latter are implemented and published in a number of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research. To assist in overcoming the bioinformatics challenges in the daily practice of proteomic laboratories, 5 years ago we developed and announced Pyteomics, a freely available open-source library providing Python interfaces to proteomic data. We summarize the new functionality of Pyteomics developed during the time since its introduction.
Collapse
Affiliation(s)
- Lev I Levitsky
- Moscow Institute of Physics and Technology , Dolgoprudny, Moscow Region 141701 , Russia.,V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| | - Joshua A Klein
- Bioinformatics Program , Boston University , Boston , Massachusetts 02215 , United States
| | - Mark V Ivanov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| | - Mikhail V Gorshkov
- V.L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Moscow 119334 , Russia
| |
Collapse
|