1
|
A Comprehensive Study of Gradient Conditions for Deep Proteome Discovery in a Complex Protein Matrix. Int J Mol Sci 2022; 23:ijms231911714. [PMID: 36233016 PMCID: PMC9569591 DOI: 10.3390/ijms231911714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 11/28/2022] Open
Abstract
Bottom–up mass-spectrometry-based proteomics is a well-developed technology based on complex peptide mixtures from proteolytic cleavage of proteins and is widely applied in protein identification, characterization, and quantitation. A tims-ToF mass spectrometer is an excellent platform for bottom–up proteomics studies due to its rapid acquisition with high sensitivity. It remains challenging for bottom–up proteomics approaches to achieve 100% proteome coverage. Liquid chromatography (LC) is commonly used prior to mass spectrometry (MS) analysis to fractionate peptide mixtures, and the LC gradient can affect the peptide fractionation and proteome coverage. We investigated the effects of gradient type and time duration to find optimal gradient conditions. Five gradient types (linear, logarithm-like, exponent-like, stepwise, and step-linear), three different gradient lengths (22 min, 44 min, and 66 min), two sample loading amounts (100 ng and 200 ng), and two loading conditions (the use of trap column and no trap column) were studied. The effect of these chromatography variables on protein groups, peptides, and spectral counts using HeLa cell digests was explored. The results indicate that (1) a step-linear gradient performs best among the five gradient types studied; (2) the optimal gradient duration depends on protein sample loading amount; (3) the use of a trap column helps to enhance protein identification, especially low-abundance proteins; (4) MSFragger and PEAKS Studio have high similarity in protein group identification; (5) MSFragger identified more protein groups among the different gradient conditions compared to PEAKS Studio; and (6) combining results from both database search engines can expand identified protein groups by 9–11%.
Collapse
|
2
|
Fang M, Wang Z, Cupp-Sutton KA, Welborn T, Smith K, Wu S. High-throughput hydrogen deuterium exchange mass spectrometry (HDX-MS) coupled with subzero-temperature ultrahigh pressure liquid chromatography (UPLC) separation for complex sample analysis. Anal Chim Acta 2021; 1143:65-72. [PMID: 33384131 PMCID: PMC8265693 DOI: 10.1016/j.aca.2020.11.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 10/31/2020] [Accepted: 11/16/2020] [Indexed: 11/23/2022]
Abstract
Hydrogen deuterium exchange coupled with mass spectrometry (HDX-MS) is a powerful technique for the characterization of protein dynamics and protein interactions. Recent technological developments in the HDX-MS field, such as sub-zero LC separations, large-scale data analysis tools, and efficient protein digestion methods, have allowed for the application of HDX-MS to the analysis of multi protein systems in addition to pure protein analysis. Still, high-throughput HDX-MS analysis of complex samples is not widespread because the co-elution of peptides combined with increased peak complexity after labeling makes peak de-convolution extremely difficult. Here, for the first time, we evaluated and optimized long gradient subzero-temperature ultra-high-pressure liquid chromatography (UPLC) separation conditions for the HDX-MS analysis of complex protein samples such as E. coli cell lysate digest. Under the optimized conditions, we identified 1419 deuterated peptides from 320 proteins at -10 °C, which is about 3-fold more when compared with a 15-min gradient separation under the same conditions. Interestingly, our results suggested that the peptides eluted late in the gradient are well-protected by peptide-column interactions at -10 °C so that peptides eluted even at the end of the gradient maintain high levels of deuteration. Overall, our study suggests that the optimized, sub-zero, long-gradient UPLC separation is capable of characterizing thousands of peptides in a single HDX-MS analysis with low back-exchange rates. As a result, this technique holds great potential for characterizing complex samples such as cell lysates using HDX-MS.
Collapse
Affiliation(s)
- Mulin Fang
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Zhe Wang
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Kellye A Cupp-Sutton
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Thomas Welborn
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Kenneth Smith
- Department of Arthritis and Clinical Immunology, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104, USA
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA.
| |
Collapse
|
3
|
Samuelsson J, Eiriksson FF, Åsberg D, Thorsteinsdóttir M, Fornstedt T. Determining gradient conditions for peptide purification in RPLC with machine-learning-based retention time predictions. J Chromatogr A 2019; 1598:92-100. [DOI: 10.1016/j.chroma.2019.03.043] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 03/20/2019] [Accepted: 03/21/2019] [Indexed: 01/22/2023]
|
4
|
Aalizadeh R, Nika MC, Thomaidis NS. Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants. JOURNAL OF HAZARDOUS MATERIALS 2019; 363:277-285. [PMID: 30312924 DOI: 10.1016/j.jhazmat.2018.09.047] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 09/16/2018] [Accepted: 09/17/2018] [Indexed: 05/13/2023]
Abstract
Hydrophilic interaction liquid chromatography (HILIC) and reversed phase LC (RPLC) coupled to high resolution mass spectrometry (HRMS) are widely used for the identification of suspects and unknown compounds in the environment. For the identification of unknowns, apart from mass accuracy and isotopic fitting, retention time (tR) and MS/MS spectra evaluation is required. In this context, a novel comprehensive workflow was developed to study the tR behavior of large groups of emerging contaminants using Quantitative Structure-Retention Relationships (QSRR). 682 compounds were analyzed by HILIC-HRMS in positive Electrospray Ionization mode (ESI). Moreover, an extensive dataset was built for RPLC-HRMS including 1830 and 308 compounds for positive and negative ESI, respectively. Support Vector Machines (SVM) was used to model the tR data. The applicability domains of the models were studied by Monte Carlo Sampling (MCS) methods. The MCS method was also used to calculate the acceptable error windows for the predicted tR from various LC conditions. This paper provides validated models for predicting tR in HILIC/RPLC-HRMS platforms to facilitate identification of new emerging contaminants by suspect and non-target HRMS screening, and were applied for the identification of transformation products (TPs) of emerging contaminants and biocides in wastewater and sludge.
Collapse
Affiliation(s)
- Reza Aalizadeh
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece
| | - Maria-Christina Nika
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zographou, 15771, Athens, Greece.
| |
Collapse
|
5
|
Lobas AA, Levitsky LI, Fichtenbaum A, Surin AK, Pridatchenko ML, Mitulovic G, Gorshkov AV, Gorshkov MV. Predictive Liquid Chromatography of Peptides Based on Hydrophilic Interactions for Mass Spectrometry-Based Proteomics. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934817140076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
6
|
Parr MK, Schmidt AH. Life cycle management of analytical methods. J Pharm Biomed Anal 2018; 147:506-517. [DOI: 10.1016/j.jpba.2017.06.020] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Revised: 06/10/2017] [Accepted: 06/12/2017] [Indexed: 11/30/2022]
|
7
|
Mikulášek K, Jaroň KS, Kulhánek P, Bittová M, Havliš J. Sequence-dependent separation of trinucleotides by ion-interaction reversed-phase liquid chromatography-A structure-retention study assisted by soft-modelling and molecular dynamics. J Chromatogr A 2016; 1469:88-95. [PMID: 27692640 DOI: 10.1016/j.chroma.2016.09.060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 09/22/2016] [Accepted: 09/24/2016] [Indexed: 10/21/2022]
Abstract
We studied sequence-dependent retention properties of synthetic 5'-terminal phosphate absent trinucleotides containing adenine, guanine and thymine through reversed-phase liquid chromatography (RPLC) and QSRR modelling. We investigated the influence of separation conditions, namely mobile phase composition (ion interaction agent content, pH and organic constituent content), on sequence-dependent separation by means of ion-interaction RPLC (II-RPLC) using two types of models: experimental design-artificial neural networks (ED-ANN), and linear regression based on molecular dynamics data. The aim was to determine those properties of the above-mentioned analytes responsible for the retention dependence of the sequence. Our results show that there is a deterministic relation between sequence and II-RPLC retention properties of the studied trinucleotides. Further, we can conclude that the higher the content of ion-interaction agent in the mobile phase, the more prominent these properties are. We also show that if we approximate the polar component of solvation energy in QSRR by the electrostatic work in transferring molecules from vacuum to water, and the non-polar component by the solvent accessible surface area, these parameters best describe the retention properties of trinucleotides. There are some exceptions to this finding, namely sequences 5'-NAN-3', 5'-ANN-3', 5'-TGN-3', 5'-NTA-3'and 5'-NGA-3' (N stands for generic nucleotide). Their role is still unknown, but since linear regression including these specific constellations showed a higher observable variance coverage than the model with only the basic descriptors, we may assume that solvent-analyte interactions are responsible for the exceptional behaviour of 5'-NAN-3' & 5'-ANN-3' trinucleotides and some intramolecular interactions of neighbouring nucleobases for 5'-TGN-3', 5'-NTA-3'and 5'-NGA-3' trinucleotides.
Collapse
Affiliation(s)
- Kamil Mikulášek
- Masaryk University, Faculty of Science, Department of Chemistry, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic
| | - Kamil S Jaroň
- Academy of Sciences of the Czech Republic, Institute of Vertebrate Biology, Květná 8, 603 65 Brno, Czech Republic
| | - Petr Kulhánek
- Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, Faculty of Science, National Centre of Biomolecular Research, Kamenice 5, 62500 Brno, Czech Republic
| | - Miroslava Bittová
- Masaryk University, Faculty of Science, Department of Chemistry, Kamenice 5, 62500 Brno, Czech Republic
| | - Jan Havliš
- Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, Faculty of Science, National Centre of Biomolecular Research, Kamenice 5, 62500 Brno, Czech Republic.
| |
Collapse
|
8
|
Toward greener analytical techniques for the absolute quantification of peptides in pharmaceutical and biological samples. J Pharm Biomed Anal 2015; 113:181-8. [DOI: 10.1016/j.jpba.2015.03.023] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Revised: 03/19/2015] [Accepted: 03/23/2015] [Indexed: 11/22/2022]
|
9
|
Le Maux S, Nongonierma AB, FitzGerald RJ. Improved short peptide identification using HILIC–MS/MS: Retention time prediction model based on the impact of amino acid position in the peptide sequence. Food Chem 2015; 173:847-54. [DOI: 10.1016/j.foodchem.2014.10.104] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Revised: 10/04/2014] [Accepted: 10/18/2014] [Indexed: 01/10/2023]
|
10
|
Smith R, Mathis AD, Ventura D, Prince JT. Proteomics, lipidomics, metabolomics: a mass spectrometry tutorial from a computer scientist's point of view. BMC Bioinformatics 2014; 15 Suppl 7:S9. [PMID: 25078324 PMCID: PMC4110734 DOI: 10.1186/1471-2105-15-s7-s9] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background For decades, mass spectrometry data has been analyzed to investigate a wide array of research interests, including disease diagnostics, biological and chemical theory, genomics, and drug development. Progress towards solving any of these disparate problems depends upon overcoming the common challenge of interpreting the large data sets generated. Despite interim successes, many data interpretation problems in mass spectrometry are still challenging. Further, though these challenges are inherently interdisciplinary in nature, the significant domain-specific knowledge gap between disciplines makes interdisciplinary contributions difficult. Results This paper provides an introduction to the burgeoning field of computational mass spectrometry. We illustrate key concepts, vocabulary, and open problems in MS-omics, as well as provide invaluable resources such as open data sets and key search terms and references. Conclusions This paper will facilitate contributions from mathematicians, computer scientists, and statisticians to MS-omics that will fundamentally improve results over existing approaches and inform novel algorithmic solutions to open problems.
Collapse
|
11
|
Kelchtermans P, Bittremieux W, De Grave K, Degroeve S, Ramon J, Laukens K, Valkenborg D, Barsnes H, Martens L. Machine learning applications in proteomics research: how the past can boost the future. Proteomics 2014; 14:353-66. [PMID: 24323524 DOI: 10.1002/pmic.201300289] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Revised: 09/24/2013] [Accepted: 10/14/2013] [Indexed: 01/22/2023]
Abstract
Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis.
Collapse
Affiliation(s)
- Pieter Kelchtermans
- Department of Medical Protein Research, VIB, Ghent, Belgium; Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium; Flemish Institute for Technological Research (VITO), Boeretang, Mol, Belgium
| | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Lobas AA, Verenchikov AN, Goloborodko AA, Levitsky LI, Gorshkov MV. Combination of Edman degradation of peptides with liquid chromatography/mass spectrometry workflow for peptide identification in bottom-up proteomics. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2013; 27:391-400. [PMID: 23280970 DOI: 10.1002/rcm.6462] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Revised: 11/01/2012] [Accepted: 11/02/2012] [Indexed: 06/01/2023]
Abstract
RATIONALE High-throughput methods of proteomics are essential for identification of proteins in a cell or tissue under certain conditions. Most of these methods require tandem mass spectrometry (MS/MS). A multidimensional approach including predictive chromatography and partial chemical degradation could be a valuable alternative and/or addition to MS/MS. METHODS In the proposed strategy peptides are identified in a three-dimensional (3D) search space consisting of retention time (RT), mass, and reduced mass after one-step partial Edman degradation. The strategy was evaluated in silico for two databases: baker's yeast and human proteins. Rates of unambiguous identifications were estimated for mass accuracies from 0.001 to 0.05 Da and RT prediction accuracies from 0.1 to 5 min. Rates of Edman reactions were measured for test peptides. RESULTS A 3D description of proteolytic peptides allowing unambiguous identification without employing MS/MS of up to 95% and 80% of tryptic peptides from the yeast and human proteomes, respectively, was considered. Further extension of the search space to a four-dimensional one by incorporating the second N-terminal amino acid residue as the fourth dimension was also considered and was shown to result in up to 90% of human peptides being identified unambiguously. CONCLUSIONS The proposed 3D search space can be a useful alternative to MS/MS-based peptide identification approach. Experimental implementations of the proposed method within the on-line liquid chromatography/mass spectrometry (LC/MS) and off-line matrix-assisted laser desorption/ionization (MALDI) workflows are in progress.
Collapse
Affiliation(s)
- Anna A Lobas
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow, Russia
| | | | | | | | | |
Collapse
|
13
|
Moskovets E, Goloborodko AA, Gorshkov AV, Gorshkov MV. Limitation of predictive 2-D liquid chromatography in reducing the database search space in shotgun proteomics: in silico studies. J Sep Sci 2012; 35:1771-8. [PMID: 22807359 DOI: 10.1002/jssc.201100798] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A two-dimensional (2-D) liquid chromatography (LC) separation of complex peptide mixtures that combines a normal phase utilizing hydrophilic interactions and a reversed phase offers reportedly the highest level of 2-D LC orthogonality by providing an even spread of peptides across multiple LC fractions. Matching experimental peptide retention times to those predicted by empirical models describing chromatographic separation in each LC dimension leads to a significant reduction in a database search space. In this work, we calculated the retention times of tryptic peptides separated in the C18 reversed phase at different separation conditions (pH 2 and pH 10) and in TSK gel Amide-80 normal phase. We show that retention times calculated for different 2-D LC separation schemes utilizing these phases start to correlate once the mass range of peptides under analysis becomes progressively narrow. This effect is explained by high degree of correlation between retention coefficients in the considered phases.
Collapse
|
14
|
Abstract
Selected reaction monitoring (SRM) has a long history of use in the area of quantitative MS. In recent years, the approach has seen increased application to quantitative proteomics, facilitating multiplexed relative and absolute quantification studies in a variety of organisms. This article discusses SRM, after introducing the context of quantitative proteomics (specifically primarily absolute quantification) where it finds most application, and considers topics such as the theory and advantages of SRM, the selection of peptide surrogates for protein quantification, the design of optimal SRM co-ordinates and the handling of SRM data. A number of published studies are also discussed to demonstrate the impact that SRM has had on the field of quantitative proteomics.
Collapse
|
15
|
On the utility of predictive chromatography to complement mass spectrometry based intact protein identification. Anal Bioanal Chem 2011; 402:2521-9. [PMID: 21901462 DOI: 10.1007/s00216-011-5350-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Revised: 07/22/2011] [Accepted: 08/19/2011] [Indexed: 10/17/2022]
Abstract
The amino acid sequence determines the individual protein three-dimensional structure and its functioning in an organism. Therefore, "reading" a protein sequence and determining its changes due to mutations or post-translational modifications is one of the objectives of proteomic experiments. The commonly utilized approach is gradient high-performance liquid chromatography (HPLC) in combination with tandem mass spectrometry. While serving as a way to simplify the protein mixture, the liquid chromatography may be an additional analytical tool providing complementary information about the protein structure. Previous attempts to develop "predictive" HPLC for large biomacromolecules were limited by empirically derived equations based purely on the adsorption mechanisms of the retention and applicable to relatively small polypeptide molecules. A mechanism of the large biomacromolecule retention in reversed-phase gradient HPLC was described recently in thermodynamics terms by the analytical model of liquid chromatography at critical conditions (BioLCCC). In this work, we applied the BioLCCC model to predict retention of the intact proteins as well as their large proteolytic peptides separated under different HPLC conditions. The specific aim of these proof-of-principle studies was to demonstrate the feasibility of using "predictive" HPLC as a complementary tool to support the analysis of identified intact proteins in top-down, middle-down, and/or targeted selected reaction monitoring (SRM)-based proteomic experiments.
Collapse
|
16
|
Retention Time of Unretained Compound Calculation and Determination of Retention Indices of Some Monosubstituted Benzenes Using a Multiparametric Method in Binary, Ternary and Quaternary Solvent RP-LC Systems. Chromatographia 2011. [DOI: 10.1007/s10337-011-2088-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
17
|
Sargaeva NP, Goloborodko AA, O'Connor PB, Moskovets E, Gorshkov MV. Sequence-specific predictive chromatography to assist mass spectrometric analysis of asparagine deamidation and aspartate isomerization in peptides. Electrophoresis 2011; 32:1962-9. [DOI: 10.1002/elps.201000507] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2010] [Revised: 12/21/2010] [Accepted: 12/30/2010] [Indexed: 11/08/2022]
|