1
|
Obradović D, Stavrianidi A, Fedorova E, Bogojević A, Shpigun O, Buryak A, Lazović S. A comparative study of the predictive performance of different descriptor calculation tools: Molecular-based elution order modeling and interpretation of retention mechanism for isomeric compounds from METLIN database. J Chromatogr A 2024; 1719:464731. [PMID: 38377661 DOI: 10.1016/j.chroma.2024.464731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/08/2024] [Accepted: 02/09/2024] [Indexed: 02/22/2024]
Abstract
In the pharmaceutical industry, the need for analytical standards is a bottleneck for comprehensive evaluation and quality control of intermediate and end products. These are complex mixtures containing structurally related molecules. In this regard, chromatographic peak annotation, especially for critical pairs of isomers and closest structural analogs, can be supported by using a Quantitative Structure Retention Relationship (QSRR) approach. In our study, we investigated the fundamental basis of the reversed-phase (RP) retention mechanism for 1141 isomeric compounds from the METLIN SMRT dataset. Nine different descriptor calculation tools combined with different feature selection methods (genetic algorithm (GA), stepwise, Boruta) and machine learning (ML) approaches (support vector machine (SVM), multiple linear regression (MLR), random forest (RF), XGBoost) were applied to provide a reliable molecular structure-based interpretation of RP retention behaviour of the isomeric compounds. Strict internal and external validation metrics were used to select models with the best predictive capabilities (rtest > 0.73, order of elution > 60 %). For the developed models, mean absolute errors were in the range of 60 to 110 s. Stepwise and GA showed the most suitable performance as descriptor selection methods, while SVM and XGBoost modeling gave satisfactory predictive characteristics in most cases. Validation performed on the published experimental data for structurally related pharmaceutical compounds confirmed the best accuracy of MLR modeling in combination with GA feature selection of general physico-chemical properties. The resulting models will be useful for the prediction of separation and identification of structurally related compounds in pharmaceutical analysis, providing a simultaneous understanding of the interaction mechanisms leading to their retention under RP conditions.
Collapse
Affiliation(s)
- Darija Obradović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Andrey Stavrianidi
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia; A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia.
| | - Elizaveta Fedorova
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Aleksandar Bogojević
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Oleg Shpigun
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia
| | - Aleksey Buryak
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Saša Lazović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| |
Collapse
|
2
|
Kumari P, Van Laethem T, Hubert P, Fillet M, Sacré PY, Hubert C. Quantitative Structure Retention-Relationship Modeling: Towards an Innovative General-Purpose Strategy. Molecules 2023; 28:molecules28041696. [PMID: 36838689 PMCID: PMC9964055 DOI: 10.3390/molecules28041696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 02/05/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
Reversed-Phase Liquid Chromatography (RPLC) is a common liquid chromatographic mode used for the control of pharmaceutical compounds during their drug life cycle. Nevertheless, determining the optimal chromatographic conditions that enable this separation is time consuming and requires a lot of lab work. Quantitative Structure Retention Relationship models (QSRR) are helpful for doing this job with minimal time and cost expenditures by predicting retention times of known compounds without performing experiments. In the current work, several QSRR models were built and compared for their adequacy in predicting the retention times. The regression models were based on a combination of linear and non-linear algorithms such as Multiple Linear Regression, Support Vector Regression, Least Absolute Shrinkage and Selection Operator, Random Forest, and Gradient Boosted Regression. Models were built for five pH conditions, i.e., at pH 2.7, 3.5, 6.5, and 8.0. In the end, the model predictions were combined using stacking and the performances of all models were compared. The k-nearest neighbor-based application domain filter was established to assess the reliability of the prediction for further compound prioritization. Altogether, this study can be insightful for analytical chemists working with RPLC to begin with the computational prediction modeling such as QSRR to predict the separation of small molecules.
Collapse
Affiliation(s)
- Priyanka Kumari
- Department of Pharmacy, Laboratory of Pharmaceutical Analytical Chemistry, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
- Laboratory for the Analysis of Medicines, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
- Correspondence: (P.K.); (C.H.); Tel.: +32-(0)-43664326 (C.H.)
| | - Thomas Van Laethem
- Department of Pharmacy, Laboratory of Pharmaceutical Analytical Chemistry, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
- Laboratory for the Analysis of Medicines, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
| | - Philippe Hubert
- Department of Pharmacy, Laboratory of Pharmaceutical Analytical Chemistry, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
| | - Marianne Fillet
- Laboratory for the Analysis of Medicines, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
| | - Pierre-Yves Sacré
- Department of Pharmacy, Laboratory of Pharmaceutical Analytical Chemistry, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
| | - Cédric Hubert
- Department of Pharmacy, Laboratory of Pharmaceutical Analytical Chemistry, University of Liège (ULiege), CIRM, Quartier Hopital (B36 Tower 4), Avenue Hippocrate, 4000 Liège, Belgium
- Correspondence: (P.K.); (C.H.); Tel.: +32-(0)-43664326 (C.H.)
| |
Collapse
|
3
|
Prediction of surface excess adsorption and retention factors in reversed-phase liquid chromatography from molecular dynamics simulations. J Chromatogr A 2022; 1685:463627. [DOI: 10.1016/j.chroma.2022.463627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/27/2022] [Accepted: 10/29/2022] [Indexed: 11/06/2022]
|
4
|
Gritti F. Perspective on the Future Approaches to Predict Retention in Liquid Chromatography. Anal Chem 2021; 93:5653-5664. [PMID: 33797872 DOI: 10.1021/acs.analchem.0c05078] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The demand for rapid column screening, computer-assisted method development and method transfer, and unambiguous compound identification by LC/MS analyses has pushed analysts to adopt experimental protocols and software for the accurate prediction of the retention time in liquid chromatography (LC). This Perspective discusses the classical approaches used to predict retention times in LC over the last three decades and proposes future requirements to increase their accuracy. First, inverse methods for retention prediction are essentially applied during screening and gradient method optimization: a minimum number of experiments or design of experiments (DoE) is run to train and calibrate a model (either purely statistical or based on the principles and fundamentals of liquid chromatography) by a mere fitting process. They do not require the accurate knowledge of the true column hold-up volume V0, system dwell volume Vdwell (in gradient elution), and the retention behavior (k versus the content of strong solvent φ, temperature T, pH, and ionic strength I) of the analytes. Their relative accuracy is often excellent below a few percent. Statistical methods are expected to be the most attractive to handle very complex retention behavior such as in mixed-mode chromatography (MMC). Fundamentally correct retention models accounting for the simultaneous impact of φ, I, pH, and T in MMC are needed for method development based on chromatography principles. Second, direct methods for retention prediction are ideally suited for accurate method transfer from one column/system configuration to another: these quality by design (QbD) methods are based on the fundamentals and principles of solid-liquid adsorption and gradient chromatography. No model calibration is necessary; however, they require universal conventions for the accurate determination of true retention factors (for 1 < k < 30) as a function of the experimental variables (φ, T, pH, and I) and of the true column/system parameters (V0, Vdwell, dispersion volume, σ, and relaxation volume, τ, of the programmed gradient profile at the column inlet and gradient distortion at the column outlet). Finally, when the molecular structure of the analytes is either known or assumed, retention prediction has essentially been made on the basis of statistical approaches such as the linear solvation energy relationships (LSERs) and the quantitative structure retention relationships (QSRRs): their ability to accurately predict the retention remains limited within 10-30%. They have been combined with molecular similarity approaches (where the retention model is calibrated with compounds having structures similar to that of the targeted analytes) and artificial intelligence algorithms to further improve their accuracy below 10%. In this Perspective, it is proposed to adopt a more rigorous and fundamental approach by considering the very details of the solid-liquid adsorption process: Monte Carlo (MC) or molecular dynamics (MD) simulations are promising tools to explain and interpret retention data that are too complex to be described by either empirical or statistical retention models.
Collapse
Affiliation(s)
- Fabrice Gritti
- Waters Corporation, 34 Maple Street, Milford, Massachusetts 01757, United States
| |
Collapse
|
5
|
Andries JPM, Goodarzi M, Heyden YV. Improvement of quantitative structure-retention relationship models for chromatographic retention prediction of peptides applying individual local partial least squares models. Talanta 2020; 219:121266. [PMID: 32887157 DOI: 10.1016/j.talanta.2020.121266] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 06/02/2020] [Accepted: 06/03/2020] [Indexed: 10/24/2022]
Abstract
In Reversed-Phase Liquid Chromatography, Quantitative Structure-Retention Relationship (QSRR) models for retention prediction of peptides can be built, starting from large sets of theoretical molecular descriptors. Good predictive QSRR models can be obtained after selecting the most informative descriptors. Reliable retention prediction may be an aid in the correct identification of proteins/peptides in proteomics and in chromatographic method development. Traditionally, global QSRR models are built, using a calibration set containing a representative range of analytes. In this study, a strategy is presented to build individual local Partial Least Squares (PLS) models for peptides, based on selected local calibration samples, most similar to the specific query peptide to be predicted. Similar local calibration peptides are selected from a possible calibration set. The calibration samples with the lowest Euclidian distances to the query peptide are considered as most similar. Two Euclidian distances are investigated as similarity parameter, (i) in the autoscaled descriptor space and, (ii) in the PLS factor space of the global calibration samples, both after variable selection by the Final Complexity Adapted Models (FCAM) method. The predictive abilities of individual local QSRR PLS models for peptides, developed with both Euclidian distances, are found significantly better than those of two global models, i.e. before and after FCAM variable selection. The predictive abilities of the local models, developed with distances calculated in the PLS factor space, were best.
Collapse
Affiliation(s)
- Jan P M Andries
- Research Group Analysis Techniques in the Life Sciences, Avans Hogeschool, University of Professional Education, P.O. Box 90116, 4800, RA Breda, the Netherlands.
| | - Mohammad Goodarzi
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, United States
| | - Yvan Vander Heyden
- Department of Analytical Chemistry, Applied Chemometrics and Molecular Modelling (FABI), Vrije Universiteit Brussel (VUB), Laarbeeklaan 103, B-1090, Brussels, Belgium
| |
Collapse
|
6
|
Prediction of Chromatographic Elution Order of Analytical Mixtures Based on Quantitative Structure-Retention Relationships and Multi-Objective Optimization. Molecules 2020; 25:molecules25133085. [PMID: 32640765 PMCID: PMC7411958 DOI: 10.3390/molecules25133085] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/29/2020] [Accepted: 07/02/2020] [Indexed: 11/16/2022] Open
Abstract
Prediction of the retention time from the molecular structure using quantitative structure-retention relationships is a powerful tool for the development of methods in reversed-phase HPLC. However, its fundamental limitation lies in the fact that low error in the prediction of the retention time does not necessarily guarantee a prediction of the elution order. Here, we propose a new method for the prediction of the elution order from quantitative structure-retention relationships using multi-objective optimization. Two case studies were evaluated: (i) separation of organic molecules in a Supelcosil LC-18 column, and (ii) separation of peptides in seven columns under varying conditions. Results have shown that, when compared to predictions based on the conventional model, the relative root mean square error of the elution order decreases by 48.84%, while the relative root mean square error of the retention time increases by 4.22% on average across both case studies. The predictive ability in terms of both retention time and elution order and the corresponding applicability domains were defined. The models were deemed stable and robust with few to no structural outliers.
Collapse
|
7
|
Liu JJ, Alipuly A, Bączek T, Wong MW, Žuvela P. Quantitative Structure-Retention Relationships with Non-Linear Programming for Prediction of Chromatographic Elution Order. Int J Mol Sci 2019; 20:ijms20143443. [PMID: 31336981 PMCID: PMC6678770 DOI: 10.3390/ijms20143443] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 07/07/2019] [Accepted: 07/10/2019] [Indexed: 11/16/2022] Open
Abstract
In this work, we employed a non-linear programming (NLP) approach via quantitative structure–retention relationships (QSRRs) modelling for prediction of elution order in reversed phase-liquid chromatography. With our rapid and efficient approach, error in prediction of retention time is sacrificed in favor of decreasing the error in elution order. Two case studies were evaluated: (i) analysis of 62 organic molecules on the Supelcosil LC-18 column; and (ii) analysis of 98 synthetic peptides on seven reversed phase-liquid chromatography (RP-LC) columns with varied gradients and column temperatures. On average across all the columns, all the chromatographic conditions and all the case studies, percentage root mean square error (%RMSE) of retention time exhibited a relative increase of 29.13%, while the %RMSE of elution order a relative decrease of 37.29%. Therefore, sacrificing %RMSE(tR) led to a considerable increase in the elution order predictive ability of the QSRR models across all the case studies. Results of our preliminary study show that the real value of the developed NLP-based method lies in its ability to easily obtain better-performing QSRR models that can accurately predict both retention time and elution order, even for complex mixtures, such as proteomics and metabolomics mixtures.
Collapse
Affiliation(s)
- J Jay Liu
- Department of Chemical Engineering, Pukyong National University, Busan 48-513, Korea
| | - Alham Alipuly
- Department of Chemical Engineering, Pukyong National University, Busan 48-513, Korea
| | - Tomasz Bączek
- Department of Pharmaceutical Chemistry, Medical University of Gdańsk, Al. Gen. Hallera 107, 80-416 Gdańsk, Poland
| | - Ming Wah Wong
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore
| | - Petar Žuvela
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| |
Collapse
|
8
|
Žuvela P, David J, Yang X, Huang D, Wong MW. Non-Linear Quantitative Structure⁻Activity Relationships Modelling, Mechanistic Study and In-Silico Design of Flavonoids as Potent Antioxidants. Int J Mol Sci 2019; 20:E2328. [PMID: 31083440 PMCID: PMC6539043 DOI: 10.3390/ijms20092328] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/04/2019] [Accepted: 05/07/2019] [Indexed: 02/05/2023] Open
Abstract
In this work, we developed quantitative structure-activity relationships (QSAR) models for prediction of oxygen radical absorbance capacity (ORAC) of flavonoids. Both linear (partial least squares-PLS) and non-linear models (artificial neural networks-ANNs) were built using parameters of two well-established antioxidant activity mechanisms, namely, the hydrogen atom transfer (HAT) mechanism defined with the minimum bond dissociation enthalpy, and the sequential proton-loss electron transfer (SPLET) mechanism defined with proton affinity and electron transfer enthalpy. Due to pronounced solvent effects within the ORAC assay, the hydration energy was also considered. The four-parameter PLS-QSAR model yielded relatively high root mean square errors (RMSECV = 0.783, RMSEE = 0.668, RMSEP = 0.900). Conversely, the ANN-QSAR model yielded considerably lower errors (RMSEE = 0.180 ± 0.059, RMSEP1 = 0.164 ± 0.128, and RMSEP2 = 0.151 ± 0.114) due to the inherent non-linear relationships between molecular structures of flavonoids and ORAC values. Five-fold cross-validation was found to be unsuitable for the internal validation of the ANN-QSAR model with a high RMSECV of 0.999 ± 0.253; which is due to limited sample size where resampling with replacement is a considerably better alternative. Chemical domains of applicability were defined for both models confirming their reliability and robustness. Based on the PLS coefficients and partial derivatives, both models were interpreted in terms of the HAT and SPLET mechanisms. Theoretical computations based on density functional theory at ωb97XD/6-311++G(d,p) level of theory were also carried out to further shed light on the plausible mechanism of anti-peroxy radical activity. Calculated energetics for simplified models (genistein and quercetin) with peroxyl radical derived from 2,2'-azobis (2-amidino-propane) dihydrochloride suggested that both SPLET and single electron transfer followed by proton loss (SETPL) mechanisms are competitive and more favorable than HAT in aqueous medium. The finding is in good accord with the ANN-based QSAR modelling results. Finally, the strongly predictive ANN-QSAR model was used to predict antioxidant activities for a series of 115 flavonoids designed combinatorially with flavone as a template. Structural trends were analyzed, and general guidelines for synthesis of new flavonoid derivatives with potentially potent antioxidant activities were given.
Collapse
Affiliation(s)
- Petar Žuvela
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| | - Jonathan David
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| | - Xin Yang
- Food Science and Technology Program, Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| | - Dejian Huang
- Food Science and Technology Program, Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| | - Ming Wah Wong
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| |
Collapse
|
9
|
A Quantitative Structure-Property Relationship Model Based on Chaos-Enhanced Accelerated Particle Swarm Optimization Algorithm and Back Propagation Artificial Neural Network. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8071121] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
10
|
Žuvela P, David J, Wong MW. Interpretation of ANN-based QSAR models for prediction of antioxidant activity of flavonoids. J Comput Chem 2018; 39:953-963. [PMID: 29399831 DOI: 10.1002/jcc.25168] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Revised: 01/04/2018] [Accepted: 01/07/2018] [Indexed: 01/18/2023]
Abstract
Quantitative structure-activity relationships (QSARs) built using machine learning methods, such as artificial neural networks (ANNs) are powerful in prediction of (antioxidant) activity from quantum mechanical (QM) parameters describing the molecular structure, but are usually not interpretable. This obvious difficulty is one of the most common obstacles in application of ANN-based QSAR models for design of potent antioxidants or elucidating the underlying mechanism. Interpreting the resulting models is often omitted or performed erroneously altogether. In this work, a comprehensive comparative study of six methods (PaD, PaD2 , weights, stepwise, perturbation and profile) for exploration and interpretation of ANN models built for prediction of Trolox-equivalent antioxidant capacity (TEAC) QM descriptors, is presented. Sum of ranking differences (SRD) was used for ranking of the six methods with respect to the contributions of the calculated QM molecular descriptors toward TEAC. The results show that the PaD, PaD2 and profile methods are the most stable and give rise to realistic interpretation of the observed correlations. Therefore, they are safely applicable for future interpretations without the opinion of an experienced chemist or bio-analyst. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Petar Žuvela
- Department of Chemistry, National University of Singapore, 12 Science Drive 2, Singapore, 11754
| | - Jonathan David
- Department of Chemistry, National University of Singapore, 12 Science Drive 2, Singapore, 11754
| | - Ming Wah Wong
- Department of Chemistry, National University of Singapore, 12 Science Drive 2, Singapore, 11754
| |
Collapse
|
11
|
Ciura K, Dziomba S, Nowakowska J, Markuszewski MJ. Thin layer chromatography in drug discovery process. J Chromatogr A 2017; 1520:9-22. [DOI: 10.1016/j.chroma.2017.09.015] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 08/29/2017] [Accepted: 09/06/2017] [Indexed: 10/18/2022]
|
12
|
Locus-specific Retention Predictor (LsRP): A Peptide Retention Time Predictor Developed for Precision Proteomics. Sci Rep 2017; 7:43959. [PMID: 28303880 PMCID: PMC5356008 DOI: 10.1038/srep43959] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 01/31/2017] [Indexed: 11/08/2022] Open
Abstract
The precision prediction of peptide retention time (RT) plays an increasingly important role in liquid chromatography-tandem mass spectrometry (LC-MS/MS) based proteomics. Owing to the high reproducibility of liquid chromatography, RT prediction provides promising information for both identification and quantification experiment design. In this work, we present a Locus-specific Retention Predictor (LsRP) for precise prediction of peptide RT, which is based on amino acid locus information and Support Vector Regression (SVR) algorithm. Corresponding to amino acid locus, each peptide sequence was converted to a featured locus vector consisting of zeros and ones. With locus vector information from LC-MS/MS data sets, an SVR computational process was trained and evaluated. LsRP finally provided a prediction correlation coefficient of 0.95~0.99. We compared our method with two common predictors. Results showed that LsRP outperforms these methods and tracked up to 30% extra peptides in an extraction RT window of 2 min. A new strategy by combining LsRP and calibration peptide approach was then proposed, which open up new opportunities for precision proteomics.
Collapse
|
13
|
Mikulášek K, Jaroň KS, Kulhánek P, Bittová M, Havliš J. Sequence-dependent separation of trinucleotides by ion-interaction reversed-phase liquid chromatography-A structure-retention study assisted by soft-modelling and molecular dynamics. J Chromatogr A 2016; 1469:88-95. [PMID: 27692640 DOI: 10.1016/j.chroma.2016.09.060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 09/22/2016] [Accepted: 09/24/2016] [Indexed: 10/21/2022]
Abstract
We studied sequence-dependent retention properties of synthetic 5'-terminal phosphate absent trinucleotides containing adenine, guanine and thymine through reversed-phase liquid chromatography (RPLC) and QSRR modelling. We investigated the influence of separation conditions, namely mobile phase composition (ion interaction agent content, pH and organic constituent content), on sequence-dependent separation by means of ion-interaction RPLC (II-RPLC) using two types of models: experimental design-artificial neural networks (ED-ANN), and linear regression based on molecular dynamics data. The aim was to determine those properties of the above-mentioned analytes responsible for the retention dependence of the sequence. Our results show that there is a deterministic relation between sequence and II-RPLC retention properties of the studied trinucleotides. Further, we can conclude that the higher the content of ion-interaction agent in the mobile phase, the more prominent these properties are. We also show that if we approximate the polar component of solvation energy in QSRR by the electrostatic work in transferring molecules from vacuum to water, and the non-polar component by the solvent accessible surface area, these parameters best describe the retention properties of trinucleotides. There are some exceptions to this finding, namely sequences 5'-NAN-3', 5'-ANN-3', 5'-TGN-3', 5'-NTA-3'and 5'-NGA-3' (N stands for generic nucleotide). Their role is still unknown, but since linear regression including these specific constellations showed a higher observable variance coverage than the model with only the basic descriptors, we may assume that solvent-analyte interactions are responsible for the exceptional behaviour of 5'-NAN-3' & 5'-ANN-3' trinucleotides and some intramolecular interactions of neighbouring nucleobases for 5'-TGN-3', 5'-NTA-3'and 5'-NGA-3' trinucleotides.
Collapse
Affiliation(s)
- Kamil Mikulášek
- Masaryk University, Faculty of Science, Department of Chemistry, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic
| | - Kamil S Jaroň
- Academy of Sciences of the Czech Republic, Institute of Vertebrate Biology, Květná 8, 603 65 Brno, Czech Republic
| | - Petr Kulhánek
- Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, Faculty of Science, National Centre of Biomolecular Research, Kamenice 5, 62500 Brno, Czech Republic
| | - Miroslava Bittová
- Masaryk University, Faculty of Science, Department of Chemistry, Kamenice 5, 62500 Brno, Czech Republic
| | - Jan Havliš
- Masaryk University, CEITEC - Central European Institute of Technology, Kamenice 5, 62500 Brno, Czech Republic; Masaryk University, Faculty of Science, National Centre of Biomolecular Research, Kamenice 5, 62500 Brno, Czech Republic.
| |
Collapse
|
14
|
Zhou W, Fan Y, Cai X, Xiang Y, Jiang P, Dai Z, Chen Y, Tan S, Yuan Z. High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods. RSC Adv 2016. [DOI: 10.1039/c6ra21076g] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The environmental protection agency thinks that quantitative structure–activity relationship (QSAR) analysis can better replace toxicity tests.
Collapse
Affiliation(s)
- Wei Zhou
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
- Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization
| | - Yanjun Fan
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Xunhui Cai
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Yan Xiang
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Peng Jiang
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Zhijun Dai
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Yuan Chen
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Siqiao Tan
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
| | - Zheming Yuan
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
- Hunan Agricultural University
- Changsha 410128
- P. R. China
- Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization
| |
Collapse
|