1
|
van Herwerden D, Nikolopoulos A, Barron LP, O'Brien JW, Pirok BWJ, Thomas KV, Samanipour S. Exploring the chemical subspace of RPLC: A data driven approach. Anal Chim Acta 2024; 1317:342869. [PMID: 39029998 DOI: 10.1016/j.aca.2024.342869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/11/2024] [Accepted: 06/11/2024] [Indexed: 07/21/2024]
Abstract
BACKGROUND The chemical space is comprised of a vast number of possible structures, of which an unknown portion comprises the human and environmental exposome. Such samples are frequently analyzed using non-targeted analysis via liquid chromatography (LC) coupled to high-resolution mass spectrometry often employing a reversed phase (RP) column. However, prior to analysis, the contents of these samples are unknown and could be comprised of thousands of known and unknown chemical constituents. Moreover, it is unknown which part of the chemical space is sufficiently retained and eluted using RPLC. RESULTS We present a generic framework that uses a data driven approach to predict whether molecules fall 'inside', 'maybe' inside, or 'outside' of the RPLC subspace. Firstly, three retention index random forest (RF) regression models were constructed that showed that molecular fingerprints are able to predict RPLC retention behavior. Secondly, these models were used to set up the dataset for building an RPLC RF classification model. The RPLC classification model was able to correctly predict whether a chemical belonged to the RPLC subspace with an accuracy of 92% for the testing set. Finally, applying this model to the 91 737 small molecules (i.e., ≤1 000 Da) in NORMAN SusDat showed that 19.1% fall 'outside' of the RPLC subspace. SIGNIFICANCE AND NOVELTY The RPLC chemical space model provides a major step towards mapping the chemical space and is able to assess whether chemicals can potentially be measured with an RPLC method (i.e., not every RPLC method) or if a different selectivity should be considered. Moreover, knowing which chemicals are outside of the RPLC subspace can assist in reducing potential candidates for library searching and avoid screening for chemicals that will not be present in RPLC data.
Collapse
Affiliation(s)
- Denice van Herwerden
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands.
| | - Alexandros Nikolopoulos
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands
| | - Leon P Barron
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands; MRC Centre for Environment and Health, Environmental Research Group, School of Public Health, Faculty of Medicine, Imperial College London, London, W12 0BZ, United Kingdom
| | - Jake W O'Brien
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands; Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane, QLD, 4102, Australia
| | - Bob W J Pirok
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane, QLD, 4102, Australia
| | - Saer Samanipour
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands; UvA Data Science Center, University of Amsterdam, Amsterdam, 1012 WP, the Netherlands.
| |
Collapse
|
2
|
Bandini E, Castellano Ontiveros R, Kajtazi A, Eghbali H, Lynen F. Physicochemical modelling of the retention mechanism of temperature-responsive polymeric columns for HPLC through machine learning algorithms. J Cheminform 2024; 16:72. [PMID: 38907264 PMCID: PMC11193285 DOI: 10.1186/s13321-024-00873-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 06/14/2024] [Indexed: 06/23/2024] Open
Abstract
Temperature-responsive liquid chromatography (TRLC) offers a promising alternative to reversed-phase liquid chromatography (RPLC) for environmentally friendly analytical techniques by utilizing pure water as a mobile phase, eliminating the need for harmful organic solvents. TRLC columns, packed with temperature-responsive polymers coupled to silica particles, exhibit a unique retention mechanism influenced by temperature-induced polymer hydration. An investigation of the physicochemical parameters driving separation at high and low temperatures is crucial for better column manufacturing and selectivity control. Assessment of predictability using a dataset of 139 molecules analyzed at different temperatures elucidated the molecular descriptors (MDs) relevant to retention mechanisms. Linear regression, support vector regression (SVR), and tree-based ensemble models were evaluated, with no standout performer. The precision, accuracy, and robustness of models were validated through metrics, such as r and mean absolute error (MAE), and statistical analysis. At 45 ∘ C , logP predominantly influenced retention, akin to reversed-phase columns, while at5 ∘ C , complex interactions with lipophilic and negative MDs, along with specific functional groups, dictated retention. These findings provide deeper insights into TRLC mechanisms, facilitating method development and maximizing column potential.
Collapse
Affiliation(s)
- Elena Bandini
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium.
| | - Rodrigo Castellano Ontiveros
- School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, 11428, Sweden
| | - Ardiana Kajtazi
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium
| | - Hamed Eghbali
- Packaging and Specialty Plastics R&D, Dow Benelux B.V., Terneuzen, 4530 AA, the Netherlands
| | - Frédéric Lynen
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium
| |
Collapse
|
3
|
Vosough M, Schmidt TC, Renner G. Non-target screening in water analysis: recent trends of data evaluation, quality assurance, and their future perspectives. Anal Bioanal Chem 2024; 416:2125-2136. [PMID: 38300263 PMCID: PMC10951028 DOI: 10.1007/s00216-024-05153-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/12/2024] [Accepted: 01/12/2024] [Indexed: 02/02/2024]
Abstract
This trend article provides an overview of recent advancements in Non-Target Screening (NTS) for water quality assessment, focusing on new methods in data evaluation, qualification, quantification, and quality assurance (QA/QC). It highlights the evolution in NTS data processing, where open-source platforms address challenges in result comparability and data complexity. Advanced chemometrics and machine learning (ML) are pivotal for trend identification and correlation analysis, with a growing emphasis on automated workflows and robust classification models. The article also discusses the rigorous QA/QC measures essential in NTS, such as internal standards, batch effect monitoring, and matrix effect assessment. It examines the progress in quantitative NTS (qNTS), noting advancements in ionization efficiency-based quantification and predictive modeling despite challenges in sample variability and analytical standards. Selected studies illustrate NTS's role in water analysis, combining high-resolution mass spectrometry with chromatographic techniques for enhanced chemical exposure assessment. The article addresses chemical identification and prioritization challenges, highlighting the integration of database searches and computational tools for efficiency. Finally, the article outlines the future research needs in NTS, including establishing comprehensive guidelines, improving QA/QC measures, and reporting results. It underscores the potential to integrate multivariate chemometrics, AI/ML tools, and multi-way methods into NTS workflows and combine various data sources to understand ecosystem health and protection comprehensively.
Collapse
Affiliation(s)
- Maryam Vosough
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, North Rhine-Westphalia, Germany.
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, Essen, 45141, North Rhine-Westphalia, Germany.
- Department of Clean Technologies, Chemistry and Chemical Engineering Research Center of Iran, P.O. Box 14335-186, Tehran, Iran.
| | - Torsten C Schmidt
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, North Rhine-Westphalia, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, Essen, 45141, North Rhine-Westphalia, Germany
- IWW Water Centre, Moritzstr. 26, Mülheim an der Ruhr, 45476, North Rhine-Westphalia, Germany
| | - Gerrit Renner
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, North Rhine-Westphalia, Germany.
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, Essen, 45141, North Rhine-Westphalia, Germany.
| |
Collapse
|
4
|
van Herwerden D, O’Brien JW, Lege S, Pirok BWJ, Thomas KV, Samanipour S. Cumulative Neutral Loss Model for Fragment Deconvolution in Electrospray Ionization High-Resolution Mass Spectrometry Data. Anal Chem 2023; 95:12247-12255. [PMID: 37549176 PMCID: PMC10448439 DOI: 10.1021/acs.analchem.3c00896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 07/03/2023] [Indexed: 08/09/2023]
Abstract
Clean high-resolution mass spectra (HRMS) are essential to a successful structural elucidation of an unknown feature during nontarget analysis (NTA) workflows. This is a crucial step, particularly for the spectra generated during data-independent acquisition or during direct infusion experiments. The most commonly available tools only take advantage of the time domain for spectral cleanup. Here, we present an algorithm that combines the time domain and mass domain information to perform spectral deconvolution. The algorithm employs a probability-based cumulative neutral loss (CNL) model for fragment deconvolution. The optimized model, with a mass tolerance of 0.005 Da and a scoreCNL threshold of 0.00, was able to achieve a true positive rate (TPr) of 95.0%, a false discovery rate (FDr) of 20.6%, and a reduction rate of 35.4%. Additionally, the CNL model was extensively tested on real samples containing predominantly pesticides at different concentration levels and with matrix effects. Overall, the model was able to obtain a TPr above 88.8% with FD rates between 33 and 79% and reduction rates between 9 and 45%. Finally, the CNL model was compared with the retention time difference method and peak shape correlation analysis, showing that a combination of correlation analysis and the CNL model was the most effective for fragment deconvolution, obtaining a TPr of 84.7%, an FDr of 54.4%, and a reduction rate of 51.0%.
Collapse
Affiliation(s)
- Denice van Herwerden
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Jake W. O’Brien
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Sascha Lege
- Agilent
Technologies Deutschland GmbH, Waldbronn 76337, Germany
| | - Bob W. J. Pirok
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Kevin V. Thomas
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Saer Samanipour
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1012 WP, The Netherlands
| |
Collapse
|