1
|
Sadia M, Boudguiyer Y, Helmus R, Seijo M, Praetorius A, Samanipour S. A stochastic approach for parameter optimization of feature detection algorithms for non-target screening in mass spectrometry. Anal Bioanal Chem 2024:10.1007/s00216-024-05425-3. [PMID: 38995405 DOI: 10.1007/s00216-024-05425-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 06/05/2024] [Accepted: 06/18/2024] [Indexed: 07/13/2024]
Abstract
Feature detection plays a crucial role in non-target screening (NTS), requiring careful selection of algorithm parameters to minimize false positive (FP) features. In this study, a stochastic approach was employed to optimize the parameter settings of feature detection algorithms used in processing high-resolution mass spectrometry data. This approach was demonstrated using four open-source algorithms (OpenMS, SAFD, XCMS, and KPIC2) within the patRoon software platform for processing extracts from drinking water samples spiked with 46 per- and polyfluoroalkyl substances (PFAS). The designed method is based on a stochastic strategy involving random sampling from variable space and the use of Pearson correlation to assess the impact of each parameter on the number of detected suspect analytes. Using our approach, the optimized parameters led to improvement in the algorithm performance by increasing suspect hits in case of SAFD and XCMS, and reducing the total number of detected features (i.e., minimizing FP) for OpenMS. These improvements were further validated on three different drinking water samples as test dataset. The optimized parameters resulted in a lower false discovery rate (FDR%) compared to the default parameters, effectively increasing the detection of true positive features. This work also highlights the necessity of algorithm parameter optimization prior to starting the NTS to reduce the complexity of such datasets.
Collapse
Affiliation(s)
- Mohammad Sadia
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands.
| | - Youssef Boudguiyer
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Marianne Seijo
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Antonia Praetorius
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Saer Samanipour
- Van'T Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
2
|
Reuschenbach M, Drees F, Leupold MS, Tintrop LK, Schmidt TC, Renner G. qPeaks: A Linear Regression-Based Asymmetric Peak Model for Parameter-Free Automatized Detection and Characterization of Chromatographic Peaks in Non-Target Screening Data. Anal Chem 2024; 96:7120-7129. [PMID: 38666514 DOI: 10.1021/acs.analchem.4c00494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
We present qPeaks (quality peaks), a novel, user-parameter-free algorithm for peak detection and peak characterization applicable to chromatographic data. The algorithm is based on a linearizable regression model that analyzes asymmetric peaks and estimates the specific uncertainties associated with the peak regression parameters. The uncertainties of the parameters are used to derive a data quality score DQSpeak, rendering low reliability results more transparent during processing and allowing for the prioritization of generated features. High DQSpeak chromatographic peaks have a lower chance of being classified as false-positive and show higher repeatability over multiple measurements. The high efficiency of the algorithm makes it particularly useful for application within processing routines of nontarget screening through chromatography coupled with high-resolution mass spectrometry. qPeaks is integrated into the qAlgorithms nontarget screening processing toolbox and appends a parameter-free chromatographic peak detection and characterization step to it. With qAlgorithms, now high-resolution mass spectra are centroided using the qCentroids algorithms, centroids are clustered to form extracted ion chromatograms (EICs) with the qBinning algorithm, and chromatographic peaks are found on the generated EICs with qPeaks. However, all tools from qAlgorithms can also be used independently.
Collapse
Affiliation(s)
- Max Reuschenbach
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
| | - Felix Drees
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
| | - Michael S Leupold
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
| | - Lucie K Tintrop
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
| | - Torsten C Schmidt
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
- IWW Water Center, Moritzstr.26, Mülheim an der Ruhr 45476, Germany
| | - Gerrit Renner
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr.5, Essen 45141, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr.2, Essen 45141, Germany
| |
Collapse
|
3
|
Schulze B, Heffernan AL, Samanipour S, Gomez Ramos MJ, Veal C, Thomas KV, Kaserzon SL. Is Nontarget Analysis Ready for Regulatory Application? Influence of Peak-Picking Algorithms on Data Analysis. Anal Chem 2023; 95:18361-18369. [PMID: 38061068 DOI: 10.1021/acs.analchem.3c03003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
The use of peak-picking algorithms is an essential step in all nontarget analysis (NTA) workflows. However, algorithm choice may influence reliability and reproducibility of results. Using a real-world data set, the aim of this study was to investigate how different peak-picking algorithms influence NTA results when exploring temporal and/or spatial trends. For this, drinking water catchment monitoring data, using passive samplers collected twice per year across Southeast Queensland, Australia (n = 18 sites) between 2014 and 2019, was investigated. Data were acquired using liquid chromatography coupled to high-resolution mass spectrometry. Peak picking was performed using five different programs/algorithms (SCIEX OS, MSDial, self-adjusting-feature-detection, two algorithms within MarkerView), keeping parameters identical whenever possible. The resulting feature lists revealed low overlap: 7.2% of features were picked by >3 algorithms, while 74% of features were only picked by a single algorithm. Trend evaluation of the data, using principal component analysis, showed significant variability between the approaches, with only one temporal and no spatial trend being identified by all algorithms. Manual evaluation of features of interest (p-value <0.01, log fold change >2) for one sampling site revealed high rates of incorrectly picked peaks (>70%) for three algorithms. Lower rates (<30%) were observed for the other algorithms, but with the caveat of not successfully picking all internal standards used as quality control. The choice is therefore currently between comprehensive and strict peak picking, either resulting in increased noise or missed peaks, respectively. Reproducibility of NTA results remains challenging when applied for regulatory frameworks.
Collapse
Affiliation(s)
- Bastian Schulze
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Amy L Heffernan
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Saer Samanipour
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
| | - Maria Jose Gomez Ramos
- Chemistry and Physics Department, University of Almeria, Agrifood Campus of International Excellence (ceiA3), 04120 Almería, Spain
| | - Cameron Veal
- Seqwater, 117 Brisbane Street, Ipswich, QLD 4305, Australia
- UQ School of Civil Engineering, The University of Queensland, Building 49 Advanced Engineering Building, Staff House Road, St Lucia, QLD 4072, Australia
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Sarit L Kaserzon
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| |
Collapse
|
4
|
Reuschenbach M, Drees F, Schmidt TC, Renner G. qBinning: Data Quality-Based Algorithm for Automized Ion Chromatogram Extraction from High-Resolution Mass Spectrometry. Anal Chem 2023; 95:13804-13812. [PMID: 37658322 DOI: 10.1021/acs.analchem.3c01079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
Due to the complexity and volume of data generated through non-target screening (NTS) using chromatographic couplings with high-resolution mass spectrometry, automized processing routines are necessary. The processing routines usually consist of many individual steps that are user-parameter-dependent and, thus, require labor-intensive optimization. Additionally, the effect of variations in raw data quality on the processing results is unclear and not fully understood. Within this work, we present qBinning, a novel algorithm for constructing extracted ion chromatograms (EICs) based on statistical principles and, thus, without the need to set user parameters. Furthermore, we give the user feedback on the specific qualities of the generated EICs using a scoring system (DQSbin). The DQSbin measures reliability as it correlates with the probability of correct classification of masses into EICs and the degree of overlap between different EIC construction algorithms. This work is a big step forward in understanding the behavior of NTS data and increasing the overall transparency in the results of NTS.
Collapse
Affiliation(s)
- Max Reuschenbach
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45141 Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, 45141 Essen, Germany
| | - Felix Drees
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45141 Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, 45141 Essen, Germany
| | - Torsten C Schmidt
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45141 Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, 45141 Essen, Germany
- IWW Water Center, Moritzstr. 26, 45476 Mülheim an der Ruhr, Germany
| | - Gerrit Renner
- Instrumental Analytical Chemistry, University of Duisburg-Essen, Universitätsstr. 5, 45141 Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstr. 2, 45141 Essen, Germany
| |
Collapse
|
5
|
van Herwerden D, O’Brien JW, Lege S, Pirok BWJ, Thomas KV, Samanipour S. Cumulative Neutral Loss Model for Fragment Deconvolution in Electrospray Ionization High-Resolution Mass Spectrometry Data. Anal Chem 2023; 95:12247-12255. [PMID: 37549176 PMCID: PMC10448439 DOI: 10.1021/acs.analchem.3c00896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 07/03/2023] [Indexed: 08/09/2023]
Abstract
Clean high-resolution mass spectra (HRMS) are essential to a successful structural elucidation of an unknown feature during nontarget analysis (NTA) workflows. This is a crucial step, particularly for the spectra generated during data-independent acquisition or during direct infusion experiments. The most commonly available tools only take advantage of the time domain for spectral cleanup. Here, we present an algorithm that combines the time domain and mass domain information to perform spectral deconvolution. The algorithm employs a probability-based cumulative neutral loss (CNL) model for fragment deconvolution. The optimized model, with a mass tolerance of 0.005 Da and a scoreCNL threshold of 0.00, was able to achieve a true positive rate (TPr) of 95.0%, a false discovery rate (FDr) of 20.6%, and a reduction rate of 35.4%. Additionally, the CNL model was extensively tested on real samples containing predominantly pesticides at different concentration levels and with matrix effects. Overall, the model was able to obtain a TPr above 88.8% with FD rates between 33 and 79% and reduction rates between 9 and 45%. Finally, the CNL model was compared with the retention time difference method and peak shape correlation analysis, showing that a combination of correlation analysis and the CNL model was the most effective for fragment deconvolution, obtaining a TPr of 84.7%, an FDr of 54.4%, and a reduction rate of 51.0%.
Collapse
Affiliation(s)
- Denice van Herwerden
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Jake W. O’Brien
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Sascha Lege
- Agilent
Technologies Deutschland GmbH, Waldbronn 76337, Germany
| | - Bob W. J. Pirok
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Kevin V. Thomas
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Saer Samanipour
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1012 WP, The Netherlands
| |
Collapse
|
6
|
Feraud M, O'Brien JW, Samanipour S, Dewapriya P, van Herwerden D, Kaserzon S, Wood I, Rauert C, Thomas KV. InSpectra - A platform for identifying emerging chemical threats. JOURNAL OF HAZARDOUS MATERIALS 2023; 455:131486. [PMID: 37172382 DOI: 10.1016/j.jhazmat.2023.131486] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 04/20/2023] [Accepted: 04/23/2023] [Indexed: 05/14/2023]
Abstract
Non-target analysis (NTA) employing high-resolution mass spectrometry (HRMS) coupled with liquid chromatography is increasingly being used to identify chemicals of biological relevance. HRMS datasets are large and complex making the identification of potentially relevant chemicals extremely challenging. As they are recorded in vendor-specific formats, interpreting them is often reliant on vendor-specific software that may not accommodate advancements in data processing. Here we present InSpectra, a vendor independent automated platform for the systematic detection of newly identified emerging chemical threats. InSpectra is web-based, open-source/access and modular providing highly flexible and extensible NTA and suspect screening workflows. As a cloud-based platform, InSpectra exploits parallel computing and big data archiving capabilities with a focus for sharing and community curation of HRMS data. InSpectra offers a reproducible and transparent approach for the identification, tracking and prioritisation of emerging chemical threats.
Collapse
Affiliation(s)
- Mathieu Feraud
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Jake W O'Brien
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia; Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands.
| | - Saer Samanipour
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia; Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands; UvA Data Science Center, University of Amsterdam, Netherlands.
| | - Pradeep Dewapriya
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Denice van Herwerden
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Netherlands
| | - Sarit Kaserzon
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Ian Wood
- School of Mathematics and Physics, The University of Queensland, Australia
| | - Cassandra Rauert
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Australia
| |
Collapse
|
7
|
Boelrijk J, van Herwerden D, Ensing B, Forré P, Samanipour S. Predicting RP-LC retention indices of structurally unknown chemicals from mass spectrometry data. J Cheminform 2023; 15:28. [PMID: 36829215 PMCID: PMC9960388 DOI: 10.1186/s13321-023-00699-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 02/13/2023] [Indexed: 02/26/2023] Open
Abstract
Non-target analysis combined with liquid chromatography high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited number structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (r[Formula: see text]) values for structurally (un)known chemicals based on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of r[Formula: see text] values without the need for the exact structure of the chemicals, with an [Formula: see text] of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 r[Formula: see text] units for the NORMAN ([Formula: see text]) and amide ([Formula: see text]) test sets, respectively. This fragment based model showed comparable accuracy in r[Formula: see text] prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained an [Formula: see text] of 0.85 with an RMSE of 67.
Collapse
Affiliation(s)
- Jim Boelrijk
- AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands. .,Institute for Informatics, University of Amsterdam, Amsterdam, The Netherlands.
| | - Denice van Herwerden
- grid.7177.60000000084992262Van’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands
| | - Bernd Ensing
- grid.7177.60000000084992262AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands ,Computational Chemistry Group, Van’t Hoff Institute for Molecular Sciences (HIMS), Amsterdam, The Netherlands
| | - Patrick Forré
- grid.7177.60000000084992262AI4Science Lab, University of Amsterdam, Amsterdam, The Netherlands ,grid.7177.60000000084992262Institute for Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Saer Samanipour
- Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands. .,UvA Data Science Center, University of Amsterdam, Amsterdam, The Netherlands. .,Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Australia.
| |
Collapse
|
8
|
Alonso LL, Slagboom J, Casewell NR, Samanipour S, Kool J. Metabolome-Based Classification of Snake Venoms by Bioinformatic Tools. Toxins (Basel) 2023; 15:161. [PMID: 36828475 PMCID: PMC9963137 DOI: 10.3390/toxins15020161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/31/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023] Open
Abstract
Snakebite is considered a neglected tropical disease, and it is one of the most intricate ones. The variability found in snake venom is what makes it immensely complex to study. These variations are present both in the big and the small molecules found in snake venom. This study focused on examining the variability found in the venom's small molecules (i.e., mass range of 100-1000 Da) between two main families of venomous snakes-Elapidae and Viperidae-managing to create a model able to classify unknown samples by means of specific features, which can be extracted from their LC-MS data and output in a comprehensive list. The developed model also allowed further insight into the composition of snake venom by highlighting the most relevant metabolites of each group by clustering similarly composed venoms. The model was created by means of support vector machines and used 20 features, which were merged into 10 principal components. All samples from the first and second validation data subsets were correctly classified. Biological hypotheses relevant to the variation regarding the metabolites that were identified are also given.
Collapse
Affiliation(s)
- Luis L. Alonso
- Division of BioAnalytical Chemistry, Amsterdam Institute of Molecular and Life Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
- Centre for Analytical Sciences Amsterdam (CASA), 1012 WX Amsterdam, The Netherlands
| | - Julien Slagboom
- Division of BioAnalytical Chemistry, Amsterdam Institute of Molecular and Life Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
- Centre for Analytical Sciences Amsterdam (CASA), 1012 WX Amsterdam, The Netherlands
| | - Nicholas R. Casewell
- Centre for Snakebite Research and Interventions, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK
| | - Saer Samanipour
- Van ‘t Hof Institute for Molecular Sciences, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
| | - Jeroen Kool
- Division of BioAnalytical Chemistry, Amsterdam Institute of Molecular and Life Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
- Centre for Analytical Sciences Amsterdam (CASA), 1012 WX Amsterdam, The Netherlands
| |
Collapse
|
9
|
Renai L, Del Bubba M, Samanipour S, Stafford R, Gargano AF. Development of a comprehensive two-dimensional liquid chromatographic mass spectrometric method for the non-targeted identification of poly- and perfluoroalkyl substances in aqueous film-forming foams. Anal Chim Acta 2022; 1232:340485. [DOI: 10.1016/j.aca.2022.340485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/02/2022] [Accepted: 10/03/2022] [Indexed: 11/26/2022]
|
10
|
Rousis NI, Li Z, Bade R, McLachlan MS, Mueller JF, O'Brien JW, Samanipour S, Tscharke BJ, Thomaidis NS, Thomas KV. Socioeconomic status and public health in Australia: A wastewater-based study. ENVIRONMENT INTERNATIONAL 2022; 167:107436. [PMID: 35914338 DOI: 10.1016/j.envint.2022.107436] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 06/22/2022] [Accepted: 07/25/2022] [Indexed: 06/15/2023]
Abstract
Analysis of untreated municipal wastewater is recognized as an innovative approach to assess population exposure to or consumption of various substances. Currently, there are no published wastewater-based studies investigating the relationships between catchment social, demographic, and economic characteristics with chemicals using advanced non-targeted techniques. In this study, fifteen wastewater samples covering 27% of the Australian population were collected during a population Census. The samples were analysed with a workflow employing liquid chromatography high-resolution mass spectrometry and chemometric tools for non-target analysis. Socioeconomic characteristics of catchment areas were generated using Geospatial Information Systems software. Potential correlations were explored between pseudo-mass loads of the identified compounds and socioeconomic and demographic descriptors of the wastewater catchments derived from Census data. Markers of public health (e.g., cardiac arrhythmia, cardiovascular disease, anxiety disorder and type 2 diabetes) were identified in the wastewater samples by the proposed workflow. They were positively correlated with descriptors of disadvantage in education, occupation, marital status and income, and negatively correlated with descriptors of advantage in education and occupation. In addition, markers of polypropylene glycol (PPG) and polyethylene glycol (PEG) related compounds were positively correlated with housing and occupation disadvantage. High positive correlations were found between separated and divorced people and specific drugs used to treat cardiac arrhythmia, cardiovascular disease, and depression. Our robust non-targeted methodology in combination with Census data can identify relationships between biomarkers of public health, human behaviour and lifestyle and socio-demographics of whole populations. Furthermore, it can identify specific areas and socioeconomic groups that may need more assistance than others for public health issues. This approach complements important public health information and enables large-scale national coverage with a relatively small number of samples.
Collapse
Affiliation(s)
- Nikolaos I Rousis
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia; Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece.
| | - Zhe Li
- Department of Environmental Science, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Richard Bade
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Michael S McLachlan
- Department of Environmental Science, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Jochen F Mueller
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Jake W O'Brien
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Saer Samanipour
- Faculty of Science, Van't Hoff Institute for Molecular Sciences, University of Amsterdam, Science Park, 904 GD Amsterdam, the Netherlands
| | - Benjamin J Tscharke
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, QLD 4102, Australia
| |
Collapse
|
11
|
Samanipour S, Choi P, O'Brien JW, Pirok BWJ, Reid MJ, Thomas KV. From Centroided to Profile Mode: Machine Learning for Prediction of Peak Width in HRMS Data. Anal Chem 2021; 93:16562-16570. [PMID: 34843646 PMCID: PMC8674881 DOI: 10.1021/acs.analchem.1c03755] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Centroiding is one of the major approaches used for size reduction of the data generated by high-resolution mass spectrometry. During centroiding, performed either during acquisition or as a pre-processing step, the mass profiles are represented by a single value (i.e., the centroid). While being effective in reducing the data size, centroiding also reduces the level of information density present in the mass peak profile. Moreover, each step of the centroiding process and their consequences on the final results may not be completely clear. Here, we present Cent2Prof, a package containing two algorithms that enables the conversion of the centroided data to mass peak profile data and vice versa. The centroiding algorithm uses the resolution-based mass peak width parameter as the first guess and self-adjusts to fit the data. In addition to the m/z values, the centroiding algorithm also generates the measured mass peak widths at half-height, which can be used during the feature detection and identification. The mass peak profile prediction algorithm employs a random-forest model for the prediction of mass peak widths, which is consequently used for mass profile reconstruction. The centroiding results were compared to the outputs of the MZmine-implemented centroiding algorithm. Our algorithm resulted in rates of false detection ≤5% while the MZmine algorithm resulted in 30% rate of false positive and 3% rate of false negative. The error in profile prediction was ≤56% independent of the mass, ionization mode, and intensity, which was 6 times more accurate than the resolution-based estimated values.
Collapse
Affiliation(s)
- Saer Samanipour
- Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands.,Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Queensland 4102, Australia.,Norwegian Institute for Water Research (NIVA), Økernveien 94, Oslo 0579, Norway
| | - Phil Choi
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Queensland 4102, Australia.,Water Unit, Health Protection Branch, Prevention Division, Queensland Department of Health, Brisbane, Queensland 4000, Australia
| | - Jake W O'Brien
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Queensland 4102, Australia
| | - Bob W J Pirok
- Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands
| | - Malcolm J Reid
- Norwegian Institute for Water Research (NIVA), Økernveien 94, Oslo 0579, Norway
| | - Kevin V Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Queensland 4102, Australia
| |
Collapse
|
12
|
Celma A, Ahrens L, Gago-Ferrero P, Hernández F, López F, Lundqvist J, Pitarch E, Sancho JV, Wiberg K, Bijlsma L. The relevant role of ion mobility separation in LC-HRMS based screening strategies for contaminants of emerging concern in the aquatic environment. CHEMOSPHERE 2021; 280:130799. [PMID: 34162120 DOI: 10.1016/j.chemosphere.2021.130799] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/29/2021] [Accepted: 05/01/2021] [Indexed: 05/24/2023]
Abstract
Ion mobility separation (IMS) coupled to high resolution mass spectrometry (IMS-HRMS) is a promising technique for (non-)target/suspect analysis of micropollutants in complex matrices. IMS separates ionized compounds based on their charge, shape and size facilitating the removal of co-eluting isomeric/isobaric species. Additionally, IMS data can be translated into collision cross-section (CCS) values, which can be used to increase the identification reliability. However, IMS-HRMS for the screening of contaminants of emerging concern (CECs) have been scarcely explored. In this study, the role of IMS-HRMS for the identification of CECs in complex matrices is highlighted, with emphasis on when and with which purpose is of use. The utilization of IMS can result in much cleaner mass spectra, which considerably facilitates data interpretation and the obtaining of reliable identifications. Furthermore, the robustness of IMS measurements across matrices permits the use of CCS as an additional relevant parameter during the identification step even when reference standards are not available. Moreover, an effect on the number of true and false identifications could be demonstrated by including IMS restrictions within the identification workflow. Data shown in this work is of special interest for environmental researchers dealing with the detection of CECs with state-of-the-art IMS-HRMS instruments.
Collapse
Affiliation(s)
- Alberto Celma
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain
| | - Lutz Ahrens
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences (SLU), Box 7050, SE-750 07, Uppsala, Sweden
| | - Pablo Gago-Ferrero
- Institute of Environmental Assessment and Water Research (IDAEA) Severo Ochoa Excellence Center, Spanish Council for Scientific Research (CSIC), Jordi Girona 18-26, E-08034, Barcelona, Spain
| | - Félix Hernández
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain
| | - Francisco López
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain
| | - Johan Lundqvist
- Department of Biomedicine and Veterinary Public Health, Swedish University of Agricultural Sciences, Box 7028, SE-750 07, Uppsala, Sweden
| | - Elena Pitarch
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain
| | - Juan Vicente Sancho
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain
| | - Karin Wiberg
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences (SLU), Box 7050, SE-750 07, Uppsala, Sweden
| | - Lubertus Bijlsma
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, E-12071, Spain.
| |
Collapse
|
13
|
Inter-laboratory mass spectrometry dataset based on passive sampling of drinking water for non-target analysis. Sci Data 2021; 8:223. [PMID: 34429429 PMCID: PMC8384892 DOI: 10.1038/s41597-021-01002-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 07/12/2021] [Indexed: 11/09/2022] Open
Abstract
Non-target analysis (NTA) employing high-resolution mass spectrometry is a commonly applied approach for the detection of novel chemicals of emerging concern in complex environmental samples. NTA typically results in large and information-rich datasets that require computer aided (ideally automated) strategies for their processing and interpretation. Such strategies do however raise the challenge of reproducibility between and within different processing workflows. An effective strategy to mitigate such problems is the implementation of inter-laboratory studies (ILS) with the aim to evaluate different workflows and agree on harmonized/standardized quality control procedures. Here we present the data generated during such an ILS. This study was organized through the Norman Network and included 21 participants from 11 countries. A set of samples based on the passive sampling of drinking water pre and post treatment was shipped to all the participating laboratories for analysis, using one pre-defined method and one locally (i.e. in-house) developed method. The data generated represents a valuable resource (i.e. benchmark) for future developments of algorithms and workflows for NTA experiments. Measurement(s) | chemical • drinking water | Technology Type(s) | high resolution mass spectrometry • non-target analysis • Interlaboratory | Factor Type(s) | method | Sample Characteristic - Environment | laboratory environment |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.15028665
Collapse
|
14
|
Zhu L, Jia W, Wang Q, Zhuang P, Wan X, Ren Y, Zhang Y. Nontargeted metabolomics-based mapping urinary metabolic fingerprints after exposure to acrylamide. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2021; 224:112625. [PMID: 34411821 DOI: 10.1016/j.ecoenv.2021.112625] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 07/31/2021] [Accepted: 08/08/2021] [Indexed: 06/13/2023]
Abstract
Acrylamide classified as a probable carcinogen to humans is a high production volume chemical in industrial applications released to aquatic and environmental ecosystems, and also widely found in the thermal processing of starch-rich foods. To gain insight into the urinary metabolomics that may induce physiological responses stimulated by acrylamide, rats were orally administered with a single dose of 13C3-acrylamide (10 mg/kg bw) in the treatment group and urine samples were continuously collected every 2 h during the first 18 h and every 3 h during the period from 18 h to 36 h. A reliable nontargeted screening method for the analysis of urinary metabolomics in rats was developed using ultra-high performance liquid chromatography coupled to quadrupole-Orbitrap high-resolution mass spectrometry. All metabolites in urine of rats receiving isotope-labeled acrylamide were screened by validated orthogonal partial least squares-discriminant analyses compared to the animals in the control group, while exposure biomarkers were further confirmed according to the characteristic fragmentation rules and time-dependent profiles. Here we identified 2 new specific exposure biomarkers, named N-acetyl-S-(2-carbamoyl-2-hydroxyethyl)-L-cysteine-sulfoxide and N-acetyl-S-(2-carboxyl)-L-cysteine, compared to 4 currently acknowledged mercapturic acid adducts of acrylamide. In addition, our findings on analysis of acrylamide metabolic pathway and identification of exposure biomarkers confirmed that acrylamide could significantly affect energy metabolism and amino acid metabolism by the Kyoto Encyclopedia of Genes and Genomes pathway analysis for key metabolites. Homocysteine thiolactone and hypoxanthine may be potential biomarkers for the cardiotoxicity, while methionine sulfoxide, hippuric acid and melatonin may be specifically related to the neurotoxicity. Thus, the current study provided new evidence on the identification of emerging exposure biomarkers and specific signature metabolites related to the toxicity of acrylamide, and shed light on how acrylamide affected energy and amino acid metabolism by further mapping urinary metabolic fingerprints.
Collapse
Affiliation(s)
- Li Zhu
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Wei Jia
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Qiao Wang
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Pan Zhuang
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xuzhi Wan
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yiping Ren
- Yangtze Delta Region Institute of Tsinghua University, Jiaxing 314006, Zhejiang, China
| | - Yu Zhang
- National Engineering Laboratory of Intelligent Food Technology and Equipment, Zhejiang Key Laboratory for Agro-Food Processing, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, Zhejiang, China.
| |
Collapse
|
15
|
An assessment of quality assurance/quality control efforts in high resolution mass spectrometry non-target workflows for analysis of environmental samples. Trends Analyt Chem 2020. [DOI: 10.1016/j.trac.2020.116063] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
16
|
McLean C, Kujawinski EB. AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing. Anal Chem 2020; 92:5724-5732. [PMID: 32212641 PMCID: PMC7310949 DOI: 10.1021/acs.analchem.9b04804] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
![]()
Untargeted
metabolomics experiments provide a snapshot of cellular
metabolism but remain challenging to interpret due to the computational
complexity involved in data processing and analysis. Prior to any
interpretation, raw data must be processed to remove noise and to
align mass-spectral peaks across samples. This step requires selection
of dataset-specific parameters, as erroneous parameters can result
in noise inflation. While several algorithms exist to automate parameter
selection, each depends on gradient descent optimization functions.
In contrast, our new parameter optimization algorithm, AutoTuner,
obtains parameter estimates from raw data in a single step as opposed
to many iterations. Here, we tested the accuracy and the run-time
of AutoTuner in comparison to isotopologue parameter optimization
(IPO), the most commonly used parameter selection tool, and compared
the resulting parameters’ influence on the properties of feature
tables after processing. We performed a Monte Carlo experiment to
test the robustness of AutoTuner parameter selection and found that
AutoTuner generated similar parameter estimates from random subsets
of samples. We conclude that AutoTuner is a desirable alternative
to existing tools, because it is scalable, highly robust, and very
fast (∼100–1000× speed improvement from other algorithms
going from days to minutes). AutoTuner is freely available as an R
package through BioConductor.
Collapse
Affiliation(s)
- Craig McLean
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States.,MIT/WHOI Joint Program in Oceanography/Applied Ocean Science and Engineering, Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States
| | - Elizabeth B Kujawinski
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States
| |
Collapse
|
17
|
Samanipour S, Reid MJ, Rundberget JT, Frost TK, Thomas KV. Concentration and Distribution of Naphthenic Acids in the Produced Water from Offshore Norwegian North Sea Oilfields. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:2707-2714. [PMID: 32019310 DOI: 10.1021/acs.est.9b05784] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Naphthenic acids (NAs) constitute one of the toxic components of the produced water (PW) from offshore oil platforms discharged into the marine environment. We employed liquid chromatography (LC) coupled to high-resolution mass spectrometry with electrospray ionization (ESI) in negative mode for the comprehensive chemical characterization and quantification of NAs in PW samples from six different Norwegian offshore oil platforms. In total, we detected 55 unique NA isomer groups, out of the 181 screened homologous groups, across all tested samples. The frequency of detected NAs in the samples varied between 14 and 44 isomer groups. Principal component analysis (PCA) indicated a clear distinction of the PW from the tested platforms based on the distribution of NAs in these samples. The averaged total concentration of NAs varied between 6 and 56 mg L-1, among the tested platforms, whereas the concentrations of the individual NA isomer groups ranged between 0.2 and 44 mg L-1. Based on both the distribution and the concentration of NAs in the samples, the C8H14O2 isomer group appeared to be a reasonable indicator of the presence and the total concentration of NAs in the samples with a Pearson correlation coefficient of 0.89.
Collapse
Affiliation(s)
- Saer Samanipour
- Norwegian Institute for Water Research (NIVA), Gaustadalléen 21, Oslo 0349, Norway
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall St, Woolloongabba, Queensland 4102, Australia
| | - Malcolm J Reid
- Norwegian Institute for Water Research (NIVA), Gaustadalléen 21, Oslo 0349, Norway
| | | | - Tone K Frost
- Equinor, Arkitekt Ebbels veg 10, Rotvoll, Trondheim 7005, Norway
| | - Kevin V Thomas
- Norwegian Institute for Water Research (NIVA), Gaustadalléen 21, Oslo 0349, Norway
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall St, Woolloongabba, Queensland 4102, Australia
| |
Collapse
|