1
|
Liu Q, Walker D, Uppal K, Liu Z, Ma C, Tran V, Li S, Jones DP, Yu T. Addressing the batch effect issue for LC/MS metabolomics data in data preprocessing. Sci Rep 2020; 10:13856. [PMID: 32807888 PMCID: PMC7431853 DOI: 10.1038/s41598-020-70850-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 07/28/2020] [Indexed: 12/31/2022] Open
Abstract
With the growth of metabolomics research, more and more studies are conducted on large numbers of samples. Due to technical limitations of the Liquid Chromatography–Mass Spectrometry (LC/MS) platform, samples often need to be processed in multiple batches. Across different batches, we often observe differences in data characteristics. In this work, we specifically focus on data generated in multiple batches on the same LC/MS machinery. Traditional preprocessing methods treat all samples as a single group. Such practice can result in errors in the alignment of peaks, which cannot be corrected by post hoc application of batch effect correction methods. In this work, we developed a new approach that address the batch effect issue in the preprocessing stage, resulting in better peak detection, alignment and quantification. It can be combined with down-stream batch effect correction methods to further correct for between-batch intensity differences. The method is implemented in the existing workflow of the apLCMS platform. Analyzing data with multiple batches, both generated from standardized quality control (QC) plasma samples and from real biological studies, the new method resulted in feature tables with better consistency, as well as better down-stream analysis results. The method can be a useful addition to the tools available for large studies involving multiple batches. The method is available as part of the apLCMS package. Download link and instructions are at https://mypage.cuhk.edu.cn/academics/yutianwei/apLCMS/.
Collapse
Affiliation(s)
- Qin Liu
- School of Software Engineering, Tongji University, Shanghai, 201804, China
| | - Douglas Walker
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Karan Uppal
- Department of Medicine, School of Medicine, Emory University, Atlanta, GA, 30322, USA
| | - Zihe Liu
- School of Software Engineering, Tongji University, Shanghai, 201804, China
| | - Chunyu Ma
- Department of Medicine, School of Medicine, Emory University, Atlanta, GA, 30322, USA
| | - ViLinh Tran
- Department of Medicine, School of Medicine, Emory University, Atlanta, GA, 30322, USA
| | - Shuzhao Li
- The Jackson Laboratory, Farmington, CT, 06032, USA
| | - Dean P Jones
- Department of Medicine, School of Medicine, Emory University, Atlanta, GA, 30322, USA
| | - Tianwei Yu
- School of Data Science, The Chinese University of Hong Kong - Shenzhen, Shenzhen, 518172, Guangdong Province, China.
| |
Collapse
|
2
|
Farahani KZ, Benvidi A, Rezaeinasab M, Abbasi S, Abdollahi-Alibeik M, Rezaeipoor-Anari A, Zarchi MAK, Abadi SSADM. Potentiality of PARAFAC approaches for simultaneous determination of N-acetylcysteine and acetaminophen based on the second-order data obtained from differential pulse voltammetry. Talanta 2018; 192:439-447. [PMID: 30348415 DOI: 10.1016/j.talanta.2018.08.092] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 08/27/2018] [Accepted: 08/31/2018] [Indexed: 02/06/2023]
Abstract
N-acetylcysteine (N-AC) has widespread application such as pharmaceutical drug and nutritional supplement. Its adverse effects are rash, urticaria, and itchiness and large doses of N-AC could potentially cause damage to the heart and lungs. Therefore, in this work, a sensitive voltammetric sensor based on a carbon paste electrode modified with silica nano particles (i.e. Mobil Composition of Matter (No. 41) modified with Boron Trifluoride or BF3@MCM-41) with a combination of 4,4'-dihydroxybiphenyl (DHB) (BF3@MCM-41/DHB/CPE) was designed for determination of N-AC. The electrochemical oxidation of N-AC was examined using various techniques such as cyclic voltammetry (CV), chronoamperometry and differential pulse voltammetry (DPV). Under the optimum conditions, some parameters such as electron transfer coefficient (α) and heterogeneous rate constant (ks) were estimated for N-AC. Due to the use of N-AC for the treatment of acetaminophen (AC) overdose, the application of modified electrode was investigated for the simultaneous determination of N-AC and AC in blood serum and tablet samples. Since, the signals of these species overlap and due to the presence of interfering species in blood samples, the simultaneous determination of mentioned species is difficult or impossible. To overcome this challenge, parallel factor analysis (PARAFAC) was used for the analysis of the complex matrices to obtain the spectral profile of each component and interference. To achieve this goal, electrochemical second-order data were generated using a simple change in pulse height of differential pulse voltammetry. The results of the presently proposed strategy for the real samples analysis are similar to those obtained with HPLC. Thus, the proposed method has acceptable performance for simultaneous determination of the two species in real samples.
Collapse
Affiliation(s)
| | - Ali Benvidi
- Department of Chemistry, Faculty of Science, Yazd University, Yazd 89195-741, Iran.
| | - Masoud Rezaeinasab
- Department of Chemistry, Faculty of Science, Yazd University, Yazd 89195-741, Iran
| | - Saleheh Abbasi
- Department of Chemistry, Faculty of Science, Yazd University, Yazd 89195-741, Iran
| | | | - Ali Rezaeipoor-Anari
- Department of Chemistry, Faculty of Science, Yazd University, Yazd 89195-741, Iran
| | | | | |
Collapse
|
3
|
Adutwum LA, Abel RJ, Harynuk J. Total Ion Spectra versus Segmented Total Ion Spectra as Preprocessing Tools for Gas Chromatography - Mass Spectrometry Data. J Forensic Sci 2017; 63:1059-1068. [PMID: 29023723 DOI: 10.1111/1556-4029.13657] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 08/21/2017] [Accepted: 09/08/2017] [Indexed: 12/01/2022]
Abstract
Alignment of fire debris data from GC-MS for chemometric analysis is challenged by highly variable, uncontrolled sample and matrix composition. The total ion spectrum (TIS) obviates the need for alignment but loses all separation information. We introduce the segmented total ion spectrum (STIS), which retains the advantages of TIS while retaining some retention information. We compare the performance of STIS with TIS for the classification of casework fire debris samples. TIS and STIS achieve good model prediction accuracies of 96% and 98%, respectively. Baseline removal improved model prediction accuracies for both TIS and STIS to 97% and 99%, respectively. The importance of maintaining some chromatographic information to aid in deciphering the underlying chemistry of the results and reasons for false positive/negative results was also examined.
Collapse
Affiliation(s)
- Lawrence A Adutwum
- Department of Chemistry, Univeristy of Alberta, Edmonton, Alberta, Canada
| | - Robin J Abel
- Department of Chemistry, Univeristy of Alberta, Edmonton, Alberta, Canada
| | - James Harynuk
- Department of Chemistry, Univeristy of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
4
|
Abou-el-karam S, Ratel J, Kondjoyan N, Truan C, Engel E. Marker discovery in volatolomics based on systematic alignment of GC-MS signals: Application to food authentication. Anal Chim Acta 2017; 991:58-67. [DOI: 10.1016/j.aca.2017.08.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Revised: 07/26/2017] [Accepted: 08/19/2017] [Indexed: 10/19/2022]
|
5
|
Zhang W, Lei Z, Huhman D, Sumner LW, Zhao PX. MET-XAlign: a metabolite cross-alignment tool for LC/MS-based comparative metabolomics. Anal Chem 2015; 87:9114-9. [PMID: 26247233 DOI: 10.1021/acs.analchem.5b01324] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Liquid chromatography/mass spectrometry (LC/MS) metabolite profiling has been widely used in comparative metabolomics studies; however, LC/MS-based comparative metabolomics currently faces several critical challenges. One of the greatest challenges is how to effectively align metabolites across different LC/MS profiles; a single metabolite can give rise to multiple peak features, and the grouped peak features that can be used to construct a spectrum pattern of single metabolite can vary greatly between biochemical experiments and even between instrument runs. Another major challenge is that the observed retention time for a single metabolite can also be significantly affected by experimental conditions. To overcome these two key challenges, we present a novel metabolite-based alignment approach entitled MET-XAlign to align metabolites across LC/MS metabolomics profiles. MET-XAlign takes the deduced molecular mass and estimated compound retention time information that can be extracted by our previously published tool, MET-COFEA, and aligns metabolites based on this information. We demonstrate that MET-XAlign is able to cross-align metabolite compounds, either known or unknown, in LC/MS profiles not only across different samples but also across different biological experiments and different electrospray ionization modes. Therefore, our proposed metabolite-based cross-alignment approach is a great step forward and its implementation, MET-XAlign, is a very useful tool in LC/MS-based comparative metabolomics. MET-XAlign has been successfully implemented with core algorithm coding in C++, making it very efficient, and visualization interface coding in the Microsoft.NET Framework. The MET-XAlign software along with demonstrative data is freely available at http://bioinfo.noble.org/manuscript-support/met-xalign/ .
Collapse
Affiliation(s)
- Wenchao Zhang
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| | - Zhentian Lei
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| | - David Huhman
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| | - Lloyd W Sumner
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| | - Patrick X Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| |
Collapse
|
6
|
Zhang W, Chang J, Lei Z, Huhman D, Sumner LW, Zhao PX. MET-COFEA: a liquid chromatography/mass spectrometry data processing platform for metabolite compound feature extraction and annotation. Anal Chem 2014; 86:6245-53. [PMID: 24856452 DOI: 10.1021/ac501162k] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
In this paper, we present a novel liquid chromatography/mass spectrometry (LC/MS) data processing and analysis platform, MET-COFEA (METabolite COmpound Feature Extraction and Annotation). MET-COFEA detects and clusters chromatographic peak features for each metabolite compound by first comprehensively evaluating retention time and peak shape criteria and then annotating the associations between each peak's observed m/z value with the corresponding metabolite compound's molecular mass. MET-COFEA integrates a series of innovative approaches, including novel mass trace based extracted-ion chromatogram (EIC) extraction, continuous wavelet transform (CWT)-based peak detection, and compound-associated peak clustering and peak annotation algorithms. On the basis of the deduced neutral molecular mass and retention time, we have also developed a new alignment algorithm that uses compound-associated peak groups instead of individual peaks to align the same metabolite compound across samples from different electrospray ionization (ESI) modes, different instruments, even different experimental conditions. MET-COFEA has been systematically tested on a series of LC/MS profiles of mixed standards at different concentrations as well as real untargeted LC/MS plant metabolomics data. We compared the performances of MET-COFEA with the existing publicly available tools at LC/MS peak analysis level and demonstrated its excellent performance in this arena. MET-COFEA is freely available at http://bioinfo.noble.org/manuscript-support/met-cofea/.
Collapse
Affiliation(s)
- Wenchao Zhang
- Plant Biology Division, The Samuel Roberts Noble Foundation , 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, United States
| | | | | | | | | | | |
Collapse
|
7
|
Dellicour S, Lecocq T. GCALIGNER 1.0: An alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC. J Sep Sci 2013; 36:3206-9. [DOI: 10.1002/jssc.201300388] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2013] [Revised: 06/21/2013] [Accepted: 07/16/2013] [Indexed: 11/09/2022]
Affiliation(s)
- Simon Dellicour
- Evolutionary Biology and Ecology; Université Libre de Bruxelles; Brussels Belgium
| | - Thomas Lecocq
- Laboratoire de Zoologie; University of Mons; Mons Belgium
| |
Collapse
|
8
|
Urban J, Vaněk J, Štys D. Unsupervised adaptive filter for baseline thresholding and elimination in liquid chromatography-mass spectrometry via approximation of the standard deviation of baseline distribution in retention time domain. ACTA CHROMATOGR 2013. [DOI: 10.1556/achrom.25.2013.2.4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
9
|
Zheng YB, Zhang ZM, Liang YZ, Zhan DJ, Huang JH, Yun YH, Xie HL. Application of fast Fourier transform cross-correlation and mass spectrometry data for accurate alignment of chromatograms. J Chromatogr A 2013; 1286:175-82. [DOI: 10.1016/j.chroma.2013.02.063] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Revised: 01/28/2013] [Accepted: 02/18/2013] [Indexed: 11/29/2022]
|
10
|
Hoffmann N, Keck M, Neuweger H, Wilhelm M, Högy P, Niehaus K, Stoye J. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics 2012; 13:214. [PMID: 22920415 PMCID: PMC3546004 DOI: 10.1186/1471-2105-13-214] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Accepted: 08/03/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. RESULTS In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). CONCLUSIONS We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source.
Collapse
Affiliation(s)
- Nils Hoffmann
- Genome Informatics Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany.
| | | | | | | | | | | | | |
Collapse
|
11
|
Abstract
Small molecules are central to all biological processes and metabolomics becoming an increasingly important discovery tool. Robust, accurate and efficient experimental approaches are critical to supporting and validating predictions from post-genomic studies. To accurately predict metabolic changes and dynamics, experimental design requires multiple biological replicates and usually multiple treatments. Mass spectra from each run are processed and metabolite features are extracted. Because of machine resolution and variation in replicates, one metabolite may have different implementations (values) of retention time and mass in different spectra. A major impediment to effectively utilizing untargeted metabolomics data is ensuring accurate spectral alignment, enabling precise recognition of features (metabolites) across spectra. Existing alignment algorithms use either a global merge strategy or a local merge strategy. The former delivers an accurate alignment, but lacks efficiency. The latter is fast, but often inaccurate. Here we document a new algorithm employing a technique known as quicksort. The results on both simulated data and real data show that this algorithm provides a dramatic increase in alignment speed and also improves alignment accuracy.
Collapse
Affiliation(s)
- Zheng Rong Yang
- Biosciences, College of Life and Environmental Science, University of Exeter, Exeter, United Kingdom.
| | | |
Collapse
|
12
|
O'Callaghan S, De Souza DP, Isaac A, Wang Q, Hodkinson L, Olshansky M, Erwin T, Appelbe B, Tull DL, Roessner U, Bacic A, McConville MJ, Likić VA. PyMS: a Python toolkit for processing of gas chromatography-mass spectrometry (GC-MS) data. Application and comparative study of selected tools. BMC Bioinformatics 2012; 13:115. [PMID: 22647087 PMCID: PMC3533878 DOI: 10.1186/1471-2105-13-115] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 04/17/2012] [Indexed: 01/06/2023] Open
Abstract
Background Gas chromatography–mass spectrometry (GC-MS) is a technique frequently used in targeted and non-targeted measurements of metabolites. Most existing software tools for processing of raw instrument GC-MS data tightly integrate data processing methods with graphical user interface facilitating interactive data processing. While interactive processing remains critically important in GC-MS applications, high-throughput studies increasingly dictate the need for command line tools, suitable for scripting of high-throughput, customized processing pipelines. Results PyMS comprises a library of functions for processing of instrument GC-MS data developed in Python. PyMS currently provides a complete set of GC-MS processing functions, including reading of standard data formats (ANDI- MS/NetCDF and JCAMP-DX), noise smoothing, baseline correction, peak detection, peak deconvolution, peak integration, and peak alignment by dynamic programming. A novel common ion single quantitation algorithm allows automated, accurate quantitation of GC-MS electron impact (EI) fragmentation spectra when a large number of experiments are being analyzed. PyMS implements parallel processing for by-row and by-column data processing tasks based on Message Passing Interface (MPI), allowing processing to scale on multiple CPUs in distributed computing environments. A set of specifically designed experiments was performed in-house and used to comparatively evaluate the performance of PyMS and three widely used software packages for GC-MS data processing (AMDIS, AnalyzerPro, and XCMS). Conclusions PyMS is a novel software package for the processing of raw GC-MS data, particularly suitable for scripting of customized processing pipelines and for data processing in batch mode. PyMS provides limited graphical capabilities and can be used both for routine data processing and interactive/exploratory data analysis. In real-life GC-MS data processing scenarios PyMS performs as well or better than leading software packages. We demonstrate data processing scenarios simple to implement in PyMS, yet difficult to achieve with many conventional GC-MS data processing software. Automated sample processing and quantitation with PyMS can provide substantial time savings compared to more traditional interactive software systems that tightly integrate data processing with the graphical user interface.
Collapse
Affiliation(s)
- Sean O'Callaghan
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, Victoria, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Global urinary metabolic profiling procedures using gas chromatography–mass spectrometry. Nat Protoc 2011; 6:1483-99. [DOI: 10.1038/nprot.2011.375] [Citation(s) in RCA: 204] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
14
|
Shmookler Reis RJ, Xu L, Lee H, Chae M, Thaden JJ, Bharill P, Tazearslan C, Siegel E, Alla R, Zimniak P, Ayyadevara S. Modulation of lipid biosynthesis contributes to stress resistance and longevity of C. elegans mutants. Aging (Albany NY) 2011; 3:125-47. [PMID: 21386131 PMCID: PMC3082008 DOI: 10.18632/aging.100275] [Citation(s) in RCA: 134] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Many lifespan-modulating genes are involved in either generation of oxidative substrates and end-products, or their detoxification and removal. Among such metabolites, only lipoperoxides have the ability to produce free-radical chain reactions. For this study, fatty-acid profiles were compared across a panel of C. elegans mutants that span a tenfold range of longevities in a uniform genetic background. Two lipid structural properties correlated extremely well with lifespan in these worms: fatty-acid chain length and susceptibility to oxidation both decreased sharply in the longest-lived mutants (affecting the insulinlike-signaling pathway). This suggested a functional model in which longevity benefits from a reduction in lipid peroxidation substrates, offset by a coordinate decline in fatty-acid chain length to maintain membrane fluidity. This model was tested by disrupting the underlying steps in lipid biosynthesis, using RNAi knockdown to deplete transcripts of genes involved in fatty-acid metabolism. These interventions produced effects on longevity that were fully consistent with the functions and abundances of their products. Most knockdowns also produced concordant effects on survival of hydrogen peroxide stress, which can trigger lipoperoxide chain reactions.
Collapse
|
15
|
Barupal DK, Kind T, Kothari SL, Lee DY, Fiehn O. Hydrocarbon phenotyping of algal species using pyrolysis-gas chromatography mass spectrometry. BMC Biotechnol 2010; 10:40. [PMID: 20492649 PMCID: PMC2883956 DOI: 10.1186/1472-6750-10-40] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2009] [Accepted: 05/21/2010] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Biofuels derived from algae biomass and algae lipids might reduce dependence on fossil fuels. Existing analytical techniques need to facilitate rapid characterization of algal species by phenotyping hydrocarbon-related constituents. RESULTS In this study, we compared the hydrocarbon rich algae Botryococcus braunii against the photoautotrophic model algae Chlamydomonas reinhardtii using pyrolysis-gas chromatography quadrupole mass spectrometry (pyGC-MS). Sequences of up to 48 dried samples can be analyzed using pyGC-MS in an automated manner without any sample preparation. Chromatograms of 30-min run times are sufficient to profile pyrolysis products from C8 to C40 carbon chain length. The freely available software tools AMDIS and SpectConnect enables straightforward data processing. In Botryococcus samples, we identified fatty acids, vitamins, sterols and fatty acid esters and several long chain hydrocarbons. The algae species C. reinhardtii, B. braunii race A and B. braunii race B were readily discriminated using their hydrocarbon phenotypes. Substructure annotation and spectral clustering yielded network graphs of similar components for visual overviews of abundant and minor constituents. CONCLUSION Pyrolysis-GC-MS facilitates large scale screening of hydrocarbon phenotypes for comparisons of strain differences in algae or impact of altered growth and nutrient conditions.
Collapse
|
16
|
Abstract
The uses of metabolic profiling technologies such as mass spectrometry and nuclear magnetic resonance spectroscopy in parasitology have been multi-faceted. Traditional uses of spectroscopic platforms focused on determining the chemical composition of drugs or natural products used for treatment of parasitic infection. A natural progression of the use of these tools led to the generation of chemical profiles of the parasite in in vitro systems, monitoring the response of the parasite to chemotherapeutics, profiling metabolic consequences in the host organism and to deriving host-parasite interactions. With the dawn of the post-genomic era the paradigm in many research areas shifted towards Systems Biology and the integration of biomolecular interactions at the level of the gene, protein and metabolite. Although these technologies have yet to deliver their full potential, metabolic profiling has a key role to play in defining diagnostic or even prognostic metabolic signatures of parasitic infection and in deciphering the molecular mechanisms underpinning the development of parasite-induced pathologies. The strengths and weaknesses of the various spectroscopic technologies and analytical strategies are summarized here with respect to achieving these goals.
Collapse
|
17
|
Christin C, Hoefsloot HCJ, Smilde AK, Suits F, Bischoff R, Horvatovich PL. Time Alignment Algorithms Based on Selected Mass Traces for Complex LC-MS Data. J Proteome Res 2010; 9:1483-95. [DOI: 10.1021/pr9010124] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Christin Christin
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| | - Huub C. J. Hoefsloot
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| | - Age K. Smilde
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| | - Frank Suits
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| | - Rainer Bischoff
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| | - Peter L. Horvatovich
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands, Biosystem Data Analysis, Swammerdam Institute for Life Science, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands, and BM T.J. Watson Research Centre, Yorktown Heights, New York 10598
| |
Collapse
|
18
|
Watkins PJ, Clifford D, Rose G, Allen D, Warner RD, Dunshea FR, Pethick DW. Sheep category can be classified using machine learning techniques applied to fatty acid profiles derivatised as trimethylsilyl esters. ANIMAL PRODUCTION SCIENCE 2010. [DOI: 10.1071/an10034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Eruption of permanent incisors (dentition) is used as a proxy for age for defining meat quality in Australian sheep meat. However, this approach may not be reliable. While not presently available, an objective method could be used to determine sheep age, and thus sheep category, which would then potentially remove any inaccuracies that may occur in classifying sheep meat product. Statistical classification algorithms have been successfully used in bioinformatics. In this paper we review the performance of three algorithms (support vector machines, recursive partitioning and random forests) for determining sheep age. The algorithms were applied to the measured fatty acid profiles of fat samples from 533 carcasses; 254 lamb (<1 year old), 131 hogget (~1–2 years old) and 148 mutton (>2 years old) samples. Three data pretreatments (range transformation, column mean centering and range transformation with mean centering) were also examined to determine their impact on the performance of the algorithms. The random forests algorithm, when applied to mean-centred data, gave 100% predictive accuracy when classifying sheep category. This approach could be used for the development of an objective test for determining sheep age and category.
Collapse
|
19
|
Wren JD, Gusev Y, Isokpehi RD, Berleant D, Braga-Neto U, Wilkins D, Bridges S. Proceedings of the 2009 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference. BMC Bioinformatics 2009; 10 Suppl 11:S1. [PMID: 19811674 PMCID: PMC3313274 DOI: 10.1186/1471-2105-10-s11-s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
20
|
Zheng L, Watson DG, Johnston BF, Clark RL, Edrada-Ebel R, Elseheri W. A chemometric study of chromatograms of tea extracts by correlation optimization warping in conjunction with PCA, support vector machines and random forest data modeling. Anal Chim Acta 2008; 642:257-65. [PMID: 19427484 DOI: 10.1016/j.aca.2008.12.015] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2008] [Revised: 12/07/2008] [Accepted: 12/08/2008] [Indexed: 10/21/2022]
Abstract
A reverse phase high performance liquid chromatography (HPLC) separation was established for profiling water soluble compounds in extracts from tea. Whole chromatograms were pre-processed by techniques including baseline correction, binning and normalisation. In addition, peak alignment by correction of retention time shifts was performed using correlation optimization warping (COW) producing a correlation score of 0.96. To extract the chemically relevant information from the data, a variety of chemometric approaches were employed. Principle component analysis (PCA) was used to group the tea samples according to their chromatographic differences. Three principal components (PCs) described 78% of the total variance after peak alignment (64% before) and analysis of the score and loading plots provided insight into the main chemical differences between the samples. Finally, PCA, support vector machines (SVMs) and random forest (RF) machine learning methods were evaluated comparatively on their ability to predict unknown tea samples using models constructed from a predetermined training set. The best predictions of identity were obtained by using RF.
Collapse
Affiliation(s)
- L Zheng
- Strathclyde Institute of Pharmacy and Biomedical Sciences, 27 Taylor Street, Glasgow G4 0NR, UK
| | | | | | | | | | | |
Collapse
|
21
|
Wren JD, Wilkins D, Fuscoe JC, Bridges S, Winters-Hilt S, Gusev Y. Proceedings of the 2008 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference. BMC Bioinformatics 2008; 9 Suppl 9:S1. [PMID: 18793454 PMCID: PMC2537572 DOI: 10.1186/1471-2105-9-s9-s1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Affiliation(s)
- Jonathan D Wren
- Arthritis and Immunology Research Program, Oklahoma Medical Research Foundation; 825 N.E. 13th Street, Oklahoma City, OK 73104-5005, USA.
| | | | | | | | | | | |
Collapse
|