1
|
Emwas AH, Saccenti E, Gao X, McKay RT, dos Santos VAPM, Roy R, Wishart DS. Recommended strategies for spectral processing and post-processing of 1D 1H-NMR data of biofluids with a particular focus on urine. Metabolomics 2018; 14:31. [PMID: 29479299 PMCID: PMC5809546 DOI: 10.1007/s11306-018-1321-4] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2017] [Accepted: 01/09/2018] [Indexed: 12/11/2022]
Abstract
1H NMR spectra from urine can yield information-rich data sets that offer important insights into many biological and biochemical phenomena. However, the quality and utility of these insights can be profoundly affected by how the NMR spectra are processed and interpreted. For instance, if the NMR spectra are incorrectly referenced or inconsistently aligned, the identification of many compounds will be incorrect. If the NMR spectra are mis-phased or if the baseline correction is flawed, the estimated concentrations of many compounds will be systematically biased. Furthermore, because NMR permits the measurement of concentrations spanning up to five orders of magnitude, several problems can arise with data analysis. For instance, signals originating from the most abundant metabolites may prove to be the least biologically relevant while signals arising from the least abundant metabolites may prove to be the most important but hardest to accurately and precisely measure. As a result, a number of data processing techniques such as scaling, transformation and normalization are often required to address these issues. Therefore, proper processing of NMR data is a critical step to correctly extract useful information in any NMR-based metabolomic study. In this review we highlight the significance, advantages and disadvantages of different NMR spectral processing steps that are common to most NMR-based metabolomic studies of urine. These include: chemical shift referencing, phase and baseline correction, spectral alignment, spectral binning, scaling and normalization. We also provide a set of recommendations for best practices regarding spectral and data processing for NMR-based metabolomic studies of biofluids, with a particular focus on urine.
Collapse
Affiliation(s)
- Abdul-Hamid Emwas
- Imaging and Characterization Core Lab, KAUST, Thuwal, 23955-6900 Kingdom of Saudi Arabia
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955 Kingdom of Saudi Arabia
| | - Ryan T. McKay
- Department of Chemistry, University of Alberta, Edmonton, Canada
| | - Vitor A. P. Martins dos Santos
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands
| | - Raja Roy
- Centre of Biomedical Research, Formerly, Centre of Biomedical Magnetic Resonance, Sanjay Gandhi Post-Graduate Institute of Medical Sciences Campus, Lucknow, India
| | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, Canada
| |
Collapse
|
2
|
Gu Q, Veselkov K. Bi-clustering of metabolic data using matrix factorization tools. Methods 2018; 151:12-20. [PMID: 29438828 PMCID: PMC6297113 DOI: 10.1016/j.ymeth.2018.02.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 02/04/2018] [Accepted: 02/06/2018] [Indexed: 01/08/2023] Open
Abstract
We propose a positive matrix factorization bi-clustering strategy for metabolic data. The approach automatically determines the number and composition of bi-clusters. We demonstrate its superior performance compared to other techniques.
Metabolic phenotyping technologies based on Nuclear Magnetic Spectroscopy (NMR) and Mass Spectrometry (MS) generate vast amounts of unrefined data from biological samples. Clustering strategies are frequently employed to provide insight into patterns of relationships between samples and metabolites. Here, we propose the use of a non-negative matrix factorization driven bi-clustering strategy for metabolic phenotyping data in order to discover subsets of interrelated metabolites that exhibit similar behaviour across subsets of samples. The proposed strategy incorporates bi-cross validation and statistical segmentation techniques to automatically determine the number and structure of bi-clusters. This alternative approach is in contrast to the widely used conventional clustering approaches that incorporate all molecular peaks for clustering in metabolic studies and require a priori specification of the number of clusters. We perform the comparative analysis of the proposed strategy with other bi-clustering approaches, which were developed in the context of genomics and transcriptomics research. We demonstrate the superior performance of the proposed bi-clustering strategy on both simulated (NMR) and real (MS) bacterial metabolic data.
Collapse
Affiliation(s)
- Quan Gu
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Garscube Estate, Glasgow G61 1QH, UK
| | - Kirill Veselkov
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, Sir Alexander Fleming Building, Exhibition Road, South Kensington, London SW7 2AZ, UK.
| |
Collapse
|
3
|
Puig-Castellví F, Alfonso I, Tauler R. Untargeted assignment and automatic integration of 1H NMR metabolomic datasets using a multivariate curve resolution approach. Anal Chim Acta 2017; 964:55-66. [PMID: 28351639 DOI: 10.1016/j.aca.2017.02.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Revised: 02/09/2017] [Accepted: 02/10/2017] [Indexed: 01/06/2023]
Abstract
In this article, we propose the use of the Multivariate Curve Resolution - Alternating Least Squares (MCR-ALS) chemometrics method to resolve the 1H NMR spectra and concentration of the individual metabolites in their mixtures in untargeted metabolomics studies. A decision tree-based strategy is presented to optimally select and implement spectra estimates and equality constraints during MCR-ALS optimization. The proposed method has been satisfactorily evaluated using different 1H NMR metabolomics datasets. In a first study, 1H NMR spectra of the metabolites in a simulated mixture were successfully recovered and assigned. In a second study, more than 30 metabolites were characterized and quantified from an experimental unknown mixture analyzed by 1H NMR. In this work, MCR-ALS is shown to be a convenient tool for metabolite investigation and sample screening using 1H NMR, and it opens a new path for performing metabolomics studies with this chemometric technique.
Collapse
Affiliation(s)
- Francesc Puig-Castellví
- Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
| | - Ignacio Alfonso
- Department of Biological Chemistry and Molecular Modelling, Institute of Advanced Chemistry of Catalonia (IQAC-CSIC), Barcelona, Spain
| | - Romà Tauler
- Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain.
| |
Collapse
|
4
|
Ramachandran GK, Yong WP, Yeow CH. Identification of Gastric Cancer Biomarkers Using 1H Nuclear Magnetic Resonance Spectrometry. PLoS One 2016; 11:e0162222. [PMID: 27611679 PMCID: PMC5017672 DOI: 10.1371/journal.pone.0162222] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 08/18/2016] [Indexed: 12/16/2022] Open
Abstract
Existing gastric cancer diagnosing methods were invasive, hence, a reliable non-invasive gastric cancer diagnosing method is needed. As a starting point, we used 1H NMR for identifying gastric cancer biomarkers using a panel of gastric cancer spheroids and normal gastric spheroids. We were able to identify 8 chemical shift biomarkers for gastric cancer spheroids. Our data suggests that the cancerous and non-cancerous spheroids significantly differ in the lipid composition and energy metabolism. These results encourage the translation of these biomarkers into in-vivo gastric cancer detection methodology using MRI-MS.
Collapse
Affiliation(s)
| | - Wei Peng Yong
- Department of Haematology-Oncology, National University Cancer Institute, Singapore (NCIS), Singapore
| | - Chen Hua Yeow
- Department of Biomedical Engineering, National University of Singapore, Singapore, Singapore
| |
Collapse
|
5
|
Zou X, Holmes E, Nicholson JK, Loo RL. Automatic Spectroscopic Data Categorization by Clustering Analysis (ASCLAN): A Data-Driven Approach for Distinguishing Discriminatory Metabolites for Phenotypic Subclasses. Anal Chem 2016; 88:5670-9. [PMID: 27149575 DOI: 10.1021/acs.analchem.5b04020] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We propose a novel data-driven approach aiming to reliably distinguish discriminatory metabolites from nondiscriminatory metabolites for a given spectroscopic data set containing two biological phenotypic subclasses. The automatic spectroscopic data categorization by clustering analysis (ASCLAN) algorithm aims to categorize spectral variables within a data set into three clusters corresponding to noise, nondiscriminatory and discriminatory metabolites regions. This is achieved by clustering each spectral variable based on the r(2) value representing the loading weight of each spectral variable as extracted from a orthogonal partial least-squares discriminant (OPLS-DA) model of the data set. The variables are ranked according to r(2) values and a series of principal component analysis (PCA) models are then built for subsets of these spectral data corresponding to ranges of r(2) values. The Q(2)X value for each PCA model is extracted. K-means clustering is then applied to the Q(2)X values to generate two clusters based on minimum Euclidean distance criterion. The cluster consisting of lower Q(2)X values is deemed devoid of metabolic information (noise), while the cluster consists of higher Q(2)X values is then further subclustered into two groups based on the r(2) values. We considered the cluster with high Q(2)X but low r(2) values as nondiscriminatory, while the cluster with high Q(2)X and r(2) values as discriminatory variables. The boundaries between these three clusters of spectral variables, on the basis of the r(2) values were considered as the cut off values for defining the noise, nondiscriminatory and discriminatory variables. We evaluated the ASCLAN algorithm using six simulated (1)H NMR spectroscopic data sets representing small, medium and large data sets (N = 50, 500, and 1000 samples per group, respectively), each with a reduced and full resolution set of variables (0.005 and 0.0005 ppm, respectively). ASCLAN correctly identified all discriminatory metabolites and showed zero false positive (100% specificity and positive predictive value) irrespective of the spectral resolution or the sample size in all six simulated data sets. This error rate was found to be superior to existing methods for ascertaining feature significance: univariate t test by Bonferroni correction (up to 10% false positive rate), Benjamini-Hochberg correction (up to 35% false positive rate) and metabolome wide significance level (MWSL, up to 0.4% false positive rate), as well as by various OPLS-DA parameters: variable importance to projection, (up to 15% false positive rate), loading coefficients (up to 35% false positive rate), and regression coefficients (up to 39% false positive rate). The application of ASCLAN was further exemplified using a widely investigated renal toxin, mercury II chloride (HgCl2) in rat model. ASCLAN successfully identified many of the known metabolites related to renal toxicity such as increased excretion of urinary creatinine, and different amino acids. The ASCLAN algorithm provides a framework for reliably differentiating discriminatory metabolites from nondiscriminatory metabolites in a biological data set without the need to set an arbitrary cut off value as applied to some of the conventional methods. This offers significant advantages over existing methods and the possibility for automation of high-throughput screening in "omics" data.
Collapse
Affiliation(s)
- Xin Zou
- Key Laboratory of Systems Biomedicine (Ministry of Education), Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University , 800 Dongchuan Road, Shanghai 200240, China.,Medway Metabonomics Research Group, Medway School of Pharmacy, Universities of Kent and Greenwich , Chatham Maritime, Kent, ME4 4TB, U.K
| | - Elaine Holmes
- Section of Biomolecular Medicine, Division of Computational and Systems Medicine, Department of Surgery and Cancer, Imperial College London , London SW7 2AZ, U.K.,MRC-NIHR Phenome Centre, Imperial College London , London SW7 2AZ, U.K
| | - Jeremy K Nicholson
- Section of Biomolecular Medicine, Division of Computational and Systems Medicine, Department of Surgery and Cancer, Imperial College London , London SW7 2AZ, U.K.,MRC-NIHR Phenome Centre, Imperial College London , London SW7 2AZ, U.K
| | - Ruey Leng Loo
- Medway Metabonomics Research Group, Medway School of Pharmacy, Universities of Kent and Greenwich , Chatham Maritime, Kent, ME4 4TB, U.K.,Section of Biomolecular Medicine, Division of Computational and Systems Medicine, Department of Surgery and Cancer, Imperial College London , London SW7 2AZ, U.K
| |
Collapse
|
6
|
Xia YG, Liang J, Yang BY, Wang QH, Kuang HX. Structural studies of an arabinan from the stems of Ephedra sinica by methylation analysis and 1D and 2D NMR spectroscopy. Carbohydr Polym 2015; 121:449-56. [DOI: 10.1016/j.carbpol.2014.12.058] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2014] [Revised: 12/23/2014] [Accepted: 12/25/2014] [Indexed: 10/24/2022]
|
7
|
Zou X, Holmes E, Nicholson JK, Loo RL. Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): an optimized statistical approach for clustering of ¹H NMR spectral data to reduce interference and enhance robust biomarkers selection. Anal Chem 2014; 86:5308-15. [PMID: 24773160 PMCID: PMC4110102 DOI: 10.1021/ac500161k] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2014] [Accepted: 04/28/2014] [Indexed: 12/24/2022]
Abstract
We propose a novel statistical approach to improve the reliability of (1)H NMR spectral analysis in complex metabolic studies. The Statistical HOmogeneous Cluster SpectroscopY (SHOCSY) algorithm aims to reduce the variation within biological classes by selecting subsets of homogeneous (1)H NMR spectra that contain specific spectroscopic metabolic signatures related to each biological class in a study. In SHOCSY, we used a clustering method to categorize the whole data set into a number of clusters of samples with each cluster showing a similar spectral feature and hence biochemical composition, and we then used an enrichment test to identify the associations between the clusters and the biological classes in the data set. We evaluated the performance of the SHOCSY algorithm using a simulated (1)H NMR data set to emulate renal tubule toxicity and further exemplified this method with a (1)H NMR spectroscopic study of hydrazine-induced liver toxicity study in rats. The SHOCSY algorithm improved the predictive ability of the orthogonal partial least-squares discriminatory analysis (OPLS-DA) model through the use of "truly" representative samples in each biological class (i.e., homogeneous subsets). This method ensures that the analyses are no longer confounded by idiosyncratic responders and thus improves the reliability of biomarker extraction. SHOCSY is a useful tool for removing irrelevant variation that interfere with the interpretation and predictive ability of models and has widespread applicability to other spectroscopic data, as well as other "omics" type of data.
Collapse
Affiliation(s)
- Xin Zou
- Medway
School of Pharmacy, Universities of Kent
and Greenwich, Anson
Building, Central Avenue, Chatham, Kent ME4 4TB, U.K.
| | - Elaine Holmes
- Section
of Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London SW7 2AZ, U.K.
- MRC-HPA
Centre for Environment and Health, Imperial
College London, 150 Stamford
Street, London SE1 9NH, U.K.
| | - Jeremy K. Nicholson
- Section
of Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London SW7 2AZ, U.K.
- MRC-HPA
Centre for Environment and Health, Imperial
College London, 150 Stamford
Street, London SE1 9NH, U.K.
| | - Ruey Leng Loo
- Medway
School of Pharmacy, Universities of Kent
and Greenwich, Anson
Building, Central Avenue, Chatham, Kent ME4 4TB, U.K.
- Section
of Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London SW7 2AZ, U.K.
| |
Collapse
|
8
|
Blaise BJ. Data-driven sample size determination for metabolic phenotyping studies. Anal Chem 2013; 85:8943-50. [PMID: 23972438 DOI: 10.1021/ac4022314] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Sample size determination is a key question in the experimental design of medical studies. The number of patients to include in a clinical study is actually critical to evaluate costs and inclusion requirements to achieve a sufficient statistical power of test and the identification of significant variations among the factors under study. Metabolic phenotyping is an expanding field of translational research in medicine, focusing on the identification of metabolism rearrangements due to various pathophysiological conditions. This top-down hypothesis-free approach uses analytical chemistry methods, coupled to statistical analysis, to quantify subtle and coordinated metabolite concentration variations and eventually identify candidate biomarkers. The sample size determination in metabolic phenotyping studies is difficult considering the absence of a priori metabolic target. This technical note introduces a data-driven sample size determination for metabolic phenotyping studies. Starting from nuclear magnetic resonance (NMR) spectra belonging to a small cohort, metabolic NMR variables are identified by the statistical recoupling of variables (SRV) procedure. A larger data set is then generated on the basis of Kernel density estimation of SRV variable distributions. Statistically significant variations of metabolic NMR signals identified by SRV are assessed by the Benjamini-Yekutieli correction for simulated data sets of variable sizes. Simulated model robustness is evaluated by receiver operating characteristic analysis (sensitivity and specificity) on an independent cohort and cross-validation. Sample size determination is obtained by identifying the optimal data set size, depending on the purpose of the study: at least one statistically significant variation (biomarker discovery) or a maximum of statistically significant variations (metabolic exploration).
Collapse
Affiliation(s)
- Benjamin J Blaise
- Hospices Civils de Lyon, Département de réanimation néonatale et néonatalogie, Hôpital Femme Mère Enfant, 59 bd Pinel, Bron Cedex, Bron 69677, France
| |
Collapse
|
9
|
Meyer H, Weidmann H, Lalk M. Methodological approaches to help unravel the intracellular metabolome of Bacillus subtilis. Microb Cell Fact 2013; 12:69. [PMID: 23844891 PMCID: PMC3722095 DOI: 10.1186/1475-2859-12-69] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Accepted: 07/01/2013] [Indexed: 11/16/2022] Open
Abstract
Background Bacillus subtilis (B. subtilis) has become widely accepted as a model organism for studies on Gram-positive bacteria. A deeper insight into the physiology of this prokaryote requires advanced studies of its metabolism. To provide a reliable basis for metabolome investigations, a validated experimental protocol is needed since the quality of the analytical sample and the final data are strongly affected by the sampling steps. To ensure that the sample analyzed precisely reflects the biological condition of interest, outside biases have to be avoided during sample preparation. Results Procedures for sampling, quenching, extraction of metabolites, cell disruption, as well as metabolite leakage were tested and optimized for B. subtilis. In particular the energy status of the bacterial cell, characterized by the adenylate energy charge, was used to evaluate sampling accuracy. Moreover, the results of the present study demonstrate that the cultivation medium can affect the efficiency of the developed sampling procedure. Conclusion The final workflow presented here allows for the reproducible and reliable generation of physiological data. The method with the highest qualitative and quantitative metabolite yield was chosen, and when used together with complementary bioanalytical methods (i.e., GC-MS, LC-MS and 1H-NMR) provides a solid basis to gather information on the metabolome of B. subtilis.
Collapse
Affiliation(s)
- Hanna Meyer
- Institute of Biochemistry, Ernst-Moritz-Arndt-University Greifswald, Felix-Hausdorff-Strasse 4, 17487 Greifswald, Germany
| | | | | |
Collapse
|
10
|
|
11
|
Cazier JB, Kaisaki PJ, Argoud K, Blaise BJ, Veselkov K, Ebbels TMD, Tsang T, Wang Y, Bihoreau MT, Mitchell SC, Holmes EC, Lindon JC, Scott J, Nicholson JK, Dumas ME, Gauguier D. Untargeted Metabolome Quantitative Trait Locus Mapping Associates Variation in Urine Glycerate to Mutant Glycerate Kinase. J Proteome Res 2011; 11:631-42. [DOI: 10.1021/pr200566t] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Jean-Baptise Cazier
- Wellcome Trust Centre for Human Genetics, Roosevelt Drive, University of Oxford, Oxford OX3 7BN, U.K
| | - Pamela J. Kaisaki
- Wellcome Trust Centre for Human Genetics, Roosevelt Drive, University of Oxford, Oxford OX3 7BN, U.K
| | - Karène Argoud
- Wellcome Trust Centre for Human Genetics, Roosevelt Drive, University of Oxford, Oxford OX3 7BN, U.K
| | - Benjamin J. Blaise
- Université de Lyon, FRE 3008 CNRS/ENS-Lyon/UCBL1 Centre de RMN à Très Hauts Champs, 5 rue de la Doua, 69100 Villeurbanne, France
| | - Kirill Veselkov
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Timothy M. D. Ebbels
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Tsz Tsang
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Yulan Wang
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Centre for Magnetic Resonance, Wuhan Institute of Physics and Mathematics, The Chinese Academy of Sciences, PR China
| | - Marie-Thérèse Bihoreau
- Wellcome Trust Centre for Human Genetics, Roosevelt Drive, University of Oxford, Oxford OX3 7BN, U.K
| | - Steve C. Mitchell
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
- Metabometrix Ltd, Imperial Incubator, Bessemer Building, Prince Consort Road, London SW7 2BP, U.K
| | - Elaine C. Holmes
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - John C. Lindon
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - James Scott
- Department of Vascular Science, National Heart and Lung Institute, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Jeremy K. Nicholson
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Marc-Emmanuel Dumas
- Université de Lyon, FRE 3008 CNRS/ENS-Lyon/UCBL1 Centre de RMN à Très Hauts Champs, 5 rue de la Doua, 69100 Villeurbanne, France
- Biomolecular Medicine, Department of Surgery and Cancer, Imperial College London, South Kensington, London SW7 2AZ, U.K
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Centre for Magnetic Resonance, Wuhan Institute of Physics and Mathematics, The Chinese Academy of Sciences, PR China
| | - Dominique Gauguier
- Wellcome Trust Centre for Human Genetics, Roosevelt Drive, University of Oxford, Oxford OX3 7BN, U.K
- Centre de Recherche des Cordeliers, INSERM UMRS872, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| |
Collapse
|
12
|
Sands CJ, Coen M, Ebbels TMD, Holmes E, Lindon JC, Nicholson JK. Data-driven approach for metabolite relationship recovery in biological 1H NMR data sets using iterative statistical total correlation spectroscopy. Anal Chem 2011; 83:2075-82. [PMID: 21323345 DOI: 10.1021/ac102870u] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Statistical total correlation spectroscopy (STOCSY) is a well-established and valuable method in the elucidation of both inter- and intrametabolite correlations in NMR metabonomic data sets. Here, the STOCSY approach is extended in a novel Iterative-STOCSY (I-STOCSY) tool in which correlations are calculated initially from a driver peak of interest and subsequently for all peaks identified as correlating with a correlation coefficient greater than a set threshold. Consequently, in a single automated run, the majority of information contained in multiple STOCSY calculations from all peaks recursively correlated to the original user defined driver peak of interest are recovered. In addition, highly correlating peaks are clustered into putative structurally related sets, and the results are presented in a fully interactive plot where each set is represented by a node; node-to-node connections are plotted alongside corresponding spectral data colored by the strength of connection, thus allowing the intuitive exploration of both inter- and intrametabolite connections. The I-STOCSY approach has been here applied to a (1)H NMR data set of 24 h postdose aqueous liver extracts from rats treated with the model hepatotoxin galactosamine and has been shown both to recover the previously deduced major metabolic effects of treatment and to generate new hypotheses even on this well-studied model system. I-STOCSY, thus, represents a significant advance in correlation based analysis and visualization, providing insight into inter- and intrametabolite relationships following metabolic perturbations.
Collapse
Affiliation(s)
- Caroline J Sands
- Biomolecular Medicine, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, Sir Alexander Fleming Building, South Kensington, SW7 2AZ, United Kingdom
| | | | | | | | | | | |
Collapse
|