1
|
Måge I, Wubshet SG, Wold JP, Solberg LE, Böcker U, Dankel K, Lintvedt TA, Kafle B, Cattaldo M, Matić J, Sorokina L, Afseth NK. The role of biospectroscopy and chemometrics as enabling technologies for upcycling of raw materials from the food industry. Anal Chim Acta 2023; 1284:342005. [PMID: 37996160 DOI: 10.1016/j.aca.2023.342005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 09/25/2023] [Accepted: 11/05/2023] [Indexed: 11/25/2023]
Abstract
It is important to utilize the entire animal in meat and fish production to ensure sustainability. Rest raw materials, such as bones, heads, trimmings, and skin, contain essential nutrients that can be transformed into high-value products. Enzymatic protein hydrolysis (EPH) is a bioprocess that can upcycle these materials to create valuable proteins and fats. This paper focuses on the role of spectroscopy and chemometrics in characterizing the quality of the resulting protein product and understanding how raw material quality and processing affect it. The article presents recent developments in chemical characterisation and process modelling, with a focus on rest raw materials from poultry and salmon production. Even if some of the technology is relatively mature and implemented in many laboratories and industries, there are still open challenges and research questions. The main challenges are related to the transition of technology and insights from laboratory to industrial scale, and the link between peptide composition and critical product quality attributes.
Collapse
Affiliation(s)
- Ingrid Måge
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway.
| | - Sileshi Gizachew Wubshet
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Jens Petter Wold
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Lars Erik Solberg
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Ulrike Böcker
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Katinka Dankel
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Tiril Aurora Lintvedt
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway; Norwegian University of Life Sciences, Faculty of Science and Technology, 1432, Ås, Norway
| | - Bijay Kafle
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway; Norwegian University of Life Sciences, Faculty of Science and Technology, 1432, Ås, Norway
| | - Marco Cattaldo
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway; Universidad Politécnica de Valencia, Department of Applied Statistics, Operations Research and Quality, 46022, Valencia, Spain
| | - Josipa Matić
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| | - Liudmila Sorokina
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway; University of Oslo, Department of Chemistry, 0371, Oslo, Norway
| | - Nils Kristian Afseth
- Nofima - Norwegian Institute for Food, Fisheries and Aquaculture Research, Muninbakken 9-13, Breivika, 9291, Tromsø, Norway
| |
Collapse
|
2
|
de Figueiredo M, Rudaz S, Boccard J. A unified strategy to rebalance multifactorial designs with unequal group sizes: application to analysis of variance multiblock orthogonal partial least squares. Anal Chim Acta 2023; 1263:341284. [PMID: 37225336 DOI: 10.1016/j.aca.2023.341284] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/19/2023] [Accepted: 04/25/2023] [Indexed: 05/26/2023]
Abstract
BACKGROUND Adequately handling unbalanced groups remains one of the major challenges for the analysis of multivariate data collected from multifactorial experimental designs. While partial least squares-based methods, such as analysis of variance multiblock orthogonal partial least squares (AMOPLS), can offer better discrimination between factor levels, they can be more heavily affected by this issue, and unbalanced designs of experiments may lead to a substantial confusion of the effects. Even state-of-the-art analysis of variance (ANOVA) decomposition methodologies using general linear models (GLM) lack the ability to efficiently disentangle these sources of variation when combined with AMOPLS. RESULTS A versatile solution developed as an extension of a prior rebalancing strategy is proposed for the first decomposition step based on ANOVA. This approach has the advantage of yielding an unbiased estimation of the parameters and retaining the within-group variation in the rebalanced design, while preserving the orthogonality of effect matrices, even in presence of unequal group sizes. This property is of utmost importance for model interpretation because it avoids mixing sources of variation related to the different effects in the design. A real case study involving metabolomic data from in vitro toxicological experiments was used to demonstrate the potential of this strategy to handle unequal group sizes using a supervised approach. Primary 3D rat neural cell cultures were exposed to trimethyltin following a multifactorial design of experiments involving three fixed effect factors. SIGNIFICANCE AND NOVELTY The rebalancing strategy was demonstrated as a novel and potent solution to handle unbalanced experimental designs by offering unbiased parameter estimators and orthogonal submatrices, thus avoiding confusion of the effects and facilitating model interpretation. Moreover, it can be combined with any multivariate method used for the analysis of high-dimensional data collected from multifactorial designs.
Collapse
Affiliation(s)
- Miguel de Figueiredo
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland; Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, Geneva, Switzerland
| | - Serge Rudaz
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland; Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, Geneva, Switzerland
| | - Julien Boccard
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland; Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, Geneva, Switzerland.
| |
Collapse
|
3
|
Camacho J, Vitale R, Morales-Jiménez D, Gómez-Llorente C. Variable-selection ANOVA Simultaneous Component Analysis (VASCA). Bioinformatics 2022; 39:6887137. [PMID: 36495189 PMCID: PMC9825241 DOI: 10.1093/bioinformatics/btac795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 11/07/2022] [Accepted: 12/09/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION ANOVA Simultaneous Component Analysis (ASCA) is a popular method for the analysis of multivariate data yielded by designed experiments. Meaningful associations between factors/interactions of the experimental design and measured variables in the dataset are typically identified via significance testing, with permutation tests being the standard go-to choice. However, in settings with large numbers of variables, like omics (genomics, transcriptomics, proteomics and metabolomics) experiments, the 'holistic' testing approach of ASCA (all variables considered) often overlooks statistically significant effects encoded by only a few variables (biomarkers). RESULTS We hereby propose Variable-selection ASCA (VASCA), a method that generalizes ASCA through variable selection, augmenting its statistical power without inflating the Type-I error risk. The method is evaluated with simulations and with a real dataset from a multi-omic clinical experiment. We show that VASCA is more powerful than both ASCA and the widely adopted false discovery rate controlling procedure; the latter is used as a benchmark for variable selection based on multiple significance testing. We further illustrate the usefulness of VASCA for exploratory data analysis in comparison to the popular partial least squares discriminant analysis method and its sparse counterpart. AVAILABILITY AND IMPLEMENTATION The code for VASCA is available in the MEDA Toolbox at https://github.com/josecamachop/MEDA-Toolbox (release v1.3). The simulation results and motivating example can be reproduced using the repository at https://github.com/josecamachop/VASCA/tree/v1.0.0 (DOI 10.5281/zenodo.7410623). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Raffaele Vitale
- University of Lille, CNRS, LASIRE (UMR 8516), Laboratoire Avancé de Spectroscopie pour les Interactions, la Réactivité et l’Environnement, Lille F-59000, France
| | - David Morales-Jiménez
- Signal Theory, Networking and Communications Department, University of Granada, Granada 18014, Spain
| | - Carolina Gómez-Llorente
- Department of Biochemistry and Molecular Biology II, School of Pharmacy, Institute of Nutrition and Food Technology “José Mataix”, Biomedical Research Center, University of Granada, Granada 18160, Spain,Instituto de Investigación Biosanitaria, ibs.GRANADA, Granada, Spain,CIBEROBN (Physiopathology of Obesity and Nutrition CB12/03/30038), Instituto de Salud Carlos III, Madrid 28029, Spain
| |
Collapse
|
4
|
Comparison of Multivariate ANOVA-Based Approaches for the Determination of Relevant Variables in Experimentally Designed Metabolomic Studies. Molecules 2022; 27:molecules27103304. [PMID: 35630781 PMCID: PMC9147242 DOI: 10.3390/molecules27103304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 05/08/2022] [Accepted: 05/19/2022] [Indexed: 02/01/2023] Open
Abstract
The use of chemometric methods based on the analysis of variances (ANOVA) allows evaluation of the statistical significance of the experimental factors used in a study. However, classical multivariate ANOVA (MANOVA) has a number of requirements that make it impractical for dealing with metabolomics data. For this reason, in recent years, different options have appeared that overcome these limitations. In this work, we evaluate the performance of three of these multivariate ANOVA-based methods (ANOVA simultaneous component analysis—ASCA, regularized MANOVA–rMANOVA, and Group-wise ANOVA-simultaneous component analysis—GASCA) in the framework of metabolomics studies. Our main goals are to compare these various ANOVA-based approaches and evaluate their performance on experimentally designed metabolomic studies to find the significant factors and identify the most relevant variables (potential markers) from the obtained results. Two experimental data sets were generated employing liquid chromatography coupled to mass spectrometry (LC-MS) with different complexity in the design to evaluate the performance of the statistical approaches. Results show that the three considered ANOVA-based methods have a similar performance in detecting statistically significant factors. However, relevant variables pointed by GASCA seem to be more reliable as there is a strong similarity with those variables detected by the widely used partial least squares discriminant analysis (PLS-DA) method.
Collapse
|
5
|
Pérez-Cova M, Jaumot J, Tauler R. Untangling comprehensive two-dimensional liquid chromatography data sets using regions of interest and multivariate curve resolution approaches. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116207] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
6
|
Chemometric applications in metabolomic studies using chromatography-mass spectrometry. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2020.116165] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
7
|
Tortorella S, Cinti S. How Can Chemometrics Support the Development of Point of Need Devices? Anal Chem 2021; 93:2713-2722. [DOI: 10.1021/acs.analchem.0c04151] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sara Tortorella
- Molecular Horizon srl, Via Montelino 30, 06084 Bettona, Perugia, Italy
| | - Stefano Cinti
- Department of Pharmacy, University of Naples “Federico II”, Via Domenico Montesano 49, 80131 Naples, Italy
- BAT Center−Interuniversity Center for Studies on Bioinspired Agro-Environmental Technology, University of Napoli “Federico II”, 80055 Portici, Naples, Italy
| |
Collapse
|
8
|
FT-IR biomarkers of sexual dimorphism in yerba-mate plants: Seasonal and light accessibility effects. Microchem J 2020. [DOI: 10.1016/j.microc.2020.105329] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
Serum metabolomics approach to monitor the changes in metabolite profiles following renal transplantation. Sci Rep 2020; 10:17223. [PMID: 33057167 PMCID: PMC7560840 DOI: 10.1038/s41598-020-74245-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 09/23/2020] [Indexed: 02/06/2023] Open
Abstract
Systemic metabolic changes after renal transplantation reflect the key processes that are related to graft accommodation. In order to describe and better understand these changes, the 1HNMR based metabolomics approach was used. The changes of 47 metabolites in the serum samples of 19 individuals were interpreted over time with respect to their levels prior to transplantation. Considering the specific repeated measures design of the experiments, data analysis was mainly focused on the multiple analyses of variance (ANOVA) methods such as ANOVA simultaneous component analysis and ANOVA-target projection. We also propose here the combined use of ANOVA and classification and regression trees (ANOVA-CART) under the assumption that a small set of metabolites the binary splits on which may better describe the graft accommodation processes over time. This assumption is very important for developing a medical protocol for evaluating a patient's health state. The results showed that besides creatinine, which is routinely used to monitor renal activity, the changes in levels of hippurate, mannitol and alanine may be associated with the changes in renal function during the post-transplantation recovery period. Specifically, the level of hippurate (or histidine) is more sensitive to any short-term changes in renal activity than creatinine.
Collapse
|
10
|
Bertinetto C, Engel J, Jansen J. ANOVA simultaneous component analysis: A tutorial review. Anal Chim Acta X 2020; 6:100061. [PMID: 33392497 PMCID: PMC7772684 DOI: 10.1016/j.acax.2020.100061] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/29/2020] [Accepted: 10/02/2020] [Indexed: 12/27/2022] Open
Abstract
When analyzing experimental chemical data, it is often necessary to incorporate the structure of the study design into the chemometric/statistical models to effectively address the research questions of interest. ANOVA-Simultaneous Component Analysis (ASCA) is one of the most prominent methods to include such information in the quantitative analysis of multivariate data, especially when the number of variables is large. This tutorial review intends to explain in a simple way how ASCA works, how it is operated and how to correctly interpret ASCA results, with approachable mathematical and visual descriptions. Two examples are given: the first, a simulated chemical reaction, serves to illustrate the ASCA steps and the second, from a real chemical ecology data set, the interpretation of results. An overview of methods closely related to ASCA is also provided, pointing out their differences and scope, to give a wide-ranging picture of the available options to build multivariate models that take experimental design into account. ASCA is a multivariate method for analysis of multi-factor data. An overview of the (mathematical) principles of ASCA is presented. Key aspects for practical application of ASCA are discussed. Detailed explanation of ASCA output in terms of score and loading plots is given. Literature review of other multivariate techniques for analysis of multi-factor data.
Collapse
Affiliation(s)
- Carlo Bertinetto
- Department of Analytical Chemistry, Institute of Molecular Materials, Radboud University, the Netherlands
| | - Jasper Engel
- Biometris, Wageningen UR, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Jeroen Jansen
- Department of Analytical Chemistry, Institute of Molecular Materials, Radboud University, the Netherlands
| |
Collapse
|
11
|
Chemometric Strategies for Spectroscopy-Based Food Authentication. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10186544] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
In the last decades, spectroscopic techniques have played an increasingly crucial role in analytical chemistry, due to the numerous advantages they offer. Several of these techniques (e.g., Near-InfraRed—NIR—or Fourier Transform InfraRed—FT-IR—spectroscopy) are considered particularly valuable because, by means of suitable equipment, they enable a fast and non-destructive sample characterization. This aspect, together with the possibility of easily developing devices for on- and in-line applications, has recently favored the diffusion of such approaches especially in the context of foodstuff quality control. Nevertheless, the complex nature of the signal yielded by spectroscopy instrumentation (regardless of the spectral range investigated) inevitably calls for the use of multivariate chemometric strategies for its accurate assessment and interpretation. This review aims at providing a comprehensive overview of some of the chemometric tools most commonly exploited for spectroscopy-based foodstuff analysis and authentication. More in detail, three different scenarios will be surveyed here: data exploration, calibration and classification. The main methodologies suited to addressing each one of these different tasks will be outlined and examples illustrating their use will be provided alongside their description.
Collapse
|
12
|
Reduction of repeatability error for analysis of variance-Simultaneous Component Analysis (REP-ASCA): Application to NIR spectroscopy on coffee sample. Anal Chim Acta 2019; 1101:23-31. [PMID: 32029115 DOI: 10.1016/j.aca.2019.12.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 11/04/2019] [Accepted: 12/09/2019] [Indexed: 11/22/2022]
Abstract
A method to reduce repeatability error in multivariate data for Analysis of variance-Simultaneous Component Analysis (REP-ASCA) has been developed. This method proposes to adapt the acquisition protocol by adding a set containing repeated measures for describing repeatability error. Then, an orthogonal projection is performed in the row-space to reduce the repeatability error of the original dataset. Finally, ASCA is performed on the orthogonalized dataset. This method was evaluated on NIR spectral data of coffee beans. This study shows that the repeatability error due to physical variations between measurements can alter results of the analysis of variance. These effects are predominant in factors analysis and can be seen on spectra as constant or non-constant baselines. By reducing repeatability error with REP-ASCA, baselines are removed and factor analysis provides more information about chemical content of the factors of interest.
Collapse
|
13
|
Untargeted analysis of nanoLC-HRMS data by ANOVA-PCA to highlight metabolites in Gammarus fossarum after in vivo exposure to pharmaceuticals. Talanta 2019; 202:221-229. [PMID: 31171174 DOI: 10.1016/j.talanta.2019.04.028] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 04/05/2019] [Accepted: 04/10/2019] [Indexed: 12/22/2022]
Abstract
In Western Europe, river water quality can be assessed using sentinel species such as the amphipod Gammarus fossarum. In this work of environmental metabolomics, the objective was to develop suitable chemometrics methods, using a limited number of individuals, to assess the modification of the metabolism of G. fossarum exposed to two human pharmaceuticals. Males and females gammarids were exposed to a mixture of the anxiolytic oxazepam and the antiepileptic carbamazepine (1000 ng L-1) for 14 days under laboratory conditions according to a full factorial design 2² (repeated 5 times). They were analyzed at the single individual scale using a method including a μQuEChERS type extraction followed by a nanoliquid chromatography analysis coupled to high-resolution mass spectrometry. The molecular fingerprints obtained were investigated using XCMS. Several corrections of experimental drifts (by using lock mass and Quality Control samples) were tested prior to using APCA + method for the exploitation of the unbalanced designed data. Signal reproducibility was greatly improved by the lock mass normalisation. From the experimental design, a significant effect of both experimental factors "exposure to the mixture" and "gammarid gender" on the signals measured were highlighted by APCA+. Finally, the results obtained made it possible to identify variables responsible for each of the factor effects.
Collapse
|
14
|
De Beer D, Tobin J, Walczak B, Van Der Rijst M, Joubert E. Phenolic composition of rooibos changes during simulated fermentation: Effect of endogenous enzymes and fermentation temperature on reaction kinetics. Food Res Int 2019; 121:185-196. [DOI: 10.1016/j.foodres.2019.03.041] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 03/13/2019] [Accepted: 03/17/2019] [Indexed: 11/27/2022]
|
15
|
Malyjurek Z, de Beer D, Joubert E, Walczak B. Working with log-ratios. Anal Chim Acta 2019; 1059:16-27. [PMID: 30876628 DOI: 10.1016/j.aca.2019.01.041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 01/16/2019] [Accepted: 01/22/2019] [Indexed: 10/27/2022]
Abstract
Instrumental signals of samples cannot be compared and/or analysed directly if their concentrations are unknown. Differences in overall concentration need to be removed at the data normalization step. The choice of normalization method has a profound effect on the final results of data analysis, and especially on biomarker identification. One of the possible approaches to deal with the 'size effect' is to work with size-irrelevant (log) ratios instead of the original variables. In the presented study, the performance of log-ratio methods, namely pairwise log-ratio (plr) and centered log-ratio (clr), is discussed for real and simulated data sets with different characteristics. It was found that the clr method can lead to distribution of local differences along an entire signal and as such, it should be avoided in all studies aiming to identify biomarkers.
Collapse
Affiliation(s)
- Zuzanna Malyjurek
- Institute of Chemistry, University of Silesia, 40-007, Katowice, Szkolna 9, Poland
| | - Dalene de Beer
- Plant Bioactives Group, Post-Harvest & Agro-Processing Technologies, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch, 7599, South Africa; Department of Food Science, Stellenbosch University, Matieland, Private Bag X1, Stellenbosch, South Africa
| | - Elizabeth Joubert
- Plant Bioactives Group, Post-Harvest & Agro-Processing Technologies, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch, 7599, South Africa; Department of Food Science, Stellenbosch University, Matieland, Private Bag X1, Stellenbosch, South Africa
| | - Beata Walczak
- Institute of Chemistry, University of Silesia, 40-007, Katowice, Szkolna 9, Poland.
| |
Collapse
|
16
|
Strani L, Grassi S, Casiraghi E, Alamprese C, Marini F. Milk Renneting: Study of Process Factor Influences by FT-NIR Spectroscopy and Chemometrics. FOOD BIOPROCESS TECH 2019. [DOI: 10.1007/s11947-019-02266-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
17
|
Multivariate Analysis of Multiple Datasets: a Practical Guide for Chemical Ecology. J Chem Ecol 2018; 44:215-234. [PMID: 29479643 DOI: 10.1007/s10886-018-0932-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 02/03/2018] [Accepted: 02/04/2018] [Indexed: 10/17/2022]
Abstract
Chemical ecology has strong links with metabolomics, the large-scale study of all metabolites detectable in a biological sample. Consequently, chemical ecologists are often challenged by the statistical analyses of such large datasets. This holds especially true when the purpose is to integrate multiple datasets to obtain a holistic view and a better understanding of a biological system under study. The present article provides a comprehensive resource to analyze such complex datasets using multivariate methods. It starts from the necessary pre-treatment of data including data transformations and distance calculations, to the application of both gold standard and novel multivariate methods for the integration of different omics data. We illustrate the process of analysis along with detailed results interpretations for six issues representative of the different types of biological questions encountered by chemical ecologists. We provide the necessary knowledge and tools with reproducible R codes and chemical-ecological datasets to practice and teach multivariate methods.
Collapse
|
18
|
Tobin J, Walach J, de Beer D, Williams PJ, Filzmoser P, Walczak B. Untargeted analysis of chromatographic data for green and fermented rooibos: Problem with size effect removal. J Chromatogr A 2017; 1525:109-115. [PMID: 29037593 DOI: 10.1016/j.chroma.2017.10.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 10/02/2017] [Accepted: 10/08/2017] [Indexed: 10/18/2022]
Abstract
While analyzing chromatographic data, it is necessary to preprocess it properly before exploration and/or supervised modeling. To make chromatographic signals comparable, it is crucial to remove the scaling effect, caused by differences in overall sample concentrations. One of the efficient methods of signal scaling is Probabilistic Quotient Normalization (PQN) [1]. However, it can be applied only to data for which the majority of features do not vary systematically among the studied classes of signals. When studying the influence of the traditional "fermentation" (oxidation) process on the concentration of 56 individual peaks detected in rooibos plant material, this assumption is not fulfilled. In this case, the only possible solution is the analysis of pairwise log-ratios, which are not influenced by the scaling constant. To estimate significant features, i.e., peaks differentiating the studied classes of samples (green and fermented rooibos plant material), we propose the application of rPLR (robust pair-wise log-ratios) as proposed by Walach et al. [2]. It allows for fast computation and identification of the significant features in terms of original variables (peaks) which is problematic, while working with the unfolded pair-wise log ratios. As demonstrated, it can be applied to designed data sets and in the case of contaminated data, it allows proper conclusions.
Collapse
Affiliation(s)
- Jade Tobin
- Plant Bioactives Group, Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch, 7599, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Jan Walach
- Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology, Vienna, Austria
| | - Dalene de Beer
- Plant Bioactives Group, Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch, 7599, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Paul J Williams
- Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Peter Filzmoser
- Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology, Vienna, Austria
| | - Beata Walczak
- University of Silesia, Institute of Chemistry, Szkolna 9, 400-006, Katowice, Poland.
| |
Collapse
|
19
|
Brereton RG, Jansen J, Lopes J, Marini F, Pomerantsev A, Rodionova O, Roger JM, Walczak B, Tauler R. Chemometrics in analytical chemistry-part I: history, experimental design and data analysis tools. Anal Bioanal Chem 2017; 409:5891-5899. [PMID: 28776070 DOI: 10.1007/s00216-017-0517-1] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 06/23/2017] [Accepted: 07/10/2017] [Indexed: 11/25/2022]
Abstract
Chemometrics has achieved major recognition and progress in the analytical chemistry field. In the first part of this tutorial, major achievements and contributions of chemometrics to some of the more important stages of the analytical process, like experimental design, sampling, and data analysis (including data pretreatment and fusion), are summarised. The tutorial is intended to give a general updated overview of the chemometrics field to further contribute to its dissemination and promotion in analytical chemistry.
Collapse
Affiliation(s)
- Richard G Brereton
- School of Chemistry, University of Bristol, Cantocks Close, Bristol, BS8 1TS, UK
| | - Jeroen Jansen
- Institute for Molecules and Materials, Radboud University, Postvak 61, P.O. Box 9010, 6500 GL, Nijmegen, The Netherlands
| | - João Lopes
- Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003, Lisbon, Portugal
| | - Federico Marini
- Department of Chemistry, University of Rome "La Sapienza", Piazzale Aldo Moro 5, 00185, Rome, Italy
| | - Alexey Pomerantsev
- Institute of Chemical Physics RAS, 4, Kosygin Str, 119991, Moscow, Russia
| | - Oxana Rodionova
- Institute of Chemical Physics RAS, 4, Kosygin Str, 119991, Moscow, Russia
| | - Jean Michel Roger
- Irstea, UMR ITAP, 361 Rue Jean-François Breton, 34000, Montpellier, France
| | - Beata Walczak
- Institute of Chemistry, University of Silesia , 40-006, Katowice, Poland
| | - Romà Tauler
- IDAEA-CSIC, Jordi Girona 18-26, 08034, Barcelona, Spain.
| |
Collapse
|
20
|
Combining ANOVA-PCA with POCHEMON to analyse micro-organism development in a polymicrobial environment. Anal Chim Acta 2017; 963:1-16. [DOI: 10.1016/j.aca.2017.01.064] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 01/26/2017] [Accepted: 01/31/2017] [Indexed: 11/23/2022]
|
21
|
Marini F, de Beer D, Walters NA, de Villiers A, Joubert E, Walczak B. Multivariate analysis of variance of designed chromatographic data. A case study involving fermentation of rooibos tea. J Chromatogr A 2017; 1489:115-125. [PMID: 28189260 DOI: 10.1016/j.chroma.2017.02.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Revised: 02/02/2017] [Accepted: 02/03/2017] [Indexed: 11/29/2022]
Abstract
An ultimate goal of investigations of rooibos plant material subjected to different stages of fermentation is to identify the chemical changes taking place in the phenolic composition, using an untargeted approach and chromatographic fingerprints. Realization of this goal requires, among others, identification of the main components of the plant material involved in chemical reactions during the fermentation process. Quantitative chromatographic data for the compounds for extracts of green, semi-fermented and fermented rooibos form the basis of preliminary study following a targeted approach. The aim is to estimate whether treatment has a significant effect based on all quantified compounds and to identify the compounds, which contribute significantly to it. Analysis of variance is performed using modern multivariate methods such as ANOVA-Simultaneous Component Analysis, ANOVA - Target Projection and regularized MANOVA. This study is the first one in which all three approaches are compared and evaluated. For the data studied, all tree methods reveal the same significance of the fermentation effect on the extract compositions, but they lead to its different interpretation.
Collapse
Affiliation(s)
- Federico Marini
- Department of Chemistry, University of Rome "La Sapienza", P.le Aldo Moro 5, I-00185 Rome, Italy
| | - Dalene de Beer
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, 7599 Stellenbosch, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Nico A Walters
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, 7599 Stellenbosch, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - André de Villiers
- Department of Chemistry and Polymer Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Elizabeth Joubert
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, 7599 Stellenbosch, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
| | - Beata Walczak
- University of Silesia, Institute of Chemistry, Szkolna 9, 400-006 Katowice, Poland.
| |
Collapse
|
22
|
Calvani R, Marini F, Cesari M, Tosato M, Picca A, Anker SD, von Haehling S, Miller RR, Bernabei R, Landi F, Marzetti E. Biomarkers for physical frailty and sarcopenia. Aging Clin Exp Res 2017; 29:29-34. [PMID: 28155180 DOI: 10.1007/s40520-016-0708-1] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 10/10/2016] [Indexed: 12/14/2022]
Abstract
Physical frailty (PF) and sarcopenia are major health issues in geriatric populations, given their high prevalence and association with several adverse outcomes. Nevertheless, the lack of an univocal operational definition for the two conditions has so far hampered their clinical implementation. Existing definitional ambiguities of PF and sarcopenia, together with their complex underlying pathophysiology, also account for the absence of robust biomarkers that can be used for screening, diagnostic and/or prognostication purposes. This review provides an overview of currently available biological markers for PF and sarcopenia, as well as a critical appraisal of strengths and weaknesses of traditional procedures for biomarker development in the field. A novel approach for biomarker identification and validation, based on multivariate methodologies, is also discussed. This strategy relies on the multidimensional modeling of complementary biomarkers to cope with the phenotypical and pathophysiological complexity of PF and sarcopenia. Biomarkers identified through the implementation of multivariate strategies may be used to support the detection of the two conditions, track their progression over time or in response to interventions, and reveal the onset of complications (e.g., mobility disability) at a very early stage.
Collapse
|
23
|
Xu Y, Muhamadali H, Sayqal A, Dixon N, Goodacre R. Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding. Metabolites 2016; 6:metabo6040038. [PMID: 27801817 PMCID: PMC5192444 DOI: 10.3390/metabo6040038] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 10/20/2016] [Accepted: 10/24/2016] [Indexed: 12/31/2022] Open
Abstract
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a “pure” regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Collapse
Affiliation(s)
- Yun Xu
- School of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
| | - Howbeer Muhamadali
- School of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
| | - Ali Sayqal
- School of Chemistry, Umm Al-Qura University, Al Taif Road, Mecca 24382, Saudi Arabia.
| | - Neil Dixon
- School of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
| | - Royston Goodacre
- School of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
| |
Collapse
|
24
|
Xiongjie Z, Ye Z, Yang L, Bin T. Evaluation of the uniformity of concentration of radon in a radon chamber. Appl Radiat Isot 2016; 110:183-188. [DOI: 10.1016/j.apradiso.2016.01.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Revised: 01/11/2016] [Accepted: 01/18/2016] [Indexed: 11/12/2022]
|
25
|
Boccard J, Rudaz S. Exploring Omics data from designed experiments using analysis of variance multiblock Orthogonal Partial Least Squares. Anal Chim Acta 2016; 920:18-28. [PMID: 27114219 DOI: 10.1016/j.aca.2016.03.042] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Revised: 03/22/2016] [Accepted: 03/23/2016] [Indexed: 11/17/2022]
Abstract
Many experimental factors may have an impact on chemical or biological systems. A thorough investigation of the potential effects and interactions between the factors is made possible by rationally planning the trials using systematic procedures, i.e. design of experiments. However, assessing factors' influences remains often a challenging task when dealing with hundreds to thousands of correlated variables, whereas only a limited number of samples is available. In that context, most of the existing strategies involve the ANOVA-based partitioning of sources of variation and the separate analysis of ANOVA submatrices using multivariate methods, to account for both the intrinsic characteristics of the data and the study design. However, these approaches lack the ability to summarise the data using a single model and remain somewhat limited for detecting and interpreting subtle perturbations hidden in complex Omics datasets. In the present work, a supervised multiblock algorithm based on the Orthogonal Partial Least Squares (OPLS) framework, is proposed for the joint analysis of ANOVA submatrices. This strategy has several advantages: (i) the evaluation of a unique multiblock model accounting for all sources of variation; (ii) the computation of a robust estimator (goodness of fit) for assessing the ANOVA decomposition reliability; (iii) the investigation of an effect-to-residuals ratio to quickly evaluate the relative importance of each effect and (iv) an easy interpretation of the model with appropriate outputs. Case studies from metabolomics and transcriptomics, highlighting the ability of the method to handle Omics data obtained from fixed-effects full factorial designs, are proposed for illustration purposes. Signal variations are easily related to main effects or interaction terms, while relevant biochemical information can be derived from the models.
Collapse
Affiliation(s)
- Julien Boccard
- School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, Geneva, Switzerland.
| | - Serge Rudaz
- School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, Geneva, Switzerland
| |
Collapse
|