1
|
Spiers RC, Kalivas JH. Calibration Model Updating to Novel Sample and Measurement Conditions without Reference Values. Anal Chem 2021; 93:9688-9696. [PMID: 34236832 DOI: 10.1021/acs.analchem.1c00578] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Updating a calibration model formed in original (primary) sample and spectral measurement conditions to predict analyte values in novel (secondary) conditions is an essential activity in analytical chemistry in order to avoid a complete recalibration. Established model updating methods require sample analyte reference values for a small set of secondary domain samples (labeled data) to be used in updating processes. Because obtaining reference values is time consuming and is the costly part of any calibration, methods are needed that do not require labeled secondary samples, thereby allowing on demand model updating. This paper compares model updating methods with and without labeled secondary samples. A hybrid model updating approach is also developed and evaluated. Unfortunately, a major impediment to adapting a model without secondary analyte reference values has been model selection. Because multiple tuning parameters are commonly involved in model updating methods, thousands of models are formed, making model selection complex. A recently developed framework is evaluated for automatic model selection of several two to three tuning parameter-based model updating methods without secondary analyte reference values (labels). The model selection method is based on model diversity and prediction similarity (MDPS) of the unlabeled samples to be predicted. The new secondary samples to be predicted can be used to form the updated models and again to select the final predicting models. Because models are formed and selected on demand to directly predict target samples, complicated cross-validation processes are not needed. Four near-infrared data sets covering 40 model updating situations are evaluated showing that MDPS can select reliable updated models outperforming or rivaling prediction errors from total recalibrations with secondary reference values.
Collapse
Affiliation(s)
- Robert C Spiers
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| | - John H Kalivas
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| |
Collapse
|
2
|
Spiers RC, Kalivas JH. Reliable Model Selection without Reference Values by Utilizing Model Diversity with Prediction Similarity. J Chem Inf Model 2021; 61:2220-2230. [PMID: 33900749 DOI: 10.1021/acs.jcim.0c01493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Predictive modeling (calibration or training) with various data formats, such as near-infrared (NIR) spectra and quantitative structure-activity relationship (QSAR) data, provides essential information if a proper model is selected. Similarly, with a general model selection approach, spectral model maintenance (updating) from original modeling conditions to new conditions can be performed for dynamic modeling. Fundamental modeling (partial least-squares (PLS) and others) and maintenance processes (domain adaptation or transfer learning and others) require selection of tuning parameter(s) values to isolate models that can accurately predict new samples or molecules, e.g., number of PLS latent variables to predict analyte concentration. Regardless of the modeling task, model selection is complex and without a reliable protocol. Tuning parameter selection typically depends on only one model quality measure assessing model bias using prediction accuracy. Developed in this paper is a generic model selection process using concepts from consensus modeling and QSAR activity landscapes. It is a consensus filtering approach that prioritizes model diversity (MD) while conserving prediction similarity (PS) fused with a common bias-variance trade-off measure. A significant feature of MDPS is that a cross-validation scheme is not needed because models are selected relative to predicting new samples or molecules, i.e., model selection uses unlabeled samples (without reference values) for active predictions. The versatility and reliability of MDPS model selection is shown using four NIR data sets and a QSAR data set. The study also substantiates the Rashomon effect where there is not one best model tuning parameter value that provides accurate predictions.
Collapse
Affiliation(s)
- Robert C Spiers
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| | - John H Kalivas
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| |
Collapse
|
3
|
Rapid prediction of multiple wine quality parameters using infrared spectroscopy coupling with chemometric methods. J Food Compost Anal 2020. [DOI: 10.1016/j.jfca.2020.103509] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
4
|
Tutorial: multivariate classification for vibrational spectroscopy in biological samples. Nat Protoc 2020; 15:2143-2162. [PMID: 32555465 DOI: 10.1038/s41596-020-0322-8] [Citation(s) in RCA: 136] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 03/20/2020] [Indexed: 12/26/2022]
Abstract
Vibrational spectroscopy techniques, such as Fourier-transform infrared (FTIR) and Raman spectroscopy, have been successful methods for studying the interaction of light with biological materials and facilitating novel cell biology analysis. Spectrochemical analysis is very attractive in disease screening and diagnosis, microbiological studies and forensic and environmental investigations because of its low cost, minimal sample preparation, non-destructive nature and substantially accurate results. However, there is now an urgent need for multivariate classification protocols allowing one to analyze biologically derived spectrochemical data to obtain accurate and reliable results. Multivariate classification comprises discriminant analysis and class-modeling techniques where multiple spectral variables are analyzed in conjunction to distinguish and assign unknown samples to pre-defined groups. The requirement for such protocols is demonstrated by the fact that applications of deep-learning algorithms of complex datasets are being increasingly recognized as critical for extracting important information and visualizing it in a readily interpretable form. Hereby, we have provided a tutorial for multivariate classification analysis of vibrational spectroscopy data (FTIR, Raman and near-IR) highlighting a series of critical steps, such as preprocessing, data selection, feature extraction, classification and model validation. This is an essential aspect toward the construction of a practical spectrochemical analysis model for biological analysis in real-world applications, where fast, accurate and reliable classification models are fundamental.
Collapse
|
5
|
Zhang J, Cui X, Cai W, Shao X. A variable importance criterion for variable selection in near-infrared spectral analysis. Sci China Chem 2018. [DOI: 10.1007/s11426-018-9368-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
6
|
Wentzell PD, Wicks CC, Braga JW, Soares LF, Pastore TC, Coradin VT, Davrieux F. Implications of measurement error structure on the visualization of multivariate chemical data: hazards and alternatives. CAN J CHEM 2018. [DOI: 10.1139/cjc-2017-0730] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The analysis of multivariate chemical data is commonplace in fields ranging from metabolomics to forensic classification. Many of these studies rely on exploratory visualization methods that represent the multidimensional data in spaces of lower dimensionality, such as hierarchical cluster analysis (HCA) or principal components analysis (PCA). However, such methods rely on assumptions of independent measurement errors with uniform variance and can fail to reveal important information when these assumptions are violated, as they often are for chemical data. This work demonstrates how two alternative methods, maximum likelihood principal components analysis (MLPCA) and projection pursuit analysis (PPA), can reveal chemical information hidden from more traditional techniques. Experimental data to compare different methods consists of near-infrared (NIR) reflectance spectra from 108 samples of wood that are derived from four different species of Brazilian trees. The measurement error characteristics of the spectra are examined and it is shown that, by incorporating measurement error information into the data analysis (through MLPCA) or using alternative projection criteria (i.e., PPA), samples can be separated by species. These techniques are proposed as powerful tools for multivariate data analysis in chemistry.
Collapse
Affiliation(s)
- Peter D. Wentzell
- Trace Analysis Research Centre, Department of Chemistry, Dalhousie University, P.O. Box 15000, Halifax, NS B3H 4R2, Canada
| | - Chelsi C. Wicks
- Trace Analysis Research Centre, Department of Chemistry, Dalhousie University, P.O. Box 15000, Halifax, NS B3H 4R2, Canada
| | - Jez W.B. Braga
- Chemistry Institute, University of Brasilia, Brasília, 72910-000, Brasilia, DF, Brasil
| | - Liz F. Soares
- Chemistry Institute, University of Brasilia, Brasília, 72910-000, Brasilia, DF, Brasil
| | - Tereza C.M. Pastore
- Forest Products Laboratory, Brazilian Forest Service, 70818-970, Brasilia, DF, Brasil
| | - Vera T.R. Coradin
- Forest Products Laboratory, Brazilian Forest Service, 70818-970, Brasilia, DF, Brasil
| | - Fabrice Davrieux
- French Agricultural Research Center for International Development, CIRAD-UMR Qualisud, F-34398, Montpellier Cedex 5, France
| |
Collapse
|
7
|
Corro-Herrera VA, Gómez-Rodríguez J, Hayward-Jones PM, Barradas-Dermitz DM, Gschaedler-Mathis AC, Aguilar-Uscanga MG. Real-time monitoring of ethanol production during Pichia stipitis NRRL Y-7124 alcoholic fermentation using transflection near infrared spectroscopy. Eng Life Sci 2018; 18:643-653. [PMID: 32624944 DOI: 10.1002/elsc.201700189] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 03/31/2018] [Accepted: 05/15/2018] [Indexed: 01/25/2023] Open
Abstract
The application of in situ near-infrared spectroscopy monitoring of xylose metabolizing yeast such as Pichia stipitis for ethanol production with semisynthetic media, applying chemometrics, was investigated. During the process in a bioreactor, biomass, glucose, xylose, ethanol, acetic acid, and glycerol determinations were performed by a transflection probe immersed in the culture broth and connected to a near-infrared process analyzer. Wavelength windows in near-infrared spectra recorded between 800 and 2200 nm were pretreated using Savitzky-Golay smoothing, second derivative and multiplicative scattering correction in order to perform a partial least squares regression and generate the calibration models. These calibration models were tested by external validation (78 samples). Calibration and validation criteria were defined and evaluated in order to generate robust and reliable models for an alcoholic fermentation process matrix. Moreover, regressions coefficients (β) and variable influence in the projection plots were used to assess the results. A novelty is the use of β versus VIP dispersion plots to determine which vectors have more influence on the response in order to improve process comprehension and operability. Validated models were used in a real-time monitoring during P. stipitis NRRL Y7124 semisynthetic media fermentations.
Collapse
Affiliation(s)
- Víctor Abel Corro-Herrera
- Bioengineering Laboratory, Food Research and Development Unit, Veracruz Institute of Technology Veracruz México
| | - Javier Gómez-Rodríguez
- Bioengineering Laboratory, Food Research and Development Unit, Veracruz Institute of Technology Veracruz México
| | | | | | | | | |
Collapse
|
8
|
Pereira LS, Lisboa FL, Neto JC, Valladão FN, Sena MM. Direct classification of new psychoactive substances in seized blotter papers by ATR-FTIR and multivariate discriminant analysis. Microchem J 2017. [DOI: 10.1016/j.microc.2017.03.032] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
9
|
Development and analytical validation of a screening method for simultaneous detection of five adulterants in raw milk using mid-infrared spectroscopy and PLS-DA. Food Chem 2015; 181:31-7. [DOI: 10.1016/j.foodchem.2015.02.077] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2014] [Revised: 01/11/2015] [Accepted: 02/14/2015] [Indexed: 11/19/2022]
|
10
|
Deng BC, Yun YH, Liang YZ, Cao DS, Xu QS, Yi LZ, Huang X. A new strategy to prevent over-fitting in partial least squares models based on model population analysis. Anal Chim Acta 2015; 880:32-41. [PMID: 26092335 DOI: 10.1016/j.aca.2015.04.045] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Revised: 04/11/2015] [Accepted: 04/23/2015] [Indexed: 11/28/2022]
Abstract
Partial least squares (PLS) is one of the most widely used methods for chemical modeling. However, like many other parameter tunable methods, it has strong tendency of over-fitting. Thus, a crucial step in PLS model building is to select the optimal number of latent variables (nLVs). Cross-validation (CV) is the most popular method for PLS model selection because it selects a model from the perspective of prediction ability. However, a clear minimum of prediction errors may not be obtained in CV which makes the model selection difficult. To solve the problem, we proposed a new strategy for PLS model selection which combines the cross-validated coefficient of determination (Qcv(2)) and model stability (S). S is defined as the stability of PLS regression vectors which is obtained using model population analysis (MPA). The results show that, when a clear maximum of Qcv(2) is not obtained, S can provide additional information of over-fitting and it helps in finding the optimal nLVs. Compared with other regression vector based indictors such as the Euclidean 2-norm (B2), the Durbin Watson statistic (DW) and the jaggedness (J), S is more sensitive to over-fitting. The model selected by our method has both good prediction ability and stability.
Collapse
Affiliation(s)
- Bai-Chuan Deng
- Department of Chemistry, University of Bergen, Bergen N-5007, Norway; School of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Yong-Huan Yun
- School of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| | - Yi-Zeng Liang
- School of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China.
| | - Dong-Sheng Cao
- School of Pharmaceutical Sciences, Central South University, Changsha 410083, PR China.
| | - Qing-Song Xu
- School of Mathematics and Statistics, Central South University, Changsha 410083, PR China
| | - Lun-Zhao Yi
- Yunnan Food Safety Research Institute, Kunming University of Science and Technology, Kunming 650500, PR China
| | - Xin Huang
- School of Chemistry and Chemical Engineering, Central South University, Changsha 410083, PR China
| |
Collapse
|
11
|
Olivieri AC. Practical guidelines for reporting results in single- and multi-component analytical calibration: A tutorial. Anal Chim Acta 2015; 868:10-22. [DOI: 10.1016/j.aca.2015.01.017] [Citation(s) in RCA: 145] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Revised: 01/08/2015] [Accepted: 01/12/2015] [Indexed: 10/24/2022]
|
12
|
Guimarães CC, Simeone MLF, Parrella RA, Sena MM. Use of NIRS to predict composition and bioethanol yield from cell wall structural components of sweet sorghum biomass. Microchem J 2014. [DOI: 10.1016/j.microc.2014.06.029] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
13
|
Orzel J, Daszykowski M, Kazura M, de Beer D, Joubert E, Schulze AE, Beelders T, de Villiers A, Malherbe CJ, Walczak B. Modeling of the total antioxidant capacity of rooibos (Aspalathus linearis) tea infusions from chromatographic fingerprints and identification of potential antioxidant markers. J Chromatogr A 2014; 1366:101-9. [PMID: 25283576 DOI: 10.1016/j.chroma.2014.09.030] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Revised: 09/12/2014] [Accepted: 09/12/2014] [Indexed: 02/06/2023]
Abstract
Models to predict the total antioxidant capacity (TAC) of rooibos tea infusions from their chromatographic fingerprints and peak table data (content of individual phenolic compounds), obtained using HPLC with diode array detection, were developed in order to identify potential antioxidant markers. Peak table data included the content of 12 compounds, namely phenylpyruvic acid-2-O-glucoside, aspalathin, nothofagin, isoorientin, orientin, ferulic acid, quercetin-3-O-robinobioside, vitexin, hyperoside, rutin, isovitexin and isoquercitrin. The TAC values, measured using the oxygen radical absorbance capacity (ORAC) and DPPH radical scavenging assays, could be predicted from the peak table data or the chromatographic fingerprints (prediction errors 9-12%) using partial least squares (PLS) regression. Prediction models created from samples of only two production years could additionally be used to predict the TAC of samples from another production year (prediction errors<13%) indicating the robustness of the models in a quality control environment. Furthermore, the uninformative variable elimination (UVE)-PLS method was used to identify potential antioxidant markers for rooibos infusions. All individual phenolic compounds that were quantified were selected as informative variables, except vitexin, while UVE-PLS models developed from chromatographic fingerprints indicated additional antioxidant markers, namely (S)-eriodictyol-6-C-glucoside, (R)-eriodictyol-6-C-glucoside, aspalalinin and two unidentified compounds. The potential antioxidant markers should be validated prior to use in quality control of rooibos tea.
Collapse
Affiliation(s)
- Joanna Orzel
- Institute of Chemistry, The University of Silesia, 9 Szkolna Street, 40-006 Katowice, Poland
| | - Michal Daszykowski
- Institute of Chemistry, The University of Silesia, 9 Szkolna Street, 40-006 Katowice, Poland.
| | - Malgorzata Kazura
- Institute of Chemistry, The University of Silesia, 9 Szkolna Street, 40-006 Katowice, Poland
| | - Dalene de Beer
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch 7599, South Africa
| | - Elizabeth Joubert
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch 7599, South Africa; Department of Food Science, Stellenbosch University, Private Bag X1, Matieland (Stellenbosch) 7602, South Africa
| | - Alexandra E Schulze
- Department of Food Science, Stellenbosch University, Private Bag X1, Matieland (Stellenbosch) 7602, South Africa
| | - Theresa Beelders
- Department of Food Science, Stellenbosch University, Private Bag X1, Matieland (Stellenbosch) 7602, South Africa
| | - André de Villiers
- Department of Chemistry and Polymer Science, Stellenbosch University, Private Bag X1, Matieland (Stellenbosch) 7602, South Africa
| | - Christiaan J Malherbe
- Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026, Stellenbosch 7599, South Africa
| | - Beata Walczak
- Institute of Chemistry, The University of Silesia, 9 Szkolna Street, 40-006 Katowice, Poland
| |
Collapse
|
14
|
Airado-Rodríguez D, Høy M, Skaret J, Wold JP. From multispectral imaging of autofluorescence to chemical and sensory images of lipid oxidation in cod caviar paste. Talanta 2014; 122:70-9. [PMID: 24720964 DOI: 10.1016/j.talanta.2013.12.052] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Revised: 12/22/2013] [Accepted: 12/24/2013] [Indexed: 11/18/2022]
Affiliation(s)
- Diego Airado-Rodríguez
- Nofima AS, Osloveien 1, N-1430 Ås, Norway; Department of Analytical Chemistry, Faculty of Sciences, University of Extremadura. Avenida de Elvas s/n, E-06006 Badajoz, Spain.
| | - Martin Høy
- Nofima AS, Osloveien 1, N-1430 Ås, Norway
| | | | | |
Collapse
|
15
|
Hernández-Hierro JM, Esquerre C, Valverde J, Villacreces S, Reilly K, Gaffney M, González-Miret ML, Heredia FJ, O’Donnell CP, Downey G. Preliminary study on the use of near infrared hyperspectral imaging for quantitation and localisation of total glucosinolates in freeze-dried broccoli. J FOOD ENG 2014. [DOI: 10.1016/j.jfoodeng.2013.11.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
16
|
Li B, Shanahan M, Calvet A, Leister KJ, Ryder AG. Comprehensive, quantitative bioprocess productivity monitoring using fluorescence EEM spectroscopy and chemometrics. Analyst 2014; 139:1661-71. [DOI: 10.1039/c4an00007b] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Using fluorescence excitation-emission matrix spectroscopy and chemometric methods we demonstrate an effective and rapid method for quantitative monitoring of a mammalian cell culture based manufacturing process.
Collapse
Affiliation(s)
- Boyan Li
- Nanoscale Biophotonics Laboratory
- School of Chemistry
- National University of Ireland
- Galway, Ireland
| | - Michael Shanahan
- Nanoscale Biophotonics Laboratory
- School of Chemistry
- National University of Ireland
- Galway, Ireland
| | - Amandine Calvet
- Nanoscale Biophotonics Laboratory
- School of Chemistry
- National University of Ireland
- Galway, Ireland
| | - Kirk J. Leister
- Bristol-Myers Squibb
- Process Analytical Sciences
- Syracuse, USA
| | - Alan G. Ryder
- Nanoscale Biophotonics Laboratory
- School of Chemistry
- National University of Ireland
- Galway, Ireland
| |
Collapse
|
17
|
Li B, Ray BH, Leister KJ, Ryder AG. Performance monitoring of a mammalian cell based bioprocess using Raman spectroscopy. Anal Chim Acta 2013; 796:84-91. [DOI: 10.1016/j.aca.2013.07.058] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 10/26/2022]
|
18
|
Kuligowski J, Carrión D, Quintás G, Garrigues S, de la Guardia M. Direct determination of polymerised triacylglycerides in deep-frying vegetable oil by near infrared spectroscopy using Partial Least Squares regression. Food Chem 2012. [DOI: 10.1016/j.foodchem.2011.07.139] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
19
|
A new and efficient variable selection algorithm based on ant colony optimization. Applications to near infrared spectroscopy/partial least-squares analysis. Anal Chim Acta 2011; 699:18-25. [DOI: 10.1016/j.aca.2011.04.061] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Revised: 04/06/2011] [Accepted: 04/28/2011] [Indexed: 11/19/2022]
|
20
|
Gröger T, Zimmermann R. Application of parallel computing to speed up chemometrics for GC×GC–TOFMS based metabolic fingerprinting. Talanta 2011; 83:1289-94. [DOI: 10.1016/j.talanta.2010.09.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 08/25/2010] [Accepted: 09/08/2010] [Indexed: 12/13/2022]
|
21
|
Keithley RB, Carelli RM, Wightman RM. Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data. Anal Chem 2010; 82:5541-51. [PMID: 20527815 DOI: 10.1021/ac100413t] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development; however, this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here, we propose that Malinowski's F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski's F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski's F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise.
Collapse
Affiliation(s)
- Richard B Keithley
- Department of Chemistry, Neuroscience Center and Neurobiology Curriculum, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | | | | |
Collapse
|
22
|
Hennessy S, Downey G, O'Donnell CP. Attempted confirmation of the provenance of Corsican PDO honey using FT-IR spectroscopy and multivariate data analysis. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2010; 58:9401-9406. [PMID: 20695639 DOI: 10.1021/jf101500n] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This study investigated the potential of Fourier-transform infrared (FT-IR) spectroscopy and chemometric techniques to produce a mathematical model that would confirm or refute the provenance of honeys claiming to be Corsican. Authentic honey samples from two harvest seasons (2004/2005 and 2005/2006) were collected from Ireland (n=2), Italy (n=30), Austria (n=40), Germany (n=36), mainland France (n=46), and Corsica (n=219). Prior to scanning, samples were diluted with distilled water to a standard solids content (70 degrees Brix). Spectra (2500-12500 nm) were recorded at room temperature using a FT-IR spectrometer equipped with a germanium attenuated total reflectance (ATR) accessory. Standard normal variate (SNV) and first- and second-derivative data pretreatments were applied to the recorded spectra, which were processed using factorial discriminant analysis (FDA) and partial least-squares (PLS) regression analysis. Overall correct classification figures of 82% (FDA) and 87% (PLS) were obtained for a separate validation set comprising samples from both harvests.
Collapse
Affiliation(s)
- Siobhán Hennessy
- Teagasc, Ashtown Food Research Centre, Ashtown, Dublin 15, Ireland.
| | | | | |
Collapse
|
23
|
Airado-Rodríguez D, Skaret J, Wold JP. Assessment of the quality attributes of cod caviar paste by means of front-face fluorescence spectroscopy. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2010; 58:5276-5285. [PMID: 20373814 DOI: 10.1021/jf100342u] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This paper describes the fluorescent behavior of cod caviar paste, stored under different conditions, in terms of light exposure and concentration of oxygen in the headspace. Multivariate curve resolution was employed to decompose the overall fluorescence spectra into pure fluorescent components and calculate the relative concentrations of these components in the different samples. Profiles corresponding to protoporphyrin IX, photoprotoporphyrin, and fluorescent oxidation products were identified. Sensory evaluation, TBARS, and analysis of volatiles are typical methods employed in the routine analysis and quality control of such food. Successful calibration models were established between fluorescence and those routine methods. Correlation coefficients higher than 0.80 were found for 79% and higher than 0.90 for 50% of the assessed odors and flavors. For instance, R values of 0.94, and 0.96 were obtained for fresh and rancid flavors respectively, and 0.89 for TBARS. On the basis of these data, it can be argued that front-face fluorescence spectroscopy can substitute all of these expensive and tedious methodologies.
Collapse
|
24
|
Keithley RB, Heien ML, Wightman RM. Multivariate concentration determination using principal component regression with residual analysis. Trends Analyt Chem 2009; 28:1127-1136. [PMID: 20160977 PMCID: PMC2760950 DOI: 10.1016/j.trac.2009.07.002] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Data analysis is an essential tenet of analytical chemistry, extending the possible information obtained from the measurement of chemical phenomena. Chemometric methods have grown considerably in recent years, but their wide use is hindered because some still consider them too complicated. The purpose of this review is to describe a multivariate chemometric method, principal component regression, in a simple manner from the point of view of an analytical chemist, to demonstrate the need for proper quality-control (QC) measures in multivariate analysis and to advocate the use of residuals as a proper QC method.
Collapse
Affiliation(s)
- Richard B. Keithley
- The University of North Carolina, Department of Chemistry, B-5 Venable Hall CB#3290, Chapel Hill, NC 27599, USA
| | - Michael L. Heien
- The Pennsylvania State University, Department of Chemistry, 104 Chemistry Building, University Park, PA, 16802, USA
| | - R. Mark Wightman
- The University of North Carolina, Department of Chemistry, B-5 Venable Hall CB#3290, Chapel Hill, NC 27599, USA
| |
Collapse
|