1
|
Dumancas GG, Setijadi C, Dufour B, Aglobo J, Carisma MS, Bello GA, Dalisay DS, Saludes JP. Comparison of Genetic and Non-genetic Algorithm Partial Least Squares for Sugar Quantification in Philippine Honeys. ANAL LETT 2022. [DOI: 10.1080/00032719.2022.2033985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- Gerard G. Dumancas
- Department of Chemistry, Loyola Science Center, The University of Scranton, Scranton, PA, USA
- Balik Scientist Program, Philippine Council for Health Research and Development, Department of Science and Technology, Taguig City, Philippines
| | - Catherine Setijadi
- Department of Mathematics and Physical Sciences, Louisiana State University–Alexandria, Alexandria, LA, USA
| | - Ben Dufour
- Department of Mathematics and Physical Sciences, Louisiana State University–Alexandria, Alexandria, LA, USA
| | - Jastine Aglobo
- Gregor Mendel Research Laboratories, University of San Agustin, Iloilo City, Philippines
| | - Marjorie S. Carisma
- Gregor Mendel Research Laboratories, University of San Agustin, Iloilo City, Philippines
- Department of Chemistry, College of Liberal Arts, Sciences, and Education, University of San Agustin, Iloilo City, Philippines
| | - Ghalib A. Bello
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Doralyn S. Dalisay
- Balik Scientist Program, Philippine Council for Health Research and Development, Department of Science and Technology, Taguig City, Philippines
- Department of Biology, College of Liberal Arts, Sciences, and Education, University of San Agustin, Iloilo City, Philippines
- Center for Chemical Biology and Biotechnology (C2B2), University of San Agustin, Iloilo City, Philippines
| | - Jonel P. Saludes
- Balik Scientist Program, Philippine Council for Health Research and Development, Department of Science and Technology, Taguig City, Philippines
- Gregor Mendel Research Laboratories, University of San Agustin, Iloilo City, Philippines
- Department of Chemistry, College of Liberal Arts, Sciences, and Education, University of San Agustin, Iloilo City, Philippines
- Center for Natural Drug Discovery and Development (CND3), University of San Agustin, Iloilo City, Philippines
| |
Collapse
|
2
|
Statistical Analysis of Chemical Element Compositions in Food Science: Problems and Possibilities. Molecules 2021; 26:molecules26195752. [PMID: 34641296 PMCID: PMC8510397 DOI: 10.3390/molecules26195752] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/16/2021] [Accepted: 09/17/2021] [Indexed: 11/17/2022] Open
Abstract
In recent years, many analyses have been carried out to investigate the chemical components of food data. However, studies rarely consider the compositional pitfalls of such analyses. This is problematic as it may lead to arbitrary results when non-compositional statistical analysis is applied to compositional datasets. In this study, compositional data analysis (CoDa), which is widely used in other research fields, is compared with classical statistical analysis to demonstrate how the results vary depending on the approach and to show the best possible statistical analysis. For example, honey and saffron are highly susceptible to adulteration and imitation, so the determination of their chemical elements requires the best possible statistical analysis. Our study demonstrated how principle component analysis (PCA) and classification results are influenced by the pre-processing steps conducted on the raw data, and the replacement strategies for missing values and non-detects. Furthermore, it demonstrated the differences in results when compositional and non-compositional methods were applied. Our results suggested that the outcome of the log-ratio analysis provided better separation between the pure and adulterated data and allowed for easier interpretability of the results and a higher accuracy of classification. Similarly, it showed that classification with artificial neural networks (ANNs) works poorly if the CoDa pre-processing steps are left out. From these results, we advise the application of CoDa methods for analyses of the chemical elements of food and for the characterization and authentication of food products.
Collapse
|
3
|
Bresolin T, Dórea JRR. Infrared Spectrometry as a High-Throughput Phenotyping Technology to Predict Complex Traits in Livestock Systems. Front Genet 2020; 11:923. [PMID: 32973876 PMCID: PMC7468402 DOI: 10.3389/fgene.2020.00923] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 07/24/2020] [Indexed: 12/17/2022] Open
Abstract
High-throughput phenotyping technologies are growing in importance in livestock systems due to their ability to generate real-time, non-invasive, and accurate animal-level information. Collecting such individual-level information can generate novel traits and potentially improve animal selection and management decisions in livestock operations. One of the most relevant tools used in the dairy and beef industry to predict complex traits is infrared spectrometry, which is based on the analysis of the interaction between electromagnetic radiation and matter. The infrared electromagnetic radiation spans an enormous range of wavelengths and frequencies known as the electromagnetic spectrum. The spectrum is divided into different regions, with near- and mid-infrared regions being the main spectral regions used in livestock applications. The advantage of using infrared spectrometry includes speed, non-destructive measurement, and great potential for on-line analysis. This paper aims to review the use of mid- and near-infrared spectrometry techniques as tools to predict complex dairy and beef phenotypes, such as milk composition, feed efficiency, methane emission, fertility, energy balance, health status, and meat quality traits. Although several research studies have used these technologies to predict a wide range of phenotypes, most of them are based on Partial Least Squares (PLS) and did not considered other machine learning (ML) techniques to improve prediction quality. Therefore, we will discuss the role of analytical methods employed on spectral data to improve the predictive ability for complex traits in livestock operations. Furthermore, we will discuss different approaches to reduce data dimensionality and the impact of validation strategies on predictive quality.
Collapse
Affiliation(s)
- Tiago Bresolin
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, United States
| | - João R R Dórea
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
4
|
Ho PN, Bonfatti V, Luke TDW, Pryce JE. Classifying the fertility of dairy cows using milk mid-infrared spectroscopy. J Dairy Sci 2019; 102:10460-10470. [PMID: 31495611 DOI: 10.3168/jds.2019-16412] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 07/23/2019] [Indexed: 12/11/2022]
Abstract
The objective of this study was to investigate the potential of milk mid-infrared (MIR) spectroscopy, MIR-derived traits including milk composition, milk fatty acids, and blood metabolic profiles (fatty acids, β-hydroxybutyrate, and urea), and other on-farm data for discriminating cows of good versus poor likelihood of conception to first insemination (i.e., pregnant vs. open). A total of 6,488 spectral and milk production records of 2,987 cows from 19 commercial dairy herds across 3 Australian states were used. Seven models, comprising different explanatory variables, were examined. Model 1 included milk production; concentrations of fat, protein, and lactose; somatic cell count; age at calving; days in milk at herd test; and days from calving to insemination. Model 2 included, in addition to the variables in model 1, milk fatty acids and blood metabolic profiles. The MIR spectrum collected before first insemination was added to model 2 to form model 3. Fat, protein, and lactose percentages, milk fatty acids, and blood metabolic profiles were removed from model 3 to create model 4. Model 5 and model 6 comprised model 4 and either fertility genomic estimated breeding value or principal components obtained from a genomic relationship matrix derived using animal genotypes, respectively. In model 7, all previously described sources of information, but not MIR-derived traits, were used. The models were developed using partial least squares discriminant analysis. The performance of each model was evaluated in 2 ways: 10-fold random cross-validation and herd-by-herd external validation. The accuracy measures were sensitivity (i.e., the proportion of pregnant cows that were correctly classified), specificity (i.e., the proportion of open cows that were correctly classified), and area under the curve (AUC) for the receiver operating curve. The results showed that in all models, prediction accuracy obtained through 10-fold random cross-validation was higher than that of herd-by-herd external validation, with the difference in AUC ranging between 0.01 and 0.09. In the herd-by-herd external validation, using basic on-farm information (model 1) was not sufficient to classify good- and poor-fertility cows; the sensitivity, specificity, and AUC were around 0.66. Compared with model 1, adding milk fatty acids and blood metabolic profiles (model 2) increased the sensitivity, specificity, and AUC by 0.01, 0.02, and 0.02 unit, respectively (i.e., 0.65, 0.63, and 0.678). Incorporating MIR spectra into model 2 resulted in sensitivity, specificity, and AUC values of 0.73, 0.63, and 0.72, respectively (model 3). The comparable prediction accuracies observed for models 3 and 4 mean that useful information from MIR-derived traits is already included in the spectra. Adding the fertility genomic estimated breeding value and animal genotypes (model 7) produced the highest prediction accuracy, with sensitivity, specificity, and AUC values of 0.75, 0.66, and 0.75, respectively. However, removing either the fertility estimated breeding value or animal genotype from model 7 resulted in a reduction of the prediction accuracy of only 0.01 and 0.02, respectively. In conclusion, this study indicates that MIR and other on-farm data could be used to classify cows of good and poor likelihood of conception with promising accuracy.
Collapse
Affiliation(s)
- P N Ho
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, Victoria 3083, Australia.
| | - V Bonfatti
- Department of Comparative Biomedicine and Food Science, University of Padova, Legnaro 35020, Italy
| | - T D W Luke
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, Victoria 3083, Australia; School of Applied Systems Biology, La Trobe University, Bundoora, Victoria 3083, Australia
| | - J E Pryce
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, Victoria 3083, Australia; School of Applied Systems Biology, La Trobe University, Bundoora, Victoria 3083, Australia
| |
Collapse
|
5
|
El Jabri M, Sanchez MP, Trossat P, Laithier C, Wolf V, Grosperrin P, Beuvier E, Rolet-Répécaud O, Gavoye S, Gaüzère Y, Belysheva O, Notz E, Boichard D, Delacroix-Buchet A. Comparison of Bayesian and partial least squares regression methods for mid-infrared prediction of cheese-making properties in Montbéliarde cows. J Dairy Sci 2019; 102:6943-6958. [PMID: 31178172 DOI: 10.3168/jds.2019-16320] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 04/23/2019] [Indexed: 01/17/2023]
Abstract
Assessing the cheese-making properties (CMP) of milks with a rapid and cost-effective method is of particular interest for the Protected Designation of Origin cheese sector. The aims of this study were to evaluate the potential of mid-infrared (MIR) spectra to estimate coagulation and acidification properties, as well as curd yield (CY) traits of Montbéliarde cow milk. Samples from 250 cows were collected in 216 commercial herds in Franche-Comté with the objectives to maximize the genetic diversity as well as the variation in milk composition. All coagulation and CY traits showed high variability (10 to 43%). Reference analyses performed for soft (SC) and pressed cooked (PCC) cheese technology were matched with MIR spectra. Prediction models were built on 446 informative wavelengths not tainted by the water absorbance, using different approaches such as partial least squares (PLS), uninformative variable elimination PLS, random forest PLS, Bayes A, Bayes B, Bayes C, and Bayes RR. We assessed equation performances for a set of 20 CMP traits (coagulation: 5 for SC and 4 for PCC; acidification: 5 for SC and 3 for PCC; laboratory CY: 3) by comparing prediction accuracies based on cross-validation. Overall, variable selection before PLS did not significantly improve the performances of the PLS regression, the prediction differences between Bayesian methods were negligible, and PLS models always outperformed Bayesian models. This was likely a result of the prior use of informative wavelengths of the MIR spectra. The best accuracies were obtained for curd yields expressed in dry matter (CYDM) or fresh (CYFRESH) and for coagulation traits (curd firmness for PCC and SC) using the PLS regression. Prediction models of other CMP traits were moderately to poorly accurate. Whatever the prediction methodology, the best results were always obtained for CY traits, probably because these traits are closely related to milk composition. The CYDM predictions showed coefficient of determination (R2) values up to 0.92 and 0.87, and RSy,x values of 3 and 4% for PLS and Bayes regressions, respectively. Finally, we divided the data set into calibration (2/3) and validation (1/3) sets and developed prediction models in external validation using PLS regression only. In conclusion, we confirmed, in the validation set, an excellent prediction for CYDM [R2 = 0.91, ratio of performance to deviation (RPD) = 3.39] and a very good prediction for CYFRESH (R2 = 0.84, RPD = 2.49), adequate for analytical purposes. We also obtained good results for both PCC and SC curd firmness traits (R2 ≥ 0.70, RPD ≥1.8), which enable quantitative prediction.
Collapse
Affiliation(s)
- M El Jabri
- Institut de l'Elevage, F-75012 Paris, France.
| | - M-P Sanchez
- GABI, INRA, AgroParisTech, Université Paris-Saclay, F-78350 Jouy-en-Josas, France
| | | | - C Laithier
- Institut de l'Elevage, F-75012 Paris, France
| | - V Wolf
- Conseil Elevage 25-90, F-25640 Roulans, France
| | | | - E Beuvier
- URTAL, INRA, F-39800 Poligny, France
| | | | - S Gavoye
- ACTALIA, F-39800 Poligny, France
| | - Y Gaüzère
- Ecole Nationale d'Industrie Laitière et des Biotechnologies, F-39800 Poligny, France
| | - O Belysheva
- Ecole Nationale d'Industrie Laitière et des Biotechnologies, F-39800 Poligny, France
| | - E Notz
- Centre Technique des Fromages Comtois, F-39800 Poligny, France
| | - D Boichard
- GABI, INRA, AgroParisTech, Université Paris-Saclay, F-78350 Jouy-en-Josas, France
| | - A Delacroix-Buchet
- GABI, INRA, AgroParisTech, Université Paris-Saclay, F-78350 Jouy-en-Josas, France
| |
Collapse
|
6
|
Zhang H, Tian JY, Huang J, Huang XH, Quan GJ, Yan S, Liu PR. Rapid and pollution-free characterization of intracellular polyphosphate and orthophosphate using mid-infrared spectroscopy combined with chemometrics in the denitrifying phosphorus removal process. RSC Adv 2016. [DOI: 10.1039/c6ra23756h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Content of the intracellular Poly-P and orthophosphate variation may be predicted rapidly by mid-infrared spectroscopy and PLS method in denitrifying phosphorus removal process.
Collapse
Affiliation(s)
- H. Zhang
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - J. Y. Tian
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - J. Huang
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - X. H. Huang
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - G. J. Quan
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - S. Yan
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| | - P. R. Liu
- Key Laboratory of Anhui Province of Water Pollution Control and Wastewater Reuse
- School of Environment and Energy Engineering
- Anhui Jianzhu University
- Hefei
- China
| |
Collapse
|