1
|
Allegrini F, Olivieri AC. Linear or non-linear multivariate calibration models? That is the question. Anal Chim Acta 2022; 1226:340248. [DOI: 10.1016/j.aca.2022.340248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 08/05/2022] [Accepted: 08/07/2022] [Indexed: 11/16/2022]
|
2
|
Predicting the Properties of High-Performance Epoxy Resin by Machine Learning Using Molecular Dynamics Simulations. NANOMATERIALS 2022; 12:nano12142353. [PMID: 35889577 PMCID: PMC9317641 DOI: 10.3390/nano12142353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 06/23/2022] [Accepted: 07/06/2022] [Indexed: 11/18/2022]
Abstract
Epoxy resin is an of the most widely used adhesives for various applications owing to its outstanding properties. The performance of epoxy systems varies significantly depending on the composition of the base resin and curing agent. However, there are limitations in exploring numerous formulations of epoxy resins to optimize adhesive properties because of the expense and time-consuming nature of the trial-and-error process. Herein, molecular dynamics (MD) simulations and machine learning (ML) methods were used to overcome these challenges and predict the adhesive properties of epoxy resin. Datasets for diverse epoxy adhesive formulations were constructed by considering the degree of crosslinking, density, free volume, cohesive energy density, modulus, and glass transition temperature. A linear correlation analysis demonstrated that the content of the curing agents, especially dicyandiamide (DICY), had the greatest correlation with the cohesive energy density. Moreover, the content of tetraglycidyl methylene dianiline (TGMDA) had the highest correlation with the modulus, and the content of diglycidyl ether of bisphenol A (DGEBA) had the highest correlation with the glass transition temperature. An optimized artificial neural network (ANN) model was constructed using test sets divided from MD datasets through error and linear regression analyses. The root mean square error (RMSE) and correlation coefficient (R2) showed the potential of each model in predicting epoxy properties, with high linear correlations (0.835–0.986). This technique can be extended for optimizing the composition of other epoxy resin systems.
Collapse
|
3
|
Sensitivity and generalized analytical sensitivity expressions for quantitative analysis using convolutional neural networks. Anal Chim Acta 2022; 1192:338697. [DOI: 10.1016/j.aca.2021.338697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/21/2021] [Accepted: 05/23/2021] [Indexed: 11/17/2022]
|
4
|
Rasouli Z, Maeder M, Abdollahi H. Chemical model-based optimization of a sensor array for simultaneous determination of glucose and fructose. Microchem J 2022. [DOI: 10.1016/j.microc.2021.106944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
5
|
Yang Y, Wang X, Zhao X, Huang M, Zhu Q. M3GPSpectra: A novel approach integrating variable selection/construction and MLR modeling for quantitative spectral analysis. Anal Chim Acta 2021; 1160:338453. [PMID: 33894955 DOI: 10.1016/j.aca.2021.338453] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/20/2021] [Accepted: 03/23/2021] [Indexed: 11/24/2022]
Abstract
Quantitative analysis of the physical or chemical properties of various materials by using spectral analysis technology combined with chemometrics has become an important method in the field of analytical chemistry. This method aims to build a model relationship (called prediction model) between feature variables acquired by spectral sensors and components to be measured. Feature selection or transformation should be conducted to reduce the interference of irrelevant information on the prediction model because original spectral feature variables contain redundant information and massive noise. Most existing feature selection and transformation methods are single linear or nonlinear operations, which easily lead to the loss of feature information and affect the accuracy of subsequent prediction models. This research proposes a novel spectroscopic technology-oriented, quantitative analysis model construction strategy named M3GPSpectra. This tool uses genetic programming algorithm to select and reconstruct the original feature variables, evaluates the performance of selected and reconstructed variables by using multivariate regression model (MLR), and obtains the best feature combination and the final parameters of MLR through iterative learning. M3GPSpectra integrates feature selection, linear/nonlinear feature transformation, and subsequent model construction into a unified framework and thus easily realizes end-to-end parameter learning to significantly improve the accuracy of the prediction model. When applied to six types of datasets, M3GPSpectra obtains 19 prediction models, which are compared with those obtained by seven linear or non-linear popular methods. Experimental results show that M3GPSpectra obtains the best performance among the eight methods tested. Further investigation verifies that the proposed method is not sensitive to the size of the training samples. Hence, M3GPSpectra is a promising spectral quantitative analytical tool.
Collapse
Affiliation(s)
- Yu Yang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Xin Wang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Xin Zhao
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Min Huang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Qibing Zhu
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.
| |
Collapse
|
6
|
Chiappini FA, Allegrini F, Goicoechea HC, Olivieri AC. Sensitivity for Multivariate Calibration Based on Multilayer Perceptron Artificial Neural Networks. Anal Chem 2020; 92:12265-12272. [DOI: 10.1021/acs.analchem.0c01863] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Fabricio A. Chiappini
- Laboratorio de Desarrollo Analítico y Quimiometría (LADAQ), Cátedra de Química Analítica I, Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Ciudad Universitaria, Santa Fe S3000ZAA, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290 CABA C1425FQB, Argentina
| | | | - Héctor C. Goicoechea
- Laboratorio de Desarrollo Analítico y Quimiometría (LADAQ), Cátedra de Química Analítica I, Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Ciudad Universitaria, Santa Fe S3000ZAA, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290 CABA C1425FQB, Argentina
| | - Alejandro C. Olivieri
- Departamento de Química Analítica, Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario, Instituto de Química de Rosario (IQUIR-CONICET), Suipacha 531, Rosario S2002LRK, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Godoy Cruz 2290 CABA C1425FQB, Argentina
| |
Collapse
|
7
|
Tutorial: multivariate classification for vibrational spectroscopy in biological samples. Nat Protoc 2020; 15:2143-2162. [PMID: 32555465 DOI: 10.1038/s41596-020-0322-8] [Citation(s) in RCA: 130] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 03/20/2020] [Indexed: 12/26/2022]
Abstract
Vibrational spectroscopy techniques, such as Fourier-transform infrared (FTIR) and Raman spectroscopy, have been successful methods for studying the interaction of light with biological materials and facilitating novel cell biology analysis. Spectrochemical analysis is very attractive in disease screening and diagnosis, microbiological studies and forensic and environmental investigations because of its low cost, minimal sample preparation, non-destructive nature and substantially accurate results. However, there is now an urgent need for multivariate classification protocols allowing one to analyze biologically derived spectrochemical data to obtain accurate and reliable results. Multivariate classification comprises discriminant analysis and class-modeling techniques where multiple spectral variables are analyzed in conjunction to distinguish and assign unknown samples to pre-defined groups. The requirement for such protocols is demonstrated by the fact that applications of deep-learning algorithms of complex datasets are being increasingly recognized as critical for extracting important information and visualizing it in a readily interpretable form. Hereby, we have provided a tutorial for multivariate classification analysis of vibrational spectroscopy data (FTIR, Raman and near-IR) highlighting a series of critical steps, such as preprocessing, data selection, feature extraction, classification and model validation. This is an essential aspect toward the construction of a practical spectrochemical analysis model for biological analysis in real-world applications, where fast, accurate and reliable classification models are fundamental.
Collapse
|
8
|
Kotani A, Hakamata H, Hayashi Y. An automated assessment system of limits of detection and quantitation in gradient high-performance liquid chromatography with ultraviolet detection. J Chromatogr A 2020; 1621:461077. [DOI: 10.1016/j.chroma.2020.461077] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 03/18/2020] [Accepted: 03/24/2020] [Indexed: 01/16/2023]
|
9
|
Chiappini FA, Teglia CM, Forno ÁG, Goicoechea HC. Modelling of bioprocess non-linear fluorescence data for at-line prediction of etanercept based on artificial neural networks optimized by response surface methodology. Talanta 2020; 210:120664. [DOI: 10.1016/j.talanta.2019.120664] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 12/17/2019] [Accepted: 12/20/2019] [Indexed: 11/24/2022]
|
10
|
Abstract
Multisensor arrays employing various sensing principles are a rapidly developing field of research as they allow simple and inexpensive quantification of various parameters in complex samples. Quantitative analysis with such systems is based on multivariate regression techniques, and deriving of traditional analytical figures of merit (e.g., sensitivity, selectivity, limit of detection, and limit of quantitation) for such systems is not obvious and straightforward. Nevertheless, it is absolutely needed for further development of the multisensor research field and for introducing these instruments into the general context of analytical chemistry. Here, we report on the protocol for calculation of sensitivity, selectivity, and detection limits for multisensor arrays. The results are provided and discussed in detail for several real-world data sets.
Collapse
Affiliation(s)
- Hadi Parastar
- Department of Chemistry, Sharif University of Technology, P.O. Box 11155-3516, Tehran 1458889694, Iran
| | - Dmitry Kirsanov
- Institute of Chemistry, Saint Petersburg State University, Saint Petersburg 199034, Russia
| |
Collapse
|
11
|
An automated system for predicting detection limit and precision profile from a chromatogram. J Chromatogr A 2020; 1612:460644. [PMID: 31676091 DOI: 10.1016/j.chroma.2019.460644] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 10/13/2019] [Accepted: 10/19/2019] [Indexed: 11/22/2022]
Abstract
This paper presents a basic model of an automated system for predicting the detection limit and precision profile (plot of relative standard deviation (RSD) of measurements against concentration) in chromatography. The fundamental assumption is that the major source of response errors at low sample concentrations is background noise and at high concentrations, it is the volumes injected into an HPLC system by a sample injector. The noise is approximated by the mixed random processes of the first order autoregressive process AR(1) and white noise. The research procedures are: (1) the description of the standard deviation (SD) of measurements in terms of the parameters of the mixed random processes; (2) the algorithm for the parameter estimation of the mixed processes from actual background noise; (3) the mathematical distinction between noise and signal in a chromatogram. When compounds are chromatographically separated, each obtained signal is given the detection limit and precision profile on laboratory-made software. A file of a chromatogram is the only requirement for the theoretical prediction of measurement uncertainty and therefore the repeated measurements of real samples can be dispensed with. The theoretically predicted RSDs are verified by comparing them with the statistical RSDs obtained by repeated measurements. Signal shapes on noise are illustrated at the detection limit and quantitation limit, the signal-to-noise ratios of which are close to the widely adopted values, 3 and 10, respectively.
Collapse
|
12
|
|
13
|
Novel application of neural network modelling for multicomponent herbal medicine optimization. Sci Rep 2019; 9:15442. [PMID: 31659222 PMCID: PMC6817903 DOI: 10.1038/s41598-019-51956-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 10/10/2019] [Indexed: 11/08/2022] Open
Abstract
The conventional method for effective or toxic chemical substance identification of multicomponent herbal medicine is based on single component separation, which is time-consuming, labor intensive, inefficient, and neglects the interaction and integrity among the components; therefore, it is necessary to find an alternative routine to evaluate the components more efficiently and scientifically. In this study, sodium aescinate injection (SAI), obtained from different manufacturers and prepared as "components knockout" samples, was chosen as the case study. The chemical fingerprints of SAI were obtained by high-performance liquid chromatography to provide the chemical information. The effectiveness and irritation of each sample were evaluated using anti-inflammatory and irritation tests, and then "Gray correlation" analysis (GCA) was applied to rank the effectiveness and irritability of each component to provide a preliminary judgment for product optimization. The prediction model of the proportions of the expected components was constructed using the artificial neural network. The results of the GCA showed that the irritation sorting of each SAI component was in the order of B > A > G > J > I > H > D > F > E > C and the effectiveness sorting of SAI components was in the order of D > C > B > A > F > E > H > I > G > J; the predictive proportion of SAI was optimized by the BP neural network as A: B: C: D: E: F = 0.7526: 0.5005: 5.4565: 1.4149: 0.8113: 1.0642. This study provided a scientific, accurate, reliable, and efficient approach for the proportion optimization of multicomponent drugs, which has a good prospect of popularization and application in product upgrading and development of herbal medicine.
Collapse
|
14
|
DeepSpectra: An end-to-end deep learning approach for quantitative spectral analysis. Anal Chim Acta 2019; 1058:48-57. [PMID: 30851853 DOI: 10.1016/j.aca.2019.01.002] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 11/24/2018] [Accepted: 01/02/2019] [Indexed: 11/20/2022]
Abstract
Learning patterns from spectra is critical for the development of chemometric analysis of spectroscopic data. Conventional two-stage calibration approaches consist of data preprocessing and modeling analysis. Misuse of preprocessing may introduce artifacts or remove useful patterns and result in worse model performance. An end-to-end deep learning approach incorporated Inception module, named DeepSpectra, is presented to learn patterns from raw data to improve the model performance. DeepSpectra model is compared to three CNN models on the raw data, and 16 preprocessing approaches are included to evaluate the preprocessing impact by testing four open accessed visible and near infrared spectroscopic datasets (corn, tablets, wheat, and soil). DeepSpectra model outperforms the other three convolutional neural network models on four datasets and obtains better results on raw data than in preprocessed data for most scenarios. The model is compared with linear partial least square (PLS) and nonlinear artificial neural network (ANN) methods and support vector machine (SVR) on raw and preprocessed data. The results show that DeepSpectra approach provides improved results than conventional linear and nonlinear calibration approaches in most scenarios. The increased training samples can improve the model repeatability and accuracy.
Collapse
|
15
|
Application of chemometric methods to XRF-data – A tutorial review. Anal Chim Acta 2018; 1040:19-32. [DOI: 10.1016/j.aca.2018.05.023] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 04/13/2018] [Accepted: 05/06/2018] [Indexed: 01/16/2023]
|
16
|
Morais CLM, Lima KMG, Martin FL. Uncertainty estimation and misclassification probability for classification models based on discriminant analysis and support vector machines. Anal Chim Acta 2018; 1063:40-46. [PMID: 30967184 DOI: 10.1016/j.aca.2018.09.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 09/05/2018] [Accepted: 09/11/2018] [Indexed: 10/28/2022]
Abstract
Uncertainty estimation provides a quantitative value of the predictive performance of a classification model based on its misclassification probability. Low misclassification probabilities are associated with a low degree of uncertainty, indicating high trustworthiness; while high misclassification probabilities are associated with a high degree of uncertainty, indicating a high susceptibility to generate incorrect classification. Herein, misclassification probability estimations based on uncertainty estimation by bootstrap were developed for classification models using discriminant analysis [linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA)] and support vector machines (SVM). Principal component analysis (PCA) was used as variable reduction technique prior classification. Four spectral datasets were tested (1 simulated and 3 real applications) for binary and ternary classifications. Models with lower misclassification probabilities were more stable when the spectra were perturbed with white Gaussian noise, indicating better robustness. Thus, misclassification probability can be used as an additional figure of merit to assess model robustness, providing a reliable metric to evaluate the predictive performance of a classifier.
Collapse
Affiliation(s)
- Camilo L M Morais
- School of Pharmacy and Biomedical Sciences, University of Central Lancashire, Preston PR1 2HE, United Kingdom.
| | - Kássio M G Lima
- Biological Chemistry and Chemometrics, Institute of Chemistry, Federal University of Rio Grande do Norte, Natal, 59072-970, Brazil
| | - Francis L Martin
- School of Pharmacy and Biomedical Sciences, University of Central Lancashire, Preston PR1 2HE, United Kingdom
| |
Collapse
|
17
|
Du C, Dai S, Qiao Y, Wu Z. Error propagation of partial least squares for parameters optimization in NIR modeling. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2018; 192:244-250. [PMID: 29154215 DOI: 10.1016/j.saa.2017.10.069] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Revised: 10/18/2017] [Accepted: 10/26/2017] [Indexed: 06/07/2023]
Abstract
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models.
Collapse
Affiliation(s)
- Chenzhao Du
- Beijing University of Chinese Medicine, 100102, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, Beijing, 100102, China
| | - Shengyun Dai
- Beijing University of Chinese Medicine, 100102, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, Beijing, 100102, China
| | - Yanjiang Qiao
- Beijing University of Chinese Medicine, 100102, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, Beijing, 100102, China.
| | - Zhisheng Wu
- Beijing University of Chinese Medicine, 100102, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, Beijing, 100102, China.
| |
Collapse
|