1
|
Zou X, Wang Q, Chen Y, Wang J, Xu S, Zhu Z, Yan C, Shan P, Wang S, Fu Y. Fusion of convolutional neural network with XGBoost feature extraction for predicting multi-constituents in corn using near infrared spectroscopy. Food Chem 2025; 463:141053. [PMID: 39241414 DOI: 10.1016/j.foodchem.2024.141053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 07/31/2024] [Accepted: 08/21/2024] [Indexed: 09/09/2024]
Abstract
Near-infrared (NIR) spectroscopy has been widely utilized to predict multi-constituents of corn in agriculture. However, directly extracting constituent information from the NIR spectra is challenging due to many issues such as broad absorption band, overlapping and non-specific nature. To solve these problems and extract implicit features from the raw data of NIR spectra to improve performance of quantitative models, a one-dimensional shallow convolutional neural network (CNN) model based on an eXtreme Gradient Boosting (XGBoost) feature extraction method was proposed in this paper. The leaf node feature information in the XGBoost was encoded and reconstructed to obtain the implicit features of raw data in the NIR spectra. A two-parametric Swish (TSwish or TS) activation function was proposed to improve the performance of CNN, and the elastic net (EN) was also applied to avoid the overfitting problem of the CNN model. Performance of the developed XGBoost-CNN-TS-EN model was evaluated using two public NIR spectroscopy datasets of corn and soil, and the obtained determination coefficients (R2) for moisture, oil, protein, and starch of the corn on test set were 0.993, 0.991, 0.998, and 0.992, respectively, with that of the soil organic matter being 0.992. The XGBoost-CNN-TS-EN model exhibits superior stability, good prediction accuracy, and generalization ability, demonstrating its great potentials for quantitative analysis of multi-constituents in spectroscopic applications.
Collapse
Affiliation(s)
- Xin Zou
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Qiaoyun Wang
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China; Hebei Key Laboratory of Micro-Nano Precision Optical Sensing and Measurement Technology, Qinhuangdao 066004, China.
| | - Yinji Chen
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Jilong Wang
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Shunyuan Xu
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Ziheng Zhu
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Chongyue Yan
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Peng Shan
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - Shuyu Wang
- College of Information Science and Engineering, Northeastern University, Shenyang, Liaoning Province 110819, China
| | - YongQing Fu
- Faculty of Engineering & Environment, Northumbria University, Newcastle upon Tyne NE1 8ST, UK
| |
Collapse
|
2
|
Sharkawi MMZ, Farid NF, Hassan MH, Hassan SA. New chemometrics-assisted spectrophotometric methods for simultaneous determination of co-formulated drugs montelukast, rupatadine, and desloratadine in their different dosage combinations. BMC Chem 2024; 18:232. [PMID: 39563400 PMCID: PMC11574986 DOI: 10.1186/s13065-024-01345-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 11/05/2024] [Indexed: 11/21/2024] Open
Abstract
Two accurate, precise and robust multivariate chemometric methods were developed for the simultaneous determination of montelukast sodium (MON), rupatadine fumarate (RUP) and desloratadine (DES). These methods provide a cost-effective alternative to chromatographic techniques by utilizing spectrophotometry in pharmaceutical quality control. The proposed approaches, partial least squares-1 (PLS-1) and artificial neural network (ANN), were optimized using genetic algorithm (GA) to select the most influential wavelengths, enhancing model performance. A five-level, three-factor design was employed to construct a calibration set with 25 mixtures, utilizing concentration ranges of 3-19, 5-25, and 4-20 µg.mL-1 for MON, RUP, and DES, respectively. An independent validation set was employed to assess the performance of the models. GA significantly improved the PLS-1 and ANN models for RUP and DES, though minimal enhancement was observed for MON. These methods were successfully applied to the simultaneous quantification of the compounds in pharmaceutical formulations and proved useful as stability-indicating assays for RUP, given that DES is a known degradation product. The developed methods offer a valuable tool for impurity profiling and quality control in pharmaceutical analysis.
Collapse
Affiliation(s)
- Marco M Z Sharkawi
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Beni-Suef University, Alshaheed Shehata Ahmad Hegazy St, Beni-Suef, 62514, Egypt
| | - Nehal F Farid
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Beni-Suef University, Alshaheed Shehata Ahmad Hegazy St, Beni-Suef, 62514, Egypt
| | - Moataz H Hassan
- Pharmaceutical Analytical Chemistry Department, College of Pharmaceutical Sciences and Drug Manufacturing, Misr University for Science and Technology, 6th of October City, 12566, Giza, Egypt.
| | - Said A Hassan
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr El-Aini Street, Cairo, 11562, Egypt
| |
Collapse
|
3
|
Serag A, Alnemari RM, Abduljabbar MH, Alosaimi ME, Almalki AH. Synchronous spectrofluorimetry and chemometric modeling: A synergistic approach for analyzing simeprevir and daclatasvir, with application to pharmacokinetics evaluation. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 315:124245. [PMID: 38581722 DOI: 10.1016/j.saa.2024.124245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 03/30/2024] [Accepted: 04/01/2024] [Indexed: 04/08/2024]
Abstract
Simeprevir and daclatasvir represent a cornerstone in the management of Hepatitis C Virus infection, a global health concern that affects millions of people worldwide. In this study, we propose a synergistic approach combining synchronous spectrofluorimetry and chemometric modeling i.e. Partial Least Squares (PLS-1) for the analysis of simeprevir and daclatasvir in different matrices. Moreover, the study employs firefly algorithms to further optimize the chemometric models via selecting the most informative features thus improving the accuracy and robustness of the calibration models. The firefly algorithm was able to reduce the number of selected wavelengths to 47-44% for simeprevir and daclatasvir, respectively offering a fast and sensitive technique for the determination of simeprevir and daclatasvir. Validation results underscore the models' effectiveness, as evidenced by recovery rates close to 100% with relative root mean square error of prediction (RRMSEP) of 2.253 and 2.1381 for simeprevir and daclatasvir, respectively. Moreover, the proposed models have been applied to determine the pharmacokinetics of simeprevir and daclatasvir, providing valuable insights into their distribution and elimination patterns. Overall, the study demonstrates the effectiveness of synchronous spectrofluorimetry coupled with multivariate calibration optimized by firefly algorithms in accurately determining and quantifying simeprevir and daclatasvir in HCV antiviral treatment, offering potential applications in pharmaceutical formulation analysis and pharmacokinetic studies for these drugs.
Collapse
Affiliation(s)
- Ahmed Serag
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, 11751 Nasr City, Cairo, Egypt.
| | - Reem M Alnemari
- Department of Pharmaceutics and Pharmaceutical Technology, College of Pharmacy, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Maram H Abduljabbar
- Department of Pharmacology and Toxicology, College of Pharmacy, Taif University, P.O. Box 11099, 21944 Taif, Saudi Arabia
| | - Manal E Alosaimi
- Department of Basic Sciences, College of Medicine, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Atiah H Almalki
- Department of Pharmaceutical Chemistry, College of Pharmacy, Taif University, P.O. Box 11099, 21944 Taif, Saudi Arabia; Addiction and Neuroscience Research Unit, Health Science Campus, Taif University, P.O. Box 11099, 21944 Taif, Saudi Arabia
| |
Collapse
|
4
|
Almalki AH, Abduljabbar MH, Alzhrani RM, Alosaimi ME, Serag A. Determination of valsartan and pitavastatin using synchronous spectrofluorimetry and augmented least squares chemometric models: A comparative study with greenness and blueness assessment. LUMINESCENCE 2024; 39:e4803. [PMID: 38880967 DOI: 10.1002/bio.4803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/22/2024] [Accepted: 05/31/2024] [Indexed: 06/18/2024]
Abstract
Hypertension and hyperlipidemia are two common conditions that require effective management to reduce the risk of cardiovascular diseases. Among the medications commonly used for the treatment of these conditions, valsartan and pitavastatin have shown significant efficacy in lowering blood pressure and cholesterol levels, respectively. In this study, synchronous spectrofluorimetry coupled to chemometric analysis tools, specifically concentration residual augmented classical least squares (CRACLS) and spectral residual augmented classical least squares (SRACLS), was employed for the determination of valsartan and pitavastatin simultaneously. The developed models exhibited excellent predictive performance with relative root mean square error of prediction (RRMSEP) of 2.253 and 2.1381 for valsartan and pitavastatin, respectively. Hence, these models were successfully applied to the analysis of synthetic samples and commercial formulations as well as plasma samples with high accuracy and precision. Besides, the greenness and blueness profiles of the determined samples were also evaluated to assess their environmental impact and analytical practicability. The results demonstrated excellent greenness and blueness scores with AGREE score of 0.7 and BAGI score of 75 posing the proposed method as reliable and sensitive approach for the determination of valsartan and pitavastatin with potential applications in pharmaceutical quality control, bioanalytical studies, and therapeutic drug monitoring.
Collapse
Affiliation(s)
- Atiah H Almalki
- Department of Pharmaceutical Chemistry, College of Pharmacy, Taif University, Taif, Saudi Arabia
- Addiction and Neuroscience Research Unit, Health Science Campus, Taif University, Taif, Saudi Arabia
| | - Maram H Abduljabbar
- Department of Pharmacology and Toxicology, College of Pharmacy, Taif University, Taif, Saudi Arabia
| | - Rami M Alzhrani
- Department of Pharmaceutics and Industrial Pharmacy, College of Pharmacy, Taif University, Taif, Saudi Arabia
| | - Manal E Alosaimi
- Department of Basic Sciences, College of Medicine, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Ahmed Serag
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Cairo, Nasr City, Egypt
| |
Collapse
|
5
|
Tan C, Chen H, Xie F, Huang Y. Feasibility study on identifying the source of cigarette ash based on infrared spectroscopy and chemometrics. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 311:124042. [PMID: 38354675 DOI: 10.1016/j.saa.2024.124042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 01/18/2024] [Accepted: 02/10/2024] [Indexed: 02/16/2024]
Abstract
Crime scene investigation is a key step in collecting and identifying physical evidence that may be closely related to the crime. The size of physical evidence can range from macro to micro. Cigarettes are a type of popular consumables, and their burned ashes are valuable resources of physical evidence since they contain important information such as brand preferences. This work explores the feasibility of using attenuated total reflection mid-infrared (ATR-MIR) spectroscopy and chemometrics to achieve cigarette brand recognition from burned ash. A total of 600 cigarette samples from ten brands were collected for experiments, and the samples were divided into a training set and a testing set in a 2:1 ratio. The Relief-F algorithm was used to sort variables and the forward search was used to further optimize variables to obtain the optimal subset of variables. Based on this, a partial least-squares discriminant analysis (PLS-DA) model was established, achieving a total accuracy of 97% on the test set. As a reference, the maximum correlation coefficient method was also used for classification, with an accuracy of only 73%. It seems that using the variable selection and modeling scheme proposed in this article is feasible for identifying cigarette brands from burned ash.
Collapse
Affiliation(s)
- Chao Tan
- Key Lab of Process Analysis and Control of Sichuan Universities, Yibin University, Yibin, Sichuan 644000, China; College of Materials and Chemical Engineering, Yibin University, Yibin, Sichuan 644000, China.
| | - Hui Chen
- Key Lab of Process Analysis and Control of Sichuan Universities, Yibin University, Yibin, Sichuan 644000, China; Hospital, Yibin University, Yibin, Sichuan 644000, China
| | - Fan Xie
- Key Lab of Process Analysis and Control of Sichuan Universities, Yibin University, Yibin, Sichuan 644000, China; College of Materials and Chemical Engineering, Yibin University, Yibin, Sichuan 644000, China
| | - Yushuang Huang
- Key Lab of Process Analysis and Control of Sichuan Universities, Yibin University, Yibin, Sichuan 644000, China; College of Materials and Chemical Engineering, Yibin University, Yibin, Sichuan 644000, China
| |
Collapse
|
6
|
Chen N, Chen S, Zhang Q, Wang SR, Tang LJ, Jiang JH, Yu RQ, Zhou YP. Robust classification and biomarker discovery of inherited metabolic diseases using GC-MS urinary metabolomics analysis combined with chemometrics. Microchem J 2023. [DOI: 10.1016/j.microc.2023.108600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
|
7
|
Fouad MA, Serag A, Tolba EH, El-Shal MA, El Kerdawy AM. QSRR modeling of the chromatographic retention behavior of some quinolone and sulfonamide antibacterial agents using firefly algorithm coupled to support vector machine. BMC Chem 2022; 16:85. [PMID: 36329493 PMCID: PMC9635186 DOI: 10.1186/s13065-022-00874-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/04/2022] [Indexed: 11/06/2022] Open
Abstract
Quinolone and sulfonamide are two classes of antibacterial agents with an opulent history of medicinal chemistry features that contribute to their bacterial spectrum, efficacy, pharmacokinetics, and adverse effect profiles. The urgent need for their use, combined with the escalating rate of their resistance, necessitates the development of suitable analytical methods that accelerate and facilitate their analysis. In this study, the advanced firefly algorithm (FFA) coupled with support vector regression (SVR) was used to select the most significant descriptors and to construct two quantitative structure-retention relationship (QSRR) models using a series of 11 selected quinolone and 13 sulfonamide drugs, respectively, to predict their retention behavior in HPLC. Precisely, the effect of the pH value and acetonitrile composition in the mobile phase on the retention behavior of quinolones and sulfonamides, respectively, were studied. The obtained QSRR models performed well in both internal and external validations, demonstrating their robustness and predictive ability. Y-randomization validation demonstrated that the obtained models did not result by statistical chance. Moreover, the obtained results shed the light on the molecular features that influence the retention behavior of these two classes under the current chromatographic conditions.
Collapse
Affiliation(s)
- Marwa A. Fouad
- grid.7776.10000 0004 0639 9286Pharmaceutical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr El-Aini St, P.O. Box 11562, Cairo, Egypt ,Department of Pharmaceutical Chemistry, School of Pharmacy, Newgiza University (NGU), Newgiza, km 22 Cairo–Alexandria Desert Road, Cairo, Egypt
| | - Ahmed Serag
- grid.411303.40000 0001 2155 6022Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, 11751 Cairo, Egypt
| | - Enas H. Tolba
- grid.419698.bEgyptian Drug Authority (Former National Organization for Drug Control and Research), Cairo, Egypt
| | - Manal A. El-Shal
- grid.419698.bEgyptian Drug Authority (Former National Organization for Drug Control and Research), Cairo, Egypt
| | - Ahmed M. El Kerdawy
- grid.7776.10000 0004 0639 9286Pharmaceutical Chemistry Department, Faculty of Pharmacy, Cairo University, Kasr El-Aini St, P.O. Box 11562, Cairo, Egypt
| |
Collapse
|
8
|
Liu S, Wang S, Hu C, Zhan S, Kong D, Wang J. Rapid and accurate determination of diesel multiple properties through NIR data analysis assisted by machine learning. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 277:121261. [PMID: 35490664 DOI: 10.1016/j.saa.2022.121261] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 04/05/2022] [Accepted: 04/10/2022] [Indexed: 06/14/2023]
Abstract
The rapid and accurate detection of diesel multiple properties is an important research topic in petrochemical industry that is conducive to diesel quality assessment and environmental pollution mitigation. To that end, this paper developed a new machine learning model for near infrared (NIR) spectroscopy capable of simultaneously determining diesel density, viscosity, freezing point, boiling point, cetane number and total aromatics. The model combined improved XY co-occurrence distance (ISPXY) and differential evolution-gray wolf optimization support vector machine (DEGWO-SVM) to attain the goal of rapidity and accuracy. Experimental results indicated that the average recovery, mean square error, mean absolute percentage error and determination coefficient of the presented method outperformed those of the existing machine learning methods. The proposed hybrid model provides superior solution to the problem of low efficiency and high cost of diesel quality detection, and has the potential to be utilized as a promising tool for diesel routine monitoring.
Collapse
Affiliation(s)
- Shiyu Liu
- Measurement Technology and Instrumentation Key Lab of Hebei Province, School of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
| | - Shutao Wang
- Measurement Technology and Instrumentation Key Lab of Hebei Province, School of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China.
| | - Chunhai Hu
- Measurement Technology and Instrumentation Key Lab of Hebei Province, School of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
| | - Shujie Zhan
- Measurement Technology and Instrumentation Key Lab of Hebei Province, School of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
| | - Deming Kong
- Measurement Technology and Instrumentation Key Lab of Hebei Province, School of Electrical Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China
| | - Junzhu Wang
- Flow Measurement Technology Key Lab of Zhejiang Province, College of Metrology & Measurement Engineering, China Jiliang University, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
9
|
Serag A, Hasan MA, Tolba EH, Abdelzaher AM, Elmaaty AA. Analysis of the ternary antiretroviral therapy dolutegravir, lamivudine and abacavir using UV spectrophotometry and chemometric tools. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 264:120334. [PMID: 34481252 DOI: 10.1016/j.saa.2021.120334] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
Herein, a simple spectrophotometric method coupled with chemometric techniques i.e. partial least square (PLS) and genetic algorithm (GA) were utilized for the simultaneous determination of the vital ternary antiretroviral therapy dolutegravir (DTG), lamivudine (LMV), and abacavir (ACV) in their combined dosage form. Calibration (25 samples) and validation (13 samples) sets were prepared for these drugs at different concentrations via implementing partial factorial experimental designs. The zero order UV spectra of calibration and validation sets were measured and then subjected for further chemometric analysis. Partial least squares with/without variable selection procedures i.e. genetic algorithm (GA) were utilized to untangle the UV spectral overlapping of these mixtures. Cross-validation and external validation methods were applied to compare the performance of these chemometric techniques in terms of accuracy and predictive abilities. It was found that six latent variables were optimum for modelling DTG, four latent variables for modelling LMV and three latent variables for modelling ACV. Although, good recoveries with prompt predictive ability were attained by these PLS, GA-PLS showed better analytical performance owing to its capability to remove redundant variables i.e. the number of absorbance variables have been reduced to about 21-29%. The proposed chemometric methods can be reliably applied for simultaneous determination of DTG, LMV, and ACV in their laboratory prepared mixtures and pharmaceutical preparation posing these chemometric methods as worthy and substantial analytical tools in in-process testing and quality control analysis of many antiretroviral pharmaceutical preparations.
Collapse
Affiliation(s)
- Ahmed Serag
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Cairo 11751, Egypt.
| | - Mohamed A Hasan
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Cairo 11751, Egypt
| | - Enas H Tolba
- National Organization for Drug Control and Research (NODCAR), Giza, P.O. Box 35521, Egypt
| | - Ahmed M Abdelzaher
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Cairo 11751, Egypt
| | - Ayman Abo Elmaaty
- Department of Medicinal Chemistry, Faculty of Pharmacy, Port Said University, Port Said 42526, Egypt.
| |
Collapse
|
10
|
Yang Y, Wang X, Zhao X, Huang M, Zhu Q. M3GPSpectra: A novel approach integrating variable selection/construction and MLR modeling for quantitative spectral analysis. Anal Chim Acta 2021; 1160:338453. [PMID: 33894955 DOI: 10.1016/j.aca.2021.338453] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/20/2021] [Accepted: 03/23/2021] [Indexed: 11/24/2022]
Abstract
Quantitative analysis of the physical or chemical properties of various materials by using spectral analysis technology combined with chemometrics has become an important method in the field of analytical chemistry. This method aims to build a model relationship (called prediction model) between feature variables acquired by spectral sensors and components to be measured. Feature selection or transformation should be conducted to reduce the interference of irrelevant information on the prediction model because original spectral feature variables contain redundant information and massive noise. Most existing feature selection and transformation methods are single linear or nonlinear operations, which easily lead to the loss of feature information and affect the accuracy of subsequent prediction models. This research proposes a novel spectroscopic technology-oriented, quantitative analysis model construction strategy named M3GPSpectra. This tool uses genetic programming algorithm to select and reconstruct the original feature variables, evaluates the performance of selected and reconstructed variables by using multivariate regression model (MLR), and obtains the best feature combination and the final parameters of MLR through iterative learning. M3GPSpectra integrates feature selection, linear/nonlinear feature transformation, and subsequent model construction into a unified framework and thus easily realizes end-to-end parameter learning to significantly improve the accuracy of the prediction model. When applied to six types of datasets, M3GPSpectra obtains 19 prediction models, which are compared with those obtained by seven linear or non-linear popular methods. Experimental results show that M3GPSpectra obtains the best performance among the eight methods tested. Further investigation verifies that the proposed method is not sensitive to the size of the training samples. Hence, M3GPSpectra is a promising spectral quantitative analytical tool.
Collapse
Affiliation(s)
- Yu Yang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Xin Wang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Xin Zhao
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Min Huang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China
| | - Qibing Zhu
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, China.
| |
Collapse
|