1
|
Geng Y, Ni H, Shen H, Wang H, Wu J, Pan K, Wu Y, Chen Y, Luo Y, Xu T, Liu X. Feasibility of an NIR spectral calibration transfer algorithm based on optimized feature variables to predict tobacco samples in different states. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2023; 15:719-728. [PMID: 36722963 DOI: 10.1039/d2ay01805e] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The prediction accuracy of calibration models for near-infrared (NIR) spectroscopy typically relies on the morphology and homogeneity of the samples. To achieve non-homogeneous tobacco samples for non-destructive and rapid analysis, a method that can predict tobacco filament samples using reliable models based on the corresponding tobacco powder is proposed here. First, as it is necessary to establish a simple and robust calibrated model with excellent performance, based on full-wavelength PLSR (Full-PLSR), the key feature variables were screened by three methods, namely competitive adaptive reweighted sampling (CARS), variable combination population analysis-iteratively retaining informative variables (VCPA-IRIV), and variable combination population analysis-genetic algorithm (VCPA-GA). The partial least squares regression (PLSR) models for predicting the total sugar content in tobacco were established based on three optimal wavelength sets and named CARS-PLSR, VCPA-IRIV-PLSR and VCPA-GA-PLSR, respectively. Subsequently, they were combined with different calibration transfer algorithms, including calibration transfer based on canonical correlation analysis (CTCCA), slope/bias correction (S/B) and non-supervised parameter-free framework for calibration enhancement (NS-PFCE), to evaluate the best prediction model for the tobacco filament samples. Compared with the previous two transfer algorithms, NS-PFCE performed the best under various wavelength conditions. The prediction results indicated that the most successful approach for predicting the tobacco filament samples was achieved by VCPA-IRIV-PLSR when coupled with the NS-PFCE method, which obtained the highest determination coefficient (Rp2 = 0.9340) and the lowest root mean square error of the prediction set (RMSEP = 0.8425). VCPA-IRIV simplifies the calibration model and improves the efficiency of model transfer (31 variables). Furthermore, it pledges the prediction accuracy of the tobacco filament samples when combined with NS-PFCE. In summary, calibration transfer based on optimized feature variables can eliminate prediction errors caused by sample morphological differences and proves to be a more beneficial method for online application in the tobacco industry.
Collapse
Affiliation(s)
- Yingrui Geng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Hongfei Ni
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China
| | - Huanchao Shen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China
| | - Hui Wang
- Technology Center, China Tobacco Zhejiang Industrial Co., Ltd, Hangzhou 310008, China
| | - Jizhong Wu
- Technology Center, China Tobacco Zhejiang Industrial Co., Ltd, Hangzhou 310008, China
| | - Keyu Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Yongjiang Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Yong Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Yingjie Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Tengfei Xu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| | - Xuesong Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
| |
Collapse
|
2
|
Liu S, Yang C, Liu L. Identifying spatial relations of industrial carbon emissions among provinces of China: evidence from unsupervised clustering algorithms. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:77958-77972. [PMID: 35687286 DOI: 10.1007/s11356-022-20784-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 05/09/2022] [Indexed: 06/15/2023]
Abstract
Reducing the total carbon emissions of modern industry is of great significance for China to achieve the carbon peak mission. The MD-SNA spatial correlation measure methodology was innovatively proposed in this paper, which was based on the clustering algorithm of similarity measure. Furthermore, the social network analysis (SNA) method was incorporated to explore the spatial relationship of provincial industrial carbon emissions. The GINI coefficient, Theil index (GE0), and mean of logarithmic deviation (GE1) were used to measure the regional differences of China's industrial carbon emissions. More specifically, we adopted a combined tactic of spatial difference and spatial correlation frameworks. The primary objective of the proposed methodology is to empirically investigate the structural characteristics and spatial relations of different provinces. The results of the case study are as follows. First, the regional industrial carbon emission intensity was unbalanced, among which energy-rich provinces and eastern developed provinces were relatively strong. Second, Beijing, Shandong, Shaanxi, Henan, Sichuan, and Xinjiang were located at the center of the spatial network of industrial carbon emissions. Third, our work clarified the node attributes and different functions of provinces. More than half of the core provinces belonged to the primary beneficial block, which was in the central position of spatial correlation network. The conclusion can help policymakers clarify the overall industrial sector spatial pattern and provinces' roles and functions.
Collapse
Affiliation(s)
- Shuning Liu
- School of Public Economics and Administration, Shanghai University of Finance and Economics, Shanghai, 200433, People's Republic of China
| | - Chaojun Yang
- Faculty of Management and Economics, Kunming University of Science and Technology, Kunming, Yunnan, 650093, People's Republic of China.
| | - Liju Liu
- Faculty of Management and Economics, Kunming University of Science and Technology, Kunming, Yunnan, 650093, People's Republic of China
| |
Collapse
|
3
|
Mailagaha Kumbure M, Luukka P. A generalized fuzzy k-nearest neighbor regression model based on Minkowski distance. GRANULAR COMPUTING 2021. [DOI: 10.1007/s41066-021-00288-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AbstractThe fuzzy k-nearest neighbor (FKNN) algorithm, one of the most well-known and effective supervised learning techniques, has often been used in data classification problems but rarely in regression settings. This paper introduces a new, more general fuzzy k-nearest neighbor regression model. Generalization is based on the usage of the Minkowski distance instead of the usual Euclidean distance. The Euclidean distance is often not the optimal choice for practical problems, and better results can be obtained by generalizing this. Using the Minkowski distance allows the proposed method to obtain more reasonable nearest neighbors to the target sample. Another key advantage of this method is that the nearest neighbors are weighted by fuzzy weights based on their similarity to the target sample, leading to the most accurate prediction through a weighted average. The performance of the proposed method is tested with eight real-world datasets from different fields and benchmarked to the k-nearest neighbor and three other state-of-the-art regression methods. The Manhattan distance- and Euclidean distance-based FKNNreg methods are also implemented, and the results are compared. The empirical results show that the proposed Minkowski distance-based fuzzy regression (Md-FKNNreg) method outperforms the benchmarks and can be a good algorithm for regression problems. In particular, the Md-FKNNreg model gave the significantly lowest overall average root mean square error (0.0769) of all other regression methods used. As a special case of the Minkowski distance, the Manhattan distance yielded the optimal conditions for Md-FKNNreg and achieved the best performance for most of the datasets.
Collapse
|
4
|
Borges-Miranda A, Silva-Mata FJ, Talavera-Bustamante I, Jiménez-Chacón J, Álvarez-Prieto M, Pérez-Martínez CS. The role of chemosensory relationships to improve raw materials’ selection for Premium cigar manufacture. CHEMICAL PAPERS 2021. [DOI: 10.1007/s11696-021-01577-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|