1
|
Xin X, Tian X, Chen C, Chen C, Li K, Ma X, Zhao L, Lv X. A method for accurate identification of Uyghur medicinal components based on Raman spectroscopy and multi-label deep learning. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 315:124251. [PMID: 38626675 DOI: 10.1016/j.saa.2024.124251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 03/10/2024] [Accepted: 04/03/2024] [Indexed: 04/18/2024]
Abstract
Uyghur medicine is one of the four major ethnic medicines in China and is a component of traditional Chinese medicine. The intrinsic quality of Uyghur medicinal materials will directly affect the clinical efficacy of Uyghur medicinal preparations. However, in recent years, problems such as adulteration of Uyghur medicinal materials and foreign bodies with the same name still exist, so it is necessary to strengthen the quality control of Uyghur medicines to guarantee Uyghur medicinal efficacy. Identifying the components of Uyghur medicines can clarify the types of medicinal materials used, is a crucial step to realizing the quality control of Uyghur medicines, and is also an important step in screening the effective components of Uyghur medicines. Currently, the method of identifying the components of Uyghur medicines relies on manual detection, which has the problems of high toxicity of the unfolding agent, poor stability, high cost, low efficiency, etc. Therefore, this paper proposes a method based on Raman spectroscopy and multi-label deep learning model to construct a model Mix2Com for accurate identification of Uyghur medicine components. The experiments use computer-simulated mixtures as the dataset, introduce the Long Short-Term Memory Model (LSTM) and Attention mechanism to encode the Raman spectral data, use multiple parallel networks for decoding, and ultimately realize the macro parallel prediction of medicine components. The results show that the model is trained to achieve 90.76% accuracy, 99.41% precision, 95.42% recall value and 97.37% F1 score. Compared to the traditional XGBoost model, the method proposed in the experiment improves the accuracy by 49% and the recall value by 18%; compared with the DeepRaman model, the accuracy is improved by 9% and the recall value is improved by 14%. The method proposed in this paper provides a new solution for the accurate identification of Uyghur medicinal components. It helps to improve the quality standard of Uyghur medicinal materials, advance the research on screening of effective chemical components of Uyghur medicines and their action mechanisms, and then promote the modernization and development of Uyghur medicine.
Collapse
Affiliation(s)
- Xiaotong Xin
- College of Software, Xinjiang University, Urumqi 830046, China.
| | - Xuecong Tian
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi 830046, China.
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China; Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi 830046, China; Xinjiang Cloud Computing Application Laboratory, Karamay 834099, China.
| | - Keao Li
- Xinjiang Qikang Habowei Medicine Co., Ltd., Urumqi 830010, China.
| | - Xuan Ma
- Xinjiang Qimu Institute of Medicine, Urumqi 830010, China.
| | - Lu Zhao
- Xinjiang Qimu Institute of Medicine, Urumqi 830010, China.
| | - Xiaoyi Lv
- College of Software, Xinjiang University, Urumqi 830046, China; Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi 830046, China.
| |
Collapse
|
2
|
Mou JY, Usman M, Tang JW, Yuan Q, Ma ZW, Wen XR, Liu Z, Wang L. Pseudo-Siamese network combined with label-free Raman spectroscopy for the quantification of mixed trace amounts of antibiotics in human milk: A feasibility study. Food Chem X 2024; 22:101507. [PMID: 38855098 PMCID: PMC11157215 DOI: 10.1016/j.fochx.2024.101507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/16/2024] [Accepted: 05/22/2024] [Indexed: 06/11/2024] Open
Abstract
The utilization of antibiotics is prevalent among lactating mothers. Hence, the rapid determination of trace amounts of antibiotics in human milk is crucial for ensuring the healthy development of infants. In this study, we constructed a human milk system containing residual doxycycline (DXC) and/or tetracycline (TC). Machine learning models and clustering algorithms were applied to classify and predict deficient concentrations of single and mixed antibiotics via label-free SERS spectra. The experimental results demonstrate that the CNN model has a recognition accuracy of 98.85% across optimal hyperparameter combinations. Furthermore, we employed Independent Component Analysis (ICA) and the pseudo-Siamese Convolutional Neural Network (pSCNN) to quantify the ratios of individual antibiotics in mixed human milk samples. Integrating the SERS technique with machine learning algorithms shows significant potential for rapid discrimination and precise quantification of single and mixed antibiotics at deficient concentrations in human milk.
Collapse
Affiliation(s)
- Jing-Yi Mou
- Laboratory Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China
- Department of Clinical Medicine, School of the 1 Clinical Medicine, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Muhammad Usman
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Jia-Wei Tang
- Laboratory Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China
| | - Quan Yuan
- Laboratory Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Zhang-Wen Ma
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
- Department of Pharmaceutical Analysis, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Xin-Ru Wen
- School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Zhao Liu
- Department of Clinical Medicine, School of the 1 Clinical Medicine, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
- Department of Thyroid and Breast Surgery, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Liang Wang
- Laboratory Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, Guangdong Province, China
- Division of Microbiology and Immunology, School of Biomedical Sciences, The University of Western Australia, Crawley, Western Australia, Australia
- School of Agriculture and Food Sustainability, University of Queensland, Brisbane, Queensland, Australia
- Center for Precision Health, School of Medical and Health Sciences, Edith Cowan University, Perth, Western Australia, Australia
| |
Collapse
|
3
|
Lu XY, Wu HP, Ma H, Li H, Li J, Liu YT, Pan ZY, Xie Y, Wang L, Ren B, Liu GK. Deep Learning-Assisted Spectrum-Structure Correlation: State-of-the-Art and Perspectives. Anal Chem 2024; 96:7959-7975. [PMID: 38662943 DOI: 10.1021/acs.analchem.4c01639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Spectrum-structure correlation is playing an increasingly crucial role in spectral analysis and has undergone significant development in recent decades. With the advancement of spectrometers, the high-throughput detection triggers the explosive growth of spectral data, and the research extension from small molecules to biomolecules accompanies massive chemical space. Facing the evolving landscape of spectrum-structure correlation, conventional chemometrics becomes ill-equipped, and deep learning assisted chemometrics rapidly emerges as a flourishing approach with superior ability of extracting latent features and making precise predictions. In this review, the molecular and spectral representations and fundamental knowledge of deep learning are first introduced. We then summarize the development of how deep learning assist to establish the correlation between spectrum and molecular structure in the recent 5 years, by empowering spectral prediction (i.e., forward structure-spectrum correlation) and further enabling library matching and de novo molecular generation (i.e., inverse spectrum-structure correlation). Finally, we highlight the most important open issues persisted with corresponding potential solutions. With the fast development of deep learning, it is expected to see ultimate solution of establishing spectrum-structure correlation soon, which would trigger substantial development of various disciplines.
Collapse
Affiliation(s)
- Xin-Yu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hao-Ping Wu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| | - Hao Ma
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hui Li
- Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen 361005, P. R. China
| | - Jia Li
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, P. R. China
| | - Yan-Ti Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Zheng-Yan Pan
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yi Xie
- School of Informatics, Xiamen University, Xiamen 361005, P. R. China
| | - Lei Wang
- Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen 361005, P. R. China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| |
Collapse
|
4
|
Guo Z, Fan Y, Yu C, Lu H, Zhang Z. GCMSFormer: A Fully Automatic Method for the Resolution of Overlapping Peaks in Gas Chromatography-Mass Spectrometry. Anal Chem 2024; 96:5878-5886. [PMID: 38560891 DOI: 10.1021/acs.analchem.3c05772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Gas chromatography-mass spectrometry (GC-MS) is one of the most important instruments for analyzing volatile organic compounds. However, the complexity of real samples and the limitations of chromatographic separation capabilities lead to coeluting compounds without ideal separation. In this study, a Transformer-based automatic resolution method (GCMSFormer) is proposed to resolve mass spectra from GC-MS peaks in an end-to-end manner, predicting the mass spectra of components directly from the raw overlapping peaks data. Furthermore, orthogonal projection resolution (OPR) was integrated into GCMSFormer to resolve minor components. The GCMSFormer model was trained, validated, and tested using 100,000 augmented data. It achieves 99.88% of the bilingual evaluation understudy (BLEU) value on the test set, significantly higher than the 97.68% BLEU value of the baseline sequence-to-sequence model long short-term memory (LSTM). GCMSFormer was also compared with two nondeep learning resolution tools (MZmine and AMDIS) and two deep learning resolution tools (PARAFAC2 with DL and MSHub/GNPS) on a real plant essential oil GC-MS data set. Their resolution results were compared on evaluation metrics, including the number of compounds resolved, mass spectral match score, correlation coefficient, explained variance, and resolution speed. The results demonstrate that GCMSFormer has better resolution performance, higher automation, and faster resolution speed. In summary, GCMSFormer is an end-to-end, fast, fully automatic, and accurate method for analyzing GC-MS data of complex samples.
Collapse
Affiliation(s)
- Zixuan Guo
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Yingjie Fan
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| |
Collapse
|
5
|
Duan C, Liu X, Cai W, Shao X. Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis. J Chem Inf Model 2024; 64:2508-2514. [PMID: 37801639 DOI: 10.1021/acs.jcim.3c01290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
A perturbator was developed for variable selection in near-infrared (NIR) spectral analysis based on the perturbation strategy in deep learning for developing interpretation methods. A deep learning predictor was first constructed to predict the targets from the spectra in the training set. Then, taking the output of the predictor as a reference, the perturbator was trained to derive the perturbation-positive (P+) and perturbation-negative (P-) features from the spectra. Therefore, the weight (σ) of the perturbator layer can be a criterion to evaluate the importance of the variables in the spectra. Ranking the spectral variables by the criterion, the number of the variables used in the quantitative model can be obtained through cross-validation. Three NIR data sets were used to evaluate the proposed method. The root mean squared error was found to be comparable with or superior to that obtained by the commonly used methods. Moreover, the selected spectral variables are interpretable in identifying the key spectral features related to the prediction target. Therefore, the proposed method provides not only an effective tool for optimizing quantitative model, but also an efficient way for explaining spectra of multicomponent samples.
Collapse
Affiliation(s)
- Chaoshu Duan
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, P. R. China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, P. R. China
| | - Xuyang Liu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, P. R. China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, P. R. China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, P. R. China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, P. R. China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, P. R. China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, P. R. China
| |
Collapse
|
6
|
Lu XY, Wang CY, Tang H, Qin YF, Cui L, Wang X, Liu GK, Ren B. Patch-Based Convolutional Encoder: A Deep Learning Algorithm for Spectral Classification Balancing the Local and Global Information. Anal Chem 2024. [PMID: 38324760 DOI: 10.1021/acs.analchem.3c03889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Molecular vibrational spectroscopies, including infrared absorption and Raman scattering, provide molecular fingerprint information and are powerful tools for qualitative and quantitative analysis. They benefit from the recent development of deep-learning-based algorithms to improve the spectral, spatial, and temporal resolutions. Although a variety of deep-learning-based algorithms, including those to simultaneously extract the global and local spectral features, have been developed for spectral classification, the classification accuracy is still far from satisfactory when the difference becomes very subtle. Here, we developed a lightweight algorithm named patch-based convolutional encoder (PACE), which effectively improved the accuracy of spectral classification by extracting spectral features while balancing local and global information. The local information was captured well by segmenting the spectrum into patches with an appropriate patch size. The global information was extracted by constructing the correlation between different patches with depthwise separable convolutions. In the five open-source spectral data sets, PACE achieved a state-of-the-art performance. The more difficult the classification, the better the performance of PACE, compared with that of residual neural network (ResNet), vision transformer (ViT), and other commonly used deep learning algorithms. PACE helped improve the accuracy to 92.1% in Raman identification of pathogen-derived extracellular vesicles at different physiological states, which is much better than those of ResNet (85.1%) and ViT (86.0%). In general, the precise recognition and extraction of subtle differences offered by PACE are expected to facilitate vibrational spectroscopy to be a powerful tool toward revealing the relevant chemical reaction mechanisms in surface science or realizing the early diagnosis in life science.
Collapse
Affiliation(s)
- Xin-Yu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Chen-Yue Wang
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, China
| | - Hui Tang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Yi-Fei Qin
- Xiamen Key Laboratory of Indoor Air and Health, Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Li Cui
- Xiamen Key Laboratory of Indoor Air and Health, Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, China
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiang Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, China
| |
Collapse
|
7
|
Ye Q, Wu M, Xu Q, Zeng S, Jiang T, Xiong W, Fu S, Birowosuto MD, Gu C. Porous carbon film/WO 3-x nanosheets based SERS substrate combined with deep learning technique for molecule detection. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 310:123962. [PMID: 38309005 DOI: 10.1016/j.saa.2024.123962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/21/2024] [Accepted: 01/22/2024] [Indexed: 02/05/2024]
Abstract
The Surface-enhanced Raman scattering (SERS) is an attractive optical detecting method with high sensitivity and detectivity, however challenges on large-area signal uniformity and complex spectra analysis methods always retards its wide application. Herein, a highly sensitive and uniform SERS detection strategy supported by porous carbon film/WO3-x nanosheets (PorC/WO3-x) based noble-metal-free SERS substrate and deep learning algorithm are reported. Experimentally, the PorC/WO3-x substrate was prepared by high-temperature annealing the PorC/WO3 films under the argon atmosphere. The defect density of the WO3 was controlled by tuning the reducing reaction time during the annealing process. The SERS performance was evaluated by using R6G as the Raman reporter, it showed that the SERS intensity obtained on the substrate with the optimal annealing time of 3 h was about 8 times as high as that obtained on the PorC/WO3 substrate without annealing treatment. And detection limit of 10-7 M and Raman enhancement factor of 106 could be achieved. Moreover, the above optimal SERS substrate was utilized to detect flavonoids of quercetin, 3-hydroxyflavone and flavone, and a deep learning algorithms was incorporated to identify the quercetin. It revealed that quercetin can be accurately detected within the above flavonoids, and lowest detectable concentration of 10-5 M can be achieved.
Collapse
Affiliation(s)
- Qinli Ye
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China
| | - Miaomiao Wu
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China; Ningbo Institute of Oceanography, Ningbo 315800, China
| | - Qian Xu
- Department of Nursing, The First Hospital of Ningbo University, Ningbo 315010, Zhejiang, China
| | - Shuwen Zeng
- Light, Nanomaterials & Nanotechnologies (L2n), CNRS-UMR 7004, Université de Technologie de Troyes, 10000 Troyes, France
| | - Tao Jiang
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China
| | - Wei Xiong
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China
| | - Songyin Fu
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China.
| | - Muhammad Danang Birowosuto
- Łukasiewicz Research Network-PORT Polish Center for Technology Development, Stabłowicka 147, 54-066 Wrocław, Poland
| | - Chenjie Gu
- The Research Institute of Advanced Technology, Ningbo University, Ningbo 315211, Zhejiang, China; Ningbo Institute of Oceanography, Ningbo 315800, China; Department of Nursing, The First Hospital of Ningbo University, Ningbo 315010, Zhejiang, China.
| |
Collapse
|
8
|
Wang Y, Wei W, Du W, Cai J, Liao Y, Lu H, Kong B, Zhang Z. Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors. Molecules 2023; 28:7380. [PMID: 37959799 PMCID: PMC10648966 DOI: 10.3390/molecules28217380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.
Collapse
Affiliation(s)
- Yufei Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Weiwei Wei
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Wen Du
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Jiaxiao Cai
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Bo Kong
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| |
Collapse
|
9
|
Cheng F, Yang C, Zhu H, Li Y, Lan L, Wang K. Semi-Supervised Deep Learning-Based Multi-component Spectral Calibration Modeling for UV-vis and Near-Infrared Spectroscopy without Information Loss. Anal Chem 2023; 95:13446-13455. [PMID: 37638661 DOI: 10.1021/acs.analchem.3c01132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2023]
Abstract
Spectral analysis is an important method for characterizing and identifying chemical species. However, quantitative spectral analysis of multiple chemical properties in the real world has always been a challenging problem due to the strong correlation, massive noise, and serious information overlapping of the spectral features. Here, we present a new semi-supervised spectral calibration method based on information lossless decoupling of spectral features named NICEM. To realize the separation and extraction of key latent features, the method uses the flow-based model non-linear independent component estimation (NICE) to learn the sample distribution. The spectral data information is transformed into independent latent variables obeying Gaussian distribution by the reversible structure of deep network without information loss, so as to find the essential properties and realize the feature nonlinear decomposition. Moreover, the association between the input latent feature variables and attributes is evaluated by the maximum mutual information coefficient to eliminate the adverse effects of irrelevant information in the latent variable space and mine key information. Since the latent variables are independent in each dimension, the NICEM method is easier to establish an accurate semi-supervised multi-component calibration model even for high overlapping and complex spectral data. The applicability of the proposed spectral modeling method is demonstrated by using three ultraviolet-visible and near-infrared spectral data sets with 15 physical and chemical properties including diesel fuels, corn, and multi-metal ions solution. Results show that the proposed NICEM method has the highest determination coefficient (R2) and significantly improves extrapolation compared with the seven state-of-the-art methods. The proposed method is intuitive because it obviates complex feature engineering and prior knowledge and is a promising spectral calibration tool for quantitative analysis in other spectroscopy applications.
Collapse
Affiliation(s)
- Fei Cheng
- School of Automation, Central South University, Changsha 410083, China
| | - Chunhua Yang
- School of Automation, Central South University, Changsha 410083, China
| | - Hongqiu Zhu
- School of Automation, Central South University, Changsha 410083, China
| | - Yonggang Li
- School of Automation, Central South University, Changsha 410083, China
| | - Lijuan Lan
- School of Automation, Central South University, Changsha 410083, China
| | - Kai Wang
- School of Automation, Central South University, Changsha 410083, China
| |
Collapse
|