1
|
Cardoso Rial R. AI in analytical chemistry: Advancements, challenges, and future directions. Talanta 2024; 274:125949. [PMID: 38569367 DOI: 10.1016/j.talanta.2024.125949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 03/09/2024] [Accepted: 03/17/2024] [Indexed: 04/05/2024]
Abstract
This article explores the influence and applications of Artificial Intelligence (AI) in analytical chemistry, highlighting its potential to revolutionize the analysis of complex data sets and the development of innovative analytical methods. Additionally, it discusses the role of AI in interpreting large-scale data and optimizing experimental processes. AI has been fundamental in managing heterogeneous data and in advanced analysis of complex spectra in areas such as spectroscopy and chromatography. The article also examines the historical development of AI in chemistry, its current challenges, including the interpretation of AI models and the integration of large volumes of data. Finally, it forecasts future trends and the potential impact of AI on analytical chemistry, emphasizing the need for ethical and secure approaches in the use of AI.
Collapse
Affiliation(s)
- Rafael Cardoso Rial
- Federal Institute of Mato Grosso do Sul, 79750-000, Nova Andradina, MS, Brazil.
| |
Collapse
|
2
|
Guo Z, Fan Y, Yu C, Lu H, Zhang Z. GCMSFormer: A Fully Automatic Method for the Resolution of Overlapping Peaks in Gas Chromatography-Mass Spectrometry. Anal Chem 2024; 96:5878-5886. [PMID: 38560891 DOI: 10.1021/acs.analchem.3c05772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Gas chromatography-mass spectrometry (GC-MS) is one of the most important instruments for analyzing volatile organic compounds. However, the complexity of real samples and the limitations of chromatographic separation capabilities lead to coeluting compounds without ideal separation. In this study, a Transformer-based automatic resolution method (GCMSFormer) is proposed to resolve mass spectra from GC-MS peaks in an end-to-end manner, predicting the mass spectra of components directly from the raw overlapping peaks data. Furthermore, orthogonal projection resolution (OPR) was integrated into GCMSFormer to resolve minor components. The GCMSFormer model was trained, validated, and tested using 100,000 augmented data. It achieves 99.88% of the bilingual evaluation understudy (BLEU) value on the test set, significantly higher than the 97.68% BLEU value of the baseline sequence-to-sequence model long short-term memory (LSTM). GCMSFormer was also compared with two nondeep learning resolution tools (MZmine and AMDIS) and two deep learning resolution tools (PARAFAC2 with DL and MSHub/GNPS) on a real plant essential oil GC-MS data set. Their resolution results were compared on evaluation metrics, including the number of compounds resolved, mass spectral match score, correlation coefficient, explained variance, and resolution speed. The results demonstrate that GCMSFormer has better resolution performance, higher automation, and faster resolution speed. In summary, GCMSFormer is an end-to-end, fast, fully automatic, and accurate method for analyzing GC-MS data of complex samples.
Collapse
Affiliation(s)
- Zixuan Guo
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Yingjie Fan
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| |
Collapse
|
3
|
Schmeis Arroyo V, Iosa M, Antonucci G, De Bartolo D. Predicting Male Infertility Using Artificial Neural Networks: A Review of the Literature. Healthcare (Basel) 2024; 12:781. [PMID: 38610202 PMCID: PMC11011284 DOI: 10.3390/healthcare12070781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 03/27/2024] [Accepted: 03/31/2024] [Indexed: 04/14/2024] Open
Abstract
Male infertility is a relevant public health problem, but there is no systematic review of the different machine learning (ML) models and their accuracy so far. The present review aims to comprehensively investigate the use of ML algorithms in predicting male infertility, thus reporting the accuracy of the used models in the prediction of male infertility as a primary outcome. Particular attention will be paid to the use of artificial neural networks (ANNs). A comprehensive literature search was conducted in PubMed, Scopus, and Science Direct between 15 July and 23 October 2023, conducted under the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We performed a quality assessment of the included studies using the recommended tools suggested for the type of study design adopted. We also made a screening of the Risk of Bias (RoB) associated with the included studies. Thus, 43 relevant publications were included in this review, for a total of 40 different ML models detected. The studies included reported a good quality, even if RoB was not always good for all the types of studies. The included studies reported a median accuracy of 88% in predicting male infertility using ML models. We found only seven studies using ANN models for male infertility prediction, reporting a median accuracy of 84%.
Collapse
Affiliation(s)
- Vivian Schmeis Arroyo
- Department of Psychology, University Sapienza of Rome, 00185 Rome, Italy (M.I.); (G.A.)
| | - Marco Iosa
- Department of Psychology, University Sapienza of Rome, 00185 Rome, Italy (M.I.); (G.A.)
- Santa Lucia Foundation, Scientific Institute for Research, Hospitalization and Health Care (IRCCS), 00179 Rome, Italy
| | - Gabriella Antonucci
- Department of Psychology, University Sapienza of Rome, 00185 Rome, Italy (M.I.); (G.A.)
- Santa Lucia Foundation, Scientific Institute for Research, Hospitalization and Health Care (IRCCS), 00179 Rome, Italy
| | - Daniela De Bartolo
- Santa Lucia Foundation, Scientific Institute for Research, Hospitalization and Health Care (IRCCS), 00179 Rome, Italy
| |
Collapse
|
4
|
Peng Z, Wang Y, Wu X, Yang S, Du X, Xu X, Hu C, Liu W, Zhu Y, Dong B, Pan J, Bao Q, Qian K, Dong L, Xue W. Identifying High Gleason Score Prostate Cancer by Prostate Fluid Metabolic Fingerprint-Based Multi-Modal Recognition. SMALL METHODS 2024:e2301684. [PMID: 38258603 DOI: 10.1002/smtd.202301684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 01/09/2024] [Indexed: 01/24/2024]
Abstract
Prostate cancer (PCa) is the second most common cancer in males worldwide. The Gleason scoring system, which classifies the pathological growth pattern of cancer, is considered one of the most important prognostic factors for PCa. Compared to indolent PCa, PCa with high Gleason score (h-GS PCa, GS ≥ 8) has greater clinical significance due to its high aggressiveness and poor prognosis. It is crucial to establish a rapid, non-invasive diagnostic modality to decipher patients with h-GS PCa as early as possible. In this study, ferric nanoparticle-assisted laser desorption/ionization mass spectrometry (FeNPALDI-MS) to extract prostate fluid metabolic fingerprint (PSF-MF) is employed and combined with the clinical features of patients, such as prostate-specific antigen (PSA), to establish a multi-modal diagnosis assisted by machine learning. This approach yields an impressive area under the curve (AUC) of 0.87 to diagnose patients with h-GS, surpassing the results of single-modal diagnosis using only PSF-MF or PSA, respectively. Additionally, using various screening methods, six key metabolites that exhibit greater diagnostic efficacy (AUC = 0.96) are identified. These findings also provide insights into related metabolic pathways, which may provide valuable information for further elucidation of the pathological mechanisms underlying h-GS PCa.
Collapse
Affiliation(s)
- Zehong Peng
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Yuning Wang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Xinrui Wu
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Shouzhi Yang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Xinxing Du
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Xiaoyu Xu
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Cong Hu
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Wanshan Liu
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Yinjie Zhu
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Baijun Dong
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Jiahua Pan
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Qingui Bao
- Fosun Diagnostics (Shanghai) Co., Ltd., No. 830, Chengyin Road, Baoshan, Shanghai, 200435, P. R. China
| | - Kun Qian
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Liang Dong
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Wei Xue
- Department of Urology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| |
Collapse
|
5
|
Wang Y, Wei W, Du W, Cai J, Liao Y, Lu H, Kong B, Zhang Z. Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors. Molecules 2023; 28:7380. [PMID: 37959799 PMCID: PMC10648966 DOI: 10.3390/molecules28217380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.
Collapse
Affiliation(s)
- Yufei Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Weiwei Wei
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Wen Du
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Jiaxiao Cai
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Bo Kong
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| |
Collapse
|
6
|
Liao Y, Tian M, Zhang H, Lu H, Jiang Y, Chen Y, Zhang Z. Highly automatic and universal approach for pure ion chromatogram construction from liquid chromatography-mass spectrometry data using deep learning. J Chromatogr A 2023; 1705:464172. [PMID: 37392637 DOI: 10.1016/j.chroma.2023.464172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/14/2023] [Accepted: 06/18/2023] [Indexed: 07/03/2023]
Abstract
Feature extraction is the most fundamental step when analyzing liquid chromatography-mass spectrometry (LC-MS) datasets. However, traditional methods require optimal parameter selections and re-optimization for different datasets, thus hindering efficient and objective large-scale data analysis. Pure ion chromatogram (PIC) is widely used because it avoids the peak splitting problem of the extracted ion chromatogram (EIC) and regions of interest (ROIs). Here, we developed a deep learning-based pure ion chromatogram method (DeepPIC) to find PICs using a customized U-Net from centroid mode data of LC-MS directly and automatically. A model was trained, validated, and tested on the Arabidopsis thaliana dataset with 200 input-label pairs. DeepPIC was integrated into KPIC2. The combination enables the entire processing pipeline from raw data to discriminant models for metabolomics datasets. The KPIC2 with DeepPIC was compared against other competing methods (XCMS, FeatureFinderMetabo, and peakonly) on the MM48, simulated MM48, and quantitative datasets. These comparisons showed that DeepPIC outperforms XCMS, FeatureFinderMetabo, and peakonly in recall rates and correlation with sample concentrations. Five datasets of different instruments and samples were used to evaluate the quality of PICs and the universal applicability of DeepPIC, and 95.12% of the found PICs could precisely match their manually labeled PICs. Therefore, KPIC2+DeepPIC is an automatic, practical, and off-the-shelf method to extract features from raw data directly, exceeding traditional methods with careful parameter tuning. It is publicly available at https://github.com/yuxuanliao/DeepPIC.
Collapse
Affiliation(s)
- Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Miao Tian
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Hailiang Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Yonglei Jiang
- Yunnan Academy of Tobacco Agricultural Sciences, Kunming, Yunnan 650021, China
| | - Yi Chen
- Yunnan Academy of Tobacco Agricultural Sciences, Kunming, Yunnan 650021, China.
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
7
|
Fan X, Wang Y, Yu C, Lv Y, Zhang H, Yang Q, Wen M, Lu H, Zhang Z. A Universal and Accurate Method for Easily Identifying Components in Raman Spectroscopy Based on Deep Learning. Anal Chem 2023; 95:4863-4870. [PMID: 36908216 DOI: 10.1021/acs.analchem.2c03853] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
Abstract
Raman spectroscopy has been widely used to provide the structural fingerprint for molecular identification. Due to interference from coexisting components, noise, baseline, and systematic differences between spectrometers, component identification with Raman spectra is challenging, especially for mixtures. In this study, a method entitled DeepRaman has been proposed to solve those problems by combining the comparison ability of a pseudo-Siamese neural network (pSNN) and the input-shape flexibility of spatial pyramid pooling (SPP). DeepRaman was trained, validated, and tested with 41,564 augmented Raman spectra from two databases (pharmaceutical material and S.T. Japan). It can achieve 96.29% accuracy, 98.40% true positive rate (TPR), and 94.36% true negative rate (TNR) on the test set. Another six data sets measured on different instruments were used to evaluate the performance of the proposed method from different aspects. DeepRaman can provide accurate identification results and significantly outperform the hit quality index (HQI) method and other deep learning models. In addition, it performs well in cases of different spectral complexity and low-content components. Once the model is established, it can be used directly on different data sets without retraining or transfer learning. Furthermore, it also obtains promising results for the analysis of surface-enhanced Raman spectroscopy (SERS) data sets and Raman imaging data sets. In summary, it is an accurate, universal, and ready-to-use method for component identification in various application scenarios.
Collapse
Affiliation(s)
- Xiaqiong Fan
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Yue Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Yuanxia Lv
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Hailiang Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Qiong Yang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Ming Wen
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
8
|
Fan Y, Yu C, Lu H, Chen Y, Hu B, Zhang X, Su J, Zhang Z. Deep learning-based method for automatic resolution of gas chromatography-mass spectrometry data from complex samples. J Chromatogr A 2023; 1690:463768. [PMID: 36641940 DOI: 10.1016/j.chroma.2022.463768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 12/21/2022] [Accepted: 12/28/2022] [Indexed: 12/31/2022]
Abstract
Modern gas chromatography-mass spectrometry (GC-MS) is the workhorse for the high-throughput profiling of volatile compounds in complex samples. It can produce a considerable amount of two-dimensional data, and automatic methods are required to distill chemical information from raw GC-MS data efficiently. In this study, we proposed an Automatic Resolution method (AutoRes) based on pseudo-Siamese convolutional neural networks (pSCNN) to extract the meaningful features swamped by the noises, baseline drifts, retention time shifts, and overlapped peaks. Two pSCNN models were trained with 400,000 augmented spectral pairs, respectively. They can predict the selective region (pSCNN1) and elution region (pSCNN2) of compounds in an untargeted manner. The accuracies of the pSCNN1 model and the pSCNN2 model on their test sets are 99.9% and 92.6%, respectively. Then, the chromatographic profile of each component was automatically resolved by full rank resolution (FRR) based on the predicted regions by these models. The performance of AutoRes was evaluated on the simulated and plant essential oil datasets. Compared to AMDIS and MZmine, AutoRes resolves more reasonable mass spectra, chromatograms, and peak areas to identify and quantify compounds. The average match scores of AutoRes (925 and 936) outperformed AMDIS (909 and 925) and MZmine (888 and 916) when resolving mass spectra from overlapped peaks on the Set Ⅰ and Set Ⅱ of plant essential oil dataset and matching them against the NIST17 library. It extracted peak areas and mass spectra automatically from 10 GC-MS files of plant essential oils, and the entire process was completed in 8 min without any prior information or manual intervention. It is implemented in Python and is available as an open-source package at https://github.com/dyjfan/AutoRes.
Collapse
Affiliation(s)
- Yingjie Fan
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, Hunan, China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, Hunan, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, Hunan, China
| | - Yi Chen
- Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, Yunnan, China
| | - Binbin Hu
- Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, Yunnan, China
| | - Xingren Zhang
- Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, Yunnan, China; Baoshan City Branch of Yunnan Tobacco Company, Baoshan 678000, Yunnan, China
| | - Jiaen Su
- Dali Prefecture Branch of Yunnan Tobacco Company, Dali 671000, Yunnan, China.
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, Hunan, China.
| |
Collapse
|
9
|
Zhang H, Xu Z, Fan X, Wang Y, Yang Q, Sun J, Wen M, Kang X, Zhang Z, Lu H. Fusion of Quality Evaluation Metrics and Convolutional Neural Network Representations for ROI Filtering in LC-MS. Anal Chem 2023; 95:612-620. [PMID: 36597722 DOI: 10.1021/acs.analchem.2c01398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Region of interest (ROI) extraction is a fundamental step in analyzing metabolomic datasets acquired by liquid chromatography-mass spectrometry (LC-MS). However, noises and backgrounds in LC-MS data often affect the quality of extracted ROIs. Therefore, developing effective ROI evaluation algorithms is necessary to eliminate false positives meanwhile keep the false-negative rate as low as possible. In this study, a deep fused filter of ROIs (dffROI) was proposed to improve the accuracy of ROI extraction by combining the handcrafted evaluation metrics with convolutional neural network (CNN)-learned representations. To evaluate the performance of dffROI, dffROI was compared with peakonly (CNN-learned representation) and five handcrafted metrics on three LC-MS datasets and a gas chromatography-mass spectrometry (GC-MS) dataset. Results show that dffROI can achieve higher accuracy, better true-positive rate, and lower false-positive rate. Its accuracy, true-positive rate, and false-positive rate are 0.9841, 0.9869, and 0.0186 on the test set, respectively. The classification error rate of dffROI (1.59%) is significantly reduced compared with peakonly (2.73%). The model-agnostic feature importance demonstrates the necessity of fusing handcrafted evaluation metrics with the convolutional neural network representations. dffROI is an automatic, robust, and universal method for ROI filtering by virtue of information fusion and end-to-end learning. It is implemented in Python programming language and open-sourced at https://github.com/zhanghailiangcsu/dffROI under BSD License. Furthermore, it has been integrated into the KPIC2 framework previously proposed by our group to facilitate real metabolomic LC-MS dataset analysis.
Collapse
Affiliation(s)
- Hailiang Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Zhenbo Xu
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Xiaqiong Fan
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Yue Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Qiong Yang
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Jinyu Sun
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Ming Wen
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Xiao Kang
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha410083, China.,National International Collaborative Research Center for Medical Metabolomics, Central South University, Changsha410083, China
| |
Collapse
|
10
|
Review of contemporary chemometric strategies applied on preparing GC–MS data in forensic analysis. Microchem J 2022. [DOI: 10.1016/j.microc.2022.107732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
11
|
Unveiling Chemical Cues of Insect-Tree and Insect-Insect Interactions for the Eucalyptus Weevil and Its Egg Parasitoid by Multidimensional Gas Chromatographic Methods. Molecules 2022; 27:molecules27134042. [PMID: 35807301 PMCID: PMC9268296 DOI: 10.3390/molecules27134042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/17/2022] [Accepted: 06/17/2022] [Indexed: 02/01/2023] Open
Abstract
Multidimensional gas chromatography is, presently, an established and powerful analytical tool, due to higher resolving power than the classical 1D chromatographic approaches. Applied to multiple areas, it allows to isolate, detect and identify a larger number of compounds present in complex matrices, even in trace amounts. Research was conducted to determine which compounds, emitted by host plants of the eucalyptus weevil, Gonipterus platensis, might mediate host selection behavior. The identification of a pheromone blend of G. platensis is presented, revealing to be more attractive to weevils of both sexes, than the individual compounds. The volatile organic compounds (VOCs) were collected by headspace solid phase microextraction (HS-SPME), MonoTrapTM disks, and simultaneous distillation-extraction (SDE). Combining one dimensional (1D) and two-dimensional (2D) chromatographic systems—comprehensive and heart-cut two-dimensional gas chromatography (GC×GC and H/C-MD-GC, respectively) with mass spectrometry (MS) and electroantennographic (EAD) detection, enabled the selection and identification of pertinent semiochemicals which were detected by the insect antennal olfactory system. The behavioral effect of a selected blend of compounds was assessed in a two-arm olfactometer with ten parallel walking chambers, coupled to video tracking and data analysis software. An active blend, composed by cis and trans-verbenol, verbenene, myrtenol and trans-pinocarveol was achieved.
Collapse
|