1
|
Gaida M, Stefanuto PH, Focant JF. Theoretical modeling and machine learning-based data processing workflows in comprehensive two-dimensional gas chromatography-A review. J Chromatogr A 2023; 1711:464467. [PMID: 37871505 DOI: 10.1016/j.chroma.2023.464467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 10/15/2023] [Accepted: 10/17/2023] [Indexed: 10/25/2023]
Abstract
In recent years, comprehensive two-dimensional gas chromatography (GC × GC) has been gradually gaining prominence as a preferred method for the analysis of complex samples due to its higher peak capacity and resolution power compared to conventional gas chromatography (GC). Nonetheless, to fully benefit from the capabilities of GC × GC, a holistic approach to method development and data processing is essential for a successful and informative analysis. Method development enables the fine-tuning of the chromatographic separation, resulting in high-quality data. While generating such data is pivotal, it does not necessarily guarantee that meaningful information will be extracted from it. To this end, the first part of this manuscript reviews the importance of theoretical modeling in achieving good optimization of the separation conditions, ultimately improving the quality of the chromatographic separation. Multiple theoretical modeling approaches are discussed, with a special focus on thermodynamic-based modeling. The second part of this review highlights the importance of establishing robust data processing workflows, with a special emphasis on the use of advanced data processing tools such as, Machine Learning (ML) algorithms. Three widely used ML algorithms are discussed: Random Forest (RF), Support Vector Machine (SVM), and Partial Least Square-Discriminate Analysis (PLS-DA), highlighting their role in discovery-based analysis.
Collapse
Affiliation(s)
- Meriem Gaida
- Organic and Biological Analytical Chemistry Group (OBiAChem), MolSys Research Unit, Liège University, Belgium
| | - Pierre-Hugues Stefanuto
- Organic and Biological Analytical Chemistry Group (OBiAChem), MolSys Research Unit, Liège University, Belgium
| | - Jean-François Focant
- Organic and Biological Analytical Chemistry Group (OBiAChem), MolSys Research Unit, Liège University, Belgium
| |
Collapse
|
2
|
Singh YR, Shah DB, Kulkarni M, Patel SR, Maheshwari DG, Shah JS, Shah S. Current trends in chromatographic prediction using artificial intelligence and machine learning. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2023; 15:2785-2797. [PMID: 37264667 DOI: 10.1039/d3ay00362k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) gained tremendous growth and are rapidly becoming popular in various fields of prediction due to their potential abilities, accuracy, and speed. Machine learning algorithms employ historical data to analyze or predict information using patterns or trends. AI and ML were most employed in chromatographic predictions and particularly attractive options for liquid chromatography method development, as they can help achieve desired results faster, more accurately, and more efficiently. This review aims at exploring various AI and ML models employed in the determination of chromatographic characteristics. This review also aims to provide deep insight into reported artificial neural network (ANN) associated techniques which maintained better accuracy and significant possibilities for chromatographic characteristics prediction in liquid chromatography over classical linear models and also emphasizes the integration of a fuzzy system with an ANN, as this integrated study provides more efficient and accurate methods in chromatographic prediction than other linear models. This study also focuses on the retention prediction of a target molecule employing QSRR methodology combined with an ANN, highlighting a more effective technique than the QSRR alone. This approach showed the benefits of combining AI or ML algorithms with the QSRR to obtain more accurate retention predictions, emphasizing the potential of artificial intelligence and machine learning for overcoming adversities in analytical chemistry.
Collapse
Affiliation(s)
- Yash Raj Singh
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Darshil B Shah
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Mangesh Kulkarni
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Shreyanshu R Patel
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Dilip G Maheshwari
- Department of Pharmaceutical Quality Assurance, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Jignesh S Shah
- Department of Pharmaceutical Regulatory Affairs, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| | - Shreeraj Shah
- Department of Pharmaceutical Technology, LJ Institute of Pharmacy, LJ University, Ahmedabad, Gujarat, India
| |
Collapse
|
3
|
Bos TS, Knol WC, Molenaar SR, Niezen LE, Schoenmakers PJ, Somsen GW, Pirok BW. Recent applications of chemometrics in one- and two-dimensional chromatography. J Sep Sci 2020; 43:1678-1727. [PMID: 32096604 PMCID: PMC7317490 DOI: 10.1002/jssc.202000011] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 02/20/2020] [Accepted: 02/21/2020] [Indexed: 12/28/2022]
Abstract
The proliferation of increasingly more sophisticated analytical separation systems, often incorporating increasingly more powerful detection techniques, such as high-resolution mass spectrometry, causes an urgent need for highly efficient data-analysis and optimization strategies. This is especially true for comprehensive two-dimensional chromatography applied to the separation of very complex samples. In this contribution, the requirement for chemometric tools is explained and the latest developments in approaches for (pre-)processing and analyzing data arising from one- and two-dimensional chromatography systems are reviewed. The final part of this review focuses on the application of chemometrics for method development and optimization.
Collapse
Affiliation(s)
- Tijmen S. Bos
- Division of Bioanalytical ChemistryAmsterdam Institute for Molecules, Medicines and SystemsVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Wouter C. Knol
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Stef R.A. Molenaar
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Leon E. Niezen
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Peter J. Schoenmakers
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Govert W. Somsen
- Division of Bioanalytical ChemistryAmsterdam Institute for Molecules, Medicines and SystemsVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Bob W.J. Pirok
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| |
Collapse
|
4
|
Mommers J, van der Wal S. Column Selection and Optimization for Comprehensive Two-Dimensional Gas Chromatography: A Review. Crit Rev Anal Chem 2020; 51:183-202. [DOI: 10.1080/10408347.2019.1707643] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- John Mommers
- DSM Material Science Center, Geleen, The Netherlands
| | - Sjoerd van der Wal
- Polymer-Analysis Group, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
5
|
Pojjanapornpun S, Kulsing C, Kakanopas P, Nolvachai Y, Aryusuk K, Krisnangkura K, Marriott PJ. Simulation of peak position and response profiles in comprehensive two-dimensional gas chromatography. J Chromatogr A 2019; 1607:460392. [DOI: 10.1016/j.chroma.2019.460392] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 07/04/2019] [Accepted: 07/21/2019] [Indexed: 10/26/2022]
|
6
|
Characterisation of Gas-Chromatographic Poly(Siloxane) Stationary Phases by Theoretical Molecular Descriptors and Prediction of McReynolds Constants. Int J Mol Sci 2019; 20:ijms20092120. [PMID: 31035726 PMCID: PMC6539345 DOI: 10.3390/ijms20092120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 04/23/2019] [Accepted: 04/25/2019] [Indexed: 12/01/2022] Open
Abstract
Retention in gas–liquid chromatography is mainly governed by the extent of intermolecular interactions between the solute and the stationary phase. While molecular descriptors of computational origin are commonly used to encode the effect of the solute structure in quantitative structure–retention relationship (QSRR) approaches, characterisation of stationary phases is historically based on empirical scales, the McReynolds system of phase constants being one of the most popular. In this work, poly(siloxane) stationary phases, which occupy a dominant position in modern gas–liquid chromatography, were characterised by theoretical molecular descriptors. With this aim, the first five McReynolds constants of 29 columns were modelled by multilinear regression (MLR) coupled with genetic algorithm (GA) variable selection applied to the molecular descriptors provided by software Dragon. The generalisation ability of the established GA-MLR models, evaluated by both external prediction and repeated calibration/evaluation splitting, was better than that reported in analogous studies regarding nonpolymeric (molecular) stationary phases. Principal component analysis on the significant molecular descriptors allowed to classify the poly(siloxanes) according to their chemical composition and partitioning properties. Development of QSRR-based models combining molecular descriptors of both solutes and stationary phases, which will be applied to transfer retention data among different columns, is in progress.
Collapse
|
7
|
Artificial Neural Network Prediction of Retention of Amino Acids in Reversed-Phase HPLC under Application of Linear Organic Modifier Gradients and/or pH Gradients. Molecules 2019; 24:molecules24030632. [PMID: 30754702 PMCID: PMC6384946 DOI: 10.3390/molecules24030632] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 02/06/2019] [Accepted: 02/07/2019] [Indexed: 12/29/2022] Open
Abstract
A multi-layer artificial neural network (ANN) was used to model the retention behavior of 16 o-phthalaldehyde derivatives of amino acids in reversed-phase liquid chromatography under application of various gradient elution modes. The retention data, taken from literature, were collected in acetonitrile⁻water eluents under application of linear organic modifier gradients ( gradients), pH gradients, or double pH/ gradients. At first, retention data collected in gradients and pH gradients were modeled separately, while these were successively combined in one dataset and fitted simultaneously. Specific ANN-based models were generated by combining the descriptors of the gradient profiles with 16 inputs representing the amino acids and providing the retention time of these solutes as the response. Categorical "bit-string" descriptors were adopted to identify the solutes, which allowed simultaneously modeling the retention times of all 16 target amino acids. The ANN-based models tested on external gradients provided mean errors for the predicted retention times of 1.1% ( gradients), 1.4% (pH gradients), 2.5% (combined and pH gradients), and 2.5% (double pH/ gradients). The accuracy of ANN prediction was better than that previously obtained by fitting of the same data with retention models based on the solution of the fundamental equation of gradient elution.
Collapse
|
8
|
Retention time prediction in thermally modulated comprehensive two-dimensional gas chromatography: Correcting second dimension retention time modeling error. J Chromatogr A 2018; 1581-1582:116-124. [DOI: 10.1016/j.chroma.2018.10.054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Revised: 10/25/2018] [Accepted: 10/29/2018] [Indexed: 11/18/2022]
|
9
|
Retention-time prediction in comprehensive two-dimensional gas chromatography to aid identification of unknown contaminants. Anal Bioanal Chem 2018; 410:7931-7941. [PMID: 30361914 PMCID: PMC6244764 DOI: 10.1007/s00216-018-1415-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 09/27/2018] [Accepted: 10/02/2018] [Indexed: 11/29/2022]
Abstract
Comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled to mass spectrometry (MS, GC×GC-MS), which enhances selectivity compared to GC-MS analysis, can be used for non-directed analysis (non-target screening) of environmental samples. Additional tools that aid in identifying unknown compounds are needed to handle the large amount of data generated. These tools include retention indices for characterizing relative retention of compounds and prediction of such. In this study, two quantitative structure–retention relationship (QSRR) approaches for prediction of retention times (1tR and 2tR) and indices (linear retention indices (LRIs) and a new polyethylene glycol–based retention index (PEG-2I)) in GC × GC were explored, and their predictive power compared. In the first method, molecular descriptors combined with partial least squares (PLS) analysis were used to predict times and indices. In the second method, the commercial software package ChromGenius (ACD/Labs), based on a “federation of local models,” was employed. Overall, the PLS approach exhibited better accuracy than the ChromGenius approach. Although average errors for the LRI prediction via ChromGenius were slightly lower, PLS was superior in all other cases. The average deviations between the predicted and the experimental value were 5% and 3% for the 1tR and LRI, and 5% and 12% for the 2tR and PEG-2I, respectively. These results are comparable to or better than those reported in previous studies. Finally, the developed model was successfully applied to an independent dataset and led to the discovery of 12 wrongly assigned compounds. The results of the present work represent the first-ever prediction of the PEG-2I. ᅟ ![]()
Collapse
|
10
|
Zhokhov AK, Loskutov AY, Rybal’chenko IV. Methodological Approaches to the Calculation and Prediction of Retention Indices in Capillary Gas Chromatography. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934818030127] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
11
|
Application of a quantitative structure retention relationship approach for the prediction of the two-dimensional gas chromatography retention times of polycyclic aromatic sulfur heterocycle compounds. J Chromatogr A 2016; 1437:191-202. [DOI: 10.1016/j.chroma.2016.02.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Revised: 01/28/2016] [Accepted: 02/01/2016] [Indexed: 10/22/2022]
|
12
|
Weggler BA, Gröger T, Zimmermann R. Advanced scripting for the automated profiling of two-dimensional gas chromatography-time-of-flight mass spectrometry data from combustion aerosol. J Chromatogr A 2014; 1364:241-8. [PMID: 25234498 DOI: 10.1016/j.chroma.2014.08.091] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 07/28/2014] [Accepted: 08/27/2014] [Indexed: 11/19/2022]
Abstract
Multidimensional gas chromatography is an appropriate tool for the non-targeted and comprehensive characterisation of complex samples generated from combustion processes. Particulate matter (PM) emission is composed of a large number of compounds, including condensed semi-volatile organic compounds (SVOCs). However, the complex amount of information gained from such comprehensive techniques is associated with difficult and time-consuming data analysis. Because of this obstacle, two-dimensional gas chromatography still receives relatively little use in aerosol science [1-4]. To remedy this problem, advanced scripting algorithms based on knowledge-based rules (KBRs) were developed in-house and applied to GCxGC-TOFMS data. Previously reported KBRs and newer findings were considered for the development of these algorithms. The novelty of the presented advanced scripting tools is a notably selective search criterion for data screening, which is primarily based on fragmentation patterns and the presence of specific fragments. Combined with "classical" approaches based on retention times, a fast, accurate and automated data evaluation method was developed, which was evaluated qualitatively and quantitatively for type 1 and type 2 errors. The method's applicability was further tested for PM filter samples obtained from ship fuel combustion. Major substance classes, including polycyclic aromatic hydrocarbons (PAH), alkanes, benzenes, esters and ethers, can be targeted. This approach allows the classification of approximately 75% of the peaks of interest within real PM samples. Various conditions of combustion, such as fuel composition and engine load, could be clearly characterised and differentiated.
Collapse
Affiliation(s)
- Benedikt A Weggler
- Joint Mass Spectrometry Centre, Cooperation Group ``Comprehensive Molecular Analytics'' Helmholtz Zentrum Muenchen, D85764 Neuherberg, Germany; Joint Mass Spectrometry Centre, Institute of Chemistry, Chair of Analytical Chemistry UNiversity of Rostock, D18057 Rostock, Germany; Helmholtz Virtual Institute of Complex Molecular Systems in Environmental Health - Aerosol and Health (HICE)
| | - Thomas Gröger
- Joint Mass Spectrometry Centre, Cooperation Group ``Comprehensive Molecular Analytics'' Helmholtz Zentrum Muenchen, D85764 Neuherberg, Germany; Joint Mass Spectrometry Centre, Institute of Chemistry, Chair of Analytical Chemistry UNiversity of Rostock, D18057 Rostock, Germany.
| | - Ralf Zimmermann
- Joint Mass Spectrometry Centre, Cooperation Group ``Comprehensive Molecular Analytics'' Helmholtz Zentrum Muenchen, D85764 Neuherberg, Germany; Joint Mass Spectrometry Centre, Institute of Chemistry, Chair of Analytical Chemistry UNiversity of Rostock, D18057 Rostock, Germany; Helmholtz Virtual Institute of Complex Molecular Systems in Environmental Health - Aerosol and Health (HICE)
| |
Collapse
|
13
|
Interpretation of comprehensive two-dimensional gas chromatography data using advanced chemometrics. Trends Analyt Chem 2014. [DOI: 10.1016/j.trac.2013.08.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
14
|
Giaginis C, Tsantili-Kakoulidou A. Quantitative Structure–Retention Relationships as Useful Tool to Characterize Chromatographic Systems and Their Potential to Simulate Biological Processes. Chromatographia 2012. [DOI: 10.1007/s10337-012-2374-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
15
|
Seeley JV, Seeley SK. Multidimensional Gas Chromatography: Fundamental Advances and New Applications. Anal Chem 2012; 85:557-78. [DOI: 10.1021/ac303195u] [Citation(s) in RCA: 183] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- John V. Seeley
- Oakland University, Department of Chemistry, Rochester, Michigan, 48309
| | - Stacy K. Seeley
- Kettering University, Department of Chemistry and Biochemistry, 1700 University Avenue,
Flint, Michigan, 48504
| |
Collapse
|
16
|
Prediction of retention times in comprehensive two-dimensional gas chromatography using thermodynamic models. J Chromatogr A 2012; 1255:184-9. [DOI: 10.1016/j.chroma.2012.02.023] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Revised: 02/02/2012] [Accepted: 02/08/2012] [Indexed: 11/17/2022]
|
17
|
Predictions of comprehensive two-dimensional gas chromatography separations from isothermal data. J Chromatogr A 2012; 1233:147-51. [DOI: 10.1016/j.chroma.2012.02.032] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Revised: 02/13/2012] [Accepted: 02/13/2012] [Indexed: 11/23/2022]
|
18
|
D’Archivio AA, Incani A, Ruggieri F. Cross-column prediction of gas-chromatographic retention of polychlorinated biphenyls by artificial neural networks. J Chromatogr A 2011; 1218:8679-90. [DOI: 10.1016/j.chroma.2011.09.071] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Revised: 09/26/2011] [Accepted: 09/27/2011] [Indexed: 10/17/2022]
|
19
|
Chemometrics in comprehensive multidimensional separations. Anal Bioanal Chem 2011; 401:2373-86. [DOI: 10.1007/s00216-011-5139-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2011] [Revised: 05/22/2011] [Accepted: 05/23/2011] [Indexed: 10/18/2022]
|