1
|
Lasfar R, Tóth G. Patch seriation to visualize data and model parameters. J Cheminform 2023; 15:78. [PMID: 37689697 PMCID: PMC10492365 DOI: 10.1186/s13321-023-00757-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/31/2023] [Indexed: 09/11/2023] Open
Abstract
We developed a new seriation merit function for enhancing the visual information of data matrices. A local similarity matrix is calculated, where the average similarity of neighbouring objects is calculated in a limited variable space and a global function is constructed to maximize the local similarities and cluster them into patches by simple row and column ordering. The method identifies data clusters in a powerful way, if the similarity of objects is caused by some variables and these variables differ for the distinct clusters. The method can be used in the presence of missing data and also on more than two-dimensional data arrays. We show the feasibility of the method on different data sets: on QSAR, chemical, material science, food science, cheminformatics and environmental data in two- and three-dimensional cases. The method can be used during the development and the interpretation of artificial neural network models by seriating different features of the models. It helps to identify interpretable models by elucidating clusters of objects, variables and hidden layer neurons.
Collapse
Affiliation(s)
- Rita Lasfar
- Institute of Chemistry, Eötvös Loránd University, Pázmány sétány 1/a, Budapest, 1117, Hungary
| | - Gergely Tóth
- Institute of Chemistry, Eötvös Loránd University, Pázmány sétány 1/a, Budapest, 1117, Hungary.
| |
Collapse
|
2
|
Quantifying orthogonality and separability: A method for optimizing resin selection and design. J Chromatogr A 2020; 1628:461429. [DOI: 10.1016/j.chroma.2020.461429] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 07/10/2020] [Accepted: 07/22/2020] [Indexed: 01/27/2023]
|
3
|
|
4
|
How to compare separation selectivity of high-performance liquid chromatographic columns properly? J Chromatogr A 2017; 1488:45-56. [DOI: 10.1016/j.chroma.2017.01.066] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 01/23/2017] [Accepted: 01/24/2017] [Indexed: 11/24/2022]
|
5
|
Clustering and diagnostic modelling of slimming aids based on chromatographic and mass spectrometric fingerprints. Drug Test Anal 2016; 9:230-242. [DOI: 10.1002/dta.1964] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 01/23/2016] [Accepted: 01/24/2016] [Indexed: 01/03/2023]
|
6
|
Galea C, Mangelings D, Heyden YV. Method development for impurity profiling in SFC: The selection of a dissimilar set of stationary phases. J Pharm Biomed Anal 2015; 111:333-43. [DOI: 10.1016/j.jpba.2014.12.043] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Revised: 12/23/2014] [Accepted: 12/27/2014] [Indexed: 11/24/2022]
|
7
|
Exploratory data analysis as a tool for similarity assessment and clustering of chiral polysaccharide-based systems used to separate pharmaceuticals in supercritical fluid chromatography. J Chromatogr A 2014; 1326:110-24. [DOI: 10.1016/j.chroma.2013.12.052] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 12/13/2013] [Accepted: 12/17/2013] [Indexed: 11/20/2022]
|
8
|
Baertschi SW, Pack BW, Hoaglund Hyzer CS, Nussbaum MA. Assessing mass balance in pharmaceutical drug products: New insights into an old topic. Trends Analyt Chem 2013. [DOI: 10.1016/j.trac.2013.06.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
9
|
Deconinck E, Verstraete T, Van Gyseghem E, Vander Heyden Y, Coomans D. Orthogonal Chromatographic Descriptors for Modelling Caco-2 Drug Permeability. J Chromatogr Sci 2012; 50:175-83. [DOI: 10.1093/chromsci/bmr044] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
10
|
Al Bakain R, Rivals I, Sassiat P, Thiébaut D, Hennion MC, Euvrard G, Vial J. Comparison of different statistical approaches to evaluate the orthogonality of chromatographic separations: Application to reverse phase systems. J Chromatogr A 2011; 1218:2963-75. [DOI: 10.1016/j.chroma.2011.03.031] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2010] [Revised: 03/09/2011] [Accepted: 03/15/2011] [Indexed: 10/18/2022]
|
11
|
Dumarey M, Vander Heyden Y, Rutan SC. Evaluation of the identification power of RPLC analyses in the screening for drug compounds. Anal Chem 2010; 82:6056-65. [PMID: 20578680 DOI: 10.1021/ac1006415] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The identification of drugs of abuse is an important issue in forensic science. The main goal is to trace and identify as many drugs as possible in the shortest possible time preferably with a simple analysis method. One possibility is to screen samples using a Liquid Chromatography-Diode Array Detection (LC-DAD) system. However, when simultaneously performing another analysis on a chromatographic column exhibiting selectivity differences from the first one, that is, orthogonal or dissimilar columns, a greater number of drugs can be possibly identified without investing a lot of extra time or money. The primary difficulty is then selecting the most appropriate columns. In this paper, it is demonstrated that selecting the most dissimilar columns based on measures such as correlation or Snyder's F(s) value is not optimal, because these measures do not take into account the identification power of the individual systems. This implies that a large number of drugs may not necessarily be identified on the systems selected using these criteria. Therefore, three other measures are tested to evaluate the identification power obtained by parallel screening on two columns or by comprehensive two-dimensional LC (LC x LC). The simplest approach is counting the number of compounds separable with a difference in retention time greater than a predefined critical value. However, this measure does not reflect the coelution pattern of the unidentified drugs nor the separation degree of all compounds. The second tested measure, information, enables differentiation between systems identifying the same number of compounds but resulting in a different coelution pattern. Multivariate selectivity, the third tested parameter, takes into account the degree of separation of all compounds and has the advantage that it reflects the gain in identification power achieved by introducing DAD data. All three proposed measures also enable evaluation of whether the corresponding LC x LC method will result in a greater identification power.
Collapse
Affiliation(s)
- Melanie Dumarey
- Analytical Chemistry and Pharmaceutical Technology, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090 Brussels, Belgium
| | | | | |
Collapse
|
12
|
Zapadlo M, Krupčík J, Májek P, Armstrong DW, Sandra P. Use of a polar ionic liquid as second column for the comprehensive two-dimensional GC separation of PCBs. J Chromatogr A 2010; 1217:5859-67. [DOI: 10.1016/j.chroma.2010.07.024] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2010] [Revised: 07/08/2010] [Accepted: 07/13/2010] [Indexed: 11/17/2022]
|
13
|
Dumarey M, Smets I, Vander Heyden Y. Prediction and interpretation of the antioxidant capacity of green tea from dissimilar chromatographic fingerprints. J Chromatogr B Analyt Technol Biomed Life Sci 2010; 878:2733-40. [PMID: 20829123 DOI: 10.1016/j.jchromb.2010.08.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2009] [Revised: 07/26/2010] [Accepted: 08/14/2010] [Indexed: 11/25/2022]
Abstract
Previously, multivariate calibration techniques have been successfully applied to model and predict the antioxidant activity of green tea from its chromatographic fingerprint. Since the selectivity differences between dissimilar chromatographic systems have already been valuably used in several applications, in this paper it is studied whether combining the complementary information contained in two dissimilar fingerprints can improve the predictive capacity of the multivariate calibration model. The simplest way of combining the data is concatenating both fingerprints for each sample. The resulting matrix can then be subjected to Orthogonal Projections to Latent Structures (O-PLS). Unfortunately, this approach resulted in a more complex model with a prediction error of about the average of the errors obtained with the individual fingerprints. Secondly, only the peaks with high loading and low orthogonal loading from both chromatograms were included in the O-PLS model. This resulted in a reduced complexity, but not in better predictions, probably due to a lack of complementarity of the information concerning the antioxidant capacity. Finally, the concatenated fingerprints were subjected to stepwise multiple linear regression (MLR) in order to build a model based on the variables most correlated with the antioxidant capacity. The obtained prediction error was lower than those of both previous approaches, but still higher than the error of the model based on a single analysis. This is probably again caused by a lack of complementarity in the variables. Nevertheless, it was advantageous to develop fingerprints on dissimilar system, because it enables to choose the most suited chromatographic profile to build a multivariate calibration model for the considered purpose. In contrast to what was expected, the study showed that the most simple (so the worst separated) fingerprints resulted in the best predictions. On the other hand, a more complex fingerprint in which more compounds are separated is still important to improve the interpretability of the model.
Collapse
Affiliation(s)
- M Dumarey
- Vrije Universiteit Brussel (VUB), Department of Analytical Chemistry and Pharmaceutical Technology, Laarbeeklaan 103, 1090 Brussels, Belgium
| | | | | |
Collapse
|
14
|
|
15
|
Dumarey M, Sneyers R, Janssens W, Somers I, Vander Heyden Y. Drug impurity profiling: Method optimization on dissimilar chromatographic systems: Part I: pH optimization of the aqueous phase. Anal Chim Acta 2009; 656:85-92. [PMID: 19932818 DOI: 10.1016/j.aca.2009.10.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Revised: 09/29/2009] [Accepted: 10/10/2009] [Indexed: 10/20/2022]
Abstract
The use of dissimilar chromatographic systems in drug impurity profiling can be very advantageous. Screening a new-drug impurity mixture on those systems not only enhances the chance that all impurities are revealed, but also allows choosing a suited system for further method development. In this paper several strategies were evaluated to predict the optimal pH (of the buffer used in the mobile phase) from the screening results. Four or five dissimilar stationary phases were screened at four pH values (between 2.5 and 9.4), in order to obtain maximal information about the composition of the sample and to select one column for the subsequent optimization. Different linear models (straight lines, 2nd and 3rd degree polynomials) based on these experiments were tested for their ability to predict the retention times (t(R)) of the impurities at intermediate pH values. The predicted t(R) values were then used to calculate minimal resolutions and eventually to select an optimal pH at which the highest minimal resolution is predicted. None of the applied models is accurate enough to predict correctly which peaks are worst separated at the indicated optimal pH. However, the best strategy (applying a second degree polynomial describing the t(R) measured at 3 consecutive screening pH values) did succeed in indicating an optimal pH at which a good separation of the impurities is obtained. Unfortunately, the resulting separation quality is not or only slightly better than the best separation obtained during screening. Therefore, it can be concluded that the most (time-) efficient approach to develop an impurity profile of a new drug is to screen it on four or five dissimilar columns at four different pH values and to retain the best screening conditions (without making predictions for intermediate conditions) for further optimization of the organic modifier composition of the mobile phase, and occasionally the temperature and the gradient. This is at least the case when the profiles have a complexity similar to those studied.
Collapse
Affiliation(s)
- M Dumarey
- Analytical Chemistry and Pharmaceutical Technology, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, 1090 Brussels, Belgium
| | | | | | | | | |
Collapse
|
16
|
West C, Lesellier E. Orthogonal screening system of columns for supercritical fluid chromatography. J Chromatogr A 2008; 1203:105-13. [DOI: 10.1016/j.chroma.2008.07.016] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 07/04/2008] [Accepted: 07/08/2008] [Indexed: 11/30/2022]
|
17
|
Dissimilar or orthogonal reversed-phase chromatographic systems: A comparison of selection techniques. Anal Chim Acta 2008; 609:223-34. [PMID: 18261518 DOI: 10.1016/j.aca.2007.12.047] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2007] [Revised: 12/11/2007] [Accepted: 12/20/2007] [Indexed: 11/24/2022]
Abstract
Developing an analytical separation procedure for an unknown mixture is a challenging issue. An important example is the separation and quantification of a new drug and its impurities. One approach to start method development is the screening of the mixture on dissimilar chromatographic systems, i.e. systems with large selectivity differences. After screening, the most suited system is retained for further method development. In a step prior to such strategy dissimilar chromatographic systems need to be selected. In this paper the performance of different chemometric selection approaches, described in the literature, was visually evaluated and compared. Additionally, orthogonal projection approach (OPA) was tested as another potential selection method. All techniques, including the OPA method, were able to select (a set of) dissimilar chromatographic systems and many similarities between the selections were observed. However, the Kennard and Stone algorithm performed best in selecting the most dissimilar systems in the earliest steps of the selection procedure. The generalized pairwise correlation method (GPCM) and the auto-associative multivariate regression trees (AAMRT) were also performing well. OPA and weighted pair group method using arithmetic averages (WPGMA) are less preferable.
Collapse
|
18
|
Review on modelling aspects in reversed-phase liquid chromatographic quantitative structure–retention relationships. Anal Chim Acta 2007; 602:164-72. [DOI: 10.1016/j.aca.2007.09.014] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2007] [Revised: 09/03/2007] [Accepted: 09/04/2007] [Indexed: 11/22/2022]
|
19
|
Pierce KM, Hoggard JC, Mohler RE, Synovec RE. Recent advancements in comprehensive two-dimensional separations with chemometrics. J Chromatogr A 2007; 1184:341-52. [PMID: 17697686 DOI: 10.1016/j.chroma.2007.07.059] [Citation(s) in RCA: 134] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2007] [Revised: 07/17/2007] [Accepted: 07/18/2007] [Indexed: 11/30/2022]
Abstract
Comprehensive two-dimensional (2D) separations provide the analyst with a tremendous amount of complex data. In order to glean useful information from this complex data, advancements in commercially available software that implement chemometrics are currently available and continue to evolve. Future advancements will no doubt involve commercializing (or adapting) specialized, in-house chemometric techniques that are currently found only in the hands of technical experts and researchers in industry, government, and academia. In order to make timely advancements, future commercialization of novel chemometric techniques should involve collaborations among instrument software manufacturers, professional programmers, technical experts, and researchers. During the last decade, this field has seen a steady advancement from single analyte target analysis to comprehensive non-target analysis of entire multidimensional sample profiles (involving sample classification and/or data mining for discovery-based sample comparisons). The advancements in instrumentation and chemometric software tools have a tremendous impact in various applications: fuels, food, environmental, pharmaceuticals, metabolomics, etc. Most of the development has been for software to apply with gas chromatography-based instrumentation, such as comprehensive two-dimensional gas chromatography (GC x GC) and comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometry (GC x GC-TOF-MS). More recently there have been notable advancements in liquid-phase instrumentation as well.
Collapse
Affiliation(s)
- Karisa M Pierce
- Department of Chemistry, Box 351700, University of Washington, Seattle, WA 98195-1700 USA
| | | | | | | |
Collapse
|
20
|
Flumignan DL, Tininis AG, Ferreira FDO, de Oliveira JE. Screening Brazilian C gasoline quality: Application of the SIMCA chemometric method to gas chromatographic data. Anal Chim Acta 2007; 595:128-35. [PMID: 17605992 DOI: 10.1016/j.aca.2007.02.049] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2006] [Revised: 02/03/2007] [Accepted: 02/16/2007] [Indexed: 11/30/2022]
Abstract
A total of 2400 samples of commercial Brazilian C gasoline were collected over a 6-month period from different gas stations in the São Paulo state, Brazil, and analysed with respect to 12 physicochemical parameters according to regulation 309 of the Brazilian Government Petroleum, Natural Gas and Biofuels Agency (ANP). The percentages (v/v) of hydrocarbons (olefins, aromatics and saturated) were also determined. Hierarchical cluster analysis (HCA) was employed to select 150 representative samples that exhibited least similarity on the basis of their physicochemical parameters and hydrocarbon compositions. The chromatographic profiles of the selected samples were measured by gas chromatography with flame ionisation detection and analysed using soft independent modelling of class analogy (SIMCA) method in order to create a classification scheme to identify conform gasolines according to ANP 309 regulation. Following the optimisation of the SIMCA algorithm, it was possible to classify correctly 96% of the commercial gasoline samples present in the training set of 100. In order to check the quality of the model, an external group of 50 gasoline samples (the prediction set) were analysed and the developed SIMCA model classified 94% of these correctly. The developed chemometric method is recommended for screening commercial gasoline quality and detection of potential adulteration.
Collapse
Affiliation(s)
- Danilo Luiz Flumignan
- Centro de Monitoramento e Pesquisa da Qualidade de Combustíveis, Petróleo e Derivados, Departamento de Química Orgânica, Instituto de Química de Araraquara, Universidade Estadual Paulista Júlio de Mesquita Filho, Araraquara, SP, Brazil
| | | | | | | |
Collapse
|
21
|
Put R, Vander Heyden Y. The evaluation of two-step multivariate adaptive regression splines for chromatographic retention prediction of peptides. Proteomics 2007; 7:1664-77. [PMID: 17443841 DOI: 10.1002/pmic.200600676] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Both the multivariate adaptive regression splines (MARS) and the two-step MARS (TMARS) methodologies were applied in a quantitative structure-retention relationship (QSRR) context. For seven RPLC systems, QSRR models were built that describe the retention times of a set of peptides using a large set of molecular descriptors as potential predictor variables. The use of QSRR models for chromatographic retention prediction of peptides may be valuable in proteomic research to improve the number of correct peptide identifications. Always, 70% of the samples was used to derive the QSRR models (calibration set), whereas the remaining 30% of the peptides were treated as an independent external test set. For four systems, the models obtained by TMARS have better predictive abilities than the MARS models. The MARS and TMARS model performance was compared with those of other multivariate modelling techniques. For five out of seven systems it was observed that the uninformative variable elimination by the partial least squares (PLS) approach outperforms all other methods studied. For three systems predictive errors smaller than 30 s were obtained. PLS regression and a multiple linear regression model based on three descriptors led to the best predictivities for the remaining two systems.
Collapse
Affiliation(s)
- Raf Put
- FABI, Department of Analytical Chemistry and Pharmaceutical Technology, Pharmaceutical Institute, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | |
Collapse
|
22
|
Deconinck E, Ates H, Callebaut N, Van Gyseghem E, Vander Heyden Y. Evaluation of chromatographic descriptors for the prediction of gastro-intestinal absorption of drugs. J Chromatogr A 2007; 1138:190-202. [PMID: 17097093 DOI: 10.1016/j.chroma.2006.10.068] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2006] [Revised: 10/25/2006] [Accepted: 10/30/2006] [Indexed: 11/23/2022]
Abstract
The use of chromatographic descriptors in QSAR was evaluated. Therefore, retentions were measured on an immobilized artificial membrane system, 2 micellar liquid chromatography systems and 17 orthogonal or disimilar reversed-phase liquid chromatographic systems. It was investigated whether it was possible to model gastro-intestinal absorption as a function of chromatographic retentions applying two linear and one non-linear multivariate modeling technique. In a second step it was evaluated if models built with theoretical descriptors could be improved by adding the measured retention factors to the data set of descriptive variables. It was seen that gastro-intestinal absorption could be modelled in function of chromatographic retention using the non-linear modeling technique multivariate adaptive regression splines (MARS). The best models were obtained using a combination of theoretical and chromatographic descriptors with MARS as modeling technique.
Collapse
Affiliation(s)
- E Deconinck
- Department of Analytical Chemistry and Pharmaceutical Technology, Pharmaceutical Institute, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | | | | | | | | |
Collapse
|
23
|
Van Gyseghem E, Elkihel A, Jimidar M, Sneyers R, Vander Heyden Y. Chemometric selection of a small set of pharmaceutical active substances used in determining the orthogonality and similarity of chromatographic systems. Anal Chim Acta 2006. [DOI: 10.1016/j.aca.2006.01.030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|