1
|
Zheng R, Li M, Chen X, Zhao S, Wu FX, Pan Y, Wang J. An Ensemble Method to Reconstruct Gene Regulatory Networks Based on Multivariate Adaptive Regression Splines. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:347-354. [PMID: 30794516 DOI: 10.1109/tcbb.2019.2900614] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Gene regulatory networks (GRNs) play a key role in biological processes. However, GRNs are diverse under different biological conditions. Reconstructing gene regulatory networks (GRNs) from gene expression has become an important opportunity and challenge in the past decades. Although there are a lot of existing methods to infer the topology of GRNs, such as mutual information, random forest, and partial least squares, the accuracy is still low due to the noise and high dimension of the expression data. In this paper, we introduce an ensemble Multivariate Adaptive Regression Splines (MARS) based method to reconstruct the directed GRNs from multifactorial gene expression data, called PBMarsNet. PBMarsNet incorporates part mutual information (PMI) to pre-weight the candidate regulatory genes and then uses MARS to detect the nonlinear regulatory links. Moreover, we apply bootstrap to run the MARS multiple times and average the outputs of each MARS as the final score of regulatory links. The results on DREAM4 challenge and DREAM5 challenge datasets show PBMarsNet has a superior performance and generalization over other state-of-the-art methods.
Collapse
|
2
|
Przybyłek M, Recki Ł, Mroczyńska K, Jeliński T, Cysewski P. Experimental and theoretical solubility advantage screening of bi-component solid curcumin formulations. J Drug Deliv Sci Technol 2019. [DOI: 10.1016/j.jddst.2019.01.023] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
3
|
Application of Multivariate Adaptive Regression Splines (MARSplines) for Predicting Hansen Solubility Parameters Based on 1D and 2D Molecular Descriptors Computed from SMILES String. J CHEM-NY 2019. [DOI: 10.1155/2019/9858371] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A new method of Hansen solubility parameters (HSPs) prediction was developed by combining the multivariate adaptive regression splines (MARSplines) methodology with a simple multivariable regression involving 1D and 2D PaDEL molecular descriptors. In order to adopt the MARSplines approach to QSPR/QSAR problems, several optimization procedures were proposed and tested. The effectiveness of the obtained models was checked via standard QSPR/QSAR internal validation procedures provided by the QSARINS software and by predicting the solubility classification of polymers and drug-like solid solutes in collections of solvents. By utilizing information derived only from SMILES strings, the obtained models allow for computing all of the three Hansen solubility parameters including dispersion, polarization, and hydrogen bonding. Although several descriptors are required for proper parameters estimation, the proposed procedure is simple and straightforward and does not require a molecular geometry optimization. The obtained HSP values are highly correlated with experimental data, and their application for solving solubility problems leads to essentially the same quality as for the original parameters. Based on provided models, it is possible to characterize any solvent and liquid solute for which HSP data are unavailable.
Collapse
|
4
|
Genetic programming based quantitative structure–retention relationships for the prediction of Kovats retention indices. J Chromatogr A 2015; 1420:98-109. [DOI: 10.1016/j.chroma.2015.09.086] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 09/25/2015] [Accepted: 09/25/2015] [Indexed: 11/20/2022]
|
5
|
Chou YC, Shih JS. Bi-channel Surface Acoustic Wave Gas Sensor for Carbon Disulfide and Methanol Vapors in Polymer Plants. J CHIN CHEM SOC-TAIP 2014. [DOI: 10.1002/jccs.201400120] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
6
|
Shao YE, Hou CD, Chiu CC. Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.09.020] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
|
8
|
Jalali-Heravi M, Mani-Varnosfaderani A, Taherinia D, Mahmoodi MM. The use of Bayesian nonlinear regression techniques for the modelling of the retention behaviour of volatile components of Artemisia species. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:461-483. [PMID: 22452344 DOI: 10.1080/1062936x.2012.665083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The main aim of this work was to assess the ability of Bayesian multivariate adaptive regression splines (BMARS) and Bayesian radial basis function (BRBF) techniques for modelling the gas chromatographic retention indices of volatile components of Artemisia species. A diverse set of molecular descriptors was calculated and used as descriptor pool for modelling the retention indices. The ability of BMARS and BRBF techniques was explored for the selection of the most relevant descriptors and proper basis functions for modelling. The results revealed that BRBF technique is more reproducible than BMARS for modelling the retention indices and can be used as a method for variable selection and modelling in quantitative structure-property relationship (QSPR) studies. It is also concluded that the Markov chain Monte Carlo (MCMC) search engine, implemented in BRBF algorithm, is a suitable method for selecting the most important features from a vast number of them. The values of correlation between the calculated retention indices and the experimental ones for the training and prediction sets (0.935 and 0.902, respectively) revealed the prediction power of the BRBF model in estimating the retention index of volatile components of Artemisia species.
Collapse
Affiliation(s)
- M Jalali-Heravi
- Department of Chemistry, Sharif University of Technology, Tehran, Iran
| | | | | | | |
Collapse
|
9
|
Rykowska I, Bielecki P, Wasiak W. Retention indices and quantum-chemical descriptors of aromatic compounds on stationary phases with chemically bonded copper complexes. J Chromatogr A 2010; 1217:1971-6. [DOI: 10.1016/j.chroma.2010.01.073] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2009] [Revised: 01/15/2010] [Accepted: 01/22/2010] [Indexed: 11/30/2022]
|
10
|
Jalali-Heravi M, Mani-Varnosfaderani A. QSAR Modeling of 1-(3,3-Diphenylpropyl)-Piperidinyl Amides as CCR5 Modulators Using Multivariate Adaptive Regression Spline and Bayesian Regularized Genetic Neural Networks. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200860136] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
11
|
Fatemi MH, Shamseddin H, Malekzadeh H. Quantitative structure migration relationship modeling of migration factor for some benzene derivatives in micellar electrokinetic chromatography. J Sep Sci 2009; 32:1934-40. [DOI: 10.1002/jssc.200800764] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
12
|
Chen CC, Shih JS. Multi-Channel Piezoelectric Quartz Crystal Sensor with Principal Component Analysis and Back-Propagation Neural Network for Organic Pollutants from Petrochemical Plants. J CHIN CHEM SOC-TAIP 2008. [DOI: 10.1002/jccs.200800145] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
13
|
Jönsson S, Eriksson L, van Bavel B. Multivariate characterisation and quantitative structure–property relationship modelling of nitroaromatic compounds. Anal Chim Acta 2008; 621:155-62. [DOI: 10.1016/j.aca.2008.05.037] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2007] [Revised: 05/13/2008] [Accepted: 05/14/2008] [Indexed: 11/29/2022]
|
14
|
Deconinck E, Zhang M, Petitet F, Dubus E, Ijjaali I, Coomans D, Vander Heyden Y. Boosted regression trees, multivariate adaptive regression splines and their two-step combinations with multiple linear regression or partial least squares to predict blood–brain barrier passage: A case study. Anal Chim Acta 2008; 609:13-23. [DOI: 10.1016/j.aca.2007.12.033] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2007] [Revised: 12/04/2007] [Accepted: 12/19/2007] [Indexed: 11/16/2022]
|
15
|
Put R, Vander Heyden Y. The evaluation of two-step multivariate adaptive regression splines for chromatographic retention prediction of peptides. Proteomics 2007; 7:1664-77. [PMID: 17443841 DOI: 10.1002/pmic.200600676] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Both the multivariate adaptive regression splines (MARS) and the two-step MARS (TMARS) methodologies were applied in a quantitative structure-retention relationship (QSRR) context. For seven RPLC systems, QSRR models were built that describe the retention times of a set of peptides using a large set of molecular descriptors as potential predictor variables. The use of QSRR models for chromatographic retention prediction of peptides may be valuable in proteomic research to improve the number of correct peptide identifications. Always, 70% of the samples was used to derive the QSRR models (calibration set), whereas the remaining 30% of the peptides were treated as an independent external test set. For four systems, the models obtained by TMARS have better predictive abilities than the MARS models. The MARS and TMARS model performance was compared with those of other multivariate modelling techniques. For five out of seven systems it was observed that the uninformative variable elimination by the partial least squares (PLS) approach outperforms all other methods studied. For three systems predictive errors smaller than 30 s were obtained. PLS regression and a multiple linear regression model based on three descriptors led to the best predictivities for the remaining two systems.
Collapse
Affiliation(s)
- Raf Put
- FABI, Department of Analytical Chemistry and Pharmaceutical Technology, Pharmaceutical Institute, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | |
Collapse
|
16
|
Zhu XH, Wang W, Schramm KW, Niu W. Prediction of the Kováts Retention Indices of Thiols by Use of Quantum Chemical and Physicochemical Descriptors. Chromatographia 2007. [DOI: 10.1365/s10337-007-0237-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
17
|
Deconinck E, Coomans D, Vander Heyden Y. Exploration of linear modelling techniques and their combination with multivariate adaptive regression splines to predict gastro-intestinal absorption of drugs. J Pharm Biomed Anal 2007; 43:119-30. [PMID: 16859855 DOI: 10.1016/j.jpba.2006.06.022] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2006] [Revised: 06/09/2006] [Accepted: 06/10/2006] [Indexed: 11/16/2022]
Abstract
In general, linear modelling techniques such as multiple linear regression (MLR), principal component regression (PCR) and partial least squares (PLS), are used to model QSAR data. This type of data can be very complex and linear modelling techniques often model only a limited part of the information captured in the data. In this study, it was tried to combine linear techniques with the flexible non-linear technique multivariate adaptive regression splines (MARS). Models were built using an MLR model, combined with either a stepwise procedure or a genetic algorithm for variable selection, a PCR model or a PLS model as starting points for the MARS algorithm. The descriptive and predictive power of the models was evaluated in a QSAR context and compared to the performances of the individual linear models and the single MARS model. In general, the combined methods resulted in significant improvements compared to the linear models and can be considered valuable techniques in modelling complex QSAR data. For the used data set the best model was obtained using a combination of PLS and MARS. This combination resulted in a model with a Pearson correlation coefficient of 0.90 and a cross-validation error, evaluated with 10-fold cross-validation of 9.9%, pointing at good descriptive and high predictive properties.
Collapse
Affiliation(s)
- E Deconinck
- Department of Analytical Chemistry and Pharmaceutical Technology, Pharmaceutical Institute, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | | | | |
Collapse
|
18
|
Deconinck E, Xu QS, Put R, Coomans D, Massart DL, Vander Heyden Y. Prediction of gastro-intestinal absorption using multivariate adaptive regression splines. J Pharm Biomed Anal 2005; 39:1021-30. [PMID: 16040225 DOI: 10.1016/j.jpba.2005.05.034] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2005] [Revised: 05/30/2005] [Accepted: 05/30/2005] [Indexed: 10/25/2022]
Abstract
Multivariate adaptive regression splines (MARS) and a derived method two-step MARS (TMARS) were used for modelling the gastro-intestinal absorption of 140 drug-like molecules. The published absorption values for these molecules were used as response variable and calculated molecular descriptors as potential explanatory variables. Both methods were compared and their potential use in quantitative structure-activity relationship (QSAR) context evaluated. The predictive abilities of the models were studied using different sequences of Monte Carlo cross validation (MCCV). It was shown that both types of models had good predictive abilities and that for the data used, MARS gave better results than TMARS. It could be concluded that both methods could be valuable for QSAR modelling.
Collapse
Affiliation(s)
- E Deconinck
- Department of Pharmaceutical and Biomedical Analysis, Pharmaceutical Institute, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | | | | | | | | | | |
Collapse
|
19
|
Caetano S, Decaestecker T, Put R, Daszykowski M, Van Bocxlaer J, Vander Heyden Y. Exploring and modelling the responses of electrospray and atmospheric pressure chemical ionization techniques based on molecular descriptors. Anal Chim Acta 2005. [DOI: 10.1016/j.aca.2005.06.069] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Caetano S, Aires-de-Sousa J, Daszykowski M, Heyden YV. Prediction of enantioselectivity using chirality codes and Classification and Regression Trees. Anal Chim Acta 2005. [DOI: 10.1016/j.aca.2004.12.012] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
21
|
Can H, Dimoglo A, Kovalishyn V. Application of artificial neural networks for the prediction of sulfur polycyclic aromatic compounds retention indices. ACTA ACUST UNITED AC 2005. [DOI: 10.1016/j.theochem.2005.03.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
22
|
Artificial neural network prediction of quantitative structure: Retention relationships of polycyclic aromatic hydocarbons in gas chromatography. JOURNAL OF THE SERBIAN CHEMICAL SOCIETY 2005. [DOI: 10.2298/jsc0511291s] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A feed-forward artificial neural network (ANN) model was used to link molecular structures (boiling points, connectivity indices and molecular weights) and retention indices of polycyclic aromatic hydrocarbons (PAHs) in linear temperature- programmed gas chromatography. A randomly taken subset of PAH retention data reported by Lee et al. [Anal. Chem. 51 (1979) 768], containing retention index data for 30 PAHs, was used to make the ANN model. The prediction ability of the trained ANN was tested on unseen data for 18 PAHs from the same article, as well as on the retention data for 7 PAHs experimentally obtained in this work. In addition, two different data sets with known retention indices taken from the literature were analyzed by the same ANN model. It has been shown that the relative accuracy as the degree of agreement between the measured and the predicted retention indices in all testing sets, for most of the studied PAHs, were within the experimental error margins (+-3 %).
Collapse
|
23
|
Garkani-Nejad Z, Karlovits M, Demuth W, Stimpfl T, Vycudilik W, Jalali-Heravi M, Varmuza K. Prediction of gas chromatographic retention indices of a diverse set of toxicologically relevant compounds. J Chromatogr A 2004; 1028:287-95. [PMID: 14989482 DOI: 10.1016/j.chroma.2003.12.003] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
For a set of 846 organic compounds, relevant in forensic analytical chemistry, with highly diverse chemical structures, the gas chromatographic Kovats retention indices have been quantitatively modeled by using a large set of molecular descriptors generated by software Dragon. Best and very similar performances for prediction have been obtained by a partial least squares regression (PLS) model using all considered 529 descriptors, and a multiple linear regression (MLR) model using only 15 descriptors obtained by a stepwise feature selection. The standard deviations of the prediction errors (SEP), were estimated in four experiments with differently distributed training and prediction sets. For the best models SEP is about 80 retention index units, corresponding to 2.1-7.2% of the covered retention index interval of 1110-3870. The molecular properties known to be relevant for GC retention data, such as molecular size, branching and polar functional groups are well covered by the selected 15 descriptors. The developed models support the identification of substances in forensic analytical work by GC-MS in cases the retention data for candidate structures are not available.
Collapse
Affiliation(s)
- Z Garkani-Nejad
- Faculty of Science, Vali-e Asr University of Rafsanjan, Rafsanjan, Iran
| | | | | | | | | | | | | |
Collapse
|