1
|
Du M, Ren Y, Zhang Y, Li W, Yang H, Chu H, Zhao Y. CSEL-BGC: A Bioinformatics Framework Integrating Machine Learning for Defining the Biosynthetic Evolutionary Landscape of Uncharacterized Antibacterial Natural Products. Interdiscip Sci 2024:10.1007/s12539-024-00656-5. [PMID: 39348072 DOI: 10.1007/s12539-024-00656-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 08/26/2024] [Accepted: 08/28/2024] [Indexed: 10/01/2024]
Abstract
The sluggish pace of new antibacterial drug development reflects a vulnerability in the face of the current severe threat posed by bacterial resistance. Microbial natural products (NPs), as a reservoir of immense chemical potential, have emerged as the most promising avenue for the discovery of next generation antibacterial agent. Directly accessing the antibacterial activity of potential products derived from biosynthetic gene clusters (BGCs) would significantly expedite the process. To tackle this issue, we propose a CSEL-BGC framework that integrates machine learning (ML) techniques. This framework involves the development of a novel cascade-stacking ensemble learning (CSEL) model and the establishment of a groundbreaking model evaluation system. Based on this framework, we predict 6,666 BGCs with antibacterial activity from 3,468 complete bacterial genomes and elucidate a biosynthetic evolutionary landscape to reveal their antibacterial potential. This provides crucial insights for interpretating the synthesis and secretion mechanisms of unknown NPs.
Collapse
Affiliation(s)
- Minghui Du
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | - Yuxiang Ren
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | - Yang Zhang
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | - Wenwen Li
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | - Hongtao Yang
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | - Huiying Chu
- State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116000, China
| | - Yongshan Zhao
- School of Life Science and Bio-Pharmaceutics, Shenyang Pharmaceutical University, Shenyang, 110016, China.
| |
Collapse
|
2
|
Guo L, Chang Z, Tong J, Gao P, Zhang Y, Liu Y, Yang Y, Wang C. Design of vilazodone-donepezil chimeric derivatives as acetylcholinesterase inhibitors by QSAR, molecular docking and molecular dynamics simulations. Phys Chem Chem Phys 2024; 26:18149-18161. [PMID: 38896464 DOI: 10.1039/d4cp01741b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Alzheimer's disease (AD) is a disease that affects the cognitive abilities of older adults, and it is one of the biggest global medical challenges of the 21st century. Acetylcholinesterase (AChE) can increase acetylcholine concentrations and improve cognitive function in patients, and is a potential target to develop small molecule inhibitors for the treatment of Alzheimer's disease (AD). In this study, 29 vilazodone-donepezil chimeric derivatives are systematically studied using 3D-QSAR modeling, and a robust and reliable Topomer CoMFA model was obtained with: q2 = 0.720, r2 = 0.991, F = 287.234, N = 6, and SEE = 0.098. Based on the established model and combined with the ZINC20 database, 33 new compounds with ideal inhibitory activity are successfully designed. Molecular docking and ADMET property prediction also show that these newly designed compounds have a good binding ability to the target protein and can meet the medicinal conditions. Subsequently, four new compounds with good comprehensive ability are selected for molecular dynamics simulation, and the simulation results confirm that the newly designed compounds have a certain degree of reliability and stability. This study provides guidance for vilazodone-donepezil chimeric derivatives as a potential AChE inhibitor and has certain theoretical value.
Collapse
Affiliation(s)
- Liyuan Guo
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Zelei Chang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Jianbo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Peng Gao
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Yakun Zhang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Yuan Liu
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Yulu Yang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| | - Chunying Wang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, China.
- Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an 710021, China
| |
Collapse
|
3
|
Tian YY, Tong JB, Liu Y, Tian Y. QSAR Study, Molecular Docking and Molecular Dynamic Simulation of Aurora Kinase Inhibitors Derived from Imidazo[4,5- b]pyridine Derivatives. Molecules 2024; 29:1772. [PMID: 38675594 PMCID: PMC11052498 DOI: 10.3390/molecules29081772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 04/03/2024] [Accepted: 04/05/2024] [Indexed: 04/28/2024] Open
Abstract
Cancer is a serious threat to human life and social development and the use of scientific methods for cancer prevention and control is necessary. In this study, HQSAR, CoMFA, CoMSIA and TopomerCoMFA methods are used to establish models of 65 imidazo[4,5-b]pyridine derivatives to explore the quantitative structure-activity relationship between their anticancer activities and molecular conformations. The results show that the cross-validation coefficients q2 of HQSAR, CoMFA, CoMSIA and TopomerCoMFA are 0.892, 0.866, 0.877 and 0.905, respectively. The non-cross-validation coefficients r2 are 0.948, 0.983, 0.995 and 0.971, respectively. The externally validated complex correlation coefficients r2pred of external validation are 0.814, 0.829, 0.758 and 0.855, respectively. The PLS analysis verifies that the QSAR models have the highest prediction ability and stability. Based on these statistics, virtual screening based on R group is performed using the ZINC database by the Topomer search technology. Finally, 10 new compounds with higher activity are designed with the screened new fragments. In order to explore the binding modes and targets between ligands and protein receptors, these newly designed compounds are conjugated with macromolecular protein (PDB ID: 1MQ4) by molecular docking technology. Furthermore, to study the nature of the newly designed compound in dynamic states and the stability of the protein-ligand complex, molecular dynamics simulation is carried out for N3, N4, N5 and N7 docked with 1MQ4 protease structure for 50 ns. A free energy landscape is computed to search for the most stable conformation. These results prove the efficient and stability of the newly designed compounds. Finally, ADMET is used to predict the pharmacology and toxicity of the 10 designed drug molecules.
Collapse
Affiliation(s)
- Yang-Yang Tian
- College of Petroleum Engineering, Xi’an Shiyou University, Xi’an 710065, China;
- Shaanxi Key Laboratory of Advanced Stimulation Technology for Oil & Gas Reservoirs, Xi’an 710065, China
| | - Jian-Bo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (Y.L.); (Y.T.)
| | - Yuan Liu
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (Y.L.); (Y.T.)
| | - Yu Tian
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (Y.L.); (Y.T.)
| |
Collapse
|
4
|
Liu Y, Tong JB, Gao P, Fan XL, Xiao XC, Xing YC. Combining QSAR techniques, molecular docking, and molecular dynamics simulations to explore anti-tumor inhibitors targeting Focal Adhesion Kinase. J Biomol Struct Dyn 2024:1-17. [PMID: 38173145 DOI: 10.1080/07391102.2023.2301055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024]
Abstract
Focal Adhesion Kinase (FAK) is an important target for tumor therapy and is closely related to tumor cell genesis and progression. In this paper, we selected 46 FAK inhibitors with anticancer activity in the pyrrolo pyrimidine backbone to establish 3D/2D-QSAR models to explore the relationship between inhibitory activity and molecular structure. We have established two ideal models, namely, the Topomer CoMFA model (q 2 = 0.715, r 2 = 0.984) and the Holographic Quantitative Structure-Activity Relationship (HQSAR) model (q 2 = 0.707, r 2 = 0.899). Both models demonstrate excellent external prediction capabilities.Based on the QSAR results, we designed 20 structurally modified novel compounds, which were subjected to molecular docking and molecular dynamics studies, and the results showed that the new compounds formed many robust interactions with residues within the active pocket and could maintain stable binding to the receptor proteins. This study not only provides a powerful screening tool for designing novel FAK inhibitors, but also presents a series of novel FAK inhibitors with high micromolar activity that can be used for further characterization. It provides a reference for addressing the shortcomings of drug metabolism and drug resistance of traditional FAK inhibitors, as well as the development of novel clinically applicable FAK inhibitors.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Yuan Liu
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| | - Jian-Bo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| | - Peng Gao
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| | - Xuan-Lu Fan
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| | - Xue-Chun Xiao
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| | - Yi-Chaung Xing
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, China
- Shaanxi Key Laboratory of Chemical Additives for Industry, Shaanxi University of Science and Technology, Xi'an, China
| |
Collapse
|
5
|
Parvatikar PP, Patil S, Khaparkhuntikar K, Patil S, Singh PK, Sahana R, Kulkarni RV, Raghu AV. Artificial intelligence: Machine learning approach for screening large database and drug discovery. Antiviral Res 2023; 220:105740. [PMID: 37935248 DOI: 10.1016/j.antiviral.2023.105740] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 10/17/2023] [Accepted: 10/26/2023] [Indexed: 11/09/2023]
Abstract
Recent research in drug discovery dealing with many faces difficulties, including development of new drugs during disease outbreak and drug resistance due to rapidly accumulating mutations. Virtual screening is the most widely used method in computer aided drug discovery. It has a prominent ability in screening drug targets from large molecular databases. Recently, a number of web servers have developed for quickly screening publicly accessible chemical databases. In a nutshell, deep learning algorithms and artificial neural networks have modernised the field. Several drug discovery processes have used machine learning and deep learning algorithms, including peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modelling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Although there are presently a wide variety of data-driven AI/ML tools available, the majority of these tools have, up to this point, been developed in the context of non-communicable diseases like cancer, and a number of obstacles have prevented the translation of these tools to the discovery of treatments against infectious diseases. In this review various aspects of AI and ML in virtual screening of large databases were discussed. Here, with an emphasis on antivirals as well as other disease, offers a perspective on the advantages, drawbacks, and hazards of AI/ML techniques in the search for innovative treatments.
Collapse
Affiliation(s)
- Prachi P Parvatikar
- Department of Biotechnology, Allied Health Science, BLDE (Deemed-to-be University), Vijayapur 586103, Karnataka, India.
| | - Sudha Patil
- Department of Pharmaceutics, BLDEA's SSM College of Pharmacy and Research Centre, Vijayapur 586 103, Karnataka, India
| | - Kedar Khaparkhuntikar
- Department of Pharmaceutics, National Institute of Pharmaceutical Education and Research (NIPER), Hyderabad, Telangana, 500037, India
| | - Shruti Patil
- Department of Biotechnology, Allied Health Science, BLDE (Deemed-to-be University), Vijayapur 586103, Karnataka, India
| | - Pankaj K Singh
- Department of Pharmaceutics, National Institute of Pharmaceutical Education and Research (NIPER), Hyderabad, Telangana, 500037, India
| | - R Sahana
- Department of Computer Science and Engineering, RV Institute of Technology and Management, 560076, Bengaluru, India
| | - Raghavendra V Kulkarni
- Department of Biotechnology, Allied Health Science, BLDE (Deemed-to-be University), Vijayapur 586103, Karnataka, India; Department of Pharmaceutics, BLDEA's SSM College of Pharmacy and Research Centre, Vijayapur 586 103, Karnataka, India
| | - Anjanapura V Raghu
- Department of Science and Technology, BLDE (Deemed-to-be University), Vijayapur 586103, Karnataka, India.
| |
Collapse
|
6
|
Redžepović I, Furtula B. Chemical similarity of molecules with physiological response. Mol Divers 2023; 27:1603-1612. [PMID: 35976549 DOI: 10.1007/s11030-022-10514-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 08/08/2022] [Indexed: 11/30/2022]
Abstract
Measuring the similarity among molecules is an important task in various chemically oriented problems. This elusive concept is hard to define and quantify. Moreover, the complexity of the problem is elevated by bifurcating the notion of molecular similarity to structural and chemical similarity. While the structural similarity of molecules is being extensively researched, the so-called chemical similarity is being mentioned scarcely. Here, we propose a way of converting the physico-chemical properties into molecular fingerprints. Then, using the apparatus of measuring the structural similarity, the chemical similarity can be assessed. The proof of a concept is demonstrated on a set of molecules that induce diverse physiological responses.
Collapse
Affiliation(s)
- Izudin Redžepović
- Department of Chemistry, Faculty of Science, University of Kragujevac, P. O. Box 60, 34000, Kragujevac, Serbia.
| | - Boris Furtula
- Department of Chemistry, Faculty of Science, University of Kragujevac, P. O. Box 60, 34000, Kragujevac, Serbia
| |
Collapse
|
7
|
Al Fahoum AS, Abu Al-Haija AO, Alshraideh HA. Identification of Coronary Artery Diseases Using Photoplethysmography Signals and Practical Feature Selection Process. Bioengineering (Basel) 2023; 10:bioengineering10020249. [PMID: 36829743 PMCID: PMC9952145 DOI: 10.3390/bioengineering10020249] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 01/19/2023] [Accepted: 02/07/2023] [Indexed: 02/16/2023] Open
Abstract
A low-cost, fast, dependable, repeatable, non-invasive, portable, and simple-to-use vascular screening tool for coronary artery diseases (CADs) is preferred. Photoplethysmography (PPG), a low-cost optical pulse wave technology, is one method with this potential. PPG signals come from changes in the amount of blood in the microvascular bed of tissue. Therefore, these signals can be used to figure out anomalies within the cardiovascular system. This work shows how to use PPG signals and feature selection-based classifiers to identify cardiorespiratory disorders based on the extraction of time-domain features. Data were collected from 360 healthy and cardiovascular disease patients. For analysis and identification, five types of cardiovascular disorders were considered. The categories of cardiovascular diseases were identified using a two-stage classification process. The first stage was utilized to differentiate between healthy and unhealthy subjects. Subjects who were found to be abnormal were then entered into the second stage classifier, which was used to determine the type of the disease. Seven different classifiers were employed to classify the dataset. Based on the subset of features found by the classifier, the Naïve Bayes classifier obtained the best test accuracy, with 94.44% for the first stage and 89.37% for the second stage. The results of this study show how vital the PPG signal is. Many time-domain parts of the PPG signal can be easily extracted and analyzed to find out if there are problems with the heart. The results were accurate and precise enough that they did not need to be looked at or analyzed further. The PPG classifier built on a simple microcontroller will work better than more expensive ones and will not make the patient nervous.
Collapse
Affiliation(s)
- Amjed S. Al Fahoum
- Biomedical Systems and Informatics Engineering Department, Yarmouk University, Irbid 21163, Jordan
- Correspondence:
| | - Ansam Omar Abu Al-Haija
- Biomedical Systems and Informatics Engineering Department, Yarmouk University, Irbid 21163, Jordan
- Industrial Engineering Department, JUST, Irbid 22110, Jordan
| | | |
Collapse
|
8
|
Király P, Kiss R, Kovács D, Ballaj A, Tóth G. The Relevance of Goodness-of-fit, Robustness and Prediction Validation Categories of OECD-QSAR Principles with Respect to Sample Size and Model Type. Mol Inform 2022; 41:e2200072. [PMID: 35773201 PMCID: PMC9787734 DOI: 10.1002/minf.202200072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/30/2022] [Indexed: 12/30/2022]
Abstract
We investigated the relevance of the validation principles on the Quantitative Structure Activity Relationship models issued by Organization for Economic and Co-operation and Development. We checked the goodness-of-fit, robustness and predictivity categories in linear and nonlinear models using benchmark datasets. Most of our conclusions are drawn using the sample size dependence of the different validation parameters. We found that the goodness-of-fit parameters misleadingly overestimate the models on small samples. In the case of neural network and support vector models, the feasibility of the goodness-of-fit parameters often might be questioned. We propose to use the simplest y-scrambling method to estimate chance correlation. We found that the leave-one-out and leave-many-out cross-validation parameters can be rescaled to each other in all models and the computationally feasible method should be chosen depending on the model type. We assessed the interdependence of the validation parameters by calculating their rank correlations. Goodness of fit and robustness correlate quite well over a sample size for linear models and one of the approaches might be redundant. In the rank correlation between internal and external validation parameters, we found that the assignment of good and bad modellable data to the training or the test causes negative correlations.
Collapse
Affiliation(s)
- Péter Király
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Ramóna Kiss
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Dániel Kovács
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Amine Ballaj
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Gergely Tóth
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| |
Collapse
|
9
|
Shayanfar S, Shayanfar A. Comparison of various methods for validity evaluation of QSAR models. BMC Chem 2022; 16:63. [PMID: 35999611 PMCID: PMC9396839 DOI: 10.1186/s13065-022-00856-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 08/09/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Quantitative structure-activity relationship (QSAR) modeling is one of the most important computational tools employed in drug discovery and development. The external validation of QSAR models is the main point to check the reliability of developed models for the prediction activity of not yet synthesized compounds. It was performed by different criteria in the literature. METHODS In this study, 44 reported QSAR models for biologically active compounds reported in scientific papers were collected. Various statistical parameters of external validation of a QSAR model were calculated, and the results were discussed. RESULTS The findings revealed that employing the coefficient of determination (r2) alone could not indicate the validity of a QSAR model. The established criteria for external validation have some advantages and disadvantages which should be considered in QSAR studies. CONCLUSION This study showed that these methods alone are not only enough to indicate the validity/invalidity of a QSAR model.
Collapse
Affiliation(s)
- Shadi Shayanfar
- Student Research Committee, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ali Shayanfar
- Pharmaceutical Analysis Research Center, Tabriz University of Medical Sciences, Tabriz, Iran. .,Editorial Office of Pharmaceutical Sciences Journal, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
10
|
Orosz Á, Héberger K, Rácz A. Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets. Front Chem 2022; 10:852893. [PMID: 35755260 PMCID: PMC9214226 DOI: 10.3389/fchem.2022.852893] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 04/14/2022] [Indexed: 01/12/2023] Open
Abstract
The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets.
Collapse
Affiliation(s)
- Álmos Orosz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
11
|
Multiobject Optimization of National Football League Drafts: Comparison of Teams and Experts. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Predicting the success of National Football League drafts has always been an exciting issue for the teams, fans and even for scientists. Among the numerous approaches, one of the best techniques is to ask the opinion of sport experts, who have the knowledge and past experiences to rate the drafts of the teams. When asking a set of sport experts to evaluate the performances of teams, a multicriteria decision making problem arises unavoidably. The current paper uses the draft evaluations of the 32 NFL teams given by 18 experts: a novel multicriteria decision making tool has been applied: the sum of ranking differences (SRD). We introduce a quick and easy-to-follow approach on how to evaluate the performance of the teams and the experts at the same time. Our results on the 2021 NFL draft data indicate that Green Bay Packers has the most promising drafts for 2021, while the experts have been grouped into three distinct groups based on the distance to the hypothetical best evaluation. Even the coding options can be tailored according to the experts’ opinions. Statistically correct (pairwise or group) comparisons can be made using analysis of variance (ANOVA). A comparison to TOPSIS ranking revealed that SRD gives a more objective ranking due to the lack of predefined weights.
Collapse
|
12
|
QSAR study, molecular docking, and ADMET prediction of vinyl sulfone-containing Nrf2 activator derivatives for treating Parkinson disease. Struct Chem 2022. [DOI: 10.1007/s11224-022-01909-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Abonyi J, Czvetkó T, Kosztyán ZT, Héberger K. Factor analysis, sparse PCA, and Sum of Ranking Differences-based improvements of the Promethee-GAIA multicriteria decision support technique. PLoS One 2022; 17:e0264277. [PMID: 35213620 PMCID: PMC8880814 DOI: 10.1371/journal.pone.0264277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 02/07/2022] [Indexed: 11/21/2022] Open
Abstract
The Promethee-GAIA method is a multicriteria decision support technique that defines the aggregated ranks of multiple criteria and visualizes them based on Principal Component Analysis (PCA). In the case of numerous criteria, the PCA biplot-based visualization do not perceive how a criterion influences the decision problem. The central question is how the Promethee-GAIA-based decision-making process can be improved to gain more interpretable results that reveal more characteristic inner relationships between the criteria. To improve the Promethee-GAIA method, we suggest three techniques that eliminate redundant criteria as well as clearly outline, which criterion belongs to which factor and explore the similarities between criteria. These methods are the following: A) Principal factoring with rotation and communality analysis (P-PFA), B) the integration of Sparse PCA into the Promethee II method (P-sPCA), and C) the Sum of Ranking Differences method (P-SRD). The suggested methods are presented through an I4.0+ dataset that measures the Industry 4.0 readiness of NUTS 2-classified regions. The proposed methods are useful tools for handling multicriteria ranking problems, if the number of criteria is numerous.
Collapse
Affiliation(s)
- János Abonyi
- MTA-PE “Lendület” Complex Systems Monitoring Research Group, University of Pannonia, Veszprém, Hungary
- * E-mail:
| | - Tímea Czvetkó
- MTA-PE “Lendület” Complex Systems Monitoring Research Group, University of Pannonia, Veszprém, Hungary
| | - Zsolt T. Kosztyán
- Department of Quantitative Methods, Faculty of Business and Economics, University of Pannonia, Veszprém, Hungary
| | - Károly Héberger
- ELKH Research Centre for Natural Sciences, Institute of Excellence of the Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
14
|
Comprehensible Visualization of Multidimensional Data: Sum of Ranking Differences-Based Parallel Coordinates. MATHEMATICS 2021. [DOI: 10.3390/math9243203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
A novel visualization technique is proposed for the sum of ranking differences method (SRD) based on parallel coordinates. An axis is defined for each variable, on which the data are depicted row-wise. By connecting data, the lines may intersect. The fewer intersections between the variables, the more similar they are and the clearer the figure becomes. Therefore, the visualization depends on what techniques are used to order the variables. The key idea is to employ the SRD method to measure the degree of similarity of the variables, establishing a distance-based order. The distances between the axes are not uniformly distributed in the proposed visualization; their closeness reflects similarity, according to their SRD value. The proposed algorithm identifies false similarities through an iterative approach, where the angles between the SRD values determine which side a variable is plotted. Visualization of the algorithm is provided by MATLAB/Octave source codes. The proposed tool is applied to study how the sources of greenhouse gas emissions can be grouped based on the statistical data of the countries. A comparison to multidimensional scaling (MDS)-based ordering is also given. The use case demonstrates the applicability of the method and the synergies of the incorporation of the SRD method into parallel coordinates.
Collapse
|
15
|
Shakour N, Hadizadeh F, Kesharwani P, Sahebkar A. 3D-QSAR Studies of 1,2,4-Oxadiazole Derivatives as Sortase A Inhibitors. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6380336. [PMID: 34912894 PMCID: PMC8668286 DOI: 10.1155/2021/6380336] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/23/2021] [Accepted: 11/13/2021] [Indexed: 12/20/2022]
Abstract
Sortase A (SrtA) is an enzyme that catalyzes the attachment of proteins to the cell wall of Gram-positive bacterial membrane, preventing the spread of pathogenic bacterial strains. Here, one class of oxadiazole compounds was distinguished as an efficient inhibitor of SrtA via the "S. aureus Sortase A" substrate-based virtual screening. The current study on 3D-QSAR was done by utilizing preparation of the structure in the Schrödinger software suite and an assessment of 120 derivatives with the crystal structure of 1,2,4-oxadiazole which was extracted from the PDB data bank. The docking operation of the best compound in terms of pMIC (pMIC = 2.77) was done to determine the drug likeliness and binding form of 1,2,4-oxadiazole derivatives as antibiotics in the active site. Using the kNN-MFA way, seven models of 3D-QSAR were created and amongst them, and one model was selected as the best. The chosen model based on q 2 (pred_r 2) and R 2 values related to the sixth factor of PLS illustrates better and more acceptable external and internal predictions. Values of crossvalidation (pred_r 2), validation (q 2), and F were observed 0.5479, 0.6319, and 179.0, respectively, for a test group including 24 molecules and the training group including 96 molecules. The external reliability outcomes showed that the acceptable and the selective 3D-QSAR model had a high predictive potential (R 2 = 0.9235) which was confirmed by the Y-randomization test. Besides, the model applicability domain was described successfully to validate the estimation of the model.
Collapse
Affiliation(s)
- Neda Shakour
- Department of Medicinal Chemistry, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Farzin Hadizadeh
- Department of Medicinal Chemistry, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Prashant Kesharwani
- Department of Pharmaceutics, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India
| | - Amirhossein Sahebkar
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
- Applied Biomedical Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Department of Biotechnology, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
16
|
Modelling in Synthesis and Optimization of Active Vaccinal Components. NANOMATERIALS 2021; 11:nano11113001. [PMID: 34835765 PMCID: PMC8625944 DOI: 10.3390/nano11113001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 10/28/2021] [Accepted: 11/04/2021] [Indexed: 12/24/2022]
Abstract
Cancer is the second leading cause of mortality worldwide, behind heart diseases, accounting for 10 million deaths each year. This study focusses on adenocarcinoma, which is a target of a number of anticancer therapies presently being tested in medical and pharmaceutical studies. The innovative study for a therapeutic vaccine comprises the investigation of gold nanoparticles and their influence on the immune response for the annihilation of cancer cells. The model is intended to be realized using Quantitative-Structure Activity Relationship (QSAR) methods, explicitly artificial neural networks combined with fuzzy rules, to enhance automated properties of neural nets with human perception characteristics. Image processing techniques such as morphological transformations and watershed segmentation are used to extract and calculate certain molecular characteristics from hyperspectral images. The quantification of single-cell properties is one of the key resolutions, representing the treatment efficiency in therapy of colon and rectum cancerous conditions. This was accomplished by using manually counted cells as a reference point for comparing segmentation results. The early findings acquired are conclusive for further study; thus, the extracted features will be used in the feature optimization process first, followed by neural network building of the required model.
Collapse
|
17
|
Dunn TB, Seabra GM, Kim TD, Juárez-Mercado KE, Li C, Medina-Franco JL, Miranda-Quintana RA. Diversity and Chemical Library Networks of Large Data Sets. J Chem Inf Model 2021; 62:2186-2201. [PMID: 34723537 DOI: 10.1021/acs.jcim.1c01013] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The quantification of chemical diversity has many applications in drug discovery, organic chemistry, food, and natural product chemistry, to name a few. As the size of the chemical space is expanding rapidly, it is imperative to develop efficient methods to quantify the diversity of large and ultralarge chemical libraries and visualize their mutual relationships in chemical space. Herein, we show an application of our recently introduced extended similarity indices to measure the fingerprint-based diversity of 19 chemical libraries typically used in drug discovery and natural products research with over 18 million compounds. Based on this concept, we introduce the Chemical Library Networks (CLNs) as a general and efficient framework to represent visually the chemical space of large chemical libraries providing a global perspective of the relation between the libraries. For the 19 compound libraries explored in this work, it was found that the (extended) Tanimoto index offers the best description of extended similarity in combination with RDKit fingerprints. CLNs are general and can be explored with any structure representation and similarity coefficient for large chemical libraries.
Collapse
Affiliation(s)
- Timothy B Dunn
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Gustavo M Seabra
- Department of Medicinal Chemistry, University of Florida, Gainesville, Florida 32610, United States.,Center for Natural Products, Drug Discovery and Development (CNPD3), University of Florida, Gainesville, Florida 32610, United States
| | - Taewon David Kim
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - K Eurídice Juárez-Mercado
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City 04510, Mexico
| | - Chenglong Li
- Department of Medicinal Chemistry, University of Florida, Gainesville, Florida 32610, United States.,Center for Natural Products, Drug Discovery and Development (CNPD3), University of Florida, Gainesville, Florida 32610, United States
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, National Autonomous University of Mexico, Mexico City 04510, Mexico
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States.,Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
18
|
Tong JB, Bian S, Zhang X, Luo D. QSAR analysis of 3-pyrimidin-4-yl-oxazolidin-2-one derivatives isocitrate dehydrogenase inhibitors using Topomer CoMFA and HQSAR methods. Mol Divers 2021; 26:1017-1037. [PMID: 33974175 DOI: 10.1007/s11030-021-10222-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/02/2021] [Indexed: 01/03/2023]
Abstract
A series of mIDH1 inhibitors derived from 3-pyrimidine-4-oxazolidin-2-ketone derivatives were studied by QSAR model to explore the key factors that inhibit mIDH1 activity. The generated model was cross-verified and non-cross-verified by Topomer CoMFA and HQSAR methods; the independent test set was verified by PLS method; the Topomer search technology was used for virtual screening and molecular design; and the Surflex-Dock method and ADMET technology were used for molecular docking, pharmacology and toxicity prediction of the designed drug molecules. The Topomer CoMFA and HQSAR cross-validation coefficients q2 are 0.783 and 0.784, respectively, and the non-cross-validation coefficients r2 are 0.978 and 0.934, respectively. Ten new drug molecules have been designed using Topomer search technology. The results of molecular docking and ADMET show that the newly designed drug molecules are effective. The docking situation, pharmacology and toxicity prediction results are good. The model can be used to predict the bioactivity of the same type of new compounds and their derivatives. The prediction results of molecular design, molecular docking and ADMET can provide some ideas for the design and development of novel mIDH1 inhibitor anticancer drugs, and provide certain theoretical basis of the experimental verification of new compounds in the future. Newly designed molecules after docking with corresponding proteins in the PDB library, it can explore the targets of drug molecules acting with large proteins and the related force, which is very helpful for the design of new drugs and the mechanism of drug action.
Collapse
Affiliation(s)
- Jian-Bo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, China. .,Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, 710021, China.
| | - Shuai Bian
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, China.,Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, 710021, China
| | - Xing Zhang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, China.,Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, 710021, China
| | - Ding Luo
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, China.,Shaanxi Key Laboratory of Chemical Additives for Industry, Xi'an, 710021, China
| |
Collapse
|
19
|
Miranda-Quintana RA, Bajusz D, Rácz A, Héberger K. Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 1: Theory and characteristics †. J Cheminform 2021; 13:32. [PMID: 33892802 PMCID: PMC8067658 DOI: 10.1186/s13321-021-00505-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 03/12/2021] [Indexed: 12/14/2022] Open
Abstract
Quantification of the similarity of objects is a key concept in many areas of computational science. This includes cheminformatics, where molecular similarity is usually quantified based on binary fingerprints. While there is a wide selection of available molecular representations and similarity metrics, there were no previous efforts to extend the computational framework of similarity calculations to the simultaneous comparison of more than two objects (molecules) at the same time. The present study bridges this gap, by introducing a straightforward computational framework for comparing multiple objects at the same time and providing extended formulas for as many similarity metrics as possible. In the binary case (i.e. when comparing two molecules pairwise) these are naturally reduced to their well-known formulas. We provide a detailed analysis on the effects of various parameters on the similarity values calculated by the extended formulas. The extended similarity indices are entirely general and do not depend on the fingerprints used. Two types of variance analysis (ANOVA) help to understand the main features of the indices: (i) ANOVA of mean similarity indices; (ii) ANOVA of sum of ranking differences (SRD). Practical aspects and applications of the extended similarity indices are detailed in the accompanying paper: Miranda-Quintana et al. J Cheminform. 2021. https://doi.org/10.1186/s13321-021-00504-4 . Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons .
Collapse
Affiliation(s)
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Anita Rácz
- Plasma Chemistry Research Group, ELKH Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, ELKH Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary.
| |
Collapse
|
20
|
Abdizadeh R, Heidarian E, Hadizadeh F, Abdizadeh T. QSAR Modeling, Molecular Docking and Molecular Dynamics Simulations Studies of Lysine-Specific Demethylase 1 (LSD1) Inhibitors as Anticancer Agents. Anticancer Agents Med Chem 2021; 21:987-1018. [PMID: 32698753 DOI: 10.2174/1871520620666200721134010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 05/07/2020] [Accepted: 05/17/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Histone Lysine Demetylases1 (LSD1) is a promising medication to treat cancer, which plays a crucial role in epigenetic modulation of gene expression. Inhibition of LSD1with small molecules has emerged as a vital mechanism to treat cancer. OBJECTIVE In the present research, molecular modeling investigations, such as CoMFA, CoMFA-RF, CoMSIA and HQSAR, molecular docking and Molecular Dynamics (MD) simulations were carried out on some tranylcypromine derivatives as LSD1 inhibitors. METHODS The QSAR models were carried out on a series of Tranylcypromine derivatives as data set via the SYBYL-X2.1.1 program. Molecular docking and MD simulations were carried out by the MOE software and the SYBYL program, respectively. The internal and external predictability performances related to the generated models for these LSD1 inhibitors were justified by evaluating cross-validated correlation coefficient (q2), noncross- validated correlation coefficient (r2ncv) and predicted correlation coefficient (r2pred) of the training and test set molecules, respectively. RESULTS The CoMFA (q2, 0.670; r2ncv, 0.930; r2pred, 0.968), CoMFA-RF (q2, 0.694; r2ncr, 0.926; r2pred, 0.927), CoMSIA (q2, 0.834; r2ncv, 0.956; r2pred, 0.958) and HQSAR models (q2, 0.854; r2ncv, 0.900; r2pred, 0.728) for training as well as the test set of LSD1 inhibition resulted in significant findings. CONCLUSION These QSAR models were found to be perfect and strong with better predictability. Contour maps of all models were generated and it was proven by molecular docking studies and molecular dynamics simulation that the hydrophobic, electrostatic and hydrogen bonding fields are crucial in these models for improving the binding affinity and determining the structure-activity relationship. These theoretical results are possibly beneficial to design new strong LSD1 inhibitors with enhanced activity to treat cancer.
Collapse
Affiliation(s)
- Rahman Abdizadeh
- Department of Medical Parasitology and Mycology, Faculty of Medicine, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | - Esfandiar Heidarian
- Clinical Biochemistry Research Center, Basic Health Sciences Institute, Sharekord University of Medical Sciences, Shahrekord, Iran
| | - Farzin Hadizadeh
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Tooba Abdizadeh
- Clinical Biochemistry Research Center, Basic Health Sciences Institute, Sharekord University of Medical Sciences, Shahrekord, Iran
| |
Collapse
|
21
|
Kovács D, Király P, Tóth G. Sample-size dependence of validation parameters in linear regression models and in QSAR. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021; 32:247-268. [PMID: 33749419 DOI: 10.1080/1062936x.2021.1890208] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/10/2021] [Indexed: 06/12/2023]
Abstract
The dependence of statistical validation parameters was investigated on the size of the sample taken in fit of multivariate linear curves. We observed that R2 and related internal parameters were misleading as they overestimated the goodness-of-fit of models at small sample size. Cross-validation metrics showed correct trends. It was possible to scale the leave-one-out and the leave-many-out results close to identical by correcting the degrees of freedom of the models. y and x-randomized validation parameters were calculated and the methods provided close to identical results. We suggest to use the simplest methods in both cases. The external parameters followed correct trends with respect to the sample size, but their sensitivity differed. We plotted the Roy-Ojha metrics in 2D and we coloured them with respect to other external parameters to provide an easy classification of models. The rank correlations were calculated between the performance parameters. Up to a sample size, goodness-of-fit and robustness were distinguishable, but above a certain sample size, the parameters were redundant. The external-internal pairs were weakly correlated. Our data show that all the three aspects of validation are necessary at small sample sizes, but the internal check of robustness is not informative above a given sample size.
Collapse
Affiliation(s)
- D Kovács
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| | - P Király
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| | - G Tóth
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| |
Collapse
|
22
|
Zhou P, Liu Q, Wu T, Miao Q, Shang S, Wang H, Chen Z, Wang S, Wang H. Systematic Comparison and Comprehensive Evaluation of 80 Amino Acid Descriptors in Peptide QSAR Modeling. J Chem Inf Model 2021; 61:1718-1731. [DOI: 10.1021/acs.jcim.0c01370] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Peng Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qian Liu
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Ting Wu
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shuyong Shang
- College of Chemistry and Life Science, Chengdu Normal University, Chengdu 611130, China
| | - Heyi Wang
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Zheng Chen
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shaozhou Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Heyan Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| |
Collapse
|
23
|
Tong JB, Luo D, Xu HY, Bian S, Zhang X, Xiao XC, Wang J. A computational approach for designing novel SARS-CoV-2 M pro inhibitors: combined QSAR, molecular docking, and molecular dynamics simulation techniques. NEW J CHEM 2021. [DOI: 10.1039/d1nj02127c] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The promising compound T21 for treating COVID-19 at the active site of SARS-CoV-2 Mpro.
Collapse
Affiliation(s)
- Jian-Bo Tong
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Ding Luo
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Hai-Yin Xu
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Shuai Bian
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Xing Zhang
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Xue-Chun Xiao
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| | - Jie Wang
- College of Chemistry and Chemical Engineering
- Shaanxi University of Science and Technology
- Xi’an 710021
- China
- Shaanxi Key Laboratory of Chemical Additives for Industry
| |
Collapse
|
24
|
Gere A, Rácz A, Bajusz D, Héberger K. Multicriteria decision making for evergreen problems in food science by sum of ranking differences. Food Chem 2020; 344:128617. [PMID: 33221108 DOI: 10.1016/j.foodchem.2020.128617] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 10/08/2020] [Accepted: 11/08/2020] [Indexed: 12/21/2022]
Abstract
Finding optimal solutions usually requires multicriteria optimization. The sum of ranking differences (SRD) algorithm can efficiently solve such problems. Its principles and earlier applications will be discussed here, along with meta-analyses of papers published in various subfields of food science, such as analytics in food chemistry, food engineering, food technology, food microbiology, quality control, and sensory analysis. Carefully selected real case studies give an overview of the wide range of applications for multicriteria optimizations, using a free, easy-to-use and validated method. Results are presented and discussed in a way that helps scientists and practitioners, who are less familiar with multicriteria optimization, to integrate the method into their research projects. The utility of SRD, optionally coupled with other statistical methods such as ANOVA, is demonstrated on altogether twelve case studies, covering diverse method comparison and data evaluation scenarios from various subfields of food science.
Collapse
Affiliation(s)
- Attila Gere
- Sensory Laboratory, Institute of Food Technology, Szent István University, Villányi út 29-43., 1118 Budapest, Hungary
| | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, H-1117 Budapest, Magyar tudósok krt. 2, Hungary
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, H-1117 Budapest, Magyar tudósok krt. 2, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, H-1117 Budapest, Magyar tudósok krt. 2, Hungary.
| |
Collapse
|
25
|
Rajathei DM, Parthasarathy S, Selvaraj S. Combined QSAR Model and Chemical Similarity Search for Novel HMG-CoA Reductase Inhibitors for Coronary Heart Disease. Curr Comput Aided Drug Des 2020; 16:473-485. [DOI: 10.2174/1573409915666190904114247] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 06/30/2019] [Accepted: 08/01/2019] [Indexed: 11/22/2022]
Abstract
Background:Coronary heart disease generally occurs due to cholesterol accumulation in the walls of the heart arteries. Statins are the most widely used drugs which work by inhibiting the active site of 3-Hydroxy-3-methylglutaryl-CoA reductase (HMGCR) enzyme that is responsible for cholesterol synthesis. A series of atorvastatin analogs with HMGCR inhibition activity have been synthesized experimentally which would be expensive and time-consuming.Methods:In the present study, we employed both the QSAR model and chemical similarity search for identifying novel HMGCR inhibitors for heart-related diseases. To implement this, a 2D QSAR model was developed by correlating the structural properties to their biological activity of a series of atorvastatin analogs reported as HMGCR inhibitors. Then, the chemical similarity search of atorvastatin analogs was performed by using PubChem database search.Results and Discussion:The three-descriptor model of charge (GATS1p), connectivity (SCH-7) and distance (VE1_D) of the molecules is obtained for HMGCR inhibition with the statistical values of R2= 0.67, RMSEtr= 0.33, R2 ext= 0.64 and CCCext= 0.76. The 109 novel compounds were obtained by chemical similarity search and the inhibition activities of the compounds were predicted using QSAR model, which were close in the range of experimentally observed threshold.Conclusion:The present study suggests that the QSAR model and chemical similarity search could be used in combination for identification of novel compounds with activity by in silico with less computation and effort.
Collapse
Affiliation(s)
- David Mary Rajathei
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| | - Subbiah Parthasarathy
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| |
Collapse
|
26
|
Abdizadeh R, Heidarian E, Hadizadeh F, Abdizadeh T. Investigation of pyrimidine analogues as xanthine oxidase inhibitors to treat of hyperuricemia and gout through combined QSAR techniques, molecular docking and molecular dynamics simulations. J Taiwan Inst Chem Eng 2020. [DOI: 10.1016/j.jtice.2020.08.028] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
27
|
Abstract
At the end of her academic career, the author summarizes the main aspects of QSAR modeling, giving comments and suggestions according to her 23 years' experience in QSAR research on environmental topics. The focus is mainly on Multiple Linear Regression, particularly Ordinary Least Squares, using a Genetic Algorithm for variable selection from various theoretical molecular descriptors, but the comments can be useful also for other QSAR methods. The need for rigorous validation, also external, and for applicability domain check to guarantee predictivity and reliability of QSAR models is particularly highlighted. The commented approach is the “predictive” one, based on chemometrics, and is usefully applied to the prioritization of environmental pollutants. All the discussed points and the author's ideas are implemented in the software QSARINS, as a legacy to the QSAR community.
Collapse
|
28
|
Hao Q, Zhou J, Zhou L, Kang L, Nan T, Yu Y, Guo L. Prediction the contents of fructose, glucose, sucrose, fructo-oligosaccharides and iridoid glycosides in Morinda officinalis radix using near-infrared spectroscopy. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2020; 234:118275. [PMID: 32217454 DOI: 10.1016/j.saa.2020.118275] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 02/24/2020] [Accepted: 03/15/2020] [Indexed: 05/23/2023]
Abstract
Morindae officinalis radix (MOR) is a famous Chinese herbal medicine which has long history of use in medicine and food. MOR and MOR with steaming process (PMOR) are the most commonly used forms in in clinical and health care. In order to establish a fast and mostly nondestructive quality control method for MOR, 183 beaches of MOR samples and 20 beaches of PMOR samples were collected commercially from major producing areas in Guangdong, Fujian and Guangxi Provinces of China. To predict main components of MOR, a calibration model was established based on near-infrared spectroscopy with partial least square regression. The model was optimized by compared the parameters of root mean square error of prediction (RMSEP), root mean square error of cross validation (RMSECV), coefficient of correlation (R2) and ratio of performance to deviation (RPD). Comparative studies were performed to evaluate the performance of models by different spectra preprocessing methods and different data set. The results showed that the model performance was improved with standard normal variate spectra preprocessing methods and when the data set contained both MOR and PMOR samples. A few PMOR samples were added to MOR samples data set the model predictive performance could be improved. The contents of 14 components were predicted in MOR with lower RMSEP and RMSECV, and higher R2 and RPD, including fructose (12.8 mg/g, 16.3 mg/g, 0.9873, 10.10), glucose (7.28 mg/g, 8.73 mg/g, 0.9611, 6.21 sucrose (9.24 mg/g, 9.10 mg/g, 0.8419, 1.75), GF2(9.42 mg/g, 11.3 mg/g, 0.8526, 2.03), GF3(7.98 mg/g, 9.20 mg/g, 0.8756, 2.74), GF4(6.81 mg/g, 8.93 mg/g, 0.8663, 3.06), GF5(8.13 mg/g, 8.85 mg/g, 0.9001, 3.06), GF6(6.40 mg/g, 6.95 mg/g, 0.9145, 3.27), GF7(5.53 mg/g, 6.15 mg/g, 0.9195, 3.57), GF8(5.40 mg/g, 6.02 mg/g, 0.9179, 3.31), GF9(3.00 mg/g,4.35 mg/g,0.9446, 5.03),GF10(4.08 mg/g, 5.34 mg/g, 0.8983, 3.62), GF11(8.97 mg/g, 7.70 mg/g, 0.8683, 2.01) and iridoid glycosides (4.12 mg/g, 5.51 mg/g, 0.8712, 2.43). The model established in this paper could predict 14 components of MOR. The results would provide a reference method for the quality control of Chinese medical materials and their process products.
Collapse
Affiliation(s)
- Qingxiu Hao
- National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica-Infinitus (China) Joint Laboratory Herbs Quality Research, No.16 Nanxiaojie, Dongzhimen Nei Ave., Beijing 100700, China
| | - Jie Zhou
- University of Jinan, No.336 Westnanxinzhuang Road, Jinan, Shandong 250022, China
| | - Li Zhou
- National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica-Infinitus (China) Joint Laboratory Herbs Quality Research, No.16 Nanxiaojie, Dongzhimen Nei Ave., Beijing 100700, China
| | - Liping Kang
- National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica-Infinitus (China) Joint Laboratory Herbs Quality Research, No.16 Nanxiaojie, Dongzhimen Nei Ave., Beijing 100700, China
| | - Tiegui Nan
- National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica-Infinitus (China) Joint Laboratory Herbs Quality Research, No.16 Nanxiaojie, Dongzhimen Nei Ave., Beijing 100700, China
| | - Yi Yu
- Infinitus (China) Company Ltd, The 1st floor, 19 Sicheng Road, Tianhe District, Guangzhou City 510663, China.
| | - Lanping Guo
- National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia Medica-Infinitus (China) Joint Laboratory Herbs Quality Research, No.16 Nanxiaojie, Dongzhimen Nei Ave., Beijing 100700, China.
| |
Collapse
|
29
|
Mora JR, Marrero-Ponce Y, García-Jacas CR, Suarez Causado A. Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Prediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches. Chem Res Toxicol 2020; 33:1855-1873. [PMID: 32406679 DOI: 10.1021/acs.chemrestox.0c00030] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity prediction of organic-type chemicals. Here, classifiers for the prediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI predictions. A noncommercial and fully cross-platform software for the DILI prediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the prediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.
Collapse
Affiliation(s)
- Jose R Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito (USFQ), Quito 17-1200-841, Ecuador.,Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador
| | - Yovani Marrero-Ponce
- Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador.,Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, and Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito, Pichincha 170157, Ecuador
| | - César R García-Jacas
- Cátedras Conacyt-Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Amileth Suarez Causado
- Grupo de Investigación Prometeus & Biomedicina Aplicada a las Ciencias Clínicas, Área de Bioquímica, Campus de Zaragocilla, Facultad de Medicina, Universidad de Cartagena, Cartagena de Indias 130001, Colombia
| |
Collapse
|
30
|
Drug design by machine-trained elastic networks: predicting Ser/Thr-protein kinase inhibitors' activities. Mol Divers 2020; 25:899-909. [PMID: 32222890 DOI: 10.1007/s11030-020-10074-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 03/11/2020] [Indexed: 12/23/2022]
Abstract
An elastic network model (ENM) represents a molecule as a matrix of pairwise atomic interactions. Rich in coded information, ENMs are hereby proposed as a novel tool for the prediction of the activity of series of molecules, with widely different chemical structures, but a common biological activity. The new approach is developed and tested using a set of 183 inhibitors of serine/threonine-protein kinase enzyme (Plk3) which is an enzyme implicated in the regulation of cell cycle and tumorigenesis. The elastic network (EN) predictive model is found to exhibit high accuracy and speed compared to descriptor-based machine-trained modeling. EN modeling appears to be a highly promising new tool for the high demands of industrial applications such as drug and material design.
Collapse
|
31
|
Jalili-Jahani N, Fatehi A. Multivariate image analysis-quantitative structure-retention relationship study of polychlorinated biphenyls using partial least squares and radial basis function neural networks. J Sep Sci 2020; 43:1479-1488. [PMID: 32052926 DOI: 10.1002/jssc.201901101] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 01/21/2020] [Accepted: 02/07/2020] [Indexed: 11/10/2022]
Abstract
Polychlorinated biphenyls belong to a class of hazardous and environmental pollutants. Gas chromatography separation and experimental relative retention time evaluation of these compounds on a poly (94% methyl/5% phenyl) silicone-based capillary non-bonded and cross-linked column are time consuming and expensive. In this study, relative retention times were estimated using two-dimensional images of molecules based on a newly implemented rapid and simple quantitative structure retention relationship methodology. The resulting descriptors were subjected to partial least square and principal component-radial basis function neural networks as linear and nonlinear models, respectively, to attain a statistical explanation of the retention behavior of the molecules. The high numerical values of correlation coefficients and low root mean square errors in the case of the partial least square model, confirm the supremacy of this model as well as the linear dependency of images of molecules to their relative retention times. Evaluation of the best correlation model performed using internal and external tests and its good applicability domain was checked using a distance to the model in the X-Space plot. This study provides a practical and effective method for analytical chemists working with chromatographic platforms to improve predictive confidence of studies that seek to identify unknown molecules or impurities.
Collapse
Affiliation(s)
- Nasser Jalili-Jahani
- Green Land Shiraz Eksir Chemical and Agricultural Industries Company, Shiraz, Iran
| | - Azadeh Fatehi
- Green Land Shiraz Eksir Chemical and Agricultural Industries Company, Shiraz, Iran
| |
Collapse
|
32
|
In silico studies of novel scaffold of thiazolidin-4-one derivatives as anti-Toxoplasma gondii agents by 2D/3D-QSAR, molecular docking, and molecular dynamics simulations. Struct Chem 2020. [DOI: 10.1007/s11224-019-01458-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
33
|
QSAR analysis of coumarin-based benzamides as histone deacetylase inhibitors using CoMFA, CoMSIA and HQSAR methods. J Mol Struct 2020. [DOI: 10.1016/j.molstruc.2019.126961] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
34
|
Martin EJ, Polyakov VR, Zhu XW, Tian L, Mukherjee P, Liu X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC 50s for 8558 Novartis Assays. J Chem Inf Model 2019; 59:4450-4459. [PMID: 31518124 DOI: 10.1021/acs.jcim.9b00375] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Profile-quantitative structure-activity relationship (pQSAR) is a massively multitask, two-step machine learning method with unprecedented scope, accuracy, and applicability domain. In step one, a "profile" of conventional single-assay random forest regression models are trained on a very large number of biochemical and cellular pIC50 assays using Morgan 2 substructural fingerprints as compound descriptors. In step two, a panel of partial least squares (PLS) models are built using the profile of pIC50 predictions from those random forest regression models as compound descriptors (hence the name). Previously described for a panel of 728 biochemical and cellular kinase assays, we have now built an enormous pQSAR from 11 805 diverse Novartis (NVS) IC50 and EC50 assays. This large number of assays, and hence of compound descriptors for PLS, dictated reducing the profile by only including random forest regression models whose predictions correlate with the assay being modeled. The random forest regression and pQSAR models were evaluated with our "realistically novel" held-out test set, whose median average similarity to the nearest training set member across the 11 805 assays was only 0.34, comparable to the novelty of compounds actually selected from virtual screens. For the 11 805 single-assay random forest regression models, the median correlation of prediction with the experiment was only rext2 = 0.05, virtually random, and only 8% of the models achieved our standard success threshold of rext2 = 0.30. For pQSAR, the median correlation was rext2 = 0.53, comparable to four-concentration experimental IC50s, and 72% of the models met our rext2 > 0.30 standard, totaling 8558 successful models. The successful models included assays from all of the 51 annotated target subclasses, as well as 4196 phenotypic assays, indicating that pQSAR can be applied to virtually any disease area. Every month, all models are updated to include new measurements, and predictions are made for 5.5 million NVS compounds, totaling 50 billion predictions. Common uses have included virtual screening, selectivity design, toxicity and promiscuity prediction, mechanism-of-action prediction, and others. Several such actual applications are described.
Collapse
Affiliation(s)
- Eric J Martin
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States
| | - Valery R Polyakov
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States
| | - Xiang-Wei Zhu
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States
| | - Li Tian
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States.,China Novartis Institutes for BioMedical Research Company, Limited , 2F, Building 4, Novartis Campus, No. 4218 Jinke Road , Zhangjiang, Pudong, Shanghai 201203 , China
| | - Prasenjit Mukherjee
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States
| | - Xin Liu
- Novartis Institute for Biomedical Research , 5300 Chiron Way , Emeryville , California 94608-2916 , United States.,China Novartis Institutes for BioMedical Research Company, Limited , 2F, Building 4, Novartis Campus, No. 4218 Jinke Road , Zhangjiang, Pudong, Shanghai 201203 , China
| |
Collapse
|
35
|
Venugopal PP, Das BK, Soorya E, Chakraborty D. Effect of hydrophobic and hydrogen bonding interactions on the potency of ß-alanine analogs of G-protein coupled glucagon receptor inhibitors. Proteins 2019; 88:327-344. [PMID: 31443129 DOI: 10.1002/prot.25807] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 08/09/2019] [Accepted: 08/21/2019] [Indexed: 01/06/2023]
Abstract
G-protein coupled glucagon receptors (GCGRs) play an important role in glucose homeostasis and pathophysiology of Type-II Diabetes Mellitus (T2DM). The allosteric pocket located at the trans-membrane domain of GCGR consists of hydrophobic (TM5) and hydrophilic (TM7) units. Hydrophobic interactions with the amino acid residues present at TM5, found to facilitate the favorable orientation of antagonist at GCGR allosteric pocket. A statistically robust and highly predictive 3D-QSAR model was developed using 58 β-alanine based GCGR antagonists with significant variation in structure and potency profile. The correlation coefficient (R2 ) and cross-validation coefficient (Q2 ) of the developed model were found to be 0.9981 and 0.8253, respectively at the PLS factor of 8. The analysis of the favorable and unfavorable contribution of different structural features on the glucagon receptor antagonists was done by 3D-QSAR contour plots. Hydrophobic and hydrogen bonding interactions are found to be main dominating non-bonding interactions in docking studies. Presence of highest occupied molecular orbital (HOMO) in the polar part and lowest unoccupied molecular orbital (LUMO) in the hydrophobic part of antagonists leads to favorable protein-ligand interactions. Molecular mechanics/generalized born surface area (MM/GBSA) calculations showed that van der Waals and nonpolar solvation energy terms are crucial components for thermodynamically stable binding of the inhibitors. The binding free energy of highly potent compound was found to be -63.475 kcal/mol; whereas the least active compound exhibited binding energy of -41.097 kcal/mol. Further, five 100 ns molecular dynamics simulation (MD) simulations were done to confirm the stability of the inhibitor-receptor complex. Outcomes of the present study can serve as the basis for designing improved GCGR antagonists.
Collapse
Affiliation(s)
- Pushyaraga P Venugopal
- Biophysical and Computational Chemistry Laboratory, Department of Chemistry, National Institute of Technology Karnataka, Surathkal, Mangalore, India
| | - Bratin K Das
- Biophysical and Computational Chemistry Laboratory, Department of Chemistry, National Institute of Technology Karnataka, Surathkal, Mangalore, India
| | - E Soorya
- Biophysical and Computational Chemistry Laboratory, Department of Chemistry, National Institute of Technology Karnataka, Surathkal, Mangalore, India
| | - Debashree Chakraborty
- Biophysical and Computational Chemistry Laboratory, Department of Chemistry, National Institute of Technology Karnataka, Surathkal, Mangalore, India
| |
Collapse
|
36
|
Rácz A, Bajusz D, Héberger K. Multi-Level Comparison of Machine Learning Classifiers and Their Performance Metrics. Molecules 2019; 24:E2811. [PMID: 31374986 PMCID: PMC6695655 DOI: 10.3390/molecules24152811] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 07/30/2019] [Indexed: 01/28/2023] Open
Abstract
Machine learning classification algorithms are widely used for the prediction and classification of the different properties of molecules such as toxicity or biological activity. the prediction of toxic vs. non-toxic molecules is important due to testing on living animals, which has ethical and cost drawbacks as well. The quality of classification models can be determined with several performance parameters. which often give conflicting results. In this study, we performed a multi-level comparison with the use of different performance metrics and machine learning classification methods. Well-established and standardized protocols for the machine learning tasks were used in each case. The comparison was applied to three datasets (acute and aquatic toxicities) and the robust, yet sensitive, sum of ranking differences (SRD) and analysis of variance (ANOVA) were applied for evaluation. The effect of dataset composition (balanced vs. imbalanced) and 2-class vs. multiclass classification scenarios was also studied. Most of the performance metrics are sensitive to dataset composition, especially in 2-class classification problems. The optimal machine learning algorithm also depends significantly on the composition of the dataset.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, Hungary
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, Hungary.
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, H-1117 Budapest, Hungary
| |
Collapse
|
37
|
Rácz A, Bajusz D, Héberger K. Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR. Mol Inform 2019; 38:e1800154. [PMID: 30945814 PMCID: PMC6767540 DOI: 10.1002/minf.201800154] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/13/2019] [Indexed: 01/03/2023]
Abstract
QSAR/QSPR (quantitative structure-activity/property relationship) modeling has been a prevalent approach in various, overlapping sub-fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process. In typical QSAR workflows, the starting pool of molecular descriptors is rationalized based on filtering out descriptors which are (i) constant throughout the whole dataset, or (ii) very strongly correlated to another descriptor. While the former is fairly straightforward, the latter involves a level of subjectivity when deciding what exactly is considered to be a strong correlation. Despite that, most QSAR modeling studies do not report on this step. In this study, we examine in detail the effect of various possible descriptor intercorrelation limits on the resulting QSAR models. Statistical comparisons are carried out based on four case studies from contemporary QSAR literature, using a combined methodology based on sum of ranking differences (SRD) and analysis of variance (ANOVA).
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group Research Centre for Natural SciencesHungarian Academy of SciencesMagyar tudósok krt. 21117BudapestHungary
| | - Dávid Bajusz
- Medicinal Chemistry Research Group Research Centre for Natural SciencesHungarian Academy of SciencesMagyar tudósok krt. 21117BudapestHungary
| | - Károly Héberger
- Plasma Chemistry Research Group Research Centre for Natural SciencesHungarian Academy of SciencesMagyar tudósok krt. 21117BudapestHungary
| |
Collapse
|
38
|
Karadžić Banjac MŽ, Kovačević SZ, Jevrić LR, Podunavac-Kuzmanović SO, Mandić AI. On the characterization of novel biologically active steroids: Selection of lipophilicity models of newly synthesized steroidal derivatives by classical and non-parametric ranking approaches. Comput Biol Chem 2019; 80:23-30. [DOI: 10.1016/j.compbiolchem.2019.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 03/09/2019] [Indexed: 10/27/2022]
|
39
|
Gere A, Radványi D, Héberger K. Which insect species can best be proposed for human consumption? INNOV FOOD SCI EMERG 2019. [DOI: 10.1016/j.ifset.2019.01.016] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
40
|
Sheikhpour R, Sarram MA, Sheikhpour E. Semi-supervised sparse feature selection via graph Laplacian based scatter matrix for regression problems. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.08.035] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
41
|
Rajathei DM, Parthasarathy S, Selvaraj S. QSAR Analysis of Multimodal Antidepressants Vortioxetine Analogs Using Physicochemical Descriptors and MLR Modeling. Curr Comput Aided Drug Des 2018; 15:294-307. [PMID: 30317998 DOI: 10.2174/1573409914666181011144810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 07/10/2018] [Accepted: 10/03/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND Vortioxetine is a multimodal antidepressant drug with combined effects on SERT as an inhibitor, 5-HT1A as agonist and 5-HT3A as an antagonist. Series of vortioxetine analogs have been reported as multi antidepressant compounds and they block serotonin transport into the neuronal cells, activate the postsynaptic 5-HT1A receptors and eliminate the low activity of 5-HT3A receptors. OBJECTIVE To explore the important properties of vortioxetine analogs involved in antidepressant activity by developing 2D QSAR models. METHODS Selections of significant descriptors were performed by Least Absolute Shrinkage and Selection Operator (LASSO) method and, the Multiple Linear Regression (MLR) method and All Subsets and GA algorithm included in QSARINS software were used for generating QSAR models. Further, the virtual screening was performed based on bioactivity and structure similarity using the PubChem database. RESULTS The four descriptor model of complementary information content (CIC2), solubility (bcutp3), mass (bcutm8) and partial charge in van der Waals surface area (PEOEVSA7) of the molecules is obtained for SERT inhibition with the significant statistics of R2= 0.69, RMSEtr= 0.44, R2 ext= 0.62 and CCCext= 0.78. For 5-HT1A agonist, the two descriptor model of molecular shape (Kappm3) and van der Waals volume of the atoms (bcutv11) with R2= 0.78, RMSEtr= 0.33, R2 ext = 0.83, and CCCext= 0.87 is established. The three descriptor model of information content (IC3), solubility (bcutp9) and electronegativity (GATSe5) of the molecules with R2= 0.61, RMSEtr= 0.34, R2 ext= 0.69 and CCCext= 0.72 is obtained for 5-HT3A antagonist. The antidepressant activities of 16 virtual screened compounds were predicted using the developed models. CONCLUSION The developed QSAR models may be useful to predict antidepressant activity for the newly synthesized vortioxetine analogs.
Collapse
Affiliation(s)
- David M Rajathei
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamilnadu, India
| | - Subbiah Parthasarathy
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamilnadu, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, Tamilnadu, India
| |
Collapse
|
42
|
Consonni V, Todeschini R, Ballabio D, Grisoni F. On the Misleading Use of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msubsup><mml:mi>Q</mml:mi> <mml:mrow><mml:mi>F</mml:mi> <mml:mn>3</mml:mn></mml:mrow> <mml:mn>2</mml:mn></mml:msubsup> </mml:math> for QSAR Model Comparison. Mol Inform 2018; 38:e1800029. [PMID: 30142701 DOI: 10.1002/minf.201800029] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 07/20/2018] [Indexed: 11/06/2022]
Abstract
Quantitative Structure - Activity Relationship (QSAR) models play a central role in medicinal chemistry, toxicology and computer-assisted molecular design, as well as a support for regulatory decisions and animal testing reduction. Thus, assessing their predictive ability becomes an essential step for any prospective application. Many metrics have been proposed to estimate the model predictive ability of QSARs, which have created confusion on how models should be evaluated and properly compared. Recently, we showed that the metric <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msubsup><mml:mi>Q</mml:mi> <mml:mrow><mml:mi>F</mml:mi> <mml:mn>3</mml:mn></mml:mrow> <mml:mn>2</mml:mn></mml:msubsup> </mml:math> is particularly well-suited for comparing the external predictivity of different models developed on the same training dataset. However, when comparing models developed on different training data, this function becomes inadequate and only dispersion measures like the root-mean-square error (RMSE) should be used. The intent of this work is to provide clarity on the correct and incorrect uses of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msubsup><mml:mi>Q</mml:mi> <mml:mrow><mml:mi>F</mml:mi> <mml:mn>3</mml:mn></mml:mrow> <mml:mn>2</mml:mn></mml:msubsup> </mml:math> , discussing its behavior towards the training data distribution and illustrating some cases in which <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msubsup><mml:mi>Q</mml:mi> <mml:mrow><mml:mi>F</mml:mi> <mml:mn>3</mml:mn></mml:mrow> <mml:mn>2</mml:mn></mml:msubsup> </mml:math> estimates may be misleading. Hereby, we encourage the usage of measures of dispersions when models trained on different datasets have to be compared and evaluated.
Collapse
Affiliation(s)
- Viviana Consonni
- University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, piazza della Scienza 1, 20126, Milano, Italy
| | - Roberto Todeschini
- University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, piazza della Scienza 1, 20126, Milano, Italy
| | - Davide Ballabio
- University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, piazza della Scienza 1, 20126, Milano, Italy
| | - Francesca Grisoni
- University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, piazza della Scienza 1, 20126, Milano, Italy
| |
Collapse
|
43
|
Sheikhpour R, Sarram MA, Rezaeian M, Sheikhpour E. QSAR modelling using combined simple competitive learning networks and RBF neural networks. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:257-276. [PMID: 29372662 DOI: 10.1080/1062936x.2018.1424030] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 01/02/2018] [Indexed: 06/07/2023]
Abstract
The aim of this study was to propose a QSAR modelling approach based on the combination of simple competitive learning (SCL) networks with radial basis function (RBF) neural networks for predicting the biological activity of chemical compounds. The proposed QSAR method consisted of two phases. In the first phase, an SCL network was applied to determine the centres of an RBF neural network. In the second phase, the RBF neural network was used to predict the biological activity of various phenols and Rho kinase (ROCK) inhibitors. The predictive ability of the proposed QSAR models was evaluated and compared with other QSAR models using external validation. The results of this study showed that the proposed QSAR modelling approach leads to better performances than other models in predicting the biological activity of chemical compounds. This indicated the efficiency of simple competitive learning networks in determining the centres of RBF neural networks.
Collapse
Affiliation(s)
- R Sheikhpour
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - M A Sarram
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - M Rezaeian
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - E Sheikhpour
- b Hematology and Oncology Research Center , Shahid Sadoughi University of Medical Sciences , Yazd , Iran
| |
Collapse
|
44
|
De P, Roy K. Greener chemicals for the future: QSAR modelling of the PBT index using ETA descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:319-337. [PMID: 29457543 DOI: 10.1080/1062936x.2018.1436086] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Persistent, bioaccumulative and toxic (PBT) chemicals symbolize a group of substances that are not easily degraded; instead, they accumulate in different organisms and exhibit an acute or chronic toxicity. The limited empirical data on PBT chemicals, the high cost of testing together with the regulatory constraints and the international push for reduced animal testing motivate a greater reliance on predictive computational methods like quantitative structure-activity relationship (QSAR) models in PBT assessment. Papa and Gramatica have recently proposed a PBT index that could be computed directly from structural features. In the current study, we have modelled the experimentally derived PBT index data using an extended topological atom (ETA) along with constitutional descriptors to show the usefulness of the ETA indices in modelling the endpoint. The models developed through a double cross-validation (DCV) method gave the best results in terms of both internal and external validation metrics. The developed models were comparable in predictive quality to those previously reported. The current models were further used for consensus predictions of PBT behaviour for a set of pharmaceuticals and a set of synthetic drug-like compounds. The developed models can be used in PBT hazard screening for identification and prioritization of chemicals from the structural information alone.
Collapse
Affiliation(s)
- P De
- a Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology , Jadavpur University , Kolkata 700 032 , India
| | - K Roy
- a Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology , Jadavpur University , Kolkata 700 032 , India
| |
Collapse
|
45
|
Karlberg M, von Stosch M, Glassey J. Exploiting mAb structure characteristics for a directed QbD implementation in early process development. Crit Rev Biotechnol 2018. [DOI: 10.1080/07388551.2017.1421899] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Micael Karlberg
- School of Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne, UK
| | - Moritz von Stosch
- School of Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne, UK
| | - Jarka Glassey
- School of Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
46
|
Retention prediction in reversed phase high performance liquid chromatography using quantitative structure-retention relationships applied to the Hydrophobic Subtraction Model. J Chromatogr A 2018; 1541:1-11. [PMID: 29454529 DOI: 10.1016/j.chroma.2018.01.053] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 01/31/2018] [Accepted: 01/31/2018] [Indexed: 11/22/2022]
Abstract
Quantitative Structure-Retention Relationships (QSRR) methodology combined with the Hydrophobic Subtraction Model (HSM) have been utilized to accurately predict retention times for a selection of analytes on several different reversed phase liquid chromatography (RPLC) columns. This approach is designed to facilitate early prediction of co-elution of analytes, for example in pharmaceutical drug discovery applications where it is advantageous to predict whether impurities might be co-eluted with the active drug component. The QSRR model utilized VolSurf+ descriptors and a Partial Least Squares regression combined with a Genetic Algorithm (GA-PLS) to predict the solute coefficients in the HSM. It was found that only the hydrophobicity (η'H) term in the HSM was required to give the accuracy necessary to predict potential co-elution of analytes. Global QSRR models derived from all 148 compounds in the dataset were compared to QSRR models derived using a range of local modelling techniques based on clustering of compounds in the dataset by the structural similarity of compounds (as represented by the Tanimoto similarity index), physico-chemical similarity of compounds (represented by log D), the neutral, acidic, or basic nature of the compound, and the second dominant interaction between analyte and stationary phase after hydrophobicity. The global model showed reasonable prediction accuracy for retention time with errors of 30 s and less for up to 50% of modeled compounds. The local models for Tanimoto, nature of the compound and second dominant interaction approaches all exhibited prediction errors less than 30 s in retention time for nearly 70% of compounds for which models could be derived. Predicted retention times of five representative compounds on nine reversed-phase columns were compared with known experimental retention data for these columns and this comparison showed that the accuracy of the proposed modelling approach is sufficient to reliably predict the retention times of analytes based only on their chemical structures.
Collapse
|
47
|
Rácz A, Gere A, Bajusz D, Héberger K. Is soft independent modeling of class analogies a reasonable choice for supervised pattern recognition? RSC Adv 2018. [DOI: 10.1039/c7ra08901e] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A thorough survey of classification data sets and a rigorous comparison of classification methods show the unambiguous superiority of other techniques over soft independent modeling of class analogies (SIMCA – one class modeling) for classification.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group
- Research Centre for Natural Sciences
- Hungarian Academy of Sciences
- H-1117 Budapest
- Hungary
| | - Attila Gere
- Szent István University
- Faculty of Food Science
- Sensory Laboratory
- H-1118 Budapest
- Hungary
| | - Dávid Bajusz
- Medicinal Chemistry Research Group
- Research Centre for Natural Sciences
- Hungarian Academy of Sciences
- H-1117 Budapest
- Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group
- Research Centre for Natural Sciences
- Hungarian Academy of Sciences
- H-1117 Budapest
- Hungary
| |
Collapse
|
48
|
A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities. J Comput Aided Mol Des 2017; 32:375-384. [DOI: 10.1007/s10822-017-0094-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 12/15/2017] [Indexed: 10/18/2022]
|
49
|
Towards a chromatographic similarity index to establish localised quantitative structure-retention relationships for retention prediction. II Use of Tanimoto similarity index in ion chromatography. J Chromatogr A 2017; 1523:173-182. [DOI: 10.1016/j.chroma.2017.02.054] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Revised: 02/20/2017] [Accepted: 02/23/2017] [Indexed: 11/19/2022]
|
50
|
Taraji M, Haddad PR, Amos RIJ, Talebi M, Szucs R, Dolan JW, Pohl CA. Chemometric-assisted method development in hydrophilic interaction liquid chromatography: A review. Anal Chim Acta 2017; 1000:20-40. [PMID: 29289311 DOI: 10.1016/j.aca.2017.09.041] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Revised: 09/22/2017] [Accepted: 09/24/2017] [Indexed: 02/09/2023]
Abstract
With an enormous growth in the application of hydrophilic interaction liquid chromatography (HILIC), there has also been significant progress in HILIC method development. HILIC is a chromatographic method that utilises hydro-organic mobile phases with a high organic content, and a hydrophilic stationary phase. It has been applied predominantly in the determination of small polar compounds. Theoretical studies in computer-aided modelling tools, most importantly the predictive, quantitative structure retention relationship (QSRR) modelling methods, have attracted the attention of researchers and these approaches greatly assist the method development process. This review focuses on the application of computer-aided modelling tools in understanding the retention mechanism, the classification of HILIC stationary phases, prediction of retention times in HILIC systems, optimisation of chromatographic conditions, and description of the interaction effects of the chromatographic factors in HILIC separations. Additionally, what has been achieved in the potential application of QSRR methodology in combination with experimental design philosophy in the optimisation of chromatographic separation conditions in the HILIC method development process is communicated. Developing robust predictive QSRR models will undoubtedly facilitate more application of this chromatographic mode in a broader variety of research areas, significantly minimising cost and time of the experimental work.
Collapse
Affiliation(s)
- Maryam Taraji
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Paul R Haddad
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia.
| | - Ruth I J Amos
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Mohammad Talebi
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Roman Szucs
- Pfizer Global Research and Development, CT13 9NJ, Sandwich, UK
| | - John W Dolan
- LC Resources, 1795 NW Wallace Rd., McMinnville, OR 97128, USA
| | | |
Collapse
|