1
|
Huang X, Xie X, Huang S, Wu S, Huang L. Predicting non-chemotherapy drug-induced agranulocytosis toxicity through ensemble machine learning approaches. Front Pharmacol 2024; 15:1431941. [PMID: 39206259 PMCID: PMC11349714 DOI: 10.3389/fphar.2024.1431941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 08/02/2024] [Indexed: 09/04/2024] Open
Abstract
Agranulocytosis, induced by non-chemotherapy drugs, is a serious medical condition that presents a formidable challenge in predictive toxicology due to its idiosyncratic nature and complex mechanisms. In this study, we assembled a dataset of 759 compounds and applied a rigorous feature selection process prior to employing ensemble machine learning classifiers to forecast non-chemotherapy drug-induced agranulocytosis (NCDIA) toxicity. The balanced bagging classifier combined with a gradient boosting decision tree (BBC + GBDT), utilizing the combined descriptor set of DS and RDKit comprising 237 features, emerged as the top-performing model, with an external validation AUC of 0.9164, ACC of 83.55%, and MCC of 0.6095. The model's predictive reliability was further substantiated by an applicability domain analysis. Feature importance, assessed through permutation importance within the BBC + GBDT model, highlighted key molecular properties that significantly influence NCDIA toxicity. Additionally, 16 structural alerts identified by SARpy software further revealed potential molecular signatures associated with toxicity, enriching our understanding of the underlying mechanisms. We also applied the constructed models to assess the NCDIA toxicity of novel drugs approved by FDA. This study advances predictive toxicology by providing a framework to assess and mitigate agranulocytosis risks, ensuring the safety of pharmaceutical development and facilitating post-market surveillance of new drugs.
Collapse
Affiliation(s)
- Xiaojie Huang
- Department of Clinical Pharmacy, Jieyang People’s Hospital, Jieyang, China
| | | | | | | | | |
Collapse
|
2
|
Jha T, Jana R, Banerjee S, Baidya SK, Amin SA, Gayen S, Ghosh B, Adhikari N. Exploring different classification-dependent QSAR modelling strategies for HDAC3 inhibitors in search of meaningful structural contributors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:367-389. [PMID: 38757181 DOI: 10.1080/1062936x.2024.2350504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Accepted: 04/28/2024] [Indexed: 05/18/2024]
Abstract
Histone deacetylase 3 (HDAC3), a Zn2+-dependent class I HDACs, contributes to numerous disorders such as neurodegenerative disorders, diabetes, cardiovascular disease, kidney disease and several types of cancers. Therefore, the development of novel and selective HDAC3 inhibitors might be promising to combat such diseases. Here, different classification-based molecular modelling studies such as Bayesian classification, recursive partitioning (RP), SARpy and linear discriminant analysis (LDA) were conducted on a set of HDAC3 inhibitors to pinpoint essential structural requirements contributing to HDAC3 inhibition followed by molecular docking study and molecular dynamics (MD) simulation analyses. The current study revealed the importance of hydroxamate function for Zn2+ chelation as well as hydrogen bonding interaction with Tyr298 residue. The importance of hydroxamate function for higher HDAC3 inhibition was noticed in the case of Bayesian classification, recursive partitioning and SARpy models. Also, the importance of substituted thiazole ring was revealed, whereas the presence of linear alkyl groups with carboxylic acid function, any type of ester function, benzodiazepine moiety and methoxy group in the molecular structure can be detrimental to HDAC3 inhibition. Therefore, this study can aid in the design and discovery of effective novel HDAC3 inhibitors in the future.
Collapse
Affiliation(s)
- T Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - R Jana
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Banerjee
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S K Baidya
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S A Amin
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - B Ghosh
- Epigenetic Research Laboratory, Department of Pharmacy, Birla Institute of Technology and Science-Pilani, Hyderabad, India
| | - N Adhikari
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
3
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
4
|
A Graph Convolution Network with Subgraph Embedding for Mutagenic Prediction in Aromatic Hydrocarbons. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
5
|
Cavasotto CN, Scardino V. Machine Learning Toxicity Prediction: Latest Advances by Toxicity End Point. ACS OMEGA 2022; 7:47536-47546. [PMID: 36591139 PMCID: PMC9798519 DOI: 10.1021/acsomega.2c05693] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 11/28/2022] [Indexed: 05/29/2023]
Abstract
Machine learning (ML) models to predict the toxicity of small molecules have garnered great attention and have become widely used in recent years. Computational toxicity prediction is particularly advantageous in the early stages of drug discovery in order to filter out molecules with high probability of failing in clinical trials. This has been helped by the increase in the number of large toxicology databases available. However, being an area of recent application, a greater understanding of the scope and applicability of ML methods is still necessary. There are various kinds of toxic end points that have been predicted in silico. Acute oral toxicity, hepatotoxicity, cardiotoxicity, mutagenicity, and the 12 Tox21 data end points are among the most commonly investigated. Machine learning methods exhibit different performances on different data sets due to dissimilar complexity, class distributions, or chemical space covered, which makes it hard to compare the performance of algorithms over different toxic end points. The general pipeline to predict toxicity using ML has already been analyzed in various reviews. In this contribution, we focus on the recent progress in the area and the outstanding challenges, making a detailed description of the state-of-the-art models implemented for each toxic end point. The type of molecular representation, the algorithm, and the evaluation metric used in each research work are explained and analyzed. A detailed description of end points that are usually predicted, their clinical relevance, the available databases, and the challenges they bring to the field are also highlighted.
Collapse
Affiliation(s)
- Claudio N. Cavasotto
- Computational
Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones
en Medicina Traslacional (IIMT), CONICET-Universidad
Austral, Pilar, B1629AHJ Buenos Aires, Argentina
- Austral
Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, B1629AHJ Buenos Aires, Argentina
- Facultad
de Ciencias Biomédicas, Facultad de Ingenierá, Universidad Austral, Pilar, B1630FHB Buenos
Aires, Argentina
| | - Valeria Scardino
- Austral
Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, B1629AHJ Buenos Aires, Argentina
- Meton
AI, Inc., Wilmington, Delaware 19801, United
States
| |
Collapse
|
6
|
Amin SA, Nandi S, Kashaw SK, Jha T, Gayen S. A critical analysis of urea transporter B inhibitors: molecular fingerprints, pharmacophore features for the development of next-generation diuretics. Mol Divers 2022; 26:2549-2559. [PMID: 34978011 DOI: 10.1007/s11030-021-10353-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 11/12/2021] [Indexed: 10/19/2022]
Abstract
Urea transporter is a membrane transport protein. It is involved in the transferring of urea across the cell membrane in humans. Along with urea transporter A, urea transporter B (UT-B) is also responsible for the management of urea concentration and blood pressure of human. The inhibitors of urea transporters have already generated a huge attention to be developed as alternate safe class of diuretic. Unlike conventional diuretics, these inhibitors are suitable for long-term therapy without hampering the precious electrolyte imbalance in the human body. In this study, UT-B inhibitors were analysed by using multi-chemometric modelling approaches. The possible pharmacophore features along with favourable and unfavourable sub-structural fingerprints for UT-B inhibition are extracted. This information will guide the medicinal chemist to design potent UT-B inhibitors in future.
Collapse
Affiliation(s)
- Sk Abdul Amin
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, P. O. Box 17020, Kolkata, India
| | - Sudipta Nandi
- Department of Pharmaceutical Sciences, Dr. Harisingh Gour University, Sagar, Madhya Pradesh, India
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - Sushil Kumar Kashaw
- Department of Pharmaceutical Sciences, Dr. Harisingh Gour University, Sagar, Madhya Pradesh, India
| | - Tarun Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, P. O. Box 17020, Kolkata, India.
| | - Shovanlal Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.
| |
Collapse
|
7
|
Amin SA, Kumar J, Khatun S, Das S, Qureshi IA, Jha T, Gayen S. Binary quantitative activity-activity relationship (QAAR) studies to explore selective HDAC8 inhibitors: In light of mathematical models, DFT-based calculation and molecular dynamic simulation studies. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.132833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
8
|
Dou H, Tan J, Wei H, Wang F, Yang J, Ma XG, Wang J, Zhou T. Transfer inhibitory potency prediction to binary classification: A model only needs a small training set. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 215:106633. [PMID: 35091229 DOI: 10.1016/j.cmpb.2022.106633] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 12/28/2021] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
One of the most laborious for drug discovery is to select compounds from a library for experimental evaluation. Hence, we propose a machine learning model only needs to be trained on a small dataset to predict the inhibition constant (Ki) and half maximal inhibitory concentration (IC50) for a compound. We transfer the prediction task to a simpler binary classification task based on a naive but effective idea that we only need the related rank of a compound to determine whether to take it for further examination. To achieve this, we design a data augmentation strategy to effectively leverage the relationship between the compounds in the training set. After that, we formulate a new reward function for deep reinforcement learning to balance the feature selection and the accuracy. We employ a particle swarm optimized support vector machine for the binary classification task. Finally, a soft voting mechanism is introduced to solve the contradiction from the binary classification. Sufficient experiments show that our model achieves high and reliable accuracy, and is capable of ranking compounds based on a selected set of molecular descriptors. The current results show that our model provides a potential ligand-based in silico approach for prioritizing chemicals for experimental studies.
Collapse
Affiliation(s)
- Haowen Dou
- Department of Computer Science, Shantou University, Shantou, China
| | - Jie Tan
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, China
| | - Huiling Wei
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, China
| | - Fei Wang
- Department of Computer Science, Shantou University, Shantou, China; Key Laboratory of Intelligent Manufacturing Technology (Shantou University), Ministry of Education, Shantou, China
| | - Jinzhu Yang
- School of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
| | - X-G Ma
- Foshan Graduate School, Northeastern University, Foshan, China; The State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang, China
| | - Jiaqi Wang
- School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, China.
| | - Teng Zhou
- Department of Computer Science, Shantou University, Shantou, China; Key Laboratory of Intelligent Manufacturing Technology (Shantou University), Ministry of Education, Shantou, China.
| |
Collapse
|
9
|
OUP accepted manuscript. Mutagenesis 2022; 37:191-202. [DOI: 10.1093/mutage/geac010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 04/09/2022] [Indexed: 11/14/2022] Open
|
10
|
Amin SA, Banerjee S, Singh S, Qureshi IA, Gayen S, Jha T. First structure-activity relationship analysis of SARS-CoV-2 virus main protease (Mpro) inhibitors: an endeavor on COVID-19 drug discovery. Mol Divers 2021; 25:1827-1838. [PMID: 33400085 PMCID: PMC7782049 DOI: 10.1007/s11030-020-10166-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/28/2020] [Indexed: 11/10/2022]
Abstract
Main protease (Mpro) of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) intervenes in the replication and transcription processes of the virus. Hence, it is a lucrative target for anti-viral drug development. In this study, molecular modeling analyses were performed on the structure activity data of recently reported diverse SARS-CoV-2 Mpro inhibitors to understand the structural requirements for higher inhibitory activity. The classification-based quantitative structure-activity relationship (QSAR) models were generated between SARS-CoV-2 Mpro inhibitory activities and different descriptors. Identification of structural fingerprints to increase or decrease in the inhibitory activity was mapped for possible inclusion/exclusion of these fingerprints in the lead optimization process. Challenges in ADME properties of protease inhibitors were also discussed to overcome the problems of oral bioavailability. Further, depending on the modeling results, we have proposed novel as well as potent SARS-CoV-2 Mpro inhibitors.
Collapse
Affiliation(s)
- Sk Abdul Amin
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Suvankar Banerjee
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Samayaditya Singh
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, 500046, Telangana, India
| | - Insaf Ahmed Qureshi
- Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad, 500046, Telangana, India
| | - Shovanlal Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University, Sagar, 470003, MP, India.
| | - Tarun Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| |
Collapse
|
11
|
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K. Machine learning models for classification tasks related to drug safety. Mol Divers 2021; 25:1409-1424. [PMID: 34110577 PMCID: PMC8342376 DOI: 10.1007/s11030-021-10239-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary
| | | | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| |
Collapse
|
12
|
CWLy-RF: A novel approach for identifying cell wall lyases based on random forest classifier. Genomics 2021; 113:2919-2924. [PMID: 34186189 DOI: 10.1016/j.ygeno.2021.06.038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/20/2021] [Accepted: 06/25/2021] [Indexed: 02/05/2023]
Abstract
Drug resistance of pathogenic bacteria has become increasingly serious due to the abuse of antibiotics in recent years. Researchers have found that cell wall lyases are effective antibacterial agents that can specifically recognize target bacteria and degrade bacterial peptidoglycan. Traditional wet experiments are usually expensive, time-consuming and laborious for the identification of lyases. Therefore, there is an urgent need to develop prediction tools based on computer methods to identify lyases quickly and accurately. In this paper, a new predictor, CWLy-RF, is proposed based on the random forest (RF) algorithm to identify cell wall lyases. In this method, we combined three features, namely, 400D, 188D and the composition of k-spaced amino acid group pairs, using mixed-feature representation methods. Afterward, we improved the feature representation ability with the selected top 100 features by using the information gain method and trained a predictive model using RF. The constructed prediction model is evaluated by using 10-fold cross-validation. The accuracy obtained was 96.09%, the AUC was 0.993, the MCC was 0.922, the sensitivity was 94.92%, and the specificity was 97.32%. We have proved that the proposed predictor CWLy-RF is superior to other latest models, and it will hopefully become an effective and useful tool for identifying lyases.
Collapse
|
13
|
Kumar R, Khan FU, Sharma A, Siddiqui MH, Aziz IB, Kamal MA, Ashraf GM, Alghamdi BS, Uddin MS. A deep neural network-based approach for prediction of mutagenicity of compounds. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:47641-47650. [PMID: 33895950 DOI: 10.1007/s11356-021-14028-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 04/16/2021] [Indexed: 02/05/2023]
Abstract
We are exposed to various chemical compounds present in the environment, cosmetics, and drugs almost every day. Mutagenicity is a valuable property that plays a significant role in establishing a chemical compound's safety. Exposure and handling of mutagenic chemicals in the environment pose a high health risk; therefore, identification and screening of these chemicals are essential. Considering the time constraints and the pressure to avoid laboratory animals' use, the shift to alternative methodologies that can establish a rapid and cost-effective detection without undue over-conservation seems critical. In this regard, computational detection and identification of the mutagens in environmental samples like drugs, pesticides, dyes, reagents, wastewater, cosmetics, and other substances is vital. From the last two decades, there have been numerous efforts to develop the prediction models for mutagenicity, and by far, machine learning methods have demonstrated some noteworthy performance and reliability. However, the accuracy of such prediction models has always been one of the major concerns for the researchers working in this area. The mutagenicity prediction models were developed using deep neural network (DNN), support vector machine, k-nearest neighbor, and random forest. The developed classifiers were based on 3039 compounds and validated on 1014 compounds; each of them encoded with 1597 molecular feature vectors. DNN-based prediction model yielded highest prediction accuracy of 92.95% and 83.81% with the training and test data, respectively. The area under the receiver's operating curve and precision-recall curve values were found to be 0.894 and 0.838, respectively. The DNN-based classifier not only fits the data with better performance as compared to traditional machine learning algorithms, viz., support vector machine, k-nearest neighbor, and random forest (with and without feature reduction) but also yields better performance metrics. In current work, we propose a DNN-based model to predict mutagenicity of compounds.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, Uttar Pradesh, India.
| | - Farhat Ullah Khan
- Computer and Information Sciences Department, Universiti Teknologi Petronas, 32610, Seri Iskander, Perak, Malaysia
| | - Anju Sharma
- Department of Applied Science, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
| | - Mohammed Haris Siddiqui
- Department of Bioengineering, Integral University, Dasauli, P.O. Basha, Kursi Road, Lucknow, Uttar Pradesh, India
| | - Izzatdin Ba Aziz
- Computer and Information Sciences Department, Universiti Teknologi Petronas, 32610, Seri Iskander, Perak, Malaysia
| | - Mohammad Amjad Kamal
- West China School of Nursing / Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, Sichuan, China
- King Fahd Medical Research Center, King Abdulaziz University, P. O. Box 80216, Jeddah 21589, Saudi Arabia
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, New South Wales, Australia
| | - Ghulam Md Ashraf
- Pre-Clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia.
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia.
| | - Badrah S Alghamdi
- Pre-Clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Physiology, Neuroscience Unit, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Md Sahab Uddin
- Department of Pharmacy, Southeast University, Dhaka, Bangladesh.
- Pharmakon Neuroscience Research Network, Dhaka, Bangladesh.
| |
Collapse
|
14
|
Arunthavanathan R, Khan F, Ahmed S, Imtiaz S. An analysis of process fault diagnosis methods from safety perspectives. Comput Chem Eng 2021. [DOI: 10.1016/j.compchemeng.2020.107197] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
15
|
Wang MWH, Goodman JM, Allen TEH. Machine Learning in Predictive Toxicology: Recent Applications and Future Directions for Classification Models. Chem Res Toxicol 2020; 34:217-239. [PMID: 33356168 DOI: 10.1021/acs.chemrestox.0c00316] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from in vivo studies toward in silico studies. Currently, in vitro methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, k-nearest neighbors, and ensemble learning. The recent successes of these machine learning methods in predictive toxicology are summarized, and a comparison of some models used in predictive toxicology is presented. In predictive toxicology, SVMs, RF, and DTs are the dominant machine learning methods due to the characteristics of the data available. Lastly, this review describes the current challenges facing the use of machine learning in predictive toxicology and offers insights into the possible areas of improvement in the field.
Collapse
Affiliation(s)
- Marcus W H Wang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jonathan M Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.,MRC Toxicology Unit, University of Cambridge, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, United Kingdom
| |
Collapse
|
16
|
Structural analysis of arylsulfonamide-based carboxylic acid derivatives: a QSAR study to identify the structural contributors toward their MMP-9 inhibition. Struct Chem 2020. [DOI: 10.1007/s11224-020-01635-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
17
|
Liu M, Zhang L, Li S, Yang T, Liu L, Zhao J, Liu H. Prediction of hERG potassium channel blockage using ensemble learning methods and molecular fingerprints. Toxicol Lett 2020; 332:88-96. [PMID: 32629073 DOI: 10.1016/j.toxlet.2020.07.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 06/16/2020] [Accepted: 07/02/2020] [Indexed: 11/30/2022]
Abstract
The human ether-a-go-go-related gene (hERG) encodes a tetrameric potassium channel called Kv11.1. This channel can be blocked by certain drugs, which leads to long QT syndrome, causing cardiotoxicity. This is a significant problem during drug development. Using computer models to predict compound cardiotoxicity during the early stages of drug design will help to solve this problem. In this study, we used a dataset of 1865 compounds exhibiting known hERG inhibitory activities as a training set. Thirty cardiotoxicity classification models were established using three machine learning algorithms based on molecular fingerprints and molecular descriptors. Through using these models as the base classifier, a new cardiotoxicity classification model with better predictive performance was developed using ensemble learning method. The accuracy of the best base classifier, which was generated using the XGBoost method with molecular descriptors, was 84.8 %, and the area under the receiver-operating characteristic curve (AUC) was 0.876 in the five fold cross-validation. However, all of the ensemble models that we developed had higher predictive performance than the base classifiers in the five fold cross-validation. The best predictive performance was achieved by the Ensemble-Top7 model, with accuracy of 84.9 % and AUC of 0.887. We also tested the ensemble model using external validation data and achieved accuracy of 85.0 % and AUC of 0.786. Furthermore, we identified several hERG-related substructures, which provide valuable information for designing drug candidates.
Collapse
Affiliation(s)
- Miao Liu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China; Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, Liaoning University, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Liaoning University, Shenyang, 110036, China
| | - Shimeng Li
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Tianzhou Yang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Lili Liu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Jian Zhao
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Hongsheng Liu
- School of Life Science, Liaoning University, Shenyang, 110036, China; Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, Liaoning University, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Liaoning University, Shenyang, 110036, China.
| |
Collapse
|
18
|
Ghosh K, Bhardwaj B, Amin SA, Jha T, Gayen S. Identification of structural fingerprints for ABCG2 inhibition by using Monte Carlo optimization, Bayesian classification, and structural and physicochemical interpretation (SPCI) analysis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:439-455. [PMID: 32539470 DOI: 10.1080/1062936x.2020.1771769] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 05/17/2020] [Indexed: 06/11/2023]
Abstract
The human breast cancer resistance protein (BCRP), one of the members of the large ATP binding cassette (ABC) transporter superfamily, is crucial for resistance against chemotherapeutic agents. Currently, it has been emerged as one of the best biological targets for the designing of small molecule drugs capable of eliminating multidrug resistance in breast cancer. In order to gain insights into the relationship between the molecular structure of compounds and the ABCG2 inhibition, a multi-QSAR approach using different methods was performed on a dataset of 294 ABCG2 inhibitors with diverse scaffolds. The best models obtained by different chemometric methods have the following statistical characteristics: Monte Carlo Optimization-based QSAR (sensitivity = 0.905, specificity = 0.6255, accuracy = 0.756, and MCC = 0.545), Bayesian classification model (sensitivity = 0.735, specificity = 0.775, and concordance = 0.757); structural and physicochemical interpretation analysis-random forest method (balance accuracy = 0.750, sensitivity = 0.810, and specificity = 0.700). Additionally, structural fingerprints modulating the ABCG2 inhibitory properties were identified from the best models of each method and also validated with each other. The current modelling study is an attempt to get a deep insight into the different important structural fingerprints modulating ABCG2 inhibition.
Collapse
Affiliation(s)
- K Ghosh
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences, Dr. H. S. Gour University , Sagar, India
| | - B Bhardwaj
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences, Dr. H. S. Gour University , Sagar, India
| | - S A Amin
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University , Kolkata, India
| | - T Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University , Kolkata, India
| | - S Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences, Dr. H. S. Gour University , Sagar, India
| |
Collapse
|
19
|
Game PS, Vaze V, Emmanuel M. Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-019-00267-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
20
|
Lv H, Zhang ZM, Li SH, Tan JX, Chen W, Lin H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief Bioinform 2019; 21:982-995. [DOI: 10.1093/bib/bbz048] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 03/25/2019] [Accepted: 04/01/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
5-Methylcytosine (m5C) plays an extremely important role in the basic biochemical process. With the great increase of identified m5C sites in a wide variety of organisms, their epigenetic roles become largely unknown. Hence, accurate identification of m5C site is a key step in understanding its biological functions. Over the past several years, more attentions have been paid on the identification of m5C sites in multiple species. In this work, we firstly summarized the current progresses in computational prediction of m5C sites and then constructed a more powerful and reliable model for identifying m5C sites. To train the model, we collected experimentally confirmed m5C data from Homo sapiens, Mus musculus, Saccharomyces cerevisiae and Arabidopsis thaliana, and compared the performances of different feature extraction methods and classification algorithms for optimizing prediction model. Based on the optimal model, a novel predictor called iRNA-m5C was developed for the recognition of m5C sites. Finally, we critically evaluated the performance of iRNA-m5C and compared it with existing methods. The result showed that iRNA-m5C could produce the best prediction performance. We hope that this paper could provide a guide on the computational identification of m5C site and also anticipate that the proposed iRNA-m5C will become a powerful tool for large scale identification of m5C sites.
Collapse
Affiliation(s)
- Hao Lv
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zi-Mei Zhang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Shi-Hao Li
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiu-Xin Tan
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wei Chen
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
21
|
Structural exploration of arylsulfonamide-based ADAM17 inhibitors through validated comparative multi-QSAR modelling studies. J Mol Struct 2019. [DOI: 10.1016/j.molstruc.2019.02.081] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
22
|
Norinder U, Ahlberg E, Carlsson L. Predicting Ames Mutagenicity Using Conformal Prediction in the Ames/QSAR International Challenge Project. Mutagenesis 2018; 34:33-40. [DOI: 10.1093/mutage/gey038] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 10/10/2018] [Accepted: 11/13/2018] [Indexed: 12/19/2022] Open
Affiliation(s)
- Ulf Norinder
- Swetox, Unit of Toxicology Sciences, Karolinska Institutet, Södertälje, Sweden
- Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden
| | - Ernst Ahlberg
- Drug Safety and Metabolism, Innovative Medicines and Early Development Biotech Unit, AstraZeneca R&D Gothenburg, Mölndal, Sweden
| | - Lars Carlsson
- Computer Learning Research Centre, Royal Holloway, University of London Egham, Surrey, UK
| |
Collapse
|
23
|
Amin SA, Adhikari N, Jha T, Ghosh B. Designing potential HDAC3 inhibitors to improve memory and learning. J Biomol Struct Dyn 2018; 37:2133-2142. [PMID: 30044204 DOI: 10.1080/07391102.2018.1477625] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Sk. Abdul Amin
- Department of Pharmaceutical Technology, Division of Medicinal and Pharmaceutical Chemistry, Natural Science Laboratory, Jadavpur University, Kolkata, West Bengal, India
| | - Nilanjan Adhikari
- Department of Pharmaceutical Technology, Division of Medicinal and Pharmaceutical Chemistry, Natural Science Laboratory, Jadavpur University, Kolkata, West Bengal, India
| | - Tarun Jha
- Department of Pharmaceutical Technology, Division of Medicinal and Pharmaceutical Chemistry, Natural Science Laboratory, Jadavpur University, Kolkata, West Bengal, India
| | - Balaram Ghosh
- Department of Pharmacy, BITS-Pilani, Hyderabad Campus, Shamirpet, Hyderabad, India
| |
Collapse
|
24
|
Zhang H, Ren JX, Ma JX, Ding L. Development of an in silico prediction model for chemical-induced urinary tract toxicity by using naïve Bayes classifier. Mol Divers 2018; 23:381-392. [DOI: 10.1007/s11030-018-9882-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 09/25/2018] [Indexed: 12/16/2022]
|
25
|
Petinrin OO, Saeed F. Bioactive molecule prediction using majority voting-based ensemble method. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-169596] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
| | - Faisal Saeed
- College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
- Department of Information Systems, Faculty of Computing, Universiti Teknologi Malaysia, Johor Bahru, Johor, Malaysia
| |
Collapse
|
26
|
Diverse classes of HDAC8 inhibitors: in search of molecular fingerprints that regulate activity. Future Med Chem 2018; 10:1589-1602. [PMID: 29953251 DOI: 10.4155/fmc-2018-0005] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
AIM HDAC8 is one of the crucial enzymes involved in malignancy. Structural explorations of HDAC8 inhibitory activity and selectivity are required. MATERIALS & METHODS A mathematical framework was constructed to explore important molecular fragments responsible for HDAC8 inhibition. Bayesian classification models were developed on a large set of structurally diverse HDAC8 inhibitors. RESULTS This study helps to understand the structural importance of HDAC8 inhibitors. The hydrophobic aryl cap function is important for HDAC8 inhibition whereas benzamide moiety shows a negative impact on HDAC8 inhibition. CONCLUSION This work validates our previously proposed structural features for better HDAC8 inhibition. The comparative learning between the statistical and intelligent methods will surely enrich future drug design aspects of HDAC8 inhibitors.
Collapse
|
27
|
Amin SA, Adhikari N, Bhargava S, Jha T, Gayen S. Structural exploration of hydroxyethylamines as HIV-1 protease inhibitors: new features identified. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:385-408. [PMID: 29566580 DOI: 10.1080/1062936x.2018.1447511] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
The current study deals with chemometric modelling strategies (Naïve Bayes classification, hologram-based quantitative structure-activity relationship (HQSAR), comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA)) to explore the important features of hydroxylamine derivatives for exerting potent human immunodeficiency virus-1 (HIV-1) protease inhibition. Depending on the statistically validated reliable and robust quantitative structure-activity relationship (QSAR) models, important and crucial structural features have been identified that may be responsible for enhancing the activity profile of these hydroxylamine compounds. Arylsulfonamide function along with methoxy or fluoro substitution is important for enhancing activity. Bulky steric substitution at the sulfonamide nitrogen disfavours activity whereas smaller hydrophobic substitution at the same position is found to be favourable. Apart from the crucial oxazolidinone moiety, pyrrolidine, cyclic urea and methyl ester functions are also responsible for increasing the HIV-1 protease inhibitory profile. Observations derived from these modelling studies may be utilized further in designing promising HIV-1 protease inhibitors of this class.
Collapse
Affiliation(s)
- S A Amin
- a Natural science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, P.O. Box 17020 , Jadavpur University , Kolkata 700032 , West Bengal , India
| | - N Adhikari
- a Natural science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, P.O. Box 17020 , Jadavpur University , Kolkata 700032 , West Bengal , India
| | - S Bhargava
- b Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences , Dr Hari Singh Gour University , Sagar 470003 , Madhya Pradesh , India
| | - T Jha
- a Natural science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, P.O. Box 17020 , Jadavpur University , Kolkata 700032 , West Bengal , India
| | - S Gayen
- b Laboratory of Drug Design and Discovery, Department of Pharmaceutical Sciences , Dr Hari Singh Gour University , Sagar 470003 , Madhya Pradesh , India
| |
Collapse
|
28
|
Structural exploration for the refinement of anticancer matrix metalloproteinase-2 inhibitor designing approaches through robust validated multi-QSARs. J Mol Struct 2018. [DOI: 10.1016/j.molstruc.2017.12.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
29
|
Sajedi H, Mohammadipanah F, Shariat Panahi HK. An image analysis-aided method for redundancy reduction in differentiation of identical Actinobacterial strains. Future Microbiol 2018; 13:313-329. [DOI: 10.2217/fmb-2016-0096] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Aim: To simplify the recognition of Actinobacteria, at different stages of the growth phase, from a mixed culture to facilitate the isolation of novel strains of these bacteria for drug discovery purposes. Materials & methods: A method was developed based on Gabor transform, and machine learning using k-Nearest Neighbors and Naive Bayes classifier, Logitboost, Bagging and Random Forest to automatically categorize the colonies. Results: A signature pattern was inferred by the model, making the differentiation of identical strains possible. Additionally, higher performance, compared with other classification methods was achieved. Conclusion: This automated approach can contribute to the acceleration of the drug discovery process while it simultaneously can diminish the loss of budget due to the redundancy occurred by the inexperienced researchers.
Collapse
Affiliation(s)
- Hedieh Sajedi
- Department of Computer Science, School of Mathematics, Statistics & Computer Science, College of Science, University of Tehran, 14155-6455, Tehran, Iran
| | - Fatemeh Mohammadipanah
- Department of Microbiology, School of Biology & Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, 14155-6455, Tehran, Iran
| | - Hamed Kazemi Shariat Panahi
- Department of Microbiology, School of Biology & Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, 14155-6455, Tehran, Iran
| |
Collapse
|
30
|
Jaeger S, Fulle S, Turk S. Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition. J Chem Inf Model 2018; 58:27-35. [DOI: 10.1021/acs.jcim.7b00616] [Citation(s) in RCA: 251] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Sabrina Jaeger
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| | - Simone Fulle
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| | - Samo Turk
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| |
Collapse
|
31
|
A self-adaptive k-means classifier for business incentive in a fashion design environment. APPLIED COMPUTING AND INFORMATICS 2018. [DOI: 10.1016/j.aci.2017.05.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
32
|
Zhang H, Yu P, Ren JX, Li XB, Wang HL, Ding L, Kong WB. Development of novel prediction model for drug-induced mitochondrial toxicity by using naïve Bayes classifier method. Food Chem Toxicol 2017; 110:122-129. [PMID: 29042293 DOI: 10.1016/j.fct.2017.10.021] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 10/10/2017] [Accepted: 10/13/2017] [Indexed: 02/05/2023]
Abstract
Mitochondrial dysfunction has been considered as an important contributing factor in the etiology of drug-induced organ toxicity, and even plays an important role in the pathogenesis of some diseases. The objective of this investigation was to develop a novel prediction model of drug-induced mitochondrial toxicity by using a naïve Bayes classifier. For comparison, the recursive partitioning classifier prediction model was also constructed. Among these methods, the prediction performance of naïve Bayes classifier established here showed best, which yielded average overall prediction accuracies for the internal 5-fold cross validation of the training set and external test set were 95 ± 0.6% and 81 ± 1.1%, respectively. In addition, four important molecular descriptors and some representative substructures of toxicants produced by ECFP_6 fingerprints were identified. We hope the established naïve Bayes prediction model can be employed for the mitochondrial toxicity assessment, and these obtained important information of mitochondrial toxicants can provide guidance for medicinal chemists working in drug discovery and lead optimization.
Collapse
Affiliation(s)
- Hui Zhang
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan 610041, PR China.
| | - Peng Yu
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China
| | - Ji-Xia Ren
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan 610041, PR China; College of Life Science, Liaocheng University, Liaocheng, Shandong 252059, PR China
| | - Xi-Bo Li
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China
| | - He-Li Wang
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China
| | - Lan Ding
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China.
| | - Wei-Bao Kong
- College of Life Science, Northwest Normal University, Lanzhou, Gansu 730070, PR China
| |
Collapse
|