1
|
Le NQK, Tran TX, Nguyen PA, Ho TT, Nguyen VN. Recent progress in machine learning approaches for predicting carcinogenicity in drug development. Expert Opin Drug Metab Toxicol 2024; 20:621-628. [PMID: 38742542 DOI: 10.1080/17425255.2024.2356162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 05/13/2024] [Indexed: 05/16/2024]
Abstract
INTRODUCTION This review explores the transformative impact of machine learning (ML) on carcinogenicity prediction within drug development. It discusses the historical context and recent advancements, emphasizing the significance of ML methodologies in overcoming challenges related to data interpretation, ethical considerations, and regulatory acceptance. AREAS COVERED The review comprehensively examines the integration of ML, deep learning, and diverse artificial intelligence (AI) approaches in various aspects of drug development safety assessments. It explores applications ranging from early-phase compound screening to clinical trial optimization, highlighting the versatility of ML in enhancing predictive accuracy and efficiency. EXPERT OPINION Through the analysis of traditional approaches such as in vivo rodent bioassays and in vitro assays, the review underscores the limitations and resource intensity associated with these methods. It provides expert insights into how ML offers innovative solutions to address these challenges, revolutionizing safety assessments in drug development.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- AIBioMed Research Group, Taipei Medical University, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Thi-Xuan Tran
- University of Economics and Business Administration, Thai Nguyen University, Thai Nguyen, Vietnam
| | - Phung-Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Vietnam
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei, Vietnam
| | - Trang-Thi Ho
- Department of Computer Science and Information Engineering, TamKang University, New Taipei, Taiwan
| | - Van-Nui Nguyen
- University of Information and Communication Technology, Thai Nguyen University, Thai Nguyen, Vietnam
| |
Collapse
|
2
|
Chen Z, Zhang L, Sun J, Meng R, Yin S, Zhao Q. DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction. J Cell Mol Med 2023; 27:3117-3126. [PMID: 37525507 PMCID: PMC10568665 DOI: 10.1111/jcmm.17889] [Citation(s) in RCA: 38] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/11/2023] [Accepted: 07/22/2023] [Indexed: 08/02/2023] Open
Abstract
The carcinogenicity of drugs can have a serious impact on human health, so carcinogenicity testing of new compounds is very necessary before being put on the market. Currently, many methods have been used to predict the carcinogenicity of compounds. However, most methods have limited predictive power and there is still much room for improvement. In this study, we construct a deep learning model based on capsule network and attention mechanism named DCAMCP to discriminate between carcinogenic and non-carcinogenic compounds. We train the DCAMCP on a dataset containing 1564 different compounds through their molecular fingerprints and molecular graph features. The trained model is validated by fivefold cross-validation and external validation. DCAMCP achieves an average accuracy (ACC) of 0.718 ± 0.009, sensitivity (SE) of 0.721 ± 0.006, specificity (SP) of 0.715 ± 0.014 and area under the receiver-operating characteristic curve (AUC) of 0.793 ± 0.012. Meanwhile, comparable results can be achieved on an external validation dataset containing 100 compounds, with an ACC of 0.750, SE of 0.778, SP of 0.727 and AUC of 0.811, which demonstrate the reliability of DCAMCP. The results indicate that our model has made progress in cancer risk assessment and could be used as an efficient tool in drug design.
Collapse
Affiliation(s)
- Zhe Chen
- School of Mathematics and StatisticsLiaoning UniversityShenyangChina
| | - Li Zhang
- School of Life ScienceLiaoning UniversityShenyangChina
| | - Jianqiang Sun
- School of Information Science and EngineeringLinyi UniversityLinyiChina
| | - Rui Meng
- School of Computer Science and Software EngineeringUniversity of Science and Technology LiaoningAnshanChina
| | - Shuaidong Yin
- School of Computer Science and Software EngineeringUniversity of Science and Technology LiaoningAnshanChina
| | - Qi Zhao
- School of Computer Science and Software EngineeringUniversity of Science and Technology LiaoningAnshanChina
| |
Collapse
|
3
|
Hao N, Sun P, Zhao W, Li X. Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 255:114806. [PMID: 36948010 DOI: 10.1016/j.ecoenv.2023.114806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 03/04/2023] [Accepted: 03/16/2023] [Indexed: 06/18/2023]
Abstract
Cancer, the second largest human disease, has become a major public health problem. The prediction of chemicals' carcinogenicity before their synthesis is crucial. In this paper, seven machine learning algorithms (i.e., Random Forest (RF), Logistic Regression (LR), Support Vector Machines (SVM), Complement Naive Bayes (CNB), K-Nearest Neighbor (KNN), XGBoost, and Multilayer Perceptron (MLP)) were used to construct the carcinogenicity triple classification prediction (TCP) model (i.e., 1A, 1B, Category 2). A total of 1444 descriptors of 118 hazardous organic chemicals were calculated by Discovery Studio 2020, Sybyl X-2.0 and PaDEL-Descriptor software. The constructed carcinogenicity TCP model was evaluated through five model evaluation indicators (i.e., Accuracy, Precision, Recall, F1 Score and AUC). The model evaluation results show that Accuracy, Precision, Recall, F1 Score and AUC evaluation indicators meet requirements (greater than 0.6). The accuracy of RF, LR, XGBoost, and MLP models for predicting carcinogenicity of Category 2 is 91.67%, 79.17%, 100%, and 100%, respectively. In addition, the constructed machine learning model in this study has potential for error correction. Taking XGBoost model as an example, the predicted carcinogenicity level of 1,2,3-Trichloropropane (96-18-4) is Category 2, but the actual carcinogenicity level is 1B. But the difference between Category 2 and 1B is only 0.004, indicating that the XGBoost is one optimum model of the seven constructed machine learning models. Besides, results showed that functional groups like chlorine and benzene ring might influence the prediction of carcinogenic classification. Therefore, considering functional group characteristics of chemicals before constructing the carcinogenicity prediction model of organic chemicals is recommended. The predicted carcinogenicity of the organic chemicals using the optimum machine leaning model (i.e., XGBoost) was also evaluated and verified by the toxicokinetics. The RF and XGBoost TCP models constructed in this paper can be used for carcinogenicity detection before synthesizing new organic substances. It also provides technical support for the subsequent management of organic chemicals.
Collapse
Affiliation(s)
- Ning Hao
- College of New Energy and Environment, Jilin University, Changchun 130012, China
| | - Peixuan Sun
- College of New Energy and Environment, Jilin University, Changchun 130012, China
| | - Wenjin Zhao
- College of New Energy and Environment, Jilin University, Changchun 130012, China.
| | - Xixi Li
- State Environmental Protection Key Laboratory of Ecological Effect and Risk Assessment of Chemicals, Chinese Research Academy of Environmental Sciences, Beijing 100012, China; Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, A1B 3×5, Canada.
| |
Collapse
|
4
|
Kour S, Biswas I, Sheoran S, Arora S, Sheela P, Duppala SK, Murthy DK, Pawar SC, Singh H, Kumar D, Prabhu D, Vuree S, Kumar R. Artificial intelligence and nanotechnology for cervical cancer treatment: Current status and future perspectives. J Drug Deliv Sci Technol 2023. [DOI: 10.1016/j.jddst.2023.104392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
5
|
Limbu S, Dakshanamurthy S. Predicting Chemical Carcinogens Using a Hybrid Neural Network Deep Learning Method. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22218185. [PMID: 36365881 PMCID: PMC9653664 DOI: 10.3390/s22218185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/11/2022] [Accepted: 10/23/2022] [Indexed: 05/28/2023]
Abstract
Determining environmental chemical carcinogenicity is urgently needed as humans are increasingly exposed to these chemicals. In this study, we developed a hybrid neural network (HNN) method called HNN-Cancer to predict potential carcinogens of real-life chemicals. The HNN-Cancer included a new SMILES feature representation method by modifying our previous 3D array representation of 1D SMILES simulated by the convolutional neural network (CNN). We developed binary classification, multiclass classification, and regression models based on diverse non-congeneric chemicals. Along with the HNN-Cancer model, we developed models based on the random forest (RF), bootstrap aggregating (Bagging), and adaptive boosting (AdaBoost) methods for binary and multiclass classification. We developed regression models using HNN-Cancer, RF, support vector regressor (SVR), gradient boosting (GB), kernel ridge (KR), decision tree with AdaBoost (DT), KNeighbors (KN), and a consensus method. The performance of the models for all classifications was assessed using various statistical metrics. The accuracy of the HNN-Cancer, RF, and Bagging models were 74%, and their AUC was ~0.81 for binary classification models developed with 7994 chemicals. The sensitivity was 79.5% and the specificity was 67.3% for the HNN-Cancer, which outperforms the other methods. In the case of multiclass classification models with 1618 chemicals, we obtained the optimal accuracy of 70% with an AUC 0.7 for HNN-Cancer, RF, Bagging, and AdaBoost, respectively. In the case of regression models, the correlation coefficient (R) was around 0.62 for HNN-Cancer and RF higher than the SVM, GB, KR, DTBoost, and NN machine learning methods. Overall, the HNN-Cancer performed better for the majority of the known carcinogen experimental datasets. Further, the predictive performance of HNN-Cancer on diverse chemicals is comparable to the literature-reported models that included similar and less diverse molecules. Our HNN-Cancer could be used in identifying potentially carcinogenic chemicals for a wide variety of chemical classes.
Collapse
|
6
|
Mittal A, Mohanty SK, Gautam V, Arora S, Saproo S, Gupta R, Sivakumar R, Garg P, Aggarwal A, Raghavachary P, Dixit NK, Singh VP, Mehta A, Tayal J, Naidu S, Sengupta D, Ahuja G. Artificial intelligence uncovers carcinogenic human metabolites. Nat Chem Biol 2022; 18:1204-1213. [PMID: 35953549 DOI: 10.1038/s41589-022-01110-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 07/07/2022] [Indexed: 12/14/2022]
Abstract
The genome of a eukaryotic cell is often vulnerable to both intrinsic and extrinsic threats owing to its constant exposure to a myriad of heterogeneous compounds. Despite the availability of innate DNA damage responses, some genomic lesions trigger malignant transformation of cells. Accurate prediction of carcinogens is an ever-challenging task owing to the limited information about bona fide (non-)carcinogens. We developed Metabokiller, an ensemble classifier that accurately recognizes carcinogens by quantitatively assessing their electrophilicity, their potential to induce proliferation, oxidative stress, genomic instability, epigenome alterations, and anti-apoptotic response. Concomitant with the carcinogenicity prediction, Metabokiller is fully interpretable and outperforms existing best-practice methods for carcinogenicity prediction. Metabokiller unraveled potential carcinogenic human metabolites. To cross-validate Metabokiller predictions, we performed multiple functional assays using Saccharomyces cerevisiae and human cells with two Metabokiller-flagged human metabolites, namely 4-nitrocatechol and 3,4-dihydroxyphenylacetic acid, and observed high synergy between Metabokiller predictions and experimental validations.
Collapse
Affiliation(s)
- Aayushi Mittal
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Sanjay Kumar Mohanty
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Vishakha Gautam
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Sakshi Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Sheetanshu Saproo
- Department of Bio-Medical Engineering, Indian Institute of Technology Ropar, Rupnagar, Punjab, India
| | - Ria Gupta
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Roshan Sivakumar
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Prakriti Garg
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Anmol Aggarwal
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Padmasini Raghavachary
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Nilesh Kumar Dixit
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India
| | - Vijay Pal Singh
- CSIR-Institute of Genomics & Integrative Biology, New Delhi, Delhi, India
| | - Anurag Mehta
- Rajiv Gandhi Cancer Institute & Research Centre, New Delhi, Delhi, India
| | - Juhi Tayal
- Rajiv Gandhi Cancer Institute & Research Centre, New Delhi, Delhi, India
| | - Srivatsava Naidu
- Department of Bio-Medical Engineering, Indian Institute of Technology Ropar, Rupnagar, Punjab, India
| | - Debarka Sengupta
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India.
| | - Gaurav Ahuja
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi, Okhla, Phase III, New Delhi, Delhi, India.
| |
Collapse
|
7
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|
8
|
Li F, Fan T, Sun G, Zhao L, Zhong R, Peng Y. Systematic QSAR and iQCCR modelling of fused/non-fused aromatic hydrocarbons (FNFAHs) carcinogenicity to rodents: reducing unnecessary chemical synthesis and animal testing. GREEN CHEMISTRY 2022; 24:5304-5319. [DOI: 10.1039/d2gc00986b] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2023]
Abstract
The prediction of new or untested FNFAHs will reduce unnecessary chemical synthesis and animal testing, and contribute to the design of safer chemicals for production activities.
Collapse
Affiliation(s)
- Feifan Li
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Tengjiao Fan
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
- Department of Medical Technology, Beijing Pharmaceutical University of Staff and Workers, Beijing 100079, China
| | - Guohui Sun
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Lijiao Zhao
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Rugang Zhong
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Yongzhen Peng
- National Engineering Laboratory for Advanced Municipal Wastewater Treatment and Reuse Technology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
9
|
Abstract
Motivation Molecular carcinogenicity is a preventable cause of cancer, but systematically identifying carcinogenic compounds, which involves performing experiments on animal models, is expensive, time consuming and low throughput. As a result, carcinogenicity information is limited and building data-driven models with good prediction accuracy remains a major challenge. Results In this work, we propose CONCERTO, a deep learning model that uses a graph transformer in conjunction with a molecular fingerprint representation for carcinogenicity prediction from molecular structure. Special efforts have been made to overcome the data size constraint, such as multi-round pre-training on related but lower quality mutagenicity data, and transfer learning from a large self-supervised model. Extensive experiments demonstrate that our model performs well and can generalize to external validation sets. CONCERTO could be useful for guiding future carcinogenicity experiments and provide insight into the molecular basis of carcinogenicity. Availability and implementation The code and data underlying this article are available on github at https://github.com/bowang-lab/CONCERTO
Collapse
Affiliation(s)
- Philip Fradkin
- Department of Electrical & Computer Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
- Vector Institute, Toronto, ON M5G 1M1, Canada
| | - Adamo Young
- Vector Institute, Toronto, ON M5G 1M1, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
| | - Lazar Atanackovic
- Department of Electrical & Computer Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada
- Vector Institute, Toronto, ON M5G 1M1, Canada
| | | | - Leo J Lee
- To whom correspondence should be addressed. E-mail: or
| | - Bo Wang
- To whom correspondence should be addressed. E-mail: or
| |
Collapse
|
10
|
Pérez Santín E, Rodríguez Solana R, González García M, García Suárez MDM, Blanco Díaz GD, Cima Cabal MD, Moreno Rojas JM, López Sánchez JI. Toxicity prediction based on artificial intelligence: A multidisciplinary overview. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1516] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Efrén Pérez Santín
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Raquel Rodríguez Solana
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - Mariano González García
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Del Mar García Suárez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Gerardo David Blanco Díaz
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Dolores Cima Cabal
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - José Manuel Moreno Rojas
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - José Ignacio López Sánchez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| |
Collapse
|
11
|
Wang YW, Huang L, Jiang SW, Li K, Zou J, Yang SY. CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens. Food Chem Toxicol 2020; 135:110921. [PMID: 31669597 DOI: 10.1016/j.fct.2019.110921] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 09/21/2019] [Accepted: 10/23/2019] [Indexed: 12/11/2022]
Affiliation(s)
- Yi-Wei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China; College of Preclinical Medicine, Southwest Medical University, Luzhou, Sichuan, 646000, PR China
| | - Lei Huang
- School of Computer Science & Engineer, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, PR China; Basic Teaching Department, Sichuan College of Architectural Technology, Deyang, Sichuan, 61800, PR China
| | - Si-Wen Jiang
- School of Computer Science & Engineer, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, PR China
| | - Kan Li
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| | - Sheng-Yong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| |
Collapse
|
12
|
Devillers J, Devillers H. Toxicity profiling and prioritization of plant-derived antimalarial agents. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2019; 30:801-824. [PMID: 31565973 DOI: 10.1080/1062936x.2019.1665844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 09/06/2019] [Indexed: 06/10/2023]
Abstract
Human malaria is the most widespread mosquito-borne life-threatening disease worldwide. In the absence of effective vaccines, prevention and treatment of malaria only depend on prophylaxis and drug-based therapy either in monotherapy or in combination. Unfortunately, the number of available antimalarial drugs presenting different mechanisms of action is rather limited. In addition, the appearance of drug-resistance in the parasite strains impacts the efficacy of the treatments. As a result, there is a crucial need to find new drugs to circumvent resistance problems. In the quest to identify new antimalarial agents a huge number of plant-derived compounds (PDCs) have been investigated. Surprisingly in the in silico PDC screening programs, toxicity filters are either never used or so simple that their interest is limited. In this context, the goal of this study was to show how to take advantage of validated toxicity QSAR models for refining the selection of PDCs. From an original data set of 507 PDCs collected from the literature, the use of toxicity filters for endocrine disruption, developmental toxicity, and hepatotoxicity in conjunction with classical pharmacokinetic filters allowed us to obtain a list of 31 compounds of potential interest. The pros and cons of such a strategy have been discussed.
Collapse
Affiliation(s)
| | - H Devillers
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay , Jouy-en-Josas , France
| |
Collapse
|
13
|
Yang H, Sun L, Li W, Liu G, Tang Y. In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts. Front Chem 2018. [PMID: 29515993 PMCID: PMC5826228 DOI: 10.3389/fchem.2018.00030] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
During drug development, safety is always the most important issue, including a variety of toxicities and adverse drug effects, which should be evaluated in preclinical and clinical trial phases. This review article at first simply introduced the computational methods used in prediction of chemical toxicity for drug design, including machine learning methods and structural alerts. Machine learning methods have been widely applied in qualitative classification and quantitative regression studies, while structural alerts can be regarded as a complementary tool for lead optimization. The emphasis of this article was put on the recent progress of predictive models built for various toxicities. Available databases and web servers were also provided. Though the methods and models are very helpful for drug design, there are still some challenges and limitations to be improved for drug safety assessment in the future.
Collapse
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Lixia Sun
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
14
|
Guan D, Fan K, Spence I, Matthews S. Combining machine learning models of in vitro and in vivo bioassays improves rat carcinogenicity prediction. Regul Toxicol Pharmacol 2018; 94:8-15. [PMID: 29337192 DOI: 10.1016/j.yrtph.2018.01.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Revised: 01/09/2018] [Accepted: 01/10/2018] [Indexed: 12/18/2022]
Abstract
In vitro genotoxicity bioassays are cost-efficient methods of assessing potential carcinogens. However, many genotoxicity bioassays are inappropriate for detecting chemicals eliciting non-genotoxic mechanisms, such as tumour promotion, this necessitates the use of in vivo rodent carcinogenicity (IVRC) assays. In silico IVRC modelling could potentially address the low throughput and high cost of this assay. We aimed to develop and combine computational QSAR models of novel bioassays for the prediction of IVRC results and compare with existing software. QSAR models were generated from existing Ames (n = 6512), Syrian Hamster Embryonic (SHE, n = 410), ISSCAN rodent carcinogenicity (ISC, n = 834) and GreenScreen GADD45a-GFP (n = 1415) chemical datasets. These models mapped the molecular descriptors of each compound to their respective assay result using machine learning algorithms (adaboost, k-Nearest Neighbours, C.45 Decision Tree, Multilayer Perceptron, Random Forest). The best performing models were combined with k-Nearest Neighbours to create a cascade model for IVRC prediction. High QSAR model performance was observed from ten time 10-fold cross-validation with above 80% accuracy and 0.85 AUC for each assay dataset. The cascade model predicted rat carcinogenicity with 69.3% accuracy and 0.700 AUC. This study demonstrates the novelty of a combined approach for IVRC prediction, with higher performance than existing software.
Collapse
Affiliation(s)
- Davy Guan
- Sydney Medical School, The University of Sydney, Australia
| | - Kevin Fan
- Sydney Medical School, The University of Sydney, Australia
| | - Ian Spence
- Sydney Medical School, The University of Sydney, Australia
| | - Slade Matthews
- Sydney Medical School, The University of Sydney, Australia.
| |
Collapse
|
15
|
|
16
|
Zhang L, Ai H, Chen W, Yin Z, Hu H, Zhu J, Zhao J, Zhao Q, Liu H. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods. Sci Rep 2017; 7:2118. [PMID: 28522849 PMCID: PMC5437031 DOI: 10.1038/s41598-017-02365-0] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 04/10/2017] [Indexed: 01/11/2023] Open
Abstract
Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models (http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/).
Collapse
Affiliation(s)
- Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China.,Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China
| | - Haixin Ai
- School of Life Science, Liaoning University, Shenyang, 110036, China.,Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China.,Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China
| | - Wen Chen
- School of Information, Liaoning University, Shenyang, 110036, China
| | - Zimo Yin
- School of Information, Liaoning University, Shenyang, 110036, China
| | - Huan Hu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Junfeng Zhu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Jian Zhao
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Qi Zhao
- Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China.,School of Mathematics, Liaoning University, Shenyang, 110036, China
| | - Hongsheng Liu
- School of Life Science, Liaoning University, Shenyang, 110036, China. .,Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province, Shenyang, 110036, China. .,Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China.
| |
Collapse
|
17
|
Zhang H, Cao ZX, Li M, Li YZ, Peng C. Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals. Food Chem Toxicol 2016; 97:141-149. [PMID: 27597133 DOI: 10.1016/j.fct.2016.09.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 08/02/2016] [Accepted: 09/01/2016] [Indexed: 02/05/2023]
Abstract
The carcinogenicity prediction has become a significant issue for the pharmaceutical industry. The purpose of this investigation was to develop a novel prediction model of carcinogenicity of chemicals by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test set. The naïve Bayes classifier gave an average overall prediction accuracy of 90 ± 0.8% for the training set and 68 ± 1.9% for the external test set. Moreover, five simple molecular descriptors (e.g., AlogP, Molecular weight (MW), No. of H donors, Apol and Wiener) considered as important for the carcinogenicity of chemicals were identified, and some substructures related to the carcinogenicity were achieved. Thus, we hope the established naïve Bayes prediction model could be applied to filter early-stage molecules for this potential carcinogenicity adverse effect; and the identified five simple molecular descriptors and substructures of carcinogens would give a better understanding of the carcinogenicity of chemicals, and further provide guidance for medicinal chemists in the design of new candidate drugs and lead optimization, ultimately reducing the attrition rate in later stages of drug development.
Collapse
Affiliation(s)
- Hui Zhang
- College of Life Science, Northwest Normal University, Lanzhou, Gansu, 730070, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| | - Zhi-Xing Cao
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China; State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Chengdu, Sichuan, 610041, PR China
| | - Meng Li
- College of Life Science, Northwest Normal University, Lanzhou, Gansu, 730070, PR China
| | - Yu-Zhi Li
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China
| | - Cheng Peng
- Pharmacy College, Chengdu University of Traditional Chinese Medicine, Key Laboratory of Systematic Research, Development and Utilization of Chinese Medicine Resources in Sichuan Province-key Laboratory Breeding Base of Co-founded by Sichuan Province and MOST, Chendu, Sichuan, PR China
| |
Collapse
|
18
|
Predicting the acute neurotoxicity of diverse organic solvents using probabilistic neural networks based QSTR modeling approaches. Neurotoxicology 2016; 53:45-52. [DOI: 10.1016/j.neuro.2015.12.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Revised: 12/17/2015] [Accepted: 12/17/2015] [Indexed: 12/23/2022]
|
19
|
Predicting human intestinal absorption of diverse chemicals using ensemble learning based QSAR modeling approaches. Comput Biol Chem 2016; 61:178-96. [PMID: 26881740 DOI: 10.1016/j.compbiolchem.2016.01.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 01/18/2016] [Accepted: 01/21/2016] [Indexed: 11/20/2022]
Abstract
Human intestinal absorption (HIA) of the drugs administered through the oral route constitutes an important criterion for the candidate molecules. The computational approach for predicting the HIA of molecules may potentiate the screening of new drugs. In this study, ensemble learning (EL) based qualitative and quantitative structure-activity relationship (SAR) models (gradient boosted tree, GBT and bagged decision tree, BDT) have been established for the binary classification and HIA prediction of the chemicals, using the selected molecular descriptors. The structural diversity of the chemicals and the nonlinear structure in the considered data were tested by the similarity index and Brock-Dechert-Scheinkman statistics. The external predictive power of the developed SAR models was evaluated through the internal and external validation procedures recommended in the literature. All the statistical criteria parameters derived for the performance of the constructed SAR models were above their respective thresholds suggesting for their robustness for future applications. In complete data, the qualitative SAR models rendered classification accuracy of >99%, while the quantitative SAR models yielded correlation (R(2)) of >0.91 between the measured and predicted HIA values. The performances of the EL-based SAR models were also compared with the linear models (linear discriminant analysis, LDA and multiple linear regression, MLR). The GBT and BDT SAR models performed better than the LDA and MLR methods. A comparison of our models with the previously reported QSARs for HIA prediction suggested for their better performance. The results suggest for the appropriateness of the developed SAR models to reliably predict the HIA of structurally diverse chemicals and can serve as useful tools for the initial screening of the molecules in the drug development process.
Collapse
|
20
|
Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks. J Med Syst 2016; 40:69. [PMID: 26733278 DOI: 10.1007/s10916-015-0428-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 12/23/2015] [Indexed: 10/22/2022]
Abstract
Resistance to antiretroviral drugs has been a major obstacle for long-lasting treatment of HIV-infected patients. The development of models to predict drug resistance is recognized as useful for helping the decision of the best therapy for each HIV+ individual. The aim of this study was to develop classifiers for predicting resistance to the HIV protease inhibitor lopinavir using a probabilistic neural network (PNN). The data were provided by the Molecular Virology Laboratory of the Health Sciences Center, Federal University of Rio de Janeiro (CCS-UFRJ/Brazil). Using bootstrap and stepwise techniques, ten features were selected by logistic regression (LR) to be used as inputs to the network. Bootstrap and cross-validation were used to define the smoothing parameter of the PNN networks. Four balanced models were designed and evaluated using a separate test set. The accuracies of the classifiers with the test set ranged from 0.89 to 0.94, and the area under the receiver operating characteristic (ROC) curve (AUC) ranged from 0.96 to 0.97. The sensitivity ranged from 0.94 to 1.00, and the specificity was between 0.88 and 0.92. Four classifiers showed performances very close to three existing expert-based interpretation systems, the HIVdb, the Rega and the ANRS algorithms, and to a k-Nearest Neighbor.
Collapse
|
21
|
Basant N, Gupta S, Singh KP. Predicting binding affinities of diverse pharmaceutical chemicals to human serum plasma proteins using QSPR modelling approaches. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:67-85. [PMID: 26854728 DOI: 10.1080/1062936x.2015.1133700] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The prediction of the plasma protein binding (PPB) affinity of chemicals is of paramount significance in the drug development process. In this study, ensemble machine learning-based QSPR models have been established for a four-category classification and PPB affinity prediction of diverse compounds using a large PPB dataset of 930 compounds and in accordance with the OECD guidelines. The structural diversity of the chemicals was tested by the Tanimoto similarity index. The external predictive power of the developed QSPR models was evaluated through internal and external validations. In the QSPR models, XLogP was the most important descriptor. In the test data, the classification QSPR models rendered an accuracy of >93%, while the regression QSPR models yielded r(2) of >0.920 between the measured and predicted PPB affinities, with the root mean squared error <9.77. Values of statistical coefficients derived for the test data were above their threshold limits, thus put a high confidence in this analysis. The QSPR models in this study performed better than any of the previous studies. The results suggest that the developed QSPR models are reliable for predicting the PPB affinity of structurally diverse chemicals. They can be useful for initial screening of candidate molecules in the drug development process.
Collapse
Affiliation(s)
- N Basant
- a ETRC , Gomtinagar, Lucknow , India
| | - S Gupta
- b Environmental Chemistry Division , CSIR-Indian Institute of Toxicology Research , Lucknow , India
| | - K P Singh
- b Environmental Chemistry Division , CSIR-Indian Institute of Toxicology Research , Lucknow , India
| |
Collapse
|
22
|
Wu X, Zhang Q, Wang H, Hu J. Predicting carcinogenicity of organic compounds based on CPDB. CHEMOSPHERE 2015; 139:81-90. [PMID: 26070146 DOI: 10.1016/j.chemosphere.2015.05.056] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 05/13/2015] [Accepted: 05/17/2015] [Indexed: 06/04/2023]
Abstract
Cancer is a major killer of human health and predictions for the carcinogenicity of chemicals are of great importance. In this article, predictive models for the carcinogenicity of organic compounds using QSAR methods for rats and mice were developed based on the data from CPDB. The models was developed based on the data of specific target site liver and classified according to sex of rats and mice. Meanwhile, models were also classified according to whether there is a ring in the molecular structure in order to reduce the diversity of molecular structure. Therefore, eight local models were developed in the final. Taking into account the complexity of carcinogenesis and in order to obtain as much information, DRAGON descriptors were selected as the variables used to develop models. Fitting ability, robustness and predictive power of the models were assessed according to the OECD principles. The external predictive coefficients for validation sets of each model were in the range of 0.711-0.906, and for the whole data in each model were all greater than 0.8, which represents that all models have good predictivity. In order to study the mechanism of carcinogenesis, standardized regression coefficients were calculated for all predictor variables. In addition, the effect of animal sex on carcinogenesis was compared and a trend that female showed stronger tolerance for cancerogen than male in both species was appeared.
Collapse
Affiliation(s)
- Xiuchao Wu
- Environment Research Institute, Shandong University, Jinan 250100, PR China
| | - Qingzhu Zhang
- Environment Research Institute, Shandong University, Jinan 250100, PR China.
| | - Hui Wang
- School of Environment, Tsinghua University, Beijing 100084, PR China.
| | - Jingtian Hu
- Environment Research Institute, Shandong University, Jinan 250100, PR China
| |
Collapse
|
23
|
Gupta S, Basant N, Rai P, Singh KP. Modeling the binding affinity of structurally diverse industrial chemicals to carbon using the artificial intelligence approaches. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2015; 22:17810-17827. [PMID: 26160122 DOI: 10.1007/s11356-015-4965-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 06/25/2015] [Indexed: 06/04/2023]
Abstract
Binding affinity of chemical to carbon is an important characteristic as it finds vast industrial applications. Experimental determination of the adsorption capacity of diverse chemicals onto carbon is both time and resource intensive, and development of computational approaches has widely been advocated. In this study, artificial intelligence (AI)-based ten different qualitative and quantitative structure-property relationship (QSPR) models (MLPN, RBFN, PNN/GRNN, CCN, SVM, GEP, GMDH, SDT, DTF, DTB) were established for the prediction of the adsorption capacity of structurally diverse chemicals to activated carbon following the OECD guidelines. Structural diversity of the chemicals and nonlinear dependence in the data were evaluated using the Tanimoto similarity index and Brock-Dechert-Scheinkman statistics. The generalization and prediction abilities of the constructed models were established through rigorous internal and external validation procedures performed employing a wide series of statistical checks. In complete dataset, the qualitative models rendered classification accuracies between 97.04 and 99.93%, while the quantitative models yielded correlation (R(2)) values of 0.877-0.977 between the measured and the predicted endpoint values. The quantitative prediction accuracies for the higher molecular weight (MW) compounds (class 4) were relatively better than those for the low MW compounds. Both in the qualitative and quantitative models, the Polarizability was the most influential descriptor. Structural alerts responsible for the extreme adsorption behavior of the compounds were identified. Higher number of carbon and presence of higher halogens in a molecule rendered higher binding affinity. Proposed QSPR models performed well and outperformed the previous reports. A relatively better performance of the ensemble learning models (DTF, DTB) may be attributed to the strengths of the bagging and boosting algorithms which enhance the predictive accuracies. The proposed AI models can be useful tools in screening the chemicals for their binding affinities toward carbon for their safe management.
Collapse
Affiliation(s)
- Shikha Gupta
- Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow, 226 001, India
| | - Nikita Basant
- KanbanSystems Pvt. Ltd., Laxmi Nagar, Delhi, 110092, India
| | - Premanjali Rai
- Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow, 226 001, India
| | - Kunwar P Singh
- Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow, 226 001, India.
| |
Collapse
|
24
|
Gupta S, Basant N, Singh KP. Nonlinear QSAR modeling for predicting cytotoxicity of ionic liquids in leukemia rat cell line: an aid to green chemicals designing. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2015; 22:12699-12710. [PMID: 25913312 DOI: 10.1007/s11356-015-4526-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 04/09/2015] [Indexed: 06/04/2023]
Abstract
Safety assessment and designing of safer ionic liquids (ILs) are among the priorities of the chemists and toxicologists today. Computational approaches have been considered as appropriate methods for prior safety assessment of chemicals and tools to aid in structural designing. The present study is an attempt to investigate the chemical attributes of a wide variety of ILs towards their cytotoxicity in leukemia rat cell line IPC-81 through the development of nonlinear quantitative structure-activity relationship (QSAR) models in the light of the OECD principles for QSAR development. Here, the cascade correlation network (CCN), probabilistic neural network (PNN), and generalized regression neural networks (GRNN) QSAR models were established for the discrimination of ILs in four categories of cytotoxicity and their end-point prediction using few simple descriptors. The diversity and nonlinearity of the considered dataset were evaluated through computing the Euclidean distance and Brock-Dechert-Scheinkman statistics. The constructed QSAR models were validated with external test data. The predictive power of these models was established through a variety of stringent parameters recommended in QSAR literature. The classification QSARs rendered the accuracy of >86%, and the regression models yielded correlation (R(2)) of >0.90 in test data. The developed QSAR models exhibited high statistical confidence and identified the structural elements of the ILs responsible for their cytotoxicity and, hence, could be useful tools in structural designing of safer and green ILs.
Collapse
Affiliation(s)
- Shikha Gupta
- Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow, 226 001, India
| | | | | |
Collapse
|
25
|
Gupta S, Basant N, Singh KP. Estimating sensory irritation potency of volatile organic chemicals using QSARs based on decision tree methods for regulatory purpose. ECOTOXICOLOGY (LONDON, ENGLAND) 2015; 24:873-86. [PMID: 25707485 DOI: 10.1007/s10646-015-1431-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/12/2015] [Indexed: 05/26/2023]
Abstract
Volatile organic compounds (VOCs) are among the priority atmospheric pollutants that have high indoor and outdoor exposure potential. The toxicity assessment of VOCs to living ecosystems has received considerable attention in recent years. Development of computational methods for safety assessment of chemicals has been advocated by various regulatory agencies. The paper proposes robust and reliable quantitative structure-activity relationships (QSARs) for estimating the sensory irritation potency and screening of the VOCs. Here, decision tree (DT) based classification and regression QSARs models, such as single DT, decision tree forest (DTF), and decision tree boost (DTB) were developed using the sensory irritation data on VOCs in mice following the OECD principles. Structural diversity and nonlinearity in the data were evaluated through the Euclidean distance and Brock-Dechert-Scheinkman statistics. The constructed QSAR models were validated with external test data and the predictive performance of these models was established through a set of coefficients recommended in QSAR literature. The performance of all three classification and regression QSAR models was satisfactory, but DTF and DTB performed relatively better. The classification and regression QSAR models (DTF, DTB) rendered classification accuracies of 98.59 and 100 %, and yielded correlations (R(2)) of 0.950 and 0.971, respectively in complete data. The lipoaffinity index and SwHBa were identified as the most influential descriptors in proposed QSARs. The developed QSARs performed better than the previous studies. The developed models exhibited high statistical confidence and identified the structural properties of the VOCs responsible for their sensory irritation, and hence could be useful tools in screening of chemicals for regulatory purpose.
Collapse
Affiliation(s)
- Shikha Gupta
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi, 110 001, India
| | | | | |
Collapse
|
26
|
Li X, Du Z, Wang J, Wu Z, Li W, Liu G, Shen X, Tang Y. In Silico Estimation of Chemical Carcinogenicity with Binary and Ternary Classification Methods. Mol Inform 2015; 34:228-35. [PMID: 27490168 DOI: 10.1002/minf.201400127] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 01/11/2015] [Indexed: 11/07/2022]
Abstract
Carcinogenicity is one of the most concerned properties of chemicals to human health, thus it is important to identify chemical carcinogenicity as early as possible. In this study, 829 diverse compounds with rat carcinogenicity were collected from Carcinogenic Potency Database (CPDB). Using six types of fingerprints to represent the molecules, 30 binary and ternary classification models were generated to predict chemical carcinogenicity by five machine learning methods. The models were evaluated by an external validation set containing 87 chemicals from ISSCAN database. The best binary model was developed by MACCS keys and kNN algorithm with predictive accuracy at 83.91 %, while the best ternary model was also generated by MACCS keys and kNN algorithm with overall accuracy at 80.46 %. Furthermore, the best binary and ternary classification models were used to estimate carcinogenicity of tobacco smoke components containing 2251 compounds. 981 ones were predicted as carcinogens by binary classification model, while 110 compounds were predicted as strong carcinogens and 807 ones as weak carcinogens by ternary classification model. The results indicated that our models would be helpful for prediction of chemical carcinogenicity.
Collapse
Affiliation(s)
- Xiao Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033.,Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
| | - Zheng Du
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Jie Wang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Zengrui Wu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033
| | - Xu Shen
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, P. R. China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, P. R. China phone: +86-21-6425-1052; fax: +86-21-6425-1033. .,Key Laboratory of Cigarette Smoke, Technical Center, Shanghai Tobacco Group Co. Ltd. Shanghai 200082, P. R. China.
| |
Collapse
|
27
|
Singh KP, Gupta S, Basant N. QSTR modeling for predicting aquatic toxicity of pharmacological active compounds in multiple test species for regulatory purpose. CHEMOSPHERE 2015; 120:680-689. [PMID: 25462313 DOI: 10.1016/j.chemosphere.2014.10.025] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 09/25/2014] [Accepted: 10/12/2014] [Indexed: 06/04/2023]
Abstract
High concentrations of pharmacological active compounds (PACs) detected in global drinking water resources and their toxicological implications in aquatic life has become a matter of concern compelling for the development of reliable QSTRs (qualitative/quantitative structure-toxicity relationships) for their risk assessment. Robust QSTRs, such as decision treeboost (DTB) and decision tree forest (DTF) models implementing stochastic gradient boosting and bagging algorithms were established by experimental toxicity data of structurally diverse PACs in daphnia using molecular descriptors for predicting toxicity of new untested compounds in multiple test species. Developed models were rigorously validated using OECD recommended internal and external validation procedures and predictive power tested with external data of different trophic level test species (algae and fish). Classification QSTRs (DTB, DTF) rendered accuracy of 98.73% and 97.47%, respectively in daphnia and 84.38%, 85.94% (algae), 78.46% and 79.23% (fish). On the other hand, the regression QSTRs (DTB, DTF) yielded squared correlation coefficient values of 0.831, 0.852 (daphnia), 0.534, 0.556 (algae) and 0.620, 0.637 (fish). QSTRs developed in this study passed the OECD validation criteria and performed better than reported earlier for predicting toxicity of PACs, and can be used for screening the new untested compounds for regulatory purpose.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi 110 001, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India.
| | - Shikha Gupta
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi 110 001, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India
| | - Nikita Basant
- KanbanSystems Pvt. Ltd., Laxmi Nagar, Delhi 110 092, India
| |
Collapse
|
28
|
Gupta S, Basant N, Singh KP. Qualitative and quantitative structure-activity relationship modelling for predicting blood-brain barrier permeability of structurally diverse chemicals. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2015; 26:95-124. [PMID: 25629764 DOI: 10.1080/1062936x.2014.994562] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this study, structure-activity relationship (SAR) models have been established for qualitative and quantitative prediction of the blood-brain barrier (BBB) permeability of chemicals. The structural diversity of the chemicals and nonlinear structure in the data were tested. The predictive and generalization ability of the developed SAR models were tested through internal and external validation procedures. In complete data, the QSAR models rendered ternary classification accuracy of >98.15%, while the quantitative SAR models yielded correlation (r(2)) of >0.926 between the measured and the predicted BBB permeability values with the mean squared error (MSE) <0.045. The proposed models were also applied to an external new in vitro data and yielded classification accuracy of >82.7% and r(2) > 0.905 (MSE < 0.019). The sensitivity analysis revealed that topological polar surface area (TPSA) has the highest effect in qualitative and quantitative models for predicting the BBB permeability of chemicals. Moreover, these models showed predictive performance superior to those reported earlier in the literature. This demonstrates the appropriateness of the developed SAR models to reliably predict the BBB permeability of new chemicals, which can be used for initial screening of the molecules in the drug development process.
Collapse
Affiliation(s)
- S Gupta
- a Academy of Scientific and Innovative Research , Anusandhan Bhawan, New Delhi , India
| | | | | |
Collapse
|
29
|
Dearden JC, Rowe PH. Use of artificial neural networks in the QSAR prediction of physicochemical properties and toxicities for REACH legislation. Methods Mol Biol 2015; 1260:65-88. [PMID: 25502376 DOI: 10.1007/978-1-4939-2239-0_5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
With the introduction of the REACH legislation in the European Union, there is a requirement for property and toxicity data on chemicals produced in or imported into the EU at levels of 1 tonne/year or more. This has meant an increase in the in silico prediction of such data. One of the chief predictive approaches is QSAR (quantitative structure-activity relationships), which is widely used in many fields. A QSAR approach that is increasingly being used is that of artificial neural networks (ANNs), and this chapter discusses its application to the range of physicochemical properties and toxicities required by REACH. ANNs generally outperform the main QSAR approach of multiple linear regression (MLR), although other approaches such as support vector machines sometimes outperform ANNs. Most ANN QSARs reported to date comply with only two of the five OECD Guidelines for the Validation of (Q)SARs.
Collapse
Affiliation(s)
- John C Dearden
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, UK,
| | | |
Collapse
|
30
|
Singh KP, Gupta S, Basant N. Predicting toxicities of ionic liquids in multiple test species – an aid in designing green chemicals. RSC Adv 2014. [DOI: 10.1039/c4ra11252k] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
|
31
|
Singh KP, Gupta S, Basant N, Mohan D. QSTR Modeling for Qualitative and Quantitative Toxicity Predictions of Diverse Chemical Pesticides in Honey Bee for Regulatory Purposes. Chem Res Toxicol 2014; 27:1504-15. [DOI: 10.1021/tx500100m] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Kunwar P. Singh
- Academy of Scientific
and Innovative Research, Anusandhan
Bhawan, Rafi Marg, New Delhi-110 001, India
- Environmental
Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow-226 001, India
| | - Shikha Gupta
- Academy of Scientific
and Innovative Research, Anusandhan
Bhawan, Rafi Marg, New Delhi-110 001, India
- Environmental
Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow-226 001, India
| | - Nikita Basant
- Kanban Systems Pvt.
Ltd., Laxmi Nagar, Delhi-110092, India
| | - Dinesh Mohan
- School
of Environmental Sciences, Jawaharlal Nehru University, New Delhi-110067, India
| |
Collapse
|
32
|
Singh KP, Gupta S, Kumar A, Mohan D. Multispecies QSAR modeling for predicting the aquatic toxicity of diverse organic chemicals for regulatory toxicology. Chem Res Toxicol 2014; 27:741-53. [PMID: 24738471 DOI: 10.1021/tx400371w] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The research aims to develop multispecies quantitative structure-activity relationships (QSARs) modeling tools capable of predicting the acute toxicity of diverse chemicals in various Organization for Economic Co-operation and Development (OECD) recommended test species of different trophic levels for regulatory toxicology. Accordingly, the ensemble learning (EL) approach based classification and regression QSAR models, such as decision treeboost (DTB) and decision tree forest (DTF) implementing stochastic gradient boosting and bagging algorithms were developed using the algae (P. subcapitata) experimental toxicity data for chemicals. The EL-QSAR models were successfully applied to predict toxicities of wide groups of chemicals in other test species including algae (S. obliguue), daphnia, fish, and bacteria. Structural diversity of the selected chemicals and those of the end-point toxicity data of five different test species were tested using the Tanimoto similarity index and Kruskal-Wallis (K-W) statistics. Predictive and generalization abilities of the constructed QSAR models were compared using statistical parameters. The developed QSAR models (DTB and DTF) yielded a considerably high classification accuracy in complete data of model building (algae) species (97.82%, 99.01%) and ranged between 92.50%-94.26% and 92.14%-94.12% in four test species, respectively, whereas regression QSAR models (DTB and DTF) rendered high correlation (R(2)) between the measured and model predicted toxicity end-point values and low mean-squared error in model building (algae) species (0.918, 0.15; 0.905, 0.21) and ranged between 0.575 and 0.672, 0.18-0.51 and 0.605-0.689 and 0.20-0.45 in four different test species. The developed QSAR models exhibited good predictive and generalization abilities in different test species of varied trophic levels and can be used for predicting the toxicities of new chemicals for screening and prioritization of chemicals for regulation.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi-110 001, India
| | | | | | | |
Collapse
|
33
|
Singh KP, Gupta S. Nano-QSAR modeling for predicting biological activity of diverse nanomaterials. RSC Adv 2014. [DOI: 10.1039/c4ra01274g] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Case study-1 (diverse metal core NPs); case study-2 (similar metal core NPs); case study-3 (metal oxide NPs); case study-4 (surface modified multi-walled CNTs); case study-5 (fullerene derivatives).
Collapse
Affiliation(s)
- Kunwar P. Singh
- Academy of Scientific and Innovative Research
- New Delhi-110 001, India
- Environmental Chemistry Division
- CSIR-Indian Institute of Toxicology Research
- Lucknow-226 001, India
| | - Shikha Gupta
- Academy of Scientific and Innovative Research
- New Delhi-110 001, India
- Environmental Chemistry Division
- CSIR-Indian Institute of Toxicology Research
- Lucknow-226 001, India
| |
Collapse
|