1
|
Richard-Bollans A, Aitken C, Antonelli A, Bitencourt C, Goyder D, Lucas E, Ondo I, Pérez-Escobar OA, Pironon S, Richardson JE, Russell D, Silvestro D, Wright CW, Howes MJR. Machine learning enhances prediction of plants as potential sources of antimalarials. FRONTIERS IN PLANT SCIENCE 2023; 14:1173328. [PMID: 37304721 PMCID: PMC10248027 DOI: 10.3389/fpls.2023.1173328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 04/20/2023] [Indexed: 06/13/2023]
Abstract
Plants are a rich source of bioactive compounds and a number of plant-derived antiplasmodial compounds have been developed into pharmaceutical drugs for the prevention and treatment of malaria, a major public health challenge. However, identifying plants with antiplasmodial potential can be time-consuming and costly. One approach for selecting plants to investigate is based on ethnobotanical knowledge which, though having provided some major successes, is restricted to a relatively small group of plant species. Machine learning, incorporating ethnobotanical and plant trait data, provides a promising approach to improve the identification of antiplasmodial plants and accelerate the search for new plant-derived antiplasmodial compounds. In this paper we present a novel dataset on antiplasmodial activity for three flowering plant families - Apocynaceae, Loganiaceae and Rubiaceae (together comprising c. 21,100 species) - and demonstrate the ability of machine learning algorithms to predict the antiplasmodial potential of plant species. We evaluate the predictive capability of a variety of algorithms - Support Vector Machines, Logistic Regression, Gradient Boosted Trees and Bayesian Neural Networks - and compare these to two ethnobotanical selection approaches - based on usage as an antimalarial and general usage as a medicine. We evaluate the approaches using the given data and when the given samples are reweighted to correct for sampling biases. In both evaluation settings each of the machine learning models have a higher precision than the ethnobotanical approaches. In the bias-corrected scenario, the Support Vector classifier performs best - attaining a mean precision of 0.67 compared to the best performing ethnobotanical approach with a mean precision of 0.46. We also use the bias correction method and the Support Vector classifier to estimate the potential of plants to provide novel antiplasmodial compounds. We estimate that 7677 species in Apocynaceae, Loganiaceae and Rubiaceae warrant further investigation and that at least 1300 active antiplasmodial species are highly unlikely to be investigated by conventional approaches. While traditional and Indigenous knowledge remains vital to our understanding of people-plant relationships and an invaluable source of information, these results indicate a vast and relatively untapped source in the search for new plant-derived antiplasmodial compounds.
Collapse
Affiliation(s)
| | - Conal Aitken
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
- EaStCHEM, School of Chemistry, University of St Andrews, St Andrews, United Kingdom
| | - Alexandre Antonelli
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
- Gothenburg Global Biodiversity Centre, Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Biology, University of Oxford, Oxford, United Kingdom
| | | | - David Goyder
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Eve Lucas
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Ian Ondo
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | | | - Samuel Pironon
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
- UN Environment Programme World Conservation Monitoring Centre (UNEP-WCMC), Cambridge, United Kingdom
| | - James E. Richardson
- School of Biological, Earth and Environmental Sciences, University College Cork, Cork, Ireland
- Tropical Diversity Section, Royal Botanic Garden, Edinburgh, United Kingdom
- Departamento de Biología, Facultad de Ciencias Naturales, Universidad del Rosario, Bogotá, Colombia
- Environmental Research Institute, University College Cork, Cork, Ireland
| | - David Russell
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Daniele Silvestro
- Gothenburg Global Biodiversity Centre, Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Biology, University of Fribourg, Fribourg, Switzerland
- Swiss Institute of Bioinformatics, Fribourg, Switzerland
| | - Colin W. Wright
- School of Pharmacy and Medical Sciences, University of Bradford, Bradford, United Kingdom
| | - Melanie-Jayne R. Howes
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
- Institute of Pharmaceutical Science, King’s College London, Franklin-Wilkins Building, London, United Kingdom
| |
Collapse
|
2
|
Mughal H, Bell EC, Mughal K, Derbyshire ER, Freundlich JS. Random Forest Model Predictions Afford Dual-Stage Antimalarial Agents. ACS Infect Dis 2022; 8:1553-1562. [PMID: 35894649 PMCID: PMC9987178 DOI: 10.1021/acsinfecdis.2c00189] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The need for novel antimalarials is apparent given the continuing disease burden worldwide, despite significant drug discovery advances from the bench to the bedside. In particular, small-molecule agents with potent efficacy against both the liver and blood stages of Plasmodium parasite infection are critical for clinical settings as they would simultaneously prevent and treat malaria with a reduced selection pressure for resistance. While experimental screens for such dual-stage inhibitors have been conducted, the time and cost of these efforts limit their scope. Here, we have focused on leveraging machine learning approaches to discover novel antimalarials with such properties. A random forest modeling approach was taken to predict small molecules with in vitro efficacy versus liver-stage Plasmodium berghei parasites and a lack of human liver cell cytotoxicity. Empirical validation of the model was achieved with the realization of hits with liver-stage efficacy after prospective scoring of a commercial diversity library and consideration of structural diversity. A subset of these hits also demonstrated promising blood-stage Plasmodium falciparum efficacy. These 18 validated dual-stage antimalarials represent novel starting points for drug discovery and mechanism of action studies with significant potential for seeding a new generation of therapies.
Collapse
Affiliation(s)
- Haseeb Mughal
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
| | - Elise C. Bell
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, USA
| | - Khadija Mughal
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
| | - Emily R. Derbyshire
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, 213 Research Drive, Durham, NC 27710, USA
| | - Joel S. Freundlich
- Department of Pharmacology, Physiology, and Neuroscience, Rutgers University – New Jersey Medical School, 185 South Orange Ave, Newark, NJ, 07103
- Department of Medicine, Center for Emerging and Re-emerging Pathogens, Rutgers University – New Jersey Medical School, Newark, NJ, 07103
| |
Collapse
|
3
|
Oguike OE, Ugwuishiwu CH, Asogwa CN, Nnadi CO, Obonga WO, Attama AA. Systematic review on the application of machine learning to quantitative structure-activity relationship modeling against Plasmodium falciparum. Mol Divers 2022; 26:3447-3462. [PMID: 35064444 PMCID: PMC8782692 DOI: 10.1007/s11030-022-10380-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 01/07/2022] [Indexed: 11/29/2022]
Abstract
Malaria accounts for over two million deaths globally. To flatten this curve, there is a need to develop new and high potent drugs against Plasmodium falciparum. Some major challenges include the dearth of suitable animal models for anti-P. falciparum assays, resistance to first-line drugs, lack of vaccines and the complex life cycle of Plasmodium. Gladly, newer approaches to antimalarial drug discovery have emerged due to the release of large datasets by pharmaceutical companies. This review provides insights into these new approaches to drug discovery covering different machine learning tools, which enhance the development of new compounds. It provides a systematic review on the use and prospects of machine learning in predicting, classifying and clustering IC50 values of bioactive compounds against P. falciparum. The authors identified many machine learning tools yet to be applied for this purpose. However, Random Forest and Support Vector Machines have been extensively applied though on a limited dataset of compounds.
Collapse
Affiliation(s)
- Osondu Everestus Oguike
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Chikodili Helen Ugwuishiwu
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Caroline Ngozi Asogwa
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Computer Science, Faculty of Physical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Charles Okeke Nnadi
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria. .,Deprtment of Pharmaceutical and Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.
| | - Wilfred Ofem Obonga
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Deprtment of Pharmaceutical and Medicinal Chemistry, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| | - Anthony Amaechi Attama
- Machine Learning Research Group, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria.,Department of Pharmaceutics, Faculty of Pharmaceutical Sciences, University of Nigeria, Nsukka, 410001, Enugu State, Nigeria
| |
Collapse
|
4
|
Antimalarial Drug Predictions Using Molecular Descriptors and Machine Learning against Plasmodium Falciparum. Biomolecules 2021; 11:biom11121750. [PMID: 34944394 PMCID: PMC8698534 DOI: 10.3390/biom11121750] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/12/2021] [Accepted: 11/17/2021] [Indexed: 11/16/2022] Open
Abstract
Malaria remains by far one of the most threatening and dangerous illnesses caused by the plasmodium falciparum parasite. Chloroquine (CQ) and first-line artemisinin-based combination treatment (ACT) have long been the drug of choice for the treatment and controlling of malaria; however, the emergence of CQ-resistant and artemisinin resistance parasites is now present in most areas where malaria is endemic. In this work, we developed five machine learning models to predict antimalarial bioactivities of a drug against plasmodium falciparum from the features (i.e., molecular descriptors values) obtained from PaDEL software from SMILES of compounds and compare the machine learning models by experiments with our collected data of 4794 instances. As a consequence, we found that three models amongst the five, namely artificial neural network (ANN), extreme gradient boost (XGB), and random forest (RF), outperform the others in terms of accuracy while observing that, using roughly a quarter of the promising descriptors picked by the feature selection algorithm, the five models achieved equivalent and comparable performance. Nevertheless, the contribution of all molecular descriptors in the models was investigated through the comparison of their rank values by the feature selection algorithm and found that the most potent and relevant descriptors which come from the ‘Autocorrelation’ module contributed more while the ‘Atom type electrotopological state’ contributed the least to the model.
Collapse
|
5
|
Nandi S, Kumar P, Amin SA, Jha T, Gayen S. First molecular modelling report on tri-substituted pyrazolines as phosphodiesterase 5 (PDE5) inhibitors through classical and machine learning based multi-QSAR analysis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021; 32:917-939. [PMID: 34727793 DOI: 10.1080/1062936x.2021.1989721] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 10/03/2021] [Indexed: 06/13/2023]
Abstract
Phosphodiesterase 5 (PDE5) falls under a broad category of metallohydrolase enzymes responsible for the catalysis of the phosphodiesterase bond, and thus it can terminate the action of cyclic guanosine monophosphate (cGMP). Overexpression of this enzyme leads to development of a number of pathological conditions. Thus, targeting the enzyme to develop inhibitors could be useful for the treatment of erectile dysfunction as well as pulmonary hypertension. In the current study, several molecular modelling techniques were utilized including Bayesian classification, single tree and forest tree recursive partitioning, and genetic function approximation to identify crucial structural fingerprints important for optimization of tri-substituted pyrazoline derivatives as PDE5 inhibitors. Later, various machine learning models were also developed that could be utilized to predict and screen PDE5 inhibitors in the future.
Collapse
Affiliation(s)
- S Nandi
- Department of Pharmaceutical Sciences, Dr. Harisingh Gour University, Sagar, India
| | - P Kumar
- Department of Computer Science, Institute of Science, Banaras Hindu University, Varanasi, India
| | - S A Amin
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - T Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | - S Gayen
- Department of Pharmaceutical Sciences, Dr. Harisingh Gour University, Sagar, India
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
6
|
Nguyen-Vo TH, Trinh QH, Nguyen L, Do TTT, Chua MCH, Nguyen BP. Predicting Antimalarial Activity in Natural Products Using Pretrained Bidirectional Encoder Representations from Transformers. J Chem Inf Model 2021; 62:5050-5058. [DOI: 10.1021/acs.jcim.1c00584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thanh-Hoang Nguyen-Vo
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| | - Quang H. Trinh
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Loc Nguyen
- Computational Biology Center, International University−VNU HCMC, Ho Chi Minh City 700000, Vietnam
| | - Trang T. T. Do
- School of Business and Information Technology, Wellington Institute of Technology, 21 Kensington Avenue, Lower Hutt 5012, New Zealand
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 29 Heng Mui Keng Terrace, Singapore 119620, Singapore
| | - Binh P. Nguyen
- School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand
| |
Collapse
|
7
|
Predicting Potential Endocrine Disrupting Chemicals Binding to Estrogen Receptor α (ERα) Using a Pipeline Combining Structure-Based and Ligand-Based in Silico Methods. Int J Mol Sci 2021; 22:ijms22062846. [PMID: 33799614 PMCID: PMC7999354 DOI: 10.3390/ijms22062846] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/08/2021] [Accepted: 03/08/2021] [Indexed: 02/07/2023] Open
Abstract
The estrogen receptors α (ERα) are transcription factors involved in several physiological processes belonging to the nuclear receptors (NRs) protein family. Besides the endogenous ligands, several other chemicals are able to bind to those receptors. Among them are endocrine disrupting chemicals (EDCs) that can trigger toxicological pathways. Many studies have focused on predicting EDCs based on their ability to bind NRs; mainly, estrogen receptors (ER), thyroid hormones receptors (TR), androgen receptors (AR), glucocorticoid receptors (GR), and peroxisome proliferator-activated receptors gamma (PPARγ). In this work, we suggest a pipeline designed for the prediction of ERα binding activity. The flagged compounds can be further explored using experimental techniques to assess their potential to be EDCs. The pipeline is a combination of structure based (docking and pharmacophore models) and ligand based (pharmacophore models) methods. The models have been constructed using the Environmental Protection Agency (EPA) data encompassing a large number of structurally diverse compounds. A validation step was then achieved using two external databases: the NR-DBIND (Nuclear Receptors DataBase Including Negative Data) and the EADB (Estrogenic Activity DataBase). Different combination protocols were explored. Results showed that the combination of models performed better than each model taken individually. The consensus protocol that reached values of 0.81 and 0.54 for sensitivity and specificity, respectively, was the best suited for our toxicological study. Insights and recommendations were drawn to alleviate the screening quality of other projects focusing on ERα binding predictions.
Collapse
|
8
|
Liu Q, Deng J, Liu M. Classification models for predicting the antimalarial activity against Plasmodium falciparum. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:313-324. [PMID: 32191533 DOI: 10.1080/1062936x.2020.1740890] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 03/07/2020] [Indexed: 06/10/2023]
Abstract
Support vector machine (SVM) and general regression neural network (GRNN) were used to develop classification models for predicting the antimalarial activity against Plasmodium falciparum. Only 15 molecular descriptors were used to build the classification models for the antimalarial activities of 4750 compounds, which were divided into a training set (3887 compounds) and a test set (863 compounds). For the SVM model, its prediction accuracies are 89.5% for the training set and 87.3% for the test set. For the GRNN model, the prediction accuracies for the two sets are 99.7% and 88.9%, respectively. Both SVC and GRNN models have better prediction ability than the classification model based on binary logistic regression (BLR) analysis. Compared with previously published classification models both SVC and GRNN models are satisfactory in predicting antimalarial activities of compounds with in addition of fewer descriptors.
Collapse
Affiliation(s)
- Q Liu
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan, China
| | - J Deng
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan, China
| | - M Liu
- School of Chemistry and Materials Engineering, Huizhou University, Huizhou, PR China
| |
Collapse
|
9
|
Devillers J, Devillers H. Toxicity profiling and prioritization of plant-derived antimalarial agents. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2019; 30:801-824. [PMID: 31565973 DOI: 10.1080/1062936x.2019.1665844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 09/06/2019] [Indexed: 06/10/2023]
Abstract
Human malaria is the most widespread mosquito-borne life-threatening disease worldwide. In the absence of effective vaccines, prevention and treatment of malaria only depend on prophylaxis and drug-based therapy either in monotherapy or in combination. Unfortunately, the number of available antimalarial drugs presenting different mechanisms of action is rather limited. In addition, the appearance of drug-resistance in the parasite strains impacts the efficacy of the treatments. As a result, there is a crucial need to find new drugs to circumvent resistance problems. In the quest to identify new antimalarial agents a huge number of plant-derived compounds (PDCs) have been investigated. Surprisingly in the in silico PDC screening programs, toxicity filters are either never used or so simple that their interest is limited. In this context, the goal of this study was to show how to take advantage of validated toxicity QSAR models for refining the selection of PDCs. From an original data set of 507 PDCs collected from the literature, the use of toxicity filters for endocrine disruption, developmental toxicity, and hepatotoxicity in conjunction with classical pharmacokinetic filters allowed us to obtain a list of 31 compounds of potential interest. The pros and cons of such a strategy have been discussed.
Collapse
Affiliation(s)
| | - H Devillers
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay , Jouy-en-Josas , France
| |
Collapse
|