1
|
Vittorio S, Lunghini F, Morerio P, Gadioli D, Orlandini S, Silva P, Jan Martinovic, Pedretti A, Bonanni D, Del Bue A, Palermo G, Vistoli G, Beccari AR. Addressing docking pose selection with structure-based deep learning: Recent advances, challenges and opportunities. Comput Struct Biotechnol J 2024; 23:2141-2151. [PMID: 38827235 PMCID: PMC11141151 DOI: 10.1016/j.csbj.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/15/2024] [Accepted: 05/15/2024] [Indexed: 06/04/2024] Open
Abstract
Molecular docking is a widely used technique in drug discovery to predict the binding mode of a given ligand to its target. However, the identification of the near-native binding pose in docking experiments still represents a challenging task as the scoring functions currently employed by docking programs are parametrized to predict the binding affinity, and, therefore, they often fail to correctly identify the ligand native binding conformation. Selecting the correct binding mode is crucial to obtaining meaningful results and to conveniently optimizing new hit compounds. Deep learning (DL) algorithms have been an area of a growing interest in this sense for their capability to extract the relevant information directly from the protein-ligand structure. Our review aims to present the recent advances regarding the development of DL-based pose selection approaches, discussing limitations and possible future directions. Moreover, a comparison between the performances of some classical scoring functions and DL-based methods concerning their ability to select the correct binding mode is reported. In this regard, two novel DL-based pose selectors developed by us are presented.
Collapse
Affiliation(s)
- Serena Vittorio
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Filippo Lunghini
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123 Naples, Italy
| | - Pietro Morerio
- Pattern Analysis and Computer Vision, Fondazione Istituto Italiano di Tecnologia, Via Morego, 30, 16163 Genova, Italy
| | - Davide Gadioli
- Dipartimento di Elettronica Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, I-20133 Milano, Italy
| | - Sergio Orlandini
- SCAI, SuperComputing Applications and Innovation Department, CINECA, Via dei Tizii 6, Rome 00185, Italy
| | - Paulo Silva
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 70800 Ostrava-Poruba, Czech Republic
| | - Jan Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 70800 Ostrava-Poruba, Czech Republic
| | - Alessandro Pedretti
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Domenico Bonanni
- Department of Physical and Chemical Sciences, University of L′Aquila, via Vetoio, L′Aquila 67010, Italy
| | - Alessio Del Bue
- Pattern Analysis and Computer Vision, Fondazione Istituto Italiano di Tecnologia, Via Morego, 30, 16163 Genova, Italy
| | - Gianluca Palermo
- Dipartimento di Elettronica Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, I-20133 Milano, Italy
| | - Giulio Vistoli
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Andrea R. Beccari
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123 Naples, Italy
| |
Collapse
|
2
|
Abbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. Chembiochem 2024; 25:e202300816. [PMID: 38735845 DOI: 10.1002/cbic.202300816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/14/2024]
Abstract
The emergence of Artificial Intelligence (AI) in drug discovery marks a pivotal shift in pharmaceutical research, blending sophisticated computational techniques with conventional scientific exploration to break through enduring obstacles. This review paper elucidates the multifaceted applications of AI across various stages of drug development, highlighting significant advancements and methodologies. It delves into AI's instrumental role in drug design, polypharmacology, chemical synthesis, drug repurposing, and the prediction of drug properties such as toxicity, bioactivity, and physicochemical characteristics. Despite AI's promising advancements, the paper also addresses the challenges and limitations encountered in the field, including data quality, generalizability, computational demands, and ethical considerations. By offering a comprehensive overview of AI's role in drug discovery, this paper underscores the technology's potential to significantly enhance drug development, while also acknowledging the hurdles that must be overcome to fully realize its benefits.
Collapse
Affiliation(s)
- M K G Abbas
- Center for Advanced Materials, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Abrar Rassam
- Secondary Education, Educational Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Fatima Karamshahi
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Rehab Abunora
- Faculty of Medicine, General Medicine and Surgery, Helwan University, Cairo, Egypt
| | - Maha Abouseada
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| |
Collapse
|
3
|
Snyder SH, Vignaux PA, Ozalp MK, Gerlach J, Puhl AC, Lane TR, Corbett J, Urbina F, Ekins S. The Goldilocks paradigm: comparing classical machine learning, large language models, and few-shot learning for drug discovery applications. Commun Chem 2024; 7:134. [PMID: 38866916 PMCID: PMC11169557 DOI: 10.1038/s42004-024-01220-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 06/04/2024] [Indexed: 06/14/2024] Open
Abstract
Recent advances in machine learning (ML) have led to newer model architectures including transformers (large language models, LLMs) showing state of the art results in text generation and image analysis as well as few-shot learning (FSLC) models which offer predictive power with extremely small datasets. These new architectures may offer promise, yet the 'no-free lunch' theorem suggests that no single model algorithm can outperform at all possible tasks. Here, we explore the capabilities of classical (SVR), FSLC, and transformer models (MolBART) over a range of dataset tasks and show a 'goldilocks zone' for each model type, in which dataset size and feature distribution (i.e. dataset "diversity") determines the optimal algorithm strategy. When datasets are small ( < 50 molecules), FSLC tend to outperform both classical ML and transformers. When datasets are small-to-medium sized (50-240 molecules) and diverse, transformers outperform both classical models and few-shot learning. Finally, when datasets are of larger and of sufficient size, classical models then perform the best, suggesting that the optimal model to choose likely depends on the dataset available, its size and diversity. These findings may help to answer the perennial question of which ML algorithm is to be used when faced with a new dataset.
Collapse
Affiliation(s)
- Scott H Snyder
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Patricia A Vignaux
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Mustafa Kemal Ozalp
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Jacob Gerlach
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Ana C Puhl
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Thomas R Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - John Corbett
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA.
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, 27606, USA.
| |
Collapse
|
4
|
Sutanto H. Transforming clinical cardiology through neural networks and deep learning: A guide for clinicians. Curr Probl Cardiol 2024; 49:102454. [PMID: 38342351 DOI: 10.1016/j.cpcardiol.2024.102454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 02/08/2024] [Indexed: 02/13/2024]
Abstract
The rapid evolution of neural networks and deep learning has revolutionized various fields, with clinical cardiology being no exception. As traditional methods in cardiology encounter limitations, the integration of advanced computational techniques offers unprecedented opportunities in diagnostics and patient care. This review explores the transformative role of neural networks and deep learning in clinical cardiology, particularly focusing on their applications in electrocardiogram (ECG) analysis, imaging technologies, and cardiac prediction models. Among others, Deep Neural Networks (DNNs) have significantly surpassed traditional approaches in accuracy and efficiency in automatic ECG diagnosis. Convolutional Neural Networks (CNNs) are successfully applied in PET/CT and PET/MR imaging, enhancing diagnostic capabilities. Furthermore, deep learning algorithms have shown potential in improving cardiac prediction models, although challenges in interpretability and clinical integration remain. The review also addresses the 'black box' nature of neural networks and the ethical considerations surrounding their use in clinical settings. Overall, this review underscores the significant impact of neural networks and deep learning in cardiology, providing insights into current applications and future directions in the field.
Collapse
Affiliation(s)
- Henry Sutanto
- Department of Internal Medicine, Faculty of Medicine, Universitas Airlangga, Surabaya, Indonesia.
| |
Collapse
|
5
|
Satalkar V, Degaga GD, Li W, Pang YT, McShan AC, Gumbart JC, Mitchell JC, Torres MP. Generative β-hairpin design using a residue-based physicochemical property landscape. Biophys J 2024:S0006-3495(24)00070-5. [PMID: 38297834 DOI: 10.1016/j.bpj.2024.01.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 02/02/2024] Open
Abstract
De novo peptide design is a new frontier that has broad application potential in the biological and biomedical fields. Most existing models for de novo peptide design are largely based on sequence homology that can be restricted based on evolutionarily derived protein sequences and lack the physicochemical context essential in protein folding. Generative machine learning for de novo peptide design is a promising way to synthesize theoretical data that are based on, but unique from, the observable universe. In this study, we created and tested a custom peptide generative adversarial network intended to design peptide sequences that can fold into the β-hairpin secondary structure. This deep neural network model is designed to establish a preliminary foundation of the generative approach based on physicochemical and conformational properties of 20 canonical amino acids, for example, hydrophobicity and residue volume, using extant structure-specific sequence data from the PDB. The beta generative adversarial network model robustly distinguishes secondary structures of β hairpin from α helix and intrinsically disordered peptides with an accuracy of up to 96% and generates artificial β-hairpin peptide sequences with minimum sequence identities around 31% and 50% when compared against the current NCBI PDB and nonredundant databases, respectively. These results highlight the potential of generative models specifically anchored by physicochemical and conformational property features of amino acids to expand the sequence-to-structure landscape of proteins beyond evolutionary limits.
Collapse
Affiliation(s)
- Vardhan Satalkar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Gemechis D Degaga
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Wei Li
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Yui Tik Pang
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - James C Gumbart
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee.
| | - Matthew P Torres
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
6
|
Chen M, Yang J, Tang C, Lu X, Wei Z, Liu Y, Yu P, Li H. Improving ADMET Prediction Accuracy for Candidate Drugs: Factors to Consider in QSPR Modeling Approaches. Curr Top Med Chem 2024; 24:222-242. [PMID: 38083894 DOI: 10.2174/0115680266280005231207105900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/02/2023] [Accepted: 11/10/2023] [Indexed: 05/04/2024]
Abstract
Quantitative Structure-Property Relationship (QSPR) employs mathematical and statistical methods to reveal quantitative correlations between the pharmacokinetics of compounds and their molecular structures, as well as their physical and chemical properties. QSPR models have been widely applied in the prediction of drug absorption, distribution, metabolism, excretion, and toxicity (ADMET). However, the accuracy of QSPR models for predicting drug ADMET properties still needs improvement. Therefore, this paper comprehensively reviews the tools employed in various stages of QSPR predictions for drug ADMET. It summarizes commonly used approaches to building QSPR models, systematically analyzing the advantages and limitations of each modeling method to ensure their judicious application. We provide an overview of recent advancements in the application of QSPR models for predicting drug ADMET properties. Furthermore, this review explores the inherent challenges in QSPR modeling while also proposing a range of considerations aimed at enhancing model prediction accuracy. The objective is to enhance the predictive capabilities of QSPR models in the field of drug development and provide valuable reference and guidance for researchers in this domain.
Collapse
Affiliation(s)
- Meilun Chen
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Jie Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Chunhua Tang
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Xiaoling Lu
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Zheng Wei
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Yijie Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - Peng Yu
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| | - HuanHuan Li
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Changsha, Hunan, 410013, China
| |
Collapse
|
7
|
Yuan J, Zhang Y, Wang X. Application of machine learning in the management of lymphoma: Current practice and future prospects. Digit Health 2024; 10:20552076241247963. [PMID: 38628632 PMCID: PMC11020711 DOI: 10.1177/20552076241247963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 03/28/2024] [Indexed: 04/19/2024] Open
Abstract
In the past decade, digitization of medical records and multiomics data analysis in lymphoma has led to the accessibility of high-dimensional records. The digitization of medical records, the visualization of extensive volume data extracted from medical images, and the integration of multiomics methods into clinical decision-making have produced many datasets. As a promising auxiliary tool, machine learning (ML) intends to extract homologous features in large-scale data sets and encode them into various patterns to complete complicated tasks. At present, artificial intelligence and digital mining have shown promising prospects in the field of lymphoma pathological image analysis. The paradigm shift from qualitative analysis to quantitative analysis makes the pathological diagnosis more intelligent and the results more accurate and objective. ML can promote accurate lymphoma diagnosis and provide patients with prognostic information and more individualized treatment options. Based on the above, this comprehensive review of the general workflow of ML highlights recent advances in ML techniques in the diagnosis, treatment, and prognosis of lymphoma, and clarifies the boundedness and future orientation of the ML technique in the clinical practice of lymphoma.
Collapse
Affiliation(s)
- Junyun Yuan
- Department of Hematology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, China
| | - Ya Zhang
- Department of Hematology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, China
- Department of Hematology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong, China
- Taishan Scholars Program of Shandong Province, Jinan, Shandong, China
| | - Xin Wang
- Department of Hematology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, China
- Department of Hematology, Shandong Provincial Hospital, Shandong University, Jinan, Shandong, China
- Taishan Scholars Program of Shandong Province, Jinan, Shandong, China
- Branch of National Clinical Research Center for Hematologic Diseases, Jinan, Shandong, China
- National Clinical Research Center for Hematologic Diseases, Hospital of Soochow University, Suzhou, China
| |
Collapse
|
8
|
Saha P, Nguyen MT. Electron density mapping of boron clusters via convolutional neural networks to augment structure prediction algorithms. RSC Adv 2023; 13:30743-30752. [PMID: 37869387 PMCID: PMC10586239 DOI: 10.1039/d3ra05851d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/10/2023] [Indexed: 10/24/2023] Open
Abstract
Determination and prediction of atomic cluster structures is an important endeavor in the field of nanoclusters and thereby in materials research. To a large extent the fundamental properties of a nanocluster are mainly governed by its molecular structure. Traditionally, structure elucidation is achieved using quantum mechanics (QM) based calculations that are usually tedious and time consuming for large nanoclusters. Various structural prediction algorithms have been reported in the literature (CALYPSO, USPEX). Although they tend to accelerate the structure exploration, they still require the aid of QM based calculations for structure evaluation. This makes the structure prediction process quite a computationally expensive affair. In this paper, we report on the creation of a convolutional neural network model, which can give relatively accurate energies for the ground state of nanoclusters from the promolecule density on the fly and could thereby be utilized for aiding structure prediction algorithms. We tested our model on dataset consisting of pure boron nanoclusters of varying sizes.
Collapse
Affiliation(s)
- Pinaki Saha
- School of Physics, Engineering and Computer Science, University of Hertfordshire UK
| | - Minh Tho Nguyen
- Laboratory for Chemical Computation and Modelling, Institute for Artificial Intelligence, Van Lang University Ho Chi Minh City Vietnam
- Faculty of Applied Technology, School of Technology, Van Lang University Ho Chi Minh City Vietnam
| |
Collapse
|
9
|
Rockholt MM, Kenefati G, Doan LV, Chen ZS, Wang J. In search of a composite biomarker for chronic pain by way of EEG and machine learning: where do we currently stand? Front Neurosci 2023; 17:1186418. [PMID: 37389362 PMCID: PMC10301750 DOI: 10.3389/fnins.2023.1186418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 05/12/2023] [Indexed: 07/01/2023] Open
Abstract
Machine learning is becoming an increasingly common component of routine data analyses in clinical research. The past decade in pain research has witnessed great advances in human neuroimaging and machine learning. With each finding, the pain research community takes one step closer to uncovering fundamental mechanisms underlying chronic pain and at the same time proposing neurophysiological biomarkers. However, it remains challenging to fully understand chronic pain due to its multidimensional representations within the brain. By utilizing cost-effective and non-invasive imaging techniques such as electroencephalography (EEG) and analyzing the resulting data with advanced analytic methods, we have the opportunity to better understand and identify specific neural mechanisms associated with the processing and perception of chronic pain. This narrative literature review summarizes studies from the last decade describing the utility of EEG as a potential biomarker for chronic pain by synergizing clinical and computational perspectives.
Collapse
Affiliation(s)
- Mika M. Rockholt
- Department of Anesthesiology, Perioperative Care and Pain Management, New York University Grossman School of Medicine, New York, NY, United States
| | - George Kenefati
- Department of Anesthesiology, Perioperative Care and Pain Management, New York University Grossman School of Medicine, New York, NY, United States
| | - Lisa V. Doan
- Department of Anesthesiology, Perioperative Care and Pain Management, New York University Grossman School of Medicine, New York, NY, United States
| | - Zhe Sage Chen
- Department of Psychiatry, New York University Grossman School of Medicine, New York, NY, United States
- Department of Neuroscience & Physiology, Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, United States
- Department of Biomedical Engineering, New York University Tandon School of Engineering, Brooklyn, NY, United States
| | - Jing Wang
- Department of Anesthesiology, Perioperative Care and Pain Management, New York University Grossman School of Medicine, New York, NY, United States
- Department of Neuroscience & Physiology, Neuroscience Institute, New York University Grossman School of Medicine, New York, NY, United States
- Department of Biomedical Engineering, New York University Tandon School of Engineering, Brooklyn, NY, United States
| |
Collapse
|
10
|
Diaz-del-Pino S, Trelles-Martinez R, González-Fernández F, Guil N. Artificial intelligence to assist specialists in the detection of haematological diseases. Heliyon 2023; 9:e15940. [PMID: 37215889 PMCID: PMC10195887 DOI: 10.1016/j.heliyon.2023.e15940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 05/24/2023] Open
Abstract
Artificial intelligence, particularly the growth of neural network research and development, has become an invaluable tool for data analysis, offering unrivalled solutions for image generation, natural language processing, and personalised suggestions. In the meantime, biomedicine has been presented as one of the pressing challenges of the 21st century. The inversion of the age pyramid, the increase in longevity, and the negative environment due to pollution and bad habits of the population have led to a necessity of research in the methodologies that can help to mitigate and fight against these changes. The combination of both fields has already achieved remarkable results in drug discovery, cancer prediction or gene activation. However, challenges such as data labelling, architecture improvements, interpretability of the models and translational implementation of the proposals still remain. In haematology, conventional protocols follow a stepwise approach that includes several tests and doctor-patient interactions to make a diagnosis. This procedure results in significant costs and workload for hospitals. In this paper, we present an artificial intelligence model based on neural networks to support practitioners in the identification of different haematological diseases using only rutinary and inexpensive blood count tests. In particular, we present both binary and multiclass classification of haematological diseases using a specialised neural network architecture where data is studied and combined along it, taking into account the clinical knowledge of the problem, obtaining results up to 96% accuracy for the binary classification experiment. Furthermore, we compare this method against traditional machine learning algorithms such as gradient boosting decision trees and transformers for tabular data. The use of these machine learning techniques could reduce the cost and decision time and improve the quality of life for both specialists and patients while producing more precise diagnoses.
Collapse
Affiliation(s)
| | | | | | - Nicolas Guil
- Computer Architecture Department, University of Malaga, Spain
| |
Collapse
|
11
|
Chu T, Nguyen TT, Hai BD, Nguyen QH, Nguyen T. Graph Transformer for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1065-1072. [PMID: 36107906 DOI: 10.1109/tcbb.2022.3206888] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
BACKGROUND Previous models have shown that learning drug features from their graph representation is more efficient than learning from their strings or numeric representations. Furthermore, integrating multi-omics data of cell lines increases the performance of drug response prediction. However, these models have shown drawbacks in extracting drug features from graph representation and incorporating redundancy information from multi-omics data. This paper proposes a deep learning model, GraTransDRP, to better drug representation and reduce information redundancy. First, the Graph transformer was utilized to extract the drug representation more efficiently. Next, Convolutional neural networks were used to learn the mutation, meth, and transcriptomics features. However, the dimension of transcriptomics features was up to 17737. Therefore, KernelPCA was applied to transcriptomics features to reduce the dimension and transform them into a dense presentation before putting them through the CNN model. Finally, drug and omics features were combined to predict a response value by a fully connected network. Experimental results show that our model outperforms some state-of-the-art methods, including GraphDRP and GraOmicDRP.
Collapse
|
12
|
Mazur H, Erbrich L, Quodbach J. Investigations into the use of machine learning to predict drug dosage form design to obtain desired release profiles for 3D printed oral medicines. Pharm Dev Technol 2023; 28:219-231. [PMID: 36715438 DOI: 10.1080/10837450.2023.2173778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Three-dimensional (3D) printing, digitalization, and artificial intelligence (AI) are gaining increasing interest in modern medicine. All three aspects are combined in personalized medicine where 3D-printed dosage forms are advantageous because of their variable geometry design. The geometry design can be used to determine the surface area to volume (SA/V) ratio, which affects drug release from the dosage forms. This study investigated artificial neural networks (ANN) to predict suitable geometries for the desired dose and release profile. Filaments with 5% API load and polyvinyl alcohol were 3D printed using Fused Deposition Modeling to provide a wide variety of geometries with different dosages and SA/V ratios. These were dissolved in vitro, and the API release profiles were described mathematically. Using these data, ANN architectures were designed with the goal of predicting a suitable dosage form geometry. Poor accuracies of 68.5% in the training and 44.4% in the test settings were achieved with a classification architecture. However, the SA/V ratio could be predicted accurately with a mean squared error loss of only 0.05. This study shows that the prediction of the SA/V ratio using AI works, but not of the exact geometry. For this purpose, a global database could be built with a range of geometries to simplify the prescription process.
Collapse
Affiliation(s)
- Hellen Mazur
- Institute of Pharmaceutics and Biopharmaceutics, Heinrich Heine University, Düsseldorf, Germany
| | - Leon Erbrich
- Institute of Pharmaceutics and Biopharmaceutics, Heinrich Heine University, Düsseldorf, Germany
| | - Julian Quodbach
- Institute of Pharmaceutics and Biopharmaceutics, Heinrich Heine University, Düsseldorf, Germany.,Department of Pharmaceutics, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
13
|
Heart rate variability is not suitable as surrogate marker for pain intensity in patients with chronic pain. Pain 2023:00006396-990000000-00252. [PMID: 36722463 DOI: 10.1097/j.pain.0000000000002868] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 01/03/2023] [Indexed: 02/02/2023]
Abstract
ABSTRACT The search towards more objective outcome measurements and consequently surrogate markers for pain started decades ago; however, no generally accepted biomarker for pain has qualified yet. The goal is to explore the value of heart rate variability (HRV) as surrogate marker for pain intensity chronic pain setting. Pain intensity scores and HRV were collected in 366 patients with chronic pain, through a cross-sectional multicenter study. Pain intensity was measured with both the Visual Analogue Scale and Numeric Rating Scale, while 16 statistical HRV parameters were derived. Canonical correlation analysis was performed to evaluate the correlation between the dependent pain variables and the HRV parameters. Surrogacy was determined for each HRV parameter with point estimates between 0 and 1 whereby values close to 1 indicate a strong association between the surrogate and the true endpoint at the patient level. Weak correlations were revealed between HRV parameters and pain intensity scores. The highest surrogacy point estimate was found for mean heart rate as marker for average pain intensity on the Numeric Rating Scale with point estimates of 0.0961 (95% CI from 0.0384 to 0.1537) and 0.0209 (95% CI from 0 to 0.05) for patients without medication use, and medication use respectively. This study indicated that HRV parameters as separate entities are no suitable surrogacy candidates for pain intensity, in a population of chronic pain patients. Further potential surrogate candidates and clinical robust true endpoints should be explored, in order to find a surrogate measure for the highly individual pain experience.
Collapse
|
14
|
On the ability of machine learning methods to discover novel scaffolds. J Mol Model 2022; 29:22. [PMID: 36574054 DOI: 10.1007/s00894-022-05359-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 10/21/2022] [Indexed: 12/28/2022]
Abstract
The recent advances in the application of machine learning to drug discovery have made it a 'hot topic' for research, with hundreds of academic groups and companies integrating machine learning into their drug discovery projects. Nevertheless, there remains great uncertainty regarding the most appropriate ways to evaluate the relative performance of these powerful methods against more traditional cheminformatics approaches, and many pitfalls remain for the unwary. In 2020, researchers at MIT (Stokes et al., Cell 180(4), 688-702, 2020) reported the discovery of a new compound with antibacterial activity, halicin, through the use of a neural network machine learning method. A robust ability to identify new active chemotypes through computational methods would be very useful. In this study, we have used the Stokes et al. dataset to compare the performance of this method to two other approaches, Mapping of Activity Through Dichotomic Scores (MADS) by Todeschini et al. (J Chemom 32(4):e2994, 2018) and Random Matrix Theory (RMT) by Lee et al. (Proc Natl Acad Sci 116(9):3373-3378, 2019). Our results demonstrate that all three methods are capable of predicting halicin as an active antibacterial compound, but that this result is dependent on the dataset composition, pre-processing and the molecular fingerprint used. We have further assessed overall performance as determined by several performance metrics. We also investigated the scaffold hopping potential of the methods by modifying the dataset by removal of the β-lactam and fluoroquinolone chemotypes. MADS and RMT are able to identify actives in the test set that contained these substructures. This ability arises because of high scoring fragments of the withheld chemotypes that are in common with other active antibiotic classes. Interestingly, MADS is relatively better compared to the other two methods based on general predictive performance.
Collapse
|
15
|
van Tilborg D, Alenicheva A, Grisoni F. Exposing the Limitations of Molecular Machine Learning with Activity Cliffs. J Chem Inf Model 2022; 62:5938-5951. [PMID: 36456532 PMCID: PMC9749029 DOI: 10.1021/acs.jcim.2c01073] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Indexed: 12/03/2022]
Abstract
Machine learning has become a crucial tool in drug discovery and chemistry at large, e.g., to predict molecular properties, such as bioactivity, with high accuracy. However, activity cliffs─pairs of molecules that are highly similar in their structure but exhibit large differences in potency─have received limited attention for their effect on model performance. Not only are these edge cases informative for molecule discovery and optimization but also models that are well equipped to accurately predict the potency of activity cliffs have increased potential for prospective applications. Our work aims to fill the current knowledge gap on best-practice machine learning methods in the presence of activity cliffs. We benchmarked a total of 24 machine and deep learning approaches on curated bioactivity data from 30 macromolecular targets for their performance on activity cliff compounds. While all methods struggled in the presence of activity cliffs, machine learning approaches based on molecular descriptors outperformed more complex deep learning methods. Our findings highlight large case-by-case differences in performance, advocating for (a) the inclusion of dedicated "activity-cliff-centered" metrics during model development and evaluation and (b) the development of novel algorithms to better predict the properties of activity cliffs. To this end, the methods, metrics, and results of this study have been encapsulated into an open-access benchmarking platform named MoleculeACE (Activity Cliff Estimation, available on GitHub at: https://github.com/molML/MoleculeACE). MoleculeACE is designed to steer the community toward addressing the pressing but overlooked limitation of molecular machine learning models posed by activity cliffs.
Collapse
Affiliation(s)
- Derek van Tilborg
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| | | | - Francesca Grisoni
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| |
Collapse
|
16
|
Using Artificial Intelligence for Drug Discovery: A Bibliometric Study and Future Research Agenda. Pharmaceuticals (Basel) 2022; 15:ph15121492. [PMID: 36558943 PMCID: PMC9785219 DOI: 10.3390/ph15121492] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/23/2022] [Accepted: 11/27/2022] [Indexed: 12/03/2022] Open
Abstract
Drug discovery is usually a rule-based process that is carefully carried out by pharmacists. However, a new trend is emerging in research and practice where artificial intelligence is being used for drug discovery to increase efficiency or to develop new drugs for previously untreatable diseases. Nevertheless, so far, no study takes a holistic view of AI-based drug discovery research. Given the importance and potential of AI for drug discovery, this lack of research is surprising. This study aimed to close this research gap by conducting a bibliometric analysis to identify all relevant studies and to analyze interrelationships among algorithms, institutions, countries, and funding sponsors. For this purpose, a sample of 3884 articles was examined bibliometrically, including studies from 1991 to 2022. We utilized various qualitative and quantitative methods, such as performance analysis, science mapping, and thematic analysis. Based on these findings, we furthermore developed a research agenda that aims to serve as a foundation for future researchers.
Collapse
|
17
|
Lane TR, Urbina F, Zhang X, Fye M, Gerlach J, Wright SH, Ekins S. Machine Learning Models Identify New Inhibitors for Human OATP1B1. Mol Pharm 2022; 19:4320-4332. [PMID: 36269563 PMCID: PMC9873312 DOI: 10.1021/acs.molpharmaceut.2c00662] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The uptake transporter OATP1B1 (SLC01B1) is largely localized to the sinusoidal membrane of hepatocytes and is a known victim of unwanted drug-drug interactions. Computational models are useful for identifying potential substrates and/or inhibitors of clinically relevant transporters. Our goal was to generate OATP1B1 in vitro inhibition data for [3H] estrone-3-sulfate (E3S) transport in CHO cells and use it to build machine learning models to facilitate a comparison of seven different classification models (Deep learning, Adaboosted decision trees, Bernoulli naïve bayes, k-nearest neighbors (knn), random forest, support vector classifier (SVC), logistic regression (lreg), and XGBoost (xgb)] using ECFP6 fingerprints to perform 5-fold, nested cross validation. In addition, we compared models using 3D pharmacophores, simple chemical descriptors alone or plus ECFP6, as well as ECFP4 and ECFP8 fingerprints. Several machine learning algorithms (SVC, lreg, xgb, and knn) had excellent nested cross validation statistics, particularly for accuracy, AUC, and specificity. An external test set containing 207 unique compounds not in the training set demonstrated that at every threshold SVC outperformed the other algorithms based on a rank normalized score. A prospective validation test set was chosen using prediction scores from the SVC models with ECFP fingerprints and were tested in vitro with 15 of 19 compounds (84% accuracy) predicted as active (≥20% inhibition) showed inhibition. Of these compounds, six (abamectin, asiaticoside, berbamine, doramectin, mobocertinib, and umbralisib) appear to be novel inhibitors of OATP1B1 not previously reported. These validated machine learning models can now be used to make predictions for drug-drug interactions for human OATP1B1 alongside other machine learning models for important drug transporters in our MegaTrans software.
Collapse
Affiliation(s)
- Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Xiaohong Zhang
- Department of Physiology, College of Medicine, University of Arizona, Tucson, AZ, 85724, USA
| | - Margret Fye
- Department of Physiology, College of Medicine, University of Arizona, Tucson, AZ, 85724, USA
| | - Jacob Gerlach
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Stephen H. Wright
- Department of Physiology, College of Medicine, University of Arizona, Tucson, AZ, 85724, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| |
Collapse
|
18
|
Parastar H, Tauler R. Big (Bio)Chemical Data Mining Using Chemometric Methods: A Need for Chemists. Angew Chem Int Ed Engl 2022. [DOI: 10.1002/ange.201801134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Hadi Parastar
- Department of Chemistry Sharif University of Technology Tehran Iran
| | - Roma Tauler
- Department of Environmental Chemistry IDAEA-CSIC 08034 Barcelona Spain
| |
Collapse
|
19
|
Learning Functions and Classes Using Rules. AI 2022. [DOI: 10.3390/ai3030044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the current work, a novel method is presented for generating rules for data classification as well as for regression problems. The proposed method generates simple rules in a high-level programming language with the help of grammatical evolution. The method does not depend on any prior knowledge of the dataset; the memory it requires for its execution is constant regardless of the objective problem, and it can be used to detect any hidden dependencies between the features of the input problem as well. The proposed method was tested on a extensive range of problems from the relevant literature, and comparative results against other machine learning techniques are presented in this manuscript.
Collapse
|
20
|
Tarasova OA, Rudik AV, Biziukova NY, Filimonov DA, Poroikov VV. Chemical named entity recognition in the texts of scientific publications using the naïve Bayes classifier approach. J Cheminform 2022; 14:55. [PMID: 35964150 PMCID: PMC9375066 DOI: 10.1186/s13321-022-00633-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 07/12/2022] [Indexed: 11/24/2022] Open
Abstract
Motivation Application of chemical named entity recognition (CNER) algorithms allows retrieval of information from texts about chemical compound identifiers and creates associations with physical–chemical properties and biological activities. Scientific texts represent low-formalized sources of information. Most methods aimed at CNER are based on machine learning approaches, including conditional random fields and deep neural networks. In general, most machine learning approaches require either vector or sparse word representation of texts. Chemical named entities (CNEs) constitute only a small fraction of the whole text, and the datasets used for training are highly imbalanced. Methods and results We propose a new method for extracting CNEs from texts based on the naïve Bayes classifier combined with specially developed filters. In contrast to the earlier developed CNER methods, our approach uses the representation of the data as a set of fragments of text (FoTs) with the subsequent preparati`on of a set of multi-n-grams (sequences from one to n symbols) for each FoT. Our approach may provide the recognition of novel CNEs. For CHEMDNER corpus, the values of the sensitivity (recall) was 0.95, precision was 0.74, specificity was 0.88, and balanced accuracy was 0.92 based on five-fold cross validation. We applied the developed algorithm to the extracted CNEs of potential Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) main protease (Mpro) inhibitors. A set of CNEs corresponding to the chemical substances evaluated in the biochemical assays used for the discovery of Mpro inhibitors was retrieved. Manual analysis of the appropriate texts showed that CNEs of potential SARS-CoV-2 Mpro inhibitors were successfully identified by our method. Conclusion The obtained results show that the proposed method can be used for filtering out words that are not related to CNEs; therefore, it can be successfully applied to the extraction of CNEs for the purposes of cheminformatics and medicinal chemistry. Supplementary Information The online version contains supplementary material available at 10.1186/s13321-022-00633-4.
Collapse
Affiliation(s)
- O A Tarasova
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow, 119121, Russia.
| | - A V Rudik
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow, 119121, Russia
| | - N Yu Biziukova
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow, 119121, Russia
| | - D A Filimonov
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow, 119121, Russia
| | - V V Poroikov
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow, 119121, Russia
| |
Collapse
|
21
|
Sheridan RP, Culberson JC, Joshi E, Tudor M, Karnachi P. Prediction Accuracy of Production ADMET Models as a Function of Version: Activity Cliffs Rule. J Chem Inf Model 2022; 62:3275-3280. [PMID: 35796226 DOI: 10.1021/acs.jcim.2c00699] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
As with many other institutions, our company maintains many quantitative structure-activity relationship (QSAR) models of absorption, distribution, metabolism, excretion, and toxicity (ADMET) end points and updates the models regularly. We recently examined version-to-version predictivity for these models over a period of 10 years. In this approach we monitor the goodness of prediction of new molecules relative to the training set of model version V before they are incorporated in the updated model V+1. Using a cell-based permeability assay (Papp) as an example, we illustrate how the QSAR models made from this data are generally predictive and can be utilized to enrich chemical designs and synthesis. Despite the obvious utility of these models, we turned up unexpected behavior in Papp and other ADMET activities for which the explanation is not obvious. One such behavior is that the apparent predictivity of the models as measured by root-mean-square-error can vary greatly from version to version and is sometimes very poor. One intuitively appealing explanation is that the observed activities of the new molecules fall outside the bulk of activities in the training set. Alternatively, one may think that the new molecules are exploring different regions of chemical space than the training set. However, the real explanation has to do with activity cliffs. If the observed activities of the new molecules are different than expected based on similar molecules in the training set, the predictions will be less accurate. This is true for all our ADMET end points.
Collapse
Affiliation(s)
- Robert P Sheridan
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - J Chris Culberson
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Elizabeth Joshi
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Matthew Tudor
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Prabha Karnachi
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| |
Collapse
|
22
|
Suresh N, Chinnakonda Ashok Kumar N, Subramanian S, Srinivasa G. Memory augmented recurrent neural networks for de-novo drug design. PLoS One 2022; 17:e0269461. [PMID: 35737661 PMCID: PMC9223405 DOI: 10.1371/journal.pone.0269461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 05/22/2022] [Indexed: 12/01/2022] Open
Abstract
A recurrent neural network (RNN) is a machine learning model that learns the relationship between elements of an input series, in addition to inferring a relationship between the data input to the model and target output. Memory augmentation allows the RNN to learn the interrelationships between elements of the input over a protracted length of the input series. Inspired by the success of stack augmented RNN (StackRNN) to generate strings for various applications, we present two memory augmented RNN-based architectures: the Neural Turing Machine (NTM) and the Differentiable Neural Computer (DNC) for the de-novo generation of small molecules. We trained a character-level convolutional neural network (CNN) to predict the properties of a generated string and compute a reward or loss in a deep reinforcement learning setup to bias the Generator to produce molecules with the desired property. Further, we compare the performance of these architectures to gain insight to their relative merits in terms of the validity and novelty of the generated molecules and the degree of property bias towards the computational generation of de-novo drugs. We also compare the performance of these architectures with simpler recurrent neural networks (Vanilla RNN, LSTM, and GRU) without an external memory component to explore the impact of augmented memory in the task of de-novo generation of small molecules.
Collapse
Affiliation(s)
- Naveen Suresh
- PES Center for Pattern Recognition and Department of Computer Science and Engineering, PES University, Bengaluru, Karnataka, India
| | - Neelesh Chinnakonda Ashok Kumar
- PES Center for Pattern Recognition and Department of Computer Science and Engineering, PES University, Bengaluru, Karnataka, India
| | - Srikumar Subramanian
- PES Center for Pattern Recognition and Department of Computer Science and Engineering, PES University, Bengaluru, Karnataka, India
| | - Gowri Srinivasa
- PES Center for Pattern Recognition and Department of Computer Science and Engineering, PES University, Bengaluru, Karnataka, India
- * E-mail:
| |
Collapse
|
23
|
Dlamini Z, Skepu A, Kim N, Mkhabele M, Khanyile R, Molefi T, Mbatha S, Setlai B, Mulaudzi T, Mabongo M, Bida M, Kgoebane-Maseko M, Mathabe K, Lockhat Z, Kgokolo M, Chauke-Malinga N, Ramagaga S, Hull R. AI and precision oncology in clinical cancer genomics: From prevention to targeted cancer therapies-an outcomes based patient care. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100965] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
|
24
|
Nag S, Baidya ATK, Mandal A, Mathew AT, Das B, Devi B, Kumar R. Deep learning tools for advancing drug discovery and development. 3 Biotech 2022; 12:110. [PMID: 35433167 PMCID: PMC8994527 DOI: 10.1007/s13205-022-03165-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 03/18/2022] [Indexed: 12/26/2022] Open
Abstract
A few decades ago, drug discovery and development were limited to a bunch of medicinal chemists working in a lab with enormous amount of testing, validations, and synthetic procedures, all contributing to considerable investments in time and wealth to get one drug out into the clinics. The advancements in computational techniques combined with a boom in multi-omics data led to the development of various bioinformatics/pharmacoinformatics/cheminformatics tools that have helped speed up the drug development process. But with the advent of artificial intelligence (AI), machine learning (ML) and deep learning (DL), the conventional drug discovery process has been further rationalized. Extensive biological data in the form of big data present in various databases across the globe acts as the raw materials for the ML/DL-based approaches and helps in accurate identifications of patterns and models which can be used to identify therapeutically active molecules with much fewer investments on time, workforce and wealth. In this review, we have begun by introducing the general concepts in the drug discovery pipeline, followed by an outline of the fields in the drug discovery process where ML/DL can be utilized. We have also introduced ML and DL along with their applications, various learning methods, and training models used to develop the ML/DL-based algorithms. Furthermore, we have summarized various DL-based tools existing in the public domain with their application in the drug discovery paradigm which includes DL tools for identification of drug targets and drug–target interaction such as DeepCPI, DeepDTA, WideDTA, PADME DeepAffinity, and DeepPocket. Additionally, we have discussed various DL-based models used in protein structure prediction, de novo design of new chemical scaffolds, virtual screening of chemical libraries for hit identification, absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, metabolite prediction, clinical trial design, and oral bioavailability prediction. In the end, we have tried to shed light on some of the successful ML/DL-based models used in the drug discovery and development pipeline while also discussing the current challenges and prospects of the application of DL tools in drug discovery and development. We believe that this review will be useful for medicinal and computational chemists searching for DL tools for use in their drug discovery projects.
Collapse
Affiliation(s)
- Sagorika Nag
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Anurag T. K. Baidya
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Abhimanyu Mandal
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Alen T. Mathew
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bhanuranjan Das
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bharti Devi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Rajnish Kumar
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| |
Collapse
|
25
|
Rodríguez-Pérez R, Miljković F, Bajorath J. Machine Learning in Chemoinformatics and Medicinal Chemistry. Annu Rev Biomed Data Sci 2022; 5:43-65. [PMID: 35440144 DOI: 10.1146/annurev-biodatasci-122120-124216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Novartis Institutes for Biomedical Research, Novartis Campus, Basel, Switzerland
| | - Filip Miljković
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D AstraZeneca, Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany;
| |
Collapse
|
26
|
Constructing Features Using a Hybrid Genetic Algorithm. SIGNALS 2022. [DOI: 10.3390/signals3020012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
A hybrid procedure that incorporates grammatical evolution and a weight decaying technique is proposed here for various classification and regression problems. The proposed method has two main phases: the creation of features and the evaluation of these features. During the first phase, using grammatical evolution, new features are created as non-linear combinations of the original features of the datasets. In the second phase, based on the characteristics of the first phase, the original dataset is modified and a neural network trained with a genetic algorithm is applied to this dataset. The proposed method was applied to an extremely wide set of datasets from the relevant literature and the experimental results were compared with four other techniques.
Collapse
|
27
|
Baskin I, Epshtein A, Ein-Eli Y. Benchmarking machine learning methods for modeling physical properties of ionic liquids. J Mol Liq 2022. [DOI: 10.1016/j.molliq.2022.118616] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
28
|
Graph neural network approaches for drug-target interactions. Curr Opin Struct Biol 2022; 73:102327. [DOI: 10.1016/j.sbi.2021.102327] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 11/22/2021] [Accepted: 12/13/2021] [Indexed: 01/06/2023]
|
29
|
Rodríguez-Pérez R, Bajorath J. Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery. J Comput Aided Mol Des 2022; 36:355-362. [PMID: 35304657 PMCID: PMC9325859 DOI: 10.1007/s10822-022-00442-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 02/15/2022] [Indexed: 11/05/2022]
Abstract
The support vector machine (SVM) algorithm is one of the most widely used machine learning (ML) methods for predicting active compounds and molecular properties. In chemoinformatics and drug discovery, SVM has been a state-of-the-art ML approach for more than a decade. A unique attribute of SVM is that it operates in feature spaces of increasing dimensionality. Hence, SVM conceptually departs from the paradigm of low dimensionality that applies to many other methods for chemical space navigation. The SVM approach is applicable to compound classification, and ranking, multi-class predictions, and –in algorithmically modified form– regression modeling. In the emerging era of deep learning (DL), SVM retains its relevance as one of the premier ML methods in chemoinformatics, for reasons discussed herein. We describe the SVM methodology including strengths and weaknesses and discuss selected applications that have contributed to the evolution of SVM as a premier approach for compound classification, property predictions, and virtual compound screening.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany.,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany. .,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland.
| |
Collapse
|
30
|
Choudhury C, Arul Murugan N, Deva Priyakumar U. Structure-based drug repurposing: traditional and advanced AI/ML-aided methods. Drug Discov Today 2022; 27:1847-1861. [PMID: 35301148 PMCID: PMC8920090 DOI: 10.1016/j.drudis.2022.03.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/16/2022] [Accepted: 03/10/2022] [Indexed: 02/08/2023]
Abstract
The current global health emergency in the form of the Coronavirus 2019 (COVID-19) pandemic has highlighted the need for fast, accurate, and efficient drug discovery pipelines. Traditional drug discovery projects relying on in vitro high-throughput screening (HTS) involve large investments and sophisticated experimental set-ups, affordable only to big biopharmaceutical companies. In this scenario, application of efficient state-of-the-art computational methods and modern artificial intelligence (AI)-based algorithms for rapid screening of repurposable chemical space [approved drugs and natural products (NPs) with proven pharmacokinetic profiles] to identify the initial leads is a powerful option to save resources and time. Structure-based drug repurposing is a popular in silico repurposing approach. In this review, we discuss traditional and modern AI-based computational methods and tools applied at various stages for structure-based drug discovery (SBDD) pipelines. Additionally, we highlight the role of generative models in generating molecules with scaffolds from repurposable chemical space. Teaser: This review highlights the importance of repurposable chemical space, and the contributions of conventional in silico approaches and modern machine-learning algorithms for rapid structure-based drug repurposing.
Collapse
Affiliation(s)
- Chinmayee Choudhury
- Department of Experimental Medicine and Biotechnology, Postgraduate Institute of Medical Education and Research, Sector-12, Chandigarh 160012, India
| | - N Arul Murugan
- Department of Computer Science, School of Electrical Engineering and Computer Sciences, KTH Royal Institute of Technology, S-100 44, Stockholm, Sweden; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi 110020, India.
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
31
|
Alkindi KM, Mukherjee K, Pandey M, Arora A, Janizadeh S, Pham QB, Anh DT, Ahmadi K. Prediction of groundwater nitrate concentration in a semiarid region using hybrid Bayesian artificial intelligence approaches. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:20421-20436. [PMID: 34735705 DOI: 10.1007/s11356-021-17224-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 10/21/2021] [Indexed: 06/13/2023]
Abstract
Nitrate is a major pollutant in groundwater whose main source is municipal wastewater and agricultural activities. In the present study, Bayesian approaches such as Bayesian generalized linear model (BGLM), Bayesian regularized neural network (BRNN), Bayesian additive regression tree (BART), and Bayesian ridge regression (BRR) were used to model groundwater nitrate contamination in a semiarid region Marvdasht watershed, Fars province, Iran. Eleven groundwater (GW) nitrate conditioning factors have been taken as input parameters for predictive modeling. The results showed that the Bayesian models used in this study were all competent to model groundwater nitrate and the BART model with R2 = 0.83 was more efficient than the other models. The result of variable importance showed that potassium (K) has the highest importance in the models followed by rainfall, altitude, groundwater depth, and distance from the residential area. The results of the study can support the decision-making process to control and reduce the sources of nitrate pollution.
Collapse
Affiliation(s)
- Khalifa M Alkindi
- UNESCO Chair on Aflaj Studies, Archaeohydrology, University of Nizwa, Nizwa, Oman
| | - Kaustuv Mukherjee
- Department of Geography, Chandidas Mahavidyalaya, Birbhum, WB, 731215, India
| | - Manish Pandey
- University Center for Research & Development (UCRD), Chandigarh University, Mohali, 140413, Punjab, India
- Department of Civil Engineering, University Institute of Engineering, Chandigarh University, Mohali, 140413, Punjab, India
| | - Aman Arora
- Department of Geography, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi, 10025, Delhi, India
| | - Saeid Janizadeh
- Department of Watershed Management Engineering and Sciences, Faculty in Natural Resources and Marine Science, Tarbiat Modares University, 14115-111, Tehran, Iran
| | - Quoc Bao Pham
- Institute of Applied Technology, Thu Dau Mot University, Binh Duong Province, Vietnam
| | - Duong Tran Anh
- Ho Chi Minh City University of Technology (HUTECH) 475A, Dien Bien Phu, Ward 25, Binh Thanh District, Ho Chi Minh City, Vietnam.
| | - Kourosh Ahmadi
- Department of Forestry, Faculty in Natural Resources and Marine Science, Tarbiat Modares University, 14115-111, Tehran, Iran
| |
Collapse
|
32
|
Ivanov SM, Lagunin AA, Filimonov DA, Poroikov VV. Relationships between the Structure and Severe Drug-Induced Liver Injury for Low, Medium, and High Doses of Drugs. Chem Res Toxicol 2022; 35:402-411. [PMID: 35172101 DOI: 10.1021/acs.chemrestox.1c00307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Assessment of structure-activity relationships (SARs) for predicting severe drug-induced liver injury (DILI) is essential since in vivo and in vitro preclinical methods cannot detect many druglike compounds disrupting liver functions. To date, plenty of SAR models for the prediction of DILI have been developed; however, none of them considered the route of drug administration and daily dose, which may introduce significant bias into prediction results. We have created a dataset of 617 drugs with parenteral and oral administration routes and consistent information on DILI severity. We have found a clear relationship between route, dose, and DILI severity. According to SAR, nearly 40% of moderate- and non-DILI-causing drugs would cause severe DILI if they were administered at high oral doses. We have proposed the following approach to predict severe DILI. New compounds recommended to be used at low oral doses (<∼10 mg daily), or parenterally, can be considered not causing severe DILI. DILI for compounds administered at medium oral doses (∼10-100 mg daily; 22.2% of drugs under consideration) can be considered unpredictable because reasonable SAR models were not obtained due to the small size and heterogeneity of the corresponding dataset. The DILI potential of the compounds recommended to be used at high oral doses (more than ∼100 mg daily) can be estimated using SAR modeling. The balanced accuracy of the approach calculated by a 10-fold cross-validation procedure is 0.803. The developed approach can be used to estimate severe DILI for druglike compounds proposed to use at low and high oral doses or parenterally at the early stages of drug development.
Collapse
Affiliation(s)
- Sergey M Ivanov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Alexey A Lagunin
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia.,Pirogov Russian National Research Medical University, Ostrovityanova Str., 1, Moscow 117997, Russia
| | - Dmitry A Filimonov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| | - Vladimir V Poroikov
- Institute of Biomedical Chemistry, Pogodinskaya Str., 10/8, Moscow 119121, Russia
| |
Collapse
|
33
|
Ksenofontov AA, Lukanov MM, Bocharov PS, Berezin MB, Tetko IV. Deep neural network model for highly accurate prediction of BODIPYs absorption. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 267:120577. [PMID: 34776377 DOI: 10.1016/j.saa.2021.120577] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 10/12/2021] [Accepted: 10/31/2021] [Indexed: 06/13/2023]
Abstract
A possibility to accurately predict the absorption maximum wavelength of BODIPYs was investigated. We found that previously reported models had a low accuracy (40-57 nm) to predict BODIPYs due to the limited dataset sizes and/or number of BODIPYs (few hundreds). New models developed in this study were based on data of 6000-plus fluorescent dyes (including 4000-plus BODIPYs) and the deep neural network architecture. The high prediction accuracy (five-fold cross-validation room mean squared error (RMSE) of 18.4 nm) was obtained using a consensus model, which was more accurate than individual models. This model provided the excellent accuracy (RMSE of 8 nm) for molecules previously synthesized in our laboratory as well as for prospective validation of three new BODIPYs. We found that solvent properties did not significantly influence the model accuracy since only few BODIPYs exhibited solvatochromism. The analysis of large prediction errors suggested that compounds able to have intermolecular interactions with solvent or salts were likely to be incorrectly predicted. The consensus model is freely available at https://ochem.eu/article/134921 and can help the other researchers to accelerate design of new dyes with desired properties.
Collapse
Affiliation(s)
- Alexander A Ksenofontov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia.
| | - Michail M Lukanov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia; Ivanovo State University of Chemistry and Technology, 7, Sheremetevskiy Avenue, Ivanovo 153000, Russia
| | - Pavel S Bocharov
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia; Ivanovo State University of Chemistry and Technology, 7, Sheremetevskiy Avenue, Ivanovo 153000, Russia
| | - Michail B Berezin
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia
| | - Igor V Tetko
- G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street, 153045 Ivanovo, Russia; Helmholtz Zentrum München‑German Research Center for Environmental Health (GmbH), Institute of Structural Biology, Ingolstädter Landstraße 1, 85764 Neuherberg, Germany; BIGCHEM GmbH, Valerystr. 49, 85716 Unterschleißheim, Germany
| |
Collapse
|
34
|
Saldívar-González FI, Aldas-Bulos VD, Medina-Franco JL, Plisson F. Natural product drug discovery in the artificial intelligence era. Chem Sci 2022; 13:1526-1546. [PMID: 35282622 PMCID: PMC8827052 DOI: 10.1039/d1sc04471k] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 12/10/2021] [Indexed: 12/19/2022] Open
Abstract
Natural products (NPs) are primarily recognized as privileged structures to interact with protein drug targets. Their unique characteristics and structural diversity continue to marvel scientists for developing NP-inspired medicines, even though the pharmaceutical industry has largely given up. High-performance computer hardware, extensive storage, accessible software and affordable online education have democratized the use of artificial intelligence (AI) in many sectors and research areas. The last decades have introduced natural language processing and machine learning algorithms, two subfields of AI, to tackle NP drug discovery challenges and open up opportunities. In this article, we review and discuss the rational applications of AI approaches developed to assist in discovering bioactive NPs and capturing the molecular "patterns" of these privileged structures for combinatorial design or target selectivity.
Collapse
Affiliation(s)
- F I Saldívar-González
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México Avenida Universidad 3000 04510 Mexico Mexico
| | - V D Aldas-Bulos
- Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN Irapuato Guanajuato Mexico
| | - J L Medina-Franco
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México Avenida Universidad 3000 04510 Mexico Mexico
| | - F Plisson
- CONACYT - Unidad de Genómica Avanzada, Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del IPN Irapuato Guanajuato Mexico
| |
Collapse
|
35
|
Lin E, Lin CH, Lane HY. De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update. J Chem Inf Model 2022; 62:761-774. [DOI: 10.1021/acs.jcim.1c01361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Eugene Lin
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, United States
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, United States
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Chieh-Hsin Lin
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
- Department of Psychiatry, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan
- School of Medicine, Chang Gung University, Taoyuan 33302, Taiwan
| | - Hsien-Yuan Lane
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
- Department of Psychiatry, China Medical University Hospital, Taichung 40447, Taiwan
- Brain Disease Research Center, China Medical University Hospital, Taichung 40447, Taiwan
- Department of Psychology, College of Medical and Health Sciences, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
36
|
Nguyen T, Nguyen GTT, Nguyen T, Le DH. Graph Convolutional Networks for Drug Response Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:146-154. [PMID: 33606633 DOI: 10.1109/tcbb.2021.3060430] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Drug response prediction is an important problem in computational personalized medicine. Many machine-learning-based methods, especially deep learning-based ones, have been proposed for this task. However, these methods often represent the drugs as strings, which are not a natural way to depict molecules. Also, interpretation (e.g., what are the mutation or copy number aberration contributing to the drug response) has not been considered thoroughly. METHODS In this study, we propose a novel method, GraphDRP, based on graph convolutional network for the problem. In GraphDRP, drugs were represented in molecular graphs directly capturing the bonds among atoms, meanwhile cell lines were depicted as binary vectors of genomic aberrations. Representative features of drugs and cell lines were learned by convolution layers, then combined to represent for each drug-cell line pair. Finally, the response value of each drug-cell line pair was predicted by a fully-connected neural network. Four variants of graph convolutional networks were used for learning the features of drugs. RESULTS We found that GraphDRP outperforms tCNNS in all performance measures for all experiments. Also, through saliency maps of the resulting GraphDRP models, we discovered the contribution of the genomic aberrations to the responses. CONCLUSION Representing drugs as graphs can improve the performance of drug response prediction. Availability of data and materials: Data and source code can be downloaded athttps://github.com/hauldhut/GraphDRP.
Collapse
|
37
|
Dzobo K. The Role of Natural Products as Sources of Therapeutic Agents for Innovative Drug Discovery. COMPREHENSIVE PHARMACOLOGY 2022. [PMCID: PMC8016209 DOI: 10.1016/b978-0-12-820472-6.00041-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Emerging threats to human health require a concerted effort in search of both preventive and treatment strategies, placing natural products at the center of efforts to obtain new therapies and reduce disease spread and associated mortality. The therapeutic value of compounds found in plants has been known for ages, resulting in their utilization in homes and in clinics for the treatment of many ailments ranging from common headache to serious conditions such as wounds. Despite the advancement observed in the world, plant based medicines are still being used to treat many pathological conditions or are used as alternatives to modern medicines. In most cases, these natural products or plant-based medicines are used in an un-purified state as extracts. A lot of research is underway to identify and purify the active compounds responsible for the healing process. Some of the current drugs used in clinics have their origins as natural products or came from plant extracts. In addition, several synthetic analogues are natural product-based or plant-based. With the emergence of novel infectious agents such as the SARS-CoV-2 in addition to already burdensome diseases such as diabetes, cancer, tuberculosis and HIV/AIDS, there is need to come up with new drugs that can cure these conditions. Natural products offer an opportunity to discover new compounds that can be converted into drugs given their chemical structure diversity. Advances in analytical processes make drug discovery a multi-dimensional process involving computational designing and testing and eventual laboratory screening of potential drug candidates. Lead compounds will then be evaluated for safety, pharmacokinetics and efficacy. New technologies including Artificial Intelligence, better organ and tissue models such as organoids allow virtual screening, automation and high-throughput screening to be part of drug discovery. The use of bioinformatics and computation means that drug discovery can be a fast and efficient process and enable the use of natural products structures to obtain novel drugs. The removal of potential bottlenecks resulting in minimal false positive leads in drug development has enabled an efficient system of drug discovery. This review describes the biosynthesis and screening of natural products during drug discovery as well as methods used in studying natural products.
Collapse
|
38
|
Karthikeyan A, Priyakumar UD. Artificial intelligence: machine learning for chemical sciences. J CHEM SCI 2021; 134:2. [PMID: 34955617 PMCID: PMC8691161 DOI: 10.1007/s12039-021-01995-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/08/2021] [Accepted: 09/14/2021] [Indexed: 12/05/2022]
Abstract
Research in molecular sciences witnessed the rise and fall of Artificial Intelligence (AI)/ Machine Learning (ML) methods, especially artificial neural networks, few decades ago. However, we see a major resurgence in the use of modern ML methods in scientific research during the last few years. These methods have had phenomenal success in the areas of computer vision, speech recognition, natural language processing (NLP), etc. This has inspired chemists and biologists to apply these algorithms to problems in natural sciences. Availability of high performance Graphics Processing Unit (GPU) accelerators, large datasets, new algorithms, and libraries has enabled this surge. ML algorithms have successfully been applied to various domains in molecular sciences by providing much faster and sometimes more accurate solutions compared to traditional methods like Quantum Mechanical (QM) calculations, Density Functional Theory (DFT) or Molecular Mechanics (MM) based methods, etc. Some of the areas where the potential of ML methods are shown to be effective are in drug design, prediction of high-level quantum mechanical energies, molecular design, molecular dynamics materials, and retrosynthesis of organic compounds, etc. This article intends to conceptually introduce various modern ML methods and their relevance and applications in computational natural sciences. Synopsis Recent surge in the application of machine learning (ML) methods in fundamental sciences has led to a perspective that these methods may become important tools in chemical science. This perspective provides an overview of the modern ML methods and their successful applications in chemistry during the last few years.
Collapse
Affiliation(s)
- Akshaya Karthikeyan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500 032 India
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500 032 India
| |
Collapse
|
39
|
Maggiora G, Vogt M. Set-Theoretic Formalism for Treating Ligand-Target Datasets. Molecules 2021; 26:molecules26247419. [PMID: 34946500 PMCID: PMC8704321 DOI: 10.3390/molecules26247419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 11/16/2021] [Accepted: 12/03/2021] [Indexed: 11/20/2022] Open
Abstract
Data on ligand–target (LT) interactions has played a growing role in drug research for several decades. Even though the amount of data has grown significantly in size and coverage during this period, most datasets remain difficult to analyze because of their extreme sparsity, as there is no activity data whatsoever for many LT pairs. Even within clusters of data there tends to be a lack of data completeness, making the analysis of LT datasets problematic. The current effort extends earlier works on the development of set-theoretic formalisms for treating thresholded LT datasets. Unlike many approaches that do not address pairs of unknown interaction, the current work specifically takes account of their presence in addition to that of active and inactive pairs. Because a given LT pair can be in any one of three states, the binary logic of classical set-theoretic methods does not strictly apply. The current work develops a formalism, based on ternary set-theoretic relations, for treating thresholded LT datasets. It also describes an extension of the concept of data completeness, which is typically applied to sets of ligands and targets, to the local data completeness of individual ligands and targets. The set-theoretic formalism is applied to the analysis of simple and joint polypharmacologies based on LT activity profiles, and it is shown that null pairs provide a means for determining bounds to these values. The methodology is applied to a dataset of protein kinase inhibitors as an illustration of the method. Although not dealt with here, work is currently underway on a more refined treatment of activity values that is based on increasing the number of activity classes.
Collapse
Affiliation(s)
- Gerald Maggiora
- BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA
- Correspondence:
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5-6, D-53115 Bonn, Germany;
| |
Collapse
|
40
|
Machine learning modelling of chemical reaction characteristics: yesterday, today, tomorrow. MENDELEEV COMMUNICATIONS 2021. [DOI: 10.1016/j.mencom.2021.11.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
41
|
Ghosh D, Koch U, Hadian K, Sattler M, Tetko IV. Highly Accurate Filters to Flag Frequent Hitters in AlphaScreen Assays by Suggesting their Mechanism. Mol Inform 2021; 41:e2100151. [PMID: 34676998 DOI: 10.1002/minf.202100151] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Accepted: 09/29/2021] [Indexed: 11/06/2022]
Abstract
AlphaScreen is one of the most widely used assay technologies in drug discovery due to its versatility, dynamic range and sensitivity. However, a presence of false positives and frequent hitters contributes to difficulties with an interpretation of measured HTS data. Although filters do exist to identify frequent hitters for AlphaScreen, they are frequently based on privileged scaffolds. The development of such filters is time consuming and requires deep domain knowledge. Recently, machine learning and artificial intelligence methods are emerging as important tools to advance drug discovery and chemoinformatics, including their application to identification of frequent hitters in screening assays. However, the relative performance and complementarity of the Machine Learning and scaffold-based techniques has not yet been comprehensively compared. In this study, we analysed filters based on the privileged scaffolds with filters built using machine learning. Our results demonstrate that machine-learning methods provide more accurate filters for identification of frequent hitters in AlphaScreen assays than scaffold-based methods and can be easily redeveloped once new data are measured. We present highly accurate models to identify frequent hitters in AlphaScreen assays.
Collapse
Affiliation(s)
- Dipan Ghosh
- Lead Discovery Center GmbH, Otto-Hahn-Straße 15, 44227, Dortmund, Germany
| | - Uwe Koch
- Lead Discovery Center GmbH, Otto-Hahn-Straße 15, 44227, Dortmund, Germany
| | - Kamyar Hadian
- Assay Development and Screening Platform, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764, Neuherberg, Germany
| | - Michael Sattler
- Bavarian NMR Center, Department Chemie, Technische Universität München, Ernst-Otto-Fischerstraße 2, D-85747, Garching, Germany.,Institute of Structural Biology, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764, Neuherberg, Germany
| | - Igor V Tetko
- Institute of Structural Biology, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764, Neuherberg, Germany.,G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street 1, 153045, Ivanovo, Russia.,BIGCHEM GmbH, Valerystr. 49, D-85716, Unterschleißheim, Germany
| |
Collapse
|
42
|
Mermer A. The role of machine learning method in the synthesis and biological ınvestigation of heterocyclic compounds. Mol Divers 2021; 26:1875-1892. [PMID: 34669112 DOI: 10.1007/s11030-021-10264-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 06/22/2021] [Indexed: 11/25/2022]
Abstract
Machine learning (ML) methods have attracted increasing interest in chemistry as in all fields of science in recent years. This method is of great importance for the design of targeted bioactive compounds, especially by avoiding loss of time, money, and chemicals. There are lots of online web-based platforms such as LibSVM and OCHEM for the application of ML methods. In this paper, it has been examined the literature data on the activity predictions of heterocyclic compounds, biological activity results such as antiurease, HIV-1 Integrase, E. Coli DNA Gyrase B, and antifungal, pharmacophore-based studies, synthesis, and finding possible inhibitors using different machine learning methods.
Collapse
Affiliation(s)
- Arif Mermer
- Experimental Medicine Research and Application Center, University of Health Sciences Turkey, Uskudar, 34662, Istanbul, Turkey.
| |
Collapse
|
43
|
Mak KK, Balijepalli MK, Pichika MR. Success stories of AI in drug discovery - where do things stand? Expert Opin Drug Discov 2021; 17:79-92. [PMID: 34553659 DOI: 10.1080/17460441.2022.1985108] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) in drug discovery and development (DDD) has gained more traction in the past few years. Many scientific reviews have already been made available in this area. Thus, in this review, the authors have focused on the success stories of AI-driven drug candidates and the scientometric analysis of the literature in this field. AREA COVERED The authors explore the literature to compile the success stories of AI-driven drug candidates that are currently being assessed in clinical trials or have investigational new drug (IND) status. The authors also provide the reader with their expert perspectives for future developments and their opinions on the field. EXPERT OPINION Partnerships between AI companies and the pharma industry are booming. The early signs of the impact of AI on DDD are encouraging, and the pharma industry is hoping for breakthroughs. AI can be a promising technology to unveil the greatest successes, but it has yet to be proven as AI is still at the embryonic stage.
Collapse
Affiliation(s)
- Kit-Kay Mak
- School of Postgraduate Studies and Research, International Medical University, Bukit Jalil, Malaysia.,Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| | | | - Mallikarjuna Rao Pichika
- Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| |
Collapse
|
44
|
Perpetuo L, Klein J, Ferreira R, Guedes S, Amado F, Leite-Moreira A, Silva AMS, Thongboonkerd V, Vitorino R. How can artificial intelligence be used for peptidomics? Expert Rev Proteomics 2021; 18:527-556. [PMID: 34343059 DOI: 10.1080/14789450.2021.1962303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
INTRODUCTION Peptidomics is an emerging field of omics sciences using advanced isolation, analysis, and computational techniques that enable qualitative and quantitative analyses of various peptides in biological samples. Peptides can act as useful biomarkers and as therapeutic molecules for diseases. AREAS COVERED The use of therapeutic peptides can be predicted quickly and efficiently using data-driven computational methods, particularly artificial intelligence (AI) approach. Various AI approaches are useful for peptide-based drug discovery, such as support vector machine, random forest, extremely randomized trees, and other more recently developed deep learning methods. AI methods are relatively new to the development of peptide-based therapies, but these techniques already become essential tools in protein science by dissecting novel therapeutic peptides and their functions (Figure 1).[Figure: see text]. EXPERT OPINION Researchers have shown that AI models can facilitate the development of peptidomics and selective peptide therapies in the field of peptide science. Biopeptide prediction is important for the discovery and development of successful peptide-based drugs. Due to their ability to predict therapeutic roles based on sequence details, many AI-dependent prediction tools have been developed (Figure 1).
Collapse
Affiliation(s)
- Luís Perpetuo
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro
| | - Julie Klein
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1297, Institute of Cardiovascular and Metabolic Disease, Université Toulouse III, Toulouse, France
| | - Rita Ferreira
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Francisco Amado
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Adelino Leite-Moreira
- UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| | - Artur M S Silva
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Visith Thongboonkerd
- Medical Proteomics Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Rui Vitorino
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro.,LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro.,UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| |
Collapse
|
45
|
Piroozmand F, Mohammadipanah F, Sajedi H. Spectrum of deep learning algorithms in drug discovery. Chem Biol Drug Des 2021; 96:886-901. [PMID: 33058458 DOI: 10.1111/cbdd.13674] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 02/11/2020] [Accepted: 02/19/2020] [Indexed: 12/16/2022]
Abstract
Deep learning (DL) algorithms are a subset of machine learning algorithms with the aim of modeling complex mapping between a set of elements and their classes. In parallel to the advance in revealing the molecular bases of diseases, a notable innovation has been undertaken to apply DL in data/libraries management, reaction optimizations, differentiating uncertainties, molecule constructions, creating metrics from qualitative results, and prediction of structures or interactions. From source identification to lead discovery and medicinal chemistry of the drug candidate, drug delivery, and modification, the challenges can be subjected to artificial intelligence algorithms to aid in the generation and interpretation of data. Discovery and design approach, both demand automation, large data management and data fusion by the advance in high-throughput mode. The application of DL can accelerate the exploration of drug mechanisms, finding novel indications for existing drugs (drug repositioning), drug development, and preclinical and clinical studies. The impact of DL in the workflow of drug discovery, design, and their complementary tools are highlighted in this review. Additionally, the type of DL algorithms used for this purpose, and their pros and cons along with the dominant directions of future research are presented.
Collapse
Affiliation(s)
- Firoozeh Piroozmand
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Fatemeh Mohammadipanah
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Hedieh Sajedi
- Department of Computer Science, School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
46
|
Ahmad F, Mahmood A, Muhmood T. Machine learning-integrated omics for the risk and safety assessment of nanomaterials. Biomater Sci 2021; 9:1598-1608. [PMID: 33443512 DOI: 10.1039/d0bm01672a] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
With the advancement in nanotechnology, we are experiencing transformation in world order with deep insemination of nanoproducts from basic necessities to advanced electronics, health care products and medicines. Therefore, nanoproducts, however, can have negative side effects and must be strictly monitored to avoid negative outcomes. Future toxicity and safety challenges regarding nanomaterial incorporation into consumer products, including rapid addition of nanomaterials with diverse functionalities and attributes, highlight the limitations of traditional safety evaluation tools. Currently, artificial intelligence and machine learning algorithms are envisioned for enhancing and improving the nano-bio-interaction simulation and modeling, and they extend to the post-marketing surveillance of nanomaterials in the real world. Thus, hyphenation of machine learning with biology and nanomaterials could provide exclusive insights into the perturbations of delicate biological functions after integration with nanomaterials. In this review, we discuss the potential of combining integrative omics with machine learning in profiling nanomaterial safety and risk assessment and provide guidance for regulatory authorities as well.
Collapse
Affiliation(s)
- Farooq Ahmad
- College of Engineering and Applied Sciences, Nanjing National Laboratory of Microstructures, Jiangsu Key Laboratory of Artificial Functional Materials, Nanjing University, Nanjing, Jiangsu 210093, China.
| | - Asif Mahmood
- Beijing Key Laboratory of Photoelectronic/Electrophotonic Conversion Materials, School of Chemistry and Chemical Engineering, Beijing Institute of Technology, Beijing, 100081, China
| | - Tahir Muhmood
- State Key Lab of Metal Matrix Composites, School of Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
47
|
Tarasova O, Poroikov V. Machine Learning in Discovery of New Antivirals and Optimization of Viral Infections Therapy. Curr Med Chem 2021; 28:7840-7861. [PMID: 33949929 DOI: 10.2174/0929867328666210504114351] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/13/2021] [Accepted: 02/24/2021] [Indexed: 11/22/2022]
Abstract
Nowadays, computational approaches play an important role in the design of new drug-like compounds and optimization of pharmacotherapeutic treatment of diseases. The emerging growth of viral infections, including those caused by the Human Immunodeficiency Virus (HIV), Ebola virus, recently detected coronavirus, and some others, leads to many newly infected people with a high risk of death or severe complications. A huge amount of chemical, biological, clinical data is at the disposal of the researchers. Therefore, there are many opportunities to find the relationships between the particular features of chemical data and the antiviral activity of biologically active compounds based on machine learning approaches. Biological and clinical data can also be used for building models to predict relationships between viral genotype and drug resistance, which might help determine the clinical outcome of treatment. In the current study, we consider machine-learning approaches in the antiviral research carried out during the past decade. We overview in detail the application of machine-learning methods for the design of new potential antiviral agents and vaccines, drug resistance prediction, and analysis of virus-host interactions. Our review also covers the perspectives of using the machine-learning approaches for antiviral research, including Dengue, Ebola viruses, Influenza A, Human Immunodeficiency Virus, coronaviruses, and some others.
Collapse
Affiliation(s)
- Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| |
Collapse
|
48
|
Kimber TB, Chen Y, Volkamer A. Deep Learning in Virtual Screening: Recent Applications and Developments. Int J Mol Sci 2021; 22:4435. [PMID: 33922714 PMCID: PMC8123040 DOI: 10.3390/ijms22094435] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 01/03/2023] Open
Abstract
Drug discovery is a cost and time-intensive process that is often assisted by computational methods, such as virtual screening, to speed up and guide the design of new compounds. For many years, machine learning methods have been successfully applied in the context of computer-aided drug discovery. Recently, thanks to the rise of novel technologies as well as the increasing amount of available chemical and bioactivity data, deep learning has gained a tremendous impact in rational active compound discovery. Herein, recent applications and developments of machine learning, with a focus on deep learning, in virtual screening for active compound design are reviewed. This includes introducing different compound and protein encodings, deep learning techniques as well as frequently used bioactivity and benchmark data sets for model training and testing. Finally, the present state-of-the-art, including the current challenges and emerging problems, are examined and discussed.
Collapse
Affiliation(s)
| | | | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité-Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; (T.B.K.); (Y.C.)
| |
Collapse
|
49
|
Marchetti F, Moroni E, Pandini A, Colombo G. Machine Learning Prediction of Allosteric Drug Activity from Molecular Dynamics. J Phys Chem Lett 2021; 12:3724-3732. [PMID: 33843228 PMCID: PMC8154828 DOI: 10.1021/acs.jpclett.1c00045] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 04/05/2021] [Indexed: 05/13/2023]
Abstract
Allosteric drugs have been attracting increasing interest over the past few years. In this context, it is common practice to use high-throughput screening for the discovery of non-natural allosteric drugs. While the discovery stage is supported by a growing amount of biological information and increasing computing power, major challenges still remain in selecting allosteric ligands and predicting their effect on the target protein's function. Indeed, allosteric compounds can act both as inhibitors and activators of biological responses. Computational approaches to the problem have focused on variations on the theme of molecular docking coupled to molecular dynamics with the aim of recovering information on the (long-range) modulation typical of allosteric proteins.
Collapse
Affiliation(s)
- Filippo Marchetti
- Department
of Chemistry, Università Degli Studi
di Pavia, Viale Taramelli 12, 27100 Pavia, Italy
- Università
Degli Studi di Milano, Via C. Golgi, 19, I-20133 Milan, Italy
| | - Elisabetta Moroni
- Istituto
di Scienze e Tecnologie Chimiche, Via Mario Bianco 9, 20131 Milano, Italy
| | | | - Giorgio Colombo
- Department
of Chemistry, Università Degli Studi
di Pavia, Viale Taramelli 12, 27100 Pavia, Italy
- Istituto
di Scienze e Tecnologie Chimiche, Via Mario Bianco 9, 20131 Milano, Italy
| |
Collapse
|
50
|
Baskin II. Practical constraints with machine learning in drug discovery. Expert Opin Drug Discov 2021; 16:929-931. [PMID: 33605818 DOI: 10.1080/17460441.2021.1887133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Igor I Baskin
- Department of Materials Science and Engineering, Technion - Israel Institute of Technology, Haifa, Israel.,Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| |
Collapse
|