1
|
Tutone M, Almerico AM. Computational Approaches and Drug Discovery: Where Are We Going? Molecules 2024; 29:969. [PMID: 38474481 DOI: 10.3390/molecules29050969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open
Abstract
Science is a point of view [...].
Collapse
Affiliation(s)
- Marco Tutone
- Dipartimento di Scienze e Tecnologie Biologiche Chimiche e Farmaceutiche (STEBICEF), Università degli Studi di Palermo, Via Archirafi 32, 90123 Palermo, Italy
| | - Anna Maria Almerico
- Dipartimento di Scienze e Tecnologie Biologiche Chimiche e Farmaceutiche (STEBICEF), Università degli Studi di Palermo, Via Archirafi 32, 90123 Palermo, Italy
| |
Collapse
|
2
|
Sanchez AJ, Maier S, Raghavachari K. Leveraging DFT and Molecular Fragmentation for Chemically Accurate p Ka Prediction Using Machine Learning. J Chem Inf Model 2024; 64:712-723. [PMID: 38301279 DOI: 10.1021/acs.jcim.3c01923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
We present a quantum mechanical/machine learning (ML) framework based on random forest to accurately predict the pKas of complex organic molecules using inexpensive density functional theory (DFT) calculations. By including physics-based features from low-level DFT calculations and structural features from our connectivity-based hierarchy (CBH) fragmentation protocol, we can correct the systematic error associated with DFT. The generalizability and performance of our model are evaluated on two benchmark sets (SAMPL6 and Novartis). We believe the carefully curated input of physics-based features lessens the model's data dependence and need for complex deep learning architectures, without compromising the accuracy of the test sets. As a point of novelty, our work extends the applicability of CBH, employing it for the generation of viable molecular descriptors for ML.
Collapse
Affiliation(s)
- Alec J Sanchez
- Department of Chemistry, Indiana University?, Bloomington, Indiana 47405, United States
| | - Sarah Maier
- Department of Chemistry, Indiana University?, Bloomington, Indiana 47405, United States
| | - Krishnan Raghavachari
- Department of Chemistry, Indiana University?, Bloomington, Indiana 47405, United States
| |
Collapse
|
3
|
Chen LY, Li YP. Enhancing chemical synthesis: a two-stage deep neural network for predicting feasible reaction conditions. J Cheminform 2024; 16:11. [PMID: 38268009 PMCID: PMC11301986 DOI: 10.1186/s13321-024-00805-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 01/14/2024] [Indexed: 01/26/2024] Open
Abstract
In the field of chemical synthesis planning, the accurate recommendation of reaction conditions is essential for achieving successful outcomes. This work introduces an innovative deep learning approach designed to address the complex task of predicting appropriate reagents, solvents, and reaction temperatures for chemical reactions. Our proposed methodology combines a multi-label classification model with a ranking model to offer tailored reaction condition recommendations based on relevance scores derived from anticipated product yields. To tackle the challenge of limited data for unfavorable reaction contexts, we employed the technique of hard negative sampling to generate reaction conditions that might be mistakenly classified as suitable, forcing the model to refine its decision boundaries, especially in challenging cases. Our developed model excels in proposing conditions where an exact match to the recorded solvents and reagents is found within the top-10 predictions 73% of the time. It also predicts temperatures within ± 20 [Formula: see text] of the recorded temperature in 89% of test cases. Notably, the model demonstrates its capacity to recommend multiple viable reaction conditions, with accuracy varying based on the availability of condition records associated with each reaction. What sets this model apart is its ability to suggest alternative reaction conditions beyond the constraints of the dataset. This underscores its potential to inspire innovative approaches in chemical research, presenting a compelling opportunity for advancing chemical synthesis planning and elevating the field of reaction engineering. Scientific contribution: The combination of multi-label classification and ranking models provides tailored recommendations for reaction conditions based on the reaction yields. A novel approach is presented to address the issue of data scarcity in negative reaction conditions through data augmentation.
Collapse
Affiliation(s)
- Lung-Yi Chen
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617, Taiwan.
- Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), No. 128, Sec. 2, Academia Road, Taipei, 11529, Taiwan.
| |
Collapse
|
4
|
Heifetz A. Accelerating COVID-19 Drug Discovery with High-Performance Computing. Methods Mol Biol 2024; 2716:405-411. [PMID: 37702951 DOI: 10.1007/978-1-0716-3449-3_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
The recent COVID-19 pandemic has served as a timely reminder that the existing drug discovery is a laborious, expensive, and slow process. Never has there been such global demand for a therapeutic treatment to be identified as a matter of such urgency. Unfortunately, this is a scenario likely to repeat itself in future, so it is of interest to explore ways in which to accelerate drug discovery at pandemic speed. Computational methods naturally lend themselves to this because they can be performed rapidly if sufficient computational resources are available. Recently, high-performance computing (HPC) technologies have led to remarkable achievements in computational drug discovery and yielded a series of new platforms, algorithms, and workflows. The application of artificial intelligence (AI) and machine learning (ML) approaches is also a promising and relatively new avenue to revolutionize the drug design process and therefore reduce costs. In this review, I describe how molecular dynamics simulations (MD) were successfully integrated with ML and adapted to HPC to form a powerful tool to study inhibitors for four of the COVID-19 target proteins. The emphasis of this review is on the strategy that was used with an explanation of each of the steps in the accelerated drug discovery workflow. For specific technical details, the reader is directed to the relevant research publications.
Collapse
|
5
|
Kotev M, Diaz Gonzalez C. Molecular Dynamics and Other HPC Simulations for Drug Discovery. Methods Mol Biol 2024; 2716:265-291. [PMID: 37702944 DOI: 10.1007/978-1-0716-3449-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
High performance computing (HPC) is taking an increasingly important place in drug discovery. It makes possible the simulation of complex biochemical systems with high precision in a short time, thanks to the use of sophisticated algorithms. It promotes the advancement of knowledge in fields that are inaccessible or difficult to access through experimentation and it contributes to accelerating the discovery of drugs for unmet medical needs while reducing costs. Herein, we report how computational performance has evolved over the past years, and then we detail three domains where HPC is essential. Molecular dynamics (MD) is commonly used to explore the flexibility of proteins, thus generating a better understanding of different possible approaches to modulate their activity. Modeling and simulation of biopolymer complexes enables the study of protein-protein interactions (PPI) in healthy and disease states, thus helping the identification of targets of pharmacological interest. Virtual screening (VS) also benefits from HPC to predict in a short time, among millions or billions of virtual chemical compounds, the best potential ligands that will be tested in relevant assays to start a rational drug design process.
Collapse
Affiliation(s)
- Martin Kotev
- Evotec SE, Integrated Drug Discovery, Molecular Architects, Campus Curie, Toulouse, France
| | | |
Collapse
|
6
|
Thomson TM. On the importance for drug discovery of a transnational Latin American database of natural compound structures. Front Pharmacol 2023; 14:1207559. [PMID: 37426821 PMCID: PMC10324963 DOI: 10.3389/fphar.2023.1207559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 06/15/2023] [Indexed: 07/11/2023] Open
Affiliation(s)
- Timothy M. Thomson
- Institute for Molecular Biology (IBMB-CSIC), Barcelona, Spain
- CIBER de Enfermedades Hepáticas y Digestivas (CIBERehd), Madrid, Spain
- Universidad Peruana Cayetano Heredia, Lima, Peru
| |
Collapse
|
7
|
David L, Mdahoma A, Singh N, Buchoux S, Pihan E, Diaz C, Rabal O. A toolkit for covalent docking with GOLD: from automated ligand preparation with KNIME to bound protein-ligand complexes. BIOINFORMATICS ADVANCES 2022; 2:vbac090. [PMID: 36699353 PMCID: PMC9722222 DOI: 10.1093/bioadv/vbac090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 10/22/2022] [Accepted: 11/28/2022] [Indexed: 12/02/2022]
Abstract
Motivation Current covalent docking tools have limitations that make them difficult to use for performing large-scale structure-based covalent virtual screening (VS). They require time-consuming tasks for the preparation of proteins and compounds (standardization, filtering according to the type of warheads), as well as for setting up covalent reactions. We have developed a toolkit to help accelerate drug discovery projects in the phases of hit identification by VS of ultra-large covalent libraries and hit expansion by exploration of the binding of known covalent compounds. With this application note, we offer the community a toolkit for performing automated covalent docking in a fast and efficient way. Results The toolkit comprises a KNIME workflow for ligand preparation and a Python program to perform the covalent docking of ligands with the GOLD docking engine running in a parallelized fashion. Availability and implementation The KNIME workflow entitled 'Evotec_Covalent_Processing_forGOLD.knwf' for the preparation of the ligands is available in the KNIME Hub https://hub.knime.com/emilie_pihan/spaces. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | | | | | - Emilie Pihan
- Evotec SE, Molecular Architects, Integrated Drug Discovery, Toulouse 31036, France
| | - Constantino Diaz
- Evotec SE, Molecular Architects, Integrated Drug Discovery, Toulouse 31036, France
| | | |
Collapse
|
8
|
Sharma T, Saralamma VVG, Lee DC, Imran MA, Choi J, Baig MH, Dong JJ. Combining structure-based pharmacophore modeling and machine learning for the identification of novel BTK inhibitors. Int J Biol Macromol 2022; 222:239-250. [PMID: 36130643 DOI: 10.1016/j.ijbiomac.2022.09.151] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 09/13/2022] [Accepted: 09/16/2022] [Indexed: 11/05/2022]
Abstract
Bruton's tyrosine kinase (BTK) is a critical enzyme which is involved in multiple signaling pathways that regulate cellular survival, activation, and proliferation, making it a major cancer therapeutic target. We applied the novel integrated structure-based pharmacophore modeling, machine learning, and other in silico studies to screen the Korean chemical database (KCB) to identify the potential BTK inhibitors (BTKi). Further evaluation of these inhibitors on three different human cancer cell lines showed significant cell growth inhibitory activity. Among the 13 compounds shortlisted, four demonstrated consistent cell inhibition activity among breast, gastric, and lung cancer cells (IC50 below 3 μM). The selected compounds also showed significant kinase inhibition activity (IC50 below 5 μM). The current study suggests the potential of these inhibitors for targeting BTK malignant tumors.
Collapse
Affiliation(s)
- Tanuj Sharma
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea
| | - Venu Venkatarame Gowda Saralamma
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea
| | - Duk Chul Lee
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea
| | - Mohammad Azhar Imran
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea
| | - Jaehyuk Choi
- BNJBiopharma, 2nd floor Memorial Hall, 85, Songdogwahak-ro, Yeonsu-gu, Incheon 21983, Republic of Korea
| | - Mohammad Hassan Baig
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea.
| | - Jae-June Dong
- Department of Family Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine, Gangnam-gu, Seoul 120-752, Republic of Korea.
| |
Collapse
|
9
|
Identification of Pharmacophoric Fragments of DYRK1A Inhibitors Using Machine Learning Classification Models. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27061753. [PMID: 35335117 PMCID: PMC8954712 DOI: 10.3390/molecules27061753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 03/04/2022] [Accepted: 03/05/2022] [Indexed: 11/17/2022]
Abstract
Dual-specific tyrosine phosphorylation regulated kinase 1 (DYRK1A) has been regarded as a potential therapeutic target of neurodegenerative diseases, and considerable progress has been made in the discovery of DYRK1A inhibitors. Identification of pharmacophoric fragments provides valuable information for structure- and fragment-based design of potent and selective DYRK1A inhibitors. In this study, seven machine learning methods along with five molecular fingerprints were employed to develop qualitative classification models of DYRK1A inhibitors, which were evaluated by cross-validation, test set, and external validation set with four performance indicators of predictive classification accuracy (CA), the area under receiver operating characteristic (AUC), Matthews correlation coefficient (MCC), and balanced accuracy (BA). The PubChem fingerprint-support vector machine model (CA = 0.909, AUC = 0.933, MCC = 0.717, BA = 0.855) and PubChem fingerprint along with the artificial neural model (CA = 0.862, AUC = 0.911, MCC = 0.705, BA = 0.870) were considered as the optimal modes for training set and test set, respectively. A hybrid data balancing method SMOTETL, a combination of synthetic minority over-sampling technique (SMOTE) and Tomek link (TL) algorithms, was applied to explore the impact of balanced learning on the performance of models. Based on the frequency analysis and information gain, pharmacophoric fragments related to DYRK1A inhibition were also identified. All the results will provide theoretical supports and clues for the screening and design of novel DYRK1A inhibitors.
Collapse
|
10
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|