1
|
Liu H, Chen P, Hu B, Wang S, Wang H, Luan J, Wang J, Lin B, Cheng M. FaissMolLib: An efficient and easy deployable tool for ligand-based virtual screening. Comput Biol Chem 2024; 110:108057. [PMID: 38581840 DOI: 10.1016/j.compbiolchem.2024.108057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 03/06/2024] [Accepted: 03/20/2024] [Indexed: 04/08/2024]
Abstract
Virtual screening-based molecular similarity and fingerprint are crucial in drug design, target prediction, and ADMET prediction, aiding in identifying potential hits and optimizing lead compounds. However, challenges such as lack of comprehensive open-source molecular fingerprint databases and efficient search methods for virtual screening are prevalent. To address these issues, we introduce FaissMolLib, an open-source virtual screening tool that integrates 2.8 million compounds from ChEMBL and ZINC databases. Notably, FaissMolLib employs the highly efficient Faiss search algorithm, outperforming the Tanimoto algorithm in identifying similar molecules with its tighter clustering in scatter plots and lower mean, standard deviation, and variance in key molecular properties. This feature enables FaissMolLib to screen 2.8 million compounds in just 0.05 seconds, offering researchers an efficient, easily deployable solution for virtual screening on laptops and building unique compound databases. This significant advancement holds great potential for accelerating drug discovery efforts and enhancing chemical data analysis. FaissMolLib is freely available at http://liuhaihan.gnway.cc:80. The code and dataset of FaissMolLib are freely available at https://github.com/Superhaihan/FiassMolLib.
Collapse
Affiliation(s)
- Haihan Liu
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Peiying Chen
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Baichun Hu
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Shizun Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Hanxun Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Jiasi Luan
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Jian Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| | - Bin Lin
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| | - Maosheng Cheng
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| |
Collapse
|
2
|
Bhattacharjee S, Saha B, Saha S. Symptom-based drug prediction of lifestyle-related chronic diseases using unsupervised machine learning techniques. Comput Biol Med 2024; 174:108413. [PMID: 38608323 DOI: 10.1016/j.compbiomed.2024.108413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 02/13/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024]
Abstract
BACKGROUND AND OBJECTIVES Lifestyle-related diseases (LSDs) impose a substantial economic burden on patients and health care services. LSDs are chronic in nature and can directly affect the heart and lungs. Therapeutic interventions only based on symptoms can be crucial for prompt treatment initiation in LSDs, as symptoms are the first information available to clinicians. So, this work aims to apply unsupervised machine learning (ML) techniques for developing models to predict drugs from symptoms for LSDs, with a specific focus on pulmonary and heart diseases. METHODS The drug-disease and disease-symptom associations of 143 LSDs, 1271 drugs, and 305 symptoms were used to compute direct associations between drugs and symptoms. ML models with four different algorithms - K-Means, Bisecting K-Means, Mean Shift, and Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) - were developed to cluster the drugs using symptoms as features. The optimal model was saved in a server for the development of a web application. A web application was developed to perform the prediction based on the optimal model. RESULTS The Bisecting K-means model showed the best performance with a silhouette coefficient of 0.647 and generated 138 drug clusters. The drugs within the optimal clusters showed good similarity based on i) gene ontology annotations of the gene targets, ii) chemical ontology annotations, and iii) maximum common substructure of the drugs. In the web application, the model also provides a confidence score for each predicted drug while predicting from a new set of input symptoms. CONCLUSION In summary, direct associations between drugs and symptoms were computed, and those were used to develop a symptom-based drug prediction tool for LSDs with unsupervised ML models. The ML-based prediction can provide a second opinion to clinicians to aid their decision-making for early treatment of LSD patients. The web application (URL - http://bicresources.jcbose.ac.in/ssaha4/sdldpred) can provide a simple interface for all end-users to perform the ML-based prediction.
Collapse
Affiliation(s)
- Sudipto Bhattacharjee
- Department of Computer Science and Engineering, University of Calcutta, JD-2, Sector-III, Salt Lake, Kolkata, 700098, India.
| | - Banani Saha
- Department of Computer Science and Engineering, University of Calcutta, JD-2, Sector-III, Salt Lake, Kolkata, 700098, India.
| | - Sudipto Saha
- Department of Biological Sciences, Bose Institute, EN 80, Sector V, Bidhan Nagar, Kolkata, 700091, India.
| |
Collapse
|
3
|
Sulimov AV, Ilin IS, Tashchilova AS, Kondakova OA, Kutov DC, Sulimov VB. Docking and other computing tools in drug design against SARS-CoV-2. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:91-136. [PMID: 38353209 DOI: 10.1080/1062936x.2024.2306336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
The use of computer simulation methods has become an indispensable component in identifying drugs against the SARS-CoV-2 coronavirus. There is a huge body of literature on application of molecular modelling to predict inhibitors against target proteins of SARS-CoV-2. To keep our review clear and readable, we limited ourselves primarily to works that use computational methods to find inhibitors and test the predicted compounds experimentally either in target protein assays or in cell culture with live SARS-CoV-2. Some works containing results of experimental discovery of corresponding inhibitors without using computer modelling are included as examples of a success. Also, some computational works without experimental confirmations are also included if they attract our attention either by simulation methods or by databases used. This review collects studies that use various molecular modelling methods: docking, molecular dynamics, quantum mechanics, machine learning, and others. Most of these studies are based on docking, and other methods are used mainly for post-processing to select the best compounds among those found through docking. Simulation methods are presented concisely, information is also provided on databases of organic compounds that can be useful for virtual screening, and the review itself is structured in accordance with coronavirus target proteins.
Collapse
Affiliation(s)
- A V Sulimov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - I S Ilin
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - A S Tashchilova
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - O A Kondakova
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - D C Kutov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - V B Sulimov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
4
|
Kosonocky CW, Feller AL, Wilke CO, Ellington AD. Using alternative SMILES representations to identify novel functional analogues in chemical similarity vector searches. PATTERNS (NEW YORK, N.Y.) 2023; 4:100865. [PMID: 38106612 PMCID: PMC10724362 DOI: 10.1016/j.patter.2023.100865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/09/2023] [Accepted: 10/06/2023] [Indexed: 12/19/2023]
Abstract
Chemical similarity searches are a widely used family of in silico methods for identifying pharmaceutical leads. These methods historically relied on structure-based comparisons to compute similarity. Here, we use a chemical language model to create a vector-based chemical search. We extend previous implementations by creating a prompt engineering strategy that utilizes two different chemical string representation algorithms: one for the query and the other for the database. We explore this method by reviewing search results from nine queries with diverse targets. We find that the method identifies molecules with similar patent-derived functionality to the query, as determined by our validated LLM-assisted patent summarization pipeline. Further, many of these functionally similar molecules have different structures and scaffolds from the query, making them unlikely to be found with traditional chemical similarity searches. This method may serve as a new tool for the discovery of novel molecular structural classes that achieve target functionality.
Collapse
Affiliation(s)
- Clayton W. Kosonocky
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
| | - Aaron L. Feller
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
| | - Claus O. Wilke
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78705, USA
| | - Andrew D. Ellington
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78705, USA
| |
Collapse
|
5
|
Jiang W, Chen J, Zhang P, Zheng N, Ma L, Zhang Y, Zhang H. Repurposing Drugs for Inhibition against ALDH2 via a 2D/3D Ligand-Based Similarity Search and Molecular Simulation. Molecules 2023; 28:7325. [PMID: 37959744 PMCID: PMC10650273 DOI: 10.3390/molecules28217325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/22/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
Aldehyde dehydrogenase-2 (ALDH2) is a crucial enzyme participating in intracellular aldehyde metabolism and is acknowledged as a potential therapeutic target for the treatment of alcohol use disorder and other addictive behaviors. Using previously reported ALDH2 inhibitors of Daidzin, CVT-10216, and CHEMBL114083 as reference molecules, here we perform a ligand-based virtual screening of world-approved drugs via 2D/3D similarity search methods, followed by the assessments of molecular docking, toxicity prediction, molecular simulation, and the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) analysis. The 2D molecular fingerprinting of ECFP4 and FCFP4 and 3D molecule-shape-based USRCAT methods show good performances in selecting compounds with a strong binding behavior with ALDH2. Three compounds of Zeaxanthin (q = 0), Troglitazone (q = 0), and Sequinavir (q = +1 e) are singled out as potential inhibitors; Zeaxanthin can only be hit via USRCAT. These drugs displayed a stronger binding strength compared to the reported potent inhibitor CVT-10216. Sarizotan (q = +1 e) and Netarsudil (q = 0/+1 e) displayed a strong binding strength with ALDH2 as well, whereas they displayed a shallow penetration into the substrate-binding tunnel of ALDH2 and could not fully occupy it. This likely left a space for substrate binding, and thus they were not ideal inhibitors. The MM-PBSA results indicate that the selected negatively charged compounds from the similarity search and Vina scoring are thermodynamically unfavorable, mainly due to electrostatic repulsion with the receptor (q = -6 e for ALDH2). The electrostatic attraction with positively charged compounds, however, yielded very strong binding results with ALDH2. These findings reveal a deficiency in the modeling of electrostatic interactions (in particular, between charged moieties) in the virtual screening via the 2D/3D similarity search and molecular docking with the Vina scoring system.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Haiyang Zhang
- Department of Biological Science and Engineering, School of Chemistry and Biological Engineering, University of Science and Technology Beijing, Beijing100083, China
| |
Collapse
|
6
|
Wu S, Pan Z, Li X, Wang Y, Tang J, Li H, Lu G, Li J, Feng Z, He Y, Liu X. Machine Learning Assisted Photothermal Conversion Efficiency Prediction of Anticancer Photothermal Agents. Chem Eng Sci 2023. [DOI: 10.1016/j.ces.2023.118619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
7
|
Bajusz D, Keserű GM. Maximizing the integration of virtual and experimental screening in hit discovery. Expert Opin Drug Discov 2022; 17:629-640. [PMID: 35671403 DOI: 10.1080/17460441.2022.2085685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Experimental and virtual screening contributes to the discovery of more than 50% of clinical candidates. Considering the similar concept and goals, early-phase drug discovery would benefit from the effective integration of these approaches. AREAS COVERED After reviewing the recent trends in both experimental and virtual screening, the authors discuss different integration strategies from parallel, focused, sequential, and iterative screening. Strategic considerations are demonstrated in a number of real-life case studies. EXPERT OPINION Experimental and virtual screening are complementary approaches that should be integrated in lead discovery settings. Virtual screening can access extremely large synthetically feasible chemical space that can be effectively searched on GPU clusters or cloud architectures. Experimental screening provides reliable datasets by quantitative HTS applications, and DNA-encoded libraries (DEL) have enlarged the chemical space covered by these technologies. These developments, together with the use of artificial intelligence methods, represent new options for their efficient integration. The case studies discussed here demonstrate the benefits of complementary strategies, such as focused and iterative screening.
Collapse
Affiliation(s)
- Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| | - György M Keserű
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|