1
|
Lu X, Xie L, Xu L, Mao R, Xu X, Chang S. Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Comput Struct Biotechnol J 2024; 23:1666-1679. [PMID: 38680871 PMCID: PMC11046066 DOI: 10.1016/j.csbj.2024.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 05/01/2024] Open
Abstract
Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, mono-modal learning is inherently limited as it relies solely on a single modality of molecular representation, which restricts a comprehensive understanding of drug molecules. To overcome the limitations, we propose a multimodal fused deep learning (MMFDL) model to leverage information from different molecular representations. Specifically, we construct a triple-modal learning model by employing Transformer-Encoder, Bidirectional Gated Recurrent Unit (BiGRU), and graph convolutional network (GCN) to process three modalities of information from chemical language and molecular graph: SMILES-encoded vectors, ECFP fingerprints, and molecular graphs, respectively. We evaluate the proposed triple-modal model using five fusion approaches on six molecule datasets, including Delaney, Llinas2020, Lipophilicity, SAMPL, BACE, and pKa from DataWarrior. The results show that the MMFDL model achieves the highest Pearson coefficients, and stable distribution of Pearson coefficients in the random splitting test, outperforming mono-modal models in accuracy and reliability. Furthermore, we validate the generalization ability of our model in the prediction of binding constants for protein-ligand complex molecules, and assess the resilience capability against noise. Through analysis of feature distributions in chemical space and the assigned contribution of each modal model, we demonstrate that the MMFDL model shows the ability to acquire complementary information by using proper models and suitable fusion approaches. By leveraging diverse sources of bioinformatics information, multimodal deep learning models hold the potential for successful drug discovery.
Collapse
Affiliation(s)
- Xiaohua Lu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Rongzhi Mao
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
2
|
Gillani M, Pollastri G. Protein subcellular localization prediction tools. Comput Struct Biotechnol J 2024; 23:1796-1807. [PMID: 38707539 PMCID: PMC11066471 DOI: 10.1016/j.csbj.2024.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/11/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Protein subcellular localization prediction is of great significance in bioinformatics and biological research. Most of the proteins do not have experimentally determined localization information, computational prediction methods and tools have been acting as an active research area for more than two decades now. Knowledge of the subcellular location of a protein provides valuable information about its functionalities, the functioning of the cell, and other possible interactions with proteins. Fast, reliable, and accurate predictors provides platforms to harness the abundance of sequence data to predict subcellular locations accordingly. During the last decade, there has been a considerable amount of research effort aimed at developing subcellular localization predictors. This paper reviews recent subcellular localization prediction tools in the Eukaryotic, Prokaryotic, and Virus-based categories followed by a detailed analysis. Each predictor is discussed based on its main features, strengths, weaknesses, algorithms used, prediction techniques, and analysis. This review is supported by prediction tools taxonomies that highlight their rele- vant area and examples for uncomplicated categorization and ease of understandability. These taxonomies help users find suitable tools according to their needs. Furthermore, recent research gaps and challenges are discussed to cover areas that need the utmost attention. This survey provides an in-depth analysis of the most recent prediction tools to facilitate readers and can be considered a quick guide for researchers to identify and explore the recent literature advancements.
Collapse
Affiliation(s)
- Maryam Gillani
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin (UCD), Dublin, D04 V1W8, Ireland
| |
Collapse
|
3
|
Shi W, Yang H, Xie L, Yin XX, Zhang Y. A review of machine learning-based methods for predicting drug-target interactions. Health Inf Sci Syst 2024; 12:30. [PMID: 38617016 PMCID: PMC11014838 DOI: 10.1007/s13755-024-00287-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 03/04/2024] [Indexed: 04/16/2024] Open
Abstract
The prediction of drug-target interactions (DTI) is a crucial preliminary stage in drug discovery and development, given the substantial risk of failure and the prolonged validation period associated with in vitro and in vivo experiments. In the contemporary landscape, various machine learning-based methods have emerged as indispensable tools for DTI prediction. This paper begins by placing emphasis on the data representation employed by these methods, delineating five representations for drugs and four for proteins. The methods are then categorized into traditional machine learning-based approaches and deep learning-based ones, with a discussion of representative approaches in each category and the introduction of a novel taxonomy for deep neural network models in DTI prediction. Additionally, we present a synthesis of commonly used datasets and evaluation metrics to facilitate practical implementation. In conclusion, we address current challenges and outline potential future directions in this research field.
Collapse
Affiliation(s)
- Wen Shi
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004 China
| | - Hong Yang
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
| | - Linhai Xie
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing, 102206 China
| | - Xiao-Xia Yin
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
| | - Yanchun Zhang
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004 China
- Department of New Networks, Peng Cheng Laboratory, Shenzhen, 518000 China
| |
Collapse
|
4
|
Nayarisseri A, Abdalla M, Joshi I, Yadav M, Bhrdwaj A, Chopra I, Khan A, Saxena A, Sharma K, Panicker A, Panwar U, Mendonça Junior FJB, Singh SK. Potential inhibitors of VEGFR1, VEGFR2, and VEGFR3 developed through Deep Learning for the treatment of Cervical Cancer. Sci Rep 2024; 14:13251. [PMID: 38858458 PMCID: PMC11164920 DOI: 10.1038/s41598-024-63762-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 05/31/2024] [Indexed: 06/12/2024] Open
Abstract
Cervical cancer stands as a prevalent gynaecologic malignancy affecting women globally, often linked to persistent human papillomavirus infection. Biomarkers associated with cervical cancer, including VEGF-A, VEGF-B, VEGF-C, VEGF-D, and VEGF-E, show upregulation and are linked to angiogenesis and lymphangiogenesis. This research aims to employ in-silico methods to target tyrosine kinase receptor proteins-VEGFR-1, VEGFR-2, and VEGFR-3, and identify novel inhibitors for Vascular Endothelial Growth Factors receptors (VEGFRs). A comprehensive literary study was conducted which identified 26 established inhibitors for VEGFR-1, VEGFR-2, and VEGFR-3 receptor proteins. Compounds with high-affinity scores, including PubChem ID-25102847, 369976, and 208908 were chosen from pre-existing compounds for creating Deep Learning-based models. RD-Kit, a Deep learning algorithm, was used to generate 43 million compounds for VEGFR-1, VEGFR-2, and VEGFR-3 targets. Molecular docking studies were conducted on the top 10 molecules for each target to validate the receptor-ligand binding affinity. The results of Molecular Docking indicated that PubChem IDs-71465,645 and 11152946 exhibited strong affinity, designating them as the most efficient molecules. To further investigate their potential, a Molecular Dynamics Simulation was performed to assess conformational stability, and a pharmacophore analysis was also conducted for indoctrinating interactions.
Collapse
Affiliation(s)
- Anuraj Nayarisseri
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India.
- Bioinformatics Research Laboratory, LeGene Biosciences Pvt Ltd, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India.
| | - Mohnad Abdalla
- Key Laboratory of Chemical Biology (Ministry of Education), Department of Pharmaceutics, School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, 44 Cultural West Road, Jinan, 250012, Shandong Province, People's Republic of China
| | - Isha Joshi
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
| | - Manasi Yadav
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
| | - Anushka Bhrdwaj
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, 630003, India
| | - Ishita Chopra
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
- School of Medicine and Health Sciences, The George Washington University, Ross Hall, 2300 Eye Street, Washington, D.C., NW, 20037, USA
| | - Arshiya Khan
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, 630003, India
| | - Arshiya Saxena
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
| | - Khushboo Sharma
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, 630003, India
| | - Aravind Panicker
- In silico Research Laboratory, Eminent Biosciences, 91, Sector-A, Mahalakshmi Nagar, Indore, Madhya Pradesh, 452010, India
| | - Umesh Panwar
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, 630003, India
| | | | - Sanjeev Kumar Singh
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, 630003, India.
| |
Collapse
|
5
|
Kairys V, Baranauskiene L, Kazlauskiene M, Zubrienė A, Petrauskas V, Matulis D, Kazlauskas E. Recent advances in computational and experimental protein-ligand affinity determination techniques. Expert Opin Drug Discov 2024; 19:649-670. [PMID: 38715415 DOI: 10.1080/17460441.2024.2349169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
INTRODUCTION Modern drug discovery revolves around designing ligands that target the chosen biomolecule, typically proteins. For this, the evaluation of affinities of putative ligands is crucial. This has given rise to a multitude of dedicated computational and experimental methods that are constantly being developed and improved. AREAS COVERED In this review, the authors reassess both the industry mainstays and the newest trends among the methods for protein - small-molecule affinity determination. They discuss both computational affinity predictions and experimental techniques, describing their basic principles, main limitations, and advantages. Together, this serves as initial guide to the currently most popular and cutting-edge ligand-binding assays employed in rational drug design. EXPERT OPINION The affinity determination methods continue to develop toward miniaturization, high-throughput, and in-cell application. Moreover, the availability of data analysis tools has been constantly increasing. Nevertheless, cross-verification of data using at least two different techniques and careful result interpretation remain of utmost importance.
Collapse
Affiliation(s)
- Visvaldas Kairys
- Department of Bioinformatics, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Lina Baranauskiene
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | | | - Asta Zubrienė
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Vytautas Petrauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Egidijus Kazlauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
6
|
Xu Z, Li W, Dong X, Chen Y, Zhang D, Wang J, Zhou L, He G. Precision medicine in colorectal cancer: Leveraging multi-omics, spatial omics, and artificial intelligence. Clin Chim Acta 2024; 559:119686. [PMID: 38663471 DOI: 10.1016/j.cca.2024.119686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 04/22/2024] [Accepted: 04/22/2024] [Indexed: 05/03/2024]
Abstract
Colorectal cancer (CRC) is a leading cause of cancer-related deaths. Recent advancements in genomic technologies and analytical approaches have revolutionized CRC research, enabling precision medicine. This review highlights the integration of multi-omics, spatial omics, and artificial intelligence (AI) in advancing precision medicine for CRC. Multi-omics approaches have uncovered molecular mechanisms driving CRC progression, while spatial omics have provided insights into the spatial heterogeneity of gene expression in CRC tissues. AI techniques have been utilized to analyze complex datasets, identify new treatment targets, and enhance diagnosis and prognosis. Despite the tumor's heterogeneity and genetic and epigenetic complexity, the fusion of multi-omics, spatial omics, and AI shows the potential to overcome these challenges and advance precision medicine in CRC. The future lies in integrating these technologies to provide deeper insights and enable personalized therapies for CRC patients.
Collapse
Affiliation(s)
- Zishan Xu
- Department of Pathology, Xinxiang Medical University, Xinxiang 453000, China
| | - Wei Li
- School of Forensic Medicine, Xinxiang Medical University, Xinxiang 453000, China
| | - Xiangyang Dong
- Department of Pathology, Xinxiang Medical University, Xinxiang 453000, China
| | - Yingying Chen
- School of Basic Medical Sciences, Xinxiang Medical University, Xinxiang 453000, China
| | - Dan Zhang
- Department of Pathology, Xinxiang Medical University, Xinxiang 453000, China
| | - Jingnan Wang
- Xinxiang Medical University SanQuan Medical College, Xinxiang 453003, China
| | - Lin Zhou
- Department of Breast and Thyroid Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China.
| | - Guoyang He
- Department of Pathology, Xinxiang Medical University, Xinxiang 453000, China.
| |
Collapse
|
7
|
Khan MK, Raza M, Shahbaz M, Hussain I, Khan MF, Xie Z, Shah SSA, Tareen AK, Bashir Z, Khan K. The recent advances in the approach of artificial intelligence (AI) towards drug discovery. Front Chem 2024; 12:1408740. [PMID: 38882215 PMCID: PMC11176507 DOI: 10.3389/fchem.2024.1408740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 04/26/2024] [Indexed: 06/18/2024] Open
Abstract
Artificial intelligence (AI) has recently emerged as a unique developmental influence that is playing an important role in the development of medicine. The AI medium is showing the potential in unprecedented advancements in truth and efficiency. The intersection of AI has the potential to revolutionize drug discovery. However, AI also has limitations and experts should be aware of these data access and ethical issues. The use of AI techniques for drug discovery applications has increased considerably over the past few years, including combinatorial QSAR and QSPR, virtual screening, and denovo drug design. The purpose of this survey is to give a general overview of drug discovery based on artificial intelligence, and associated applications. We also highlighted the gaps present in the traditional method for drug designing. In addition, potential strategies and approaches to overcome current challenges are discussed to address the constraints of AI within this field. We hope that this survey plays a comprehensive role in understanding the potential of AI in drug discovery.
Collapse
Affiliation(s)
- Mahroza Kanwal Khan
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, China
| | - Mohsin Raza
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| | - Muhammad Shahbaz
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| | - Iftikhar Hussain
- Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
- A. J. Drexel Nanomaterials Institute and Department of Materials Science and Engineering, Drexel University, Philadelphia, PA, United States
| | - Muhammad Farooq Khan
- Department of Electrical Engineering, Sejong University, Seoul, Republic of Korea
| | - Zhongjian Xie
- Shenzhen Children's Hospital, Clinical Medical College of Southern University of Science and Technology, Shenzhen, China
| | - Syed Shoaib Ahmad Shah
- Department of Chemistry, School of Natural Sciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Ayesha Khan Tareen
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, China
| | - Zoobia Bashir
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, China
| | - Karim Khan
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| |
Collapse
|
8
|
López-López E, Medina-Franco JL. Toward structure-multiple activity relationships (SMARts) using computational approaches: A polypharmacological perspective. Drug Discov Today 2024; 29:104046. [PMID: 38810721 DOI: 10.1016/j.drudis.2024.104046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Revised: 05/13/2024] [Accepted: 05/22/2024] [Indexed: 05/31/2024]
Abstract
In the current era of biological big data, which are rapidly populating the biological chemical space, in silico polypharmacology drug design approaches help to decode structure-multiple activity relationships (SMARts). Current computational methods can predict or categorize multiple properties simultaneously, which aids the generation, identification, curation, prioritization, optimization, and repurposing of molecules. Computational methods have generated opportunities and challenges in medicinal chemistry, pharmacology, food chemistry, toxicology, bioinformatics, and chemoinformatics. It is anticipated that computer-guided SMARts could contribute to the full automatization of drug design and drug repurposing campaigns, facilitating the prediction of new biological targets, side and off-target effects, and drug-drug interactions.
Collapse
Affiliation(s)
- Edgar López-López
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Section 14-740, Mexico City 07000, Mexico; DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| |
Collapse
|
9
|
Tang Q, Ratnayake R, Seabra G, Jiang Z, Fang R, Cui L, Ding Y, Kahveci T, Bian J, Li C, Luesch H, Li Y. Morphological profiling for drug discovery in the era of deep learning. Brief Bioinform 2024; 25:bbae284. [PMID: 38886164 PMCID: PMC11182685 DOI: 10.1093/bib/bbae284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 05/13/2024] [Accepted: 06/03/2024] [Indexed: 06/20/2024] Open
Abstract
Morphological profiling is a valuable tool in phenotypic drug discovery. The advent of high-throughput automated imaging has enabled the capturing of a wide range of morphological features of cells or organisms in response to perturbations at the single-cell resolution. Concurrently, significant advances in machine learning and deep learning, especially in computer vision, have led to substantial improvements in analyzing large-scale high-content images at high throughput. These efforts have facilitated understanding of compound mechanism of action, drug repurposing, characterization of cell morphodynamics under perturbation, and ultimately contributing to the development of novel therapeutics. In this review, we provide a comprehensive overview of the recent advances in the field of morphological profiling. We summarize the image profiling analysis workflow, survey a broad spectrum of analysis strategies encompassing feature engineering- and deep learning-based approaches, and introduce publicly available benchmark datasets. We place a particular emphasis on the application of deep learning in this pipeline, covering cell segmentation, image representation learning, and multimodal learning. Additionally, we illuminate the application of morphological profiling in phenotypic drug discovery and highlight potential challenges and opportunities in this field.
Collapse
Affiliation(s)
- Qiaosi Tang
- Calico Life Sciences, South San Francisco, CA 94080, United States
| | - Ranjala Ratnayake
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Gustavo Seabra
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Zhe Jiang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Ruogu Fang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Lina Cui
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Tamer Kahveci
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32611, United States
| | - Chenglong Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Hendrik Luesch
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| |
Collapse
|
10
|
Li J, Sun L, Liu L, Li Z. MIFAM-DTI: a drug-target interactions predicting model based on multi-source information fusion and attention mechanism. Front Genet 2024; 15:1381997. [PMID: 38770418 PMCID: PMC11102998 DOI: 10.3389/fgene.2024.1381997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
Accurate identification of potential drug-target pairs is a crucial step in drug development and drug repositioning, which is characterized by the ability of the drug to bind to and modulate the activity of the target molecule, resulting in the desired therapeutic effect. As machine learning and deep learning technologies advance, an increasing number of models are being engaged for the prediction of drug-target interactions. However, there is still a great challenge to improve the accuracy and efficiency of predicting. In this study, we proposed a deep learning method called Multi-source Information Fusion and Attention Mechanism for Drug-Target Interaction (MIFAM-DTI) to predict drug-target interactions. Firstly, the physicochemical property feature vector and the Molecular ACCess System molecular fingerprint feature vector of a drug were extracted based on its SMILES sequence. The dipeptide composition feature vector and the Evolutionary Scale Modeling -1b feature vector of a target were constructed based on its amino acid sequence information. Secondly, the PCA method was employed to reduce the dimensionality of the four feature vectors, and the adjacency matrices were constructed by calculating the cosine similarity. Thirdly, the two feature vectors of each drug were concatenated and the two adjacency matrices were subjected to a logical OR operation. And then they were fed into a model composed of graph attention network and multi-head self-attention to obtain the final drug feature vectors. With the same method, the final target feature vectors were obtained. Finally, these final feature vectors were concatenated, which served as the input to a fully connected layer, resulting in the prediction output. MIFAM-DTI not only integrated multi-source information to capture the drug and target features more comprehensively, but also utilized the graph attention network and multi-head self-attention to autonomously learn attention weights and more comprehensively capture information in sequence data. Experimental results demonstrated that MIFAM-DTI outperformed state-of-the-art methods in terms of AUC and AUPR. Case study results of coenzymes involved in cellular energy metabolism also demonstrated the effectiveness and practicality of MIFAM-DTI. The source code and experimental data for MIFAM-DTI are available at https://github.com/Search-AB/MIFAM-DTI.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | | | | | | |
Collapse
|
11
|
Yamasan BE, Korkmaz S. Binding Activity Classification of Anti-SARS-CoV-2 Molecules using Deep Learning Across Multiple Assays. Balkan Med J 2024; 41:186-192. [PMID: 38462979 PMCID: PMC11077922 DOI: 10.4274/balkanmedj.galenos.2024.2024-1-73] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 02/15/2024] [Indexed: 03/12/2024] Open
Abstract
Background The coronavirus disease-2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2), has urgently necessitated effective therapeutic solutions, with a focus on rapidly identifying and classifying potential small-molecule drugs. Given traditional methods’ labor-intensive and time-consuming nature, deep learning has emerged as an essential tool for efficiently processing and extracting insights from complex biological data. Aims To utilize deep learning techniques, particularly deep neural networks (DNN) enhanced with the synthetic minority oversampling technique (SMOTE), to enhance the classification of binding activities in anti-SARS-CoV-2 molecules across various bioassays. Methods We used 11 bioassay datasets covering various SARS-CoV-2 interactions and inhibitory mechanisms. These assays ranged from spike-ACE2 protein-protein interaction to ACE2 enzymatic activity and 3CL enzymatic activity. To address the prevalent class imbalance in these datasets, the SMOTE technique was employed to generate new samples for the minority class. In our model-building approach, we divided the dataset into 80% training and 20% test sets, reserving 10% of the training set for validation. Our approach involved employing a DNN that integrates ReLU and sigmoid activation functions, incorporates batch normalization, and uses Adam optimization. The hyperparameters and architecture of the DNN were optimized through various tests on layers, minibatch sizes, epoch sizes, and learning rates. A 40% dropout rate was incorporated to mitigate overfitting. For model evaluation, we computed performance metrics, such as balanced accuracy (BACC), precision, recall, F1 score, Matthews’ correlation coefficient (MCC), and area under the curve (AUC). Results The performance of the DNN across 11 bioassay test sets revealed varying outcomes, significantly influenced by the ratios of active-to-inactive compounds. Assays, such as AlphaLISA and CoV-PPE, demonstrated robust performance across various metrics, including BACC, precision, recall, and AUC, when configured with more balanced ratios (1:3 and 1:1, respectively). This suggests the effective identification of active compounds in both cases. In contrast, assays with higher imbalance ratios, such as 3CL (1:38) and cytopathic effect (1:15), demonstrated higher recall but lower precision, highlighting challenges in accurately identifying active compounds among numerous inactive compounds. However, even in these challenging settings, the model achieved favorable BACC and recall scores. Overall, the DNN model generally performed well, as indicated by the BACC, MCC, and AUC values, especially when considering the degree of dataset imbalance in each assay. Conclusion This study demonstrates the significant impact of deep learning, particularly DNN models enhanced with SMOTE, in improving the identification of active compounds in bioassay datasets for COVID-19 drug discovery, outperforming traditional machine learning models. Furthermore, this study highlights the efficacy of advanced computational techniques in addressing high-throughput screening data imbalances.
Collapse
Affiliation(s)
- Bilge Eren Yamasan
- Department of Biophysics, Trakya University Faculty of Medicine, Edirne, Türkiye
| | - Selçuk Korkmaz
- Department of Biostatistics and Medical Informatics, Trakya University Faculty of Medicine, Edirne, Türkiye
| |
Collapse
|
12
|
Nada H, Kim S, Lee K. PT-Finder: A multi-modal neural network approach to target identification. Comput Biol Med 2024; 174:108444. [PMID: 38636325 DOI: 10.1016/j.compbiomed.2024.108444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/20/2024]
Abstract
Efficient target identification for bioactive compounds, including novel synthetic analogs, is crucial for accelerating the drug discovery pipeline. However, the process of target identification presents significant challenges and is often expensive, which in turn can hinder the drug discovery efforts. To address these challenges machine learning applications have arisen as a promising approach for predicting the targets for novel chemical compounds. These methods allow the exploration of ligand-target interactions, uncovering of biochemical mechanisms, and the investigation of drug repurposing. Typically, the current target identification tools rely on assessing ligand structural similarities. Herein, a multi-modal neural network model was built using a library of proteins, their respective sequences, and active inhibitors. Subsequent validations showed the model to possess accuracy of 82 % and MPRAUC of 0.80. Leveraging the trained model, we developed PT-Finder (Protein Target Finder), a user-friendly offline application that is capable of predicting the target proteins for hundreds of compounds within a few seconds. This combination of offline operation, speed, and accuracy positions PT-Finder as a powerful tool to accelerate drug discovery workflows. PT-Finder and its source codes have been made freely accessible for download at https://github.com/PT-Finder/PT-Finder.
Collapse
Affiliation(s)
- Hossam Nada
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea
| | - Sungdo Kim
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea
| | - Kyeong Lee
- BK21 FOUR Team and Integrated Research Institute for Drug Development, College of Pharmacy, Dongguk University-Seoul, Goyang, 10326, Republic of Korea.
| |
Collapse
|
13
|
Chen X, Lu Z, Xiao J, Xia W, Pan Y, Xia H, Chen YH, Zhang H. Small-Molecule Inhibitors of TIPE3 Protein Identified through Deep Learning Suppress Cancer Cell Growth In Vitro. Cells 2024; 13:771. [PMID: 38727307 PMCID: PMC11082981 DOI: 10.3390/cells13090771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/17/2024] [Accepted: 04/26/2024] [Indexed: 05/13/2024] Open
Abstract
Tumor necrosis factor-α-induced protein 8-like 3 (TNFAIP8L3 or TIPE3) functions as a transfer protein for lipid second messengers. TIPE3 is highly upregulated in several human cancers and has been established to significantly promote tumor cell proliferation, migration, and invasion and inhibit the apoptosis of cancer cells. Thus, inhibiting the function of TIPE3 is expected to be an effective strategy against cancer. The advancement of artificial intelligence (AI)-driven drug development has recently invigorated research in anti-cancer drug development. In this work, we incorporated DFCNN, Autodock Vina docking, DeepBindBC, MD, and metadynamics to efficiently identify inhibitors of TIPE3 from a ZINC compound dataset. Six potential candidates were selected for further experimental study to validate their anti-tumor activity. Among these, three small-molecule compounds (K784-8160, E745-0011, and 7238-1516) showed significant anti-tumor activity in vitro, leading to reduced tumor cell viability, proliferation, and migration and enhanced apoptotic tumor cell death. Notably, E745-0011 and 7238-1516 exhibited selective cytotoxicity toward tumor cells with high TIPE3 expression while having little or no effect on normal human cells or tumor cells with low TIPE3 expression. A molecular docking analysis further supported their interactions with TIPE3, highlighting hydrophobic interactions and their shared interaction residues and offering insights for designing more effective inhibitors. Taken together, this work demonstrates the feasibility of incorporating deep learning and MD simulations in virtual drug screening and provides inhibitors with significant potential for anti-cancer drug development against TIPE3-.
Collapse
Affiliation(s)
- Xiaodie Chen
- Center for Cancer Immunology, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.C.); (Z.L.); (H.X.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhen Lu
- Center for Cancer Immunology, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.C.); (Z.L.); (H.X.)
| | - Jin Xiao
- Faculty of Synthetic Biology and Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (J.X.); (W.X.); (Y.P.)
| | - Wei Xia
- Faculty of Synthetic Biology and Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (J.X.); (W.X.); (Y.P.)
| | - Yi Pan
- Faculty of Synthetic Biology and Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (J.X.); (W.X.); (Y.P.)
| | - Houjun Xia
- Center for Cancer Immunology, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.C.); (Z.L.); (H.X.)
- Faculty of Pharmaceutical Sciences, Shenzhen University of Advanced Technology, Shenzhen 518055, China
| | - Youhai H. Chen
- Center for Cancer Immunology, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.C.); (Z.L.); (H.X.)
- University of Chinese Academy of Sciences, Beijing 100049, China
- Faculty of Pharmaceutical Sciences, Shenzhen University of Advanced Technology, Shenzhen 518055, China
| | - Haiping Zhang
- Faculty of Synthetic Biology and Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (J.X.); (W.X.); (Y.P.)
| |
Collapse
|
14
|
Ghazikhani H, Butler G. Exploiting protein language models for the precise classification of ion channels and ion transporters. Proteins 2024. [PMID: 38656743 DOI: 10.1002/prot.26694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 03/26/2024] [Accepted: 04/08/2024] [Indexed: 04/26/2024]
Abstract
This study introduces TooT-PLM-ionCT, a comprehensive framework that consolidates three distinct systems, each meticulously tailored for one of the following tasks: distinguishing ion channels (ICs) from membrane proteins (MPs), segregating ion transporters (ITs) from MPs, and differentiating ICs from ITs. Drawing upon the strengths of six Protein Language Models (PLMs)-ProtBERT, ProtBERT-BFD, ESM-1b, ESM-2 (650M parameters), and ESM-2 (15B parameters), TooT-PLM-ionCT employs a combination of traditional classifiers and deep learning models for nuanced protein classification. Originally validated on an existing dataset by previous researchers, our systems demonstrated superior performance in identifying ITs from MPs and distinguishing ICs from ITs, with the IC-MP discrimination achieving state-of-the-art results. In light of recommendations for additional validation, we introduced a new dataset, significantly enhancing the robustness and generalization of our models across bioinformatics challenges. This new evaluation underscored the effectiveness of TooT-PLM-ionCT in adapting to novel data while maintaining high classification accuracy. Furthermore, this study explores critical factors affecting classification accuracy, such as dataset balancing, the impact of using frozen versus fine-tuned PLM representations, and the variance between half and full precision in floating-point computations. To facilitate broader application and accessibility, a web server (https://tootsuite.encs.concordia.ca/service/TooT-PLM-ionCT) has been developed, allowing users to evaluate unknown protein sequences through our specialized systems for IC-MP, IT-MP, and IC-IT classification tasks.
Collapse
Affiliation(s)
- Hamed Ghazikhani
- Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada
| | - Gregory Butler
- Centre for Structural and Functional Genomics, Concordia University, Montréal, Québec, Canada
| |
Collapse
|
15
|
Meewan I, Panmanee J, Petchyam N, Lertvilai P. HBCVTr: an end-to-end transformer with a deep neural network hybrid model for anti-HBV and HCV activity predictor from SMILES. Sci Rep 2024; 14:9262. [PMID: 38649402 PMCID: PMC11035669 DOI: 10.1038/s41598-024-59933-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 04/16/2024] [Indexed: 04/25/2024] Open
Abstract
Hepatitis B and C viruses (HBV and HCV) are significant causes of chronic liver diseases, with approximately 350 million infections globally. To accelerate the finding of effective treatment options, we introduce HBCVTr, a novel ligand-based drug design (LBDD) method for predicting the inhibitory activity of small molecules against HBV and HCV. HBCVTr employs a hybrid model consisting of double encoders of transformers and a deep neural network to learn the relationship between small molecules' simplified molecular-input line-entry system (SMILES) and their antiviral activity against HBV or HCV. The prediction accuracy of HBCVTr has surpassed baseline machine learning models and existing methods, with R-squared values of 0.641 and 0.721 for the HBV and HCV test sets, respectively. The trained models were successfully applied to virtual screening against 10 million compounds within 240 h, leading to the discovery of the top novel inhibitor candidates, including IJN04 for HBV and IJN12 and IJN19 for HCV. Molecular docking and dynamics simulations identified IJN04, IJN12, and IJN19 target proteins as the HBV core antigen, HCV NS5B RNA-dependent RNA polymerase, and HCV NS3/4A serine protease, respectively. Overall, HBCVTr offers a new and rapid drug discovery and development screening method targeting HBV and HCV.
Collapse
Affiliation(s)
- Ittipat Meewan
- Center for Advanced Therapeutics, Institute of Molecular Biosciences, Mahidol University, Nakhon Pathom, 73170, Thailand.
| | - Jiraporn Panmanee
- Research Center for Neuroscience, Institute of Molecular Biosciences, Mahidol University, Nakhon Pathom, 73170, Thailand
| | - Nopphon Petchyam
- Center for Advanced Therapeutics, Institute of Molecular Biosciences, Mahidol University, Nakhon Pathom, 73170, Thailand
| | - Pichaya Lertvilai
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, 92037, USA
| |
Collapse
|
16
|
Kengkanna A, Ohue M. Enhancing property and activity prediction and interpretation using multiple molecular graph representations with MMGX. Commun Chem 2024; 7:74. [PMID: 38580841 PMCID: PMC10997661 DOI: 10.1038/s42004-024-01155-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/18/2024] [Indexed: 04/07/2024] Open
Abstract
Graph Neural Networks (GNNs) excel in compound property and activity prediction, but the choice of molecular graph representations significantly influences model learning and interpretation. While atom-level molecular graphs resemble natural topology, they overlook key substructures or functional groups and their interpretation partially aligns with chemical intuition. Recent research suggests alternative representations using reduced molecular graphs to integrate higher-level chemical information and leverages both representations for model. However, there is a lack of studies about applicability and impact of different molecular graphs on model learning and interpretation. Here, we introduce MMGX (Multiple Molecular Graph eXplainable discovery), investigating the effects of multiple molecular graphs, including Atom, Pharmacophore, JunctionTree, and FunctionalGroup, on model learning and interpretation with various perspectives. Our findings indicate that multiple graphs relatively improve model performance, but in varying degrees depending on datasets. Interpretation from multiple graphs in different views provides more comprehensive features and potential substructures consistent with background knowledge. These results help to understand model decisions and offer valuable insights for subsequent tasks. The concept of multiple molecular graph representations and diverse interpretation perspectives has broad applicability across tasks, architectures, and explanation techniques, enhancing model learning and interpretation for relevant applications in drug discovery.
Collapse
Affiliation(s)
- Apakorn Kengkanna
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Kanagawa, 226-8501, Japan
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Kanagawa, 226-8501, Japan.
| |
Collapse
|
17
|
Zhang C, Xie L, Lu X, Mao R, Xu L, Xu X. Developing an Improved Cycle Architecture for AI-Based Generation of New Structures Aimed at Drug Discovery. Molecules 2024; 29:1499. [PMID: 38611779 PMCID: PMC11013495 DOI: 10.3390/molecules29071499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 03/18/2024] [Accepted: 03/21/2024] [Indexed: 04/14/2024] Open
Abstract
Drug discovery involves a crucial step of optimizing molecules with the desired structural groups. In the domain of computer-aided drug discovery, deep learning has emerged as a prominent technique in molecular modeling. Deep generative models, based on deep learning, play a crucial role in generating novel molecules when optimizing molecules. However, many existing molecular generative models have limitations as they solely process input information in a forward way. To overcome this limitation, we propose an improved generative model called BD-CycleGAN, which incorporates BiLSTM (bidirectional long short-term memory) and Mol-CycleGAN (molecular cycle generative adversarial network) to preserve the information of molecular input. To evaluate the proposed model, we assess its performance by analyzing the structural distribution and evaluation matrices of generated molecules in the process of structural transformation. The results demonstrate that the BD-CycleGAN model achieves a higher success rate and exhibits increased diversity in molecular generation. Furthermore, we demonstrate its application in molecular docking, where it successfully increases the docking score for the generated molecules. The proposed BD-CycleGAN architecture harnesses the power of deep learning to facilitate the generation of molecules with desired structural features, thus offering promising advancements in the field of drug discovery processes.
Collapse
Affiliation(s)
| | | | | | | | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (C.Z.); (L.X.); (X.L.); (R.M.)
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (C.Z.); (L.X.); (X.L.); (R.M.)
| |
Collapse
|
18
|
Farzan R. Artificial intelligence in Immuno-genetics. Bioinformation 2024; 20:29-35. [PMID: 38352901 PMCID: PMC10859949 DOI: 10.6026/973206300200029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 02/16/2024] Open
Abstract
Rapid advancements in the field of artificial intelligence (AI) have opened up unprecedented opportunities to revolutionize various scientific domains, including immunology and genetics. Therefore, it is of interest to explore the emerging applications of AI in immunology and genetics, with the objective of enhancing our understanding of the dynamic intricacies of the immune system, disease etiology, and genetic variations. Hence, the use of AI methodologies in immunological and genetic datasets, thereby facilitating the development of innovative approaches in the realms of diagnosis, treatment, and personalized medicine is reviewed.
Collapse
Affiliation(s)
- Raed Farzan
- Department of Clinical Laboratory Sciences, College of Applied Medical Scienecs, King Saud University, Riyadh - 11433, Saudi Arabia
- Center of Excellence in Biotechnology Research, King Saud University, Riyadh - 11433, Saudi Arabia
- Medical and Molecular Genetics Research, King Saud University, Riyadh-11433, Saudi Arabia
| |
Collapse
|
19
|
Nitulescu GM. Techniques and Strategies in Drug Design and Discovery. Int J Mol Sci 2024; 25:1364. [PMID: 38338643 PMCID: PMC10855429 DOI: 10.3390/ijms25031364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
The process of drug discovery constitutes a highly intricate and formidable undertaking, encompassing the identification and advancement of novel therapeutic entities [...].
Collapse
Affiliation(s)
- George Mihai Nitulescu
- Faculty of Pharmacy, "Carol Davila" University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania
| |
Collapse
|
20
|
Liu J, Yang M, Yu Y, Xu H, Li K, Zhou X. Large language models in bioinformatics: applications and perspectives. ARXIV 2024:arXiv:2401.04155v1. [PMID: 38259343 PMCID: PMC10802675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Large language models (LLMs) are a class of artificial intelligence models based on deep learning, which have great performance in various tasks, especially in natural language processing (NLP). Large language models typically consist of artificial neural networks with numerous parameters, trained on large amounts of unlabeled input using self-supervised or semi-supervised learning. However, their potential for solving bioinformatics problems may even exceed their proficiency in modeling human language. In this review, we will present a summary of the prominent large language models used in natural language processing, such as BERT and GPT, and focus on exploring the applications of large language models at different omics levels in bioinformatics, mainly including applications of large language models in genomics, transcriptomics, proteomics, drug discovery and single cell analysis. Finally, this review summarizes the potential and prospects of large language models in solving bioinformatic problems.
Collapse
Affiliation(s)
- Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, 77030, USA
| | - Mengyuan Yang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Yankai Yu
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China
| | - Haixia Xu
- The Center of Gerontology and Geriatrics, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Kang Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
21
|
Jeje O, Otun S, Aloke C, Achilonu I. Exploring NAD + metabolism and NNAT: Insights from structure, function, and computational modeling. Biochimie 2024; 220:84-98. [PMID: 38182101 DOI: 10.1016/j.biochi.2024.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/18/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024]
Abstract
Nicotinamide Adenine Dinucleotide (NAD+), a coenzyme, is ubiquitously distributed and serves crucial functions in diverse biological processes, encompassing redox reactions, energy metabolism, and cellular signalling. This review article explores the intricate realm of NAD + metabolism, with a particular emphasis on the complex relationship between its structure, function, and the pivotal enzyme, Nicotinate Nucleotide Adenylyltransferase (NNAT), also known as nicotinate mononucleotide adenylyltransferase (NaMNAT), in the process of its biosynthesis. Our findings indicate that NAD + biosynthesis in humans and bacteria occurs via the same de novo synthesis route and the pyridine ring salvage pathway. Maintaining NAD homeostasis in bacteria is imperative, as most bacterial species cannot get NAD+ from their surroundings. However, due to lower sequence identity and structurally distant relationship of bacteria, including E. faecium and K. pneumonia, to its human counterpart, inhibiting NNAT, an indispensable enzyme implicated in NAD + biosynthesis, is a viable alternative in curtailing infections orchestrated by E. faecium and K. pneumonia. By merging empirical and computational discoveries and connecting the intricate NAD + metabolism network with NNAT's crucial role, it becomes clear that the synergistic effect of these insights may lead to a more profound understanding of the coenzyme's function and its potential applications in the fields of therapeutics and biotechnology.
Collapse
Affiliation(s)
- Olamide Jeje
- Protein Structure-Function and Research Unit, School of Molecular and Cell Biology, Faculty of Science, University of the Witwatersrand, Braamfontein, Johannesburg, 2050, South Africa
| | - Sarah Otun
- Protein Structure-Function and Research Unit, School of Molecular and Cell Biology, Faculty of Science, University of the Witwatersrand, Braamfontein, Johannesburg, 2050, South Africa.
| | - Chinyere Aloke
- Protein Structure-Function and Research Unit, School of Molecular and Cell Biology, Faculty of Science, University of the Witwatersrand, Braamfontein, Johannesburg, 2050, South Africa; Department of Medical Biochemistry, Alex Ekwueme Federal University Ndufu-Alike, Ebonyi State, Nigeria
| | - Ikechukwu Achilonu
- Protein Structure-Function and Research Unit, School of Molecular and Cell Biology, Faculty of Science, University of the Witwatersrand, Braamfontein, Johannesburg, 2050, South Africa
| |
Collapse
|
22
|
Li Y, Cardoso-Silva J, Kelly JM, Delves MJ, Furnham N, Papageorgiou LG, Tsoka S. Optimisation-based modelling for explainable lead discovery in malaria. Artif Intell Med 2024; 147:102700. [PMID: 38184363 DOI: 10.1016/j.artmed.2023.102700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/17/2023] [Accepted: 10/29/2023] [Indexed: 01/08/2024]
Abstract
BACKGROUND The search for new antimalarial treatments is urgent due to growing resistance to existing therapies. The Open Source Malaria (OSM) project offers a promising starting point, having extensively screened various compounds for their effectiveness. Further analysis of the chemical space surrounding these compounds could provide the means for innovative drugs. METHODS We report an optimisation-based method for quantitative structure-activity relationship (QSAR) modelling that provides explainable modelling of ligand activity through a mathematical programming formulation. The methodology is based on piecewise regression principles and offers optimal detection of breakpoint features, efficient allocation of samples into distinct sub-groups based on breakpoint feature values, and insightful regression coefficients. Analysis of OSM antimalarial compounds yields interpretable results through rules generated by the model that reflect the contribution of individual fingerprint fragments in ligand activity prediction. Using knowledge of fragment prioritisation and screening of commercially available compound libraries, potential lead compounds for antimalarials are identified and evaluated experimentally via a Plasmodium falciparum asexual growth inhibition assay (PfGIA) and a human cell cytotoxicity assay. CONCLUSIONS Three compounds are identified as potential leads for antimalarials using the methodology described above. This work illustrates how explainable predictive models based on mathematical optimisation can pave the way towards more efficient fragment-based lead discovery as applied in malaria.
Collapse
Affiliation(s)
- Yutong Li
- Department of Informatics, King's College London, Bush House, London, WC2B 4BG, UK
| | - Jonathan Cardoso-Silva
- Data Science Institute, London School of Economics and Political Science, Houghton St, London, WC2A 2AE, UK
| | - John M Kelly
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Michael J Delves
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK
| | - Lazaros G Papageorgiou
- The Sargent Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, UK
| | - Sophia Tsoka
- Department of Informatics, King's College London, Bush House, London, WC2B 4BG, UK.
| |
Collapse
|
23
|
Guzman-Pando A, Ramirez-Alonso G, Arzate-Quintana C, Camarillo-Cisneros J. Deep learning algorithms applied to computational chemistry. Mol Divers 2023:10.1007/s11030-023-10771-y. [PMID: 38151697 DOI: 10.1007/s11030-023-10771-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 11/14/2023] [Indexed: 12/29/2023]
Abstract
Recently, there has been a significant increase in the use of deep learning techniques in the molecular sciences, which have shown high performance on datasets and the ability to generalize across data. However, no model has achieved perfect performance in solving all problems, and the pros and cons of each approach remain unclear to those new to the field. Therefore, this paper aims to review deep learning algorithms that have been applied to solve molecular challenges in computational chemistry. We proposed a comprehensive categorization that encompasses two primary approaches; conventional deep learning and geometric deep learning models. This classification takes into account the distinct techniques employed by the algorithms within each approach. We present an up-to-date analysis of these algorithms, emphasizing their key features and open issues. This includes details of input descriptors, datasets used, open-source code availability, task solutions, and actual research applications, focusing on general applications rather than specific ones such as drug discovery. Furthermore, our report discusses trends and future directions in molecular algorithm design, including the input descriptors used for each deep learning model, GPU usage, training and forward processing time, model parameters, the most commonly used datasets, libraries, and optimization schemes. This information aids in identifying the most suitable algorithms for a given task. It also serves as a reference for the datasets and input data frequently used for each algorithm technique. In addition, it provides insights into the benefits and open issues of each technique, and supports the development of novel computational chemistry systems.
Collapse
Affiliation(s)
- Abimael Guzman-Pando
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Graciela Ramirez-Alonso
- Faculty of Engineering, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Carlos Arzate-Quintana
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico
| | - Javier Camarillo-Cisneros
- Computational Chemistry Physics Laboratory, Facultad de Medicina y Ciencias Biomédicas, Universidad Autónoma de Chihuahua, Campus II, 31125, Chihuahua, Mexico.
| |
Collapse
|
24
|
Amoroso N, Gambacorta N, Mastrolorito F, Togo MV, Trisciuzzi D, Monaco A, Pantaleo E, Altomare CD, Ciriaco F, Nicolotti O. Making sense of chemical space network shows signs of criticality. Sci Rep 2023; 13:21335. [PMID: 38049451 PMCID: PMC10696027 DOI: 10.1038/s41598-023-48107-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 11/22/2023] [Indexed: 12/06/2023] Open
Abstract
Chemical space modelling has great importance in unveiling and visualising latent information, which is critical in predictive toxicology related to drug discovery process. While the use of traditional molecular descriptors and fingerprints may suffer from the so-called curse of dimensionality, complex networks are devoid of the typical drawbacks of coordinate-based representations. Herein, we use chemical space networks (CSNs) to analyse the case of the developmental toxicity (Dev Tox), which remains a challenging endpoint for the difficulty of gathering enough reliable data despite very important for the protection of the maternal and child health. Our study proved that the Dev Tox CSN has a complex non-random organisation and can thus provide a wealth of meaningful information also for predictive purposes. At a phase transition, chemical similarities highlight well-established toxicophores, such as aryl derivatives, mostly neurotoxic hydantoins, barbiturates and amino alcohols, steroids, and volatile organic compounds ether-like chemicals, which are strongly suspected of the Dev Tox onset and can thus be employed as effective alerts for prioritising chemicals before testing.
Collapse
Affiliation(s)
- Nicola Amoroso
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy.
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, via E. Orabona, 4, 70125, Bari, Italy.
| | - Nicola Gambacorta
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
- Division of Medical Genetics, Fondazione IRCCS-Casa Sollievo della Sofferenza, San Giovanni Rotondo (Foggia), Italy
| | - Fabrizio Mastrolorito
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
| | - Maria Vittoria Togo
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, via E. Orabona, 4, 70125, Bari, Italy
- Dipartimento Interateneo di Fisica "M. Merlin", Università degli studi di Bari Aldo Moro, Via Giovanni Amendola, 173, 70125, Bari, Italy
| | - Ester Pantaleo
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, via E. Orabona, 4, 70125, Bari, Italy
- Dipartimento Interateneo di Fisica "M. Merlin", Università degli studi di Bari Aldo Moro, Via Giovanni Amendola, 173, 70125, Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy.
| | - Orazio Nicolotti
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli studi di Bari Aldo Moro, via E. Orabona, 4, 70125, Bari, Italy
| |
Collapse
|
25
|
Sinha K, Ghosh N, Sil PC. A Review on the Recent Applications of Deep Learning in Predictive Drug Toxicological Studies. Chem Res Toxicol 2023; 36:1174-1205. [PMID: 37561655 DOI: 10.1021/acs.chemrestox.2c00375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Drug toxicity prediction is an important step in ensuring patient safety during drug design studies. While traditional preclinical studies have historically relied on animal models to evaluate toxicity, recent advances in deep-learning approaches have shown great promise in advancing drug safety science and reducing animal use in preclinical studies. However, deep-learning-based approaches also face challenges in handling large biological data sets, model interpretability, and regulatory acceptance. In this review, we provide an overview of recent developments in deep-learning-based approaches for predicting drug toxicity, highlighting their potential advantages over traditional methods and the need to address their limitations. Deep-learning models have demonstrated excellent performance in predicting toxicity outcomes from various data sources such as chemical structures, genomic data, and high-throughput screening assays. The potential of deep learning for automated feature engineering is also discussed. This review emphasizes the need to address ethical concerns related to the use of deep learning in drug toxicity studies, including the reduction of animal use and ensuring regulatory acceptance. Furthermore, emerging applications of deep learning in drug toxicity prediction, such as predicting drug-drug interactions and toxicity in rare subpopulations, are highlighted. The integration of deep-learning-based approaches with traditional methods is discussed as a way to develop more reliable and efficient predictive models for drug safety assessment, paving the way for safer and more effective drug discovery and development. Overall, this review highlights the critical role of deep learning in predictive toxicology and drug safety evaluation, emphasizing the need for continued research and development in this rapidly evolving field. By addressing the limitations of traditional methods, leveraging the potential of deep learning for automated feature engineering, and addressing ethical concerns, deep-learning-based approaches have the potential to revolutionize drug toxicity prediction and improve patient safety in drug discovery and development.
Collapse
Affiliation(s)
- Krishnendu Sinha
- Department of Zoology, Jhargram Raj College, Jhargram 721507, West Bengal, India
| | - Nabanita Ghosh
- Department of Zoology, Maulana Azad College, Kolkata 700013, West Bengal, India
| | - Parames C Sil
- Division of Molecular Medicine, Bose Institute, Kolkata 700054, West Bengal, India
| |
Collapse
|
26
|
Li F, Nian Y, Sun Z, Tao C. Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions. Yearb Med Inform 2023; 32:215-224. [PMID: 38147863 PMCID: PMC10751115 DOI: 10.1055/s-0043-1768735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
OBJECTIVES Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research. METHODS We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality. RESULTS A total of 78 articles were included in our analysis. We identified three main categories of GRL methods and summarized their methodological foundations and notable models. In terms of GRL applications, we focused on two main topics: drug and disease. We analyzed the study frameworks and achievements of the prominent research. Based on the current state-of-the-art, we discussed the challenges and future directions. CONCLUSIONS GRL methods applied in the biomedical field demonstrated several key characteristics, including the utilization of attention mechanisms to prioritize relevant features, a growing emphasis on model interpretability, and the combination of various techniques to improve model performance. There are also challenges needed to be addressed, including mitigating model bias, accommodating the heterogeneity of large-scale knowledge graphs, and improving the availability of high-quality graph data. To fully leverage the potential of GRL, future efforts should prioritize these areas of research.
Collapse
Affiliation(s)
- Fang Li
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yi Nian
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Zenan Sun
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Cui Tao
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
27
|
Koutroumpa NM, Papavasileiou KD, Papadiamantis AG, Melagraki G, Afantitis A. A Systematic Review of Deep Learning Methodologies Used in the Drug Discovery Process with Emphasis on In Vivo Validation. Int J Mol Sci 2023; 24:6573. [PMID: 37047543 PMCID: PMC10095548 DOI: 10.3390/ijms24076573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 03/24/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
The discovery and development of new drugs are extremely long and costly processes. Recent progress in artificial intelligence has made a positive impact on the drug development pipeline. Numerous challenges have been addressed with the growing exploitation of drug-related data and the advancement of deep learning technology. Several model frameworks have been proposed to enhance the performance of deep learning algorithms in molecular design. However, only a few have had an immediate impact on drug development since computational results may not be confirmed experimentally. This systematic review aims to summarize the different deep learning architectures used in the drug discovery process and are validated with further in vivo experiments. For each presented study, the proposed molecule or peptide that has been generated or identified by the deep learning model has been biologically evaluated in animal models. These state-of-the-art studies highlight that even if artificial intelligence in drug discovery is still in its infancy, it has great potential to accelerate the drug discovery cycle, reduce the required costs, and contribute to the integration of the 3R (Replacement, Reduction, Refinement) principles. Out of all the reviewed scientific articles, seven algorithms were identified: recurrent neural networks, specifically, long short-term memory (LSTM-RNNs), Autoencoders (AEs) and their Wasserstein Autoencoders (WAEs) and Variational Autoencoders (VAEs) variants; Convolutional Neural Networks (CNNs); Direct Message Passing Neural Networks (D-MPNNs); and Multitask Deep Neural Networks (MTDNNs). LSTM-RNNs were the most used architectures with molecules or peptide sequences as inputs.
Collapse
Affiliation(s)
- Nikoletta-Maria Koutroumpa
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Konstantinos D. Papavasileiou
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| | - Anastasios G. Papadiamantis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Georgia Melagraki
- Division of Physical Sciences & Applications, Hellenic Military Academy, 166 73 Vari, Greece
| | - Antreas Afantitis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| |
Collapse
|
28
|
Das P, Mazumder DH. An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects. Artif Intell Rev 2023; 56:1-28. [PMID: 36819660 PMCID: PMC9930028 DOI: 10.1007/s10462-023-10413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/19/2023]
Abstract
Approved drugs for sale must be effective and safe, implying that the drug's advantages outweigh its known harmful side effects. Side effects (SE) of drugs are one of the common reasons for drug failure that may halt the whole drug discovery pipeline. The side effects might vary from minor concerns like a runny nose to potentially life-threatening issues like liver damage, heart attack, and death. Therefore, predicting the side effects of the drug is vital in drug development, discovery, and design. Supervised machine learning-based side effects prediction task has recently received much attention since it reduces time, chemical waste, design complexity, risk of failure, and cost. The advancement of supervised learning approaches for predicting side effects have emerged as essential computational tools. Supervised machine learning technique provides early information on drug side effects to develop an effective drug based on drug properties. Still, there are several challenges to predicting drug side effects. Thus, a near-exhaustive survey is carried out in this paper on the use of supervised machine learning approaches employed in drug side effects prediction tasks in the past two decades. In addition, this paper also summarized the drug descriptor required for the side effects prediction task, commonly utilized drug properties sources, computational models, and their performances. Finally, the research gap, open problems, and challenges for the further supervised learning-based side effects prediction task have been discussed.
Collapse
Affiliation(s)
- Pranab Das
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| | - Dilwar Hussain Mazumder
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| |
Collapse
|
29
|
Vashishat A, Gupta GD, Kurmi BD. Revolutionizing Drug Discovery: The Role of AI and Machine Learning. Curr Pharm Des 2023; 29:3087-3088. [PMID: 38083886 DOI: 10.2174/0113816128287941231206050340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 11/16/2023] [Indexed: 01/05/2024]
Affiliation(s)
- Abhinav Vashishat
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, Punjab 142001, India
| | - Ghanshyam Das Gupta
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, Punjab 142001, India
| | - Balak Das Kurmi
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, Punjab 142001, India
| |
Collapse
|