1
|
Shi W, Yang H, Xie L, Yin XX, Zhang Y. A review of machine learning-based methods for predicting drug-target interactions. Health Inf Sci Syst 2024; 12:30. [PMID: 38617016 PMCID: PMC11014838 DOI: 10.1007/s13755-024-00287-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 03/04/2024] [Indexed: 04/16/2024] Open
Abstract
The prediction of drug-target interactions (DTI) is a crucial preliminary stage in drug discovery and development, given the substantial risk of failure and the prolonged validation period associated with in vitro and in vivo experiments. In the contemporary landscape, various machine learning-based methods have emerged as indispensable tools for DTI prediction. This paper begins by placing emphasis on the data representation employed by these methods, delineating five representations for drugs and four for proteins. The methods are then categorized into traditional machine learning-based approaches and deep learning-based ones, with a discussion of representative approaches in each category and the introduction of a novel taxonomy for deep neural network models in DTI prediction. Additionally, we present a synthesis of commonly used datasets and evaluation metrics to facilitate practical implementation. In conclusion, we address current challenges and outline potential future directions in this research field.
Collapse
Affiliation(s)
- Wen Shi
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004 China
| | - Hong Yang
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
| | - Linhai Xie
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Beijing, 102206 China
| | - Xiao-Xia Yin
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006 China
| | - Yanchun Zhang
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004 China
- Department of New Networks, Peng Cheng Laboratory, Shenzhen, 518000 China
| |
Collapse
|
2
|
Liu S, Chen H, Tang L, Liu M, Chen J, Wang D. WGCNA and machine learning analysis identifi ed SAMD9 and IFIT3 as primary Sjögren's Syndrome key genes. Heliyon 2024; 10:e29652. [PMID: 38707449 PMCID: PMC11068537 DOI: 10.1016/j.heliyon.2024.e29652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Background Current treatments for primary Sjögren's Syndrome (pSS) are with limited effect, partially due to the heterogeneity and uncleared mechanism. Methods We got GSE40568 (Japan) and GSE40611 (USA), and analyzed them with WGCNA to find key Differentially expressed genes (DEGs) between pSS and healthy salivary glands (SG). Key pSS genes (KPGs) were further selected through 3 machine-learning methods. The expression of KPGs was validated via two other GEO datasets (GSE127952 and GSE154926). Infiltrated immune cells, ceRNA network, and potential compounds were explored. Results Our study identified 376 DEGs from the pSS patients, with 186 genes located in the "plum2" module, showing the strongest correlation with clinical characteristics. SAMD9 and IFIT3 emerged as KPGs with excellent diagnostic potential. SAMD9 demonstrated close association with immune cell infiltration. We constructed a lncRNA-miRNA-mRNA network comprising 2 KPGs, 12 miRNAs, 124 lncRNAs, and potential therapeutic targets. Conclusion In the investigation of pSS public datasets, our study revealed two potential critical mediators in the pathological process of pSS salivary glands, namely SAMD9 and IFIT3. Furthermore, we put forth a hypothesis regarding the ceRNA network and made predictions regarding potential therapeutic drugs targeting these two genes.
Collapse
Affiliation(s)
- Shu Liu
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Clinical College of Nanjing University of Chinese Medicine, China
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, China
| | - Hongzhen Chen
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Clinical College of Nanjing University of Chinese Medicine, China
| | - Lin Tang
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, China
| | - Mian Liu
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Clinical College of Nanjing University of Chinese Medicine, China
| | - Jinfeng Chen
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, China
| | - Dandan Wang
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Clinical College of Nanjing University of Chinese Medicine, China
- Department of Rheumatology and Immunology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, China
| |
Collapse
|
3
|
Kuznetsov M, Ryabov F, Schutski R, Shayakhmetov R, Lin YC, Aliper A, Polykovskiy D. COSMIC: Molecular Conformation Space Modeling in Internal Coordinates with an Adversarial Framework. J Chem Inf Model 2024; 64:3610-3620. [PMID: 38668753 PMCID: PMC11094738 DOI: 10.1021/acs.jcim.3c00989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 03/29/2024] [Accepted: 04/02/2024] [Indexed: 05/14/2024]
Abstract
The fast and accurate conformation space modeling is an essential part of computational approaches for solving ligand and structure-based drug discovery problems. Recent state-of-the-art diffusion models for molecular conformation generation show promising distribution coverage and physical plausibility metrics but suffer from a slow sampling procedure. We propose a novel adversarial generative framework, COSMIC, that shows comparable generative performance but provides a time-efficient sampling and training procedure. Given a molecular graph and random noise, the generator produces a conformation in two stages. First, it constructs a conformation in a rotation and translation invariant representation─internal coordinates. In the second step, the model predicts the distances between neighboring atoms and performs a few fast optimization steps to refine the initial conformation. The proposed model considers conformation energy, achieving comparable space coverage, and diversity metrics results.
Collapse
Affiliation(s)
- Maksim Kuznetsov
- Insilico
Medicine Canada Inc., 1250 René-Lévesque Ouest, Suite 3710, Montréal, Québec H3B 4W8, Canada
| | - Fedor Ryabov
- Insilico
Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak
Shek Kok, New Territories, Hong Kong 999077, China
| | - Roman Schutski
- Insilico
Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak
Shek Kok, New Territories, Hong Kong 999077, China
| | - Rim Shayakhmetov
- Insilico
Medicine Canada Inc., 1250 René-Lévesque Ouest, Suite 3710, Montréal, Québec H3B 4W8, Canada
| | - Yen-Chu Lin
- Insilico
Medicine Taiwan Ltd., Taipei City 110208, Taiwan
| | - Alex Aliper
- Insilico
Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak
Shek Kok, New Territories, Hong Kong 999077, China
| | - Daniil Polykovskiy
- Insilico
Medicine Canada Inc., 1250 René-Lévesque Ouest, Suite 3710, Montréal, Québec H3B 4W8, Canada
| |
Collapse
|
4
|
Abbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. Chembiochem 2024:e202300816. [PMID: 38735845 DOI: 10.1002/cbic.202300816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/14/2024]
Abstract
The emergence of Artificial Intelligence (AI) in drug discovery marks a pivotal shift in pharmaceutical research, blending sophisticated computational techniques with conventional scientific exploration to break through enduring obstacles. This review paper elucidates the multifaceted applications of AI across various stages of drug development, highlighting significant advancements and methodologies. It delves into AI's instrumental role in drug design, polypharmacology, chemical synthesis, drug repurposing, and the prediction of drug properties such as toxicity, bioactivity, and physicochemical characteristics. Despite AI's promising advancements, the paper also addresses the challenges and limitations encountered in the field, including data quality, generalizability, computational demands, and ethical considerations. By offering a comprehensive overview of AI's role in drug discovery, this paper underscores the technology's potential to significantly enhance drug development, while also acknowledging the hurdles that must be overcome to fully realize its benefits.
Collapse
Affiliation(s)
- M K G Abbas
- Center for Advanced Materials, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Abrar Rassam
- Secondary Education, Educational Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Fatima Karamshahi
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Rehab Abunora
- Faculty of Medicine, General Medicine and Surgery, Helwan University, Cairo, Egypt
| | - Maha Abouseada
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| |
Collapse
|
5
|
Zabihian A, Asghari J, Hooshmand M, Gharaghani S. A comparative analysis of computational drug repurposing approaches: proposing a novel tensor-matrix-tensor factorization method. Mol Divers 2024:10.1007/s11030-024-10851-7. [PMID: 38683487 DOI: 10.1007/s11030-024-10851-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Efficient drug discovery relies on drug repurposing, an important and open research field. This work presents a novel factorization method and a practical comparison of different approaches for drug repurposing. First, we propose a novel tensor-matrix-tensor (TMT) formulation as a new data array method with a gradient-based factorization procedure. Additionally, this paper examines and contrasts four computational drug repurposing approaches-factorization-based methods, machine learning methods, deep learning methods, and graph neural networks-to fulfill the second purpose. We test the strategies on two datasets and assess each approach's performance, drawbacks, problems, and benefits based on results. The results demonstrate that deep learning techniques work better than other strategies and that their results might be more reliable. Ultimately, graph neural methods need to be in an inductive manner to have a reliable prediction.
Collapse
Affiliation(s)
- Arash Zabihian
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, Iran
| | - Javad Asghari
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran
| | - Mohsen Hooshmand
- Department of Computer Science and Information Technology, Institute of Advanced Studies in Basic Sciences, Zanjan, Iran.
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design, University of Tehran, Tehran, Iran
| |
Collapse
|
6
|
Zeng X, Chen W, Lei B. CAT-DTI: cross-attention and Transformer network with domain adaptation for drug-target interaction prediction. BMC Bioinformatics 2024; 25:141. [PMID: 38566002 DOI: 10.1186/s12859-024-05753-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 03/19/2024] [Indexed: 04/04/2024] Open
Abstract
Accurate and efficient prediction of drug-target interaction (DTI) is critical to advance drug development and reduce the cost of drug discovery. Recently, the employment of deep learning methods has enhanced DTI prediction precision and efficacy, but it still encounters several challenges. The first challenge lies in the efficient learning of drug and protein feature representations alongside their interaction features to enhance DTI prediction. Another important challenge is to improve the generalization capability of the DTI model within real-world scenarios. To address these challenges, we propose CAT-DTI, a model based on cross-attention and Transformer, possessing domain adaptation capability. CAT-DTI effectively captures the drug-target interactions while adapting to out-of-distribution data. Specifically, we use a convolution neural network combined with a Transformer to encode the distance relationship between amino acids within protein sequences and employ a cross-attention module to capture the drug-target interaction features. Generalization to new DTI prediction scenarios is achieved by leveraging a conditional domain adversarial network, aligning DTI representations under diverse distributions. Experimental results within in-domain and cross-domain scenarios demonstrate that CAT-DTI model overall improves DTI prediction performance compared with previous methods.
Collapse
Affiliation(s)
- Xiaoting Zeng
- School of Computer and Software, Shenzhen University, Shenzhen, 518060, China
| | - Weilin Chen
- Marshall Laboratory of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, 518055, China.
| | - Baiying Lei
- School of Biomedical Engineering, Shenzhen University, Shenzhen, 518055, China.
| |
Collapse
|
7
|
Wang K, Kim N, Bagherian M, Li K, Chou E, Colacino JA, Dolinoy DC, Sartor MA. Gene Target Prediction of Environmental Chemicals Using Coupled Matrix-Matrix Completion. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:5889-5898. [PMID: 38501580 PMCID: PMC11131040 DOI: 10.1021/acs.est.4c00458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Human exposure to toxic chemicals presents a huge health burden. Key to understanding chemical toxicity is knowledge of the molecular target(s) of the chemicals. Because a comprehensive safety assessment for all chemicals is infeasible due to limited resources, a robust computational method for discovering targets of environmental exposures is a promising direction for public health research. In this study, we implemented a novel matrix completion algorithm named coupled matrix-matrix completion (CMMC) for predicting direct and indirect exposome-target interactions, which exploits the vast amount of accumulated data regarding chemical exposures and their molecular targets. Our approach achieved an AUC of 0.89 on a benchmark data set generated using data from the Comparative Toxicogenomics Database. Our case studies with bisphenol A and its analogues, PFAS, dioxins, PCBs, and VOCs show that CMMC can be used to accurately predict molecular targets of novel chemicals without any prior bioactivity knowledge. Our results demonstrate the feasibility and promise of computationally predicting environmental chemical-target interactions to efficiently prioritize chemicals in hazard identification and risk assessment.
Collapse
Affiliation(s)
- Kai Wang
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nicole Kim
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
- Michigan Institute for Data Science (MIDAS), University of Michigan, Ann Arbor, MI 48109, USA
| | - Kai Li
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Elysia Chou
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Justin A. Colacino
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Nutritional Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Dana C. Dolinoy
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Nutritional Sciences, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Maureen A. Sartor
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
8
|
Agu PC, Obulose CN. Piquing artificial intelligence towards drug discovery: Tools, techniques, and applications. Drug Dev Res 2024; 85:e22159. [PMID: 38375772 DOI: 10.1002/ddr.22159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/12/2024] [Accepted: 01/29/2024] [Indexed: 02/21/2024]
Abstract
The purpose of this study was to discuss how artificial intelligence (AI) methods have affected the field of drug development. It looks at how AI models and data resources are reshaping the drug development process by offering more affordable and expedient options to conventional approaches. The paper opens with an overview of well-known information sources for drug development. The discussion then moves on to molecular representation techniques that make it possible to convert data into representations that computers can understand. The paper also gives a general overview of the algorithms used in the creation of drug discovery models based on AI. In particular, the paper looks at how AI algorithms might be used to forecast drug toxicity, drug bioactivity, and drug physicochemical properties. De novo drug design, binding affinity prediction, and other AI-based models for drug-target interaction were covered in deeper detail. Modern applications of AI in nanomedicine design and pharmacological synergism/antagonism prediction were also covered. The potential advantages of AI in drug development are highlighted as the evaluation comes to a close. It underlines how AI may greatly speed up and improve the efficiency of drug discovery, resulting in the creation of new and better medicines. To fully realize the promise of AI in drug discovery, the review acknowledges the difficulties that come with its uses in this field and advocates for more study and development.
Collapse
Affiliation(s)
- Peter Chinedu Agu
- Department of Biochemistry, College of Science, Evangel University, Akaeze, Ebonyi State, Nigeria
| | - Chidiebere Nwiboko Obulose
- Department of Computer Sciences, Our Savior Institute of Science, Agriculture, and Technology (OSISATECH Polytechnic), Enugu, Nigeria
| |
Collapse
|
9
|
Huang D, Ye X, Sakurai T. Multi-party collaborative drug discovery via federated learning. Comput Biol Med 2024; 171:108181. [PMID: 38428094 DOI: 10.1016/j.compbiomed.2024.108181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/28/2024] [Accepted: 02/18/2024] [Indexed: 03/03/2024]
Abstract
In the field of drug discovery and pharmacology research, precise and rapid prediction of drug-target binding affinity (DTA) and drug-drug interaction (DDI) are essential for drug efficacy and safety. However, pharmacological data are often distributed across different institutions. Moreover, due to concerns regarding data privacy and intellectual property, the sharing of pharmacological data is often restricted. It is difficult for institutions to achieve the desired performance by solely utilizing their data. This urgent challenge calls for a solution that not only enhances collaboration between multiple institutions to improve prediction accuracy but also safeguards data privacy. In this study, we propose a novel federated learning (FL) framework to advance the prediction of DTA and DDI, namely FL-DTA and FL-DDI. The proposed framework enables multiple institutions to collaboratively train a predictive model without the need to share their local data. Moreover, to ensure data privacy, we employ secure multi-party computation (MPC) during the federated learning model aggregation phase. We evaluated the proposed method on two DTA and one DDI benchmark datasets and compared them with centralized learning and local learning. The experimental results indicate that the proposed method performs closely to centralized learning, and significantly outperforms local learning. Moreover, the proposed framework ensures data security while promoting collaboration among institutions, thereby accelerating the drug discovery process.
Collapse
Affiliation(s)
- Dong Huang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| |
Collapse
|
10
|
Singh S, Pandey AK, Prajapati VK. From genome to clinic: The power of translational bioinformatics in improving human health. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:1-25. [PMID: 38448133 DOI: 10.1016/bs.apcsb.2023.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Translational bioinformatics (TBI) has transformed healthcare by providing personalized medicine and tailored treatment options by integrating genomic data and clinical information. In recent years, TBI has bridged the gap between genome and clinical data because of significant advances in informatics like quantum computing and utilizing state-of-the-art technologies. This chapter discusses the power of translational bioinformatics in improving human health, from uncovering disease-causing genes and variations to establishing new therapeutic techniques. We discuss key application areas of bioinformatics in clinical genomics, such as data sources and methods used in translational bioinformatics, the impact of translational bioinformatics on human health, and how machine learning and artificial intelligence are being used to mine vast amounts of data for drug development and precision medicine. We also look at the problems, constraints, and ethical concerns connected with exploiting genomic data and the future of translational bioinformatics and its potential impact on medicine and human health. Ultimately, this chapter emphasizes the great potential of translational bioinformatics to alter healthcare and enhance patient outcomes.
Collapse
Affiliation(s)
- Satyendra Singh
- Department of Biochemistry, School of Life Sciences, Central University of Rajasthan, Bandarsindri, Kishangarh, Ajmer, Rajasthan, India
| | - Anurag Kumar Pandey
- College of Biotechnology, Sardar Vallabhbhai Patel University of Agriculture and Technology, Meerut, Uttar Pradesh, India
| | - Vijay Kumar Prajapati
- Department of Biochemistry, University of Delhi South Campus, Dhaula Kuan, New Delhi, India.
| |
Collapse
|
11
|
Xu W, Yang X, Guan Y, Cheng X, Wang Y. Integrative approach for predicting drug-target interactions via matrix factorization and broad learning systems. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2608-2625. [PMID: 38454698 DOI: 10.3934/mbe.2024115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
In the drug discovery process, time and costs are the most typical problems resulting from the experimental screening of drug-target interactions (DTIs). To address these limitations, many computational methods have been developed to achieve more accurate predictions. However, identifying DTIs mostly rely on separate learning tasks with drug and target features that neglect interaction representation between drugs and target. In addition, the lack of these relationships may lead to a greatly impaired performance on the prediction of DTIs. Aiming at capturing comprehensive drug-target representations and simplifying the network structure, we propose an integrative approach with a convolution broad learning system for the DTI prediction (ConvBLS-DTI) to reduce the impact of the data sparsity and incompleteness. First, given the lack of known interactions for the drug and target, the weighted K-nearest known neighbors (WKNKN) method was used as a preprocessing strategy for unknown drug-target pairs. Second, a neighborhood regularized logistic matrix factorization (NRLMF) was applied to extract features of updated drug-target interaction information, which focused more on the known interaction pair parties. Then, a broad learning network incorporating a convolutional neural network was established to predict DTIs, which can make classification more effective using a different perspective. Finally, based on the four benchmark datasets in three scenarios, the ConvBLS-DTI's overall performance out-performed some mainstream methods. The test results demonstrate that our model achieves improved prediction effect on the area under the receiver operating characteristic curve and the precision-recall curve.
Collapse
Affiliation(s)
- Wanying Xu
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Xixin Yang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
- School of Automation, Qingdao University, Qingdao 266071, China
| | - Yuanlin Guan
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- School of Mechanical & Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China
| | - Xiaoqing Cheng
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Yu Wang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| |
Collapse
|
12
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
13
|
Abdul Raheem AK, Dhannoon BN. Comprehensive Review on Drug-target Interaction Prediction - Latest Developments and Overview. Curr Drug Discov Technol 2024; 21:e010923220652. [PMID: 37680152 DOI: 10.2174/1570163820666230901160043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 05/29/2023] [Accepted: 07/18/2023] [Indexed: 09/09/2023]
Abstract
Drug-target interactions (DTIs) are an important part of the drug development process. When the drug (a chemical molecule) binds to a target (proteins or nucleic acids), it modulates the biological behavior/function of the target, returning it to its normal state. Predicting DTIs plays a vital role in the drug discovery (DD) process as it has the potential to enhance efficiency and reduce costs. However, DTI prediction poses significant challenges and expenses due to the time-consuming and costly nature of experimental assays. As a result, researchers have increased their efforts to identify the association between medications and targets in the hopes of speeding up drug development and shortening the time to market. This paper provides a detailed discussion of the initial stage in drug discovery, namely drug-target interactions. It focuses on exploring the application of machine learning methods within this step. Additionally, we aim to conduct a comprehensive review of relevant papers and databases utilized in this field. Drug target interaction prediction covers a wide range of applications: drug discovery, prediction of adverse effects and drug repositioning. The prediction of drugtarget interactions can be categorized into three main computational methods: docking simulation approaches, ligand-based methods, and machine-learning techniques.
Collapse
Affiliation(s)
- Ali K Abdul Raheem
- Software Department, College of Information Technology, University of Babylon, Hillah, Babil, Iraq
- University of Warith Al-Anbiyaa, Kerbala, Iraq
| | - Ban N Dhannoon
- Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq
| |
Collapse
|
14
|
Tiwari PC, Pal R, Chaudhary MJ, Nath R. Artificial intelligence revolutionizing drug development: Exploring opportunities and challenges. Drug Dev Res 2023; 84:1652-1663. [PMID: 37712494 DOI: 10.1002/ddr.22115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/14/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023]
Abstract
By harnessing artificial intelligence (AI) algorithms and machine learning techniques, the entire drug discovery process stands to undergo a profound transformation, offering a myriad of advantages. Foremost among these is the ability of AI to conduct swift and efficient screenings of expansive compound libraries, significantly augmenting the identification of potential drug candidates. Moreover, AI algorithms can prove instrumental in predicting the efficacy and safety profiles of candidate compounds, thus endowing invaluable insights and reducing reliance on extensive preclinical and clinical testing. This predictive capacity of AI has the potential to streamline the drug development pipeline and enhance the success rate of clinical trials, ultimately resulting in the emergence of more efficacious and safer therapeutic agents. However, the deployment of AI in drug discovery introduces certain challenges that warrant attention. A primary hurdle entails the imperative acquisition of high-quality and diverse data. Furthermore, ensuring the interpretability of AI models assumes critical importance in securing regulatory endorsement and cultivating trust within scientific and medical communities. Addressing ethical considerations, including data privacy and mitigating bias, represents an additional momentous challenge, requiring assiduous navigation. In this review, we provide an intricate and comprehensive overview of the multifaceted challenges intrinsic to conventional drug development paradigms, while simultaneously interrogating the efficacy of AI in effectively surmounting these formidable obstacles.
Collapse
Affiliation(s)
- Prafulla C Tiwari
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| | - Rishi Pal
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| | - Manju J Chaudhary
- Department of Physiology, Government Medical College, Kannauj, Uttar Pradesh, India
| | - Rajendra Nath
- Department of Pharmacology and Therapeutics, King George's Medical University, Lucknow, Uttar Pradesh, India
| |
Collapse
|
15
|
Hua Y, Luo L, Qiu H, Huang D, Zhao Y, Liu H, Lu T, Chen Y, Zhang Y, Jiang Y. Multimodal multi-task deep neural network framework for kinase-target prediction. Mol Divers 2023; 27:2491-2503. [PMID: 36369613 DOI: 10.1007/s11030-022-10565-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
Kinase plays a significant role in various disease signaling pathways. Due to the highly conserved sequence of kinase family members, understanding the selectivity profile of kinase inhibitors remains a priority for drug discovery. Previous methods for kinase selectivity identification use biochemical assays, which are very useful but limited by the protein available. The lack of kinase selectivity can exert benefits but also can cause adverse effects. With the explosion of the dataset for kinase activities, current computational methods can achieve accuracy for large-scale selectivity predictions. Here, we present a multimodal multi-task deep neural network model for kinase selectivity prediction by calculating the fingerprint and physiochemical descriptors. With the multimodal inputs of structure and physiochemical properties information, the multi-task framework could accurately predict the kinome map for selectivity analysis. The proposed model displays better performance for kinase-target prediction based on system evaluations.
Collapse
Affiliation(s)
- Yi Hua
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Lin Luo
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haodi Qiu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Dingfang Huang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yang Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, 24 Tongjiaxiang, Nanjing, 210009, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| | - Yulei Jiang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| |
Collapse
|
16
|
Wang Y, Zhang Z, Piao C, Huang Y, Zhang Y, Zhang C, Lu YJ, Liu D. LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening. Health Inf Sci Syst 2023; 11:42. [PMID: 37667773 PMCID: PMC10475000 DOI: 10.1007/s13755-023-00243-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023] Open
Abstract
Background Drug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge. Method Here, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction. Result On 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats. Conclusion In this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.
Collapse
Affiliation(s)
- Yang Wang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| | - Zuxian Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chenghong Piao
- The First Affiliated Hospital of Ningbo University, Ningbo, 315010 China
| | - Ying Huang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Yihan Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chi Zhang
- Shanghai Institute of Biological Products, Shanghai, 201403 China
| | - Yu-Jing Lu
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
- Smart Medical Innovation Technology Center, Guangdong University of Technology, Guangzhou, 510006 China
| | - Dongning Liu
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| |
Collapse
|
17
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|
18
|
Ru X, Zou Q, Lin C. Optimization of drug-target affinity prediction methods through feature processing schemes. Bioinformatics 2023; 39:btad615. [PMID: 37812388 PMCID: PMC10636279 DOI: 10.1093/bioinformatics/btad615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/19/2023] [Accepted: 10/07/2023] [Indexed: 10/10/2023] Open
Abstract
MOTIVATION Numerous high-accuracy drug-target affinity (DTA) prediction models, whose performance is heavily reliant on the drug and target feature information, are developed at the expense of complexity and interpretability. Feature extraction and optimization constitute a critical step that significantly influences the enhancement of model performance, robustness, and interpretability. Many existing studies aim to comprehensively characterize drugs and targets by extracting features from multiple perspectives; however, this approach has drawbacks: (i) an abundance of redundant or noisy features; and (ii) the feature sets often suffer from high dimensionality. RESULTS In this study, to obtain a model with high accuracy and strong interpretability, we utilize various traditional and cutting-edge feature selection and dimensionality reduction techniques to process self-associated features and adjacent associated features. These optimized features are then fed into learning to rank to achieve efficient DTA prediction. Extensive experimental results on two commonly used datasets indicate that, among various feature optimization methods, the regression tree-based feature selection method is most beneficial for constructing models with good performance and strong robustness. Then, by utilizing Shapley Additive Explanations values and the incremental feature selection approach, we obtain that the high-quality feature subset consists of the top 150D features and the top 20D features have a breakthrough impact on the DTA prediction. In conclusion, our study thoroughly validates the importance of feature optimization in DTA prediction and serves as inspiration for constructing high-performance and high-interpretable models. AVAILABILITY AND IMPLEMENTATION https://github.com/RUXIAOQING964914140/FS_DTA.
Collapse
Affiliation(s)
- Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Chen Lin
- Department of Computer Science and Technology, School of Informatics, Xiamen University, Xiamen, Fujian, 361005, China
| |
Collapse
|
19
|
Zhang J, Xie M. Graph regularized non-negative matrix factorization with [Formula: see text] norm regularization terms for drug-target interactions prediction. BMC Bioinformatics 2023; 24:375. [PMID: 37789278 PMCID: PMC10548602 DOI: 10.1186/s12859-023-05496-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 09/22/2023] [Indexed: 10/05/2023] Open
Abstract
BACKGROUND Identifying drug-target interactions (DTIs) plays a key role in drug development. Traditional wet experiments to identify DTIs are costly and time consuming. Effective computational methods to predict DTIs are useful to speed up the process of drug discovery. A variety of non-negativity matrix factorization based methods are proposed to predict DTIs, but most of them overlooked the sparsity of feature matrices and the convergence of adopted matrix factorization algorithms, therefore their performances can be further improved. RESULTS In order to predict DTIs more accurately, we propose a novel method iPALM-DLMF. iPALM-DLMF models DTIs prediction as a problem of non-negative matrix factorization with graph dual regularization terms and [Formula: see text] norm regularization terms. The graph dual regularization terms are used to integrate the information from the drug similarity matrix and the target similarity matrix, and [Formula: see text] norm regularization terms are used to ensure the sparsity of the feature matrices obtained by non-negative matrix factorization. To solve the model, iPALM-DLMF adopts non-negative double singular value decomposition to initialize the nonnegative matrix factorization, and an inertial Proximal Alternating Linearized Minimization iterating process, which has been proved to converge to a KKT point, to obtain the final result of the matrix factorization. Extensive experimental results show that iPALM-DLMF has better performance than other state-of-the-art methods. In case studies, in 50 highest-scoring proteins targeted by the drug gabapentin predicted by iPALM-DLMF, 46 have been validated, and in 50 highest-scoring drugs targeting prostaglandin-endoperoxide synthase 2 predicted by iPALM-DLMF, 47 have been validated.
Collapse
Affiliation(s)
- Junjun Zhang
- Key Laboratory of Computing and Stochastic Mathematics(LCSM) (Ministry of Education), School of Mathematics and Statistics, Hunan Normal University, Changsha, 410081 China
| | - Minzhu Xie
- Key Laboratory of Computing and Stochastic Mathematics(LCSM) (Ministry of Education), School of Mathematics and Statistics, Hunan Normal University, Changsha, 410081 China
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081 China
| |
Collapse
|
20
|
Viljanen M, Minnema J, Wassenaar PNH, Rorije E, Peijnenburg W. What is the ecotoxicity of a given chemical for a given aquatic species? Predicting interactions between species and chemicals using recommender system techniques. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:765-788. [PMID: 37670728 DOI: 10.1080/1062936x.2023.2254225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/27/2023] [Indexed: 09/07/2023]
Abstract
Ecotoxicological safety assessment of chemicals requires toxicity data on multiple species, despite the general desire of minimizing animal testing. Predictive models, specifically machine learning (ML) methods, are one of the tools capable of solving this apparent contradiction as they allow to generalize toxicity patterns across chemicals and species. However, despite the availability of large public toxicity datasets, the data is highly sparse, complicating model development. The aim of this study is to provide insights into how ML can predict toxicity using a large but sparse dataset. We developed models to predict LC50-values, based on experimental LC50-data covering 2431 organic chemicals and 1506 aquatic species from the ECOTOX-database. Several well-known ML techniques were evaluated and a new ML model was developed, inspired by recommender systems. This new model involves a simple linear model that learns low-rank interactions between species and chemicals using factorization machines. We evaluated the predictive performances of the developed models based on two validation settings: 1) predicting unseen chemical-species pairs, and 2) predicting unseen chemicals. The results of this study show that ML models can accurately predict LC50-values in both validation settings. Moreover, we show that the novel factorization machine approach can match well-tuned, complex, ML approaches.
Collapse
Affiliation(s)
- M Viljanen
- Department of Statistics, Data Science and Modelling, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - J Minnema
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - P N H Wassenaar
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - E Rorije
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - W Peijnenburg
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
- Institute of Environmental Sciences (CML), Leiden University, Leiden, The Netherlands
| |
Collapse
|
21
|
Lee M, Min K. AmorProt: Amino Acid Molecular Fingerprints Repurposing-Based Protein Fingerprint. Biochemistry 2023; 62:2700-2709. [PMID: 37622182 DOI: 10.1021/acs.biochem.3c00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/26/2023]
Abstract
As protein therapeutics play an important role in almost all medical fields, numerous studies have been conducted on proteins using artificial intelligence. Artificial intelligence has enabled data-driven predictions without the need for expensive experiments. Nevertheless, unlike the various molecular fingerprint algorithms that have been developed, protein fingerprint algorithms have rarely been studied. In this study, we proposed the amino acid molecular fingerprints repurposing-based protein (AmorProt) fingerprint, a protein sequence representation method that effectively uses the molecular fingerprints corresponding to 20 amino acids. Subsequently, the performances of the tree-based machine learning and artificial neural network models were compared using (1) amyloid classification and (2) isoelectric point regression. Finally, the applicability and advantages of the developed platform were demonstrated through a case study and the following experiments: (3) comparison of dataset dependence with feature-based methods, (4) feature importance analysis, and (5) protein space analysis. Consequently, the significantly improved model performance and data-set-independent versatility of the AmorProt fingerprint were verified. The results revealed that the current protein representation method can be applied to various fields related to proteins, such as predicting their fundamental properties or interaction with ligands.
Collapse
Affiliation(s)
- Myeonghun Lee
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| | - Kyoungmin Min
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| |
Collapse
|
22
|
Wang L, Zhou Y, Chen Q. AMMVF-DTI: A Novel Model Predicting Drug-Target Interactions Based on Attention Mechanism and Multi-View Fusion. Int J Mol Sci 2023; 24:14142. [PMID: 37762445 PMCID: PMC10531525 DOI: 10.3390/ijms241814142] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 09/09/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
Accurate identification of potential drug-target interactions (DTIs) is a crucial task in drug development and repositioning. Despite the remarkable progress achieved in recent years, improving the performance of DTI prediction still presents significant challenges. In this study, we propose a novel end-to-end deep learning model called AMMVF-DTI (attention mechanism and multi-view fusion), which leverages a multi-head self-attention mechanism to explore varying degrees of interaction between drugs and target proteins. More importantly, AMMVF-DTI extracts interactive features between drugs and proteins from both node-level and graph-level embeddings, enabling a more effective modeling of DTIs. This advantage is generally lacking in existing DTI prediction models. Consequently, when compared to many of the start-of-the-art methods, AMMVF-DTI demonstrated excellent performance on the human, C. elegans, and DrugBank baseline datasets, which can be attributed to its ability to incorporate interactive information and mine features from both local and global structures. The results from additional ablation experiments also confirmed the importance of each module in our AMMVF-DTI model. Finally, a case study is presented utilizing our model for COVID-19-related DTI prediction. We believe the AMMVF-DTI model can not only achieve reasonable accuracy in DTI prediction, but also provide insights into the understanding of potential interactions between drugs and targets.
Collapse
|
23
|
Huang Y, Huang HY, Chen Y, Lin YCD, Yao L, Lin T, Leng J, Chang Y, Zhang Y, Zhu Z, Ma K, Cheng YN, Lee TY, Huang HD. A Robust Drug-Target Interaction Prediction Framework with Capsule Network and Transfer Learning. Int J Mol Sci 2023; 24:14061. [PMID: 37762364 PMCID: PMC10531393 DOI: 10.3390/ijms241814061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/27/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023] Open
Abstract
Drug-target interactions (DTIs) are considered a crucial component of drug design and drug discovery. To date, many computational methods were developed for drug-target interactions, but they are insufficiently informative for accurately predicting DTIs due to the lack of experimentally verified negative datasets, inaccurate molecular feature representation, and ineffective DTI classifiers. Therefore, we address the limitations of randomly selecting negative DTI data from unknown drug-target pairs by establishing two experimentally validated datasets and propose a capsule network-based framework called CapBM-DTI to capture hierarchical relationships of drugs and targets, which adopts pre-trained bidirectional encoder representations from transformers (BERT) for contextual sequence feature extraction from target proteins through transfer learning and the message-passing neural network (MPNN) for the 2-D graph feature extraction of compounds to accurately and robustly identify drug-target interactions. We compared the performance of CapBM-DTI with state-of-the-art methods using four experimentally validated DTI datasets of different sizes, including human (Homo sapiens) and worm (Caenorhabditis elegans) species datasets, as well as three subsets (new compounds, new proteins, and new pairs). Our results demonstrate that the proposed model achieved robust performance and powerful generalization ability in all experiments. The case study on treating COVID-19 demonstrates the applicability of the model in virtual screening.
Collapse
Affiliation(s)
- Yixian Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Hsi-Yuan Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Yigang Chen
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Yang-Chi-Dung Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Lantian Yao
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Tianxiu Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Junlin Leng
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Yuan Chang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Yuntian Zhang
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Zihao Zhu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Kun Ma
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| | - Yeong-Nan Cheng
- Institute of Bioinformatics and Systems Biology, Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (Y.-N.C.)
| | - Tzong-Yi Lee
- Institute of Bioinformatics and Systems Biology, Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan; (Y.-N.C.)
| | - Hsien-Da Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.H.); (Y.C.); (J.L.)
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (L.Y.); (Y.C.)
| |
Collapse
|
24
|
Pun FW, Ozerov IV, Zhavoronkov A. AI-powered therapeutic target discovery. Trends Pharmacol Sci 2023; 44:561-572. [PMID: 37479540 DOI: 10.1016/j.tips.2023.06.010] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 06/20/2023] [Accepted: 06/23/2023] [Indexed: 07/23/2023]
Abstract
Disease modeling and target identification are the most crucial initial steps in drug discovery, and influence the probability of success at every step of drug development. Traditional target identification is a time-consuming process that takes years to decades and usually starts in an academic setting. Given its advantages of analyzing large datasets and intricate biological networks, artificial intelligence (AI) is playing a growing role in modern drug target identification. We review recent advances in target discovery, focusing on breakthroughs in AI-driven therapeutic target exploration. We also discuss the importance of striking a balance between novelty and confidence in target selection. An increasing number of AI-identified targets are being validated through experiments and several AI-derived drugs are entering clinical trials; we highlight current limitations and potential pathways for moving forward.
Collapse
Affiliation(s)
- Frank W Pun
- Insilico Medicine Hong Kong Ltd., Hong Kong Science and Technology Park, New Territories, Hong Kong
| | - Ivan V Ozerov
- Insilico Medicine Hong Kong Ltd., Hong Kong Science and Technology Park, New Territories, Hong Kong
| | - Alex Zhavoronkov
- Insilico Medicine Hong Kong Ltd., Hong Kong Science and Technology Park, New Territories, Hong Kong; Insilico Medicine MENA, 6F IRENA Building, Abu Dhabi, United Arab Emirates; Buck Institute for Research on Aging, Novato, CA, USA.
| |
Collapse
|
25
|
Chen J, Zhang L, Cheng K, Jin B, Lu X, Che C. Predicting Drug-Target Interaction Via Self-Supervised Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2781-2789. [PMID: 35230952 DOI: 10.1109/tcbb.2022.3153963] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent advances in graph representation learning provide new opportunities for computational drug-target interaction (DTI) prediction. However, it still suffers from deficiencies of dependence on manual labels and vulnerability to attacks. Inspired by the success of self-supervised learning (SSL) algorithms, which can leverage input data itself as supervision,we propose SupDTI, a SSL-enhanced drug-target interaction prediction framework based on a heterogeneous network (i.e., drug-protein, drug-drug, and protein-protein interaction network; drug-disease, drug-side-effect, and protein-disease association network; drug-structure and protein-sequence similarity network). Specifically, SupDTI is an end-to-end learning framework consisting of five components. First, localized and globalized graph convolutions are designed to capture the nodes' information from both local and global perspectives, respectively. Then, we develop a variational autoencoder to constrain the nodes' representation to have desired statistical characteristics. Finally, a unified self-supervised learning strategy is leveraged to enhance the nodes' representation, namely, a contrastive learning module is employed to enable the nodes' representation to fit the graph-level representation, followed by a generative learning module which further maximizes the node-level agreement across the global and local views by learning the probabilistic connectivity distribution of the original heterogeneous network. Experimental results show that our model can achieve better prediction performance than state-of-the-art methods.
Collapse
|
26
|
Qian Y, Li X, Wu J, Zhang Q. MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug-target interaction. BMC Bioinformatics 2023; 24:323. [PMID: 37633938 PMCID: PMC10463755 DOI: 10.1186/s12859-023-05447-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 08/15/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND Prediction of drug-target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug-target interacting parties, which may lead to insufficient feature representation. MOTIVATION In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. RESULTS To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug-target interaction for drug-target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. CONCLUSIONS MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug-drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.
Collapse
Affiliation(s)
- Ying Qian
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Xinyi Li
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Jian Wu
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Qian Zhang
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| |
Collapse
|
27
|
Singh S, Kumar R, Payra S, Singh SK. Artificial Intelligence and Machine Learning in Pharmacological Research: Bridging the Gap Between Data and Drug Discovery. Cureus 2023; 15:e44359. [PMID: 37779744 PMCID: PMC10539991 DOI: 10.7759/cureus.44359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2023] [Indexed: 10/03/2023] Open
Abstract
Artificial intelligence (AI) has transformed pharmacological research through machine learning, deep learning, and natural language processing. These advancements have greatly influenced drug discovery, development, and precision medicine. AI algorithms analyze vast biomedical data identifying potential drug targets, predicting efficacy, and optimizing lead compounds. AI has diverse applications in pharmacological research, including target identification, drug repurposing, virtual screening, de novo drug design, toxicity prediction, and personalized medicine. AI improves patient selection, trial design, and real-time data analysis in clinical trials, leading to enhanced safety and efficacy outcomes. Post-marketing surveillance utilizes AI-based systems to monitor adverse events, detect drug interactions, and support pharmacovigilance efforts. Machine learning models extract patterns from complex datasets, enabling accurate predictions and informed decision-making, thus accelerating drug discovery. Deep learning, specifically convolutional neural networks (CNN), excels in image analysis, aiding biomarker identification and optimizing drug formulation. Natural language processing facilitates the mining and analysis of scientific literature, unlocking valuable insights and information. However, the adoption of AI in pharmacological research raises ethical considerations. Ensuring data privacy and security, addressing algorithm bias and transparency, obtaining informed consent, and maintaining human oversight in decision-making are crucial ethical concerns. The responsible deployment of AI necessitates robust frameworks and regulations. The future of AI in pharmacological research is promising, with integration with emerging technologies like genomics, proteomics, and metabolomics offering the potential for personalized medicine and targeted therapies. Collaboration among academia, industry, and regulatory bodies is essential for the ethical implementation of AI in drug discovery and development. Continuous research and development in AI techniques and comprehensive training programs will empower scientists and healthcare professionals to fully exploit AI's potential, leading to improved patient outcomes and innovative pharmacological interventions.
Collapse
Affiliation(s)
- Shruti Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Rajesh Kumar
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Shuvasree Payra
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Sunil K Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| |
Collapse
|
28
|
Blanco-González A, Cabezón A, Seco-González A, Conde-Torres D, Antelo-Riveiro P, Piñeiro Á, Garcia-Fandino R. The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies. Pharmaceuticals (Basel) 2023; 16:891. [PMID: 37375838 DOI: 10.3390/ph16060891] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 06/14/2023] [Accepted: 06/15/2023] [Indexed: 06/29/2023] Open
Abstract
Artificial intelligence (AI) has the potential to revolutionize the drug discovery process, offering improved efficiency, accuracy, and speed. However, the successful application of AI is dependent on the availability of high-quality data, the addressing of ethical concerns, and the recognition of the limitations of AI-based approaches. In this article, the benefits, challenges, and drawbacks of AI in this field are reviewed, and possible strategies and approaches for overcoming the present obstacles are proposed. The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods, as well as the potential advantages of AI in pharmaceutical research, are also discussed. Overall, this review highlights the potential of AI in drug discovery and provides insights into the challenges and opportunities for realizing its potential in this field. Note from the human authors: This article was created to test the ability of ChatGPT, a chatbot based on the GPT-3.5 language model, in terms of assisting human authors in writing review articles. The text generated by the AI following our instructions (see Supporting Information) was used as a starting point, and its ability to automatically generate content was evaluated. After conducting a thorough review, the human authors practically rewrote the manuscript, striving to maintain a balance between the original proposal and the scientific criteria. The advantages and limitations of using AI for this purpose are discussed in the last section.
Collapse
Affiliation(s)
- Alexandre Blanco-González
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
- MD.USE Innovations S.L., Edificio Emprendia, 15782 Santiago de Compostela, Spain
| | - Alfonso Cabezón
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
| | - Alejandro Seco-González
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
| | - Daniel Conde-Torres
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
| | - Paula Antelo-Riveiro
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
| | - Ángel Piñeiro
- Soft Matter & Molecular Biophysics Group, Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, 15705 Santiago de Compostela, Spain
| | - Rebeca Garcia-Fandino
- Department of Organic Chemistry, Center for Research in Biological Chemistry and Molecular Materials, University of Santiago de Compostela, CIQUS, 15705 Santiago de Compostela, Spain
| |
Collapse
|
29
|
Zong N, Chowdhury S, Zhou S, Rajaganapathy S, yu Y, Wang L, Dai Q, Bielinski SJ, Chen Y, Cerhan JR. Artificial Intelligence-based Efficacy Prediction of Phase 3 Clinical Trial for Repurposing Heart Failure Therapies. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.25.23290531. [PMID: 37398384 PMCID: PMC10312819 DOI: 10.1101/2023.05.25.23290531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Introduction Drug repurposing involves finding new therapeutic uses for already approved drugs, which can save costs as their pharmacokinetics and pharmacodynamics are already known. Predicting efficacy based on clinical endpoints is valuable for designing phase 3 trials and making Go/No-Go decisions, given the potential for confounding effects in phase 2. Objectives This study aims to predict the efficacy of the repurposed Heart Failure (HF) drugs for the Phase 3 Clinical Trial. Methods Our study presents a comprehensive framework for predicting drug efficacy in phase 3 trials, which combines drug-target prediction using biomedical knowledgebases with statistical analysis of real-world data. We developed a novel drug-target prediction model that uses low-dimensional representations of drug chemical structures and gene sequences, and biomedical knowledgebase. Furthermore, we conducted statistical analyses of electronic health records to assess the effectiveness of repurposed drugs in relation to clinical measurements (e.g., NT-proBNP). Results We identified 24 repurposed drugs (9 with a positive effect and 15 with a non-positive) for heart failure from 266 phase 3 clinical trials. We used 25 genes related to heart failure for drug-target prediction, as well as electronic health records (EHR) from the Mayo Clinic for screening, which contained over 58,000 heart failure patients treated with various drugs and categorized by heart failure subtypes. Our proposed drug-target predictive model performed exceptionally well in all seven tests in the BETA benchmark compared to the six cutting-edge baseline methods (i.e., best performed in 266 out of 404 tasks). For the overall prediction of the 24 drugs, our model achieved an AUCROC of 82.59% and PRAUC (average precision) of 73.39%. Conclusion The study demonstrated exceptional results in predicting the efficacy of repurposed drugs for phase 3 clinical trials, highlighting the potential of this method to facilitate computational drug repurposing.
Collapse
Affiliation(s)
- Nansu Zong
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Shaika Chowdhury
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Shibo Zhou
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Sivaraman Rajaganapathy
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Yue yu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN, USA
| | - Qiying Dai
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA
| | | | - Yongbin Chen
- Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, MN, USA
| | - James R. Cerhan
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
30
|
Ahlquist KD, Sugden LA, Ramachandran S. Enabling interpretable machine learning for biological data with reliability scores. PLoS Comput Biol 2023; 19:e1011175. [PMID: 37235578 PMCID: PMC10249903 DOI: 10.1371/journal.pcbi.1011175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/08/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Machine learning tools have proven useful across biological disciplines, allowing researchers to draw conclusions from large datasets, and opening up new opportunities for interpreting complex and heterogeneous biological data. Alongside the rapid growth of machine learning, there have also been growing pains: some models that appear to perform well have later been revealed to rely on features of the data that are artifactual or biased; this feeds into the general criticism that machine learning models are designed to optimize model performance over the creation of new biological insights. A natural question arises: how do we develop machine learning models that are inherently interpretable or explainable? In this manuscript, we describe the SWIF(r) reliability score (SRS), a method building on the SWIF(r) generative framework that reflects the trustworthiness of the classification of a specific instance. The concept of the reliability score has the potential to generalize to other machine learning methods. We demonstrate the utility of the SRS when faced with common challenges in machine learning including: 1) an unknown class present in testing data that was not present in training data, 2) systemic mismatch between training and testing data, and 3) instances of testing data that have missing values for some attributes. We explore these applications of the SRS using a range of biological datasets, from agricultural data on seed morphology, to 22 quantitative traits in the UK Biobank, and population genetic simulations and 1000 Genomes Project data. With each of these examples, we demonstrate how the SRS can allow researchers to interrogate their data and training approach thoroughly, and to pair their domain-specific knowledge with powerful machine-learning frameworks. We also compare the SRS to related tools for outlier and novelty detection, and find that it has comparable performance, with the advantage of being able to operate when some data are missing. The SRS, and the broader discussion of interpretable scientific machine learning, will aid researchers in the biological machine learning space as they seek to harness the power of machine learning without sacrificing rigor and biological insight.
Collapse
Affiliation(s)
- K. D. Ahlquist
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, Rhode Island, United States of America
| | - Lauren A. Sugden
- Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania, United States of America
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Data Science Initiative, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
31
|
Chen P, Zheng H. Drug-target interaction prediction based on spatial consistency constraint and graph convolutional autoencoder. BMC Bioinformatics 2023; 24:151. [PMID: 37069493 PMCID: PMC10109239 DOI: 10.1186/s12859-023-05275-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/05/2023] [Indexed: 04/19/2023] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction plays an important role in drug discovery and repositioning. However, most of the computational methods used for identifying relevant DTIs do not consider the invariance of the nearest neighbour relationships between drugs or targets. In other words, they do not take into account the invariance of the topological relationships between nodes during representation learning. It may limit the performance of the DTI prediction methods. RESULTS Here, we propose a novel graph convolutional autoencoder-based model, named SDGAE, to predict DTIs. As the graph convolutional network cannot handle isolated nodes in a network, a pre-processing step was applied to reduce the number of isolated nodes in the heterogeneous network and facilitate effective exploitation of the graph convolutional network. By maintaining the graph structure during representation learning, the nearest neighbour relationships between nodes in the embedding space remained as close as possible to the original space. CONCLUSIONS Overall, we demonstrated that SDGAE can automatically learn more informative and robust feature vectors of drugs and targets, thus exhibiting significantly improved predictive accuracy for DTIs.
Collapse
Affiliation(s)
- Peng Chen
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China
| | - Haoran Zheng
- School of Computer Science and Technology, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
- Anhui Key Laboratory of Software Engineering in Computing and Communication, University of Science and Technology of China, Jinzhai Road 96, Hefei, 230027, People's Republic of China.
| |
Collapse
|
32
|
Lu J, Jia X, Wang C, He H. Screening potential anaphylactoid components in vinpocetine injection using a high expression Mas-related G-protein-coupled receptor X2 cell membrane chromatography. J Appl Toxicol 2023; 43:508-516. [PMID: 36199206 DOI: 10.1002/jat.4402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/18/2022] [Accepted: 10/02/2022] [Indexed: 11/08/2022]
Abstract
Vinpocetine injection is often used in clinical treatment of acute cardiovascular and cerebrovascular diseases. However, it was reported that vinpocetine injection caused allergic reactions in clinical use; therefore, its safety needs urgent attention. Until now, research on its sensitization is rarely reported. Here, the components contained in three vinpocetine injections were examined. It was found that besides vinpocetine, the synthetic raw material vincamine, the excipients benzyl alcohol and ethyl p-toluenesulfonate, and the impurities A, B, C, and D, which are excipients specified in the European Pharmacopoeia, were also present in them. Then the Mas-related G-protein-coupled receptor X2 (MRGPRX2)-HEK293 cell membrane chromatography was used to investigate the affinity of them with MRGPRX2 and found that vinpocetine, vincamine, and impurities A, B, C, and D bind to MRGPRX2. Afterwards, these compounds were further used to investigate the local sensitization ability in vivo. The results showed that vinpocetine, vincamine, and impurity C could induce swelling of the paw and decrease body temperature in mice, but only impurity C could cause local skin mast cell degranulation and serum histamine release increase. In vitro, the results also indicated that impurity C could increase intracellular [Ca2+ ] in MRGPRX2-HEK293 cells, whereas vinpocetine and vincamine did not. Therefore, the impurity C was the potential anaphylactoid component in vinpocetine injection, which may be one of the reasons for the occurrence of allergic reactions in the clinical use of vinpocetine injection. This work provides evidence on the sensitization of impurity C and also contributes to promoting the clinical safety of vinpocetine injection.
Collapse
Affiliation(s)
- Jiayu Lu
- School of Pharmacy, Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Xin Jia
- School of Pharmacy, Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Changhe Wang
- Shaanxi Institute for Food and Drug Control, Xi'an, Shaanxi, China
| | - Huaizhen He
- School of Pharmacy, Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| |
Collapse
|
33
|
Ji KY, Liu C, Liu ZQ, Deng YF, Hou TJ, Cao DS. Comprehensive assessment of nine target prediction web services: which should we choose for target fishing? Brief Bioinform 2023; 24:6995377. [PMID: 36681902 DOI: 10.1093/bib/bbad014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/29/2022] [Accepted: 01/03/2023] [Indexed: 01/23/2023] Open
Abstract
Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.
Collapse
Affiliation(s)
- Kai-Yue Ji
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Chong Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Ya-Feng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| |
Collapse
|
34
|
Yang X, Niu Z, Liu Y, Song B, Lu W, Zeng L, Zeng X. Modality-DTA: Multimodality Fusion Strategy for Drug-Target Affinity Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1200-1210. [PMID: 36083952 DOI: 10.1109/tcbb.2022.3205282] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Prediction of the drug-target affinity (DTA) plays an important role in drug discovery. Existing deep learning methods for DTA prediction typically leverage a single modality, namely simplified molecular input line entry specification (SMILES) or amino acid sequence to learn representations. SMILES or amino acid sequences can be encoded into different modalities. Multimodality data provide different kinds of information, with complementary roles for DTA prediction. We propose Modality-DTA, a novel deep learning method for DTA prediction that leverages the multimodality of drugs and targets. A group of backward propagation neural networks is applied to ensure the completeness of the reconstruction process from the latent feature representation to original multimodality data. The tag between the drug and target is used to reduce the noise information in the latent representation from multimodality data. Experiments on three benchmark datasets show that our Modality-DTA outperforms existing methods in all metrics. Modality-DTA reduces the mean square error by 15.7% and improves the area under the precisionrecall curve by 12.74% in the Davis dataset. We further find that the drug modality Morgan fingerprint and the target modality generated by one-hot-encoding play the most significant roles. To the best of our knowledge, Modality-DTA is the first method to explore multimodality for DTA prediction.
Collapse
|
35
|
De Vita S, Chini MG, Bifulco G, Lauro G. Target identification by structure-based computational approaches: Recent advances and perspectives. Bioorg Med Chem Lett 2023; 83:129171. [PMID: 36739998 DOI: 10.1016/j.bmcl.2023.129171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 12/15/2022] [Accepted: 02/01/2023] [Indexed: 02/05/2023]
Abstract
The use of computational techniques in the early stages of drug discovery has recently experienced a boost, especially in the target identification step. Finding the biological partner(s) for new or existing synthetic and/or natural compounds by "wet" approaches may be challenging; therefore, preliminary in silico screening is even more recommended. After a brief overview of some of the most known target identification techniques, recent advances in structure-based computational approaches for target identification are reported in this digest, focusing on Inverse Virtual Screening and its recent applications. Moreover, future perspectives concerning the use of such methodologies, coupled or not with other approaches, are analyzed.
Collapse
Affiliation(s)
- Simona De Vita
- Department of Pharmacy, University of Salerno, Via Giovanni Paolo II 132, 84084 Fisciano (SA), Italy
| | - Maria Giovanna Chini
- Department of Biosciences and Territory, University of Molise, Contrada Fonte Lappone, 86090 Pesche (IS), Italy
| | - Giuseppe Bifulco
- Department of Pharmacy, University of Salerno, Via Giovanni Paolo II 132, 84084 Fisciano (SA), Italy.
| | - Gianluigi Lauro
- Department of Pharmacy, University of Salerno, Via Giovanni Paolo II 132, 84084 Fisciano (SA), Italy.
| |
Collapse
|
36
|
Liu ZY, Liu F, Cao Y, Peng SL, Pan HW, Hong XQ, Zheng PF. ACSL1, CH25H, GPCPD1, and PLA2G12A as the potential lipid-related diagnostic biomarkers of acute myocardial infarction. Aging (Albany NY) 2023; 15:1394-1411. [PMID: 36863716 PMCID: PMC10042701 DOI: 10.18632/aging.204542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 02/13/2023] [Indexed: 03/04/2023]
Abstract
Lipid metabolism plays an essential role in the genesis and progress of acute myocardial infarction (AMI). Herein, we identified and verified latent lipid-related genes involved in AMI by bioinformatic analysis. Lipid-related differentially expressed genes (DEGs) involved in AMI were identified using the GSE66360 dataset from the Gene Expression Omnibus (GEO) database and R software packages. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted to analyze lipid-related DEGs. Lipid-related genes were identified by two machine learning techniques: least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE). The receiver operating characteristic (ROC) curves were used to descript diagnostic accuracy. Furthermore, blood samples were collected from AMI patients and healthy individuals, and real-time quantitative polymerase chain reaction (RT-qPCR) was used to determine the RNA levels of four lipid-related DEGs. Fifty lipid-related DEGs were identified, 28 upregulated and 22 downregulated. Several enrichment terms related to lipid metabolism were found by GO and KEGG enrichment analyses. After LASSO and SVM-RFE screening, four genes (ACSL1, CH25H, GPCPD1, and PLA2G12A) were identified as potential diagnostic biomarkers for AMI. Moreover, the RT-qPCR analysis indicated that the expression levels of four DEGs in AMI patients and healthy individuals were consistent with bioinformatics analysis results. The validation of clinical samples suggested that 4 lipid-related DEGs are expected to be diagnostic markers for AMI and provide new targets for lipid therapy of AMI.
Collapse
Affiliation(s)
- Zheng-Yu Liu
- Department of Cardiology, Hunan Provincial People's Hospital, Changsha 410000, China
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
| | - Fen Liu
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
- The First Affiliated Hospital of Hunan Normal University (Hunan Provincial People's Hospital), Changsha 410000, China
| | - Yan Cao
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
- Department of Emergency, Hunan Provincial People's Hospital, Changsha 410000, China
| | - Shao-Liang Peng
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Data Center, Hunan Provincial People's Hospital, Changsha 410000, China
| | - Hong-Wei Pan
- Department of Cardiology, Hunan Provincial People's Hospital, Changsha 410000, China
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
| | - Xiu-Qin Hong
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
- The First Affiliated Hospital of Hunan Normal University (Hunan Provincial People's Hospital), Changsha 410000, China
| | - Peng-Fei Zheng
- Department of Cardiology, Hunan Provincial People's Hospital, Changsha 410000, China
- Department of Epidemiology, Hunan Provincial People's Hospital (The First Affiliated Hospital of Hunan Normal University), Changsha 410000, China
- Clinical Medicine Research Center of Heart Failure of Hunan Province, Changsha 410000, China
| |
Collapse
|
37
|
Zheng PF, Zhou SY, Zhong CQ, Zheng ZF, Liu ZY, Pan HW, Peng JQ. Identification of m6A regulator-mediated RNA methylation modification patterns and key immune-related genes involved in atrial fibrillation. Aging (Albany NY) 2023; 15:1371-1393. [PMID: 36863715 PMCID: PMC10042702 DOI: 10.18632/aging.204537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 02/11/2023] [Indexed: 03/04/2023]
Abstract
The role of m6A in the regulation of the immune microenvironment in atrial fibrillation (AF) remains unclear. This study systematically evaluated the RNA modification patterns mediated by differential m6A regulators in 62 AF samples, identified the pattern of immune cell infiltration in AF and identified several immune-related genes associated with AF. A total of six key differential m6A regulators between healthy subjects and AF patients were identified by the random forest classifier. Three distinct RNA modification patterns (m6A cluster-A, -B and -C) among AF samples were identified based on the expression of 6 key m6A regulators. Differential infiltrating immune cells and HALLMARKS signaling pathways between normal and AF samples as well as among samples with three distinct m6A modification patterns were identified. A total of 16 overlapping key genes were identified by weighted gene coexpression network analysis (WGCNA) combined with two machine learning methods. The expression levels of the NCF2 and HCST genes were different between controls and AF patient samples as well as among samples with the distinct m6A modification patterns. RT-qPCR also proved that the expression of NCF2 and HCST was significantly increased in AF patients compared with control participants. These results suggested that m6A modification plays a key role in the complexity and diversity of the immune microenvironment of AF. Immunotyping of patients with AF will help to develop more accurate immunotherapy strategies for those with a significant immune response. The NCF2 and HCST genes may be novel biomarkers for the accurate diagnosis and immunotherapy of AF.
Collapse
Affiliation(s)
- Peng-Fei Zheng
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| | - Sen-Yu Zhou
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- The First Affiliated Hospital of Hunan Normal University (Hunan Provincial People’s Hospital), Furong, Changsha 410000, Hunan, China
| | - Chang-Qing Zhong
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| | - Zhao-Fen Zheng
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| | - Zheng-Yu Liu
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| | - Hong-Wei Pan
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| | - Jian-Qiang Peng
- Cardiology Department, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
- Clinical Research Center for Heart Failure in Hunan Province, Furong, Changsha 410000, Hunan, China
- Institute of Cardiovascular Epidemiology, Hunan Provincial People’s Hospital, Furong, Changsha 410000, Hunan, China
| |
Collapse
|
38
|
DRaW: prediction of COVID-19 antivirals by deep learning-an objection on using matrix factorization. BMC Bioinformatics 2023; 24:52. [PMID: 36793010 PMCID: PMC9931173 DOI: 10.1186/s12859-023-05181-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 02/09/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Due to the high resource consumption of introducing a new drug, drug repurposing plays an essential role in drug discovery. To do this, researchers examine the current drug-target interaction (DTI) to predict new interactions for the approved drugs. Matrix factorization methods have much attention and utilization in DTIs. However, they suffer from some drawbacks. METHODS We explain why matrix factorization is not the best for DTI prediction. Then, we propose a deep learning model (DRaW) to predict DTIs without having input data leakage. We compare our model with several matrix factorization methods and a deep model on three COVID-19 datasets. In addition, to ensure the validation of DRaW, we evaluate it on benchmark datasets. Furthermore, as an external validation, we conduct a docking study on the COVID-19 recommended drugs. RESULTS In all cases, the results confirm that DRaW outperforms matrix factorization and deep models. The docking results approve the top-ranked recommended drugs for COVID-19. CONCLUSIONS In this paper, we show that it may not be the best choice to use matrix factorization in the DTI prediction. Matrix factorization methods suffer from some intrinsic issues, e.g., sparsity in the domain of bioinformatics applications and fixed-unchanged size of the matrix-related paradigm. Therefore, we propose an alternative method (DRaW) that uses feature vectors rather than matrix factorization and demonstrates better performance than other famous methods on three COVID-19 and four benchmark datasets.
Collapse
|
39
|
Hua Y, Song X, Feng Z, Wu X. MFR-DTA: a multi-functional and robust model for predicting drug-target binding affinity and region. Bioinformatics 2023; 39:7008321. [PMID: 36708000 PMCID: PMC9900210 DOI: 10.1093/bioinformatics/btad056] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 12/31/2022] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION Recently, deep learning has become the mainstream methodology for drug-target binding affinity prediction. However, two deficiencies of the existing methods restrict their practical applications. On the one hand, most existing methods ignore the individual information of sequence elements, resulting in poor sequence feature representations. On the other hand, without prior biological knowledge, the prediction of drug-target binding regions based on attention weights of a deep neural network could be difficult to verify, which may bring adverse interference to biological researchers. RESULTS We propose a novel Multi-Functional and Robust Drug-Target binding Affinity prediction (MFR-DTA) method to address the above issues. Specifically, we design a new biological sequence feature extraction block, namely BioMLP, that assists the model in extracting individual features of sequence elements. Then, we propose a new Elem-feature fusion block to refine the extracted features. After that, we construct a Mix-Decoder block that extracts drug-target interaction information and predicts their binding regions simultaneously. Last, we evaluate MFR-DTA on two benchmarks consistently with the existing methods and propose a new dataset, sc-PDB, to better measure the accuracy of binding region prediction. We also visualize some samples to demonstrate the locations of their binding sites and the predicted multi-scale interaction regions. The proposed method achieves excellent performance on these datasets, demonstrating its merits and superiority over the state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION https://github.com/JU-HuaY/MFR.
Collapse
Affiliation(s)
- Yang Hua
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Xiaoning Song
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Zhenhua Feng
- School of Computer Science and Electronic Engineering, University of Surrey, Guildford GU2 7XH, UK
| | - Xiaojun Wu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
40
|
A Novel Autoencoder-Based Feature Selection Method for Drug-Target Interaction Prediction with Human-Interpretable Feature Weights. Symmetry (Basel) 2023. [DOI: 10.3390/sym15010192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Drug-target interaction prediction provides important information that could be exploited for drug discovery, drug design, and drug repurposing. Chemogenomic approaches for predicting drug-target interaction assume that similar receptors bind to similar ligands. Capturing this similarity in so-called “fingerprints” and combining the target and ligand fingerprints provide an efficient way to search for protein-ligand pairs that are more likely to interact. In this study, we constructed drug and target fingerprints by employing features extracted from the DrugBank. However, the number of extracted features is quite large, necessitating an effective feature selection mechanism since some features can be redundant or irrelevant to drug-target interaction prediction problems. Although such feature selection methods are readily available in the literature, usually they act as black boxes and do not provide any quantitative information about why a specific feature is preferred over another. To alleviate this lack of human interpretability, we proposed a novel feature selection method in which we used an autoencoder as a symmetric learning method and compared the proposed method to some popular feature selection algorithms, such as Kbest, Variance Threshold, and Decision Tree. The results of a detailed performance study, in which we trained six Multi-Layer Perceptron (MLP) Networks of different sizes and configurations for prediction, demonstrate that the proposed method yields superior results compared to the aforementioned methods.
Collapse
|
41
|
Hua Y, Song X, Feng Z, Wu XJ, Kittler J, Yu DJ. CPInformer for Efficient and Robust Compound-Protein Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:285-296. [PMID: 35044921 DOI: 10.1109/tcbb.2022.3144008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Recently, deep learning has become the mainstream methodology for Compound-Protein Interaction (CPI) prediction. However, the existing compound-protein feature extraction methods have some issues that limit their performance. First, graph networks are widely used for structural compound feature extraction, but the chemical properties of a compound depend on functional groups rather than graphic structure. Besides, the existing methods lack capabilities in extracting rich and discriminative protein features. Last, the compound-protein features are usually simply combined for CPI prediction, without considering information redundancy and effective feature mining. To address the above issues, we propose a novel CPInformer method. Specifically, we extract heterogeneous compound features, including structural graph features and functional class fingerprints, to reduce prediction errors caused by similar structural compounds. Then, we combine local and global features using dense connections to obtain multi-scale protein features. Last, we apply ProbSparse self-attention to protein features, under the guidance of compound features, to eliminate information redundancy, and to improve the accuracy of CPInformer. More importantly, the proposed method identifies the activated local regions that link a CPI, providing a good visualisation for the CPI state. The results obtained on five benchmarks demonstrate the merits and superiority of CPInformer over the state-of-the-art approaches.
Collapse
|
42
|
Bae H, Nam H. GraphATT-DTA: Attention-Based Novel Representation of Interaction to Predict Drug-Target Binding Affinity. Biomedicines 2022; 11:biomedicines11010067. [PMID: 36672575 PMCID: PMC9855982 DOI: 10.3390/biomedicines11010067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/06/2022] [Accepted: 12/20/2022] [Indexed: 12/29/2022] Open
Abstract
Drug-target binding affinity (DTA) prediction is an essential step in drug discovery. Drug-target protein binding occurs at specific regions between the protein and drug, rather than the entire protein and drug. However, existing deep-learning DTA prediction methods do not consider the interactions between drug substructures and protein sub-sequences. This work proposes GraphATT-DTA, a DTA prediction model that constructs the essential regions for determining interaction affinity between compounds and proteins, modeled with an attention mechanism for interpretability. We make the model consider the local-to-global interactions with the attention mechanism between compound and protein. As a result, GraphATT-DTA shows an improved prediction of DTA performance and interpretability compared with state-of-the-art models. The model is trained and evaluated with the Davis dataset, the human kinase dataset; an external evaluation is achieved with the independently proposed human kinase dataset from the BindingDB dataset.
Collapse
Affiliation(s)
- Haelee Bae
- AI Graduate School, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Republic of Korea
| | - Hojung Nam
- AI Graduate School, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Republic of Korea
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Republic of Korea
- Center for AI-Applied High Efficiency Drug Discovery (AHEDD), Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Republic of Korea
- Correspondence:
| |
Collapse
|
43
|
Sinha K, Ghosh J, Sil PC. Machine Learning in Drug Metabolism Study. Curr Drug Metab 2022; 23:CDM-EPUB-128463. [PMID: 36578255 DOI: 10.2174/1389200224666221227094144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/27/2022] [Accepted: 11/01/2022] [Indexed: 12/30/2022]
Abstract
Metabolic reactions in the body transform the administered drug into metabolites. These metabolites exhibit diverse biological activities. Drug metabolism is the major underlying cause of drug overdose-related toxicity, adversative drug effects and the drug's reduced efficacy. Though metabolic reactions deactivate a drug, drug metabolites are often considered pivotal agents for off-target effects or toxicity. On the other side, in combination drug therapy, one drug may influence another drug's metabolism and clearance and is thus considered one of the primary causes of drug-drug interactions. Today with the advancement of machine learning, the metabolic fate of a drug candidate can be comprehensively studied throughout the drug development procedure. Naïve Bayes, Logistic Regression, k-Nearest Neighbours, Decision Trees, different Boosting and Ensemble methods, Support Vector Machines and Artificial Neural Network boosted Deep Learning are some machine learning algorithms which are being extensively used in such studies. Such tools are covering several attributes of drug metabolism, with an emphasis on the prediction of drug-drug interactions, drug-target-interactions, clinical drug responses, metabolite predictions, sites of metabolism, etc. These reports are crucial for evaluating metabolic stability and predicting prospective drug-drug interactions, and can help pharmaceutical companies accelerate the drug development process in a less resource-demanding manner than what in vitro studies offer. It could also help medical practitioners to use combinatorial drug therapy in a more resourceful manner. Also, with the help of the enormous growth of deep learning, traditional fields of computational drug development like molecular interaction fields, molecular docking, quantitative structure-to-activity relationship (QSAR) studies and quantum mechanical simulations are producing results which were unimaginable couple of years back. This review provides a glimpse of a few contextually relevant machine learning algorithms and then focuses on their outcomes in different studies.
Collapse
Affiliation(s)
| | | | - Parames C Sil
- Division of Molecular Medicine, Bose Institute, Kolkata-700054
| |
Collapse
|
44
|
Bonner S, Barrett IP, Ye C, Swiers R, Engkvist O, Bender A, Hoyt CT, Hamilton WL. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Brief Bioinform 2022; 23:6712301. [PMID: 36151740 DOI: 10.1093/bib/bbac404] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/14/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.
Collapse
Affiliation(s)
- Stephen Bonner
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ian P Barrett
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Cheng Ye
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Rowan Swiers
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweeden
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, UK
| | | | - William L Hamilton
- School of Computer Science, McGill University, Canada.,Mila-Quebec AI Institute, Montreal, Canada
| |
Collapse
|
45
|
Mongia A, Chouzenoux E, Majumdar A. Computational Prediction of Drug-Disease Association Based on Graph-Regularized One Bit Matrix Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3332-3339. [PMID: 35816539 DOI: 10.1109/tcbb.2022.3189879] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Investigation of existing drugs is an effective alternative to the discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can be modeled as a low-rank matrix completion problem. In this work, we propose a novel matrix completion framework that makes use of the side-information associated with drugs/diseases for the prediction of drug-disease indications modeled as neighborhood graph: Graph regularized 1-bit matrix completion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results have been validated on two standard databases by evaluating the AUC across the 10-fold cross-validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs, which then are verified with the CTD database.
Collapse
|
46
|
Wang YX, Yang Z, Wang WX, Huang YX, Zhang Q, Li JJ, Tang YP, Yue SJ. Methodology of network pharmacology for research on Chinese herbal medicine against COVID-19: A review. JOURNAL OF INTEGRATIVE MEDICINE 2022; 20:477-487. [PMID: 36182651 PMCID: PMC9508683 DOI: 10.1016/j.joim.2022.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 08/15/2022] [Indexed: 12/09/2022]
Abstract
Traditional Chinese medicine, as a complementary and alternative medicine, has been practiced for thousands of years in China and possesses remarkable clinical efficacy. Thus, systematic analysis and examination of the mechanistic links between Chinese herbal medicine (CHM) and the complex human body can benefit contemporary understandings by carrying out qualitative and quantitative analysis. With increasing attention, the approach of network pharmacology has begun to unveil the mystery of CHM by constructing the heterogeneous network relationship of "herb-compound-target-pathway," which corresponds to the holistic mechanisms of CHM. By integrating computational techniques into network pharmacology, the efficiency and accuracy of active compound screening and target fishing have been improved at an unprecedented pace. This review dissects the core innovations to the network pharmacology approach that were developed in the years since 2015 and highlights how this tool has been applied to understanding the coronavirus disease 2019 and refining the clinical use of CHM to combat it.
Collapse
Affiliation(s)
- Yi-xuan Wang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China,Department of Scientific Research, Shaanxi Provincial People’s Hospital, Xi’an 710068, Shaanxi Province, China
| | - Zhen Yang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Wen-xiao Wang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Yu-xi Huang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Qiao Zhang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Jia-jia Li
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Yu-ping Tang
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China
| | - Shi-jun Yue
- Key Laboratory of Shaanxi Administration of Traditional Chinese Medicine for TCM Compatibility, State Key Laboratory of Research & Development of Characteristic Qin Medicine Resources (Cultivation), and Shaanxi Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, Shaanxi University of Chinese Medicine, Xi’an 712046, Shaanxi Province, China,Corresponding author
| |
Collapse
|
47
|
Drug-Disease Association Prediction Using Heterogeneous Networks for Computational Drug Repositioning. Biomolecules 2022; 12:biom12101497. [PMID: 36291706 PMCID: PMC9599692 DOI: 10.3390/biom12101497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/10/2022] [Accepted: 10/13/2022] [Indexed: 11/18/2022] Open
Abstract
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug–disease associations. This review reveals existing network-based approaches for predicting drug–disease associations in three major categories: graph mining, matrix factorization or completion, and deep learning. We selected eleven methods from the three categories to compare their predictive performances. The experiment was conducted using two uniform datasets on the drug and disease sides, separately. We constructed heterogeneous networks using drug–drug similarities based on chemical structures and ATC codes, ontology-based disease–disease similarities, and drug–disease associations. An improved evaluation metric was used to reflect data imbalance as positive associations are typically sparse. The prediction results demonstrated that methods in the graph mining and matrix factorization or completion categories performed well in the overall assessment. Furthermore, prediction on the drug side had higher accuracy than on the disease side. Selecting and integrating informative drug features in drug–drug similarity measurement are crucial for improving disease-side prediction.
Collapse
|
48
|
Identification of sitagliptin binding proteins by affinity purification mass spectrometry. Acta Biochim Biophys Sin (Shanghai) 2022; 54:1453-1463. [PMID: 36239351 PMCID: PMC9827809 DOI: 10.3724/abbs.2022142] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Type 2 diabetes mellitus (T2DM) is recognized as a serious public health concern with increasing incidence. The dipeptidyl peptidase-4 (DPP-4) inhibitor sitagliptin has been used for the treatment of T2DM worldwide. Although sitagliptin has excellent therapeutic outcome, adverse effects are observed. In addition, previous studies have suggested that sitagliptin may have pleiotropic effects other than treating T2DM. These pieces of evidence point to the importance of further investigation of the molecular mechanisms of sitagliptin, starting from the identification of sitagliptin-binding proteins. In this study, by combining affinity purification mass spectrometry (AP-MS) and stable isotope labeling by amino acids in cell culture (SILAC), we discover seven high-confidence targets that can interact with sitagliptin. Surface plasmon resonance (SPR) assay confirms the binding of sitagliptin to three proteins, i. e., LYPLAL1, TCP1, and CCAR2, with binding affinities (K D) ranging from 50.1 μM to 1490 μM. Molecular docking followed by molecular dynamic (MD) simulation reveals hydrogen binding between sitagliptin and the catalytic triad of LYPLAL1, and also between sitagliptin and the P-loop of ATP-binding pocket of TCP1. Molecular mechanics Poisson-Boltzmann Surface Area (MMPBSA) analysis indicates that sitagliptin can stably bind to LYPLAL1 and TCP1 in active sites, which may have an impact on the functions of these proteins. SPR analysis validates the binding affinity of sitagliptin to TCP1 mutant D88A is ~10 times lower than that to the wild-type TCP1. Our findings provide insights into the sitagliptin-targets interplay and demonstrate the potential of sitagliptin in regulating gluconeogenesis and in anti-tumor drug development.
Collapse
|
49
|
Yaseen A, Amin I, Akhter N, Ben-Hur A, Minhas F. Insights into performance evaluation of compound-protein interaction prediction methods. Bioinformatics 2022; 38:ii75-ii81. [PMID: 36124806 DOI: 10.1093/bioinformatics/btac496] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Machine-learning-based prediction of compound-protein interactions (CPIs) is important for drug design, screening and repurposing. Despite numerous recent publication with increasing methodological sophistication claiming consistent improvements in predictive accuracy, we have observed a number of fundamental issues in experiment design that produce overoptimistic estimates of model performance. RESULTS We systematically analyze the impact of several factors affecting generalization performance of CPI predictors that are overlooked in existing work: (i) similarity between training and test examples in cross-validation; (ii) synthesizing negative examples in absence of experimentally verified negative examples and (iii) alignment of evaluation protocol and performance metrics with real-world use of CPI predictors in screening large compound libraries. Using both state-of-the-art approaches by other researchers as well as a simple kernel-based baseline, we have found that effective assessment of generalization performance of CPI predictors requires careful control over similarity between training and test examples. We show that, under stringent performance assessment protocols, a simple kernel-based approach can exceed the predictive performance of existing state-of-the-art methods. We also show that random pairing for generating synthetic negative examples for training and performance evaluation results in models with better generalization in comparison to more sophisticated strategies used in existing studies. Our analyses indicate that using proposed experiment design strategies can offer significant improvements for CPI prediction leading to effective target compound screening for drug repurposing and discovery of putative chemical ligands of SARS-CoV-2-Spike and Human-ACE2 proteins. AVAILABILITY AND IMPLEMENTATION Code and supplementary material available at https://github.com/adibayaseen/HKRCPI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Adiba Yaseen
- Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad 45650, Pakistan
| | - Imran Amin
- National Institute for Biotechnology and Genetic Engineering, Faisalabad 38000, Pakistan
| | - Naeem Akhter
- Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad 45650, Pakistan
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO 80523, USA
| | - Fayyaz Minhas
- Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
50
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: taxonomy, trends and challenges of computational models. Brief Bioinform 2022; 23:6686738. [PMID: 36056743 DOI: 10.1093/bib/bbac358] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/24/2022] [Accepted: 07/30/2022] [Indexed: 12/12/2022] Open
Abstract
Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|