1
|
Zhang B, Quan L, Zhang Z, Cao L, Chen Q, Peng L, Wang J, Jiang Y, Nie L, Li G, Wu T, Lyu Q. MVCL-DTI: Predicting Drug-Target Interactions Using a Multiview Contrastive Learning Model on a Heterogeneous Graph. J Chem Inf Model 2025; 65:1009-1026. [PMID: 39812134 DOI: 10.1021/acs.jcim.4c02073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Accurate prediction of drug-target interactions (DTIs) is pivotal for accelerating the processes of drug discovery and drug repurposing. MVCL-DTI, a novel model leveraging heterogeneous graphs for predicting DTIs, tackles the challenge of synthesizing information from varied biological subnetworks. It integrates neighbor view, meta-path view, and diffusion view to capture semantic features and employs an attention-based contrastive learning approach, along with a multiview attention-weighted fusion module, to effectively integrate and adaptively weight the information from the different views. Tested under various conditions on benchmark data sets, including varying positive-to-negative sample ratios, conducting hard negative sampling experiments, and masking known DTIs with different ratios, as well as redundant DTIs with various similarity metrics, MVCL-DTI exhibits strong robust generalization. The model is then employed to predict novel DTIs, with a particular focus on COVID-19-related drugs, highlighting its practical applicability. Ultimately, through features visualization and computational properties analysis, we've pinpointed critical elements, including Gene Ontology and substituent nodes, along with a proper initialization strategy, underscoring their vital role in DTI prediction tasks.
Collapse
Affiliation(s)
- Bei Zhang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
- China Mobile (Suzhou) Software Technology Company Limited, Suzhou 215163, China
| | - Lijun Quan
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu 210000, China
| | - Zhijun Zhang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Lexin Cao
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Qiufeng Chen
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Liangchen Peng
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Junkai Wang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Yelu Jiang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Liangpeng Nie
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Geng Li
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Tingfang Wu
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu 210000, China
| | - Qiang Lyu
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu 210000, China
| |
Collapse
|
2
|
Mishra VP, Singh YN, Khan F, Dutta MK. SeqDPI: A 1D-CNN approach for predicting binding affinity of kinase inhibitors. J Comput Chem 2025; 46:e27518. [PMID: 39644133 DOI: 10.1002/jcc.27518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 08/26/2024] [Accepted: 10/13/2024] [Indexed: 12/09/2024]
Abstract
Predicting drug target binding affinity has huge relevance in Modern drug discovery and drug repositioning processes which assist doctors to come up with new drugs or even use the existing drugs for new target proteins. In silico models, using advanced deep learning techniques could further assist these prediction tasks by providing most prominent drug target pairs. Considering these factors, a deep learning based algorithmic framework is developed in this study to support drug target interaction prediction. The proposed SeqDPI model extract the relevant drug and protein features from the one dimensional Sequential representation of the dataset considered using optimized CNN networks that deploy convolutions on varying length of amino acid subsequence's to capture hidden pattern, the convolved drug- protein features obtained are then used as an input to L2 penalized feed forward neural network which matches the local residue patterns in protein classes with molecular fingerprints of drugs to predict the binding strength for all drug target pairs. The proposed model reduces the convolution strain typically encountered in existing in silico models that utilize complex 3D structures of drug protein datasets. The result shows that the SeqDPI model achieves a mean square error MSE of (0.167) across cross validation folds, outperforming baseline models such as KronRLS (0.406), Simboost (0.226), and DeepPS (0.214). Additionally, SeqDPI attains a high CI score of 0.9114 on the benchmark KIBA dataset, demonstrating its statistical significance and computational efficiency compared to existing methods. This gives the relevance and effectiveness of SeqDPI model in accurately predicting binding affinities while working with simpler one-dimensional data, making it a robust and computationally cost-effective solution for drug-target interaction prediction.
Collapse
Affiliation(s)
- Vinay Priy Mishra
- Centre for Advanced Studies, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
| | - Yogendra Narain Singh
- Department of Computer Science & Engineering, Institute of Engineering and Technology, Lucknow, India
| | - Feroz Khan
- Technology Dissemination & Computational Biology Division, CSIR-Central Institute of Medicinal and Aromatic Plants, Lucknow, India
| | | |
Collapse
|
3
|
Özçelik R, Grisoni F. A hitchhiker's guide to deep chemical language processing for bioactivity prediction. DIGITAL DISCOVERY 2024:d4dd00311j. [PMID: 39726698 PMCID: PMC11667676 DOI: 10.1039/d4dd00311j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Accepted: 12/13/2024] [Indexed: 12/28/2024]
Abstract
Deep learning has significantly accelerated drug discovery, with 'chemical language' processing (CLP) emerging as a prominent approach. CLP approaches learn from molecular string representations (e.g., Simplified Molecular Input Line Entry Systems [SMILES] and Self-Referencing Embedded Strings [SELFIES]) with methods akin to natural language processing. Despite their growing importance, training predictive CLP models is far from trivial, as it involves many 'bells and whistles'. Here, we analyze the key elements of CLP and provide guidelines for newcomers and experts. Our study spans three neural network architectures, two string representations, three embedding strategies, across ten bioactivity datasets, for both classification and regression purposes. This 'hitchhiker's guide' not only underscores the importance of certain methodological decisions, but it also equips researchers with practical recommendations on ideal choices, e.g., in terms of neural network architectures, molecular representations, and hyperparameter optimization.
Collapse
Affiliation(s)
- Rıza Özçelik
- Eindhoven University of Technology, Institute for Complex Molecular Systems, Eindhoven AI Systems Institute, Dept. Biomedical Engineering Eindhoven Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht Netherlands
| | - Francesca Grisoni
- Eindhoven University of Technology, Institute for Complex Molecular Systems, Eindhoven AI Systems Institute, Dept. Biomedical Engineering Eindhoven Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht Netherlands
| |
Collapse
|
4
|
De Waele G, Menschaert G, Waegeman W. An antimicrobial drug recommender system using MALDI-TOF MS and dual-branch neural networks. eLife 2024; 13:RP93242. [PMID: 39540875 PMCID: PMC11563574 DOI: 10.7554/elife.93242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024] Open
Abstract
Timely and effective use of antimicrobial drugs can improve patient outcomes, as well as help safeguard against resistance development. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used in clinical diagnostics for rapid species identification. Mining additional data from said spectra in the form of antimicrobial resistance (AMR) profiles is, therefore, highly promising. Such AMR profiles could serve as a drop-in solution for drastically improving treatment efficiency, effectiveness, and costs. This study endeavors to develop the first machine learning models capable of predicting AMR profiles for the whole repertoire of species and drugs encountered in clinical microbiology. The resulting models can be interpreted as drug recommender systems for infectious diseases. We find that our dual-branch method delivers considerably higher performance compared to previous approaches. In addition, experiments show that the models can be efficiently fine-tuned to data from other clinical laboratories. MALDI-TOF-based AMR recommender systems can, hence, greatly extend the value of MALDI-TOF MS for clinical diagnostics. All code supporting this study is distributed on PyPI and is packaged at https://github.com/gdewael/maldi-nn.
Collapse
Affiliation(s)
- Gaetan De Waele
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Gerben Menschaert
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Willem Waegeman
- Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| |
Collapse
|
5
|
Aruna AS, Babu KRR, Deepthi K. A deep drug prediction framework for viral infectious diseases using an optimizer-based ensemble of convolutional neural network: COVID-19 as a case study. Mol Divers 2024:10.1007/s11030-024-11003-7. [PMID: 39379663 DOI: 10.1007/s11030-024-11003-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Accepted: 09/26/2024] [Indexed: 10/10/2024]
Abstract
The SARS-CoV-2 outbreak highlights the persistent vulnerability of humanity to epidemics and emerging microbial threats, emphasizing the lack of time to develop disease-specific treatments. Therefore, it appears beneficial to utilize existing resources and therapies. Computational drug repositioning is an effective strategy that redirects authorized drugs to new therapeutic purposes. This strategy holds significant promise for newly emerging diseases, as drug discovery is a lengthy and expensive process. Through this study, we present an ensemble method based on the convolutional neural network integrated with genetic algorithm and deep forest classifier for virus-drug association prediction (CGDVDA). We generated feature vectors by combining drug chemical structure and virus genomic sequence-based similarities, and extracted prominent deep features by applying the convolutional neural network. The convoluted features are optimized using the genetic algorithm and classified using the ensemble deep forest classifier to predict novel virus-drug associations. The proposed method predicts drugs for COVID-19 and other viral diseases in the dataset. The model could achieve ROC-AUC scores of 0.9159 on fivefold cross-validation. We compared the performance of the model with state-of-the-art approaches and classifiers. The experimental results and case studies illustrate the efficacy of CGDVDA in predicting drugs against viral infectious diseases.
Collapse
Affiliation(s)
- A S Aruna
- Dept. of Information Technology, Government Engineering College Palakkad, APJ Abdul Kalam Technological University, Palakkad, Kerala, 678633, India.
- Department of Computer Science, College of Engineering Vadakara, Kozhikode, Kerala, 673105, India.
| | - K R Remesh Babu
- Dept. of Information Technology, Government Engineering College Palakkad, APJ Abdul Kalam Technological University, Palakkad, Kerala, 678633, India
| | - K Deepthi
- Department of Computer Science, Central University of Kerala (Govt. of India), Kasaragod, Kerala, 671320, India
| |
Collapse
|
6
|
Wang Y, Su Y, Zhao K, Huo D, Du Z, Wang Z, Xie H, Liu L, Jin Q, Ren X, Chen X, Zhang D. A deep learning drug screening framework for integrating local-global characteristics: A novel attempt for limited data. Heliyon 2024; 10:e34244. [PMID: 39130417 PMCID: PMC11315141 DOI: 10.1016/j.heliyon.2024.e34244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/31/2024] [Accepted: 07/05/2024] [Indexed: 08/13/2024] Open
Abstract
At the beginning of the "Disease X" outbreak, drug discovery and development are often challenged by insufficient and unbalanced data. To address this problem and maximize the information value of limited data, we propose a drug screening model, LGCNN, based on convolutional neural network (CNN), which enables rapid drug screening by integrating features of drug molecular structures and drug-target interactions at both local and global (LG) levels. Experimental results show that LGCNN exhibits better performance compared to other state-of-the-art classification methods under limited data. In addition, LGCNN was applied to anti-SARS-CoV-2 drug screening to realize therapeutic drug mining against COVID-19. LGCNN transcends the limitations of traditional models for predicting interactions between single drug targets and shows new advantages in predicting multi-target drug-target interactions. Notably, the cross-coronavirus generalizability of the model is also implied by the analysis of targets, drugs, and mechanisms in the prediction results. In conclusion, LGCNN provides new ideas and methods for rapid drug screening in emergency situations where data are scarce.
Collapse
Affiliation(s)
- Ying Wang
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Yangguang Su
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Kairui Zhao
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Diwei Huo
- The Fourth Hospital of Harbin Medical University, No.37 Yiyuan Street, Harbin, Heilongjiang, 150001, China
| | - Zhenshun Du
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Zhiju Wang
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Hongbo Xie
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Lei Liu
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Qing Jin
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Xuekun Ren
- College of Mathematics of Harbin Institute of Technology, No.92 Xidazhi Street, Harbin, Heilongjiang, 150001, China
| | - Xiujie Chen
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Denan Zhang
- Department of Pharmacogenomics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| |
Collapse
|
7
|
Wan Z, Jiang N, Su M, Zhang X, Cao Y, Wu A, Zhang P, Jiang T. Multiscale fusion network drives the repurposing of anticancer drugs. Clin Transl Med 2024; 14:e1745. [PMID: 38924682 PMCID: PMC11199060 DOI: 10.1002/ctm2.1745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 06/06/2024] [Accepted: 06/09/2024] [Indexed: 06/28/2024] Open
Affiliation(s)
- Zhaoman Wan
- State Key Laboratory of Common Mechanism Research for Major Diseases, Suzhou Institute of Systems MedicineChinese Academy of Medical Sciences & Peking Union Medical CollegeSuzhouChina
| | - Nan Jiang
- 4+4 Medical Doctor ProgramChinese Academy of Medical Sciences & Peking Union Medical CollegeBeijingChina
| | - Mingming Su
- Beijing Cloudna Technology Co., Ltd.BeijingChina
| | - Xinlei Zhang
- Beijing Cloudna Technology Co., Ltd.BeijingChina
| | - Yang Cao
- School of Biological SciencesSichuan UniversityChengduChina
| | - Aiping Wu
- State Key Laboratory of Common Mechanism Research for Major Diseases, Suzhou Institute of Systems MedicineChinese Academy of Medical Sciences & Peking Union Medical CollegeSuzhouChina
| | - Peng Zhang
- Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, MOE Key Laboratory of Major Diseases in Children; Rare Disease Center, Beijing Children’s HospitalCapital Medical University, National Center for Children's HealthBeijingChina
| | | |
Collapse
|
8
|
Akgüller Ö, Balcı MA, Cioca G. Network Models of BACE-1 Inhibitors: Exploring Structural and Biochemical Relationships. Int J Mol Sci 2024; 25:6890. [PMID: 38999999 PMCID: PMC11240958 DOI: 10.3390/ijms25136890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 06/14/2024] [Accepted: 06/21/2024] [Indexed: 07/14/2024] Open
Abstract
This study investigates the clustering patterns of human β-secretase 1 (BACE-1) inhibitors using complex network methodologies based on various distance functions, including Euclidean, Tanimoto, Hamming, and Levenshtein distances. Molecular descriptor vectors such as molecular mass, Merck Molecular Force Field (MMFF) energy, Crippen partition coefficient (ClogP), Crippen molar refractivity (MR), eccentricity, Kappa indices, Synthetic Accessibility Score, Topological Polar Surface Area (TPSA), and 2D/3D autocorrelation entropies are employed to capture the diverse properties of these inhibitors. The Euclidean distance network demonstrates the most reliable clustering results, with strong agreement metrics and minimal information loss, indicating its robustness in capturing essential structural and physicochemical properties. Tanimoto and Hamming distance networks yield valuable clustering outcomes, albeit with moderate performance, while the Levenshtein distance network shows significant discrepancies. The analysis of eigenvector centrality across different networks identifies key inhibitors acting as hubs, which are likely critical in biochemical pathways. Community detection results highlight distinct clustering patterns, with well-defined communities providing insights into the functional and structural groupings of BACE-1 inhibitors. The study also conducts non-parametric tests, revealing significant differences in molecular descriptors, validating the clustering methodology. Despite its limitations, including reliance on specific descriptors and computational complexity, this study offers a comprehensive framework for understanding molecular interactions and guiding therapeutic interventions. Future research could integrate additional descriptors, advanced machine learning techniques, and dynamic network analysis to enhance clustering accuracy and applicability.
Collapse
Affiliation(s)
- Ömer Akgüller
- Department of Mathematics, Faculty of Science, Mugla Sitki Kocman University, 48000 Mugla, Turkey;
| | - Mehmet Ali Balcı
- Department of Mathematics, Faculty of Science, Mugla Sitki Kocman University, 48000 Mugla, Turkey;
| | - Gabriela Cioca
- Preclinical Department, Faculty of Medicine, Lucian Blaga University of Sibiu, 550024 Sibiu, Romania;
| |
Collapse
|
9
|
Croy A. From Local Atomic Environments to Molecular Information Entropy. ACS OMEGA 2024; 9:20616-20622. [PMID: 38737089 PMCID: PMC11080039 DOI: 10.1021/acsomega.4c02770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 04/01/2024] [Accepted: 04/05/2024] [Indexed: 05/14/2024]
Abstract
The similarity of local atomic environments is an important concept in many machine learning techniques, which find applications in computational chemistry and material science. Here, we present and discuss a connection between the information entropy and the similarity matrix of a molecule. The resulting entropy can be used as a measure of the complexity of a molecule. Exemplarily, we introduce and evaluate two specific choices for defining the similarity: one is based on a SMILES representation of local substructures, and the other is based on the SOAP kernel. By tuning the sensitivity of the latter, we can achieve good agreement between the respective entropies. Finally, we consider the entropy of two molecules in a mixture. The gain of entropy due to the mixing can be used as a similarity measure of the molecules. We compare this measure to the average and best-match kernel. The results indicate a connection between the different approaches and demonstrate the usefulness and broad applicability of the similarity-based entropy approach.
Collapse
Affiliation(s)
- Alexander Croy
- Institute of Physical Chemistry, Friedrich Schiller University Jena, 07737 Jena, Germany
| |
Collapse
|
10
|
Zhang B, Xi Y, Huang Y, Zhang Y, Guo F, Yang H. Integration of single-nucleus RNA sequencing and network disturbance to elucidate crosstalk between multicomponent drugs and trigeminal ganglia cells in migraine. JOURNAL OF ETHNOPHARMACOLOGY 2024; 319:117286. [PMID: 37838292 DOI: 10.1016/j.jep.2023.117286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 10/16/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Migraine is caused by hyperactivity of the trigeminovascular system, where trigeminal ganglia (TG) plays an important role. TG is composed of multiple neuronal and non-neuronal cell types, which is related to "neuro-inflammation-vascular" disorder in migraine. Tou Tong Ning capsule (TTNC), a CFDA-approved traditional Chinese medicine for treating migraine, has the characteristics of "multicomponents, multitargets, multipathways". AIM OF THE STUDY To clarify the mechanism of TTNC and elucidate crosstalk between multicomponent drugs and neuronal and non-neuronal functions and cells in migraine. MATERIALS AND METHODS We integrated single-nucleus RNA sequencing and a quantitative evaluation algorithm of the disturbance of multitarget drugs on the disease network and explored the specific pathology of migraine and corresponding compounds. A cerebrovascular smooth muscle spasmolytic activity experiment was carried out to verify the results of the bioinformatics analysis. RESULTS TTNC exhibited its regulation activities in neuronal and non-neuronal aspects based on drugs attack to four subnetworks and cell specific networks, which explored the MoA of TTNC in comprehensive and refined perspectives. Compared to neuronal regulation, TTNC showed more significant attack score on non-neuronal biological function (smooth muscle and vessel). And TTNC compound clusters C1, C6 and C7, targeting non-neuronal function and cells, had larger group area than C10, C4 and C6 for neuronal function and cell, which implied that TTNC may mainly regulate the non-neuronal function, e.g., vessel smooth muscle contraction. Contraction of cerebrovascular smooth muscle of mice ex vivo confirmed the vasodilation activity of TTNC and active compounds from C1, C6, C9 (Emodin, Luteolin and Levistilide A). Literature mining confirmed the vasospasmodolytic activity and neuroprotective effect of TTNC. CONCLUSIONS The study found that TTNC may primarily alleviate non-neuronal functional disorders in migraine by relaxing cerebral vascular smooth muscle cell spasm to alleviate migraine. Integrating single-nucleus RNA sequencing data and network disturbance tools provides a new strategy for the pharmacological mechanism of multicomponent drugs through cell subtyping.
Collapse
Affiliation(s)
- Bo Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yujie Xi
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China; Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, China
| | - Ying Huang
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yi Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Feifei Guo
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China.
| | - Hongjun Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China; Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, China; China Academy of Chinese Medical Sciences, Beijing, China.
| |
Collapse
|
11
|
Wang Y, Xia Y, Yan J, Yuan Y, Shen HB, Pan X. ZeroBind: a protein-specific zero-shot predictor with subgraph matching for drug-target interactions. Nat Commun 2023; 14:7861. [PMID: 38030641 PMCID: PMC10687269 DOI: 10.1038/s41467-023-43597-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 11/13/2023] [Indexed: 12/01/2023] Open
Abstract
Existing drug-target interaction (DTI) prediction methods generally fail to generalize well to novel (unseen) proteins and drugs. In this study, we propose a protein-specific meta-learning framework ZeroBind with subgraph matching for predicting protein-drug interactions from their structures. During the meta-training process, ZeroBind formulates training a protein-specific model, which is also considered a learning task, and each task uses graph neural networks (GNNs) to learn the protein graph embedding and the molecular graph embedding. Inspired by the fact that molecules bind to a binding pocket in proteins instead of the whole protein, ZeroBind introduces a weakly supervised subgraph information bottleneck (SIB) module to recognize the maximally informative and compressive subgraphs in protein graphs as potential binding pockets. In addition, ZeroBind trains the models of individual proteins as multiple tasks, whose importance is automatically learned with a task adaptive self-attention module to make final predictions. The results show that ZeroBind achieves superior performance on DTI prediction over existing methods, especially for those unseen proteins and drugs, and performs well after fine-tuning for those proteins or drugs with a few known binding partners.
Collapse
Affiliation(s)
- Yuxuan Wang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Ying Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Junchi Yan
- Department of Computer Science and Engineering, and MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Ye Yuan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
12
|
Wang R, Feng H, Wei GW. ChatGPT in Drug Discovery: A Case Study on Anticocaine Addiction Drug Development with Chatbots. J Chem Inf Model 2023; 63:7189-7209. [PMID: 37956228 PMCID: PMC11021135 DOI: 10.1021/acs.jcim.3c01429] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The birth of ChatGPT, a cutting-edge language model-based chatbot developed by OpenAI, ushered in a new era in AI. However, due to potential pitfalls, its role in rigorous scientific research is not clear yet. This paper vividly showcases its innovative application within the field of drug discovery. Focused specifically on developing anticocaine addiction drugs, the study employs GPT-4 as a virtual guide, offering strategic and methodological insights to researchers working on generative models for drug candidates. The primary objective is to generate optimal drug-like molecules with desired properties. By leveraging the capabilities of ChatGPT, the study introduces a novel approach to the drug discovery process. This symbiotic partnership between AI and researchers transforms how drug development is approached. Chatbots become facilitators, steering researchers toward innovative methodologies and productive paths for creating effective drug candidates. This research sheds light on the collaborative synergy between human expertise and AI assistance, wherein ChatGPT's cognitive abilities enhance the design and development of pharmaceutical solutions. This paper not only explores the integration of advanced AI in drug discovery but also reimagines the landscape by advocating for AI-powered chatbots as trailblazers in revolutionizing therapeutic innovation.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
13
|
Shtar G, Solomon A, Mazuz E, Rokach L, Shapira B. A simplified similarity-based approach for drug-drug interaction prediction. PLoS One 2023; 18:e0293629. [PMID: 37943768 PMCID: PMC10635435 DOI: 10.1371/journal.pone.0293629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/17/2023] [Indexed: 11/12/2023] Open
Abstract
Drug-drug interactions (DDIs) are a critical component of drug safety surveillance. Laboratory studies aimed at detecting DDIs are typically difficult, expensive, and time-consuming; therefore, developing in-silico methods is critical. Machine learning-based approaches for DDI prediction have been developed; however, in many cases, their ability to achieve high accuracy relies on data only available towards the end of the molecule lifecycle. Here, we propose a simple yet effective similarity-based method for preclinical DDI prediction where only the chemical structure is available. We test the model on new, unseen drugs. To focus on the preclinical problem setting, we conducted a retrospective analysis and tested the models on drugs that were added to a later version of the DrugBank database. We extend an existing method, adjacency matrix factorization with propagation (AMFP), to support unseen molecules by applying a new lookup mechanism to the drugs' chemical structure, lookup adjacency matrix factorization with propagation (LAMFP). We show that using an ensemble of different similarity measures improves the results. We also demonstrate that Chemprop, a message-passing neural network, can be used for DDI prediction. In computational experiments, LAMFP results in high accuracy, with an area under the receiver operating characteristic curve of 0.82 for interactions involving a new drug and an existing drug and for interactions involving only existing drugs. Moreover, LAMFP outperforms state-of-the-art, complex graph neural network DDI prediction methods.
Collapse
Affiliation(s)
- Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
- Department of Information Systems, University of Haifa, Haifa, Israel
| | - Adir Solomon
- Department of Information Systems, University of Haifa, Haifa, Israel
| | - Eyal Mazuz
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
14
|
Muniyappan S, Rayan AXA, Varrieth GT. EGeRepDR: An enhanced genetic-based representation learning for drug repurposing using multiple biomedical sources. J Biomed Inform 2023; 147:104528. [PMID: 37858852 DOI: 10.1016/j.jbi.2023.104528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/11/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023]
Abstract
MOTIVATION Drug repurposing (DR) is an imminent approach for identifying novel therapeutic indications for the available drugs and discovering novel drugs for previously untreatable diseases. Nowadays, DR has major attention in the pharmaceutical industry due to the high cost and time of launching new drugs to the market through traditional drug development. DR task majorly depends on genetic information since the drugs revert the modified Gene Expression (GE) of diseases to normal. Many of the existing studies have not considered the genetic importance of predicting the potential candidates. METHOD We proposed a novel multimodal framework that utilizes genetic aspects of drugs and diseases such as genes, pathways, gene signatures, or expression to enhance the performance of DR using various data sources. Firstly, the heterogeneous biological network (HBN) is constructed with three types of nodes namely drug, disease, and gene, and 4 types of edges similarities (drug, gene, and disease), drug-gene, gene-disease, and drug-disease. Next, a modified graph auto-encoder (GAE*) model is applied to learn the representation of drug and disease nodes using the topological structure and edge information. Secondly, the HBN is enhanced with the information extracted from biomedical literature and ontology using a novel semi-supervised pattern embedding-based bootstrapping model and novel DR perspective representation learning respectively to improve the prediction performance. Finally, our proposed system uses a neural network model to generate the probability score of drug-disease pairs. RESULTS We demonstrate the efficiency of the proposed model on various datasets and achieved outstanding performance in 5-fold cross-validation (AUC = 0.99, AUPR = 0.98). Further, we validated the top-ranked potential candidates using pathway analysis and proved that the known and predicted candidates share common genes in the pathways.
Collapse
Affiliation(s)
- Saranya Muniyappan
- Computer Science and Engineering, CEG Campus, Anna University, Chennai, Tamil Nadu, India.
| | | | | |
Collapse
|
15
|
Ong WJG, Kirubakaran P, Karanicolas J. Poor Generalization by Current Deep Learning Models for Predicting Binding Affinities of Kinase Inhibitors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.04.556234. [PMID: 37732243 PMCID: PMC10508770 DOI: 10.1101/2023.09.04.556234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
The extreme surge of interest over the past decade surrounding the use of neural networks has inspired many groups to deploy them for predicting binding affinities of drug-like molecules to their receptors. A model that can accurately make such predictions has the potential to screen large chemical libraries and help streamline the drug discovery process. However, despite reports of models that accurately predict quantitative inhibition using protein kinase sequences and inhibitors' SMILES strings, it is still unclear whether these models can generalize to previously unseen data. Here, we build a Convolutional Neural Network (CNN) analogous to those previously reported and evaluate the model over four datasets commonly used for inhibitor/kinase predictions. We find that the model performs comparably to those previously reported, provided that the individual data points are randomly split between the training set and the test set. However, model performance is dramatically deteriorated when all data for a given inhibitor is placed together in the same training/testing fold, implying that information leakage underlies the models' performance. Through comparison to simple models in which the SMILES strings are tokenized, or in which test set predictions are simply copied from the closest training set data points, we demonstrate that there is essentially no generalization whatsoever in this model. In other words, the model has not learned anything about molecular interactions, and does not provide any benefit over much simpler and more transparent models. These observations strongly point to the need for richer structure-based encodings, to obtain useful prospective predictions of not-yet-synthesized candidate inhibitors.
Collapse
Affiliation(s)
- Wern Juin Gabriel Ong
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
- Bowdoin College, Brunswick, ME 04011
| | - Palani Kirubakaran
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| | - John Karanicolas
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| |
Collapse
|
16
|
Yao K, Wang X, Li W, Zhu H, Jiang Y, Li Y, Tian T, Yang Z, Liu Q, Liu Q. Semi-supervised heterogeneous graph contrastive learning for drug-target interaction prediction. Comput Biol Med 2023; 163:107199. [PMID: 37421738 DOI: 10.1016/j.compbiomed.2023.107199] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 04/15/2023] [Accepted: 06/19/2023] [Indexed: 07/10/2023]
Abstract
Identification of drug-target interactions (DTIs) is an important step in drug discovery and drug repositioning. In recent years, graph-based methods have attracted great attention and show advantages on predicting potential DTIs. However, these methods face the problem that the known DTIs are very limited and expensive to obtain, which decreases the generalization ability of the methods. Self-supervised contrastive learning is independent of labeled DTIs, which can mitigate the impact of the problem. Therefore, we propose a framework SHGCL-DTI for predicting DTIs, which supplements the classical semi-supervised DTI prediction task with an auxiliary graph contrastive learning module. Specifically, we generate representations for the nodes through the neighbor view and meta-path view, and define positive and negative pairs to maximize the similarity between positive pairs from different views. Subsequently, SHGCL-DTI reconstructs the original heterogeneous network to predict the potential DTIs. The experiments on the public dataset show that SHGCL-DTI has significant improvement in different scenarios, compared with existing state-of-the-art methods. We also demonstrate that the contrastive learning module improves the prediction performance and generalization ability of SHGCL-DTI through ablation study. In addition, we have found several novel predicted DTIs supported by the biological literature. The data and source code are available at: https://github.com/TOJSSE-iData/SHGCL-DTI.
Collapse
Affiliation(s)
- Kainan Yao
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Xiaowen Wang
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Wannian Li
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Yangpu District, Shanghai, 200092, China.
| | - Hongming Zhu
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Yizhi Jiang
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Yulong Li
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Tongxuan Tian
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China
| | - Zhaoyi Yang
- The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, No. 96, JinZhai Road Baohe District, Hefei, 230001, Anhui, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Yangpu District, Shanghai, 200092, China.
| | - Qin Liu
- School of Software Engineering, Tongji University, 4800 Caoan Road, Jiading District, Shanghai, 201804, China.
| |
Collapse
|
17
|
Suviriyapaisal N, Wichadakul D. iEdgeDTA: integrated edge information and 1D graph convolutional neural networks for binding affinity prediction. RSC Adv 2023; 13:25218-25228. [PMID: 37636509 PMCID: PMC10448119 DOI: 10.1039/d3ra03796g] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 08/14/2023] [Indexed: 08/29/2023] Open
Abstract
Artificial intelligence has become more prevalent in broad fields, including drug discovery, in which the process is costly and time-consuming when conducted through wet experiments. As a result, drug repurposing, which tries to utilize approved and low-risk drugs for a new purpose, becomes more attractive. However, screening candidates from many drugs for specific protein targets is still expensive and tedious. This study aims to leverage computational resources to aid drug discovery by utilizing drug-protein interaction data and estimating their interaction strength, so-called binding affinity. Our estimation approach addresses multiple challenges encountered in the field. First, we employed a graph-based deep learning technique to overcome the limitations of drug compounds represented in string format by incorporating background knowledge of node and edge information as separate multi-dimensional features. Second, we tackled the complexities associated with extracting the representation and structure of proteins by utilizing a pre-trained model for feature extraction. Also, we employed graph operations over the 1D representation of a protein sequence to overcome the fixed-length problem typically encountered in language model tasks. In addition, we conducted a comparative analysis with a baseline model that creates a protein graph from a contact map prediction model, giving valuable insights into the performance and effectiveness of our proposed method. We evaluated the performance of our model using the same benchmark datasets with a variety of matrices as other previous work, and the results show that our model achieved the best prediction results while requiring no contact map information compared to other graph-based methods.
Collapse
Affiliation(s)
- Natchanon Suviriyapaisal
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University Bangkok 10330 Thailand
| | - Duangdao Wichadakul
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University Bangkok 10330 Thailand
- Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University Bangkok 10330 Thailand
| |
Collapse
|
18
|
Xu H, Zhang B, Liu Q. Deep learning-based classification model for GPR151 activator activity prediction. BMC Bioinformatics 2023; 24:245. [PMID: 37296398 DOI: 10.1186/s12859-023-05369-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/29/2023] [Indexed: 06/12/2023] Open
Abstract
BACKGROUND GPR151 is a kind of protein belonging to G protein-coupled receptor family that is closely associated with a variety of physiological and pathological processes.The potential use of GPR151 as a therapeutic target for the management of metabolic disorders has been demonstrated in several studies, highlighting the demand to explore its activators further. Activity prediction serves as a vital preliminary step in drug discovery, which is both costly and time-consuming. Thus, the development of reliable activity classification model has become an essential way in the process of drug discovery, aiming to enhance the efficiency of virtual screening. RESULTS We propose a learning-based method based on feature extractor and deep neural network to predict the activity of GPR151 activators. We first introduce a new molecular feature extraction algorithm which utilizes the idea of bag-of-words model in natural language to densify the sparse fingerprint vector. Mol2vec method is also used to extract diverse features. Then, we construct three classical feature selection algorithms and three types of deep learning model to enhance the representational capacity of molecules and predict activity label by five different classifiers. We conduct experiments using our own dataset of GPR151 activators. The results demonstrate high classification accuracy and stability, with the optimal model Mol2vec-CNN significantly improving performance across multiple classifiers. The svm classifier achieves the best accuracy of 0.92 and F1 score of 0.76 which indicates promising applications for our method in the field of activity prediction. CONCLUSION The results suggest that the experimental design of this study is appropriate and well-conceived. The deep learning-based feature extraction algorithm established in this study outperforms traditional feature selection algorithm for activity prediction. The model developed can be effectively utilized in the pre-screening stage of drug virtual screening.
Collapse
Affiliation(s)
- Huangchao Xu
- Computer Network Information Center, Chinese Academy of Sciences, Dongsheng Sourth Street No.2, Haidian District, Beijing, 100190, China
- University of Chinese Academy of Sciences, No.1 Yanqihu East Rd, Huairou District, Beijing, 101408, China
| | - Baohua Zhang
- Computer Network Information Center, Chinese Academy of Sciences, Dongsheng Sourth Street No.2, Haidian District, Beijing, 100190, China
| | - Qian Liu
- Computer Network Information Center, Chinese Academy of Sciences, Dongsheng Sourth Street No.2, Haidian District, Beijing, 100190, China.
| |
Collapse
|
19
|
Chatterjee A, Walters R, Shafi Z, Ahmed OS, Sebek M, Gysi D, Yu R, Eliassi-Rad T, Barabási AL, Menichetti G. Improving the generalizability of protein-ligand binding predictions with AI-Bind. Nat Commun 2023; 14:1989. [PMID: 37031187 PMCID: PMC10082765 DOI: 10.1038/s41467-023-37572-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 03/23/2023] [Indexed: 04/10/2023] Open
Abstract
Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.
Collapse
Affiliation(s)
- Ayan Chatterjee
- Network Science Institute, Northeastern University, Boston, MA, USA
| | - Robin Walters
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Zohair Shafi
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Omair Shafi Ahmed
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Michael Sebek
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
| | - Deisy Gysi
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Rose Yu
- Department of Computer Science and Engineering, University of California, San Diego, CA, USA
| | - Tina Eliassi-Rad
- Network Science Institute, Northeastern University, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Santa Fe Institute, Santa Fe, NM, USA
- The Institute for Experiential AI, Northeastern University, Boston, MA, USA
| | - Albert-László Barabási
- Network Science Institute, Northeastern University, Boston, MA, USA
- Department of Physics, Northeastern University, Boston, MA, USA
- Department of Network and Data Science, Central European University, Budapest, Hungary
| | - Giulia Menichetti
- Network Science Institute, Northeastern University, Boston, MA, USA.
- Department of Physics, Northeastern University, Boston, MA, USA.
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
20
|
Guo B, Zheng H, Jiang H, Li X, Guan N, Zuo Y, Zhang Y, Yang H, Wang X. Enhanced compound-protein binding affinity prediction by representing protein multimodal information via a coevolutionary strategy. Brief Bioinform 2023; 24:6995409. [PMID: 36682005 DOI: 10.1093/bib/bbac628] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 12/12/2022] [Accepted: 12/25/2022] [Indexed: 01/23/2023] Open
Abstract
Due to the lack of a method to efficiently represent the multimodal information of a protein, including its structure and sequence information, predicting compound-protein binding affinity (CPA) still suffers from low accuracy when applying machine-learning methods. To overcome this limitation, in a novel end-to-end architecture (named FeatNN), we develop a coevolutionary strategy to jointly represent the structure and sequence features of proteins and ultimately optimize the mathematical models for predicting CPA. Furthermore, from the perspective of data-driven approach, we proposed a rational method that can utilize both high- and low-quality databases to optimize the accuracy and generalization ability of FeatNN in CPA prediction tasks. Notably, we visually interpret the feature interaction process between sequence and structure in the rationally designed architecture. As a result, FeatNN considerably outperforms the state-of-the-art (SOTA) baseline in virtual drug evaluation tasks, indicating the feasibility of this approach for practical use. FeatNN provides an outstanding method for higher CPA prediction accuracy and better generalization ability by efficiently representing multimodal information of proteins via a coevolutionary strategy.
Collapse
Affiliation(s)
- Binjie Guo
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zheng
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Haohan Jiang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Xiaodan Li
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Naiyu Guan
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Yanming Zuo
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Yicheng Zhang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Hengfu Yang
- School of Computer Science, Hunan First Normal University, Changsha, 410205 Hunan, China
| | - Xuhua Wang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, 226001 Jiangsu, China
| |
Collapse
|
21
|
Dou L, Zhang Z, Liu D, Qian Y, Zhang Q. BCM-DTI: A fragment-oriented method for drug-target interaction prediction using deep learning. Comput Biol Chem 2023; 104:107844. [PMID: 36924586 DOI: 10.1016/j.compbiolchem.2023.107844] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 01/30/2023] [Accepted: 02/23/2023] [Indexed: 03/08/2023]
Abstract
The identification of drug-target interaction (DTI) is significant in drug discovery and development, which is usually of high cost in time and money due to large amount of molecule and protein space. The application of deep learning in predicting DTI pairs can overcome these limitations through feature engineering. However, most works do the features extraction using the whole drug and target, which do not take the theoretical basis of pharmacological reaction that the interaction is closely related to some substructure of molecule and protein into consideration, thus poor in performance. On the other hand, some substructure-oriented studies only consider a single type of fragment, e.g., functional group. To address these issues, we propose an end-to-end predicting framework for drug-target interaction named BCM-DTI that takes diverse fragment types into account, including branch chain, common substructure and motif/fragments, and applies a feature learning module based on CNN to learn the synergistic effect between these fragments. We implement BCM-DTI on four public datasets, and the results show that BCM-DTI outperforms state-of-the-art approaches and requires lower training cost.
Collapse
Affiliation(s)
- Liang Dou
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062, China.
| | - Zhen Zhang
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062, China.
| | - Dan Liu
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062, China.
| | - Ying Qian
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062, China.
| | - Qian Zhang
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062, China.
| |
Collapse
|
22
|
Choi IH, Oh IS. Weighted edit distance optimized using genetic algorithm for SMILES-based compound similarity. Pattern Anal Appl 2023. [DOI: 10.1007/s10044-023-01141-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
|
23
|
Pan X, Yun J, Coban Akdemir ZH, Jiang X, Wu E, Huang JH, Sahni N, Yi SS. AI-DrugNet: A network-based deep learning model for drug repurposing and combination therapy in neurological disorders. Comput Struct Biotechnol J 2023; 21:1533-1542. [PMID: 36879885 PMCID: PMC9984442 DOI: 10.1016/j.csbj.2023.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 02/03/2023] [Accepted: 02/03/2023] [Indexed: 02/10/2023] Open
Abstract
Discovering effective therapies is difficult for neurological and developmental disorders in that disease progression is often associated with a complex and interactive mechanism. Over the past few decades, few drugs have been identified for treating Alzheimer's disease (AD), especially for impacting the causes of cell death in AD. Although drug repurposing is gaining more success in developing therapeutic efficacy for complex diseases such as common cancer, the complications behind AD require further study. Here, we developed a novel prediction framework based on deep learning to identify potential repurposed drug therapies for AD, and more importantly, our framework is broadly applicable and may generalize to identifying potential drug combinations in other diseases. Our prediction framework is as follows: we first built a drug-target pair (DTP) network based on multiple drug features and target features, as well as the associations between DTP nodes where drug-target pairs are the DTP nodes and the associations between DTP nodes are represented as the edges in the AD disease network; furthermore, we incorporated the drug-target feature from the DTP network and the relationship information between drug-drug, target-target, drug-target within and outside of drug-target pairs, representing each drug-combination as a quartet to generate corresponding integrated features; finally, we developed an AI-based Drug discovery Network (AI-DrugNet), which exhibits robust predictive performance. The implementation of our network model help identify potential repurposed and combination drug options that may serve to treat AD and other diseases.
Collapse
Affiliation(s)
- Xingxin Pan
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Jun Yun
- Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA
| | - Zeynep H. Coban Akdemir
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaoqian Jiang
- School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX 77030, USA
| | - Erxi Wu
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
- Neuroscience Institute and Department of Neurosurgery, Baylor Scott & White Health, Temple, TX 76502, USA
- Department of Surgery, Texas A & M University Health Science Center, College of Medicine, Temple, TX 76508, USA
- Department of Pharmaceutical Sciences, Texas A & M University Health Science Center, College of Pharmacy, College Station, TX 77843, USA
| | - Jason H. Huang
- Neuroscience Institute and Department of Neurosurgery, Baylor Scott & White Health, Temple, TX 76502, USA
- Department of Surgery, Texas A & M University Health Science Center, College of Medicine, Temple, TX 76508, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Smithville, TX 78957, USA
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Quantitative and Computational Biosciences Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - S. Stephen Yi
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
- Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA
- Interdisciplinary Life Sciences Graduate Programs (ILSGP), College of Natural Sciences, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Biomedical Engineering, Cockrell School of Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
24
|
Feng J, Wu S, Yang H, Ai C, Qiao J, Xu J, Guo F. Microbe-bridged disease-metabolite associations identification by heterogeneous graph fusion. Brief Bioinform 2022; 23:6720417. [PMID: 36168719 DOI: 10.1093/bib/bbac423] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Metabolomics has developed rapidly in recent years, and metabolism-related databases are also gradually constructed. Nowadays, more and more studies are being carried out on diverse microbes, metabolites and diseases. However, the logics of various associations among microbes, metabolites and diseases are limited understanding in the biomedicine of gut microbial system. The collection and analysis of relevant microbial bioinformation play an important role in the revelation of microbe-metabolite-disease associations. Therefore, the dataset that integrates multiple relationships and the method based on complex heterogeneous graphs need to be developed. RESULTS In this study, we integrated some databases and extracted a variety of associations data among microbes, metabolites and diseases. After obtaining the three interconnected bilateral association data (microbe-metabolite, metabolite-disease and disease-microbe), we considered building a heterogeneous graph to describe the association data. In our model, microbes were used as a bridge between diseases and metabolites. In order to fuse the information of disease-microbe-metabolite graph, we used the bipartite graph attention network on the disease-microbe and metabolite-microbe bipartite graph. The experimental results show that our model has good performance in the prediction of various disease-metabolite associations. Through the case study of type 2 diabetes mellitus, Parkinson's disease, inflammatory bowel disease and liver cirrhosis, it is noted that our proposed methodology are valuable for the mining of other associations and the prediction of biomarkers for different human diseases.Availability and implementation: https://github.com/Selenefreeze/DiMiMe.git.
Collapse
Affiliation(s)
- Jitong Feng
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Shengbo Wu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Zhejiang Shaoxing Research Institute of Tianjin University, Shaoxing, China
| | - Hongpeng Yang
- School of Computational Science and Engineering, University of South Carolina, Columbia, U.S
| | - Chengwei Ai
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianjun Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Zhejiang Shaoxing Research Institute of Tianjin University, Shaoxing, China
| | - Junhai Xu
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
25
|
Chen L, Lin D, Xu H, Li J, Lin L. WLLP: A weighted reconstruction-based linear label propagation algorithm for predicting potential therapeutic agents for COVID-19. Front Microbiol 2022; 13:1040252. [PMID: 36466666 PMCID: PMC9713947 DOI: 10.3389/fmicb.2022.1040252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV) has led to a huge health and economic crises. However, the research required to develop new drugs and vaccines is very expensive in terms of labor, money, and time. Owing to recent advances in data science, drug-repositioning technologies have become one of the most promising strategies available for developing effective treatment options. Using the previously reported human drug virus database (HDVD), we proposed a model to predict possible drug regimens based on a weighted reconstruction-based linear label propagation algorithm (WLLP). For the drug–virus association matrix, we used the weighted K-nearest known neighbors method for preprocessing and label propagation of the network based on the linear neighborhood similarity of drugs and viruses to obtain the final prediction results. In the framework of 10 times 10-fold cross-validated area under the receiver operating characteristic (ROC) curve (AUC), WLLP exhibited excellent performance with an AUC of 0.8828 ± 0.0037 and an area under the precision-recall curve of 0.5277 ± 0.0053, outperforming the other four models used for comparison. We also predicted effective drug regimens against SARS-CoV-2, and this case study showed that WLLP can be used to suggest potential drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Langcheng Chen
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Dongying Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Lieqing Lin
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
- *Correspondence: Lieqing Lin
| |
Collapse
|
26
|
Arulanandam CD, Hwang JS, Rathinam AJ, Dahms HU. Evaluating different web applications to assess the toxicity of plasticizers. Sci Rep 2022; 12:19684. [PMID: 36385271 PMCID: PMC9668977 DOI: 10.1038/s41598-022-18327-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 08/09/2022] [Indexed: 11/18/2022] Open
Abstract
Plasticizers increase the flexibility of plastics. As environmental leachates they lead to increased water and soil pollution, as well as to serious harm to human health. This study was set out to explore various web applications to predict the toxicological properties of plasticizers. Web-based tools (e.g., BOILED-Egg, LAZAR, PROTOX-II, CarcinoPred-EL) and VEGA were accessed via an 5th-10th generation computer in order to obtain toxicological predictions. Based on the LAZAR mutagenicity assessment was only bisphenol F predicted as mutagenic. The BBP and DBP in RF; DEHP in RF and XGBoost; DNOP in RF and XGBoost models were predicted as carcinogenic in the CarcinoPred-EL web application. From the bee predictive model (KNN/IRFMN) BPF, di-n-propyl phthalate, diallyl phthalate, dibutyl phthalate, and diisohexyl phthalate were predicted as strong bee toxicants. Acute toxicity for fish using the model Sarpy/IRFMN predicted 19 plasticizers as strong toxicants with LC50 values of less than 1 mg/L. This study also considered plasticizer effects on gastrointestinal absorption and other toxicological endpoints.
Collapse
Affiliation(s)
- Charli Deepak Arulanandam
- grid.412019.f0000 0000 9476 5696Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, 80708 Taiwan, ROC ,grid.412019.f0000 0000 9476 5696Department of Medicinal and Applied Chemistry, Kaohsiung Medical University, Kaohsiung, 80708 Taiwan, ROC
| | - Jiang-Shiou Hwang
- grid.260664.00000 0001 0313 3026Institute of Marine Biology, National Taiwan Ocean University, Keelung, 20224 Taiwan, ROC ,grid.260664.00000 0001 0313 3026Center of Excellence for Ocean Engineering, National Taiwan Ocean University, Keelung, 20224 Taiwan, ROC ,grid.260664.00000 0001 0313 3026Center of Excellence for the Oceans, National Taiwan Ocean University, Keelung, 20224 Taiwan, ROC
| | - Arthur James Rathinam
- grid.411678.d0000 0001 0941 7660Department of Marine Science, Bharathidasan University, Tiruchirappalli, 620 024, India
| | - Hans-Uwe Dahms
- grid.412019.f0000 0000 9476 5696Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, 80708 Taiwan, ROC ,grid.412019.f0000 0000 9476 5696Research Center of Precision Environmental Medicine, Kaohsiung Medical University, Kaohsiung, 807 Taiwan ,grid.412036.20000 0004 0531 9758Department of Marine Biotechnology and Resources, National Sun Yat-Sen University, No. 70, Lienhai Road, Kaohsiung, 80424 Taiwan, ROC
| |
Collapse
|
27
|
Mongia A, Chouzenoux E, Majumdar A. Computational Prediction of Drug-Disease Association Based on Graph-Regularized One Bit Matrix Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3332-3339. [PMID: 35816539 DOI: 10.1109/tcbb.2022.3189879] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Investigation of existing drugs is an effective alternative to the discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can be modeled as a low-rank matrix completion problem. In this work, we propose a novel matrix completion framework that makes use of the side-information associated with drugs/diseases for the prediction of drug-disease indications modeled as neighborhood graph: Graph regularized 1-bit matrix completion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results have been validated on two standard databases by evaluating the AUC across the 10-fold cross-validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs, which then are verified with the CTD database.
Collapse
|
28
|
Tangmanussukum P, Kawichai T, Suratanee A, Plaimas K. Heterogeneous network propagation with forward similarity integration to enhance drug-target association prediction. PeerJ Comput Sci 2022; 8:e1124. [PMID: 36262151 PMCID: PMC9575853 DOI: 10.7717/peerj-cs.1124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 09/14/2022] [Indexed: 06/16/2023]
Abstract
Identification of drug-target interaction (DTI) is a crucial step to reduce time and cost in the drug discovery and development process. Since various biological data are publicly available, DTIs have been identified computationally. To predict DTIs, most existing methods focus on a single similarity measure of drugs and target proteins, whereas some recent methods integrate a particular set of drug and target similarity measures by a single integration function. Therefore, many DTIs are still missing. In this study, we propose heterogeneous network propagation with the forward similarity integration (FSI) algorithm, which systematically selects the optimal integration of multiple similarity measures of drugs and target proteins. Seven drug-drug and nine target-target similarity measures are applied with four distinct integration methods to finally create an optimal heterogeneous network model. Consequently, the optimal model uses the target similarity based on protein sequences and the fused drug similarity, which combines the similarity measures based on chemical structures, the Jaccard scores of drug-disease associations, and the cosine scores of drug-drug interactions. With an accuracy of 99.8%, this model significantly outperforms others that utilize different similarity measures of drugs and target proteins. In addition, the validation of the DTI predictions of this model demonstrates the ability of our method to discover missing potential DTIs.
Collapse
Affiliation(s)
- Piyanut Tangmanussukum
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Thitipong Kawichai
- Department of Mathematics and Computer Science, Academic Division, Chulachomklao Royal Military Academy, Nakhon Nayok, Thailand
| | - Apichat Suratanee
- Department of Mathematics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
- Intelligent and Nonlinear Dynamics Innovations Research Center, Science and Technology Research Institute, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Kitiporn Plaimas
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
29
|
Díaz Rodríguez CA, Díaz-García L, Bunk B, Spröer C, Herrera K, Tarazona NA, Rodriguez-R LM, Overmann J, Jiménez DJ. Novel bacterial taxa in a minimal lignocellulolytic consortium and their potential for lignin and plastics transformation. ISME COMMUNICATIONS 2022; 2:89. [PMID: 37938754 PMCID: PMC9723784 DOI: 10.1038/s43705-022-00176-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/09/2022] [Accepted: 09/13/2022] [Indexed: 11/09/2023]
Abstract
The understanding and manipulation of microbial communities toward the conversion of lignocellulose and plastics are topics of interest in microbial ecology and biotechnology. In this study, the polymer-degrading capability of a minimal lignocellulolytic microbial consortium (MELMC) was explored by genome-resolved metagenomics. The MELMC was mostly composed (>90%) of three bacterial members (Pseudomonas protegens; Pristimantibacillus lignocellulolyticus gen. nov., sp. nov; and Ochrobactrum gambitense sp. nov) recognized by their high-quality metagenome-assembled genomes (MAGs). Functional annotation of these MAGs revealed that Pr. lignocellulolyticus could be involved in cellulose and xylan deconstruction, whereas Ps. protegens could catabolize lignin-derived chemical compounds. The capacity of the MELMC to transform synthetic plastics was assessed by two strategies: (i) annotation of MAGs against databases containing plastic-transforming enzymes; and (ii) predicting enzymatic activity based on chemical structural similarities between lignin- and plastics-derived chemical compounds, using Simplified Molecular-Input Line-Entry System and Tanimoto coefficients. Enzymes involved in the depolymerization of polyurethane and polybutylene adipate terephthalate were found to be encoded by Ps. protegens, which could catabolize phthalates and terephthalic acid. The axenic culture of Ps. protegens grew on polyhydroxyalkanoate (PHA) nanoparticles and might be a suitable species for the industrial production of PHAs in the context of lignin and plastic upcycling.
Collapse
Affiliation(s)
- Carlos Andrés Díaz Rodríguez
- Microbiomes and Bioenergy Research Group, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Laura Díaz-García
- Microbiomes and Bioenergy Research Group, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Chemical and Biological Engineering, Advanced Biomanufacturing Centre, University of Sheffield, Sheffield, UK
| | - Boyke Bunk
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Cathrin Spröer
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Katherine Herrera
- Department of Civil and Environmental Engineering, Universidad de los Andes, Bogotá, Colombia
| | | | - Luis M Rodriguez-R
- Department of Microbiology and Digital Science Center (DiSC), University of Innsbruck, Innsbruck, Austria
| | - Jörg Overmann
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- Braunschweig University of Technology, Braunschweig, Germany
| | - Diego Javier Jiménez
- Microbiomes and Bioenergy Research Group, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia.
| |
Collapse
|
30
|
Pu Y, Li J, Tang J, Guo F. DeepFusionDTA: Drug-Target Binding Affinity Prediction With Information Fusion and Hybrid Deep-Learning Ensemble Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2760-2769. [PMID: 34379594 DOI: 10.1109/tcbb.2021.3103966] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identification of drug-target interaction (DTI) is the most important issue in the broad field of drug discovery. Using purely biological experiments to verify drug-target binding profiles takes lots of time and effort, so computational technologies for this task obviously have great benefits in reducing the drug search space. Most of computational methods to predict DTI are proposed to solve a binary classification problem, which ignore the influence of binding strength. Therefore, drug-target binding affinity prediction is still a challenging issue. Currently, lots of studies only extract sequence information that lacks feature-rich representation, but we consider more spatial features in order to merge various data in drug and target spaces. In this study, we propose a two-stage deep neural network ensemble model for detecting drug-target binding affinity, called DeepFusionDTA, via various information analysis modules. First stage is to utilize sequence and structure information to generate fusion feature map of candidate protein and drug pair through various analysis modules based deep learning. Second stage is to apply bagging-based ensemble learning strategy for regression prediction, and we obtain outstanding results by combining the advantages of various algorithms in efficient feature abstraction and regression calculation. Importantly, we evaluate our novel method, DeepFusionDTA, which delivers 1.5 percent CI increase on KIBA dataset and 1.0 percent increase on Davis dataset, by comparing with existing prediction tools, DeepDTA. Furthermore, the ideas we have offered can be applied to in-silico screening of the interaction space, to provide novel DTIs which can be experimentally pursued. The codes and data are available from https://github.com/guofei-tju/DeepFusionDTA.
Collapse
|
31
|
Zhang Y, Guo K, Zhang P, Zhang M, Li X, Zhou S, Sun H, Wang W, Wang H, Hu Y. Exploring the mechanism of YangXue QingNao Wan based on network pharmacology in the treatment of Alzheimer’s disease. Front Genet 2022; 13:942203. [PMID: 36105078 PMCID: PMC9465410 DOI: 10.3389/fgene.2022.942203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
It is clinical reported that YangXue QingNao Wan (YXQNW) combined with donepezil can significantly improve the cognitive function of AD patients. However, the mechanism is not clear. A network pharmacology approach was employed to predict the protein targets and affected pathways of YXQNW in the treatment of AD. Based on random walk evaluation, the correlation between YXQNW and AD was calculated; while a variety of AD clinical approved Western drugs were compared. The targets of YXQNW were enriched and analyzed by using the TSEA platform and MetaCore. We proved that the overall correlation between YXQNW and AD is equivalent to clinical Western drugs, but the mechanism of action is very different. Firstly, YXQNW may promote cerebral blood flow velocity by regulating platelet aggregation and the vasoconstriction/relaxation signal pathway, which has been verified by clinical meta-analysis. Secondly, YXQNW may promote Aβ degradation in the liver by modulating the abnormal glucose and lipid metabolisms via the adiponectin-dependent pathway, RXR/PPAR-dependent lipid metabolism signal pathway, and fatty acid synthase activity signal pathway. We also verified whether YXQNW indeed promoted Aβ degradation in hepatic stellate cells. This work provides a novel scientific basis for the mechanism of YXQNW in the treatment of AD.
Collapse
Affiliation(s)
- Yuying Zhang
- Cloudphar Pharmaceuticals Co. Ltd., Shenzhen, China
| | - Kaimin Guo
- Cloudphar Pharmaceuticals Co. Ltd., Shenzhen, China
| | - Pengfei Zhang
- Tianjin Pharmaceutical and Cosmetic Evaluation and Inspection Center, Tianjin, China
| | | | - Xiaoqiang Li
- Cloudphar Pharmaceuticals Co. Ltd., Shenzhen, China
| | - Shuiping Zhou
- The State Key Laboratory of Core Technology in Innovative Chinese Medicine, Tasly Academy, Tasly Holding Group Co. Ltd., Tianjin, China
- Tasly Pharmaceutical Group Co. Ltd., Tianjin, China
| | - He Sun
- The State Key Laboratory of Core Technology in Innovative Chinese Medicine, Tasly Academy, Tasly Holding Group Co. Ltd., Tianjin, China
- Tasly Pharmaceutical Group Co. Ltd., Tianjin, China
| | - Wenjia Wang
- Cloudphar Pharmaceuticals Co. Ltd., Shenzhen, China
| | - Hui Wang
- Key Laboratory of Molecular Biophysics, Hebei Province, Institute of Biophysics, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
- Key Laboratory of Bioactive Materials Ministry of Education, School of Life Sciences, Nankai University, Tianjin, China
- *Correspondence: Hui Wang, ; Yunhui Hu,
| | - Yunhui Hu
- Cloudphar Pharmaceuticals Co. Ltd., Shenzhen, China
- *Correspondence: Hui Wang, ; Yunhui Hu,
| |
Collapse
|
32
|
Reciprocal perspective as a super learner improves drug-target interaction prediction (MUSDTI). Sci Rep 2022; 12:13237. [PMID: 35918366 PMCID: PMC9344797 DOI: 10.1038/s41598-022-16493-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 07/11/2022] [Indexed: 11/08/2022] Open
Abstract
The identification of novel drug-target interactions (DTI) is critical to drug discovery and drug repurposing to address contemporary medical and public health challenges presented by emergent diseases. Historically, computational methods have framed DTI prediction as a binary classification problem (indicating whether or not a drug physically interacts with a given protein target); however, framing the problem instead as a regression-based prediction of the physiochemical binding affinity is more meaningful. With growing databases of experimentally derived drug-target interactions (e.g. Davis, Binding-DB, and Kiba), deep learning-based DTI predictors can be effectively leveraged to achieve state-of-the-art (SOTA) performance. In this work, we formulated a DTI competition as part of the coursework for a senior undergraduate machine learning course and challenged students to generate component DTI models that might surpass SOTA models and effectively combine these component models as part of a meta-model using the Reciprocal Perspective (RP) multi-view learning framework. Following 6 weeks of concerted effort, 28 student-produced component deep-learning DTI models were leveraged in this work to produce a new SOTA RP-DTI model, denoted the Meta Undergraduate Student DTI (MUSDTI) model. Through a series of experiments we demonstrate that (1) RP can considerably improve SOTA DTI prediction, (2) our new double-cold experimental design is more appropriate for emergent DTI challenges, (3) that our novel MUSDTI meta-model outperforms SOTA models, (4) that RP can improve upon individual models as an ensembling method, and finally, (5) RP can be utilized for low computation transfer learning. This work introduces a number of important revelations for the field of DTI prediction and sequence-based, pairwise prediction in general.
Collapse
|
33
|
Zheng J, Xiao X, Qiu WR. DTI-BERT: Identifying Drug-Target Interactions in Cellular Networking Based on BERT and Deep Learning Method. Front Genet 2022; 13:859188. [PMID: 35754843 PMCID: PMC9213727 DOI: 10.3389/fgene.2022.859188] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 04/25/2022] [Indexed: 11/20/2022] Open
Abstract
Drug–target interactions (DTIs) are regarded as an essential part of genomic drug discovery, and computational prediction of DTIs can accelerate to find the lead drug for the target, which can make up for the lack of time-consuming and expensive wet-lab techniques. Currently, many computational methods predict DTIs based on sequential composition or physicochemical properties of drug and target, but further efforts are needed to improve them. In this article, we proposed a new sequence-based method for accurately identifying DTIs. For target protein, we explore using pre-trained Bidirectional Encoder Representations from Transformers (BERT) to extract sequence features, which can provide unique and valuable pattern information. For drug molecules, Discrete Wavelet Transform (DWT) is employed to generate information from drug molecular fingerprints. Then we concatenate the feature vectors of the DTIs, and input them into a feature extraction module consisting of a batch-norm layer, rectified linear activation layer and linear layer, called BRL block and a Convolutional Neural Networks module to extract DTIs features further. Subsequently, a BRL block is used as the prediction engine. After optimizing the model based on contrastive loss and cross-entropy loss, it gave prediction accuracies of the target families of G Protein-coupled receptors, ion channels, enzymes, and nuclear receptors up to 90.1, 94.7, 94.9, and 89%, which indicated that the proposed method can outperform the existing predictors. To make it as convenient as possible for researchers, the web server for the new predictor is freely accessible at: https://bioinfo.jcu.edu.cn/dtibert or http://121.36.221.79/dtibert/. The proposed method may also be a potential option for other DITs.
Collapse
Affiliation(s)
- Jie Zheng
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China
| | - Xuan Xiao
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China
| | - Wang-Ren Qiu
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China
| |
Collapse
|
34
|
Singh AK, Bilal M, Barceló D, Iqbal HMN. A predictive toolset for the identification of degradation pattern and toxic hazard estimation of multimeric hazardous compounds persists in water bodies. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 824:153979. [PMID: 35181354 DOI: 10.1016/j.scitotenv.2022.153979] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 02/08/2023]
Abstract
An array of industrial processing units generates many multimeric hazardous compounds, such as complex technical lignin and its toxic derivatives, thereby persist in expelled water bodies. The inclusion of some group of motifs in the complex technical lignin structure helps it resist degrade biologically, most often even recalcitrant. Relatively small concentrations of lignin are harmful to aquatic organisms and can trigger environmental hazards. Sadly, the entire biotransformation pathway and insightful information about these toxic derivatives are incomplete and missing in the literature. This is mainly because the current conventional treatments often fail to identify all transformed compounds and their environmental fate. Thus, a robust toolset is much needed to cover this literature gap. Inadequate performance of conventional remediation processes and biological degradation patterns can be maximally optimized with the aid of predictive toolset methods that could offer better degradability and complete transformed compound information. A predictive toolset-assisted biodegradation pattern determination is a multifaceted and reliable analytical technique that can help to overcome existing shortcomings by providing an entire transformation pathway. Considering the above critiques, this work reports on the degradation pattern, and toxicological endpoints of five hazardous compounds, i.e., 2-chlorosyringaldehyde, 5-chlorovanillin, catechol, guaiacyl 4-O-5 guaiacyl, and syringyl β-O-4 syringyl β-O-4 sinapyl alcohol, that persists in water matrices. The predictive transformation pattern was revealed notably less complex end-products of catechol as; succinate, and 2-Oxo-4-pentenoate. The gastrointestinal (GI) absorption rate was found high for all tested compounds, excluding trimer compound, i.e., syringyl β-O-4 syringyl β-O-4 sinapyl alcohol. The toxicity and persistence profile tested via Toxtree showed that the Cramer Rules, Verhaar Scheme, and Structural Alerts for Reactivity, (START) biodegradation ability as positive, and all five target compounds were found as class-II persistent compounds. Furthermore, the Ecological Structure-Activity Relationships (ECOSAR)assisted testing specifies that all tested derivatives have multiple aquatic toxic levels. In summary, the current findings endorse the hazardous compounds and undertake prescreening of the deprivation policy to protect the environment.
Collapse
Affiliation(s)
- Anil Kumar Singh
- Environmental Microbiology Laboratory, Environmental Toxicology Group, CSIR-Indian Institute of Toxicology Research (CSIR-IITR), Vishvigyan Bhawan, 31, Mahatma Gandhi Marg, Lucknow 226001, Uttar Pradesh, India; Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Muhammad Bilal
- School of Life Science and Food Engineering, Huaiyin Institute of Technology, Huaian 223003, China
| | - Damià Barceló
- Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Jordi Girona, 18-26, 08034 Barcelona, Spain; Catalan Institute of Water Research (ICRA-CERCA), Parc Científic i Tecnològic de la Universitat de Girona, c/Emili Grahit, 101, Edifici H2O, 17003 Girona, Spain; Sustainability Cluster, School of Engineering, UPES, Dehradun, India
| | - Hafiz M N Iqbal
- Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64849, Mexico.
| |
Collapse
|
35
|
Li X, Guo K, Zhang R, Wang W, Sun H, Yagüe E, Hu Y. Exploration of the Mechanism of Salvianolic Acid for Injection Against Ischemic Stroke: A Research Based on Computational Prediction and Experimental Validation. Front Pharmacol 2022; 13:894427. [PMID: 35694259 PMCID: PMC9175744 DOI: 10.3389/fphar.2022.894427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 04/21/2022] [Indexed: 11/13/2022] Open
Abstract
Ischemic stroke (IS) is an acute neurological injury that occurs when a vessel supplying blood to the brain is obstructed, which is a leading cause of death and disability. Salvia miltiorrhiza has been used in the treatment of cardiovascular and cerebrovascular diseases for over thousands of years due to its effect activating blood circulation and dissipating blood stasis. However, the herbal preparation is chemically complex and the diversity of potential targets makes difficult to determine its mechanism of action. To gain insight into its mechanism of action, we analyzed “Salvianolic acid for injection” (SAFI), a traditional Chinese herbal medicine with anti-IS effects, using computational systems pharmacology. The potential targets of SAFI, obtained from literature mining and database searches, were compared with IS-associated genes, giving 38 common genes that were related with pathways involved in inflammatory response. This suggests that SAFI might function as an anti-inflammatory agent. Two genes associated with inflammation (PTGS1 and PTGS2), which were inhibited by SAFI, were preliminarily validated in vitro. The results showed that SAFI inhibited PTGS1 and PTGS2 activity in a dose-dependent manner and inhibited the production of prostaglandin E2 induced by lipopolysaccharide in RAW264.7 macrophages and BV-2 microglia. This approach reveals the possible pharmacological mechanism of SAFI acting on IS, and also provides a feasible way to elucidate the mechanism of traditional Chinese medicine (TCM).
Collapse
Affiliation(s)
- Xiaoqiang Li
- Cloudphar Pharmaceuticals Co., Ltd., Shenzhen, China
| | - Kaimin Guo
- Cloudphar Pharmaceuticals Co., Ltd., Shenzhen, China
| | - Ruili Zhang
- College of Pharmacy, Haihe Education Park, Nankai University, Tianjin, China
| | - Wenjia Wang
- Cloudphar Pharmaceuticals Co., Ltd., Shenzhen, China
| | - He Sun
- Tasly Pharmaceuticals Co., Ltd., Tianjin, China
| | - Ernesto Yagüe
- Division of Cancer, Imperial College Faculty of Medicine, Hammersmith Hospital Campus, London, United Kingdom
| | - Yunhui Hu
- Cloudphar Pharmaceuticals Co., Ltd., Shenzhen, China
- *Correspondence: Yunhui Hu,
| |
Collapse
|
36
|
Wang Y, Wang L, Wong L, Zhao B, Su X, Li Y, You Z. RoFDT: Identification of Drug–Target Interactions from Protein Sequence and Drug Molecular Structure Using Rotation Forest. BIOLOGY 2022; 11:biology11050741. [PMID: 35625469 PMCID: PMC9138819 DOI: 10.3390/biology11050741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/02/2022] [Accepted: 05/06/2022] [Indexed: 11/16/2022]
Abstract
As the basis for screening drug candidates, the identification of drug–target interactions (DTIs) plays a crucial role in the innovative drugs research. However, due to the inherent constraints of small-scale and time-consuming wet experiments, DTI recognition is usually difficult to carry out. In the present study, we developed a computational approach called RoFDT to predict DTIs by combining feature-weighted Rotation Forest (FwRF) with a protein sequence. In particular, we first encode protein sequences as numerical matrices by Position-Specific Score Matrix (PSSM), then extract their features utilize Pseudo Position-Specific Score Matrix (PsePSSM) and combine them with drug structure information-molecular fingerprints and finally feed them into the FwRF classifier and validate the performance of RoFDT on Enzyme, GPCR, Ion Channel and Nuclear Receptor datasets. In the above dataset, RoFDT achieved 91.68%, 84.72%, 88.11% and 78.33% accuracy, respectively. RoFDT shows excellent performance in comparison with support vector machine models and previous superior approaches. Furthermore, 7 of the top 10 DTIs with RoFDT estimate scores were proven by the relevant database. These results demonstrate that RoFDT can be employed to a powerful predictive approach for DTIs to provide theoretical support for innovative drug discovery.
Collapse
Affiliation(s)
- Ying Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- Correspondence: (L.W.); (Z.Y.); Tel.: +86-151-0632-2257 (L.W.); +86-173-9276-3836 (Z.Y.)
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
| | - Bowei Zhao
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; (B.Z.); (X.S.)
| | - Xiaorui Su
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; (B.Z.); (X.S.)
| | - Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China;
| | - Zhuhong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
- Correspondence: (L.W.); (Z.Y.); Tel.: +86-151-0632-2257 (L.W.); +86-173-9276-3836 (Z.Y.)
| |
Collapse
|
37
|
Wang L, Wong L, Chen ZH, Hu J, Sun XF, Li Y, You ZH. MSPEDTI: Prediction of Drug-Target Interactions via Molecular Structure with Protein Evolutionary Information. BIOLOGY 2022; 11:740. [PMID: 35625468 PMCID: PMC9138588 DOI: 10.3390/biology11050740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/03/2022] [Accepted: 05/04/2022] [Indexed: 11/25/2022]
Abstract
The key to new drug discovery and development is first and foremost the search for molecular targets of drugs, thus advancing drug discovery and drug repositioning. However, traditional drug-target interactions (DTIs) is a costly, lengthy, high-risk, and low-success-rate system project. Therefore, more and more pharmaceutical companies are trying to use computational technologies to screen existing drug molecules and mine new drugs, leading to accelerating new drug development. In the current study, we designed a deep learning computational model MSPEDTI based on Molecular Structure and Protein Evolutionary to predict the potential DTIs. The model first fuses protein evolutionary information and drug structure information, then a deep learning convolutional neural network (CNN) to mine its hidden features, and finally accurately predicts the associated DTIs by extreme learning machine (ELM). In cross-validation experiments, MSPEDTI achieved 94.19%, 90.95%, 87.95%, and 86.11% prediction accuracy in the gold-standard datasets enzymes, ion channels, G-protein-coupled receptors (GPCRs), and nuclear receptors, respectively. MSPEDTI showed its competitive ability in ablation experiments and comparison with previous excellent methods. Additionally, 7 of 10 potential DTIs predicted by MSPEDTI were substantiated by the classical database. These excellent outcomes demonstrate the ability of MSPEDTI to provide reliable drug candidate targets and strongly facilitate the development of drug repositioning and drug development.
Collapse
Affiliation(s)
- Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China; (J.H.); (X.-F.S.)
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
| | - Zhan-Heng Chen
- Computer Science and Technology, Tongji University, Shanghai 200092, China;
| | - Jing Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China; (J.H.); (X.-F.S.)
| | - Xiao-Fei Sun
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China; (J.H.); (X.-F.S.)
| | - Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China;
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
| |
Collapse
|
38
|
Jafari M, Mirzaie M, Bao J, Barneh F, Zheng S, Eriksson J, Heckman CA, Tang J. Bipartite network models to design combination therapies in acute myeloid leukaemia. Nat Commun 2022; 13:2128. [PMID: 35440130 PMCID: PMC9018865 DOI: 10.1038/s41467-022-29793-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 03/30/2022] [Indexed: 12/20/2022] Open
Abstract
Combination therapy is preferred over single-targeted monotherapies for cancer treatment due to its efficiency and safety. However, identifying effective drug combinations costs time and resources. We propose a method for identifying potential drug combinations by bipartite network modelling of patient-related drug response data, specifically the Beat AML dataset. The median of cell viability is used as a drug potency measurement to reconstruct a weighted bipartite network, model drug-biological sample interactions, and find the clusters of nodes inside two projected networks. Then, the clustering results are leveraged to discover effective multi-targeted drug combinations, which are also supported by more evidence using GDSC and ALMANAC databases. The potency and synergy levels of selective drug combinations are corroborated against monotherapy in three cell lines for acute myeloid leukaemia in vitro. In this study, we introduce a nominal data mining approach to improving acute myeloid leukaemia treatment through combinatorial therapy. Identifying effective drug combinations to treat cancer is a challenging task, either experimentally or computationally. Here, the authors develop a bipartite network modelling approach to propose drug combination strategies in acute myeloid leukaemia using patient and cell line drug screening data.
Collapse
Affiliation(s)
- Mohieddin Jafari
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Mehdi Mirzaie
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jie Bao
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Farnaz Barneh
- Prinses Maxima Center for Pediatric Oncology, 3584 CS Utrecht, Utrech, the Netherlands
| | - Shuyu Zheng
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Johanna Eriksson
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Caroline A Heckman
- Institute for Molecular Medicine Finland - FIMM, HiLIFE - Helsinki Institute of Life Science, iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
39
|
Ru X, Ye X, Sakurai T, Zou Q. NerLTR-DTA: drug-target binding affinity prediction based on neighbor relationship and learning to rank. Bioinformatics 2022; 38:1964-1971. [PMID: 35134828 DOI: 10.1093/bioinformatics/btac048] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/20/2021] [Accepted: 01/28/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Drug-target interaction prediction plays an important role in new drug discovery and drug repurposing. Binding affinity indicates the strength of drug-target interactions. Predicting drug-target binding affinity is expected to provide promising candidates for biologists, which can effectively reduce the workload of wet laboratory experiments and speed up the entire process of drug research. Given that, numerous new proteins are sequenced and compounds are synthesized, several improved computational methods have been proposed for such predictions, but there are still some challenges. (i) Many methods only discuss and implement one application scenario, they focus on drug repurposing and ignore the discovery of new drugs and targets. (ii) Many methods do not consider the priority order of proteins (or drugs) related to each target drug (or protein). Therefore, it is necessary to develop a comprehensive method that can be used in multiple scenarios and focuses on candidate order. RESULTS In this study, we propose a method called NerLTR-DTA that uses the neighbor relationship of similarity and sharing to extract features, and applies a ranking framework with regression attributes to predict affinity values and priority order of query drug (or query target) and its related proteins (or compounds). It is worth noting that using the characteristics of learning to rank to set different queries can smartly realize the multi-scenario application of the method, including the discovery of new drugs and new targets. Experimental results on two commonly used datasets show that NerLTR-DTA outperforms some state-of-the-art competing methods. NerLTR-DTA achieves excellent performance in all application scenarios mentioned in this study, and the rm(test)2 values guarantee such excellent performance is not obtained by chance. Moreover, it can be concluded that NerLTR-DTA can provide accurate ranking lists for the relevant results of most queries through the statistics of the association relationship of each query drug (or query protein). In general, NerLTR-DTA is a powerful tool for predicting drug-target associations and can contribute to new drug discovery and drug repurposing. AVAILABILITY AND IMPLEMENTATION The proposed method is implemented in Python and Java. Source codes and datasets are available at https://github.com/RUXIAOQING964914140/NerLTR-DTA.
Collapse
Affiliation(s)
- Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| |
Collapse
|
40
|
Xi Y, Miao Y, Zhou R, Wang M, Zhang F, Li Y, Zhang Y, Yang H, Guo F. Exploration of the Specific Pathology of HXMM Tablet Against Retinal Injury Based on Drug Attack Model to Network Robustness. Front Pharmacol 2022; 13:826535. [PMID: 35401181 PMCID: PMC8990835 DOI: 10.3389/fphar.2022.826535] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Retinal degenerative diseases are related to retinal injury because of the activation of the complement cascade, oxidative stress-induced cell death mechanisms, dysfunctional mitochondria, chronic neuroinflammation, and production of the vascular endothelial growth factor. Anti-VEGF therapy demonstrates remarkable clinical effects and benefits in retinal degenerative disease patients. Hence, new drug development is necessary to treat patients with severe visual loss. He xue ming mu (HXMM) tablet is a CFDA-approved traditional Chinese medicine (TCM) for retinal degenerative diseases, which can alleviate the symptoms of age-related macular degeneration (AMD) and diabetic retinopathy (DR) alone or in combination with anti-VEGF agents. To elucidate the mechanisms of HXMM, a quantitative evaluation algorithm for the prediction of the effect of multi-target drugs on the disturbance of the disease network has been used for exploring the specific pathology of HXMM and TCM precision positioning. Compared with anti-VEGF agents, the drug disturbance of HXMM on the functional subnetwork shows that HXMM reduces the network robustness on the oxidative stress subnetwork and inflammatory subnetwork to exhibit the anti-oxidation and anti-inflammation activity. HXMM provides better protection to ARPE-19 cells against retinal injury after H2O2 treatment. HXMM can elevate GSH and reduce LDH levels to exhibit antioxidant activity and suppress the expression of IL-6 and TNF-α for anti-inflammatory activity, which is different from the anti-VEGF agent with strong anti-VEGF activity. The experimental result confirmed the accuracy of the computational prediction. The combination of bioinformatics prediction based on the drug attack on network robustness and experimental validation provides a new strategy for precision application of TCM.
Collapse
Affiliation(s)
- Yujie Xi
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- Chinese Medicine Research Institute, Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Yan Miao
- Department of Pharmacology, School of Basic Medical Sciences, Xi’an Jiaotong University Health Science Center, Xi’an, China
| | - Rui Zhou
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- College of Traditional Chinese Medicine, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Maolin Wang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Fangbo Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yu Li
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- Chinese Medicine Research Institute, Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | - Yi Zhang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Hongjun Yang
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- Chinese Medicine Research Institute, Tianjin University of Traditional Chinese Medicine, Tianjin, China
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing, China
- *Correspondence: Feifei Guo, ; Hongjun Yang,
| | - Feifei Guo
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
- *Correspondence: Feifei Guo, ; Hongjun Yang,
| |
Collapse
|
41
|
Nakamura T, Sakaue S, Fujii K, Harabuchi Y, Maeda S, Iwata S. Selecting molecules with diverse structures and properties by maximizing submodular functions of descriptors learned with graph neural networks. Sci Rep 2022; 12:1124. [PMID: 35064170 PMCID: PMC8782878 DOI: 10.1038/s41598-022-04967-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 01/04/2022] [Indexed: 12/25/2022] Open
Abstract
Selecting diverse molecules from unexplored areas of chemical space is one of the most important tasks for discovering novel molecules and reactions. This paper proposes a new approach for selecting a subset of diverse molecules from a given molecular list by using two existing techniques studied in machine learning and mathematical optimization: graph neural networks (GNNs) for learning vector representation of molecules and a diverse-selection framework called submodular function maximization. Our method, called SubMo-GNN, first trains a GNN with property prediction tasks, and then the trained GNN transforms molecular graphs into molecular vectors, which capture both properties and structures of molecules. Finally, to obtain a subset of diverse molecules, we define a submodular function, which quantifies the diversity of molecular vectors, and find a subset of molecular vectors with a large submodular function value. This can be done efficiently by using the greedy algorithm, and the diversity of selected molecules measured by the submodular function value is mathematically guaranteed to be at least 63% of that of an optimal selection. We also introduce a new evaluation criterion to measure the diversity of selected molecules based on molecular properties. Computational experiments confirm that our SubMo-GNN successfully selects diverse molecules from the QM9 dataset regarding the property-based criterion, while performing comparably to existing methods regarding standard structure-based criteria. We also demonstrate that SubMo-GNN with a GNN trained on the QM9 dataset can select diverse molecules even from other MoleculeNet datasets whose domains are different from the QM9 dataset. The proposed method enables researchers to obtain diverse sets of molecules for discovering new molecules and novel chemical reactions, and the proposed diversity criterion is useful for discussing the diversity of molecular libraries from a new property-based perspective.
Collapse
Affiliation(s)
- Tomohiro Nakamura
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| | - Shinsaku Sakaue
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Kaito Fujii
- National Institute of Informatics, Hitotsubashi 2-1-2, Chiyoda-ku, Tokyo, 101-8430, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Yu Harabuchi
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan. .,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan. .,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.
| | - Satoshi Maeda
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.,National Institute for Materials Science (NIMS), Research and Services Division of Materials Data and Integrated System (MaDIS), Tsukuba, Ibaraki, 305-0044, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| | - Satoru Iwata
- Department of Mathematical Informatics, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113-8656, Japan.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido, 001-0021, Japan.,JST, ERATO Maeda Artificial Intelligence for Chemical Reaction Design and Discovery Project, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido, 060-0810, Japan
| |
Collapse
|
42
|
Çakı O, Karaçalı B. Quasi-Supervised Strategies for Compound-Protein Interaction Prediction. Mol Inform 2021; 41:e2100118. [PMID: 34837345 DOI: 10.1002/minf.202100118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 11/01/2021] [Indexed: 11/08/2022]
Abstract
In-silico compound-protein interaction prediction addresses prioritization of drug candidates for experimental biochemical validation because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning (QSL) algorithm. In this framework, potential interactions are predicted by estimating the overlap between a true positive dataset of compound-protein pairs with known interactions and an unknown dataset of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure. We also address the class-imbalance problem by modifying the conventional cost function of the QSL algorithm. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.
Collapse
Affiliation(s)
- Onur Çakı
- Electrical and Electronics Engineering Department, Izmir Institute of Technology, Urla, Izmir, 35430, Turkey
| | - Bilge Karaçalı
- Electrical and Electronics Engineering Department, Izmir Institute of Technology, Urla, Izmir, 35430, Turkey
| |
Collapse
|
43
|
Bolt MJ, Singh P, Obkirchner CE, Powell RT, Mancini MG, Szafran AT, Stossi F, Mancini MA. Endocrine disrupting chemicals differentially alter intranuclear dynamics and transcriptional activation of estrogen receptor-α. iScience 2021; 24:103227. [PMID: 34712924 PMCID: PMC8529556 DOI: 10.1016/j.isci.2021.103227] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 08/30/2021] [Accepted: 09/30/2021] [Indexed: 11/21/2022] Open
Abstract
Transcription is a highly regulated sequence of stochastic processes utilizing many regulators, including nuclear receptors (NR) that respond to stimuli. Endocrine disrupting chemicals (EDCs) in the environment can compete with natural ligands for nuclear receptors to alter transcription. As nuclear dynamics can be tightly linked to transcription, it is important to determine how EDCs affect NR mobility. We use an EPA-assembled set of 45 estrogen receptor-α (ERα) ligands and EDCs in our engineered PRL-Array model to characterize their effect upon transcription using fluorescence in situ hybridization and fluorescence recovery after photobleaching (FRAP). We identified 36 compounds that target ERα-GFP to a transcriptionally active, visible locus. Using a novel method for multi-region FRAP analysis we find a strong negative correlation between ERα mobility and inverse agonists. Our findings indicate that ERα mobility is not solely tied to transcription but affected highly by the chemical class binding the receptor.
Collapse
Affiliation(s)
- Michael J. Bolt
- Center for Advanced Microscopy and Image Informatics, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
- Center for Translational Cancer Research, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| | - Pankaj Singh
- Center for Advanced Microscopy and Image Informatics, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
- Center for Translational Cancer Research, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| | - Caroline E. Obkirchner
- Center for Advanced Microscopy and Image Informatics, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
- Center for Translational Cancer Research, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| | - Reid T. Powell
- Center for Translational Cancer Research, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| | - Maureen G. Mancini
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Adam T. Szafran
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Fabio Stossi
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Advanced Microscopy and Image Informatics, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| | - Michael A. Mancini
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Advanced Microscopy and Image Informatics, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
- Center for Translational Cancer Research, Institute of Biosciences & Technology, Texas A&M University, Houston, TX 77030, USA
| |
Collapse
|
44
|
Drug–disease associations prediction via Multiple Kernel-based Dual Graph Regularized Least Squares. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107811] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
45
|
Lennox M, Robertson N, Devereux B. Modelling Drug-Target Binding Affinity using a BERT based Graph Neural network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:4348-4353. [PMID: 34892183 DOI: 10.1109/embc46164.2021.9629695] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding the interactions between novel drugs and target proteins is fundamentally important in disease research as discovering drug-protein interactions can be an exceptionally time-consuming and expensive process. Alternatively, this process can be simulated using modern deep learning methods that have the potential of utilising vast quantities of data to reduce the cost and time required to provide accurate predictions. We seek to leverage a set of BERT-style models that have been pre-trained on vast quantities of both protein and drug data. The encodings produced by each model are then utilised as node representations for a graph convolutional neural network, which in turn are used to model the interactions without the need to simultaneously fine-tune both protein and drug BERT models to the task. We evaluate the performance of our approach on two drug-target interaction datasets that were previously used as benchmarks in recent work.Our results significantly improve upon a vanilla BERT baseline approach as well as the former state-of-the-art methods for each task dataset. Our approach builds upon past work in two key areas; firstly, we take full advantage of two large pre-trained BERT models that provide improved representations of task-relevant properties of both drugs and proteins. Secondly, inspired by work in natural language processing that investigates how linguistic structure is represented in such models, we perform interpretability analyses that allow us to locate functionally-relevant areas of interest within each drug and protein. By modelling the drug-target interactions as a graph as opposed to a set of isolated interactions, we demonstrate the benefits of combining large pre-trained models and a graph neural network to make state-of-the-art predictions on drug-target binding affinity.
Collapse
|
46
|
K D, A S J, Liu Y. A deep learning ensemble approach to prioritize antiviral drugs against novel coronavirus SARS-CoV-2 for COVID-19 drug repurposing. Appl Soft Comput 2021; 113:107945. [PMID: 34630000 PMCID: PMC8492370 DOI: 10.1016/j.asoc.2021.107945] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 08/22/2021] [Accepted: 09/23/2021] [Indexed: 12/13/2022]
Abstract
The alarming pandemic situation of Coronavirus infectious disease COVID-19, caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), has become a critical threat to public health. The unexpected outbreak and unrealistic progression of COVID-19 have generated an utmost need to realize promising therapeutic strategies to fight the pandemic. Drug repurposing-an efficient drug discovery technique from approved drugs is an emerging tactic to face the immediate global challenge. It offers a time-efficient and cost-effective way to find potential therapeutic agents for the disease. Artificial Intelligence-empowered deep learning models enable the rapid identification of potentially repurposable drug candidates against diseases. This study presents a deep learning ensemble model to prioritize clinically validated anti-viral drugs for their potential efficacy against SARS-CoV-2. The method integrates the similarities of drug chemical structures and virus genome sequences to generate feature vectors. The best combination of features is retrieved by the convolutional neural network in a deep learning manner. The extracted deep features are classified by the extreme gradient boosting classifier to infer potential virus–drug associations. The method could achieve an AUC of 0.8897 with 0.8571 prediction accuracy and 0.8394 sensitivity under the fivefold cross-validation. The experimental results and case studies demonstrate the suggested deep learning ensemble system yields competitive results compared with the state-of-the-art approaches. The top-ranked drugs are released for further wet-lab researches.
Collapse
Affiliation(s)
- Deepthi K
- Department of Computer Science, College of Engineering, Vadakara (CAPE, Govt. of Kerala), Kozhikkode 673104, Kerala, India
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kochi 682022, Kerala, India
| | - Jereesh A S
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kochi 682022, Kerala, India
| | - Yuansheng Liu
- College of Information Science and Engineering, Hunan University, 2 Lushan S Rd, Yuelu District, 410086, Changsha, China
| |
Collapse
|
47
|
Prediction of Drug-Target Interactions by Combining Dual-Tree Complex Wavelet Transform with Ensemble Learning Method. Molecules 2021; 26:molecules26175359. [PMID: 34500792 PMCID: PMC8433937 DOI: 10.3390/molecules26175359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 08/27/2021] [Accepted: 08/30/2021] [Indexed: 11/17/2022] Open
Abstract
Identification of drug–target interactions (DTIs) is vital for drug discovery. However, traditional biological approaches have some unavoidable shortcomings, such as being time consuming and expensive. Therefore, there is an urgent need to develop novel and effective computational methods to predict DTIs in order to shorten the development cycles of new drugs. In this study, we present a novel computational approach to identify DTIs, which uses protein sequence information and the dual-tree complex wavelet transform (DTCWT). More specifically, a position-specific scoring matrix (PSSM) was performed on the target protein sequence to obtain its evolutionary information. Then, DTCWT was used to extract representative features from the PSSM, which were then combined with the drug fingerprint features to form the feature descriptors. Finally, these descriptors were sent to the Rotation Forest (RoF) model for classification. A 5-fold cross validation (CV) was adopted on four datasets (Enzyme, Ion Channel, GPCRs (G-protein-coupled receptors), and NRs (Nuclear Receptors)) to validate the proposed model; our method yielded high average accuracies of 89.21%, 85.49%, 81.02%, and 74.44%, respectively. To further verify the performance of our model, we compared the RoF classifier with two state-of-the-art algorithms: the support vector machine (SVM) and the k-nearest neighbor (KNN) classifier. We also compared it with some other published methods. Moreover, the prediction results for the independent dataset further indicated that our method is effective for predicting potential DTIs. Thus, we believe that our method is suitable for facilitating drug discovery and development.
Collapse
|
48
|
Chiu PH, Yang YL, Tsao HK, Sheng YJ. Deep learning for predictions of hydrolysis rates and conditional molecular design of esters. J Taiwan Inst Chem Eng 2021. [DOI: 10.1016/j.jtice.2021.06.045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
49
|
Wang F, Feng X, Guo X, Xu L, Xie L, Chang S. Improving de novo Molecule Generation by Embedding LSTM and Attention Mechanism in CycleGAN. Front Genet 2021; 12:709500. [PMID: 34422013 PMCID: PMC8376287 DOI: 10.3389/fgene.2021.709500] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of deep learning in the field of drug discovery brings the development and expansion of molecular generative models along with new challenges in this field. One of challenges in de novo molecular generation is how to produce new reasonable molecules with desired pharmacological, physical, and chemical properties. To improve the similarity between the generated molecule and the starting molecule, we propose a new molecule generation model by embedding Long Short-Term Memory (LSTM) and Attention mechanism in CycleGAN architecture, LA-CycleGAN. The network layer of the generator in CycleGAN is fused head and tail to improve the similarity of the generated structure. The embedded LSTM and Attention mechanism can overcome long-term dependency problems in treating the normally used SMILES input. From our quantitative evaluation, we present that LA-CycleGAN expands the chemical space of the molecules and improves the ability of structure conversion. The generated molecules are highly similar to the starting compound structures while obtaining expected molecular properties during cycle generative adversarial network learning, which comprehensively improves the performance of the generative model.
Collapse
Affiliation(s)
- Feng Wang
- Changzhou University Huaide College, Taizhou, China.,School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, Changzhou University, Changzhou, China
| | | | - Xiao Guo
- Changzhou University Huaide College, Taizhou, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| |
Collapse
|
50
|
Predicting Drug-Target Interactions Based on the Ensemble Models of Multiple Feature Pairs. Int J Mol Sci 2021; 22:ijms22126598. [PMID: 34202954 PMCID: PMC8234024 DOI: 10.3390/ijms22126598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/09/2021] [Accepted: 06/16/2021] [Indexed: 11/30/2022] Open
Abstract
Backgroud: The prediction of drug–target interactions (DTIs) is of great significance in drug development. It is time-consuming and expensive in traditional experimental methods. Machine learning can reduce the cost of prediction and is limited by the characteristics of imbalanced datasets and problems of essential feature selection. Methods: The prediction method based on the Ensemble model of Multiple Feature Pairs (Ensemble-MFP) is introduced. Firstly, three negative sets are generated according to the Euclidean distance of three feature pairs. Then, the negative samples of the validation set/test set are randomly selected from the union set of the three negative sets in the validation set/test set. At the same time, the ensemble model with weight is optimized and applied to the test set. Results: The area under the receiver operating characteristic curve (area under ROC, AUC) in three out of four sub-datasets in gold standard datasets was more than 94.0% in the prediction of new drugs. The effectiveness of the proposed method is also shown with the comparison of state-of-the-art methods and demonstration of predicted drug–target pairs. Conclusion: The Ensemble-MFP can weigh the existing feature pairs and has a good prediction effect for general prediction on new drugs.
Collapse
|