1
|
Abbas A, Ye F. Computational methods and key considerations for in silico design of proteolysis targeting chimera (PROTACs). Int J Biol Macromol 2024; 277:134293. [PMID: 39084437 DOI: 10.1016/j.ijbiomac.2024.134293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 07/19/2024] [Accepted: 07/28/2024] [Indexed: 08/02/2024]
Abstract
Proteolysis-targeting chimeras (PROTACs), as heterobifunctional molecules, have garnered significant attention for their ability to target previously undruggable proteins. Due to the challenges in obtaining crystal structures of PROTAC molecules in the ternary complex, a plethora of computational tools have been developed to aid in PROTAC design. These computational tools can be broadly classified into artificial intelligence (AI)-based or non-AI-based methods. This review aims to provide a comprehensive overview of the latest computational methods for the PROTAC design process, covering both AI and non-AI approaches, from protein selection to ternary complex modeling and prediction. Key considerations for in silico PROTAC design are discussed, along with additional considerations for deploying AI-based models. These considerations are intended to guide subsequent model development in the PROTAC design process. Finally, future directions and recommendations are provided.
Collapse
Affiliation(s)
- Amr Abbas
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310018, China; Pharmaceutical Chemistry Department, Faculty of Pharmacy, Cairo University, Cairo 11562, Egypt
| | - Fei Ye
- College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310018, China.
| |
Collapse
|
2
|
Wang Y, Zhang Z, Piao C, Huang Y, Zhang Y, Zhang C, Lu YJ, Liu D. LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening. Health Inf Sci Syst 2023; 11:42. [PMID: 37667773 PMCID: PMC10475000 DOI: 10.1007/s13755-023-00243-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023] Open
Abstract
Background Drug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge. Method Here, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction. Result On 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats. Conclusion In this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.
Collapse
Affiliation(s)
- Yang Wang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| | - Zuxian Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chenghong Piao
- The First Affiliated Hospital of Ningbo University, Ningbo, 315010 China
| | - Ying Huang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Yihan Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chi Zhang
- Shanghai Institute of Biological Products, Shanghai, 201403 China
| | - Yu-Jing Lu
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
- Smart Medical Innovation Technology Center, Guangdong University of Technology, Guangzhou, 510006 China
| | - Dongning Liu
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| |
Collapse
|
3
|
Qian Y, Li X, Wu J, Zhang Q. MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug-target interaction. BMC Bioinformatics 2023; 24:323. [PMID: 37633938 PMCID: PMC10463755 DOI: 10.1186/s12859-023-05447-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 08/15/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND Prediction of drug-target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug-target interacting parties, which may lead to insufficient feature representation. MOTIVATION In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. RESULTS To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug-target interaction for drug-target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. CONCLUSIONS MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug-drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.
Collapse
Affiliation(s)
- Ying Qian
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Xinyi Li
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Jian Wu
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| | - Qian Zhang
- Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Computer Science and Technology, East China Normal University, North Zhongshan Road, Shanghai, 200062 China
| |
Collapse
|
4
|
Kalia A, Krishnan D, Hassoun S. CSI: Contrastive data Stratification for Interaction prediction and its application to compound-protein interaction prediction. Bioinformatics 2023; 39:btad456. [PMID: 37490457 PMCID: PMC10423023 DOI: 10.1093/bioinformatics/btad456] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/10/2023] [Accepted: 07/24/2023] [Indexed: 07/27/2023] Open
Abstract
MOTIVATION Accurately predicting the likelihood of interaction between two objects (compound-protein sequence, user-item, author-paper, etc.) is a fundamental problem in Computer Science. Current deep-learning models rely on learning accurate representations of the interacting objects. Importantly, relationships between the interacting objects, or features of the interaction, offer an opportunity to partition the data to create multi-views of the interacting objects. The resulting congruent and non-congruent views can then be exploited via contrastive learning techniques to learn enhanced representations of the objects. RESULTS We present a novel method, Contrastive Stratification for Interaction Prediction (CSI), to stratify (partition) a dataset in a manner that can be exploited via Contrastive Multiview Coding to learn embeddings that maximize the mutual information across congruent data views. CSI assigns a key and multiple views to each data point, where data partitions under a particular key form congruent views of the data. We showcase the effectiveness of CSI by applying it to the compound-protein sequence interaction prediction problem, a pressing problem whose solution promises to expedite drug delivery (drug-protein interaction prediction), metabolic engineering, and synthetic biology (compound-enzyme interaction prediction) applications. Comparing CSI with a baseline model that does not utilize data stratification and contrastive learning, and show gains in average precision ranging from 13.7% to 39% using compounds and sequences as keys across multiple drug-target and enzymatic datasets, and gains ranging from 16.9% to 63% using reaction features as keys across enzymatic datasets. AVAILABILITY AND IMPLEMENTATION Code and dataset available at https://github.com/HassounLab/CSI.
Collapse
Affiliation(s)
- Apurva Kalia
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | | | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA 02155, United States
| |
Collapse
|
5
|
Abbasi Mesrabadi H, Faez K, Pirgazi J. Drug-target interaction prediction based on protein features, using wrapper feature selection. Sci Rep 2023; 13:3594. [PMID: 36869062 PMCID: PMC9984486 DOI: 10.1038/s41598-023-30026-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Drug-target interaction prediction is a vital stage in drug development, involving lots of methods. Experimental methods that identify these relationships on the basis of clinical remedies are time-taking, costly, laborious, and complex introducing a lot of challenges. One group of new methods is called computational methods. The development of new computational methods which are more accurate can be preferable to experimental methods, in terms of total cost and time. In this paper, a new computational model to predict drug-target interaction (DTI), consisting of three phases, including feature extraction, feature selection, and classification is proposed. In feature extraction phase, different features such as EAAC, PSSM and etc. would be extracted from sequence of proteins and fingerprint features from drugs. These extracted features would then be combined. In the next step, one of the wrapper feature selection methods named IWSSR, due to the large amount of extracted data, is applied. The selected features are then given to rotation forest classification, to have a more efficient prediction. Actually, the innovation of our work is that we extract different features; and then select features by the use of IWSSR. The accuracy of the rotation forest classifier based on tenfold on the golden standard datasets (enzyme, ion channels, G-protein-coupled receptors, nuclear receptors) is as follows: 98.12, 98.07, 96.82, and 95.64. The results of experiments indicate that the proposed model has an acceptable rate in DTI prediction and is compatible with the proposed methods in other papers.
Collapse
Affiliation(s)
- Hengame Abbasi Mesrabadi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Karim Faez
- Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran.
| | - Jamshid Pirgazi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
6
|
Peng Y, Zhao S, Zeng Z, Hu X, Yin Z. LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions. Front Microbiol 2023; 13:1092467. [PMID: 36687573 PMCID: PMC9849804 DOI: 10.3389/fmicb.2022.1092467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/07/2022] [Indexed: 01/07/2023] Open
Abstract
Prediction of drug-target interactions (DTIs) plays an important role in drug development. However, traditional laboratory methods to determine DTIs require a lot of time and capital costs. In recent years, many studies have shown that using machine learning methods to predict DTIs can speed up the drug development process and reduce capital costs. An excellent DTI prediction method should have both high prediction accuracy and low computational cost. In this study, we noticed that the previous research based on deep forests used XGBoost as the estimator in the cascade, we applied LightGBM instead of XGBoost to the cascade forest as the estimator, then the estimator group was determined experimentally as three LightGBMs and three ExtraTrees, this new model is called LGBMDF. We conducted 5-fold cross-validation on LGBMDF and other state-of-the-art methods using the same dataset, and compared their Sn, Sp, MCC, AUC and AUPR. Finally, we found that our method has better performance and faster calculation speed.
Collapse
|
7
|
Sun G, Dong D, Dong Z, Zhang Q, Fang H, Wang C, Zhang S, Wu S, Dong Y, Wan Y. Drug repositioning: A bibliometric analysis. Front Pharmacol 2022; 13:974849. [PMID: 36225586 PMCID: PMC9549161 DOI: 10.3389/fphar.2022.974849] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/12/2022] [Indexed: 11/14/2022] Open
Abstract
Drug repurposing has become an effective approach to drug discovery, as it offers a new way to explore drugs. Based on the Science Citation Index Expanded (SCI-E) and Social Sciences Citation Index (SSCI) databases of the Web of Science core collection, this study presents a bibliometric analysis of drug repurposing publications from 2010 to 2020. Data were cleaned, mined, and visualized using Derwent Data Analyzer (DDA) software. An overview of the history and development trend of the number of publications, major journals, major countries, major institutions, author keywords, major contributors, and major research fields is provided. There were 2,978 publications included in the study. The findings show that the United States leads in this area of research, followed by China, the United Kingdom, and India. The Chinese Academy of Science published the most research studies, and NIH ranked first on the h-index. The Icahn School of Medicine at Mt Sinai leads in the average number of citations per study. Sci Rep, Drug Discov. Today, and Brief. Bioinform. are the three most productive journals evaluated from three separate perspectives, and pharmacology and pharmacy are unquestionably the most commonly used subject categories. Cheng, FX; Mucke, HAM; and Butte, AJ are the top 20 most prolific and influential authors. Keyword analysis shows that in recent years, most research has focused on drug discovery/drug development, COVID-19/SARS-CoV-2/coronavirus, molecular docking, virtual screening, cancer, and other research areas. The hotspots have changed in recent years, with COVID-19/SARS-CoV-2/coronavirus being the most popular topic for current drug repurposing research.
Collapse
Affiliation(s)
- Guojun Sun
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Dashun Dong
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Zuojun Dong
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Qian Zhang
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Hui Fang
- Institute of Information Resource, Zhejiang University of Technology, Hangzhou, China
| | - Chaojun Wang
- Hangzhou Aeronautical Sanatorium for Special Service of Chinese Air Force, Hangzhou, China
| | - Shaoya Zhang
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Shuaijun Wu
- Institute of Pharmaceutical Preparations, Department of Pharmacy, Zhejiang University of Technology, Hangzhou, China
| | - Yichen Dong
- Faculty of Chinese Medicine, Macau University of Science and Technology, Macau, China
| | - Yuehua Wan
- Institute of Information Resource, Zhejiang University of Technology, Hangzhou, China
- *Correspondence: Yuehua Wan,
| |
Collapse
|
8
|
Li S, Wang B, Chang M, Hou R, Tian G, Tong L. A Novel Algorithm for Detecting Microsatellite Instability Based on Next-Generation Sequencing Data. Front Oncol 2022; 12:916379. [PMID: 35847873 PMCID: PMC9280483 DOI: 10.3389/fonc.2022.916379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 05/27/2022] [Indexed: 11/25/2022] Open
Abstract
Objectives Microsatellite instability (MSI) is the condition of genetic hypermutability caused by spontaneous acquisition or loss of nucleotides during the DNA replication. MSI has been discovered to be a useful immunotherapy biomarker clinically. The main DNA-based method for MSI detection is polymerase chain reaction (PCR) amplification and fragment length analysis, which are costly and laborious. Thus, we developed a novel method to detect MSI based on next-generation sequencing (NGS) data. Methods We chose six markers of MSI. After alignment and reads counting, a histogram was plotted showing the counts of different lengths for each marker. We then designed an algorithm to discover peaks in the generated histograms so that the peak numbers discovered in NGS data resembled that in PCR-based method. Results We selected nine samples as the training dataset, 101 samples for validation, and 68 samples as the test dataset from Chifeng Municipal Hospital, Inner Mongolia, China. The NGS-based method achieved 100% accuracy for the validation dataset and 98.53% accuracy for the test dataset, in which only one false positive was detected. Conclusions Accurate MSI judgments were achieved using NGS data, which could provide comparable MSI detection with the gold standard, PCR-based methods.
Collapse
Affiliation(s)
- Shijun Li
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
| | - Bo Wang
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
| | - Miaomiao Chang
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
| | - Rui Hou
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
- *Correspondence: Geng Tian, ; Ling Tong,
| | - Ling Tong
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
- *Correspondence: Geng Tian, ; Ling Tong,
| |
Collapse
|
9
|
Lang J, Zhu R, Sun X, Zhu S, Li T, Shi X, Sun Y, Yang Z, Wang W, Bing P, He B, Tian G. Evaluation of the MGISEQ-2000 Sequencing Platform for Illumina Target Capture Sequencing Libraries. Front Genet 2021; 12:730519. [PMID: 34777467 PMCID: PMC8578046 DOI: 10.3389/fgene.2021.730519] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 09/24/2021] [Indexed: 01/19/2023] Open
Abstract
Illumina is the leading sequencing platform in the next-generation sequencing (NGS) market globally. In recent years, MGI Tech has presented a series of new sequencers, including DNBSEQ-T7, MGISEQ-2000 and MGISEQ-200. As a complex application of NGS, cancer-detecting panels pose increasing demands for the high accuracy and sensitivity of sequencing and data analysis. In this study, we used the same capture DNA libraries constructed based on the Illumina protocol to evaluate the performance of the Illumina Nextseq500 and MGISEQ-2000 sequencing platforms. We found that the two platforms had high consistency in the results of hotspot mutation analysis; more importantly, we found that there was a significant loss of fragments in the 101-133 bp size range on the MGISEQ-2000 sequencing platform for Illumina libraries, but not for the capture DNA libraries prepared based on the MGISEQ protocol. This phenomenon may indicate fragment selection or low fragment ligation efficiency during the DNA circularization step, which is a unique step of the MGISEQ-2000 sequence platform. In conclusion, these different sequencing libraries and corresponding sequencing platforms are compatible with each other, but protocol and platform selection need to be carefully evaluated in combination with research purpose.
Collapse
Affiliation(s)
- Jidong Lang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China.,Academician Workstation, Changsha Medical University, Changsha, China
| | - Rongrong Zhu
- Vascular Surgery Department, Tsinghua University Affiliated Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Xue Sun
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Siyu Zhu
- Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA, United States
| | - Tianbao Li
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xiaoli Shi
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Yanqi Sun
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Zhou Yang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China
| | - Weiwei Wang
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Pingping Bing
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Binsheng He
- Academician Workstation, Changsha Medical University, Changsha, China
| | - Geng Tian
- Bioinformatics and R and D Department, Geneis (Beijing) Co. Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| |
Collapse
|
10
|
Zhu X, He X, Kuang L, Chen Z, Lancine C. A Novel Collaborative Filtering Model-Based Method for Identifying Essential Proteins. Front Genet 2021; 12:763153. [PMID: 34745230 PMCID: PMC8566338 DOI: 10.3389/fgene.2021.763153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 09/13/2021] [Indexed: 11/19/2022] Open
Abstract
Considering that traditional biological experiments are expensive and time consuming, it is important to develop effective computational models to infer potential essential proteins. In this manuscript, a novel collaborative filtering model-based method called CFMM was proposed, in which, an updated protein–domain interaction (PDI) network was constructed first by applying collaborative filtering algorithm on the original PDI network, and then, through integrating topological features of PDI networks with biological features of proteins, a calculative method was designed to infer potential essential proteins based on an improved PageRank algorithm. The novelties of CFMM lie in construction of an updated PDI network, application of the commodity-customer-based collaborative filtering algorithm, and introduction of the calculation method based on an improved PageRank algorithm, which ensured that CFMM can be applied to predict essential proteins without relying entirely on known protein–domain associations. Simulation results showed that CFMM can achieve reliable prediction accuracies of 92.16, 83.14, 71.37, 63.87, 55.84, and 52.43% in the top 1, 5, 10, 15, 20, and 25% predicted candidate key proteins based on the DIP database, which are remarkably higher than 14 competitive state-of-the-art predictive models as a whole, and in addition, CFMM can achieve satisfactory predictive performances based on different databases with various evaluation measurements, which further indicated that CFMM may be a useful tool for the identification of essential proteins in the future.
Collapse
Affiliation(s)
- Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China.,Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, China
| | - Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Camara Lancine
- The Social Sciences and Management University of Bamako, Bamako, Mali
| |
Collapse
|
11
|
Drewe J, Küsters E, Hammann F, Kreuter M, Boss P, Schöning V. Modeling Structure-Activity Relationship of AMPK Activation. Molecules 2021; 26:molecules26216508. [PMID: 34770917 PMCID: PMC8587902 DOI: 10.3390/molecules26216508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 12/23/2022] Open
Abstract
The adenosine monophosphate activated protein kinase (AMPK) is critical in the regulation of important cellular functions such as lipid, glucose, and protein metabolism; mitochondrial biogenesis and autophagy; and cellular growth. In many diseases-such as metabolic syndrome, obesity, diabetes, and also cancer-activation of AMPK is beneficial. Therefore, there is growing interest in AMPK activators that act either by direct action on the enzyme itself or by indirect activation of upstream regulators. Many natural compounds have been described that activate AMPK indirectly. These compounds are usually contained in mixtures with a variety of structurally different other compounds, which in turn can also alter the activity of AMPK via one or more pathways. For these compounds, experiments are complicated, since the required pure substances are often not yet isolated and/or therefore not sufficiently available. Therefore, our goal was to develop a screening tool that could handle the profound heterogeneity in activation pathways of the AMPK. Since machine learning algorithms can model complex (unknown) relationships and patterns, some of these methods (random forest, support vector machines, stochastic gradient boosting, logistic regression, and deep neural network) were applied and validated using a database, comprising of 904 activating and 799 neutral or inhibiting compounds identified by extensive PubMed literature search and PubChem Bioassay database. All models showed unexpectedly high classification accuracy in training, but more importantly in predicting the unseen test data. These models are therefore suitable tools for rapid in silico screening of established substances or multicomponent mixtures and can be used to identify compounds of interest for further testing.
Collapse
Affiliation(s)
- Jürgen Drewe
- Medical Department, Max Zeller Söhne AG, CH-8590 Romanshorn, Switzerland;
- Correspondence:
| | | | - Felix Hammann
- Clinical Pharmacology and Toxicology, Department of General Internal Medicine, Inselspital University Hospital, CH-3012 Bern, Switzerland; (F.H.); (V.S.)
| | - Matthias Kreuter
- Medical Department, Max Zeller Söhne AG, CH-8590 Romanshorn, Switzerland;
| | - Philipp Boss
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, D-13125 Berlin, Germany;
| | - Verena Schöning
- Clinical Pharmacology and Toxicology, Department of General Internal Medicine, Inselspital University Hospital, CH-3012 Bern, Switzerland; (F.H.); (V.S.)
| |
Collapse
|
12
|
Yu L, Shi X, Liu X, Jin W, Jia X, Xi S, Wang A, Li T, Zhang X, Tian G, Sun D. Artificial Intelligence Systems for Diagnosis and Clinical Classification of COVID-19. Front Microbiol 2021; 12:729455. [PMID: 34650534 PMCID: PMC8507494 DOI: 10.3389/fmicb.2021.729455] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 08/17/2021] [Indexed: 01/14/2023] Open
Abstract
Objectives: COVID-19 is highly infectious and has been widely spread worldwide, with more than 159 million confirmed cases and more than 3 million deaths as of May 11, 2021. It has become a serious public health event threatening people's lives and safety. Due to the rapid transmission and long incubation period, shortage of medical resources would easily occur in the short term of discovering disease cases. Therefore, we aimed to construct an artificial intelligent framework to rapidly distinguish patients with COVID-19 from common pneumonia and non-pneumonia populations based on computed tomography (CT) images. Furthermore, we explored artificial intelligence (AI) algorithms to integrate CT features and laboratory findings on admission to predict the clinical classification of COVID-19. This will ease the burden of doctors in this emergency period and aid them to perform timely and appropriate treatment on patients. Methods: We collected all CT images and clinical data of novel coronavirus pneumonia cases in Inner Mongolia, including domestic cases and those imported from abroad; then, three models based on transfer learning to distinguish COVID-19 from other pneumonia and non-pneumonia population were developed. In addition, CT features and laboratory findings on admission were combined to predict clinical types of COVID-19 using AI algorithms. Lastly, Spearman's correlation test was applied to study correlations of CT characteristics and laboratory findings. Results: Among three models to distinguish COVID-19 based on CT, vgg19 showed excellent diagnostic performance, with area under the curve (AUC) of the receiver operating characteristic (ROC) curve at 95%. Together with laboratory findings, we were able to predict clinical types of COVID-19 with AUC of the ROC curve at 90%. Furthermore, biochemical markers, such as C-reactive protein (CRP), LYM, and lactic dehydrogenase (LDH) were identified and correlated with CT features. Conclusion: We developed an AI model to identify patients who were positive for COVID-19 according to the results of the first CT examination after admission and predict the progression combined with laboratory findings. In addition, we obtained important clinical characteristics that correlated with the CT image features. Together, our AI system could rapidly diagnose COVID-19 and predict clinical types to assist clinicians perform appropriate clinical management.
Collapse
Affiliation(s)
- Lan Yu
- Clinical Medical Research Center/Inner Mongolia Key Laboratory of Gene Regulation of the Metabolic Diseases, Inner Mongolia People's Hospital, Hohhot, China.,Department of Endocrinology, Inner Mongolia People's Hospital, Hohhot, China
| | - Xiaoli Shi
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xiaoling Liu
- Department of Otolaryngology, Inner Mongolia People's Hospital, Hohhot, China
| | - Wen Jin
- Clinical Medical Research Center/Inner Mongolia Key Laboratory of Gene Regulation of the Metabolic Diseases, Inner Mongolia People's Hospital, Hohhot, China
| | - Xiaoqing Jia
- Baotou City Hospital for Infectious Diseases, Baotou, China
| | - Shuxue Xi
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Ailan Wang
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Tianbao Li
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Xiao Zhang
- Clinical Medical Research Center/Inner Mongolia Key Laboratory of Gene Regulation of the Metabolic Diseases, Inner Mongolia People's Hospital, Hohhot, China
| | - Geng Tian
- Geneis (Beijing) Co., Ltd., Beijing, China.,Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, China
| | - Dejun Sun
- Department of Pulmonary and Critical Care Medicine/Key Laboratory of National Health Commission for the Diagnosis & Treatment of COPD, Inner Mongolia People's Hospital, Hohhot, China
| |
Collapse
|
13
|
Identification of drug-target interactions via multi-view graph regularized link propagation model. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.100] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
14
|
Zhao L, Li Y, Wang Y, Gao Q, Ge Z, Sun X, Li Y. Development and Validation of a Nomogram for the Prediction of Hospital Mortality of Patients With Encephalopathy Caused by Microbial Infection: A Retrospective Cohort Study. Front Microbiol 2021; 12:737066. [PMID: 34489922 PMCID: PMC8417384 DOI: 10.3389/fmicb.2021.737066] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 08/02/2021] [Indexed: 12/12/2022] Open
Abstract
Background Hospital mortality is high for patients with encephalopathy caused by microbial infection. Microbial infections often induce sepsis. The damage to the central nervous system (CNS) is defined as sepsis-associated encephalopathy (SAE). However, the relationship between pathogenic microorganisms and the prognosis of SAE patients is still unclear, especially gut microbiota, and there is no clinical tool to predict hospital mortality for SAE patients. The study aimed to explore the relationship between pathogenic microorganisms and the hospital mortality of SAE patients and develop a nomogram for the prediction of hospital mortality in SAE patients. Methods The study is a retrospective cohort study. The lasso regression model was used for data dimension reduction and feature selection. Model of hospital mortality of SAE patients was developed by multivariable Cox regression analysis. Calibration and discrimination were used to assess the performance of the nomogram. Decision curve analysis (DCA) to evaluate the clinical utility of the model. Results Unfortunately, the results of our study did not find intestinal infection and microorganisms of the gastrointestinal (such as: Escherichia coli) that are related to the prognosis of SAE. Lasso regression and multivariate Cox regression indicated that factors including respiratory failure, lactate, international normalized ratio (INR), albumin, SpO2, temperature, and renal replacement therapy were significantly correlated with hospital mortality. The AUC of 0.812 under the nomogram was more than that of the Simplified Acute Physiology Score (0.745), indicating excellent discrimination. DCA demonstrated that using the nomogram or including the prognostic signature score status was better than without the nomogram or using the SAPS II at predicting hospital mortality. Conclusion The prognosis of SAE patients has nothing to do with intestinal and microbial infections. We developed a nomogram that predicts hospital mortality in patients with SAE according to clinical data. The nomogram exhibited excellent discrimination and calibration capacity, favoring its clinical utility.
Collapse
Affiliation(s)
- Lina Zhao
- Emergency Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China.,Department of Critical Care Medicine, Chifeng Municipal Hospital, Chifeng Clinical Medical College of Inner Mongolia Medical University, Chifeng, China
| | - Yun Li
- Department of Anesthesiology, Chifeng Municipal Hospital, Chifeng Clinical Medical College of Inner Mongolia Medical University, Chifeng, China
| | - Yunying Wang
- Department of Critical Care Medicine, Chifeng Municipal Hospital, Chifeng Clinical Medical College of Inner Mongolia Medical University, Chifeng, China
| | - Qian Gao
- Department of Neurology, Yidu Central Hospital Affiliated to Weifang Medical University, Weifang, China
| | - Zengzheng Ge
- Emergency Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| | - Xibo Sun
- Department of Neurology, Yidu Central Hospital Affiliated to Weifang Medical University, Weifang, China
| | - Yi Li
- Emergency Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
15
|
Prediction of Drug-Target Interactions by Combining Dual-Tree Complex Wavelet Transform with Ensemble Learning Method. Molecules 2021; 26:molecules26175359. [PMID: 34500792 PMCID: PMC8433937 DOI: 10.3390/molecules26175359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 08/27/2021] [Accepted: 08/30/2021] [Indexed: 11/17/2022] Open
Abstract
Identification of drug–target interactions (DTIs) is vital for drug discovery. However, traditional biological approaches have some unavoidable shortcomings, such as being time consuming and expensive. Therefore, there is an urgent need to develop novel and effective computational methods to predict DTIs in order to shorten the development cycles of new drugs. In this study, we present a novel computational approach to identify DTIs, which uses protein sequence information and the dual-tree complex wavelet transform (DTCWT). More specifically, a position-specific scoring matrix (PSSM) was performed on the target protein sequence to obtain its evolutionary information. Then, DTCWT was used to extract representative features from the PSSM, which were then combined with the drug fingerprint features to form the feature descriptors. Finally, these descriptors were sent to the Rotation Forest (RoF) model for classification. A 5-fold cross validation (CV) was adopted on four datasets (Enzyme, Ion Channel, GPCRs (G-protein-coupled receptors), and NRs (Nuclear Receptors)) to validate the proposed model; our method yielded high average accuracies of 89.21%, 85.49%, 81.02%, and 74.44%, respectively. To further verify the performance of our model, we compared the RoF classifier with two state-of-the-art algorithms: the support vector machine (SVM) and the k-nearest neighbor (KNN) classifier. We also compared it with some other published methods. Moreover, the prediction results for the independent dataset further indicated that our method is effective for predicting potential DTIs. Thus, we believe that our method is suitable for facilitating drug discovery and development.
Collapse
|
16
|
Sajadi SZ, Zare Chahooki MA, Gharaghani S, Abbasi K. AutoDTI++: deep unsupervised learning for DTI prediction by autoencoders. BMC Bioinformatics 2021; 22:204. [PMID: 33879050 PMCID: PMC8056558 DOI: 10.1186/s12859-021-04127-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 04/09/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Drug-target interaction (DTI) plays a vital role in drug discovery. Identifying drug-target interactions related to wet-lab experiments are costly, laborious, and time-consuming. Therefore, computational methods to predict drug-target interactions are an essential task in the drug discovery process. Meanwhile, computational methods can reduce search space by proposing potential drugs already validated on wet-lab experiments. Recently, deep learning-based methods in drug-target interaction prediction have gotten more attention. Traditionally, DTI prediction methods' performance heavily depends on additional information, such as protein sequence and molecular structure of the drug, as well as deep supervised learning. RESULTS This paper proposes a method based on deep unsupervised learning for drug-target interaction prediction called AutoDTI++. The proposed method includes three steps. The first step is to pre-process the interaction matrix. Since the interaction matrix is sparse, we solved the sparsity of the interaction matrix with drug fingerprints. Then, in the second step, the AutoDTI approach is introduced. In the third step, we post-preprocess the output of the AutoDTI model. CONCLUSIONS Experimental results have shown that we were able to improve the prediction performance. To this end, the proposed method has been compared to other algorithms using the same reference datasets. The proposed method indicates that the experimental results of running five repetitions of tenfold cross-validation on golden standard datasets (Nuclear Receptors, GPCRs, Ion channels, and Enzymes) achieve good performance with high accuracy.
Collapse
Affiliation(s)
| | | | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Karim Abbasi
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
17
|
Application of network link prediction in drug discovery. BMC Bioinformatics 2021; 22:187. [PMID: 33845763 PMCID: PMC8042985 DOI: 10.1186/s12859-021-04082-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Accepted: 03/16/2021] [Indexed: 11/10/2022] Open
Abstract
Background Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug–drug, drug–disease, and protein–protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. Results We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$LRW_5$$\end{document}LRW5 are the top 3 best performers on all five datasets. Conclusions This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug–drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks. Supplementary Information The online version supplementary material available at 10.1186/s12859-021-04082-y.
Collapse
|
18
|
Quancard J, Bach A, Cox B, Craft R, Finsinger D, Guéret SM, Hartung IV, Laufer S, Messinger J, Sbardella G, Koolman HF. The European Federation for Medicinal Chemistry and Chemical Biology (EFMC) Best Practice Initiative: Phenotypic Drug Discovery. ChemMedChem 2021; 16:1736-1739. [PMID: 33825353 DOI: 10.1002/cmdc.202100041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Indexed: 12/16/2022]
Abstract
Phenotypic drug discovery has a long track record of delivering innovative drugs and has received renewed attention in the last few years. The promise of this approach, however, comes with several challenges that should be addressed to avoid wasting time and resources on drugs with undesired modes of action or, worse, false-positive hits. In this set of best practices, we go over the essential steps of phenotypic drug discovery and provide guidance on how to increase the chance of success in identifying validated and relevant chemical starting points for optimization: selecting the right assay, selecting the right compound screening library and developing appropriate hit validation assays. Then, we highlight the importance of initiating studies to determine the mode of action of the identified hits early and present the current state of the art.
Collapse
Affiliation(s)
- Jean Quancard
- Global Discovery Chemistry, Novartis Institute For Biomedical Research, Novartis Pharma AG, Novartis Campus, 4056, Basel, Switzerland
| | - Anders Bach
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen, Denmark
| | - Brian Cox
- Pharmaceutical Chemistry, School of Life Sciences, University of Sussex, Falmer, East Sussex, BN1 9RH, UK
| | - Russell Craft
- Medicinal Chemistry, Symeres, Kadijk 3, 9747AT, Groningen (The, Netherlands
| | - Dirk Finsinger
- Medicinal Chemistry, Global R&D, Merck Healthcare KGaA, Frankfurter Straße 250, 64293, Darmstadt, Germany
| | - Stéphanie M Guéret
- Medicinal Chemistry, Research and Early Development Cardiovascular, Renal and Metabolism BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Ingo V Hartung
- Medicinal Chemistry, Global R&D, Merck Healthcare KGaA, Frankfurter Straße 250, 64293, Darmstadt, Germany
| | - Stefan Laufer
- Pharmaceutical&Medicinal Chemistry, Institute of Pharmacy & Biochemistry, Tübingen Center for Academic Drug Discovery, Auf der Morgenstelle 8, 72070 Tuebingen, Germany
| | - Josef Messinger
- Medicine Design, Orionpharma, Orionintie 1, 02101, Espoo, Finland
| | - Gianluca Sbardella
- Department of Pharmacy, Epigenetic Med Chem Lab., University of Salerno, Via Giovanni Paolo II 132, 84084, Fisciano-SA, Italy
| | - Hannes F Koolman
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397, Biberach an der Riss, Germany
| |
Collapse
|
19
|
Ning Z, Yu S, Zhao Y, Sun X, Wu H, Yu X. Identification of miRNA-Mediated Subpathways as Prostate Cancer Biomarkers Based on Topological Inference in a Machine Learning Process Using Integrated Gene and miRNA Expression Data. Front Genet 2021; 12:656526. [PMID: 33841512 PMCID: PMC8024646 DOI: 10.3389/fgene.2021.656526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 03/02/2021] [Indexed: 11/25/2022] Open
Abstract
Accurately identifying classification biomarkers for distinguishing between normal and cancer samples is challenging. Additionally, the reproducibility of single-molecule biomarkers is limited by the existence of heterogeneous patient subgroups and differences in the sequencing techniques used to collect patient data. In this study, we developed a method to identify robust biomarkers (i.e., miRNA-mediated subpathways) associated with prostate cancer based on normal prostate samples and cancer samples from a dataset from The Cancer Genome Atlas (TCGA; n = 546) and datasets from the Gene Expression Omnibus (GEO) database (n = 139 and n = 90, with the latter being a cell line dataset). We also obtained 10 other cancer datasets to evaluate the performance of the method. We propose a multi-omics data integration strategy for identifying classification biomarkers using a machine learning method that involves reassigning topological weights to the genes using a directed random walk (DRW)-based method. A global directed pathway network (GDPN) was constructed based on the significantly differentially expressed target genes of the significantly differentially expressed miRNAs, which allowed us to identify the robust biomarkers in the form of miRNA-mediated subpathways (miRNAs). The activity value of each miRNA-mediated subpathway was calculated by integrating multiple types of data, which included the expression of the miRNA and the miRNAs’ target genes and GDPN topological information. Finally, we identified the high-frequency miRNA-mediated subpathways involved in prostate cancer using a support vector machine (SVM) model. The results demonstrated that we obtained robust biomarkers of prostate cancer, which could classify prostate cancer and normal samples. Our method outperformed seven other methods, and many of the identified biomarkers were associated with known clinical treatments.
Collapse
Affiliation(s)
- Ziyu Ning
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China.,School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China
| | - Shuang Yu
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China
| | - Yanqiao Zhao
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China
| | - Xiaoming Sun
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China
| | - Haibin Wu
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China
| | - Xiaoyang Yu
- The Higher Educational Key Laboratory for Measuring and Control Technology and Instrumentations of Heilongjiang Province, Harbin University of Science and Technology, Harbin, China
| |
Collapse
|
20
|
Li Z, Yao Y, Cheng X, Chen Q, Zhao W, Ma S, Li Z, Zhou H, Li W, Fei T. A computational framework of host-based drug repositioning for broad-spectrum antivirals against RNA viruses. iScience 2021; 24:102148. [PMID: 33665567 PMCID: PMC7900436 DOI: 10.1016/j.isci.2021.102148] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 01/11/2021] [Accepted: 02/01/2021] [Indexed: 12/18/2022] Open
Abstract
RNA viruses are responsible for many zoonotic diseases that post great challenges for public health. Effective therapeutics against these viral infections remain limited. Here, we deployed a computational framework for host-based drug repositioning to predict potential antiviral drugs from 2,352 approved drugs and 1,062 natural compounds embedded in herbs of traditional Chinese medicine. By systematically interrogating public genetic screening data, we comprehensively cataloged host dependency genes (HDGs) that are indispensable for successful viral infection corresponding to 10 families and 29 species of RNA viruses. We then utilized these HDGs as potential drug targets and interrogated extensive drug-target interactions through database retrieval, literature mining, and de novo prediction using artificial intelligence-based algorithms. Repurposed drugs or natural compounds were proposed against many viral pathogens such as coronaviruses including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), flaviviruses, and influenza viruses. This study helps to prioritize promising drug candidates for in-depth evaluation against these virus-related diseases.
Collapse
Affiliation(s)
- Zexu Li
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| | - Yingjia Yao
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| | - Xiaolong Cheng
- Center for Genetic Medicine Research, Children's National Hospital, 111 Michigan Avenue NW, Washington, DC 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Avenue NW, Washington, DC 20010, USA
| | - Qing Chen
- Center for Genetic Medicine Research, Children's National Hospital, 111 Michigan Avenue NW, Washington, DC 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Avenue NW, Washington, DC 20010, USA
| | - Wenchang Zhao
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| | - Shixin Ma
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| | - Zihan Li
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| | - Hu Zhou
- School of Pharmaceutical Sciences, Fujian Provincial Key Laboratory of Innovative Drug Target Research, Xiamen University, Xiamen, Fujian 361102, China
- High Throughput Drug Screening Platform, Xiamen University, Xiamen, Fujian 361102, China
| | - Wei Li
- Center for Genetic Medicine Research, Children's National Hospital, 111 Michigan Avenue NW, Washington, DC 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Avenue NW, Washington, DC 20010, USA
| | - Teng Fei
- College of Life and Health Sciences, Northeastern University, Shenyang 110819, People's Republic of China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang 110819, People's Republic of China
| |
Collapse
|
21
|
Wang J, Yang Z, Chen C, Xu Y, Wang H, Liu B, Zhang W, Jiang Y. Comprehensive circRNA Expression Profile and Construction of circRNAs-Related ceRNA Network in a Mouse Model of Autism. Front Genet 2021; 11:623584. [PMID: 33679870 PMCID: PMC7928284 DOI: 10.3389/fgene.2020.623584] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 12/23/2020] [Indexed: 12/27/2022] Open
Abstract
Autism is a common disease that seriously affects the quality of life. The role of circular RNAs (circRNAs) in autism remains largely unexplored. We aimed to detect the circRNA expression profile and construct a circRNA-based competing endogenous RNA (ceRNA) network in autism. Valproate acid was used to establish an in vivo model of autism in mice. A total of 1,059 differentially expressed circRNAs (477 upregulated and 582 downregulated) in autism group was identified by RNA sequencing. The expression of novel_circ_015779 and novel_circ_035247 were detected by real-time PCR. A ceRNA network based on altered circRNAs was established, with 9,715 nodes and 150,408 edges. Module analysis was conducted followed by GO and KEGG pathway enrichment analysis. The top three modules were all correlated with autism-related pathways involving “TGF-beta signaling pathway,” “Notch signaling pathway,” “MAPK signaling pathway,” “long term depression,” “thyroid hormone signaling pathway,” etc. The present study reveals a novel circRNA involved mechanisms in the pathogenesis of autism.
Collapse
Affiliation(s)
- Ji Wang
- Yangzhou Maternal and Child Health Hospital, Yangzhou, China.,Harbin Children's Hospital, Harbin, China
| | - Zhongxiu Yang
- Xuzhou Children's Hospital, Xuzhou Medical University, Xuzhou, China
| | - Canming Chen
- Yangzhou Maternal and Child Health Hospital, Yangzhou, China
| | - Yang Xu
- Yangzhou Maternal and Child Health Hospital, Yangzhou, China
| | - Hongguang Wang
- School of Civil Engineering, Northeast Forestry University, Harbin, China
| | - Bing Liu
- Translational Medicine Research and Cooperation Center of Northern China, Heilongjiang Academy of Medical Sciences, Harbin, China
| | - Wei Zhang
- Translational Medicine Research and Cooperation Center of Northern China, Heilongjiang Academy of Medical Sciences, Harbin, China
| | - Yanan Jiang
- Translational Medicine Research and Cooperation Center of Northern China, Heilongjiang Academy of Medical Sciences, Harbin, China.,Department of Pharmacology (State-Province Key Laboratories of Biomedicine- Pharmaceutics of China, Key Laboratory of Cardiovascular Research, Ministry of Education), College of Pharmacy, Harbin Medical University, Harbin, China
| |
Collapse
|
22
|
Ma S, Guo Z, Wang B, Yang M, Yuan X, Ji B, Wu Y, Chen S. A Computational Framework to Identify Biomarkers for Glioma Recurrence and Potential Drugs Targeting Them. Front Genet 2021; 12:832627. [PMID: 35116059 PMCID: PMC8804649 DOI: 10.3389/fgene.2021.832627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 12/29/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Recurrence is still a major obstacle to the successful treatment of gliomas. Understanding the underlying mechanisms of recurrence may help for developing new drugs to combat gliomas recurrence. This study provides a strategy to discover new drugs for recurrent gliomas based on drug perturbation induced gene expression changes. Methods: The RNA-seq data of 511 low grade gliomas primary tumor samples (LGG-P), 18 low grade gliomas recurrent tumor samples (LGG-R), 155 glioblastoma multiforme primary tumor samples (GBM-P), and 13 glioblastoma multiforme recurrent tumor samples (GBM-R) were downloaded from TCGA database. DESeq2, key driver analysis and weighted gene correlation network analysis (WGCNA) were conducted to identify differentially expressed genes (DEGs), key driver genes and coexpression networks between LGG-P vs LGG-R, GBM-P vs GBM-R pairs. Then, the CREEDS database was used to find potential drugs that could reverse the DEGs and key drivers. Results: We identified 75 upregulated and 130 downregulated genes between LGG-P and LGG-R samples, which were mainly enriched in human papillomavirus (HPV) infection, PI3K-Akt signaling pathway, Wnt signaling pathway, and ECM-receptor interaction. A total of 262 key driver genes were obtained with frizzled class receptor 8 (FZD8), guanine nucleotide-binding protein subunit gamma-12 (GNG12), and G protein subunit β2 (GNB2) as the top hub genes. By screening the CREEDS database, we got 4 drugs (Paclitaxel, 6-benzyladenine, Erlotinib, Cidofovir) that could downregulate the expression of up-regulated genes and 5 drugs (Fenofibrate, Oxaliplatin, Bilirubin, Nutlins, Valproic acid) that could upregulate the expression of down-regulated genes. These drugs may have a potential in combating recurrence of gliomas. Conclusion: We proposed a time-saving strategy based on drug perturbation induced gene expression changes to find new drugs that may have a potential to treat recurrent gliomas.
Collapse
Affiliation(s)
- Shuzhi Ma
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Department of Pathology, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Zhen Guo
- Academician Workstation, Changsha Medical University, Changsha, China
- Hunan Key Laboratory of the Research and Development of Novel Pharmaceutical Preparations, Changsha Medical University, Changsha, China
| | - Bo Wang
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Min Yang
- Geneis (Beijing) Co., Ltd., Beijing, China
| | | | - Binbin Ji
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Yan Wu
- Geneis (Beijing) Co., Ltd., Beijing, China
| | - Size Chen
- Department of Oncology, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Guangdong Provincial Engineering Research Center for Esophageal Cancer Precise Therapy, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- Central Laboratory, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, China
- *Correspondence: Size Chen,
| |
Collapse
|
23
|
Yu D, Liu G, Zhao N, Liu X, Guo M. FPSC-DTI: drug-target interaction prediction based on feature projection fuzzy classification and super cluster fusion. Mol Omics 2020; 16:583-591. [PMID: 33084702 DOI: 10.1039/d0mo00062k] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Identifying drug-target interactions (DTIs) is an important part of drug discovery and development. However, identifying DTIs is a complex process that is time consuming, costly, long, and often inefficient, with a low success rate, especially with wet-experimental methods. Computational methods based on drug repositioning and network pharmacology can effectively overcome these defects. In this paper, we develop a new fusion method, called FPSC-DTI, that fuses feature projection fuzzy classification (FP) and super cluster classification (SC) to predict DTI. As the experimental result, the mean percentile ranking (MPR) that was yielded by FPSC-DTI achieved 0.043, 0.084, 0.072, and 0.146 on enzyme, ion channel (IC), G-protein-coupled receptor (GPCR), and nuclear receptor (NR) datasets, respectively. And the AUC values exceeded 0.969 over all four datasets. Compared with other methods, FPSC-DTI obtained better predictive performance and became more robust.
Collapse
Affiliation(s)
- Donghua Yu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| | | | | | | | | |
Collapse
|
24
|
Identification of Drug–Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106254] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
25
|
Abstract
Network theory provides one of the most potent analysis tools for the study of complex systems. In this paper, we illustrate the network-based perspective in drug research and how it is coherent with the new paradigm of drug discovery. We first present data sources from which networks are built, then show some examples of how the networks can be used to investigate drug-related systems. A section is devoted to network-based inference applications, i.e., prediction methods based on interactomes, that can be used to identify putative drug-target interactions without resorting to 3D modeling. Finally, we present some aspects of Boolean networks dynamics, anticipating that it might become a very potent modeling framework to develop in silico screening protocols able to simulate phenotypic screening experiments. We conclude that network applications integrated with machine learning and 3D modeling methods will become an indispensable tool for computational drug discovery in the next years.
Collapse
Affiliation(s)
- Maurizio Recanatini
- Department of Pharmacy and
Biotechnology, Alma Mater Studiorum—University of Bologna, Via Belmeloro 6, I-40126 Bologna, Italy
| | - Chiara Cabrelle
- Department of Pharmacy and
Biotechnology, Alma Mater Studiorum—University of Bologna, Via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
26
|
Abstract
The current global pandemic COVID-19 caused by the SARS-CoV-2 virus has already inflicted insurmountable damage both to the human lives and global economy. There is an immediate need for identification of effective drugs to contain the disastrous virus outbreak. Global efforts are already underway at a war footing to identify the best drug combination to address the disease. In this review, an attempt has been made to understand the SARS-CoV-2 life cycle, and based on this information potential druggable targets against SARS-CoV-2 are summarized. Also, the strategies for ongoing and future drug discovery against the SARS-CoV-2 virus are outlined. Given the urgency to find a definitive cure, ongoing drug repurposing efforts being carried out by various organizations are also described. The unprecedented crisis requires extraordinary efforts from the scientific community to effectively address the issue and prevent further loss of human lives and health.
Collapse
Affiliation(s)
- Ambrish Saxena
- Indian Institute of Technology Tirupati, Tirupati, India
| |
Collapse
|
27
|
Thafar MA, Olayan RS, Ashoor H, Albaradei S, Bajic VB, Gao X, Gojobori T, Essack M. DTiGEMS+: drug-target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform 2020; 12:44. [PMID: 33431036 PMCID: PMC7325230 DOI: 10.1186/s13321-020-00447-2] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts Drug–Target interactions using Graph Embedding, graph Mining, and Similarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.
Collapse
Affiliation(s)
- Maha A Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Collage of Computers and Information Technology, Taif University, Taif, Kingdom of Saudi Arabia
| | - Rawan S Olayan
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Haitham Ashoor
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
28
|
Luo H, Li M, Yang M, Wu FX, Li Y, Wang J. Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 2020; 22:1604-1619. [PMID: 32043521 DOI: 10.1093/bib/bbz176] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/07/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Drug repositioning can drastically decrease the cost and duration taken by traditional drug research and development while avoiding the occurrence of unforeseen adverse events. With the rapid advancement of high-throughput technologies and the explosion of various biological data and medical data, computational drug repositioning methods have been appealing and powerful techniques to systematically identify potential drug-target interactions and drug-disease interactions. In this review, we first summarize the available biomedical data and public databases related to drugs, diseases and targets. Then, we discuss existing drug repositioning approaches and group them based on their underlying computational models consisting of classical machine learning, network propagation, matrix factorization and completion, and deep learning based models. We also comprehensively analyze common standard data sets and evaluation metrics used in drug repositioning, and give a brief comparison of various prediction methods on the gold standard data sets. Finally, we conclude our review with a brief discussion on challenges in computational drug repositioning, which includes the problem of reducing the noise and incompleteness of biomedical data, the ensemble of various computation drug repositioning methods, the importance of designing reliable negative samples selection methods, new techniques dealing with the data sparseness problem, the construction of large-scale and comprehensive benchmark data sets and the analysis and explanation of the underlying mechanisms of predicted interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer Science and Engineering at Central South University
| | - Min Li
- School of Computer Science and Engineering at Central South University
| | - Mengyun Yang
- School of Computer Science and Engineering at Central South University
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Yaohang Li
- Department of Computer Science at Old Dominion University, Norfolk, USA
| | - Jianxin Wang
- School of Computer Science and Engineering at Central South University
| |
Collapse
|
29
|
Ji Y, Mishra RK, Davuluri RV. In silico analysis of alternative splicing on drug-target gene interactions. Sci Rep 2020; 10:134. [PMID: 31924844 PMCID: PMC6954184 DOI: 10.1038/s41598-019-56894-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 12/18/2019] [Indexed: 12/24/2022] Open
Abstract
Identifying and evaluating the right target are the most important factors in early drug discovery phase. Most studies focus on one protein ignoring the multiple splice-variant or protein-isoforms, which might contribute to unexpected therapeutic activity or adverse side effects. Here, we present computational analysis of cancer drug-target interactions affected by alternative splicing. By integrating information from publicly available databases, we curated 883 FDA approved or investigational stage small molecule cancer drugs that target 1,434 different genes, with an average of 5.22 protein isoforms per gene. Of these, 618 genes have ≥5 annotated protein-isoforms. By analyzing the interactions with binding pocket information, we found that 76% of drugs either miss a potential target isoform or target other isoforms with varied expression in multiple normal tissues. We present sequence and structure level alignments at isoform-level and make this information publicly available for all the curated drugs. Structure-level analysis showed ligand binding pocket architectures differences in size, shape and electrostatic parameters between isoforms. Our results emphasize how potentially important isoform-level interactions could be missed by solely focusing on the canonical isoform, and suggest that on- and off-target effects at isoform-level should be investigated to enhance the productivity of drug-discovery research.
Collapse
Affiliation(s)
- Yanrong Ji
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Rama K Mishra
- The Center for Molecular Innovation and Drug Discovery, Northwestern University, Evanston, IL, USA.,Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.,Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Ramana V Davuluri
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
| |
Collapse
|
30
|
Karasev D, Sobolev B, Lagunin A, Filimonov D, Poroikov V. Prediction of Protein-Ligand Interaction Based on the Positional Similarity Scores Derived from Amino Acid Sequences. Int J Mol Sci 2019; 21:ijms21010024. [PMID: 31861473 PMCID: PMC6981593 DOI: 10.3390/ijms21010024] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 12/13/2019] [Accepted: 12/16/2019] [Indexed: 12/14/2022] Open
Abstract
The affinity of different drug-like ligands to multiple protein targets reflects general chemical–biological interactions. Computational methods estimating such interactions analyze the available information about the structure of the targets, ligands, or both. Prediction of protein–ligand interactions based on pairwise sequence alignment provides reasonable accuracy if the ligands’ specificity well coincides with the phylogenic taxonomy of the proteins. Methods using multiple alignment require an accurate match of functionally significant residues. Such conditions may not be met in the case of diverged protein families. To overcome these limitations, we propose an approach based on the analysis of local sequence similarity within the set of analyzed proteins. The positional scores, calculated by sequence fragment comparisons, are used as input data for the Bayesian classifier. Our approach provides a prediction accuracy comparable or exceeding those of other methods. It was demonstrated on the popular Gold Standard test sets, presenting different sequence heterogeneity and varying from the group, including different protein families to the more specific groups. A reasonable prediction accuracy was also found for protein kinases, displaying weak relationships between sequence phylogeny and inhibitor specificity. Thus, our method can be applied to the broad area of protein–ligand interactions.
Collapse
Affiliation(s)
- Dmitry Karasev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Correspondence:
| | - Boris Sobolev
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Alexey Lagunin
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
- Department of Bioinformatics, Russian National Research Medical University, Moscow 117997, Russia
| | - Dmitry Filimonov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow 119121, Russia; (B.S.); (A.L.); (D.F.); (V.P.)
| |
Collapse
|
31
|
Ding Y, Tang J, Guo F. Identification of drug–target interactions via fuzzy bipartite local model. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04569-z] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|