1
|
Ivanov SM, Rudik AV, Lagunin AA, Filimonov DA, Poroikov VV. DIGEP-Pred 2.0: A web application for predicting drug-induced cell signaling and gene expression changes. Mol Inform 2024:e202400032. [PMID: 38979651 DOI: 10.1002/minf.202400032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 05/16/2024] [Accepted: 06/14/2024] [Indexed: 07/10/2024]
Abstract
The analysis of drug-induced gene expression profiles (DIGEP) is widely used to estimate the potential therapeutic and adverse drug effects as well as the molecular mechanisms of drug action. However, the corresponding experimental data is absent for many existing drugs and drug-like compounds. To solve this problem, we created the DIGEP-Pred 2.0 web application, which allows predicting DIGEP and potential drug targets by structural formula of drug-like compounds. It is based on the combined use of structure-activity relationships (SARs) and network analysis. SAR models were created using PASS (Prediction of Activity Spectra for Substances) technology for data from the Comparative Toxicogenomics Database (CTD), the Connectivity Map (CMap) for the prediction of DIGEP, and PubChem and ChEMBL for the prediction of molecular mechanisms of action (MoA). Using only the structural formula of a compound, the user can obtain information on potential gene expression changes in several cell lines and drug targets, which are potential master regulators responsible for the observed DIGEP. The mean accuracy of prediction calculated by leave-one-out cross validation was 86.5 % for 13377 genes and 94.8 % for 2932 proteins (CTD data), and it was 97.9 % for 2170 MoAs. SAR models (mean accuracy-87.5 %) were also created for CMap data given on MCF7, PC3, and HL60 cell lines with different threshold values for the logarithm of fold changes: 0.5, 0.7, 1, 1.5, and 2. Additionally, the data on pathways (KEGG, Reactome), biological processes of Gene Ontology, and diseases (DisGeNet) enriched by the predicted genes, together with the estimation of target-master regulators based on OmniPath data, is also provided. DIGEP-Pred 2.0 web application is freely available at https://www.way2drug.com/digep-pred.
Collapse
Affiliation(s)
- Sergey M Ivanov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Street, 10 bldg. 8, Moscow, 119121, Russia
- Department of Bioinformatics, Pirogov Russian National Research Medical University, Ostrovityanova Street, 1, Moscow, 117997, Russia
| | - Anastasia V Rudik
- Department of Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Street, 10 bldg. 8, Moscow, 119121, Russia
| | - Alexey A Lagunin
- Department of Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Street, 10 bldg. 8, Moscow, 119121, Russia
- Department of Bioinformatics, Pirogov Russian National Research Medical University, Ostrovityanova Street, 1, Moscow, 117997, Russia
| | - Dmitry A Filimonov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Street, 10 bldg. 8, Moscow, 119121, Russia
| | - Vladimir V Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Pogodinskaya Street, 10 bldg. 8, Moscow, 119121, Russia
| |
Collapse
|
2
|
Liu J, Gui Y, Rao J, Sun J, Wang G, Ren Q, Qu N, Niu B, Chen Z, Sheng X, Wang Y, Zheng M, Li X. In silico off-target profiling for enhanced drug safety assessment. Acta Pharm Sin B 2024; 14:2927-2941. [PMID: 39027254 PMCID: PMC11252485 DOI: 10.1016/j.apsb.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/21/2024] [Accepted: 02/29/2024] [Indexed: 07/20/2024] Open
Abstract
Ensuring drug safety in the early stages of drug development is crucial to avoid costly failures in subsequent phases. However, the economic burden associated with detecting drug off-targets and potential side effects through in vitro safety screening and animal testing is substantial. Drug off-target interactions, along with the adverse drug reactions they induce, are significant factors affecting drug safety. To assess the liability of candidate drugs, we developed an artificial intelligence model for the precise prediction of compound off-target interactions, leveraging multi-task graph neural networks. The outcomes of off-target predictions can serve as representations for compounds, enabling the differentiation of drugs under various ATC codes and the classification of compound toxicity. Furthermore, the predicted off-target profiles are employed in adverse drug reaction (ADR) enrichment analysis, facilitating the inference of potential ADRs for a drug. Using the withdrawn drug Pergolide as an example, we elucidate the mechanisms underlying ADRs at the target level, contributing to the exploration of the potential clinical relevance of newly predicted off-target interactions. Overall, our work facilitates the early assessment of compound safety/toxicity based on off-target identification, deduces potential ADRs of drugs, and ultimately promotes the secure development of drugs.
Collapse
Affiliation(s)
- Jin Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
| | - Yike Gui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingjing Sun
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Gang Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qun Ren
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Ning Qu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Buying Niu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhiyi Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xia Sheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yitian Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyue Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
3
|
Liu G, Seal S, Arevalo J, Liang Z, Carpenter AE, Jiang M, Singh S. Learning Molecular Representation in a Cell. ARXIV 2024:arXiv:2406.12056v2. [PMID: 38947938 PMCID: PMC11213146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignment (InfoAlign) approach to learn molecular representations through the information bottleneck method in cells. We integrate molecules and cellular response data as nodes into a context graph, connecting them with weighted edges based on chemical, biological, and computational criteria. For each molecule in a training batch, InfoAlign optimizes the encoder's latent representation with a minimality objective to discard redundant structural information. A sufficiency objective decodes the representation to align with different feature spaces from the molecule's neighborhood in the context graph. We demonstrate that the proposed sufficiency objective for alignment is tighter than existing encoder-based contrastive methods. Empirically, we validate representations from InfoAlign in two downstream tasks: molecular property prediction against up to 19 baseline methods across four datasets, plus zero-shot molecule-morphology matching.
Collapse
|
4
|
Das P, Mazumder DH. K 1K 2NN: A novel multi-label classification approach based on neighbors for predicting COVID-19 drug side effects. Comput Biol Chem 2024; 110:108066. [PMID: 38579549 DOI: 10.1016/j.compbiolchem.2024.108066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 03/12/2024] [Accepted: 04/01/2024] [Indexed: 04/07/2024]
Abstract
COVID-19, a novel ailment, has received comparatively fewer drugs for its treatment. Side Effects (SE) of a COVID-19 drug could cause long-term health issues. Hence, SE prediction is essential in COVID-19 drug development. Efficient models are also needed to predict COVID-19 drug SE since most existing research has proposed many classifiers to predict SE for diseases other than COVID-19. This work proposes a novel classifier based on neighbors named K1 K2 Nearest Neighbors (K1K2NN) to predict the SE of the COVID-19 drug from 17 molecules' descriptors and the chemical 1D structure of the drugs. The model is implemented based on the proposition that chemically similar drugs may be assigned similar drug SE, and co-occurring SE may be assigned to chemically similar drugs. The K1K2NN model chooses the first K1 neighbors to the test drug sample by calculating its similarity with the train drug samples. It then assigns the test sample with the SE label having the majority count on the SE labels of these K1 neighbor drugs obtained through a voting mechanism. The model then calculates the SE-SE similarity using the Jaccard similarity measure from the SE co-occurrence values. Finally, the model chooses the most similar K2 SE neighbors for those SE determined by the K1 neighbor drugs and assigns these SE to that test drug sample. The proposed K1K2NN model has showcased promising performance with the highest accuracy of 97.53% on chemical 1D drug structure and outperforms the state-of-the-art multi-label classifiers. In addition, we demonstrate the successful application of the proposed model on gene expression signature datasets, which aided in evaluating its performance and confirming its accuracy and robustness.
Collapse
Affiliation(s)
- Pranab Das
- Department of Computer Science & Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103, India
| | - Dilwar Hussain Mazumder
- Department of Computer Science & Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103, India.
| |
Collapse
|
5
|
Seal S, Carreras-Puigvert J, Singh S, Carpenter AE, Spjuth O, Bender A. From pixels to phenotypes: Integrating image-based profiling with cell health data as BioMorph features improves interpretability. Mol Biol Cell 2024; 35:mr2. [PMID: 38170589 PMCID: PMC10916876 DOI: 10.1091/mbc.e23-08-0298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/07/2023] [Accepted: 12/22/2023] [Indexed: 01/05/2024] Open
Abstract
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features extracted from classical software such as CellProfiler are based on statistical calculations and often not readily biologically interpretable. In this study, we propose a new feature space, which we call BioMorph, that maps these Cell Painting features with readouts from comprehensive Cell Health assays. We validated that the resulting BioMorph space effectively connected compounds not only with the morphological features associated with their bioactivity but with deeper insights into phenotypic characteristics and cellular processes associated with the given bioactivity. The BioMorph space revealed the mechanism of action for individual compounds, including dual-acting compounds such as emetine, an inhibitor of both protein synthesis and DNA replication. Overall, BioMorph space offers a biologically relevant way to interpret the cell morphological features derived using software such as CellProfiler and to generate hypotheses for experimental validation.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Jordi Carreras-Puigvert
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, 752 37 Uppsala, Sweden
| | - Shantanu Singh
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
| | - Anne E Carpenter
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, 752 37 Uppsala, Sweden
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
6
|
Seal S, Spjuth O, Hosseini-Gerami L, García-Ortegón M, Singh S, Bender A, Carpenter AE. Insights into Drug Cardiotoxicity from Biological and Chemical Data: The First Public Classifiers for FDA Drug-Induced Cardiotoxicity Rank. J Chem Inf Model 2024; 64:1172-1186. [PMID: 38300851 PMCID: PMC10900289 DOI: 10.1021/acs.jcim.3c01834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/11/2024] [Accepted: 01/16/2024] [Indexed: 02/03/2024]
Abstract
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10-14% of postmarket withdrawals. In this study, we explored the capabilities of chemical and biological data to predict cardiotoxicity, using the recently released DICTrank data set from the United States FDA. We found that such data, including protein targets, especially those related to ion channels (e.g., hERG), physicochemical properties (e.g., electrotopological state), and peak concentration in plasma offer strong predictive ability for DICT. Compounds annotated with mechanisms of action such as cyclooxygenase inhibition could distinguish between most-concern and no-concern DICT. Cell Painting features for ER stress discerned most-concern cardiotoxic from nontoxic compounds. Models based on physicochemical properties provided substantial predictive accuracy (AUCPR = 0.93). With the availability of omics data in the future, using biological data promises enhanced predictability and deeper mechanistic insights, paving the way for safer drug development. All models from this study are available at https://broad.io/DICTrank_Predictor.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Ola Spjuth
- Department
of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box
591, SE-75124 Uppsala, Sweden
| | - Layla Hosseini-Gerami
- Ignota
Labs, The Bradfield Centre, Cambridge Science Park, County Hall, Westminster Bridge Road, Cambridge CB4 0GA, U.K.
| | - Miguel García-Ortegón
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Shantanu Singh
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| | - Andreas Bender
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Anne E. Carpenter
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
7
|
Yu L, Xu Z, Qiu W, Xiao X. MSDSE: Predicting drug-side effects based on multi-scale features and deep multi-structure neural network. Comput Biol Med 2024; 169:107812. [PMID: 38091725 DOI: 10.1016/j.compbiomed.2023.107812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/10/2023] [Accepted: 12/03/2023] [Indexed: 02/08/2024]
Abstract
Unexpected side effects may accompany the research stage and post-marketing of drugs. These accidents lead to drug development failure and even endanger patients' health. Thus, it is essential to recognize the unknown drug-side effects. Most existing methods in silico find the answer from the association network or similarity network of drugs while ignoring the drug-intrinsic attributes. The limitation is that they can only handle drugs in the maturation stage. To be suitable for early drug-side effect screening, we conceive a multi-structural deep learning framework, MSDSE, which synthetically considers the multi-scale features derived from the drug. MSDSE can jointly learn SMILES sequence-based word embedding, substructure-based molecular fingerprint, and chemical structure-based graph embedding. In the preprocessing stage of MSDSE, we project all features to the abstract space with the same dimension. MSDSE builds a bi-level channel strategy, including a convolutional neural network module with an Inception structure and a multi-head Self-Attention module, to learn and integrate multi-modal features from local to global perspectives. Finally, MSDSE regards the prediction of drug-side effects as pair-wise learning and outputs the pair-wise probability of drug-side effects through the inner product operation. MSDSE is evaluated and analyzed on benchmark datasets and performs optimally compared to other baseline models. We also set up the ablation study to explain the rationality of the feature approach and model structure. Moreover, we select model partial prediction results for the case study to reveal actual capability. The original data are available at http://github.com/yuliyi/MSDSE.
Collapse
Affiliation(s)
- Liyi Yu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Zhaochun Xu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Wangren Qiu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China
| | - Xuan Xiao
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, 333403, China.
| |
Collapse
|
8
|
Zaib S, Rana N, Ali HS, Hussain N, Areeba, Ogaly HA, Al-Zahrani FAM, Khan I. Discovery of druggable potent inhibitors of serine proteases and farnesoid X receptor by ligand-based virtual screening to obstruct SARS-CoV-2. Int J Biol Macromol 2023; 253:127379. [PMID: 37838109 DOI: 10.1016/j.ijbiomac.2023.127379] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 09/12/2023] [Accepted: 10/09/2023] [Indexed: 10/16/2023]
Abstract
The coronavirus, a subfamily of the coronavirinae family, is an RNA virus with over 40 variations that can infect humans, non-human mammals and birds. There are seven types of human coronaviruses, including SARS-CoV-2, is responsible for the recent COVID-19 pandemic. The current study is focused on the identification of drug molecules for the treatment of COVID-19 by targeting human proteases like transmembrane serine protease 2 (TMPRSS2), furin, cathepsin B, and a nuclear receptor named farnesoid X receptor (FXR). TMPRSS2 and furin help in cleaving the spike protein of the SARS-CoV-2 virus, while cathepsin B plays a critical role in the entry and pathogenesis. FXR, on the other hand, regulates the expression of ACE2, and its inhibition can reduce SARS-CoV-2 infection. By inhibiting these four protein targets with non-toxic inhibitors, the entry of the infectious agent into host cells and its pathogenesis can be obstructed. We have used the BioSolveIT suite for pharmacophore-based computational drug designing. A total of 1611 ligands from the ligand library were docked with the target proteins to obtain potent inhibitors on the basis of pharmacophore. Following the ADMET analysis and protein ligand interactions, potent and druggable inhibitors of the target proteins were obtained. Additionally, toxic substructures and the less toxic route of administration of the most potent inhibitors in rodents were also determined computationally. Compounds namely N-(diaminomethylene)-2-((3-((1R,3R)-3-(2-(methoxy(methyl)amino)-2-oxoethyl)cyclopentyl)propyl)amino)-2-oxoethan-1-aminium (26), (1R,3R)-3-(((2-ammonioethyl)ammonio)methyl)-1-((4-propyl-1H-imidazol-2-yl)methyl)piperidin-1-ium (29) and (1R,3R)-3-(((2-ammonioethyl)ammonio)methyl)-1-((1-propyl-1H-pyrazol-4-yl)methyl)piperidin-1-ium (30) were found as the potent inhibitors of TMPRSS2, whereas, 1-(1-(1-(1H-tetrazol-1-yl)cyclopropane-1‑carbonyl)piperidin-4-yl)azepan-2-one (6), (2R)-4-methyl-1-oxo-1-((7R,11S)-4-oxo-6,7,8,9,10,11-hexahydro-4H-7,11-methanopyrido[1,2-a]azocin-9-yl)pentan-2-aminium (12), 4-((1-(3-(3,5-dimethylisoxazol-4-yl)propanoyl)piperidin-4-yl)methyl)morpholin-4-ium (13), 1-(4,6-dimethylpyrimidin-2-yl)-N-(3-oxocyclohex-1-en-1-yl)piperidine-4-carboxamide (14), 1-(4-(1,5-dimethyl-1H-1,2,4-triazol-3-yl)piperidin-1-yl)-3-(3,5-dimethylisoxazol-4-yl)propan-1-one (25) and N,N-dimethyl-4-oxo-4-((1S,5R)-8-oxo-5,6-dihydro-1H-1,5-methanopyrido[1,2-a][1,5]diazocin-3(2H,4H,8H)-yl)butanamide (31) inhibited the FXR preferentially. In case of cathepsin B, N-((5-benzoylthiophen-2-yl)methyl)-2-hydrazineyl-2-oxoacetamide (2) and N-([2,2'-bifuran]-5-ylmethyl)-2-hydrazineyl-2-oxoacetamide (7) were identified as the most druggable inhibitors whereas 1-amino-2,7-diethyl-3,8-dioxo-6-(p-tolyl)-2,3,7,8-tetrahydro-2,7-naphthyridine-4‑carbonitrile (5) and (R)-6-amino-2-(2,3-dihydroxypropyl)-1H-benzo[de]isoquinoline-1,3(2H)-dione (20) were active against furin.
Collapse
Affiliation(s)
- Sumera Zaib
- Department of Basic and Applied Chemistry, Faculty of Science and Technology, University of Central Punjab, Lahore 54590, Pakistan.
| | - Nehal Rana
- Department of Basic and Applied Chemistry, Faculty of Science and Technology, University of Central Punjab, Lahore 54590, Pakistan
| | - Hafiz Saqib Ali
- INEOS Oxford Institute for Antimicrobial Research and Chemistry Research Laboratory, Department of Chemistry, University of Oxford, 12 Mansfield Road, Oxford OX1 3TA, United Kingdom
| | - Nadia Hussain
- Department of Pharmaceutical Sciences, College of Pharmacy, Al Ain University, Al Ain, P.O. Box 64141, United Arab Emirates; AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, P.O. Box 144534, United Arab Emirates
| | - Areeba
- Department of Basic and Applied Chemistry, Faculty of Science and Technology, University of Central Punjab, Lahore 54590, Pakistan
| | - Hanan A Ogaly
- Chemistry Department, College of Science, King Khalid University, Abha 61421, Saudi Arabia
| | - Fatimah A M Al-Zahrani
- Chemistry Department, College of Science, King Khalid University, Abha 61421, Saudi Arabia
| | - Imtiaz Khan
- Department of Chemistry and Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, United Kingdom.
| |
Collapse
|
9
|
Wang J, Novick S. DOSE-L1000: unveiling the intricate landscape of compound-induced transcriptional changes. Bioinformatics 2023; 39:btad683. [PMID: 37952162 PMCID: PMC10663987 DOI: 10.1093/bioinformatics/btad683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 10/31/2023] [Accepted: 11/10/2023] [Indexed: 11/14/2023] Open
Abstract
MOTIVATION The LINCS L1000 project has collected gene expression profiles for thousands of compounds across a wide array of concentrations, cell lines, and time points. However, conventional analysis methods often fall short in capturing the rich information encapsulated within the L1000 transcriptional dose-response data. RESULTS We present DOSE-L1000, a database that unravels the potency and efficacy of compound-gene pairs and the intricate landscape of compound-induced transcriptional changes. Our study uses the fitting of over 140 million generalized additive models and robust linear models, spanning the complete spectrum of compounds and landmark genes within the LINCS L1000 database. This systematic approach provides quantitative insights into differential gene expression and the potency and efficacy of compound-gene pairs across diverse cellular contexts. Through examples, we showcase the application of DOSE-L1000 in tasks such as cell line and compound comparisons, along with clustering analyses and predictions of drug-target interactions. DOSE-L1000 fosters applications in drug discovery, accelerating the transition to omics-driven drug development. AVAILABILITY AND IMPLEMENTATION DOSE-L1000 is publicly available at https://doi.org/10.5281/zenodo.8286375.
Collapse
Affiliation(s)
- Junmin Wang
- Data Sciences and Quantitative Biology, Discovery Sciences, Biopharmaceuticals R&D, AstraZeneca, Gaithersburg, MD 20878, United States
| | - Steven Novick
- Global Statistical Sciences, Eli Lilly, Indianapolis, IN 46285, United States
| |
Collapse
|
10
|
Seal S, Spjuth O, Hosseini-Gerami L, García-Ortegón M, Singh S, Bender A, Carpenter AE. Insights into Drug Cardiotoxicity from Biological and Chemical Data: The First Public Classifiers for FDA DICTrank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.15.562398. [PMID: 37905146 PMCID: PMC10614794 DOI: 10.1101/2023.10.15.562398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10-14% of postmarket withdrawals. In this study, we explored the capabilities of various chemical and biological data to predict cardiotoxicity, using the recently released Drug-Induced Cardiotoxicity Rank (DICTrank) dataset from the United States FDA. We analyzed a diverse set of data sources, including physicochemical properties, annotated mechanisms of action (MOA), Cell Painting, Gene Expression, and more, to identify indications of cardiotoxicity. We found that such data, including protein targets, especially those related to ion channels (such as hERG), physicochemical properties (such as electrotopological state) as well as peak concentration in plasma offer strong predictive ability as well as valuable insights into DICT. We also found compounds annotated with particular mechanisms of action, such as cyclooxygenase inhibition, could distinguish between most-concern and no-concern DICT compounds. Cell Painting features related to ER stress discern the most-concern cardiotoxic compounds from non-toxic compounds. While models based on physicochemical properties currently provide substantial predictive accuracy (AUCPR = 0.93), this study also underscores the potential benefits of incorporating more comprehensive biological data in future DICT predictive models. With the availability of - omics data in the future, using biological data promises enhanced predictability and delivers deeper mechanistic insights, paving the way for safer therapeutic drug development. All models and data used in this study are publicly released at https://broad.io/DICTrank_Predictor.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging Platform, Broad Institute of MIT and Harvard, US
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Sweden
| | | | | | - Shantanu Singh
- Imaging Platform, Broad Institute of MIT and Harvard, US
| | | | | |
Collapse
|
11
|
Wang L, Sun C, Xu X, Li J, Zhang W. A neighborhood-regularization method leveraging multiview data for predicting the frequency of drug-side effects. Bioinformatics 2023; 39:btad532. [PMID: 37647657 PMCID: PMC10491955 DOI: 10.1093/bioinformatics/btad532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/24/2023] [Accepted: 08/28/2023] [Indexed: 09/01/2023] Open
Abstract
MOTIVATION A critical issue in drug benefit-risk assessment is to determine the frequency of side effects, which is performed by randomized controlled trails. Computationally predicted frequencies of drug side effects can be used to effectively guide the randomized controlled trails. However, it is more challenging to predict drug side effect frequencies, and thus only a few studies cope with this problem. RESULTS In this work, we propose a neighborhood-regularization method (NRFSE) that leverages multiview data on drugs and side effects to predict the frequency of side effects. First, we adopt a class-weighted non-negative matrix factorization to decompose the drug-side effect frequency matrix, in which Gaussian likelihood is used to model unknown drug-side effect pairs. Second, we design a multiview neighborhood regularization to integrate three drug attributes and two side effect attributes, respectively, which makes most similar drugs and most similar side effects have similar latent signatures. The regularization can adaptively determine the weights of different attributes. We conduct extensive experiments on one benchmark dataset, and NRFSE improves the prediction performance compared with five state-of-the-art approaches. Independent test set of post-marketing side effects further validate the effectiveness of NRFSE. AVAILABILITY AND IMPLEMENTATION Source code and datasets are available at https://github.com/linwang1982/NRFSE or https://codeocean.com/capsule/4741497/tree/v1.
Collapse
Affiliation(s)
- Lin Wang
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Chenhao Sun
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Xianyu Xu
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Jia Li
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Wenjuan Zhang
- College of General Education, Tianjin Foreign Studies University, No. 117, Machang Road, Hexi District, Tianjin 300204, China
| |
Collapse
|
12
|
Williams AH, Zhan CG. Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management. BioDrugs 2023; 37:649-674. [PMID: 37464099 DOI: 10.1007/s40259-023-00611-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2023] [Indexed: 07/20/2023]
Abstract
In recent years, machine learning (ML) techniques have garnered considerable interest for their potential use in accelerating the rate of drug discovery. With the emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, the utilization of ML has become even more crucial in the search for effective antiviral medications. The pandemic has presented the scientific community with a unique challenge, and the rapid identification of potential treatments has become an urgent priority. Researchers have been able to accelerate the process of identifying drug candidates, repurposing existing drugs, and designing new compounds with desirable properties using machine learning in drug discovery. To train predictive models, ML techniques in drug discovery rely on the analysis of large datasets, including both experimental and clinical data. These models can be used to predict the biological activities, potential side effects, and interactions with specific target proteins of drug candidates. This strategy has proven to be an effective method for identifying potential coronavirus disease 2019 (COVID-19) and other disease treatments. This paper offers a thorough analysis of the various ML techniques implemented to combat COVID-19, including supervised and unsupervised learning, deep learning, and natural language processing. The paper discusses the impact of these techniques on pandemic drug development, including the identification of potential treatments, the understanding of the disease mechanism, and the creation of effective and safe therapeutics. The lessons learned can be applied to future outbreaks and drug discovery initiatives.
Collapse
Affiliation(s)
- Alexander H Williams
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- GSK Upper Providence, 1250 S. Collegeville Road, Collegeville, PA, 19426, USA
| | - Chang-Guo Zhan
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
| |
Collapse
|
13
|
Krix S, DeLong LN, Madan S, Domingo-Fernández D, Ahmad A, Gul S, Zaliani A, Fröhlich H. MultiGML: Multimodal graph machine learning for prediction of adverse drug events. Heliyon 2023; 9:e19441. [PMID: 37681175 PMCID: PMC10481305 DOI: 10.1016/j.heliyon.2023.e19441] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 08/22/2023] [Accepted: 08/23/2023] [Indexed: 09/09/2023] Open
Abstract
Adverse drug events constitute a major challenge for the success of clinical trials. Several computational strategies have been suggested to estimate the risk of adverse drug events in preclinical drug development. While these approaches have demonstrated high utility in practice, they are at the same time limited to specific information sources. Thus, many current computational approaches neglect a wealth of information which results from the integration of different data sources, such as biological protein function, gene expression, chemical compound structure, cell-based imaging and others. In this work we propose an integrative and explainable multi-modal Graph Machine Learning approach (MultiGML), which fuses knowledge graphs with multiple further data modalities to predict drug related adverse events and general drug target-phenotype associations. MultiGML demonstrates excellent prediction performance compared to alternative algorithms, including various traditional knowledge graph embedding techniques. MultiGML distinguishes itself from alternative techniques by providing in-depth explanations of model predictions, which point towards biological mechanisms associated with predictions of an adverse drug event. Hence, MultiGML could be a versatile tool to support decision making in preclinical drug development.
Collapse
Affiliation(s)
- Sophia Krix
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Lauren Nicole DeLong
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, EH8 9AB, UK
| | - Sumit Madan
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Department of Computer Science, University of Bonn, 53115, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| | - Ashar Ahmad
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Grunenthal GmbH, 52099, Aachen, Germany
| | - Sheraz Gul
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| |
Collapse
|
14
|
Sutherland JJ, Yonchev D, Fekete A, Urban L. A preclinical secondary pharmacology resource illuminates target-adverse drug reaction associations of marketed drugs. Nat Commun 2023; 14:4323. [PMID: 37468498 DOI: 10.1038/s41467-023-40064-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 07/11/2023] [Indexed: 07/21/2023] Open
Abstract
In vitro secondary pharmacology assays are an important tool for predicting clinical adverse drug reactions (ADRs) of investigational drugs. We created the Secondary Pharmacology Database (SPD) by testing 1958 drugs using 200 assays to validate target-ADR associations. Compared to public and subscription resources, 95% of all and 36% of active (AC50 < 1 µM) results are unique to SPD, with bias towards higher activity in public resources. Annotating drugs with free maximal plasma concentrations, we find 684 physiologically relevant unpublished off-target activities. Furthermore, 64% of putative ADRs linked to target activity in key literature reviews are not statistically significant in SPD. Systematic analysis of all target-ADR pairs identifies several putative associations supported by publications. Finally, candidate mechanisms for known ADRs are proposed based on SPD off-target activities. Here we present a freely-available resource for benchmarking ADR predictions, explaining phenotypic activity and investigating clinical properties of marketed drugs.
Collapse
Affiliation(s)
| | - Dimitar Yonchev
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | | | - Laszlo Urban
- Novartis Institutes for Biomedical Research, Cambridge, MA, USA.
| |
Collapse
|
15
|
Wu Y, Liu Q, Xie L. Hierarchical multi-omics data integration and modeling predict cell-specific chemical proteomics and drug responses. CELL REPORTS METHODS 2023; 3:100452. [PMID: 37159671 PMCID: PMC10163019 DOI: 10.1016/j.crmeth.2023.100452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 12/28/2022] [Accepted: 03/22/2023] [Indexed: 05/11/2023]
Abstract
Drug-induced phenotypes result from biomolecular interactions across various levels of a biological system. Characterization of pharmacological actions therefore requires integration of multi-omics data. Proteomics profiles, which may more directly reflect disease mechanisms and biomarkers than transcriptomics, have not been widely exploited due to data scarcity and frequent missing values. A computational method for inferring drug-induced proteome patterns would therefore enable progress in systems pharmacology. To predict the proteome profiles and corresponding phenotypes of an uncharacterized cell or tissue type that has been disturbed by an uncharacterized chemical, we developed an end-to-end deep learning framework: TransPro. TransPro hierarchically integrated multi-omics data, in line with the central dogma of molecular biology. Our in-depth assessments of TransPro's predictions of anti-cancer drug sensitivity and drug adverse reactions reveal that TransPro's accuracy is on par with that of experimental data. Hence, TransPro may facilitate the imputation of proteomics data and compound screening in systems pharmacology.
Collapse
Affiliation(s)
- You Wu
- The Graduate Center, City University of New York, New York, NY 10016, USA
| | - Qiao Liu
- The Graduate Center, City University of New York, New York, NY 10016, USA
| | - Lei Xie
- The Graduate Center, City University of New York, New York, NY 10016, USA
- Hunter College, City University of New York, New York, NY 10065, USA
- Weill Cornell Medicine, Cornell University, New York, NY 10021, USA
| |
Collapse
|
16
|
Nemoto S, Mizuno T, Kusuhara H. Investigation of chemical structure recognition by encoder-decoder models in learning progress. J Cheminform 2023; 15:45. [PMID: 37046349 PMCID: PMC10100163 DOI: 10.1186/s13321-023-00713-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 03/18/2023] [Indexed: 04/14/2023] Open
Abstract
Descriptor generation methods using latent representations of encoder-decoder (ED) models with SMILES as input are useful because of the continuity of descriptor and restorability to the structure. However, it is not clear how the structure is recognized in the learning progress of ED models. In this work, we created ED models of various learning progress and investigated the relationship between structural information and learning progress. We showed that compound substructures were learned early in ED models by monitoring the accuracy of downstream tasks and input-output substructure similarity using substructure-based descriptors, which suggests that existing evaluation methods based on the accuracy of downstream tasks may not be sensitive enough to evaluate the performance of ED models with SMILES as descriptor generation methods. On the other hand, we showed that structure restoration was time-consuming, and in particular, insufficient learning led to the estimation of a larger structure than the actual one. It can be inferred that determining the endpoint of the structure is a difficult task for the model. To our knowledge, this is the first study to link the learning progress of SMILES by ED model to chemical structures for a wide range of chemicals.
Collapse
Affiliation(s)
- Shumpei Nemoto
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| | - Tadahaya Mizuno
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan.
| | - Hiroyuki Kusuhara
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| |
Collapse
|
17
|
Das D, Chakrabarty B, Srinivasan R, Roy A. Gex2SGen: Designing Drug-like Molecules from Desired Gene Expression Signatures. J Chem Inf Model 2023; 63:1882-1893. [PMID: 36971750 DOI: 10.1021/acs.jcim.2c01301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Drug-induced gene expression profiling provides a lot of useful information covering various aspects of drug discovery and development. Most importantly, this knowledge can be used to discover drugs' mechanisms of action. Recently, deep learning-based drug design methods are in the spotlight due to their ability to explore huge chemical space and design property-optimized target-specific drug molecules. Recent advances in accessibility of open-source drug-induced transcriptomic data along with the ability of deep learning algorithms to understand hidden patterns have opened opportunities for designing drug molecules based on desired gene expression signatures. In this study, we propose a deep learning model, Gex2SGen (Gene Expression 2 SMILES Generation), to generate novel drug-like molecules based on desired gene expression profiles. The model accepts desired gene expression profiles in a cell-specific manner as input and designs drug-like molecules which can elicit the required transcriptomic profile. The model was first tested against individual gene-knocked-out transcriptomic profiles, where the newly designed molecules showed high similarity with known inhibitors of the knocked-out target genes. The model was next applied on a triple negative breast cancer signature profile, where it could generate novel molecules, highly similar to known anti-breast cancer drugs. Overall, this work provides a generalized method, where the method first learned the molecular signature of a given cell due to a specific condition, and designs new small molecules with drug-like properties.
Collapse
|
18
|
He H, Duo H, Hao Y, Zhang X, Zhou X, Zeng Y, Li Y, Li B. Computational drug repurposing by exploiting large-scale gene expression data: Strategy, methods and applications. Comput Biol Med 2023; 155:106671. [PMID: 36805225 DOI: 10.1016/j.compbiomed.2023.106671] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/05/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023]
Abstract
De novo drug development is an extremely complex, time-consuming and costly task. Urgent needs for therapies of various diseases have greatly accelerated searches for more effective drug development methods. Luckily, drug repurposing provides a new and effective perspective on disease treatment. Rapidly increased large-scale transcriptome data paints a detailed prospect of gene expression during disease onset and thus has received wide attention in the field of computational drug repurposing. However, how to efficiently mine transcriptome data and identify new indications for old drugs remains a critical challenge. This review discussed the irreplaceable role of transcriptome data in computational drug repurposing and summarized some representative databases, tools and strategies. More importantly, it proposed a practical guideline through establishing the correspondence between three gene expression data types and five strategies, which would facilitate researchers to adopt appropriate strategies to deeply mine large-scale transcriptome data and discover more effective therapies.
Collapse
Affiliation(s)
- Hao He
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai, 200032, PR China
| | - Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Xinyi Zhou
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China
| | - Yinghong Li
- The Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, PR China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 400044, PR China.
| |
Collapse
|
19
|
Das P, Mazumder DH. An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects. Artif Intell Rev 2023; 56:1-28. [PMID: 36819660 PMCID: PMC9930028 DOI: 10.1007/s10462-023-10413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/19/2023]
Abstract
Approved drugs for sale must be effective and safe, implying that the drug's advantages outweigh its known harmful side effects. Side effects (SE) of drugs are one of the common reasons for drug failure that may halt the whole drug discovery pipeline. The side effects might vary from minor concerns like a runny nose to potentially life-threatening issues like liver damage, heart attack, and death. Therefore, predicting the side effects of the drug is vital in drug development, discovery, and design. Supervised machine learning-based side effects prediction task has recently received much attention since it reduces time, chemical waste, design complexity, risk of failure, and cost. The advancement of supervised learning approaches for predicting side effects have emerged as essential computational tools. Supervised machine learning technique provides early information on drug side effects to develop an effective drug based on drug properties. Still, there are several challenges to predicting drug side effects. Thus, a near-exhaustive survey is carried out in this paper on the use of supervised machine learning approaches employed in drug side effects prediction tasks in the past two decades. In addition, this paper also summarized the drug descriptor required for the side effects prediction task, commonly utilized drug properties sources, computational models, and their performances. Finally, the research gap, open problems, and challenges for the further supervised learning-based side effects prediction task have been discussed.
Collapse
Affiliation(s)
- Pranab Das
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| | - Dilwar Hussain Mazumder
- Department of Computer Science and Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, Nagaland 797103 India
| |
Collapse
|
20
|
Wang L, Song Y, Wang H, Zhang X, Wang M, He J, Li S, Zhang L, Li K, Cao L. Advances of Artificial Intelligence in Anti-Cancer Drug Design: A Review of the Past Decade. Pharmaceuticals (Basel) 2023; 16:253. [PMID: 37259400 PMCID: PMC9963982 DOI: 10.3390/ph16020253] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/25/2023] [Accepted: 02/06/2023] [Indexed: 10/13/2023] Open
Abstract
Anti-cancer drug design has been acknowledged as a complicated, expensive, time-consuming, and challenging task. How to reduce the research costs and speed up the development process of anti-cancer drug designs has become a challenging and urgent question for the pharmaceutical industry. Computer-aided drug design methods have played a major role in the development of cancer treatments for over three decades. Recently, artificial intelligence has emerged as a powerful and promising technology for faster, cheaper, and more effective anti-cancer drug designs. This study is a narrative review that reviews a wide range of applications of artificial intelligence-based methods in anti-cancer drug design. We further clarify the fundamental principles of these methods, along with their advantages and disadvantages. Furthermore, we collate a large number of databases, including the omics database, the epigenomics database, the chemical compound database, and drug databases. Other researchers can consider them and adapt them to their own requirements.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| | - Lei Cao
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
21
|
Using chemical and biological data to predict drug toxicity. SLAS DISCOVERY : ADVANCING LIFE SCIENCES R & D 2023; 28:53-64. [PMID: 36639032 DOI: 10.1016/j.slasd.2022.12.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 12/19/2022] [Accepted: 12/31/2022] [Indexed: 01/12/2023]
Abstract
Various sources of information can be used to better understand and predict compound activity and safety-related endpoints, including biological data such as gene expression and cell morphology. In this review, we first introduce types of chemical, in vitro and in vivo information that can be used to describe compounds and adverse effects. We then explore how compound descriptors based on chemical structure or biological perturbation response can be used to predict safety-related endpoints, and how especially biological data can help us to better understand adverse effects mechanistically. Overall, the described applications demonstrate how large-scale biological information presents new opportunities to anticipate and understand the biological effects of compounds, and how this can support predictive toxicology and drug discovery projects.
Collapse
|
22
|
Uner OC, Kuru HI, Cinbis RG, Tastan O, Cicek AE. DeepSide: A Deep Learning Approach for Drug Side Effect Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:330-339. [PMID: 34995191 DOI: 10.1109/tcbb.2022.3141103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Drug failures due to unforeseen adverse effects at clinical trials pose health risks for the participants and lead to substantial financial losses. Side effect prediction algorithms have the potential to guide the drug design process. LINCS L1000 dataset provides a vast resource of cell line gene expression data perturbed by different drugs and creates a knowledge base for context specific features. The state-of-the-art approach that aims at using context specific information relies on only the high-quality experiments in LINCS L1000 and discards a large portion of the experiments. In this study, our goal is to boost the prediction performance by utilizing this data to its full extent. We experiment with 5 deep learning architectures. We find that a multi-modal architecture produces the best predictive performance among multi-layer perceptron-based architectures when drug chemical structure (CS), and the full set of drug perturbed gene expression profiles (GEX) are used as modalities. Overall, we observe that the CS is more informative than the GEX. A convolutional neural network-based model that uses only SMILES string representation of the drugs achieves the best results and provides 13.0% macro-AUC and 3.1% micro-AUC improvements over the state-of-the-art. We also show that the model is able to predict side effect-drug pairs that are reported in the literature but was missing in the ground truth side effect dataset. DeepSide is available at http://github.com/OnurUner/DeepSide.
Collapse
|
23
|
DasGupta R, Yap A, Yaqing EY, Chia S. Evolution of precision oncology-guided treatment paradigms. WIREs Mech Dis 2023; 15:e1585. [PMID: 36168283 DOI: 10.1002/wsbm.1585] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 06/30/2022] [Accepted: 07/11/2022] [Indexed: 01/31/2023]
Abstract
Cancer treatment is gradually evolving from the classical use of nonspecific cytotoxic drugs targeting generic mechanisms of cell growth and proliferation. Instead, new "patient-specific treatment paradigms" that are based on an individual patient's tumor-specific molecular features are emerging, and these include "druggable" genomic alterations such as oncogenic driver mutations, downstream activities of cancer-signaling pathways, and the expression of specific genes involved in tumorigenesis and cancer progression. This evolving landscape of making evidence-based treatment decisions forms the foundation of precision oncology, which aims to deliver "the right drug, to the right patient and at the right time". The long-term vision for this approach is to maximize the treatment efficacy while minimizing exposure to ineffective therapy and reducing co-morbidity-related side effects. Successful clinical translation and implementation of this vision have the potential to revolutionize treatment paradigms from predominantly reactive, to more evidence-based, proactive and predictive care. In this article, we review the past and current approaches in precision oncology, and describe their remarkable power and limitations. We also speculate on the evolution of newly emerging methodologies of the future that can be used to address some of the key challenges associated with the existing paradigms. This article is categorized under: Cancer > Genetics/Genomics/Epigenetics Cancer > Molecular and Cellular Physiology Cancer > Computational Models.
Collapse
Affiliation(s)
- Ramanuj DasGupta
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, A*STAR, Singapore, Singapore.,Cancer Science Institute, National University of Singapore, Singapore, Singapore
| | - Aixin Yap
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Elena Yong Yaqing
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Shumei Chia
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, A*STAR, Singapore, Singapore
| |
Collapse
|
24
|
Lim S, Kim Y, Gu J, Lee S, Shin W, Kim S. Supervised chemical graph mining improves drug-induced liver injury prediction. iScience 2022; 26:105677. [PMID: 36654861 PMCID: PMC9840932 DOI: 10.1016/j.isci.2022.105677] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/11/2022] [Accepted: 11/23/2022] [Indexed: 12/27/2022] Open
Abstract
Drug-induced liver injury (DILI) is the main cause of drug failure in clinical trials. The characterization of toxic compounds in terms of chemical structure is important because compounds can be metabolized to toxic substances in the liver. Traditional machine learning approaches have had limited success in predicting DILI, and emerging deep graph neural network (GNN) models are yet powerful enough to predict DILI. In this study, we developed a completely different approach, supervised subgraph mining (SSM), a strategy to mine explicit subgraph features by iteratively updating individual graph transitions to maximize DILI fidelity. Our method outperformed previous methods including state-of-the-art GNN tools in classifying DILI on two different datasets: DILIst and TDC-benchmark. We also combined the subgraph features by using SMARTS-based frequent structural pattern matching and associated them with drugs' ATC code.
Collapse
Affiliation(s)
- Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
| | - Youngkuk Kim
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
| | - Jeonghyeon Gu
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
| | - Sunho Lee
- AIGENDRUG Co., Ltd., Gwanak-ro 1, Seoul 08826, South Korea
| | - Wonseok Shin
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
| | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Seoul 08826, South Korea
- AIGENDRUG Co., Ltd., Gwanak-ro 1, Seoul 08826, South Korea
- Corresponding author
| |
Collapse
|
25
|
Abstract
Liver cancer, mainly hepatocellular carcinoma (HCC), remains a major cause of cancer-related death worldwide. With the global epidemic of obesity, the major HCC etiologies have been dynamically shifting from viral to metabolic liver diseases. This change has made HCC prevention difficult with increasingly elusive at-risk populations as rational target for preventive interventions. Besides ongoing efforts to reduce obesity and metabolic disorders, chemoprevention in patients who already have metabolic liver diseases may have a significant impact on the poor HCC prognosis. Hepatitis B- and hepatitis C-related HCC incidences have been substantially reduced by the new antivirals, but HCC risk can persist over a decade even after successful viral treatment, highlighting the need for HCC-preventive measures also in these patients. Experimental and retrospective studies have suggested potential utility of generic agents such as lipophilic statins and aspirin for HCC chemoprevention given their well-characterized safety profile, although anticipated efficacy may be modest. In this review, we overview recent clinical and translational studies of generic agents in the context of HCC chemoprevention under the contemporary HCC etiologies. We also discuss newly emerging approaches to overcome the challenges in clinical testing of the agents to facilitate their clinical translation.
Collapse
Affiliation(s)
- Fahmida Rasha
- Liver Tumor Translational Research Program; Simmons Comprehensive Cancer Center; Division of Digestive and Liver Diseases; Department of Internal Medicine; University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Subhojit Paul
- Liver Tumor Translational Research Program; Simmons Comprehensive Cancer Center; Division of Digestive and Liver Diseases; Department of Internal Medicine; University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Tracey G Simon
- Liver Center, Division of Gastroenterology, Clinical and Translational Epidemiology Unit, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Yujin Hoshida
- Liver Tumor Translational Research Program; Simmons Comprehensive Cancer Center; Division of Digestive and Liver Diseases; Department of Internal Medicine; University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
26
|
Zheng M, Okawa S, Bravo M, Chen F, Martínez-Chantar ML, del Sol A. ChemPert: mapping between chemical perturbation and transcriptional response for non-cancer cells. Nucleic Acids Res 2022; 51:D877-D889. [PMID: 36200827 PMCID: PMC9825489 DOI: 10.1093/nar/gkac862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 09/08/2022] [Accepted: 09/25/2022] [Indexed: 01/30/2023] Open
Abstract
Prior knowledge of perturbation data can significantly assist in inferring the relationship between chemical perturbations and their specific transcriptional response. However, current databases mostly contain cancer cell lines, which are unsuitable for the aforementioned inference in non-cancer cells, such as cells related to non-cancer disease, immunology and aging. Here, we present ChemPert (https://chempert.uni.lu/), a database consisting of 82 270 transcriptional signatures in response to 2566 unique perturbagens (drugs, small molecules and protein ligands) across 167 non-cancer cell types, as well as the protein targets of 57 818 perturbagens. In addition, we develop a computational tool that leverages the non-cancer cell datasets, which enables more accurate predictions of perturbation responses and drugs in non-cancer cells compared to those based onto cancer databases. In particular, ChemPert correctly predicted drug effects for treating hepatitis and novel drugs for osteoarthritis. The ChemPert web interface is user-friendly and allows easy access of the entire datasets and the computational tool, providing valuable resources for both experimental researchers who wish to find datasets relevant to their research and computational researchers who need comprehensive non-cancer perturbation transcriptomics datasets for developing novel algorithms. Overall, ChemPert will facilitate future in silico compound screening for non-cancer cells.
Collapse
Affiliation(s)
| | | | - Miren Bravo
- Liver Disease Laboratory, Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Bizkaia Technology Park, Derio, Spain,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), 48160 Bizkaia, Spain
| | - Fei Chen
- German Research Center for Artificial Intelligence (DFKI), 66123 Saarbrücken, Germany
| | - María-Luz Martínez-Chantar
- Liver Disease Laboratory, Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Bizkaia Technology Park, Derio, Spain,Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), 48160 Bizkaia, Spain
| | - Antonio del Sol
- To whom correspondence should be addressed. Tel: +352 46 66 44 6982;
| |
Collapse
|
27
|
Hsieh CY, Tu CC, Hung JH. Estimating intraclonal heterogeneity and subpopulation changes from bulk expression profiles in CMap. Life Sci Alliance 2022; 5:5/10/e202101299. [PMID: 35688486 PMCID: PMC9187873 DOI: 10.26508/lsa.202101299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 05/25/2022] [Accepted: 05/25/2022] [Indexed: 11/24/2022] Open
Abstract
Premnas is a computational framework that provides a new perspective to interpret perturbational data in LINC L1000 CMap by learning an ad hoc subpopulation representation from scRNA-seq and performing the digital cytometry to estimate the abundance of undetermined subpopulations. The connectivity among signatures upon perturbations curated in the CMap library provides a valuable resource for understanding therapeutic pathways and biological processes associated with the drugs and diseases. However, because of the nature of bulk-level expression profiling by the L1000 assay, intraclonal heterogeneity and subpopulation compositional change that could contribute to the responses to perturbations are largely neglected, hampering the interpretability and reproducibility of the connections. In this work, we proposed a computational framework, Premnas, to estimate the abundance of undetermined subpopulations from L1000 profiles in CMap directly according to an ad hoc subpopulation representation learned from a well-normalized batch of single-cell RNA-seq datasets by the archetypal analysis. By recovering the information of subpopulation changes upon perturbation, the potentials of drug-resistant/susceptible subpopulations with CMap L1000 were further explored and examined. The proposed framework enables a new perspective to understand the connectivity among cellular signatures and expands the scope of the CMAP and other similar perturbation datasets limited by the bulk profiling technology.
Collapse
Affiliation(s)
- Chiao-Yu Hsieh
- Department of Computer Science, College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Ching-Chih Tu
- Department of Computer Science, College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Jui-Hung Hung
- Department of Computer Science, College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
28
|
Integrating cell morphology with gene expression and chemical structure to aid mitochondrial toxicity detection. Commun Biol 2022; 5:858. [PMID: 35999457 PMCID: PMC9399120 DOI: 10.1038/s42003-022-03763-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/25/2022] [Indexed: 12/05/2022] Open
Abstract
Mitochondrial toxicity is an important safety endpoint in drug discovery. Models based solely on chemical structure for predicting mitochondrial toxicity are currently limited in accuracy and applicability domain to the chemical space of the training compounds. In this work, we aimed to utilize both -omics and chemical data to push beyond the state-of-the-art. We combined Cell Painting and Gene Expression data with chemical structural information from Morgan fingerprints for 382 chemical perturbants tested in the Tox21 mitochondrial membrane depolarization assay. We observed that mitochondrial toxicants differ from non-toxic compounds in morphological space and identified compound clusters having similar mechanisms of mitochondrial toxicity, thereby indicating that morphological space provides biological insights related to mechanisms of action of this endpoint. We further showed that models combining Cell Painting, Gene Expression features and Morgan fingerprints improved model performance on an external test set of 244 compounds by 60% (in terms of F1 score) and improved extrapolation to new chemical space. The performance of our combined models was comparable with dedicated in vitro assays for mitochondrial toxicity. Our results suggest that combining chemical descriptors with biological readouts enhances the detection of mitochondrial toxicants, with practical implications in drug discovery. Cell Painting, gene expression, and chemical structural data are used to examine the differences between mitochondrial toxicants and non-toxicants and enhance the detection of mitotoxic compounds for future drug discovery.
Collapse
|
29
|
Morita K, Mizuno T, Kusuhara H. Investigation of a Data Split Strategy Involving the Time Axis in Adverse Event Prediction Using Machine Learning. J Chem Inf Model 2022; 62:3982-3992. [PMID: 35971760 DOI: 10.1021/acs.jcim.2c00765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Adverse events are a serious issue in drug development, and many prediction methods using machine learning have been developed. The random split cross-validation is the de facto standard for model building and evaluation in machine learning, but care should be taken in adverse event prediction because this approach does not strictly match the real-world situation. The time split, which uses the time axis, is considered suitable for real-world prediction. However, the differences in model performance obtained using the time and random splits are not clear due to the lack of comparable studies. To understand the differences, we compared the model performance between the time and random splits using nine types of compound information as input, eight adverse events as targets, and six machine learning algorithms. The random split showed higher area under the curve values than did the time split for six of eight targets. The chemical spaces of the training and test datasets of the time split were similar, suggesting that the concept of applicability domain is insufficient to explain the differences derived from the splitting. The area under the curve differences were smaller for the protein interaction than for the other datasets. Subsequent detailed analyses suggested the danger of confounding in the use of knowledge-based information in the time split. These findings indicate the importance of understanding the differences between the time and random splits in adverse event prediction and suggest that appropriate use of the splitting strategies and interpretation of results are necessary for the real-world prediction of adverse events. We provide the analysis code and datasets used in the present study at https://github.com/mizuno-group/AE_prediction.
Collapse
Affiliation(s)
- Katsuhisa Morita
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Tadahaya Mizuno
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroyuki Kusuhara
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
30
|
Amano Y, Yamane M, Honda H. RAID: Regression Analysis–Based Inductive DNA Microarray for Precise Read-Across. Front Pharmacol 2022; 13:879907. [PMID: 35935858 PMCID: PMC9354856 DOI: 10.3389/fphar.2022.879907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 05/30/2022] [Indexed: 12/02/2022] Open
Abstract
Chemical structure-based read-across represents a promising method for chemical toxicity evaluation without the need for animal testing; however, a chemical structure is not necessarily related to toxicity. Therefore, in vitro studies were often used for read-across reliability refinement; however, their external validity has been hindered by the gap between in vitro and in vivo conditions. Thus, we developed a virtual DNA microarray, regression analysis–based inductive DNA microarray (RAID), which quantitatively predicts in vivo gene expression profiles based on the chemical structure and/or in vitro transcriptome data. For each gene, elastic-net models were constructed using chemical descriptors and in vitro transcriptome data to predict in vivo data from in vitro data (in vitro to in vivo extrapolation; IVIVE). In feature selection, useful genes for assessing the quantitative structure–activity relationship (QSAR) and IVIVE were identified. Predicted transcriptome data derived from the RAID system reflected the in vivo gene expression profiles of characteristic hepatotoxic substances. Moreover, gene ontology and pathway analysis indicated that nuclear receptor-mediated xenobiotic response and metabolic activation are related to these gene expressions. The identified IVIVE-related genes were associated with fatty acid, xenobiotic, and drug metabolisms, indicating that in vitro studies were effective in evaluating these key events. Furthermore, validation studies revealed that chemical substances associated with these key events could be detected as hepatotoxic biosimilar substances. These results indicated that the RAID system could represent an alternative screening test for a repeated-dose toxicity test and toxicogenomics analyses. Our technology provides a critical solution for IVIVE-based read-across by considering the mode of action and chemical structures.
Collapse
|
31
|
Liang X, Li J, Fu Y, Qu L, Tan Y, Zhang P. A novel machine learning model based on sparse structure learning with adaptive graph regularization for predicting drug side effects. J Biomed Inform 2022; 132:104131. [PMID: 35840061 DOI: 10.1016/j.jbi.2022.104131] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 06/08/2022] [Accepted: 06/29/2022] [Indexed: 10/17/2022]
Abstract
Drug side effects are closely related to the success and failure of drug development. Here we present a novel machine learning method for side effect prediction. The proposed method treats side effect prediction as a multi-label learning problem and uses sparse structure learning to model the relationships between side effects. Additionally, the proposed method adopts the adaptive graph regularization strategy to explore the local structure in drug data and fuse multiple types of drug features. An alternating optimization algorithm is proposed to solve the optimization problem. We collected chemical structures and biological pathway features of drugs as the inputs of our method to predict drug side effects. The results of the cross-validation experiment showed that our method could significantly improve the prediction performance compared to the other state-of-the-art methods. Besides, our model is highly interpretable. It could learn the drug neighbourhood relationships, side effect relationships, and drug features related to side effects. We systematically validated the information extracted by the model with independent data. Some prediction results could also be supported by literature reports. The proposed method could be applied to integrate both chemical and biological data to predict side effects and helps improve drug safety.
Collapse
Affiliation(s)
- Xujun Liang
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China; National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, PR China.
| | - Jun Li
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China
| | - Ying Fu
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China
| | - Lingzhi Qu
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China
| | - Yuying Tan
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China
| | - Pengfei Zhang
- NHC Key Laboratory of Cancer Proteomics, Department of Oncology, PR China; National Clinical Research Center for Gerontology, Xiangya Hospital, Central South University, PR China
| |
Collapse
|
32
|
Kuru HI, Tastan O, Cicek AE. MatchMaker: A Deep Learning Framework for Drug Synergy Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2334-2344. [PMID: 34086576 DOI: 10.1109/tcbb.2021.3086702] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Drug combination therapies have been a viable strategy for the treatment of complex diseases such as cancer due to increased efficacy and reduced side effects. However, experimentally validating all possible combinations for synergistic interaction even with high-throughout screens is intractable due to vast combinatorial search space. Computational techniques can reduce the number of combinations to be evaluated experimentally by prioritizing promising candidates. We present MatchMaker that predicts drug synergy scores using drug chemical structure information and gene expression profiles of cell lines in a deep learning framework. For the first time, our model utilizes the largest known drug combination dataset to date, DrugComb. We compare the performance of MatchMaker with the state-of-the-art models and observe up to ∼ 15% correlation and ∼ 33% mean squared error (MSE) improvements over the next best method. We investigate the cell types and drug pairs that are relatively harder to predict and present novel candidate pairs. MatchMaker is built and available at https://github.com/tastanlab/matchmaker.
Collapse
|
33
|
Evangelista JE, Clarke DJB, Xie Z, Lachmann A, Jeon M, Chen K, Jagodnik KM, Jenkins SL, Kuleshov MV, Wojciechowicz ML, Schürer SC, Medvedovic M, Ma'ayan A. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Res 2022; 50:W697-W709. [PMID: 35524556 PMCID: PMC9252724 DOI: 10.1093/nar/gkac328] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/04/2022] [Accepted: 04/20/2022] [Indexed: 12/13/2022] Open
Abstract
Millions of transcriptome samples were generated by the Library of Integrated Network-based Cellular Signatures (LINCS) program. When these data are processed into searchable signatures along with signatures extracted from Genotype-Tissue Expression (GTEx) and Gene Expression Omnibus (GEO), connections between drugs, genes, pathways and diseases can be illuminated. SigCom LINCS is a webserver that serves over a million gene expression signatures processed, analyzed, and visualized from LINCS, GTEx, and GEO. SigCom LINCS is built with Signature Commons, a cloud-agnostic skeleton Data Commons with a focus on serving searchable signatures. SigCom LINCS provides a rapid signature similarity search for mimickers and reversers given sets of up and down genes, a gene set, a single gene, or any search term. Additionally, users of SigCom LINCS can perform a metadata search to find and analyze subsets of signatures and find information about genes and drugs. SigCom LINCS is findable, accessible, interoperable, and reusable (FAIR) with metadata linked to standard ontologies and vocabularies. In addition, all the data and signatures within SigCom LINCS are available via a well-documented API. In summary, SigCom LINCS, available at https://maayanlab.cloud/sigcom-lincs, is a rich webserver resource for accelerating drug and target discovery in systems pharmacology.
Collapse
Affiliation(s)
- John Erol Evangelista
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Minji Jeon
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kerwin Chen
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Maxim V Kuleshov
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Megan L Wojciechowicz
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Stephan C Schürer
- Department of Biomedical Informatics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Mario Medvedovic
- Department of Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
34
|
Gao S, Han L, Luo D, Xiao Z, Liu G, Zhang Y, Zhou W. Deep Learning Applications for the accurate identification of low-transcriptional activity drugs and their mechanism of actions. Pharmacol Res 2022; 180:106225. [PMID: 35452801 DOI: 10.1016/j.phrs.2022.106225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/07/2022] [Accepted: 04/15/2022] [Indexed: 10/18/2022]
Abstract
Analysis of drug-induced expression profiles facilitated comprehensive understanding of drug properties. However, many compounds exhibit weak transcription responses though they mostly possess definite pharmacological effects. Actually, as a representative example, over 66.4% of 312,438 molecular signatures in the Library of Integrated Cellular Signatures (LINCS) database exhibit low-transcriptional activities (i.e. TAS-low signatures). When computing the association between TAS-low signatures with shared mechanism of actions (MOAs), commonly used algorithms showed inadequate performance with an average area under receiver operating characteristic curve (AUROC) of 0.55, but the computation accuracy of the same task can be improved by our developed tool Genetic profile activity relationship (GPAR) with an average AUROC of 0.68. Up to 36 out of 74 TAS-low MOAs were well trained with AUROC≥0.7 by GPAR, higher than those by other approaches. Further studies showed that GPAR benefited from the size of training samples more significantly than other approaches. Lastly, in biological validation of the MOA prediction for a TAS-low drug Tropisetron, we found an unreported mechanism that Tropisetron can bind to the glucocorticoid receptor. This study indicated that GPAR can serve as an effective approach for the accurate identification of low-transcriptional activity drugs and their MOAs, thus providing a good tool for drug repurposing with both TAS-low and TAS-high signatures.
Collapse
Affiliation(s)
- Shengqiao Gao
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China
| | - Lu Han
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China
| | - Dan Luo
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China
| | - Zhiyong Xiao
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China
| | - Gang Liu
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China
| | - Yongxiang Zhang
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China.
| | - Wenxia Zhou
- Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China; State Key Laboratory of Toxicology and Medical Countermeasures, Beijing 100850, China.
| |
Collapse
|
35
|
Chen YW, Diamante G, Ding J, Nghiem TX, Yang J, Ha SM, Cohn P, Arneson D, Blencowe M, Garcia J, Zaghari N, Patel P, Yang X. PharmOmics: A species- and tissue-specific drug signature database and gene-network-based drug repositioning tool. iScience 2022; 25:104052. [PMID: 35345455 PMCID: PMC8957031 DOI: 10.1016/j.isci.2022.104052] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 01/29/2022] [Accepted: 03/08/2022] [Indexed: 12/29/2022] Open
Abstract
Drug development has been hampered by a high failure rate in clinical trials due to our incomplete understanding of drug functions across organs and species. Therefore, elucidating species- and tissue-specific drug functions can provide insights into therapeutic efficacy, potential adverse effects, and interspecies differences necessary for effective translational medicine. Here, we present PharmOmics, a drug knowledgebase and analytical tool that is hosted on an interactive web server. Using tissue- and species-specific transcriptome data from human, mouse, and rat curated from different databases, we implemented a gene-network-based approach for drug repositioning. We demonstrate the potential of PharmOmics to retrieve known therapeutic drugs and identify drugs with tissue toxicity using in silico performance assessment. We further validated predicted drugs for nonalcoholic fatty liver disease in mice. By combining tissue- and species-specific in vivo drug signatures with gene networks, PharmOmics serves as a complementary tool to support drug characterization and network-based medicine. Development of PharmOmics, a platform for drug repositioning and toxicity prediction Contains >18000 species/tissue-specific gene signatures for 941 drugs and chemicals Benchmarked and validated network-based drug repositioning and toxicity prediction PharmOmics is freely accessible via an online web server to facilitate user access
Collapse
Affiliation(s)
- Yen-Wei Chen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular Toxicology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Graciel Diamante
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular Toxicology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Jessica Ding
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular, & Integrative Physiology, Los Angeles, Los Angeles, CA 90095, USA
| | - Thien Xuan Nghiem
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Jessica Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sung-Min Ha
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Peter Cohn
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Douglas Arneson
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular, & Integrative Physiology, Los Angeles, Los Angeles, CA 90095, USA
| | - Jennifer Garcia
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nima Zaghari
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Paul Patel
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular Toxicology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Molecular, Cellular, & Integrative Physiology, Los Angeles, Los Angeles, CA 90095, USA
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Corresponding author
| |
Collapse
|
36
|
Kim E, Nam H. DeSIDE-DDI: interpretable prediction of drug-drug interactions using drug-induced gene expressions. J Cheminform 2022; 14:9. [PMID: 35246258 PMCID: PMC8895921 DOI: 10.1186/s13321-022-00589-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 02/09/2022] [Indexed: 11/10/2022] Open
Abstract
Adverse drug-drug interaction (DDI) is a major concern to polypharmacy due to its unexpected adverse side effects and must be identified at an early stage of drug discovery and development. Many computational methods have been proposed for this purpose, but most require specific types of information, or they have less concern in interpretation on underlying genes. We propose a deep learning-based framework for DDI prediction with drug-induced gene expression signatures so that the model can provide the expression level of interpretability for DDIs. The model engineers dynamic drug features using a gating mechanism that mimics the co-administration effects by imposing attention to genes. Also, each side-effect is projected into a latent space through translating embedding. As a result, the model achieved an AUC of 0.889 and an AUPR of 0.915 in unseen interaction prediction, which is competitively very accurate and outperforms other state-of-the-art methods. Furthermore, it can predict potential DDIs with new compounds not used in training. In conclusion, using drug-induced gene expression signatures followed by gating and translating embedding can increase DDI prediction accuracy while providing model interpretability. The source code is available on GitHub (https://github.com/GIST-CSBL/DeSIDE-DDI).
Collapse
Affiliation(s)
- Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju, 61005, Republic of Korea.
| |
Collapse
|
37
|
Bundy JL, Judson R, Williams AJ, Grulke C, Shah I, Everett LJ. Predicting molecular initiating events using chemical target annotations and gene expression. BioData Min 2022; 15:7. [PMID: 35246223 PMCID: PMC8895536 DOI: 10.1186/s13040-022-00292-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 02/10/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The advent of high-throughput transcriptomic screening technologies has resulted in a wealth of publicly available gene expression data associated with chemical treatments. From a regulatory perspective, data sets that cover a large chemical space and contain reference chemicals offer utility for the prediction of molecular initiating events associated with chemical exposure. Here, we integrate data from a large compendium of transcriptomic responses to chemical exposure with a comprehensive database of chemical-protein associations to train binary classifiers that predict mechanism(s) of action from transcriptomic responses. First, we linked reference chemicals present in the LINCS L1000 gene expression data collection to chemical identifiers in RefChemDB, a database of chemical-protein interactions. Next, we trained binary classifiers on MCF7 human breast cancer cell line derived gene expression profiles and chemical-protein labels using six classification algorithms to identify optimal analysis parameters. To validate classifier accuracy, we used holdout data sets, training-excluded reference chemicals, and empirical significance testing of null models derived from permuted chemical-protein associations. To identify classifiers that have variable predicting performance across training data derived from different cellular contexts, we trained a separate set of binary classifiers on the PC3 human prostate cancer cell line. RESULTS We trained classifiers using expression data associated with chemical treatments linked to 51 molecular initiating events. This analysis identified and validated 9 high-performing classifiers with empirical p-values lower than 0.05 and internal accuracies ranging from 0.73 to 0.94 and holdout accuracies of 0.68 to 0.92. High-ranking predictions for training-excluded reference chemicals demonstrating that predictive accuracy extends beyond the set of chemicals used in classifier training. To explore differences in classifier performance as a function of training data cellular context, MCF7-trained classifier accuracies were compared to classifiers trained on the PC3 gene expression data for the same molecular initiating events. CONCLUSIONS This methodology can offer insight in prioritizing candidate perturbagens of interest for targeted screens. This approach can also help guide the selection of relevant cellular contexts for screening classes of candidate perturbagens using cell line specific model performance.
Collapse
Affiliation(s)
- Joseph L Bundy
- Biomolecular and Computational Toxicology Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA
| | - Richard Judson
- Biomolecular and Computational Toxicology Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA
| | - Antony J Williams
- Chemical Characterization and Exposure Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA
| | - Chris Grulke
- Chemical Characterization and Exposure Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA
| | - Imran Shah
- Biomolecular and Computational Toxicology Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA
| | - Logan J Everett
- Biomolecular and Computational Toxicology Division, Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Durham, NC, 27709, USA.
| |
Collapse
|
38
|
Clarke DJB, Kuleshov MV, Xie Z, Evangelista JE, Meyers MR, Kropiwnicki E, Jenkins SL, Ma’ayan A. Gene and drug landing page aggregator. BIOINFORMATICS ADVANCES 2022; 2:vbac013. [PMID: 35368424 PMCID: PMC8969666 DOI: 10.1093/bioadv/vbac013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 02/05/2022] [Accepted: 02/26/2022] [Indexed: 01/27/2023]
Abstract
Motivation Many biological and biomedical researchers commonly search for information about genes and drugs to gather knowledge from these resources. For the most part, such information is served as landing pages in disparate data repositories and web portals. Results The Gene and Drug Landing Page Aggregator (GDLPA) provides users with access to 50 gene-centric and 19 drug-centric repositories, enabling them to retrieve landing pages corresponding to their gene and drug queries. Bringing these resources together into one dashboard that directs users to the landing pages across many resources can help centralize gene- and drug-centric knowledge, as well as raise awareness of available resources that may be missed when using standard search engines. To demonstrate the utility of GDLPA, case studies for the gene klotho and the drug remdesivir were developed. The first case study highlights the potential role of klotho as a drug target for aging and kidney disease, while the second study gathers knowledge regarding approval, usage, and safety for remdesivir, the first approved coronavirus disease 2019 therapeutic. Finally, based on our experience, we provide guidelines for developing effective landing pages for genes and drugs. Availability and implementation GDLPA is open source and is available from: https://cfde-gene-pages.cloud/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Maxim V Kuleshov
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - John E Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Marilyn R Meyers
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Eryk Kropiwnicki
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Avi Ma’ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA,To whom correspondence should be addressed.
| |
Collapse
|
39
|
Kropiwnicki E, Lachmann A, Clarke DJB, Xie Z, Jagodnik KM, Ma’ayan A. DrugShot: querying biomedical search terms to retrieve prioritized lists of small molecules. BMC Bioinformatics 2022; 23:76. [PMID: 35183110 PMCID: PMC8858480 DOI: 10.1186/s12859-022-04590-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/28/2022] [Indexed: 11/29/2022] Open
Abstract
Background PubMed contains millions of abstracts that co-mention terms that describe drugs with other biomedical terms such as genes or diseases. Unique opportunities exist for leveraging these co-mentions by integrating them with other drug-drug similarity resources such as the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 signatures to develop novel hypotheses. Results DrugShot is a web-based server application and an Appyter that enables users to enter any biomedical search term into a simple input form to receive ranked lists of drugs and other small molecules based on their relevance to the search term. To produce ranked lists of small molecules, DrugShot cross-references returned PubMed identifiers (PMIDs) with DrugRIF or AutoRIF, which are curated resources of drug-PMID associations, to produce an associated small molecule list where each small molecule is ranked according to total co-mentions with the search term from shared PubMed IDs. Additionally, using two types of drug-drug similarity matrices, lists of small molecules are predicted to be associated with the search term. Such predictions are based on literature co-mentions and signature similarity from LINCS L1000 drug-induced gene expression profiles. Conclusions DrugShot prioritizes drugs and small molecules associated with biomedical search terms. In addition to listing known associations, DrugShot predicts additional drugs and small molecules related to any search term. Hence, DrugShot can be used to prioritize drugs and preclinical compounds for drug repurposing and suggest indications and adverse events for preclinical compounds. DrugShot is freely and openly available at: https://maayanlab.cloud/drugshot and https://appyters.maayanlab.cloud/#/DrugShot. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04590-5.
Collapse
|
40
|
Xu X, Yue L, Li B, Liu Y, Wang Y, Zhang W, Wang L. DSGAT: predicting frequencies of drug side effects by graph attention networks. Brief Bioinform 2022; 23:6511198. [DOI: 10.1093/bib/bbab586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/01/2021] [Accepted: 12/20/2021] [Indexed: 12/22/2022] Open
Abstract
Abstract
A critical issue of drug risk–benefit evaluation is to determine the frequencies of drug side effects. Randomized controlled trail is the conventional method for obtaining the frequencies of side effects, while it is laborious and slow. Therefore, it is necessary to guide the trail by computational methods. Existing methods for predicting the frequencies of drug side effects focus on modeling drug–side effect interaction graph. The inherent disadvantage of these approaches is that their performance is closely linked to the density of interactions but which is highly sparse. More importantly, for a cold start drug that does not appear in the training data, such methods cannot learn the preference embedding of the drug because there is no link to the drug in the interaction graph. In this work, we propose a new method for predicting the frequencies of drug side effects, DSGAT, by using the drug molecular graph instead of the commonly used interaction graph. This leads to the ability to learn embeddings for cold start drugs with graph attention networks. The proposed novel loss function, i.e. weighted $\varepsilon$-insensitive loss function, could alleviate the sparsity problem. Experimental results on one benchmark dataset demonstrate that DSGAT yields significant improvement for cold start drugs and outperforms the state-of-the-art performance in the warm start scenario. Source code and datasets are available at https://github.com/xxy45/DSGAT.
Collapse
|
41
|
Yu Z, Wu Z, Li W, Liu G, Tang Y. ADENet: a novel network-based inference method for prediction of drug adverse events. Brief Bioinform 2022; 23:6510157. [PMID: 35039845 DOI: 10.1093/bib/bbab580] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/02/2021] [Accepted: 12/19/2021] [Indexed: 11/13/2022] Open
Abstract
Identification of adverse drug events (ADEs) is crucial to reduce human health risks and improve drug safety assessment. With an increasing number of biological and medical data, computational methods such as network-based methods were proposed for ADE prediction with high efficiency and low cost. However, previous network-based methods rely on the topological information of known drug-ADE networks, and hence cannot make predictions for novel compounds without any known ADE. In this study, we introduced chemical substructures to bridge the gap between the drug-ADE network and novel compounds, and developed a novel network-based method named ADENet, which can predict potential ADEs for not only drugs within the drug-ADE network, but also novel compounds outside the network. To show the performance of ADENet, we collected drug-ADE associations from a comprehensive database named MetaADEDB and constructed a series of network-based prediction models. These models obtained high area under the receiver operating characteristic curve values ranging from 0.871 to 0.947 in 10-fold cross-validation. The best model further showed high performance in external validation, which outperformed a previous network-based and a recent deep learning-based method. Using several approved drugs as case studies, we found that 32-54% of the predicted ADEs can be validated by the literature, indicating the practical value of ADENet. Moreover, ADENet is freely available at our web server named NetInfer (http://lmmd.ecust.edu.cn/netinfer). In summary, our method would provide a promising tool for ADE prediction and drug safety assessment in drug discovery and development.
Collapse
Affiliation(s)
- Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
42
|
Kropiwnicki E, Binder J, Yang J, Holmes J, Lachmann A, Clarke DJB, Sheils T, Kelleher K, Metzger V, Bologa CG, Oprea TI, Ma’ayan A. Getting Started with the IDG KMC Datasets and Tools. Curr Protoc 2022; 2:e355. [PMID: 35085427 PMCID: PMC10789444 DOI: 10.1002/cpz1.355] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The Illuminating the Druggable Genome (IDG) consortium is a National Institutes of Health (NIH) Common Fund program designed to enhance our knowledge of under-studied proteins, more specifically, proteins unannotated within the three most commonly drug-targeted protein families: G-protein coupled receptors, ion channels, and protein kinases. Since 2014, the IDG Knowledge Management Center (IDG-KMC) has generated several open-access datasets and resources that jointly serve as a highly translational machine-learning-ready knowledgebase focused on human protein-coding genes and their products. The goal of the IDG-KMC is to develop comprehensive integrated knowledge for the druggable genome to illuminate the uncharacterized or poorly annotated portion of the druggable genome. The tools derived from the IDG-KMC provide either user-friendly visualizations or ways to impute the knowledge about potential targets using machine learning strategies. In the following protocols, we describe how to use each web-based tool to accelerate illumination in under-studied proteins. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Interacting with the Pharos user interface Basic Protocol 2: Accessing the data in Harmonizome Basic Protocol 3: The ARCHS4 resource Basic Protocol 4: Making predictions about gene function with PrismExp Basic Protocol 5: Using Geneshot to illuminate knowledge about under-studied targets Basic Protocol 6: Exploring under-studied targets with TIN-X Basic Protocol 7: Interacting with the DrugCentral user interface Basic Protocol 8: Estimating Anti-SARS-CoV-2 activities with DrugCentral REDIAL-2020 Basic Protocol 9: Drug Set Enrichment Analysis using Drugmonizome Basic Protocol 10: The Drugmonizome-ML Appyter Basic Protocol 11: The Harmonizome-ML Appyter Basic Protocol 12: GWAS target illumination with TIGA Basic Protocol 13: Prioritizing kinases for lists of proteins and phosphoproteins with KEA3 Basic Protocol 14: Converting PubMed searches to drug sets with the DrugShot Appyter.
Collapse
Affiliation(s)
- Eryk Kropiwnicki
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Jessica Binder
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Jeremy Yang
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Jayme Holmes
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J. B. Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Timothy Sheils
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Keith Kelleher
- National Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Vincent Metzger
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Cristian G. Bologa
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Tudor I. Oprea
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Avi Ma’ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
43
|
Zaheer J, Yu AR, Kim H, Kang HJ, Kang MK, Lee JJ, Kim JS. Diacerein, an inhibitor of IL-1β downstream mediated apoptosis, improves radioimmunotherapy in a mouse model of Burkitt's lymphoma. Am J Cancer Res 2021; 11:6147-6159. [PMID: 35018248 PMCID: PMC8727812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 10/31/2021] [Indexed: 06/14/2023] Open
Abstract
Lymphoma has the characteristics of a solid tumor. Penetration of monoclonal antibodies is limited in solid tumors during radioimmunotherapy (RIT). Here, we first investigated the use of diacerein (DIA) as a combination drug to improve the penetration and therapeutic efficacy of 131I-rituximab (RTX) using the Burkitt's lymphoma mouse model. We selected DIA through computational drug repurposing and focused on rheumatoid arthritis (RA) drug interaction genes to minimize side effects. Then, the cytotoxicity of DIA was assessed in vitro using three different lymphoma cell lines. DIA-induced apoptosis was confirmed by Western blotting. After confirming apoptosis, we confirmed the enhanced uptake of 131I-RTX in Burkitt's lymphoma mouse model using SPECT/CT. Autoradiography of 131I-RTX confirmed the therapeutic effect of DIA. Finally, the tumor size and survival rate were assessed to measure the enhanced therapeutic efficacy when DIA was used. In addition, we assessed the dose-dependency of DIA in terms of the accumulation of 131I-RTX in tumor tissue, the tumor size, and the survival rate. The in vitro cytotoxicity was 10.9%. We showed that DIA induced apoptosis which was related to downstream IL-1β signaling by Western blotting. We found increased Annexin V positive apoptosis after DIA administration. Immuno SPECT/CT images demonstrated a higher uptake of 131I-RTX in tumors in the DIA-administered group than that in the PBS-alone group. However, there were no statistical differences of dose-dependency between 20 mg/kg and 40 mg/kg of DIA. Tumor growth was significantly inhibited in the group treated with the combination of DIA plus 131I-RTX at 7 days after injection. Our suggested combination of DIA and 131I-RTX strategies could enhance the efficacy of 131I-RTX treatment.
Collapse
Affiliation(s)
- Javeria Zaheer
- Division of RI Application, Korea Institute of Radiological and Medical SciencesSeoul 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology (UST)Seoul 01812, Republic of Korea
| | - A Ram Yu
- Laboratory Animal Center, Osong Medical Innovation FoundationOsong, Chungbuk 28160, Republic of Korea
| | - Hyeongi Kim
- Division of RI Application, Korea Institute of Radiological and Medical SciencesSeoul 01812, Republic of Korea
| | - Hyun Ji Kang
- Division of RI Application, Korea Institute of Radiological and Medical SciencesSeoul 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology (UST)Seoul 01812, Republic of Korea
| | - Min Kyoung Kang
- Laboratory Animal Center, Osong Medical Innovation FoundationOsong, Chungbuk 28160, Republic of Korea
| | - Jae Jun Lee
- Laboratory Animal Center, Osong Medical Innovation FoundationOsong, Chungbuk 28160, Republic of Korea
| | - Jin Su Kim
- Division of RI Application, Korea Institute of Radiological and Medical SciencesSeoul 01812, Republic of Korea
- Radiological and Medico-Oncological Sciences, University of Science and Technology (UST)Seoul 01812, Republic of Korea
| |
Collapse
|
44
|
Wang Z, Guo K, Gao P, Pu Q, Li C, Hur J, Wu M. Repurposable drugs for SARS-CoV-2 and influenza sepsis with scRNA-seq data targeting post-transcription modifications. PRECISION CLINICAL MEDICINE 2021; 4:215-230. [PMID: 34993416 PMCID: PMC8694063 DOI: 10.1093/pcmedi/pbab022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 08/04/2021] [Accepted: 08/22/2021] [Indexed: 02/06/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19) has impacted almost every part of human life worldwide, posing a massive threat to human health. The lack of time for new drug discovery and the urgent need for rapid disease control to reduce mortality have led to a search for quick and effective alternatives to novel therapeutics, for example drug repurposing. To identify potentially repurposable drugs, we employed a systematic approach to mine candidates from U.S. FDA-approved drugs and preclinical small-molecule compounds by integrating gene expression perturbation data for chemicals from the Library of Integrated Network-Based Cellular Signatures project with a publicly available single-cell RNA sequencing dataset from patients with mild and severe COVID-19 (GEO: GSE145926, public data available and accessed on 22 April 2020). We identified 281 FDA-approved drugs that have the potential to be effective against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, 16 of which are currently undergoing clinical trials to evaluate their efficacy against COVID-19. We experimentally tested and demonstrated the inhibitory effects of tyrphostin-AG-1478 and brefeldin-a, two chemical inhibitors of glycosylation (a post-translational modification) on the replication of the single-stranded ribonucleic acid (ssRNA) virus influenza A virus as well as on the transcription and translation of host cell cytokines and their regulators (IFNs and ISGs). In conclusion, we have identified and experimentally validated repurposable anti-SARS-CoV-2 and IAV drugs using a systems biology approach, which may have the potential for treating these viral infections and their complications (sepsis).
Collapse
Affiliation(s)
- Zhihan Wang
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China
| | - Kai Guo
- Department of Neurology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Pan Gao
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
- Medical Research Institute, Wuhan University, Wuhan 430071, China
| | - Qinqin Pu
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| | - Changlong Li
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| | - Min Wu
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58202, USA
| |
Collapse
|
45
|
Cakir A, Tuncer M, Taymaz-Nikerel H, Ulucan O. Side effect prediction based on drug-induced gene expression profiles and random forest with iterative feature selection. THE PHARMACOGENOMICS JOURNAL 2021; 21:673-681. [PMID: 34155353 DOI: 10.1038/s41397-021-00246-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 05/28/2021] [Accepted: 06/10/2021] [Indexed: 02/06/2023]
Abstract
One in every ten drug candidates fail in clinical trials mainly due to efficacy and safety related issues, despite in-depth preclinical testing. Even some of the approved drugs such as chemotherapeutics are notorious for their side effects that are burdensome on patients. In order to pave the way for new therapeutics with more tolerable side effects, the mechanisms underlying side effects need to be fully elucidated. In this work, we addressed the common side effects of chemotherapeutics, namely alopecia, diarrhea and edema. A strategy based on Random Forest algorithm unveiled an expression signature involving 40 genes that predicted these side effects with an accuracy of 89%. We further characterized the resulting signature and its association with the side effects using functional enrichment analysis and protein-protein interaction networks. This work contributes to the ongoing efforts in drug development for early identification of side effects to use the resources more effectively.
Collapse
Affiliation(s)
- Arzu Cakir
- Department of Genetics and Bioengineering, Istanbul Bilgi University, Istanbul, Eyupsultan, Turkey
| | - Melisa Tuncer
- Department of Genetics and Bioengineering, Istanbul Bilgi University, Istanbul, Eyupsultan, Turkey
| | - Hilal Taymaz-Nikerel
- Department of Genetics and Bioengineering, Istanbul Bilgi University, Istanbul, Eyupsultan, Turkey
| | - Ozlem Ulucan
- Department of Genetics and Bioengineering, Istanbul Bilgi University, Istanbul, Eyupsultan, Turkey.
| |
Collapse
|
46
|
Hao Y, Moore JH. TargetTox: A Feature Selection Pipeline for Identifying Predictive Targets Associated with Drug Toxicity. J Chem Inf Model 2021; 61:5386-5394. [PMID: 34757743 DOI: 10.1021/acs.jcim.1c00733] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In silico assessment of drug toxicity is becoming a critical step in drug development. Conventional ligand-based models are limited by low accuracy and lack of interpretability. Further, they often fail to explain cellular mechanisms underlying structure-toxicity associations. We addressed these limitations by incorporating target profile as an intermediate connecting structure to toxicity. To accommodate for high-dimensional feature space, we developed a pipeline named TargetTox that can identity a subset of predictive features. We implemented TargetTox to study 569 targets and 815 adverse events. The features identified by TargetTox comprise less than 10% of the original feature space; nevertheless, they accurately predicted binding outcomes for 377 targets and toxicity outcomes for 36 adverse events. We demonstrated that predictive targets tend to be differentially expressed in the tissue of toxicity. We also rediscovered key cellular functions associated with cardiotoxicity from the predictive targets, as well as markers of skin and liver diseases. Furthermore, we found evidence supporting diagnostic and therapeutic applications of some predictive targets in hepatotoxicity and nephrotoxicity. Our findings highlighted the critical role of predictive targets in cellular mechanisms leading to toxicity. In general, our study improved the interpretability of toxicity prediction without sacrificing accuracy. Our novel pipeline may benefit future studies of high-dimensional data sets.
Collapse
Affiliation(s)
- Yun Hao
- Genomics and Computational Biology (GCB) Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
47
|
From serendipity to rational drug design in brain disorders: in silico, in vitro, and in vivo approaches. Curr Opin Pharmacol 2021; 60:177-182. [PMID: 34461562 DOI: 10.1016/j.coph.2021.07.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/16/2021] [Accepted: 07/19/2021] [Indexed: 11/23/2022]
Abstract
Prolonged life expectancy and stressful lifestyles have increased the risk of developing neurological disorders, including neurodegenerative and psychiatric illnesses. Despite obvious and immediate needs for effective treatment, drug discovery for neurological disorders has been largely serendipitous, whereas hypothesis-driven drug development programs have been remarkably poor. This may be partly due to insufficient knowledge of molecular mechanisms underlying disease pathophysiology, complex genetic and environmental risk factors, and oversimplified diagnostic criteria. Here, we review recent progress in cell type-specific investigations, bioinformatics analyses, and large reference databases, the integration of which, when combined with effective use of animal models, provides novel insights into disease mechanisms, suggests innovative drug development, and ultimately promises superior treatments for patients suffering from neurological disorders.
Collapse
|
48
|
Interpreting machine learning models to investigate circadian regulation and facilitate exploration of clock function. Proc Natl Acad Sci U S A 2021; 118:2103070118. [PMID: 34353905 PMCID: PMC8364196 DOI: 10.1073/pnas.2103070118] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The circadian clock is an internal molecular 24-h timer that is critical to life on Earth. We describe a series of artificial intelligence (AI)– and machine learning (ML)–based approaches that enable more cost-effective analysis and insight into circadian regulation and function. Throughout the manuscript, we illuminate what is inside the ML “black box” via explanation or interpretation of predictive ML models. Using this interpretation of our models, we derive biological insights into why a prediction was made, alongside accurate predictions. Most innovatively, we use only DNA sequence features for accurate circadian gene expression prediction. Using explainable AI, we define possible, responsible regulatory elements as we make these predictions; this critically requires no prior knowledge of regulatory elements. The circadian clock is an important adaptation to life on Earth. Here, we use machine learning to predict complex, temporal, and circadian gene expression patterns in Arabidopsis. Most significantly, we classify circadian genes using DNA sequence features generated de novo from public, genomic resources, facilitating downstream application of our methods with no experimental work or prior knowledge needed. We use local model explanation that is transcript specific to rank DNA sequence features, providing a detailed profile of the potential circadian regulatory mechanisms for each transcript. Furthermore, we can discriminate the temporal phase of transcript expression using the local, explanation-derived, and ranked DNA sequence features, revealing hidden subclasses within the circadian class. Model interpretation/explanation provides the backbone of our methodological advances, giving insight into biological processes and experimental design. Next, we use model interpretation to optimize sampling strategies when we predict circadian transcripts using reduced numbers of transcriptomic timepoints. Finally, we predict the circadian time from a single, transcriptomic timepoint, deriving marker transcripts that are most impactful for accurate prediction; this could facilitate the identification of altered clock function from existing datasets.
Collapse
|
49
|
Lim JJ, Li X, Lehmler HJ, Wang D, Gu H, Cui JY. Gut Microbiome Critically Impacts PCB-induced Changes in Metabolic Fingerprints and the Hepatic Transcriptome in Mice. Toxicol Sci 2021; 177:168-187. [PMID: 32544245 DOI: 10.1093/toxsci/kfaa090] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Polychlorinated biphenyls (PCBs) are ubiquitously detected and have been linked to metabolic diseases. Gut microbiome is recognized as a critical regulator of disease susceptibility; however, little is known how PCBs and gut microbiome interact to modulate hepatic xenobiotic and intermediary metabolism. We hypothesized the gut microbiome regulates PCB-mediated changes in the metabolic fingerprints and hepatic transcriptome. Ninety-day-old female conventional and germ-free mice were orally exposed to the Fox River Mixture (synthetic PCB mixture, 6 or 30 mg/kg) or corn oil (vehicle control, 10 ml/kg), once daily for 3 consecutive days. RNA-seq was conducted in liver, and endogenous metabolites were measured in liver and serum by LC-MS. Prototypical target genes of aryl hydrocarbon receptor, pregnane X receptor, and constitutive androstane receptor were more readily upregulated by PCBs in conventional conditions, indicating PCBs, to the hepatic transcriptome, act partly through the gut microbiome. In a gut microbiome-dependent manner, xenobiotic, and steroid metabolism pathways were upregulated, whereas response to misfolded proteins-related pathways was downregulated by PCBs. At the high PCB dose, NADP, and arginine appear to interact with drug-metabolizing enzymes (ie, Cyp1-3 family), which are highly correlated with Ruminiclostridium and Roseburia, providing a novel explanation of gut-liver interaction from PCB-exposure. Utilizing the Library of Integrated Network-based Cellular Signatures L1000 database, therapeutics targeting anti-inflammatory and endoplasmic reticulum stress pathways are predicted to be remedies that can mitigate PCB toxicity. Our findings demonstrate that habitation of the gut microbiota drives PCB-mediated hepatic responses. Our study adds knowledge of physiological response differences from PCB exposure and considerations for further investigations for gut microbiome-dependent therapeutics.
Collapse
Affiliation(s)
- Joe Jongpyo Lim
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, Washington 98195
| | - Xueshu Li
- Department of Occupational and Environmental Health, University of Iowa, Iowa City, Iowa 52242; and
| | - Hans-Joachim Lehmler
- Department of Occupational and Environmental Health, University of Iowa, Iowa City, Iowa 52242; and
| | - Dongfang Wang
- Arizona Metabolomics Laboratory, School of Nutrition and Health Promotion, College of Health Solutions, Arizona State University, Scottsdale, Arizona 85259
| | - Haiwei Gu
- Arizona Metabolomics Laboratory, School of Nutrition and Health Promotion, College of Health Solutions, Arizona State University, Scottsdale, Arizona 85259
| | - Julia Yue Cui
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, Washington 98195
| |
Collapse
|
50
|
Joshi P, Vedhanayagam M, Ramesh R. An Ensembled SVM Based Approach for Predicting Adverse Drug Reactions. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200707141420] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Preventing adverse drug reactions (ADRs) is imperative for the safety of
the people. The problem of under-reporting the ADRs has been prevalent across the world, making it
difficult to develop the prediction models, which are unbiased. As a result, most of the models are
skewed to the negative samples leading to high accuracy but poor performance in other metrics such
as precision, recall, F1 score, and AUROC score.
Objective:
In this work, we have proposed a novel way of predicting the ADRs by balancing the dataset.
Method:
The whole data set has been partitioned into balanced smaller data sets. SVMs with
optimal kernel have been learned using each of the balanced data sets and the prediction of given
ADR for the given drug has been obtained by voting from the ensembled optimal SVMs learned.
Results:
We have found that results are encouraging and comparable with the competing methods in
the literature and obtained the average sensitivity of 0.97 for all the ADRs. The model has been
interpreted and explained with SHAP values by various plots.
Conclusion:
A novel way of predicting ADRs by balancing the dataset has been proposed thereby
reducing the effect of unbalanced datasets.
Collapse
Affiliation(s)
- Pratik Joshi
- Department of Computer Science and Engineering, IIITDM Kancheepuram, Chennai, India
| | | | | |
Collapse
|