1
|
A novel graph mining approach to predict and evaluate food-drug interactions. Sci Rep 2022; 12:1061. [PMID: 35058561 PMCID: PMC8776972 DOI: 10.1038/s41598-022-05132-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 01/05/2022] [Indexed: 12/26/2022] Open
Abstract
Food-drug interactions (FDIs) arise when nutritional dietary consumption regulates biochemical mechanisms involved in drug metabolism. This study proposes FDMine, a novel systematic framework that models the FDI problem as a homogenous graph. Our dataset consists of 788 unique approved small molecule drugs with metabolism-related drug-drug interactions and 320 unique food items, composed of 563 unique compounds. The potential number of interactions is 87,192 and 92,143 for disjoint and joint versions of the graph. We defined several similarity subnetworks comprising food-drug similarity, drug-drug similarity, and food-food similarity networks. A unique part of the graph involves encoding the food composition as a set of nodes and calculating a content contribution score. To predict new FDIs, we considered several link prediction algorithms and various performance metrics, including the precision@top (top 1%, 2%, and 5%) of the newly predicted links. The shortest path-based method has achieved a precision of 84%, 60% and 40% for the top 1%, 2% and 5% of FDIs identified, respectively. We validated the top FDIs predicted using FDMine to demonstrate its applicability, and we relate therapeutic anti-inflammatory effects of food items informed by FDIs. FDMine is publicly available to support clinicians and researchers.
Collapse
|
2
|
Zhao H, Li Y, Wang J. A convolutional neural network and graph convolutional network-based method for predicting the classification of anatomical therapeutic chemicals. Bioinformatics 2021; 37:2841-2847. [PMID: 33769479 DOI: 10.1093/bioinformatics/btab204] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Revised: 03/03/2021] [Accepted: 03/24/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION The Anatomical Therapeutic Chemical (ATC) system is an official classification system established by the World Health Organization for medicines. Correctly assigning ATC classes to given compounds is an important research problem in drug discovery, which can not only discover the possible active ingredients of the compounds, but also infer theirs therapeutic, pharmacological and chemical properties. RESULTS In this article, we develop an end-to-end multi-label classifier called CGATCPred to predict 14 main ATC classes for given compounds. In order to extract rich features of each compound, we use the deep Convolutional Neural Network and shortcut connections to represent and learn the seven association scores between the given compound and others. Moreover, we construct the correlation graph of ATC classes and then apply graph convolutional network on the graph for label embedding abstraction. We use all label embedding to guide the learning process of compound representation. As a result, by using the Jackknife test, CGATCPred obtain reliable Aiming of 81.94%, Coverage of 82.88%, Accuracy 80.81%, Absolute True 76.58% and Absolute False 2.75%, yielding significantly improvements compared to exiting multi-label classifiers. AVAILABILITY AND IMPLEMENTATION The codes of CGATCPred are available at https://github.com/zhc940702/CGATCPred and https://zenodo.org/record/4552917.
Collapse
Affiliation(s)
- Haochen Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529-0001, USA
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
3
|
Wang L, Upadhyay V, Maranas CD. dGPredictor: Automated fragmentation method for metabolic reaction free energy prediction and de novo pathway design. PLoS Comput Biol 2021; 17:e1009448. [PMID: 34570771 PMCID: PMC8496854 DOI: 10.1371/journal.pcbi.1009448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 10/07/2021] [Accepted: 09/13/2021] [Indexed: 11/19/2022] Open
Abstract
Group contribution (GC) methods are conventionally used in thermodynamics analysis of metabolic pathways to estimate the standard Gibbs energy change (ΔrG′o) of enzymatic reactions from limited experimental measurements. However, these methods are limited by their dependence on manually curated groups and inability to capture stereochemical information, leading to low reaction coverage. Herein, we introduce an automated molecular fingerprint-based thermodynamic analysis tool called dGPredictor that enables the consideration of stereochemistry within metabolite structures and thus increases reaction coverage. dGPredictor has comparable prediction accuracy compared to existing GC methods and can capture Gibbs energy changes for isomerase and transferase reactions, which exhibit no overall group changes. We also demonstrate dGPredictor’s ability to predict the Gibbs energy change for novel reactions and seamless integration within de novo metabolic pathway design tools such as novoStoic for safeguarding against the inclusion of reaction steps with infeasible directionalities. To facilitate easy access to dGPredictor, we developed a graphical user interface to predict the standard Gibbs energy change for reactions at various pH and ionic strengths. The tool allows customized user input of known metabolites as KEGG IDs and novel metabolites as InChI strings (https://github.com/maranasgroup/dGPredictor). The standard Gibbs energy change is commonly used to check for the feasibility of enzyme-catalyzed reactions as thermodynamics plays a crucial role in pathway design for biochemical synthesis. The group contribution methods using expert-defined functional groups have been extensively used for estimating standard Gibbs energy change. Here, we introduce a molecular fingerprint-based thermodynamic tool, dGPredictor, that enables distinguishing between (stereo)isomers in metabolic reactions leading to improved reaction coverage and comparable prediction accuracy as GC methods. dGPredictor can also be used alongside de novo pathway design tools to ensure the correct directionality of chosen reaction steps. We applied and tested dGPredictor on reactions from the KEGG database and applied it to screen an isobutanol synthesis pathway design. An open-source, user-friendly web interface is provided to facilitate easy access for standard Gibbs energy change of reactions at different pH values. (https://github.com/maranasgroup/dGPredictor).
Collapse
Affiliation(s)
- Lin Wang
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
| | - Vikas Upadhyay
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
| | - Costas D. Maranas
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
- * E-mail:
| |
Collapse
|
4
|
Kuwahara H, Gao X. Analysis of the effects of related fingerprints on molecular similarity using an eigenvalue entropy approach. J Cheminform 2021; 13:27. [PMID: 33757582 PMCID: PMC7989080 DOI: 10.1186/s13321-021-00506-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 03/13/2021] [Indexed: 11/18/2022] Open
Abstract
Two-dimensional (2D) chemical fingerprints are widely used as binary features for the quantification of structural similarity of chemical compounds, which is an important step in similarity-based virtual screening (VS). Here, using an eigenvalue-based entropy approach, we identified 2D fingerprints with little to no contribution to shaping the eigenvalue distribution of the feature matrix as related ones and examined the degree to which these related 2D fingerprints influenced molecular similarity scores calculated with the Tanimoto coefficient. Our analysis identified many related fingerprints in publicly available fingerprint schemes and showed that their presence in the feature set could have substantial effects on the similarity scores and bias the outcome of molecular similarity analysis. Our results have implication in the optimal selection of 2D fingerprints for compound similarity analysis and the identification of potential hits for compounds with target biological activity in VS.
Collapse
Affiliation(s)
- Hiroyuki Kuwahara
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
5
|
Suthers PF, Foster CJ, Sarkar D, Wang L, Maranas CD. Recent advances in constraint and machine learning-based metabolic modeling by leveraging stoichiometric balances, thermodynamic feasibility and kinetic law formalisms. Metab Eng 2020; 63:13-33. [PMID: 33310118 DOI: 10.1016/j.ymben.2020.11.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 11/13/2020] [Accepted: 11/27/2020] [Indexed: 12/16/2022]
Abstract
Understanding the governing principles behind organisms' metabolism and growth underpins their effective deployment as bioproduction chassis. A central objective of metabolic modeling is predicting how metabolism and growth are affected by both external environmental factors and internal genotypic perturbations. The fundamental concepts of reaction stoichiometry, thermodynamics, and mass action kinetics have emerged as the foundational principles of many modeling frameworks designed to describe how and why organisms allocate resources towards both growth and bioproduction. This review focuses on the latest algorithmic advancements that have integrated these foundational principles into increasingly sophisticated quantitative frameworks.
Collapse
Affiliation(s)
- Patrick F Suthers
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, The Pennsylvania State University, University Park, PA, USA
| | - Charles J Foster
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Debolina Sarkar
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
6
|
Baranwal M, Magner A, Elvati P, Saldinger J, Violi A, Hero AO. A deep learning architecture for metabolic pathway prediction. Bioinformatics 2020; 36:2547-2553. [PMID: 31879763 DOI: 10.1093/bioinformatics/btz954] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 12/02/2019] [Accepted: 12/22/2019] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Understanding the mechanisms and structural mappings between molecules and pathway classes are critical for design of reaction predictors for synthesizing new molecules. This article studies the problem of prediction of classes of metabolic pathways (series of chemical reactions occurring within a cell) in which a given biochemical compound participates. We apply a hybrid machine learning approach consisting of graph convolutional networks used to extract molecular shape features as input to a random forest classifier. In contrast to previously applied machine learning methods for this problem, our framework automatically extracts relevant shape features directly from input SMILES representations, which are atom-bond specifications of chemical structures composing the molecules. RESULTS Our method is capable of correctly predicting the respective metabolic pathway class of 95.16% of tested compounds, whereas competing methods only achieve an accuracy of 84.92% or less. Furthermore, our framework extends to the task of classification of compounds having mixed membership in multiple pathway classes. Our prediction accuracy for this multi-label task is 97.61%. We analyze the relative importance of various global physicochemical features to the pathway class prediction problem and show that simple linear/logistic regression models can predict the values of these global features from the shape features extracted using our framework. AVAILABILITY AND IMPLEMENTATION https://github.com/baranwa2/MetabolicPathwayPrediction. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mayank Baranwal
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Abram Magner
- Department of Computer Science, University at Albany, SUNY, Albany, NY 12222, USA
| | | | | | - Angela Violi
- Department of Mechanical Engineering.,Department of Chemical Engineering and Biophysics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alfred O Hero
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
7
|
Chen Z, Pang M, Zhao Z, Li S, Miao R, Zhang Y, Feng X, Feng X, Zhang Y, Duan M, Huang L, Zhou F. Feature selection may improve deep neural networks for the bioinformatics problems. Bioinformatics 2019; 36:1542-1552. [DOI: 10.1093/bioinformatics/btz763] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 09/03/2019] [Accepted: 10/02/2019] [Indexed: 12/22/2022] Open
Abstract
Abstract
Motivation
Deep neural network (DNN) algorithms were utilized in predicting various biomedical phenotypes recently, and demonstrated very good prediction performances without selecting features. This study proposed a hypothesis that the DNN models may be further improved by feature selection algorithms.
Results
A comprehensive comparative study was carried out by evaluating 11 feature selection algorithms on three conventional DNN algorithms, i.e. convolution neural network (CNN), deep belief network (DBN) and recurrent neural network (RNN), and three recent DNNs, i.e. MobilenetV2, ShufflenetV2 and Squeezenet. Five binary classification methylomic datasets were chosen to calculate the prediction performances of CNN/DBN/RNN models using feature selected by the 11 feature selection algorithms. Seventeen binary classification transcriptome and two multi-class transcriptome datasets were also utilized to evaluate how the hypothesis may generalize to different data types. The experimental data supported our hypothesis that feature selection algorithms may improve DNN models, and the DBN models using features selected by SVM-RFE usually achieved the best prediction accuracies on the five methylomic datasets.
Availability and implementation
All the algorithms were implemented and tested under the programming environment Python version 3.6.6.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zheng Chen
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Meng Pang
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Zixin Zhao
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Shuainan Li
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Rui Miao
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Yifan Zhang
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Xiaoyue Feng
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Xin Feng
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Yexian Zhang
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Meiyu Duan
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Lan Huang
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Fengfeng Zhou
- BioKnow Health Informatics Lab, College of Computer Science and Technology
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| |
Collapse
|