1
|
Chen X, Luo L, Shen C, Ding P, Luo J. An In Silico Method for Predicting Drug Synergy Based on Multitask Learning. Interdiscip Sci 2021; 13:299-311. [PMID: 33611781 DOI: 10.1007/s12539-021-00422-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 01/29/2021] [Accepted: 02/07/2021] [Indexed: 12/20/2022]
Abstract
To make better use of all kinds of knowledge to predict drug synergy, it is crucial to successfully establish a drug synergy prediction model and leverage the reconstruction of sparse known drug targets. Therefore, we present an in silico method that predicts the synergy scores of drug pairs based on multitask learning (DSML) that could fuse drug targets, protein-protein interactions, anatomical therapeutic chemical codes, a priori knowledge of drug combinations. To simultaneously reconstruct drug-target protein interactions and synergistic drug combinations, DSML benefits indirectly from the associations with relation through proteins. In cross-validation experiments, DSML improved the ability to predict drug synergy. Moreover, the reconstruction of drug-target interactions and the incorporation of multisource knowledge significantly improved drug combination predictions by a large margin. The potential drug combinations predicted by DSML demonstrate its ability to predict drug synergy.
Collapse
Affiliation(s)
- Xin Chen
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China.,Hunan Medical Big Data International Sci.&Tech. Innovation Cooperation Base, Hengyang, 421000, Hunan, China
| | - Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China. .,Hunan Medical Big Data International Sci.&Tech. Innovation Cooperation Base, Hengyang, 421000, Hunan, China.
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| |
Collapse
|
2
|
Hudson IL. Data Integration Using Advances in Machine Learning in Drug Discovery and Molecular Biology. Methods Mol Biol 2021; 2190:167-184. [PMID: 32804365 DOI: 10.1007/978-1-0716-0826-5_7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
While the term artificial intelligence and the concept of deep learning are not new, recent advances in high-performance computing, the availability of large annotated data sets required for training, and novel frameworks for implementing deep neural networks have led to an unprecedented acceleration of the field of molecular (network) biology and pharmacogenomics. The need to align biological data to innovative machine learning has stimulated developments in both data integration (fusion) and knowledge representation, in the form of heterogeneous, multiplex, and biological networks or graphs. In this chapter we briefly introduce several popular neural network architectures used in deep learning, namely, the fully connected deep neural network, recurrent neural network, convolutional neural network, and the autoencoder. Deep learning predictors, classifiers, and generators utilized in modern feature extraction may well assist interpretability and thus imbue AI tools with increased explication, potentially adding insights and advancements in novel chemistry and biology discovery.The capability of learning representations from structures directly without using any predefined structure descriptor is an important feature distinguishing deep learning from other machine learning methods and makes the traditional feature selection and reduction procedures unnecessary. In this chapter we briefly show how these technologies are applied for data integration (fusion) and analysis in drug discovery research covering these areas: (1) application of convolutional neural networks to predict ligand-protein interactions; (2) application of deep learning in compound property and activity prediction; (3) de novo design through deep learning. We also: (1) discuss some aspects of future development of deep learning in drug discovery/chemistry; (2) provide references to published information; (3) provide recently advocated recommendations on using artificial intelligence and deep learning in -omics research and drug discovery.
Collapse
Affiliation(s)
- Irene Lena Hudson
- Mathematical Sciences, School of Science, RMIT University, Melbourne, VIC, Australia.
| |
Collapse
|
3
|
System-Based Differential Gene Network Analysis for Characterizing a Sample-Specific Subnetwork. Biomolecules 2020; 10:biom10020306. [PMID: 32075209 PMCID: PMC7072632 DOI: 10.3390/biom10020306] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 02/03/2020] [Accepted: 02/08/2020] [Indexed: 12/18/2022] Open
Abstract
Gene network estimation is a method key to understanding a fundamental cellular system from high throughput omics data. However, the existing gene network analysis relies on having a sufficient number of samples and is required to handle a huge number of nodes and estimated edges, which remain difficult to interpret, especially in discovering the clinically relevant portions of the network. Here, we propose a novel method to extract a biomedically significant subnetwork using a Bayesian network, a type of unsupervised machine learning method that can be used as an explainable and interpretable artificial intelligence algorithm. Our method quantifies sample specific networks using our proposed Edge Contribution value (ECv) based on the estimated system, which realizes condition-specific subnetwork extraction using a limited number of samples. We applied this method to the Epithelial-Mesenchymal Transition (EMT) data set that is related to the process of metastasis and thus prognosis in cancer biology. We established our method-driven EMT network representing putative gene interactions. Furthermore, we found that the sample-specific ECv patterns of this EMT network can characterize the survival of lung cancer patients. These results show that our method unveils the explainable network differences in biological and clinical features through artificial intelligence technology.
Collapse
|
4
|
Sheng Z, Sun Y, Yin Z, Tang K, Cao Z. Advances in computational approaches in identifying synergistic drug combinations. Brief Bioinform 2019; 19:1172-1182. [PMID: 28475767 DOI: 10.1093/bib/bbx047] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Indexed: 12/21/2022] Open
Abstract
Accumulated empirical clinical experience, supported by animal or cell line models, has initiated efforts of predicting synergistic combinatorial drugs with more-than-additive effect compared with the sum of the individual agents. Aiming to construct better computational models, this review started from the latest updated data resources of combinatorial drugs, then summarized the reported mechanism of the known synergistic combinations from aspects of drug molecular and pharmacological patterns, target network properties and compound functional annotation. Based on above, we focused on the main in silico strategies recently published, covering methods of molecular modeling, mathematical simulation, optimization of combinatorial targets and pattern-based statistical/learning model. Future thoughts are also discussed related to the role of natural compounds, drug combination with immunotherapy and management of adverse effects. Overall, with particular emphasis on mechanism of action of drug synergy, this review may serve as a rapid reference to design improved models for combinational drugs.
Collapse
Affiliation(s)
- Zhen Sheng
- School of Life Sciences and Technology, Tongji University
| | - Yi Sun
- School of Life Sciences and Technology, Tongji University
| | - Zuojing Yin
- School of Life Sciences and Technology, Tongji University
| | - Kailin Tang
- Advanced Institute of Translational Medicine, Tongji University
| | - Zhiwei Cao
- School of Life Sciences and Technology, Tongji University
| |
Collapse
|
5
|
Ding P, Yin R, Luo J, Kwoh CK. Ensemble Prediction of Synergistic Drug Combinations Incorporating Biological, Chemical, Pharmacological, and Network Knowledge. IEEE J Biomed Health Inform 2018; 23:1336-1345. [PMID: 29994408 DOI: 10.1109/jbhi.2018.2852274] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Combinatorial therapy may reduce drug side effects and improve drug efficacy, making combination therapy a promising strategy to treat complex diseases. However, in the existing computational methods, the natural properties and network knowledge of drugs have not been adequately and simultaneously considered, making it difficult to identify effective drug combinations. Computational methods that incorporate multiple sources of information (biological, chemical, pharmacological, and network knowledge) offer more opportunities to screen synergistic drug combinations. Therefore, we developed a novel Ensemble Prediction framework of Synergistic Drug Combinations (EPSDC) to accurately and efficiently predict drug combinations by integrating information from multiple-sources. EPSDC constructs feature vector of drug pair by concatenating different types of drug similarities, and then uses these groups in a feature-based base predictor. Next, transductive learning is applied on heterogeneous drug-target networks to achieve a network-based score for the drug pair. Finally, two types of ensemble rules are introduced to combine the feature-based score and the network-based score, and then potential drug combinations are prioritized. To demonstrate the effect of the ensemble rule, comprehensive experiments were conducted to compare single models and ensemble models. The experimental results indicated that our method outperformed the state-of-the-art method in five-fold cross validation and de novo prediction tests on the two benchmark datasets. We further analyzed the effect of maximum length of the meta-path and the impacts of different types of features. Moreover, the practical usefulness of our method was confirmed in the predicted novel drug combinations. The source code of EPSDC is available at https://github.com/KDDing/EPSDC.
Collapse
|
6
|
Griffin PJ, Zhang Y, Johnson WE, Kolaczyk ED. Detection of multiple perturbations in multi-omics biological networks. Biometrics 2018; 74:1351-1361. [PMID: 29772079 DOI: 10.1111/biom.12893] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 04/01/2018] [Accepted: 04/01/2018] [Indexed: 01/24/2023]
Abstract
Cellular mechanism-of-action is of fundamental concern in many biological studies. It is of particular interest for identifying the cause of disease and learning the way in which treatments act against disease. However, pinpointing such mechanisms is difficult, due to the fact that small perturbations to the cell can have wide-ranging downstream effects. Given a snapshot of cellular activity, it can be challenging to tell where a disturbance originated. The presence of an ever-greater variety of high-throughput biological data offers an opportunity to examine cellular behavior from multiple angles, but also presents the statistical challenge of how to effectively analyze data from multiple sources. In this setting, we propose a method for mechanism-of-action inference by extending network filtering to multi-attribute data. We first estimate a joint Gaussian graphical model across multiple data types using penalized regression and filter for network effects. We then apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. In addition, we propose a conditional testing procedure to allow for detection of multiple perturbations. We demonstrate this methodology on paired gene expression and methylation data from The Cancer Genome Atlas (TCGA).
Collapse
Affiliation(s)
- Paula J Griffin
- Department of Biostatistics, Boston University School of Public Health, Boston, U.S.A
| | - Yuqing Zhang
- Division of Computational Biomedicine, Boston University School of Medicine, Boston, U.S.A.,Graduate Program in Bioinformatics, Boston University, Boston, U.S.A
| | - William Evan Johnson
- Department of Biostatistics, Boston University School of Public Health, Boston, U.S.A.,Division of Computational Biomedicine, Boston University School of Medicine, Boston, U.S.A.,Graduate Program in Bioinformatics, Boston University, Boston, U.S.A
| | - Eric D Kolaczyk
- Graduate Program in Bioinformatics, Boston University, Boston, U.S.A.,Department of Mathematics and Statistics, Boston University, Boston, U.S.A
| |
Collapse
|
7
|
Sam E, Athri P. Web-based drug repurposing tools: a survey. Brief Bioinform 2017; 20:299-316. [DOI: 10.1093/bib/bbx125] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Indexed: 12/15/2022] Open
Affiliation(s)
- Elizabeth Sam
- Department of Computer Science & Engineering Amrita, University Bengaluru, India
| | - Prashanth Athri
- Department of Computer Science & Engineering Amrita, University Bengaluru, India
| |
Collapse
|
8
|
Iorio F, Bernardo-Faura M, Gobbi A, Cokelaer T, Jurman G, Saez-Rodriguez J. Efficient randomization of biological networks while preserving functional characterization of individual nodes. BMC Bioinformatics 2016; 17:542. [PMID: 27998275 PMCID: PMC5168876 DOI: 10.1186/s12859-016-1402-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Accepted: 12/01/2016] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Networks are popular and powerful tools to describe and model biological processes. Many computational methods have been developed to infer biological networks from literature, high-throughput experiments, and combinations of both. Additionally, a wide range of tools has been developed to map experimental data onto reference biological networks, in order to extract meaningful modules. Many of these methods assess results' significance against null distributions of randomized networks. However, these standard unconstrained randomizations do not preserve the functional characterization of the nodes in the reference networks (i.e. their degrees and connection signs), hence including potential biases in the assessment. RESULTS Building on our previous work about rewiring bipartite networks, we propose a method for rewiring any type of unweighted networks. In particular we formally demonstrate that the problem of rewiring a signed and directed network preserving its functional connectivity (F-rewiring) reduces to the problem of rewiring two induced bipartite networks. Additionally, we reformulate the lower bound to the iterations' number of the switching-algorithm to make it suitable for the F-rewiring of networks of any size. Finally, we present BiRewire3, an open-source Bioconductor package enabling the F-rewiring of any type of unweighted network. We illustrate its application to a case study about the identification of modules from gene expression data mapped on protein interaction networks, and a second one focused on building logic models from more complex signed-directed reference signaling networks and phosphoproteomic data. CONCLUSIONS BiRewire3 it is freely available at https://www.bioconductor.org/packages/BiRewire/ , and it should have a broad application as it allows an efficient and analytically derived statistical assessment of results from any network biology tool.
Collapse
Affiliation(s)
- Francesco Iorio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.
| | - Marti Bernardo-Faura
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.,Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus UAB, Bellaterra, Barcelona, 08193, Spain
| | - Andrea Gobbi
- Fondazione Bruno Kessler, Povo, Trento, I-38122, Italy
| | - Thomas Cokelaer
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK.,Institut Pasteur - Bioinformatics and Biostatistics Hub - C3BI, USR 3756 IP CNRS, Paris, France
| | | | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, CB10 1SD, UK. .,RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), MTI2 Wendlingweg 2, Aachen, 52074, Germany.
| |
Collapse
|
9
|
Mayer G, Marcus K, Eisenacher M, Kohl M. Boolean modeling techniques for protein co-expression networks in systems medicine. Expert Rev Proteomics 2016; 13:555-69. [PMID: 27105325 DOI: 10.1080/14789450.2016.1181546] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
INTRODUCTION Application of systems biology/systems medicine approaches is promising for proteomics/biomedical research, but requires selection of an adequate modeling type. AREAS COVERED This article reviews the existing Boolean network modeling approaches, which provide in comparison with alternative modeling techniques several advantages for the processing of proteomics data. Application of methods for inference, reduction and validation of protein co-expression networks that are derived from quantitative high-throughput proteomics measurements is presented. It's also shown how Boolean models can be used to derive system-theoretic characteristics that describe both the dynamical behavior of such networks as a whole and the properties of different cell states (e.g. healthy or diseased cell states). Furthermore, application of methods derived from control theory is proposed in order to simulate the effects of therapeutic interventions on such networks, which is a promising approach for the computer-assisted discovery of biomarkers and drug targets. Finally, the clinical application of Boolean modeling analyses is discussed. Expert commentary: Boolean modeling of proteomics data is still in its infancy. Progress in this field strongly depends on provision of a repository with public access to relevant reference models. Also required are community supported standards that facilitate input of both proteomics and patient related data (e.g. age, gender, laboratory results, etc.).
Collapse
Affiliation(s)
- Gerhard Mayer
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Katrin Marcus
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Martin Eisenacher
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| | - Michael Kohl
- a Medizinisches Proteom Center (MPC) , Ruhr-Universität Bochum , Bochum , Germany
| |
Collapse
|
10
|
Dräger A, Zielinski DC, Keller R, Rall M, Eichner J, Palsson BO, Zell A. SBMLsqueezer 2: context-sensitive creation of kinetic equations in biochemical networks. BMC SYSTEMS BIOLOGY 2015; 9:68. [PMID: 26452770 PMCID: PMC4600286 DOI: 10.1186/s12918-015-0212-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 09/15/2015] [Indexed: 12/25/2022]
Abstract
BACKGROUND The size and complexity of published biochemical network reconstructions are steadily increasing, expanding the potential scale of derived computational models. However, the construction of large biochemical network models is a laborious and error-prone task. Automated methods have simplified the network reconstruction process, but building kinetic models for these systems is still a manually intensive task. Appropriate kinetic equations, based upon reaction rate laws, must be constructed and parameterized for each reaction. The complex test-and-evaluation cycles that can be involved during kinetic model construction would thus benefit from automated methods for rate law assignment. RESULTS We present a high-throughput algorithm to automatically suggest and create suitable rate laws based upon reaction type according to several criteria. The criteria for choices made by the algorithm can be influenced in order to assign the desired type of rate law to each reaction. This algorithm is implemented in the software package SBMLsqueezer 2. In addition, this program contains an integrated connection to the kinetics database SABIO-RK to obtain experimentally-derived rate laws when desired. CONCLUSIONS The described approach fills a heretofore absent niche in workflows for large-scale biochemical kinetic model construction. In several applications the algorithm has already been demonstrated to be useful and scalable. SBMLsqueezer is platform independent and can be used as a stand-alone package, as an integrated plugin, or through a web interface, enabling flexible solutions and use-case scenarios.
Collapse
Affiliation(s)
- Andreas Dräger
- Systems Biology Research Group, University of California, San Diego, 9500 Gilman Drive, La Jolla, 92093-0412, CA, USA.
- Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.
| | - Daniel C Zielinski
- Systems Biology Research Group, University of California, San Diego, 9500 Gilman Drive, La Jolla, 92093-0412, CA, USA.
| | - Roland Keller
- Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.
| | - Matthias Rall
- Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.
| | - Johannes Eichner
- Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.
| | - Bernhard O Palsson
- Systems Biology Research Group, University of California, San Diego, 9500 Gilman Drive, La Jolla, 92093-0412, CA, USA.
- Novo Nordisk Foundation Center for Biosustainability, Kogle Allé 6, Hørsholm, 2970, Denmark.
| | - Andreas Zell
- Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, Tübingen, 72076, Germany.
| |
Collapse
|
11
|
Folch-Fortuny A, Villaverde AF, Ferrer A, Banga JR. Enabling network inference methods to handle missing data and outliers. BMC Bioinformatics 2015; 16:283. [PMID: 26335628 PMCID: PMC4559359 DOI: 10.1186/s12859-015-0717-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 08/24/2015] [Indexed: 12/20/2022] Open
Abstract
Background The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity and quality of the data greatly affect the results. While many methodologies have been developed for this task, they seldom take into account issues such as missing data or outlier detection and correction, which need to be properly addressed before network inference. Results Here we present an approach to (i) handle missing data and (ii) detect and correct outliers based on multivariate projection to latent structures. The method, called trimmed scores regression (TSR), enables network inference methods to analyse incomplete datasets by imputing the missing values coherently with the latent data structure. Furthermore, it substitutes the faulty values in a dataset by proper estimations. We provide an implementation of this approach, and show how it can be integrated with any network inference method as a preliminary data curation step. This functionality is demonstrated with a state of the art network inference method based on mutual information distance and entropy reduction, MIDER. Conclusion The methodology presented here enables network inference methods to analyse a large number of incomplete and faulty datasets that could not be reliably analysed so far. Our comparative studies show the superiority of TSR over other missing data approaches used by practitioners. Furthermore, the method allows for outlier detection and correction. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0717-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Abel Folch-Fortuny
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camino de Vera s/n, Valencia, 46022, Spain.
| | - Alejandro F Villaverde
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain.,Centre of Biological Engineering, Universidade do Minho, Campus de Gualtar, Braga, 4710-057, Portugal.,Department of Systems and Control Engineering, Universidade de Vigo, Rua Maxwell, Vigo, 36310, Spain
| | - Alberto Ferrer
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camino de Vera s/n, Valencia, 46022, Spain
| | - Julio R Banga
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain
| |
Collapse
|
12
|
Martínez-Jiménez F, Marti-Renom MA. Ligand-target prediction by structural network biology using nAnnoLyze. PLoS Comput Biol 2015; 11:e1004157. [PMID: 25816344 PMCID: PMC4376866 DOI: 10.1371/journal.pcbi.1004157] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Accepted: 01/27/2015] [Indexed: 11/24/2022] Open
Abstract
Target identification is essential for drug design, drug-drug interaction prediction, dosage adjustment and side effect anticipation. Specifically, the knowledge of structural details is essential for understanding the mode of action of a compound on a target protein. Here, we present nAnnoLyze, a method for target identification that relies on the hypothesis that structurally similar binding sites bind similar ligands. nAnnoLyze integrates structural information into a bipartite network of interactions and similarities to predict structurally detailed compound-protein interactions at proteome scale. The method was benchmarked on a dataset of 6,282 pairs of known interacting ligand-target pairs reaching a 0.96 of area under the Receiver Operating Characteristic curve (AUC) when using the drug names as an input feature for the classifier, and a 0.70 of AUC for “anonymous” compounds or compounds not present in the training set. nAnnoLyze resulted in higher accuracies than its predecessor, AnnoLyze. We applied the method to predict interactions for all the compounds in the DrugBank database with each human protein structure and provide examples of target identification for known drugs against human diseases. The accuracy and applicability of our method to any compound indicate that a comparative docking approach such as nAnnoLyze enables large-scale annotation and analysis of compound–protein interactions and thus may benefit drug development. Description of the “mode-of-action” of a small chemical compound against a protein target is essential for the drug discovery process. Such description relies on three main steps: i) the identification of the target protein within the thousands of proteins in an organism, ii) the localization of the binding interaction site in the identified target protein, and iii) the molecular characterization of the compound’s binding mode in the binding site of the target protein. Here, we introduce a new computational method, called nAnnoLyze, which uses graph theory principles to relate compounds and target proteins based on comparative principles. nAnnoLyze aims at addressing two of the three previous steps, that is, target identification and binding site localization. Our results suggest that the nAnnoLyze accuracy and proteome-wide applicability enables the large-scale annotation and analysis of compound–protein interaction and thus may benefit drug development.
Collapse
Affiliation(s)
- Francisco Martínez-Jiménez
- Genome Biology Group, Centre Nacional d’Aanàlisi Genòmica (CNAG), Barcelona, Spain
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Marc A. Marti-Renom
- Genome Biology Group, Centre Nacional d’Aanàlisi Genòmica (CNAG), Barcelona, Spain
- Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- * E-mail:
| |
Collapse
|
13
|
Cakır T, Khatibipour MJ. Metabolic network discovery by top-down and bottom-up approaches and paths for reconciliation. Front Bioeng Biotechnol 2014; 2:62. [PMID: 25520953 PMCID: PMC4253960 DOI: 10.3389/fbioe.2014.00062] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 11/14/2014] [Indexed: 11/13/2022] Open
Abstract
The primary focus in the network-centric analysis of cellular metabolism by systems biology approaches is to identify the active metabolic network for the condition of interest. Two major approaches are available for the discovery of the condition-specific metabolic networks. One approach starts from genome-scale metabolic networks, which cover all possible reactions known to occur in the related organism in a condition-independent manner, and applies methods such as the optimization-based Flux-Balance Analysis to elucidate the active network. The other approach starts from the condition-specific metabolome data, and processes the data with statistical or optimization-based methods to extract information content of the data such that the active network is inferred. These approaches, termed bottom-up and top-down, respectively, are currently employed independently. However, considering that both approaches have the same goal, they can both benefit from each other paving the way for the novel integrative analysis methods of metabolome data- and flux-analysis approaches in the post-genomic era. This study reviews the strengths of constraint-based analysis and network inference methods reported in the metabolic systems biology field; then elaborates on the potential paths to reconcile the two approaches to shed better light on how the metabolism functions.
Collapse
Affiliation(s)
- Tunahan Cakır
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey
| | - Mohammad Jafar Khatibipour
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey ; Department of Chemical Engineering, Gebze Technical University (formerly known as Gebze Institute of Technology) , Gebze , Turkey
| |
Collapse
|
14
|
Studham ME, Tjärnberg A, Nordling TEM, Nelander S, Sonnhammer ELL. Functional association networks as priors for gene regulatory network inference. ACTA ACUST UNITED AC 2014; 30:i130-8. [PMID: 24931976 PMCID: PMC4058914 DOI: 10.1093/bioinformatics/btu285] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadequate for reliable inference of the network, informative priors have been shown to improve the accuracy of inferences. Results: This study explores the potential of undirected, confidence-weighted networks, such as those in functional association databases, as a prior source for GRN inference. Such networks often erroneously indicate symmetric interaction between genes and may contain mostly correlation-based interaction information. Despite these drawbacks, our testing on synthetic datasets indicates that even noisy priors reflect some causal information that can improve GRN inference accuracy. Our analysis on yeast data indicates that using the functional association databases FunCoup and STRING as priors can give a small improvement in GRN inference accuracy with biological data. Contact:matthew.studham@scilifelab.se Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew E Studham
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Andreas Tjärnberg
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Torbjörn E M Nordling
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Sven Nelander
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Erik L L Sonnhammer
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| |
Collapse
|
15
|
Villaverde AF, Ross J, Morán F, Banga JR. MIDER: network inference with mutual information distance and entropy reduction. PLoS One 2014; 9:e96732. [PMID: 24806471 PMCID: PMC4013075 DOI: 10.1371/journal.pone.0096732] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 04/09/2014] [Indexed: 01/14/2023] Open
Abstract
The prediction of links among variables from a given dataset is a task referred to as network inference or reverse engineering. It is an open problem in bioinformatics and systems biology, as well as in other areas of science. Information theory, which uses concepts such as mutual information, provides a rigorous framework for addressing it. While a number of information-theoretic methods are already available, most of them focus on a particular type of problem, introducing assumptions that limit their generality. Furthermore, many of these methods lack a publicly available implementation. Here we present MIDER, a method for inferring network structures with information theoretic concepts. It consists of two steps: first, it provides a representation of the network in which the distance among nodes indicates their statistical closeness. Second, it refines the prediction of the existing links to distinguish between direct and indirect interactions and to assign directionality. The method accepts as input time-series data related to some quantitative features of the network nodes (such as e.g. concentrations, if the nodes are chemical species). It takes into account time delays between variables, and allows choosing among several definitions and normalizations of mutual information. It is general purpose: it may be applied to any type of network, cellular or otherwise. A Matlab implementation including source code and data is freely available (http://www.iim.csic.es/~gingproc/mider.html). The performance of MIDER has been evaluated on seven different benchmark problems that cover the main types of cellular networks, including metabolic, gene regulatory, and signaling. Comparisons with state of the art information–theoretic methods have demonstrated the competitive performance of MIDER, as well as its versatility. Its use does not demand any a priori knowledge from the user; the default settings and the adaptive nature of the method provide good results for a wide range of problems without requiring tuning.
Collapse
Affiliation(s)
| | - John Ross
- Department of Chemistry, Stanford University, Stanford, California, United States of America
| | - Federico Morán
- Department of Biochemistry and Molecular Biology, Complutense University, Madrid, Spain
| | | |
Collapse
|
16
|
Kell DB, Goodacre R. Metabolomics and systems pharmacology: why and how to model the human metabolic network for drug discovery. Drug Discov Today 2014; 19:171-82. [PMID: 23892182 PMCID: PMC3989035 DOI: 10.1016/j.drudis.2013.07.014] [Citation(s) in RCA: 118] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 07/03/2013] [Accepted: 07/16/2013] [Indexed: 02/06/2023]
Abstract
Metabolism represents the 'sharp end' of systems biology, because changes in metabolite concentrations are necessarily amplified relative to changes in the transcriptome, proteome and enzyme activities, which can be modulated by drugs. To understand such behaviour, we therefore need (and increasingly have) reliable consensus (community) models of the human metabolic network that include the important transporters. Small molecule 'drug' transporters are in fact metabolite transporters, because drugs bear structural similarities to metabolites known from the network reconstructions and from measurements of the metabolome. Recon2 represents the present state-of-the-art human metabolic network reconstruction; it can predict inter alia: (i) the effects of inborn errors of metabolism; (ii) which metabolites are exometabolites, and (iii) how metabolism varies between tissues and cellular compartments. However, even these qualitative network models are not yet complete. As our understanding improves so do we recognise more clearly the need for a systems (poly)pharmacology.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry and Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK.
| | - Royston Goodacre
- School of Chemistry and Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
| |
Collapse
|
17
|
Sturla SJ, Boobis AR, FitzGerald RE, Hoeng J, Kavlock RJ, Schirmer K, Whelan M, Wilks MF, Peitsch MC. Systems toxicology: from basic research to risk assessment. Chem Res Toxicol 2014; 27:314-29. [PMID: 24446777 PMCID: PMC3964730 DOI: 10.1021/tx400410s] [Citation(s) in RCA: 211] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Systems Toxicology is the integration of classical toxicology with quantitative analysis of large networks of molecular and functional changes occurring across multiple levels of biological organization. Society demands increasingly close scrutiny of the potential health risks associated with exposure to chemicals present in our everyday life, leading to an increasing need for more predictive and accurate risk-assessment approaches. Developing such approaches requires a detailed mechanistic understanding of the ways in which xenobiotic substances perturb biological systems and lead to adverse outcomes. Thus, Systems Toxicology approaches offer modern strategies for gaining such mechanistic knowledge by combining advanced analytical and computational tools. Furthermore, Systems Toxicology is a means for the identification and application of biomarkers for improved safety assessments. In Systems Toxicology, quantitative systems-wide molecular changes in the context of an exposure are measured, and a causal chain of molecular events linking exposures with adverse outcomes (i.e., functional and apical end points) is deciphered. Mathematical models are then built to describe these processes in a quantitative manner. The integrated data analysis leads to the identification of how biological networks are perturbed by the exposure and enables the development of predictive mathematical models of toxicological processes. This perspective integrates current knowledge regarding bioanalytical approaches, computational analysis, and the potential for improved risk assessment.
Collapse
Affiliation(s)
- Shana J Sturla
- Department of Health Sciences and Technology, Institute of Food, Nutrition and Health, ETH Zürich , Schmelzbergstrasse 9, 8092 Zürich, Switzerland
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Hoeng J, Talikka M, Martin F, Sewer A, Yang X, Iskandar A, Schlage WK, Peitsch MC. Case study: the role of mechanistic network models in systems toxicology. Drug Discov Today 2013; 19:183-92. [PMID: 23933191 DOI: 10.1016/j.drudis.2013.07.023] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Revised: 07/14/2013] [Accepted: 07/25/2013] [Indexed: 10/26/2022]
Abstract
Twenty first century systems toxicology approaches enable the discovery of biological pathways affected in response to active substances. Here, we briefly summarize current network approaches that facilitate the detailed mechanistic understanding of the impact of a given stimulus on a biological system. We also introduce our network-based method with two use cases and show how causal biological network models combined with computational methods provide quantitative mechanistic insights. Our approach provides a robust comparison of the transcriptional responses in different experimental systems and enables the identification of network-based biomarkers modulated in response to exposure. These advances can also be applied to pharmacology, where the understanding of disease mechanisms and adverse drug effects is imperative for the development of efficient and safe treatment options.
Collapse
Affiliation(s)
- Julia Hoeng
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Marja Talikka
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Florian Martin
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Alain Sewer
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Xiang Yang
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Anita Iskandar
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Walter K Schlage
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland
| | - Manuel C Peitsch
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000 Neuchâtel, Switzerland.
| |
Collapse
|
19
|
Wang Z, Deisboeck TS. Mathematical modeling in cancer drug discovery. Drug Discov Today 2013; 19:145-50. [PMID: 23831857 DOI: 10.1016/j.drudis.2013.06.015] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Revised: 06/25/2013] [Accepted: 06/27/2013] [Indexed: 12/20/2022]
Abstract
Mathematical models have the potential to help discover new therapeutic targets and treatment strategies. In this review, we discuss how the latest developments in mathematical modeling can provide useful context for the rational design, validation and prioritization of novel cancer drug targets and their combinations. We give special attention to two modeling approaches: network-based modeling and multiscale modeling, because they have begun to show promise in facilitating the process of effective cancer drug discovery. Both modeling approaches are integrated with a variety of experimental methods to ensure proper parameterization and to maximize their predictive value. We also discuss several challenges faced in modeling-based drug discovery.
Collapse
Affiliation(s)
- Zhihui Wang
- Department of Pathology, University of New Mexico, Albuquerque, NM 87131, USA
| | | |
Collapse
|