1
|
Wang H, Wang J, Dong C, Lian Y, Liu D, Yan Z. A Novel Approach for Drug-Target Interactions Prediction Based on Multimodal Deep Autoencoder. Front Pharmacol 2020; 10:1592. [PMID: 32047432 PMCID: PMC6997437 DOI: 10.3389/fphar.2019.01592] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 12/09/2019] [Indexed: 01/09/2023] Open
Abstract
Drug targets are biomacromolecules or biomolecular structures that bind to specific drugs and produce therapeutic effects. Therefore, the prediction of drug-target interactions (DTIs) is important for disease therapy. Incorporating multiple similarity measures for drugs and targets is of essence for improving the accuracy of prediction of DTIs. However, existing studies with multiple similarity measures ignored the global structure information of similarity measures, and required manual extraction features of drug-target pairs, ignoring the non-linear relationship among features. In this paper, we proposed a novel approach MDADTI for DTIs prediction based on MDA. MDADTI applied random walk with restart method and positive pointwise mutual information to calculate the topological similarity matrices of drugs and targets, capturing the global structure information of similarity measures. Then, MDADTI applied multimodal deep autoencoder to fuse multiple topological similarity matrices of drugs and targets, automatically learned the low-dimensional features of drugs and targets, and applied deep neural network to predict DTIs. The results of 5-repeats of 10-fold cross-validation under three different cross-validation settings indicated that MDADTI is superior to the other four baseline methods. In addition, we validated the predictions of the MDADTI in six drug-target interactions reference databases, and the results showed that MDADTI can effectively identify unknown DTIs.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Jingjing Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Chunlin Dong
- Dryland Agriculture Research Center, Shanxi Academy of Agricultural Sciences, Taiyuan, China
| | - Yuanyuan Lian
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Dan Liu
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Zhiliang Yan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
2
|
Cui X, Shen K, Xie Z, Liu T, Zhang H. Identification of key genes in colorectal cancer using random walk with restart. Mol Med Rep 2016; 15:867-872. [PMID: 28000901 DOI: 10.3892/mmr.2016.6058] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 11/15/2016] [Indexed: 11/05/2022] Open
Abstract
As the most common type of cancer and the second leading cause of cancer-associated mortality, colorectal cancer (CRC) has received increasing attention. The aim of the present study was to investigate the mechanisms of CRC by analyzing the microarray dataset, GSE32323. The GSE32323 dataset was downloaded from the Gene Expression Omnibus, and included 17 pairs of matched cancer and normal colorectal tissue samples. The differentially expressed genes (DEGs) were screened using the Linear Models for Microarray Data package and a search of CRC genes, also denoted as seed genes, was performed using the Online Mendelian Inheritance in Man database. Subsequently, the protein‑protein interaction (PPI) network was downloaded from the Search Tool for the Retrieval of Interacting Genes database and the sub‑network (CRC.PPI) of the DEGs and seed genes were obtained. In addition, the top 50 nodes with highest affinity scores in the CRC.PPI were identified using random walk with restart analysis. The potential functions of the DEGs included in the top 50 nodes were analyzed using the Database for Annotation, Visualization and Integrated Discovery online tool. Using the Drug Gene Interaction database, drug‑gene interaction analysis was performed to identify antineoplastic drug interacts with genes. A total of 1,640 DEGs between the CRC and normal samples were screened. The obtained seed genes included cyclin D1 (CCND1) and aurora kinase A (AURKA). The enriched functions for the 31 DEGs in the PPI network of the top 50 nodes were predominantly associated with cell cycle. The DEGs may function in CRC by interacting with other genes in the PPI network of the top 50 nodes, for example, DEP domain‑containing MTOR‑interacting protein (DEPTOR)‑CCND1, AURKA‑breast carcinoma amplified sequence‑1 (BCAS1), CCND1‑BCAS1, CCND1‑neural precursor cell expressed developmentally downregulated 9 (NEDD9) and CCND1‑mitogen‑activated protein kinase kinase 2 (MAP2K2). Only three DEGs (CCND1, AURKA and DEPTOR) had interactions with their corresponding antineoplastic drugs. Taken together, DEPTOR, AURKA, CCND1, BCAS1, NEDD9 and MAP2K2 may act in CRC.
Collapse
Affiliation(s)
- Xiaofeng Cui
- Department of Gastrointestinal Colorectal and Anal Surgery, China‑Japan Union Hospital, Jilin University, Changchun, Jilin 130033, P.R. China
| | - Kexin Shen
- Department of Gastrointestinal Colorectal and Anal Surgery, China‑Japan Union Hospital, Jilin University, Changchun, Jilin 130033, P.R. China
| | - Zhongshi Xie
- Department of Gastrointestinal Colorectal and Anal Surgery, China‑Japan Union Hospital, Jilin University, Changchun, Jilin 130033, P.R. China
| | - Tongjun Liu
- Department of General Surgery, Jilin University, Changchun, Jilin 130022, P.R. China
| | - Haishan Zhang
- Department of Gastrointestinal Colorectal and Anal Surgery, China‑Japan Union Hospital, Jilin University, Changchun, Jilin 130033, P.R. China
| |
Collapse
|
3
|
Li J, Lin X, Teng Y, Qi S, Xiao D, Zhang J, Kang Y. A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization. PLoS One 2016; 11:e0159457. [PMID: 27415759 PMCID: PMC4944959 DOI: 10.1371/journal.pone.0159457] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 07/01/2016] [Indexed: 12/31/2022] Open
Abstract
Identification of disease-causing genes is a fundamental challenge for human health studies. The phenotypic similarity among diseases may reflect the interactions at the molecular level, and phenotype comparison can be used to predict disease candidate genes. Online Mendelian Inheritance in Man (OMIM) is a database of human genetic diseases and related genes that has become an authoritative source of disease phenotypes. However, disease phenotypes have been described by free text; thus, standardization of phenotypic descriptions is needed before diseases can be compared. Several disease phenotype networks have been established in OMIM using different standardization methods. Two of these networks are important for phenotypic similarity analysis: the first and most commonly used network (mimMiner) is standardized by medical subject heading, and the other network (resnikHPO) is the first to be standardized by human phenotype ontology. This paper comprehensively evaluates for the first time the accuracy of these two networks in gene prioritization based on protein–protein interactions using large-scale, leave-one-out cross-validation experiments. The results show that both networks can effectively prioritize disease-causing genes, and the approach that relates two diseases using a logistic function improves prioritization performance. Tanimoto, one of four methods for normalizing resnikHPO, generates a symmetric network and it performs similarly to mimMiner. Furthermore, an integration of these two networks outperforms either network alone in gene prioritization, indicating that these two disease networks are complementary.
Collapse
Affiliation(s)
- Jianhua Li
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
| | - Xiaoyan Lin
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Yueyang Teng
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Shouliang Qi
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Dayu Xiao
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Jianying Zhang
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- Border Biomedical Research Center, Department of Biological Sciences, The University of Texas at El Paso, El Paso, Texas, United States of America
| | - Yan Kang
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- * E-mail:
| |
Collapse
|
4
|
Linkeviciute V, Rackham OJL, Gough J, Oates ME, Fang H. Function-selective domain architecture plasticity potentials in eukaryotic genome evolution. Biochimie 2015; 119:269-77. [PMID: 25980317 PMCID: PMC4679076 DOI: 10.1016/j.biochi.2015.05.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2014] [Accepted: 05/06/2015] [Indexed: 12/20/2022]
Abstract
To help evaluate how protein function impacts on genome evolution, we introduce a new concept of ‘architecture plasticity potential’ – the capacity to form distinct domain architectures – both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution. A new concept to measure domain architecture plasticity potential in a genome. We reveal the function-selective role in eukaryotic genome evolution. Eukaryotic genomes expand signalling and regulations but reduce metabolism. We observe differential evolution between trans- and cis-acting regulations. We observe a unique role of channel regulators in separating eukaryotic kingdoms.
Collapse
Affiliation(s)
- Viktorija Linkeviciute
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK; School of Biological Sciences, University of Edinburgh, Darwin Building, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Owen J L Rackham
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK; Centre for Computational Biology, Duke-NUS Graduate Medical School, Singapore 169857, Singapore
| | - Julian Gough
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK
| | - Matt E Oates
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK
| | - Hai Fang
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK; Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK.
| |
Collapse
|
5
|
Emerson AI, Andrews S, Ahmed I, Azis TK, Malek JA. K-core decomposition of a protein domain co-occurrence network reveals lower cancer mutation rates for interior cores. J Clin Bioinforma 2015; 5:1. [PMID: 25767694 PMCID: PMC4357223 DOI: 10.1186/s13336-015-0016-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 02/18/2015] [Indexed: 11/10/2022] Open
Abstract
Background Network biology currently focuses primarily on metabolic pathways, gene regulatory, and protein-protein interaction networks. While these approaches have yielded critical information, alternative methods to network analysis will offer new perspectives on biological information. A little explored area is the interactions between domains that can be captured using domain co-occurrence networks (DCN). A DCN can be used to study the function and interaction of proteins by representing protein domains and their co-existence in genes and by mapping cancer mutations to the individual protein domains to identify signals. Results The domain co-occurrence network was constructed for the human proteome based on PFAM domains in proteins. Highly connected domains in the central cores were identified using the k-core decomposition technique. Here we show that these domains were found to be more evolutionarily conserved than the peripheral domains. The somatic mutations for ovarian, breast and prostate cancer diseases were obtained from the TCGA database. We mapped the somatic mutations to the individual protein domains and the local false discovery rate was used to identify significantly mutated domains in each cancer type. Significantly mutated domains were found to be enriched in cancer disease pathways. However, we found that the inner cores of the DCN did not contain any of the significantly mutated domains. We observed that the inner core protein domains are highly conserved and these domains co-exist in large numbers with other protein domains. Conclusion Mutations and domain co-occurrence networks provide a framework for understanding hierarchal designs in protein function from a network perspective. This study provides evidence that a majority of protein domains in the inner core of the DCN have a lower mutation frequency and that protein domains present in the peripheral regions of the k-core contribute more heavily to the disease. These findings may contribute further to drug development. Electronic supplementary material The online version of this article (doi:10.1186/s13336-015-0016-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arnold I Emerson
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Simeon Andrews
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Ikhlak Ahmed
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Thasni Ka Azis
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| | - Joel A Malek
- Department of Genetic Medicine, Weill Cornell Medical College, New York, NY USA ; Genomic Core, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, 24144 Qatar
| |
Collapse
|
6
|
Fang H. dcGOR: an R package for analysing ontologies and protein domain annotations. PLoS Comput Biol 2014; 10:e1003929. [PMID: 25356683 PMCID: PMC4214615 DOI: 10.1371/journal.pcbi.1003929] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Accepted: 09/21/2014] [Indexed: 01/08/2023] Open
Abstract
I introduce an open-source R package 'dcGOR' to provide the bioinformatics community with the ease to analyse ontologies and protein domain annotations, particularly those in the dcGO database. The dcGO is a comprehensive resource for protein domain annotations using a panel of ontologies including Gene Ontology. Although increasing in popularity, this database needs statistical and graphical support to meet its full potential. Moreover, there are no bioinformatics tools specifically designed for domain ontology analysis. As an add-on package built in the R software environment, dcGOR offers a basic infrastructure with great flexibility and functionality. It implements new data structure to represent domains, ontologies, annotations, and all analytical outputs as well. For each ontology, it provides various mining facilities, including: (i) domain-based enrichment analysis and visualisation; (ii) construction of a domain (semantic similarity) network according to ontology annotations; and (iii) significance analysis for estimating a contact (statistical significance) network. To reduce runtime, most analyses support high-performance parallel computing. Taking as inputs a list of protein domains of interest, the package is able to easily carry out in-depth analyses in terms of functional, phenotypic and diseased relevance, and network-level understanding. More importantly, dcGOR is designed to allow users to import and analyse their own ontologies and annotations on domains (taken from SCOP, Pfam and InterPro) and RNAs (from Rfam) as well. The package is freely available at CRAN for easy installation, and also at GitHub for version control. The dedicated website with reproducible demos can be found at http://supfam.org/dcGOR.
Collapse
Affiliation(s)
- Hai Fang
- Computational Genomics Group, Department of Computer Science, University of Bristol, Bristol, United Kingdom
- * E-mail:
| |
Collapse
|
7
|
Chen Y, Zhang X, Zhang GQ, Xu R. Comparative analysis of a novel disease phenotype network based on clinical manifestations. J Biomed Inform 2014; 53:113-20. [PMID: 25277758 DOI: 10.1016/j.jbi.2014.09.007] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 08/18/2014] [Accepted: 09/21/2014] [Indexed: 12/21/2022]
Abstract
Systems approaches to analyzing disease phenotype networks in combination with protein functional interaction networks have great potential in illuminating disease pathophysiological mechanisms. While many genetic networks are readily available, disease phenotype networks remain largely incomplete. In this study, we built a large-scale Disease Manifestation Network (DMN) from 50,543 highly accurate disease-manifestation semantic relationships in the United Medical Language System (UMLS). Our new phenotype network contains 2305 nodes and 373,527 weighted edges to represent the disease phenotypic similarities. We first compared DMN with the networks representing genetic relationships among diseases, and demonstrated that the phenotype clustering in DMN reflects common disease genetics. Then we compared DMN with a widely-used disease phenotype network in previous gene discovery studies, called mimMiner, which was extracted from the textual descriptions in Online Mendelian Inheritance in Man (OMIM). We demonstrated that DMN contains different knowledge from the existing phenotype data source. Finally, a case study on Marfan syndrome further proved that DMN contains useful information and can provide leads to discover unknown disease causes. Integrating DMN in systems approaches with mimMiner and other data offers the opportunities to predict novel disease genetics. We made DMN publicly available at nlp/case.edu/public/data/DMN.
Collapse
Affiliation(s)
- Yang Chen
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, United States; Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Xiang Zhang
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Guo-Qiang Zhang
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, United States; Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Rong Xu
- Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, United States.
| |
Collapse
|
8
|
Fang H, Gough J. The 'dnet' approach promotes emerging research on cancer patient survival. Genome Med 2014; 6:64. [PMID: 25246945 PMCID: PMC4160547 DOI: 10.1186/s13073-014-0064-8] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 08/15/2014] [Indexed: 12/20/2022] Open
Abstract
We present the 'dnet' package and apply it to the 'TCGA' mutation and clinical data of >3,000 patients. We uncover the existence of an underlying gene network that at least partially controls cancer 'survivalness', with mutations that are significantly correlated with patient survival, yet independent of tumour origin and type. The survivalness network has natural community structure corresponding to tumour hallmarks, and contains genes that are potentially druggable in the clinic. This network has evolutionary roots in Deuterostomia identifying PTK2 and VAV1 as under-valued relative to more studied genes from that era. The 'dnet' R package is available at http://cran.r-project.org/package=dnet.
Collapse
Affiliation(s)
- Hai Fang
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol, BS8 1UB UK
| | - Julian Gough
- Computational Genomics Group, Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol, BS8 1UB UK
| |
Collapse
|