1
|
Iuchi H, Kawasaki J, Kubo K, Fukunaga T, Hokao K, Yokoyama G, Ichinose A, Suga K, Hamada M. Bioinformatics approaches for unveiling virus-host interactions. Comput Struct Biotechnol J 2023; 21:1774-1784. [PMID: 36874163 PMCID: PMC9969756 DOI: 10.1016/j.csbj.2023.02.044] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 02/22/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023] Open
Abstract
The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.
Collapse
Affiliation(s)
- Hitoshi Iuchi
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan
| | - Junna Kawasaki
- Faculty of Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Kento Kubo
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Tsukasa Fukunaga
- Waseda Institute for Advanced Study, Waseda University, Nishi Waseda, Shinjuku-ku, Tokyo 169-0051, Japan
| | - Koki Hokao
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Gentaro Yokoyama
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Akiko Ichinose
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan
| | - Kanta Suga
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan.,Graduate School of Medicine, Nippon Medical School, Tokyo 113-8602, Japan
| |
Collapse
|
2
|
Ray S, Lall S, Mukhopadhyay A, Bandyopadhyay S, Schönhuth A. Deep variational graph autoencoders for novel host-directed therapy options against COVID-19. Artif Intell Med 2022; 134:102418. [PMID: 36462892 PMCID: PMC9556806 DOI: 10.1016/j.artmed.2022.102418] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Revised: 03/22/2022] [Accepted: 10/02/2022] [Indexed: 12/14/2022]
Abstract
The COVID-19 pandemic has been keeping asking urgent questions with respect to therapeutic options. Existing drugs that can be repurposed promise rapid implementation in practice because of their prior approval. Conceivably, there is still room for substantial improvement, because most advanced artificial intelligence techniques for screening drug repositories have not been exploited so far. We construct a comprehensive network by combining year-long curated drug-protein/protein-protein interaction data on the one hand, and most recent SARS-CoV-2 protein interaction data on the other hand. We learn the structure of the resulting encompassing molecular interaction network and predict missing links using variational graph autoencoders (VGAEs), as a most advanced deep learning technique that has not been explored so far. We focus on hitherto unknown links between drugs and human proteins that play key roles in the replication cycle of SARS-CoV-2. Thereby, we establish novel host-directed therapy (HDT) options whose utmost plausibility is confirmed by realistic simulations. As a consequence, many of the predicted links are likely to be crucial for the virus to thrive on the one hand, and can be targeted with existing drugs on the other hand.
Collapse
Affiliation(s)
- Sumanta Ray
- Department of Computer Science and Engineering, Aliah University, New Town, Kolkata, India; Health Analytics Network, PA, USA.
| | - Snehalika Lall
- Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India
| | - Anirban Mukhopadhyay
- Department of Computer Science and Engineering, University of Kalyani, Kalyani, India
| | | | | |
Collapse
|
3
|
Wang S, Wu R, Lu J, Jiang Y, Huang T, Cai YD. Protein-protein interaction networks as miners of biological discovery. Proteomics 2022; 22:e2100190. [PMID: 35567424 DOI: 10.1002/pmic.202100190] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 03/28/2022] [Accepted: 04/29/2022] [Indexed: 11/12/2022]
Abstract
Protein-protein interactions (PPIs) form the basis of a myriad of biological pathways and mechanism, such as the formation of protein-complexes or the components of signaling cascades. Here, we reviewed experimental methods for identifying PPI pairs, including yeast two-hybrid, mass spectrometry, co-localization, and co-immunoprecipitation. Furthermore, a range of computational methods leveraging biochemical properties, evolution history, protein structures and more have enabled identification of additional PPIs. Given the wealth of known PPIs, we reviewed important network methods to construct and analyze networks of PPIs. These methods aid biological discovery through identifying hub genes and dynamic changes in the network, and have been thoroughly applied in various fields of biological research. Lastly, we discussed the challenges and future direction of research utilizing the power of PPI networks. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Steven Wang
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Runxin Wu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jiaqi Lu
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, IN, USA
| | - Yijia Jiang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tao Huang
- Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
4
|
Xu X, Zhou Y, Feng X, Li X, Asad M, Li D, Liao B, Li J, Cui Q, Wang E. Germline genomic patterns are associated with cancer risk, oncogenic pathways, and clinical outcomes. SCIENCE ADVANCES 2020; 6:6/48/eaba4905. [PMID: 33246949 PMCID: PMC7695479 DOI: 10.1126/sciadv.aba4905] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 10/15/2020] [Indexed: 06/12/2023]
Abstract
There is an ongoing debate on the importance of genetic factors in cancer development, where gene-centered cancer predisposition seems to show that only 5 to 10% of the cancer cases are inheritable. By conducting a systematic analysis of germline genomes of 9712 cancer patients representing 22 common cancer types along with 16,670 noncancer individuals, we identified seven cancer-associated germline genomic patterns (CGGPs), which summarized trinucleotide mutational spectra of germline genomes. A few CGGPs were consistently enriched in the germline genomes of patients whose tumors had smoking signatures or correlated with oncogenesis- and genome instability-related mutations. Furthermore, subgroups defined by the CGGPs were significantly associated with distinct oncogenic pathways, tumor histological subtypes, and prognosis in 13 common cancer types, suggesting that germline genomic patterns enable to inform treatment and clinical outcomes. These results provided evidence that cancer risk and clinical outcomes could be encoded in germline genomes.
Collapse
Affiliation(s)
- Xue Xu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Science, Peking University Health Science Center, Beijing, China
| | - Xiaowen Feng
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Biomedical Informatics, School of Basic Medical Science, Peking University Health Science Center, Beijing, China
| | - Xiong Li
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- School of Software, East China Jiaotong University, Nanchang, China
| | - Mohammad Asad
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Derek Li
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China.
| | - Jianqiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China.
| | - Qinghua Cui
- Department of Biomedical Informatics, School of Basic Medical Science, Peking University Health Science Center, Beijing, China.
| | - Edwin Wang
- Department of Biochemistry and Molecular Biology, Medical Genetics, and Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Arnie Charbonneau Cancer Research Institute, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
5
|
Hou MX, Liu JX, Gao YL, Shang J, Wu SS, Yuan SS. A New Model of Identifying Differentially Expressed Genes via Weighted Network Analysis Based on Dimensionality Reduction Method. Curr Bioinform 2019. [DOI: 10.2174/1574893614666181220094235] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Background:
As a method to identify Differentially Expressed Genes (DEGs), Non-
Negative Matrix Factorization (NMF) has been widely praised in bioinformatics. Although NMF
can make DEGs to be easily identified, it cannot provide more associated information for these
DEGs.
Objective:
The methods of network analysis can be used to analyze the correlation of genes, but
they caused more data redundancy and great complexity in gene association analysis of high dimensions.
Dimensionality reduction is worth considering in this condition.
Methods:
In this paper, we provide a new framework by combining the merits of two: NMF is applied
to select DEGs for dimensionality reduction, and then Weighted Gene Co-Expression Network
Analysis (WGCNA) is introduced to cluster on DEGs into similar function modules. The
combination of NMF and WGCNA as a novel model accomplishes the analysis of DEGs for cholangiocarcinoma
(CHOL).
Results:
Some hub genes from DEGs are highlighted in the co-expression network. Candidate
pathways and genes are also discovered in the most relevant module of CHOL.
Conclusion:
The experiments indicate that our framework is effective and the works also provide
some useful clues to the reaches of CHOL.
Collapse
Affiliation(s)
- Mi-Xiao Hou
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, 276826, China
| | - Junliang Shang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Sha-Sha Wu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Sha-Sha Yuan
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| |
Collapse
|
6
|
Chen J, Zhang S. Matrix Integrative Analysis (MIA) of Multiple Genomic Data for Modular Patterns. Front Genet 2018; 9:194. [PMID: 29910825 PMCID: PMC5992392 DOI: 10.3389/fgene.2018.00194] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 05/11/2018] [Indexed: 11/13/2022] Open
Abstract
The increasing availability of high-throughput biological data, especially multi-dimensional genomic data across the same samples, has created an urgent need for modular and integrative analysis tools that can reveal the relationships among different layers of cellular activities. To this end, we present a MATLAB package, Matrix Integration Analysis (MIA), implementing and extending four published methods, designed based on two classical techniques, non-negative matrix factorization (NMF), and partial least squares (PLS). This package can integrate diverse types of genomic data (e.g., copy number variation, DNA methylation, gene expression, microRNA expression profiles, and/or gene network data) to identify the underlying modular patterns by each method. Particularly, we demonstrate the differences between these two classes of methods, which give users some suggestions about how to select a suitable method in the MIA package. MIA is a flexible tool which could handle a wide range of biological problems and data types. Besides, we also provide an executable version for users without a MATLAB license.
Collapse
Affiliation(s)
- Jinyu Chen
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, CAS, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, CAS, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
7
|
Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids. JOURNAL OF HEALTHCARE ENGINEERING 2018; 2018:1391265. [PMID: 29854357 PMCID: PMC5966669 DOI: 10.1155/2018/1391265] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 03/27/2018] [Accepted: 04/17/2018] [Indexed: 11/29/2022]
Abstract
Previous methods for predicting protein-protein interactions (PPIs) were mainly focused on PPIs within a single species, but PPIs across different species have recently emerged as an important issue in some areas such as viral infection. The primary focus of this study is to predict PPIs between virus and its targeted host, which are involved in viral infection. We developed a general method that predicts interactions between virus and host proteins using the repeat patterns and composition of amino acids. In independent testing of the method with PPIs of new viruses and hosts, it showed a high performance comparable to the best performance of other methods for single virus-host PPIs. In comparison of our method with others using same datasets, our method outperformed the others. The repeat patterns and composition of amino acids are simple, yet powerful features for predicting virus-host PPIs. The method developed in this study will help in finding new virus-host PPIs for which little information is available.
Collapse
|
8
|
Ray S, Maulik U. Discovering Perturbation of Modular Structure in HIV Progression by Integrating Multiple Data Sources Through Non-Negative Matrix Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:869-877. [PMID: 28029629 DOI: 10.1109/tcbb.2016.2642184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Detecting perturbation in modular structure during HIV-1 disease progression is an important step to understand stage specific infection pattern of HIV-1 virus in human cell. In this article, we proposed a novel methodology on integration of multiple biological information to identify such disruption in human gene module during different stages of HIV-1 infection. We integrate three different biological information: gene expression information, protein-protein interaction information, and gene ontology information in single gene meta-module, through non negative matrix factorization (NMF). As the identified meta-modules inherit those information so, detecting perturbation of these, reflects the changes in expression pattern, in PPI structure and in functional similarity of genes during the infection progression. To integrate modules of different data sources into strong meta-modules, NMF based clustering is utilized here. Perturbation in meta-modular structure is identified by investigating the topological and intramodular properties and putting rank to those meta-modules using a rank aggregation algorithm. We have also analyzed the preservation structure of significant GO terms in which the human proteins of the meta-modules participate. Moreover, we have performed an analysis to show the change of coregulation pattern of identified transcription factors (TFs) over the HIV progression stages.
Collapse
|
9
|
Jalili M, Gebhardt T, Wolkenhauer O, Salehzadeh-Yazdi A. Unveiling network-based functional features through integration of gene expression into protein networks. Biochim Biophys Acta Mol Basis Dis 2018; 1864:2349-2359. [PMID: 29466699 DOI: 10.1016/j.bbadis.2018.02.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 01/31/2018] [Accepted: 02/13/2018] [Indexed: 02/02/2023]
Abstract
Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers.
Collapse
Affiliation(s)
- Mahdi Jalili
- Hematology, Oncology and SCT Research Center, Tehran University of Medical Sciences, Tehran, Iran; Hematologic Malignancies Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Tom Gebhardt
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Ali Salehzadeh-Yazdi
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany.
| |
Collapse
|
10
|
Durmuş S, Ülgen KÖ. Comparative interactomics for virus-human protein-protein interactions: DNA viruses versus RNA viruses. FEBS Open Bio 2017; 7:96-107. [PMID: 28097092 PMCID: PMC5221455 DOI: 10.1002/2211-5463.12167] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 11/06/2016] [Accepted: 11/16/2016] [Indexed: 01/01/2023] Open
Abstract
Viruses are obligatory intracellular pathogens and completely depend on their hosts for survival and reproduction. The strategies adopted by viruses to exploit host cell processes and to evade host immune systems during infections may differ largely with the type of the viral genetic material. An improved understanding of these viral infection mechanisms is only possible through a better understanding of the pathogen-host interactions (PHIs) that enable viruses to enter into the host cells and manipulate the cellular mechanisms to their own advantage. Experimentally-verified protein-protein interaction (PPI) data of pathogen-host systems only became available at large scale within the last decade. In this study, we comparatively analyzed the current PHI networks belonging to DNA and RNA viruses and their human host, to get insights into the infection strategies used by these viral groups. We investigated the functional properties of human proteins in the PHI networks, to observe and compare the attack strategies of DNA and RNA viruses. We observed that DNA viruses are able to attack both human cellular and metabolic processes simultaneously during infections. On the other hand, RNA viruses preferentially interact with human proteins functioning in specific cellular processes as well as in intracellular transport and localization within the cell. Observing virus-targeted human proteins, we propose heterogeneous nuclear ribonucleoproteins and transporter proteins as potential antiviral therapeutic targets. The observed common and specific infection mechanisms in terms of viral strategies to attack human proteins may provide crucial information for further design of broad and specific next-generation antiviral therapeutics.
Collapse
Affiliation(s)
- Saliha Durmuş
- Computational Systems Biology GroupDepartment of BioengineeringGebze Technical UniversityKocaeliTurkey
| | - Kutlu Ö. Ülgen
- Department of Chemical EngineeringBoğaziçi UniversityİstanbulTurkey
| |
Collapse
|