1
|
Li J, Yin K, Hou L, Zhang Y, Lu H, Ma C, Xing M. Polystyrene microplastics mediate inflammatory responses in the chicken thymus by Nrf2/NF-κB pathway and trigger autophagy and apoptosis. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2023; 100:104136. [PMID: 37127111 DOI: 10.1016/j.etap.2023.104136] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/15/2023] [Accepted: 04/28/2023] [Indexed: 05/03/2023]
Abstract
Microplastics (MPs) are now a hot environmental contaminant. However, researchers paid little attention to their effects on immune organs such as the thymus. Here, we exposed chickens to a concentration gradient of polystyrene microplastics (PS-MPs) and then followed the decrease in the thymus index. HE staining showed cellular infiltration in the thymus. The assay kit corroborated that PS-MPs impelled oxidative stress in the thymus: increased MDA levels, downregulated antioxidants such as SOD, CAT, and GSH, and significantly undermined total antioxidant capacity. Western blotting and qRT-PCR results showed that Nrf2/NF-κB, Bcl-2/Bax, and AKT signaling pathways were activated in the thymus after exposure to PS-MPs. It stimulated the increased expression of downstream such as IL-1β, caspase-3, and Beclin1, triggering thymus inflammation, apoptosis, and autophagy. This study provides new insights into the field of microplastic immunotoxicity and highlights potential environmental hazards in poultry farming.
Collapse
Affiliation(s)
- Junbo Li
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China
| | - Kai Yin
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China
| | - Lulu Hou
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China
| | - Yue Zhang
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China
| | - Hongmin Lu
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China
| | - Chengxue Ma
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China.
| | - Mingwei Xing
- College of Wildlife and Protected Area, Northeast Forestry University, Harbin 150040, Heilongjiang, PR China.
| |
Collapse
|
2
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
3
|
Iuchi H, Kawasaki J, Kubo K, Fukunaga T, Hokao K, Yokoyama G, Ichinose A, Suga K, Hamada M. Bioinformatics approaches for unveiling virus-host interactions. Comput Struct Biotechnol J 2023; 21:1774-1784. [PMID: 36874163 PMCID: PMC9969756 DOI: 10.1016/j.csbj.2023.02.044] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 02/22/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023] Open
Abstract
The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.
Collapse
Affiliation(s)
- Hitoshi Iuchi
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan
| | - Junna Kawasaki
- Faculty of Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Kento Kubo
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Tsukasa Fukunaga
- Waseda Institute for Advanced Study, Waseda University, Nishi Waseda, Shinjuku-ku, Tokyo 169-0051, Japan
| | - Koki Hokao
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Gentaro Yokoyama
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Akiko Ichinose
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan
| | - Kanta Suga
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan.,Graduate School of Medicine, Nippon Medical School, Tokyo 113-8602, Japan
| |
Collapse
|
4
|
Khan T, Raza S. Exploration of Computational Aids for Effective Drug Designing and Management of Viral Diseases: A Comprehensive Review. Curr Top Med Chem 2023; 23:1640-1663. [PMID: 36725827 DOI: 10.2174/1568026623666230201144522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/14/2022] [Accepted: 12/19/2022] [Indexed: 02/03/2023]
Abstract
BACKGROUND Microbial diseases, specifically originating from viruses are the major cause of human mortality all over the world. The current COVID-19 pandemic is a case in point, where the dynamics of the viral-human interactions are still not completely understood, making its treatment a case of trial and error. Scientists are struggling to devise a strategy to contain the pandemic for over a year and this brings to light the lack of understanding of how the virus grows and multiplies in the human body. METHODS This paper presents the perspective of the authors on the applicability of computational tools for deep learning and understanding of host-microbe interaction, disease progression and management, drug resistance and immune modulation through in silico methodologies which can aid in effective and selective drug development. The paper has summarized advances in the last five years. The studies published and indexed in leading databases have been included in the review. RESULTS Computational systems biology works on an interface of biology and mathematics and intends to unravel the complex mechanisms between the biological systems and the inter and intra species dynamics using computational tools, and high-throughput technologies developed on algorithms, networks and complex connections to simulate cellular biological processes. CONCLUSION Computational strategies and modelling integrate and prioritize microbial-host interactions and may predict the conditions in which the fine-tuning attenuates. These microbial-host interactions and working mechanisms are important from the aspect of effective drug designing and fine- tuning the therapeutic interventions.
Collapse
Affiliation(s)
- Tahmeena Khan
- Department of Chemistry, Integral University, Lucknow, 226026, U.P., India
| | - Saman Raza
- Department of Chemistry, Isabella Thoburn College, Lucknow, 226007, U.P., India
| |
Collapse
|
5
|
Wang L, Tan H, Medina-Puche L, Wu M, Garnelo Gomez B, Gao M, Shi C, Jimenez-Gongora T, Fan P, Ding X, Zhang D, Ding Y, Rosas-Díaz T, Liu Y, Aguilar E, Fu X, Lozano-Durán R. Combinatorial interactions between viral proteins expand the potential functional landscape of the tomato yellow leaf curl virus proteome. PLoS Pathog 2022; 18:e1010909. [PMID: 36256684 PMCID: PMC9633003 DOI: 10.1371/journal.ppat.1010909] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 11/03/2022] [Accepted: 09/30/2022] [Indexed: 11/06/2022] Open
Abstract
Viruses manipulate the cells they infect in order to replicate and spread. Due to strict size restrictions, viral genomes have reduced genetic space; how the action of the limited number of viral proteins results in the cell reprogramming observed during the infection is a long-standing question. Here, we explore the hypothesis that combinatorial interactions may expand the functional landscape of the viral proteome. We show that the proteins encoded by a plant-infecting DNA virus, the geminivirus tomato yellow leaf curl virus (TYLCV), physically associate with one another in an intricate network, as detected by a number of protein-protein interaction techniques. Importantly, our results indicate that intra-viral protein-protein interactions can modify the subcellular localization of the proteins involved. Using one particular pairwise interaction, that between the virus-encoded C2 and CP proteins, as proof-of-concept, we demonstrate that the combination of viral proteins leads to novel transcriptional effects on the host cell. Taken together, our results underscore the importance of studying viral protein function in the context of the infection. We propose a model in which viral proteins might have evolved to extensively interact with other elements within the viral proteome, enlarging the potential functional landscape available to the pathogen. Viruses are obligate intracellular parasites that depend on the molecular machinery of their host cell to complete their life cycle. For this purpose, viruses co-opt host processes, modulating or redirecting them. Most viruses have small genomes, and hence limited coding capacity. During the viral invasion, virus-encoded proteins will be produced in large amounts and coexist in the infected cell, which enables physical or functional interactions among viral proteins, potentially expanding the virus-host functional interface by increasing the number of potential targets in the host cell and/or synergistically modulating the cellular environment. Examples of interactions between viral proteins have been recently documented for both animal and plant viruses; however, the hypothesis that viral proteins might have a combinatorial effect, which would lead to the acquisition of novel functions, lacks systematic experimental validation. Here, we use the geminivirus tomato yellow leaf curl virus (TYLCV), a plant-infecting virus with reduced proteome and causing devastating diseases in crops, to test the idea that combinatorial interactions between viral proteins exist and might underlie an expansion of the functional landscape of the viral proteome. Our results indicate that viral proteins prevalently interact with one another in the context of the infection, which can result in the acquisition of novel functions.
Collapse
Affiliation(s)
- Liping Wang
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Huang Tan
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
- Department of Plant Biochemistry, Center for Plant Molecular Biology (ZMBP), Eberhard Karls University, Tübingen, Germany
| | - Laura Medina-Puche
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- Department of Plant Biochemistry, Center for Plant Molecular Biology (ZMBP), Eberhard Karls University, Tübingen, Germany
| | - Mengshi Wu
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Borja Garnelo Gomez
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Man Gao
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Chaonan Shi
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- Department of Plant Biochemistry, Center for Plant Molecular Biology (ZMBP), Eberhard Karls University, Tübingen, Germany
| | - Tamara Jimenez-Gongora
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Pengfei Fan
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Xue Ding
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Dan Zhang
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Yi Ding
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Tábata Rosas-Díaz
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yujing Liu
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Emmanuel Aguilar
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- Instituto de Hortofruticultura Subtropical y Mediterránea “La Mayora” (IHSM-UMA-CSIC), Area de Genética, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, Málaga, Spain
| | - Xing Fu
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Rosa Lozano-Durán
- Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
- Department of Plant Biochemistry, Center for Plant Molecular Biology (ZMBP), Eberhard Karls University, Tübingen, Germany
- * E-mail:
| |
Collapse
|
6
|
Fang Y, Yang Y, Liu C. New feature extraction from phylogenetic profiles improved the performance of pathogen-host interactions. Front Cell Infect Microbiol 2022; 12:931072. [PMID: 35982784 PMCID: PMC9378789 DOI: 10.3389/fcimb.2022.931072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
MotivationThe understanding of pathogen-host interactions (PHIs) is essential and challenging research because this potentially provides the mechanism of molecular interactions between different organisms. The experimental exploration of PHI is time-consuming and labor-intensive, and computational approaches are playing a crucial role in discovering new unknown PHIs between different organisms. Although it has been proposed that most machine learning (ML)–based methods predict PHI, these methods are all based on the structure-based information extracted from the sequence for prediction. The selection of feature values is critical to improving the performance of predicting PHI using ML.ResultsThis work proposed a new method to extract features from phylogenetic profiles as evolutionary information for predicting PHI. The performance of our approach is better than that of structure-based and ML-based PHI prediction methods. The five different extract models proposed by our approach combined with structure-based information significantly improved the performance of PHI, suggesting that combining phylogenetic profile features and structure-based methods could be applied to the exploration of PHI and discover new unknown biological relativity.Availability and implementationThe KPP method is implemented in the Java language and is available at https://github.com/yangfangs/KPP.
Collapse
Affiliation(s)
- Yang Fang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- Department of Laboratory Medicine, Third Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yi Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| | - Chengcheng Liu
- State Key Laboratory of Oral Diseases, Department of Periodontics, National Clinical Research Center for Oral Diseases, West China School & Hospital of Stomatology, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| |
Collapse
|
7
|
Fang Y, Yang Y, Liu C. Evolutionary Relationships Between Dysregulated Genes in Oral Squamous Cell Carcinoma and Oral Microbiota. Front Cell Infect Microbiol 2022; 12:931011. [PMID: 35909962 PMCID: PMC9328420 DOI: 10.3389/fcimb.2022.931011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/20/2022] [Indexed: 11/30/2022] Open
Abstract
Oral squamous cell carcinoma (OSCC) is one of the most prevalent cancers in the world. Changes in the composition and abundance of oral microbiota are associated with the development and metastasis of OSCC. To elucidate the exact roles of the oral microbiota in OSCC, it is essential to reveal the evolutionary relationships between the dysregulated genes in OSCC progression and the oral microbiota. Thus, we interrogated the microarray and high-throughput sequencing datasets to obtain the transcriptional landscape of OSCC. After identifying differentially expressed genes (DEGs) with three different methods, pathway and functional analyses were also performed. A total of 127 genes were identified as common DEGs, which were enriched in extracellular matrix organization and cytokine related pathways. Furthermore, we established a predictive pipeline for detecting the coevolutionary of dysregulated host genes and microbial proteomes based on the homology method, and this pipeline was employed to analyze the evolutionary relations between the seven most dysregulated genes (MMP13, MMP7, MMP1, CXCL13, CRISPO3, CYP3A4, and CRNN) and microbiota obtained from the eHOMD database. We found that cytochrome P450 3A4 (CYP3A4), a member of the cytochrome P450 family of oxidizing enzymes, was associated with 45 microbes from the eHOMD database and involved in the oral habitat of Comamonas testosteroni and Arachnia rubra. The peptidase M10 family of matrix metalloproteinases (MMP13, MMP7, and MMP1) was associated with Lacticaseibacillus paracasei, Lacticaseibacillus rhamnosus, Streptococcus salivarius, Tannerella sp._HMT_286, and Streptococcus infantis in the oral cavity. Overall, this study revealed the dysregulated genes in OSCC and explored their evolutionary relationship with oral microbiota, which provides new insight for exploring the microbiota–host interactions in diseases.
Collapse
Affiliation(s)
- Yang Fang
- Department of Laboratory Medicine, Third Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yi Yang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Department of Periodontics, West China School and Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Chengcheng Liu
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Department of Periodontics, West China School and Hospital of Stomatology, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu,
| |
Collapse
|
8
|
Saha S, Halder AK, Bandyopadhyay SS, Chatterjee P, Nasipuri M, Basu S. Computational modeling of human-nCoV protein-protein interaction network. Methods 2022; 203:488-497. [PMID: 34902553 PMCID: PMC8662836 DOI: 10.1016/j.ymeth.2021.12.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 11/30/2021] [Accepted: 12/06/2021] [Indexed: 01/25/2023] Open
Abstract
Novel coronavirus(SARS-CoV2) replicates the host cell's genome by interacting with the host proteins. Due to this fact, the identification of virus and host protein-protein interactions could be beneficial in understanding the disease transmission behavior of the virus as well as in potential COVID-19 drug identification. International Committee on Taxonomy of Viruses (ICTV) has declared that nCoV is highly genetically similar to the SARS-CoV epidemic in 2003 (∼89% similarity). With this hypothesis, the present work focuses on developing a computational model for the nCoV-Human protein interaction network, using the experimentally validated SARS-CoV-Human protein interactions. Initially, level-1 and level-2 human spreader proteins are identified in the SARS-CoV-Human interaction network, using Susceptible-Infected-Susceptible (SIS) model. These proteins are considered potential human targets for nCoV bait proteins. A gene-ontology-based fuzzy affinity function has been used to construct the nCoV-Human protein interaction network at a ∼99.98% specificity threshold. This also identifies 37 level-1 human spreaders for COVID-19 in the human protein-interaction network. 2474 level-2 human spreaders are subsequently identified using the SIS model. The derived host-pathogen interaction network is finally validated using six potential FDA-listed drugs for COVID-19 with significant overlap between the known drug target proteins and the identified spreader proteins.
Collapse
Affiliation(s)
- Sovan Saha
- Department of Computer Science & Engineering, Institute of Engineering & Management, Salt Lake Electronics Complex, Kolkata 700091, West Bengal, India
| | - Anup Kumar Halder
- Department of Computer Science & Engineering, University of Engineering & Management, Kolkata 700156, West Bengal, India
| | - Soumyendu Sekhar Bandyopadhyay
- Department of Computer Science & Engineering, School of Engineering and Technology, Adamas University, Kolkata 700126, West Bengal, India; Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India
| | - Piyali Chatterjee
- Department of Computer Science & Engineering, Netaji Subhash Engineering College, Garia, Kolkata, West Bengal 700152, India
| | - Mita Nasipuri
- Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India
| | - Subhadip Basu
- Department of Computer Science & Engineering, Jadavpur University, Jadavpur, Kolkata, West Bengal 700032, India.
| |
Collapse
|
9
|
Xia S, Xia Y, Xiang C, Wang H, Wang C, He J, Shi G, Gu L. A virus–target host proteins recognition method based on integrated complexes data and seed extension. BMC Bioinformatics 2022; 23:256. [PMID: 35764916 PMCID: PMC9238269 DOI: 10.1186/s12859-022-04792-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 06/14/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Target drugs play an important role in the clinical treatment of virus diseases. Virus-encoded proteins are widely used as targets for target drugs. However, they cannot cope with the drug resistance caused by a mutated virus and ignore the importance of host proteins for virus replication. Some methods use interactions between viruses and their host proteins to predict potential virus–target host proteins, which are less susceptible to mutated viruses. However, these methods only consider the network topology between the virus and the host proteins, ignoring the influences of protein complexes. Therefore, we introduce protein complexes that are less susceptible to drug resistance of mutated viruses, which helps recognize the unknown virus–target host proteins and reduce the cost of disease treatment.
Results
Since protein complexes contain virus–target host proteins, it is reasonable to predict virus–target human proteins from the perspective of the protein complexes. We propose a coverage clustering-core-subsidiary protein complex recognition method named CCA-SE that integrates the known virus–target host proteins, the human protein–protein interaction network, and the known human protein complexes. The proposed method aims to obtain the potential unknown virus–target human host proteins. We list part of the targets after proving our results effectively in enrichment experiments.
Conclusions
Our proposed CCA-SE method consists of two parts: one is CCA, which is to recognize protein complexes, and the other is SE, which is to select seed nodes as the core of protein complexes by using seed expansion. The experimental results validate that CCA-SE achieves efficient recognition of the virus–target host proteins.
Collapse
|
10
|
Yang X, Yang S, Ren P, Wuchty S, Zhang Z. Deep Learning-Powered Prediction of Human-Virus Protein-Protein Interactions. Front Microbiol 2022; 13:842976. [PMID: 35495666 PMCID: PMC9051481 DOI: 10.3389/fmicb.2022.842976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Identifying human-virus protein-protein interactions (PPIs) is an essential step for understanding viral infection mechanisms and antiviral response of the human host. Recent advances in high-throughput experimental techniques enable the significant accumulation of human-virus PPI data, which have further fueled the development of machine learning-based human-virus PPI prediction methods. Emerging as a very promising method to predict human-virus PPIs, deep learning shows the powerful ability to integrate large-scale datasets, learn complex sequence-structure relationships of proteins and convert the learned patterns into final prediction models with high accuracy. Focusing on the recent progresses of deep learning-powered human-virus PPI predictions, we review technical details of these newly developed methods, including dataset preparation, deep learning architectures, feature engineering, and performance assessment. Moreover, we discuss the current challenges and potential solutions and provide future perspectives of human-virus PPI prediction in the coming post-AlphaFold2 era.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Panyu Ren
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami, FL, United States
- Department of Biology, University of Miami, Miami, FL, United States
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, United States
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
- *Correspondence: Ziding Zhang,
| |
Collapse
|
11
|
Chai H, Gu Q, Hughes J, Robertson DL. In silico prediction of HIV-1-host molecular interactions and their directionality. PLoS Comput Biol 2022; 18:e1009720. [PMID: 35134057 PMCID: PMC8856524 DOI: 10.1371/journal.pcbi.1009720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 02/18/2022] [Accepted: 12/03/2021] [Indexed: 11/18/2022] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) continues to be a major cause of disease and premature death. As with all viruses, HIV-1 exploits a host cell to replicate. Improving our understanding of the molecular interactions between virus and human host proteins is crucial for a mechanistic understanding of virus biology, infection and host antiviral activities. This knowledge will potentially permit the identification of host molecules for targeting by drugs with antiviral properties. Here, we propose a data-driven approach for the analysis and prediction of the HIV-1 interacting proteins (VIPs) with a focus on the directionality of the interaction: host-dependency versus antiviral factors. Using support vector machine learning models and features encompassing genetic, proteomic and network properties, our results reveal some significant differences between the VIPs and non-HIV-1 interacting human proteins (non-VIPs). As assessed by comparison with the HIV-1 infection pathway data in the Reactome database (sensitivity > 90%, threshold = 0.5), we demonstrate these models have good generalization properties. We find that the ‘direction’ of the HIV-1-host molecular interactions is also predictable due to different characteristics of ‘forward’/pro-viral versus ‘backward’/pro-host proteins. Additionally, we infer the previously unknown direction of the interactions between HIV-1 and 1351 human host proteins. A web server for performing predictions is available at http://hivpre.cvr.gla.ac.uk/.
Collapse
Affiliation(s)
- Haiting Chai
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Quan Gu
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Joseph Hughes
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - David L. Robertson
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
- * E-mail:
| |
Collapse
|
12
|
Khan MS, Yousafi Q, Bibi S, Azhar M, Ihsan A. Bioinformatics-Based Approaches to Study Virus-Host Interactions During SARS-CoV-2 Infection. Methods Mol Biol 2022; 2452:197-212. [PMID: 35554909 DOI: 10.1007/978-1-0716-2111-0_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As the knowledge of biomolecules is increasing from the last decades, it is helping the researchers to understand the unsolved issues regarding virology. Recent technologies in high-throughput sequencing are providing the swift generation of SARS-CoV-2 genomic data with the basic inside of viral infection. Owing to various virus-host protein interactions, high-throughput technologies are unable to provide complete details of viral pathogenesis. Identifying the virus-host protein interactions using bioinformatics approaches can assist in understanding the mechanism of SARS-CoV-2 infection and pathogenesis. In this chapter, recent integrative bioinformatics approaches are discussed to help the virologists and computational biologists in the identification of structurally similar proteins of human and SARS-CoV-2 virus, and to predict the potential of virus-host interactions. Considering experimental and time limitations for effective viral drug development, computational aided drug design (CADD) can reduce the gap between drug prediction and development. More research with respect to evolutionary solutions could be helpful to make a new pipeline for virus-host protein-protein interactions and provide more understanding to disclose the cases of host switch, and also expand the virulence of the pathogen and host range in developing viral infections.
Collapse
Affiliation(s)
- Muhammad Saad Khan
- Department of Biosciences, COMSATS University Islamabad, Sahiwal, Pakistan
| | - Qudsia Yousafi
- Department of Biosciences, COMSATS University Islamabad, Sahiwal, Pakistan
| | - Shabana Bibi
- Yunnan Herbal Laboratory, School of Ecology and Environmental Sciences, Yunnan University, Kunming, Yunnan, China
| | - Muhammad Azhar
- Department of Biosciences, COMSATS University Islamabad, Sahiwal, Pakistan
| | - Awais Ihsan
- Department of Biosciences, COMSATS University Islamabad, Sahiwal, Pakistan.
| |
Collapse
|
13
|
Pitta JLDLP, Vasconcelos CRDS, Wallau GDL, Campos TDL, Rezende AM. In silico predictions of protein interactions between Zika virus and human host. PeerJ 2021; 9:e11770. [PMID: 34513323 PMCID: PMC8395582 DOI: 10.7717/peerj.11770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 06/23/2021] [Indexed: 11/20/2022] Open
Abstract
Background The ZIKA virus (ZIKV) belongs to the Flaviviridae family, was first isolated in the 1940s, and remained underreported until its global threat in 2016, where drastic consequences were reported as Guillan-Barre syndrome and microcephaly in newborns. Understanding molecular interactions of ZIKV proteins during the host infection is important to develop treatments and prophylactic measures; however, large-scale experimental approaches normally used to detect protein-protein interaction (PPI) are onerous and labor-intensive. On the other hand, computational methods may overcome these challenges and guide traditional approaches on one or few protein molecules. The prediction of PPIs can be used to study host-parasite interactions at the protein level and reveal key pathways that allow viral infection. Results Applying Random Forest and Support Vector Machine (SVM) algorithms, we performed predictions of PPI between two ZIKV strains and human proteomes. The consensus number of predictions of both algorithms was 17,223 pairs of proteins. Functional enrichment analyses were executed with the predicted networks to access the biological meanings of the protein interactions. Some pathways related to viral infection and neurological development were found for both ZIKV strains in the enrichment analysis, but the JAK-STAT pathway was observed only for strain PE243 when compared with the FSS13025 strain. Conclusions The consensus network of PPI predictions made by Random Forest and SVM algorithms allowed an enrichment analysis that corroborates many aspects of ZIKV infection. The enrichment results are mainly related to viral infection, neuronal development, and immune response, and presented differences among the two compared ZIKV strains. Strain PE243 presented more predicted interactions between proteins from the JAK-STAT signaling pathway, which could lead to a more inflammatory immune response when compared with the FSS13025 strain. These results show that the methodology employed in this study can potentially reveal new interactions between the ZIKV and human cells.
Collapse
Affiliation(s)
| | | | | | - Túlio de Lima Campos
- Bioinformatics Platform, Aggeu Magalhães Institute-FIOCRUZ/PE, Recife, PE, Brasil
| | | |
Collapse
|
14
|
Yang X, Yang S, Lian X, Wuchty S, Zhang Z. Transfer learning via multi-scale convolutional neural layers for human-virus protein-protein interaction prediction. Bioinformatics 2021; 37:4771-4778. [PMID: 34273146 PMCID: PMC8406877 DOI: 10.1093/bioinformatics/btab533] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 06/03/2021] [Accepted: 07/16/2021] [Indexed: 11/20/2022] Open
Abstract
Motivation To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human–virus protein–protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source dataset/task to a small target dataset/task, improving prediction performance. Results To predict interactions between human and viral proteins, we combine evolutionary sequence profile features with a Siamese convolutional neural network (CNN) architecture and a multi-layer perceptron. Our architecture outperforms various feature encodings-based machine learning and state-of-the-art prediction methods. As our main contribution, we introduce two transfer learning methods (i.e. ‘frozen’ type and ‘fine-tuning’ type) that reliably predict interactions in a target human–virus domain based on training in a source human–virus domain, by retraining CNN layers. Finally, we utilize the ‘frozen’ type transfer learning approach to predict human–SARS-CoV-2 PPIs, indicating that our predictions are topologically and functionally similar to experimentally known interactions. Availability and implementation: The source codes and datasets are available at https://github.com/XiaodiYangCAU/TransPPI/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Dept. of Computer Science, University of Miami, Miami, FL 33146, USA.,Dept. of Biology, University of Miami, Miami, FL 33146, USA.,Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
15
|
Sudhakar P, Machiels K, Verstockt B, Korcsmaros T, Vermeire S. Computational Biology and Machine Learning Approaches to Understand Mechanistic Microbiome-Host Interactions. Front Microbiol 2021; 12:618856. [PMID: 34046017 PMCID: PMC8148342 DOI: 10.3389/fmicb.2021.618856] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/19/2021] [Indexed: 12/11/2022] Open
Abstract
The microbiome, by virtue of its interactions with the host, is implicated in various host functions including its influence on nutrition and homeostasis. Many chronic diseases such as diabetes, cancer, inflammatory bowel diseases are characterized by a disruption of microbial communities in at least one biological niche/organ system. Various molecular mechanisms between microbial and host components such as proteins, RNAs, metabolites have recently been identified, thus filling many gaps in our understanding of how the microbiome modulates host processes. Concurrently, high-throughput technologies have enabled the profiling of heterogeneous datasets capturing community level changes in the microbiome as well as the host responses. However, due to limitations in parallel sampling and analytical procedures, big gaps still exist in terms of how the microbiome mechanistically influences host functions at a system and community level. In the past decade, computational biology and machine learning methodologies have been developed with the aim of filling the existing gaps. Due to the agnostic nature of the tools, they have been applied in diverse disease contexts to analyze and infer the interactions between the microbiome and host molecular components. Some of these approaches allow the identification and analysis of affected downstream host processes. Most of the tools statistically or mechanistically integrate different types of -omic and meta -omic datasets followed by functional/biological interpretation. In this review, we provide an overview of the landscape of computational approaches for investigating mechanistic interactions between individual microbes/microbiome and the host and the opportunities for basic and clinical research. These could include but are not limited to the development of activity- and mechanism-based biomarkers, uncovering mechanisms for therapeutic interventions and generating integrated signatures to stratify patients.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Kathleen Machiels
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
| | - Bram Verstockt
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Séverine Vermeire
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| |
Collapse
|
16
|
Prasasty VD, Hutagalung RA, Gunadi R, Sofia DY, Rosmalena R, Yazid F, Sinaga E. Prediction of human-Streptococcus pneumoniae protein-protein interactions using logistic regression. Comput Biol Chem 2021; 92:107492. [PMID: 33964803 DOI: 10.1016/j.compbiolchem.2021.107492] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 04/21/2021] [Indexed: 02/07/2023]
Abstract
Streptococcus pneumoniae is a major cause of mortality in children under five years old. In recent years, the emergence of antibiotic-resistant strains of S. pneumoniae increases the threat level of this pathogen. For that reason, the exploration of S. pneumoniae protein virulence factors should be considered in developing new drugs or vaccines, for instance by the analysis of host-pathogen protein-protein interactions (HP-PPIs). In this research, prediction of protein-protein interactions was performed with a logistic regression model with the number of protein domain occurrences as features. By utilizing HP-PPIs of three different pathogens as training data, the model achieved 57-77 % precision, 64-75 % recall, and 96-98 % specificity. Prediction of human-S. pneumoniae protein-protein interactions using the model yielded 5823 interactions involving thirty S. pneumoniae proteins and 324 human proteins. Pathway enrichment analysis showed that most of the pathways involved in the predicted interactions are immune system pathways. Network topology analysis revealed β-galactosidase (BgaA) as the most central among the S. pneumoniae proteins in the predicted HP-PPI networks, with a degree centrality of 1.0 and a betweenness centrality of 0.451853. Further experimental studies are required to validate the predicted interactions and examine their roles in S. pneumoniae infection.
Collapse
Affiliation(s)
- Vivitri Dewi Prasasty
- Faculty of Biotechnology, Atma Jaya Catholic University of Indonesia, Jakarta, 12930, Indonesia.
| | - Rory Anthony Hutagalung
- Faculty of Biotechnology, Atma Jaya Catholic University of Indonesia, Jakarta, 12930, Indonesia
| | - Reinhart Gunadi
- Department of Biology, Faculty of Life Sciences, Universitas Surya, Tangerang, Banten, 15143, Indonesia
| | - Dewi Yustika Sofia
- Department of Biology, Faculty of Life Sciences, Universitas Surya, Tangerang, Banten, 15143, Indonesia
| | - Rosmalena Rosmalena
- Department of Medical Chemistry, Faculty of Medicine, Universitas Indonesia, Jakarta, 10430, Indonesia
| | - Fatmawaty Yazid
- Department of Medical Chemistry, Faculty of Medicine, Universitas Indonesia, Jakarta, 10430, Indonesia
| | - Ernawati Sinaga
- Faculty of Biology, Universitas Nasional, Jakarta, 12520, Indonesia.
| |
Collapse
|
17
|
KÖSESOY İ, GÖK M, KAHVECİ T. Prediction of host-pathogen protein interactions by extended network model. Turk J Biol 2021; 45:138-148. [PMID: 33907496 PMCID: PMC8068772 DOI: 10.3906/biy-2009-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/04/2021] [Indexed: 11/26/2022] Open
Abstract
Knowledge of the pathogen-host interactions between the species is essentialin order to develop a solution strategy against infectious diseases. In vitro methods take extended periods of time to detect interactions and provide very few of the possible interaction pairs. Hence, modelling interactions between proteins has necessitated the development of computational methods. The main scope of this paper is integrating the known protein interactions between thehost and pathogen organisms to improve the prediction success rate of unknown pathogen-host interactions. Thus, the truepositive rate of the predictions was expected to increase.In order to perform this study extensively, encoding methods and learning algorithms of several proteins were tested. Along with human as the host organism, two different pathogen organisms were used in the experiments. For each combination of protein-encoding and prediction method, both the original prediction algorithms were tested using only pathogen-host interactions and the same methodwas testedagain after integrating the known protein interactions within each organism. The effect of merging the networks of pathogen-host interactions of different species on the prediction performance of state-of-the-art methods was also observed. Successwas measured in terms of Matthews correlation coefficient, precision, recall, F1 score, and accuracy metrics. Empirical results showed that integrating the host and pathogen interactions yields better performance consistently in almost all experiments.
Collapse
Affiliation(s)
- İrfan KÖSESOY
- Department of Computer Engineering, Faculty of Engineering, Yalova University, YalovaTurkey
| | - Murat GÖK
- Department of Computer Engineering, Faculty of Engineering, Yalova University, YalovaTurkey
| | - Tamer KAHVECİ
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FLUSA
| |
Collapse
|
18
|
Wang Y, Zhou M, Zou Q, Xu L. Machine learning for phytopathology: from the molecular scale towards the network scale. Brief Bioinform 2021; 22:6204793. [PMID: 33787847 DOI: 10.1093/bib/bbab037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/09/2021] [Accepted: 01/26/2021] [Indexed: 01/16/2023] Open
Abstract
With the increasing volume of high-throughput sequencing data from a variety of omics techniques in the field of plant-pathogen interactions, sorting, retrieving, processing and visualizing biological information have become a great challenge. Within the explosion of data, machine learning offers powerful tools to process these complex omics data by various algorithms, such as Bayesian reasoning, support vector machine and random forest. Here, we introduce the basic frameworks of machine learning in dissecting plant-pathogen interactions and discuss the applications and advances of machine learning in plant-pathogen interactions from molecular to network biology, including the prediction of pathogen effectors, plant disease resistance protein monitoring and the discovery of protein-protein networks. The aim of this review is to provide a summary of advances in plant defense and pathogen infection and to indicate the important developments of machine learning in phytopathology.
Collapse
Affiliation(s)
- Yansu Wang
- Postdoctoral Innovation Practice Base, Shenzhen Polytechnic, China
| | | | - Quan Zou
- University of Electronic Science and Technology of China
| | - Lei Xu
- Shenzhen Polytechnic, China
| |
Collapse
|
19
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
20
|
Khorsand B, Savadi A, Naghibzadeh M. SARS-CoV-2-human protein-protein interaction network. INFORMATICS IN MEDICINE UNLOCKED 2020; 20:100413. [PMID: 32838020 PMCID: PMC7425553 DOI: 10.1016/j.imu.2020.100413] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 07/11/2020] [Accepted: 08/10/2020] [Indexed: 12/13/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the novel coronavirus which caused the coronavirus disease 2019 pandemic and infected more than 12 million victims and resulted in over 560,000 deaths in 213 countries around the world. Having no symptoms in the first week of infection increases the rate of spreading the virus. The increasing rate of the number of infected individuals and its high mortality necessitates an immediate development of proper diagnostic methods and effective treatments. SARS-CoV-2, similar to other viruses, needs to interact with the host proteins to reach the host cells and replicate its genome. Consequently, virus-host protein-protein interaction (PPI) identification could be useful in predicting the behavior of the virus and the design of antiviral drugs. Identification of virus-host PPIs using experimental approaches are very time consuming and expensive. Computational approaches could be acceptable alternatives for many preliminary investigations. In this study, we developed a new method to predict SARS-CoV-2-human PPIs. Our model is a three-layer network in which the first layer contains the most similar Alphainfluenzavirus proteins to SARS-CoV-2 proteins. The second layer contains protein-protein interactions between Alphainfluenzavirus proteins and human proteins. The last layer reveals protein-protein interactions between SARS-CoV-2 proteins and human proteins by using the clustering coefficient network property on the first two layers. To further analyze the results of our prediction network, we investigated human proteins targeted by SARS-CoV-2 proteins and reported the most central human proteins in human PPI network. Moreover, differentially expressed genes of previous researches were investigated and PPIs of SARS-CoV-2-human network, the human proteins of which were related to upregulated genes, were reported.
Collapse
Affiliation(s)
- Babak Khorsand
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Abdorreza Savadi
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
21
|
Chen H, Li F, Wang L, Jin Y, Chi CH, Kurgan L, Song J, Shen J. Systematic evaluation of machine learning methods for identifying human-pathogen protein-protein interactions. Brief Bioinform 2020; 22:5847611. [PMID: 32459334 DOI: 10.1093/bib/bbaa068] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 12/11/2022] Open
Abstract
In recent years, high-throughput experimental techniques have significantly enhanced the accuracy and coverage of protein-protein interaction identification, including human-pathogen protein-protein interactions (HP-PPIs). Despite this progress, experimental methods are, in general, expensive in terms of both time and labour costs, especially considering that there are enormous amounts of potential protein-interacting partners. Developing computational methods to predict interactions between human and bacteria pathogen has thus become critical and meaningful, in both facilitating the detection of interactions and mining incomplete interaction maps. In this paper, we present a systematic evaluation of machine learning-based computational methods for human-bacterium protein-protein interactions (HB-PPIs). We first reviewed a vast number of publicly available databases of HP-PPIs and then critically evaluate the availability of these databases. Benefitting from its well-structured nature, we subsequently preprocess the data and identified six bacterium pathogens that could be used to study bacterium subjects in which a human was the host. Additionally, we thoroughly reviewed the literature on 'host-pathogen interactions' whereby existing models were summarized that we used to jointly study the impact of different feature representation algorithms and evaluate the performance of existing machine learning computational models. Owing to the abundance of sequence information and the limited scale of other protein-related information, we adopted the primary protocol from the literature and dedicated our analysis to a comprehensive assessment of sequence information and machine learning models. A systematic evaluation of machine learning models and a wide range of feature representation algorithms based on sequence information are presented as a comparison survey towards the prediction performance evaluation of HB-PPIs.
Collapse
|
22
|
Yang X, Yang S, Li Q, Wuchty S, Zhang Z. Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput Struct Biotechnol J 2019; 18:153-161. [PMID: 31969974 PMCID: PMC6961065 DOI: 10.1016/j.csbj.2019.12.005] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/29/2019] [Accepted: 12/10/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.
Collapse
Key Words
- AC, Auto Covariance
- ACC, Accuracy
- AUC, area under the ROC curve
- AUPRC, area under the PR curve
- Adaboost, Adaptive Boosting
- CT, Conjoint Triad
- Doc2vec
- Embedding
- Human-virus interaction
- LD, Local Descriptor
- MCC, Matthews correlation coefficient
- ML, machine learning
- MLP, Multiple Layer Perceptron
- MS, mass spectroscopy
- Machine learning
- PPIs, protein-protein interactions
- PR, Precision-Recall
- Prediction
- Protein-protein interaction
- RBF, radial basis function
- RF, Random Forest
- ROC, Receiver Operating Characteristic
- SGD, stochastic gradient descent
- SVM, Support Vector Machine
- Y2H, yeast two-hybrid
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qinmengge Li
- National Demonstration Center for Experimental Biological Sciences Education, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Dept. of Computer Science, University of Miami, Miami, FL 33146, USA
- Dept. of Biology, University of Miami, Miami, FL 33146, USA
- Center of Computational Science, University of Miami, Miami, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
23
|
Zheng N, Wang K, Zhan W, Deng L. Targeting Virus-host Protein Interactions: Feature Extraction and Machine Learning Approaches. Curr Drug Metab 2019; 20:177-184. [PMID: 30156155 DOI: 10.2174/1389200219666180829121038] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 01/15/2023]
Abstract
BACKGROUND Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions. METHODS In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods. RESULTS We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions. CONCLUSION The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.
Collapse
Affiliation(s)
- Nantao Zheng
- School of Software, Central South University, Changsha, 410075, China
| | - Kairou Wang
- School of Software, Central South University, Changsha, 410075, China
| | - Weihua Zhan
- School of Electronics and Computer Science, Zhejiang Wanli University, Ningbo 315100, China
| | - Lei Deng
- School of Software, Central South University, Changsha, 410075, China.,Shanghai Key Lab of Intelligent Information Processing, Shanghai 200433, China
| |
Collapse
|
24
|
Lian X, Yang S, Li H, Fu C, Zhang Z. Machine-Learning-Based Predictor of Human–Bacteria Protein–Protein Interactions by Incorporating Comprehensive Host-Network Properties. J Proteome Res 2019; 18:2195-2205. [DOI: 10.1021/acs.jproteome.9b00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hong Li
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, 570228, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
25
|
Ray S, Alberuni S, Maulik U. Computational Prediction of HCV-Human Protein-Protein Interaction via Topological Analysis of HCV Infected PPI Modules. IEEE Trans Nanobioscience 2019; 17:55-61. [PMID: 29570075 DOI: 10.1109/tnb.2018.2797696] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper, we have developed a framework for detection of protein-protein interactions (PPI) between Hepatitis-C virus (HCV) and human proteins based on PPI and gene ontology based information of the HCV infected proteins. First, a bipartite interaction network is formed between HCV proteins and human host proteins. Next, we have analyzed different topological properties of the interaction network and observed that degree of HCV-interacting proteins is significantly higher than non-interacting host proteins. We have also observed that the HCV interacted protein pairs are functionally similar with each other than the non-interacting pairs. Following the observations, we have applied an inference mechanism to predict novel interactions between HCV and human protein. The inference mechanism is based on partitioning the network formed by HCV interacted human proteins and their first neighbors in dense and functionally similar groups using a PPI network clustering algorithm. The groups are then analyzed to predict PPIs. The predicted interaction pairs are validated using literature search in PUBMED. Experimental evidence of over 50% of the predicted pairs are found in existing literatures by searching PUBMED. A Gene Ontology and pathway based analysis is also carried out to validate the identified modules biologically.
Collapse
|
26
|
Application of Support Vector Machines in Viral Biology. GLOBAL VIROLOGY III: VIROLOGY IN THE 21ST CENTURY 2019. [PMCID: PMC7114997 DOI: 10.1007/978-3-030-29022-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Novel experimental and sequencing techniques have led to an exponential explosion and spiraling of data in viral genomics. To analyse such data, rapidly gain information, and transform this information to knowledge, interdisciplinary approaches involving several different types of expertise are necessary. Machine learning has been in the forefront of providing models with increasing accuracy due to development of newer paradigms with strong fundamental bases. Support Vector Machines (SVM) is one such robust tool, based rigorously on statistical learning theory. SVM provides very high quality and robust solutions to classification and regression problems. Several studies in virology employ high performance tools including SVM for identification of potentially important gene and protein functions. This is mainly due to the highly beneficial aspects of SVM. In this chapter we briefly provide lucid and easy to understand details of SVM algorithms along with applications in virology.
Collapse
|
27
|
A new sequence based encoding for prediction of host-pathogen protein interactions. Comput Biol Chem 2018; 78:170-177. [PMID: 30553999 DOI: 10.1016/j.compbiolchem.2018.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 08/23/2018] [Accepted: 12/01/2018] [Indexed: 12/22/2022]
Abstract
Pathogen-host interactions are very important to figure out the infection process at the molecular level, where pathogen proteins physically bind to human proteins to manipulate critical biological processes in the host cell. Data scarcity and data unavailability are two major problems for computational approaches in the prediction of pathogen-host interactions. Developing a computational method to predict pathogen-host interactions with high accuracy, based on protein sequences alone, is of great importance because it can eliminate these problems. In this study, we propose a novel and robust sequence based feature extraction method, named Location Based Encoding, to predict pathogen-host interactions with machine learning based algorithms. In this context, we use Bacillus Anthracis and Yersinia Pestis data sets as the pathogen organisms and human proteins as the host model to compare our method with sequence based protein encoding methods, which are widely used in the literature, namely amino acid composition, amino acid pair, and conjoint triad. We use these encoding methods with decision trees (Random Forest, j48), statistical (Bayesian Networks, Naive Bayes), and instance based (kNN) classifiers to predict pathogen-host interactions. We conduct different experiments to evaluate the effectiveness of our method. We obtain the best results among all the experiments with RF classifier in terms of F1, accuracy, MCC, and AUC.
Collapse
|
28
|
Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M. Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions. Brief Funct Genomics 2018; 17:381-391. [PMID: 29028879 PMCID: PMC7109800 DOI: 10.1093/bfgp/elx026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Identification of potential virus-host interactions is useful and vital to control the highly infectious virus-caused diseases. This may contribute toward development of new drugs to treat the viral infections. Recently, database records of clinically and experimentally validated interactions between a small set of human proteins and Ebola virus (EBOV) have been published. Using the information of the known human interaction partners of EBOV, our main objective is to identify a set of proteins that may interact with EBOV proteins. Here, we first review the state-of-the-art, computational methods used for prediction of novel virus-host interactions for infectious diseases followed by a case study on EBOV-human interactions. The assessment result shows that the predicted human host proteins are highly similar with known human interaction partners of EBOV in the context of structure and semantics and are responsible for similar biochemical activities, pathways and host-pathogen relationships.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Pritha Dutta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mahantapas Kundu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
29
|
Sun J, Yang LL, Chen X, Kong DX, Liu R. Integrating Multifaceted Information to Predict Mycobacterium tuberculosis-Human Protein-Protein Interactions. J Proteome Res 2018; 17:3810-3823. [PMID: 30269499 DOI: 10.1021/acs.jproteome.8b00497] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Tuberculosis (TB) is one of the biggest infectious disease killers caused by Mycobacterium tuberculosis (MTB). Studying the protein-protein interactions (PPIs) between MTB and human can deepen our understanding of the pathogenesis of TB and offer new clues to the treatment against MTB infection, but the experimentally validated interactions are especially scarce in this regard. Herein we proposed an integrated framework that combined template-, domain-domain interaction-, and machine learning-based methods to predict MTB-human PPIs. As a result, we established a network composed of 13 758 PPIs including 451 MTB proteins and 3167 human proteins ( http://liulab.hzau.edu.cn/MTB/ ). Compared to known human targets of various pathogens, our predicted human targets show a similar tendency in terms of the network topological properties and enrichment in important functional genes. Additionally, these human targets largely have longer sequence lengths, more protein domains, more disordered residues, lower evolutionary rates, and older protein ages. Functional analysis demonstrates that these proteins show strong preferences toward the phosphorylation, kinase activity, and signaling transduction processes and the disease and immune related pathways. Dissecting the cross-talk among top-ranked pathways suggests that the cancer pathway may serve as a bridge in MTB infection. Triplet analysis illustrates that the paired targets interacting with the same partner are adjacent to each other in the intraspecies network and tend to share similar expression patterns. Finally, we identified 36 potential anti-MTB human targets by integrating known drug target information and molecular properties of proteins.
Collapse
|
30
|
Goodacre N, Devkota P, Bae E, Wuchty S, Uetz P. Protein-protein interactions of human viruses. Semin Cell Dev Biol 2018; 99:31-39. [PMID: 30031213 PMCID: PMC7102568 DOI: 10.1016/j.semcdb.2018.07.018] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 04/02/2018] [Accepted: 07/17/2018] [Indexed: 12/16/2022]
Abstract
Viruses infect their human hosts by a series of interactions between viral and host proteins, indicating that detailed knowledge of such virus-host interaction interfaces are critical for our understanding of viral infection mechanisms, disease etiology and the development of new drugs. In this review, we primarily survey human host-virus interaction data that are available from public databases following the standardized PSI-MS format. Notably, available host-virus protein interaction information is strongly biased toward a small number of virus families including herpesviridae, papillomaviridae, orthomyxoviridae and retroviridae. While we explore the reliability and relevance of these protein interactions we also survey the current knowledge about viruses functional and topological targets. Furthermore, we assess emerging frontiers of host-virus protein interaction research, focusing on protein interaction interfaces of hosts that are infected by different viruses and viruses that infect multiple hosts. Finally, we cover the current status of research that investigates the relationships of virus-targeted host proteins to other comorbidities as well as the influence of host-virus protein interactions on human metabolism.
Collapse
Affiliation(s)
- Norman Goodacre
- Division of Viral Products, Office of Vaccines Research and Review, Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Prajwal Devkota
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, 33146, USA
| | - Eunhae Bae
- Division of Viral Products, Office of Vaccines Research and Review, Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Stefan Wuchty
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, 33146, USA; Center for Computational Science, Univ. of Miami, Coral Gables, FL, 33146, USA; Dept. of Biology, Univ. of Miami, Coral Gables, FL, 33146, USA; Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA.
| | - Peter Uetz
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
31
|
Basit AH, Abbasi WA, Asif A, Gull S, Minhas FUAA. Training host-pathogen protein-protein interaction predictors. J Bioinform Comput Biol 2018; 16:1850014. [PMID: 30060698 DOI: 10.1142/s0219720018500142] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology. Particularly, pathogenic infections are caused by interactions of host and pathogen proteins. It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI detection techniques have limitations in terms of cost and large-scale application. Hence, computational approaches are developed to predict PPIs. This study aims to develop machine learning models to predict inter-species PPIs with a special interest in HPIs. Specifically, we focus on seeking answers to three questions that arise while developing an HPI predictor: (1) How should negative training examples be selected? (2) Does assigning sample weights to individual negative examples based on their similarity to positive examples improve generalization performance? and, (3) What should be the size of negative samples as compared to the positive samples during training and evaluation? We compare two available methods for negative sampling: random versus DeNovo sampling and our experiments show that DeNovo sampling offers better accuracy. However, our experiments also show that generalization performance can be improved further by using a soft DeNovo approach that assigns sample weights to negative examples inversely proportional to their similarity to known positive examples during training. Based on our findings, we have also developed an HPI predictor called HOPITOR (Host-Pathogen Interaction Predictor) that can predict interactions between human and viral proteins. The HOPITOR web server can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor .
Collapse
Affiliation(s)
- Abdul Hannan Basit
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan.,† Department of Electrical Engineering, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Wajid Arshad Abbasi
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Amina Asif
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Sadaf Gull
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Fayyaz Ul Amir Afsar Minhas
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| |
Collapse
|
32
|
Devkota P, Danzi MC, Wuchty S. Beyond degree and betweenness centrality: Alternative topological measures to predict viral targets. PLoS One 2018; 13:e0197595. [PMID: 29795705 PMCID: PMC5967884 DOI: 10.1371/journal.pone.0197595] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 05/04/2018] [Indexed: 11/18/2022] Open
Abstract
The availability of large-scale screens of host-virus interaction interfaces enabled the topological analysis of viral protein targets of the host. In particular, host proteins that bind viral proteins are generally hubs and proteins with high betweenness centrality. Recently, other topological measures were introduced that a virus may tap to infect a host cell. Utilizing experimentally determined sets of human protein targets from Herpes, Hepatitis, HIV and Influenza, we pooled molecular interactions between proteins from different pathway databases. Apart from a protein's degree and betweenness centrality, we considered a protein's pathway participation, ability to topologically control a network and protein PageRank index. In particular, we found that proteins with increasing values of such measures tend to accumulate viral targets and distinguish viral targets from non-targets. Furthermore, all such topological measures strongly correlate with the occurrence of a given protein in different pathways. Building a random forest classifier that is based on such topological measures, we found that protein PageRank index had the highest impact on the classification of viral (non-)targets while proteins' ability to topologically control an interaction network played the least important role.
Collapse
Affiliation(s)
- Prajwal Devkota
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, United States of America
| | - Matt C. Danzi
- The Miami Project to Cure Paralysis, Miller School of Medicine, University of Miami, Miami, FL, United States of America
- Center for Computational Science, Univ. of Miami, Coral Gables, FL, United States of America
| | - Stefan Wuchty
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, United States of America
- Center for Computational Science, Univ. of Miami, Coral Gables, FL, United States of America
- Dept. of Biology, Univ. of Miami, Coral Gables, FL, United States of America
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, United States of America
- * E-mail:
| |
Collapse
|
33
|
Computational and Experimental Approaches to Predict Host-Parasite Protein-Protein Interactions. Methods Mol Biol 2018; 1819:153-173. [PMID: 30421403 DOI: 10.1007/978-1-4939-8618-7_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In host-parasite systems, protein-protein interactions are key to allow the pathogen to enter the host and persist within the host. The study of host-parasite molecular communication improves the understanding the mechanisms of infection, evasion of the host immune system and tropism across different tissues. Current trends in parasitology focus on unraveling host-parasite protein-protein interactions to aid the development of new strategies to combat pathogenic parasites with better treatments and prevention mechanisms. Due to the complexity of capturing experimentally these interactions, computational approaches integrating data from different sources (mainly "omics" data) become key to complement or support experimental approaches. Here, we focus on the application of experimental and computational methods in the prediction of host-parasite interactions and highlight the potential of each of these methods in specific contexts.
Collapse
|
34
|
Nourani E, Khunjush F, Sevilgen FE. Virus–human protein–protein interaction prediction using Bayesian matrix factorization and projection techniques. Biocybern Biomed Eng 2018. [DOI: 10.1016/j.bbe.2018.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
35
|
Brito AF, Pinney JW. Protein-Protein Interactions in Virus-Host Systems. Front Microbiol 2017; 8:1557. [PMID: 28861068 PMCID: PMC5562681 DOI: 10.3389/fmicb.2017.01557] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 08/02/2017] [Indexed: 01/10/2023] Open
Abstract
To study virus–host protein interactions, knowledge about viral and host protein architectures and repertoires, their particular evolutionary mechanisms, and information on relevant sources of biological data is essential. The purpose of this review article is to provide a thorough overview about these aspects. Protein domains are basic units defining protein interactions, and the uniqueness of viral domain repertoires, their mode of evolution, and their roles during viral infection make viruses interesting models of study. Mutations at protein interfaces can reduce or increase their binding affinities by changing protein electrostatics and structural properties. During the course of a viral infection, both pathogen and cellular proteins are constantly competing for binding partners. Endogenous interfaces mediating intraspecific interactions—viral–viral or host–host interactions—are constantly targeted and inhibited by exogenous interfaces mediating viral–host interactions. From a biomedical perspective, blocking such interactions is the main mechanism underlying antiviral therapies. Some proteins are able to bind multiple partners, and their modes of interaction define how fast these “hub proteins” evolve. “Party hubs” have multiple interfaces; they establish simultaneous/stable (domain–domain) interactions, and tend to evolve slowly. On the other hand, “date hubs” have few interfaces; they establish transient/weak (domain–motif) interactions by means of short linear peptides (15 or fewer residues), and can evolve faster. Viral infections are mediated by several protein–protein interactions (PPIs), which can be represented as networks (protein interaction networks, PINs), with proteins being depicted as nodes, and their interactions as edges. It has been suggested that viral proteins tend to establish interactions with more central and highly connected host proteins. In an evolutionary arms race, viral and host proteins are constantly changing their interface residues, either to evade or to optimize their binding capabilities. Apart from gaining and losing interactions via rewiring mechanisms, virus–host PINs also evolve via gene duplication (paralogy); conservation (orthology); horizontal gene transfer (HGT) (xenology); and molecular mimicry (convergence). The last sections of this review focus on PPI experimental approaches and their limitations, and provide an overview of sources of biomolecular data for studying virus–host protein interactions.
Collapse
Affiliation(s)
- Anderson F Brito
- Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College LondonLondon, United Kingdom
| | - John W Pinney
- Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College LondonLondon, United Kingdom
| |
Collapse
|
36
|
Mariano R, Wuchty S. Structure-based prediction of host–pathogen protein interactions. Curr Opin Struct Biol 2017; 44:119-124. [DOI: 10.1016/j.sbi.2017.02.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 02/28/2017] [Indexed: 11/25/2022]
|
37
|
Mosaddek Hossain SM, Ray S, Mukhopadhyay A. Preservation affinity in consensus modules among stages of HIV-1 progression. BMC Bioinformatics 2017; 18:181. [PMID: 28320358 PMCID: PMC5359929 DOI: 10.1186/s12859-017-1590-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Accepted: 03/09/2017] [Indexed: 11/16/2022] Open
Abstract
Background Analysis of gene expression data provides valuable insights into disease mechanism. Investigating relationship among co-expression modules of different stages is a meaningful tool to understand the way in which a disease progresses. Identifying topological preservation of modular structure also contributes to that understanding. Methods HIV-1 disease provides a well-documented progression pattern through three stages of infection: acute, chronic and non-progressor. In this article, we have developed a novel framework to describe the relationship among the consensus (or shared) co-expression modules for each pair of HIV-1 infection stages. The consensus modules are identified to assess the preservation of network properties. We have investigated the preservation patterns of co-expression networks during HIV-1 disease progression through an eigengene-based approach. Results We discovered that the expression patterns of consensus modules have a strong preservation during the transitions of three infection stages. In particular, it is noticed that between acute and non-progressor stages the preservation is slightly more than the other pair of stages. Moreover, we have constructed eigengene networks for the identified consensus modules and observed the preservation structure among them. Some consensus modules are marked as preserved in two pairs of stages and are analyzed further to form a higher order meta-network consisting of a group of preserved modules. Additionally, we observed that module membership (MM) values of genes within a module are consistent with the preservation characteristics. The MM values of genes within a pair of preserved modules show strong correlation patterns across two infection stages. Conclusions We have performed an extensive analysis to discover preservation pattern of co-expression network constructed from microarray gene expression data of three different HIV-1 progression stages. The preservation pattern is investigated through identification of consensus modules in each pair of infection stages. It is observed that the preservation of the expression pattern of consensus modules remains more prominent during the transition of infection from acute stage to non-progressor stage. Additionally, we observed that the module membership values of genes are coherent with preserved modules across the HIV-1 progression stages. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1590-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sk Md Mosaddek Hossain
- Department of Computer Science and Engineering, Aliah University, Kolkata, West Bengal, 700156, India
| | - Sumanta Ray
- Department of Computer Science and Engineering, Aliah University, Kolkata, West Bengal, 700156, India.
| | - Anirban Mukhopadhyay
- Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, 741235, India
| |
Collapse
|
38
|
Technologies for Proteome-Wide Discovery of Extracellular Host-Pathogen Interactions. J Immunol Res 2017; 2017:2197615. [PMID: 28321417 PMCID: PMC5340944 DOI: 10.1155/2017/2197615] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 01/19/2017] [Indexed: 12/26/2022] Open
Abstract
Pathogens have evolved unique mechanisms to breach the cell surface barrier and manipulate the host immune response to establish a productive infection. Proteins exposed to the extracellular environment, both cell surface-expressed receptors and secreted proteins, are essential targets for initial invasion and play key roles in pathogen recognition and subsequent immunoregulatory processes. The identification of the host and pathogen extracellular molecules and their interaction networks is fundamental to understanding tissue tropism and pathogenesis and to inform the development of therapeutic strategies. Nevertheless, the characterization of the proteins that function in the host-pathogen interface has been challenging, largely due to the technical challenges associated with detection of extracellular protein interactions. This review discusses available technologies for the high throughput study of extracellular protein interactions between pathogens and their hosts, with a focus on mammalian viruses and bacteria. Emerging work illustrates a rich landscape for extracellular host-pathogen interaction and points towards the evolution of multifunctional pathogen-encoded proteins. Further development and application of technologies for genome-wide identification of extracellular protein interactions will be important in deciphering functional host-pathogen interaction networks, laying the foundation for development of novel therapeutics.
Collapse
|
39
|
Kshirsagar M, Murugesan K, Carbonell JG, Klein-Seetharaman J. Multitask Matrix Completion for Learning Protein Interactions Across Diseases. J Comput Biol 2017; 24:501-514. [PMID: 28128642 DOI: 10.1089/cmb.2016.0201] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Disease-causing pathogens such as viruses introduce their proteins into the host cells in which they interact with the host's proteins, enabling the virus to replicate inside the host. These interactions between pathogen and host proteins are key to understanding infectious diseases. Often multiple diseases involve phylogenetically related or biologically similar pathogens. Here we present a multitask learning method to jointly model interactions between human proteins and three different but related viruses: Hepatitis C, Ebola virus, and Influenza A. Our multitask matrix completion-based model uses a shared low-rank structure in addition to a task-specific sparse structure to incorporate the various interactions. We obtain between 7 and 39 percentage points improvement in predictive performance over prior state-of-the-art models. We show how our model's parameters can be interpreted to reveal both general and specific interaction-relevant characteristics of the viruses. Our code is available online.
Collapse
Affiliation(s)
| | - Keerthiram Murugesan
- 2 Language Technologies Institute, Carnegie Mellon University , Pittsburgh, Pennsylvania
| | - Jaime G Carbonell
- 2 Language Technologies Institute, Carnegie Mellon University , Pittsburgh, Pennsylvania
| | - Judith Klein-Seetharaman
- 3 Metabolic & Vascular Health, Warwick Medical School, University of Warwick , Coventry, United Kingdom
| |
Collapse
|
40
|
Mei S, Zhang K. Computational discovery of Epstein-Barr virus targeted human genes and signalling pathways. Sci Rep 2016; 6:30612. [PMID: 27470517 PMCID: PMC4965740 DOI: 10.1038/srep30612] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 07/05/2016] [Indexed: 12/22/2022] Open
Abstract
Epstein-Barr virus (EBV) plays important roles in the origin and the progression of human carcinomas, e.g. diffuse large B cell tumors, T cell lymphomas, etc. Discovering EBV targeted human genes and signaling pathways is vital to understand EBV tumorigenesis. In this study we propose a noise-tolerant homolog knowledge transfer method to reconstruct functional protein-protein interactions (PPI) networks between Epstein-Barr virus and Homo sapiens. The training set is augmented via homolog instances and the homolog noise is counteracted by support vector machine (SVM). Additionally we propose two methods to define subcellular co-localization (i.e. stringent and relaxed), based on which to further derive physical PPI networks. Computational results show that the proposed method achieves sound performance of cross validation and independent test. In the space of 648,672 EBV-human protein pairs, we obtain 51,485 functional interactions (7.94%), 869 stringent physical PPIs and 46,050 relaxed physical PPIs. Fifty-eight evidences are found from the latest database and recent literature to validate the model. This study reveals that Epstein-Barr virus interferes with normal human cell life, such as cholesterol homeostasis, blood coagulation, EGFR binding, p53 binding, Notch signaling, Hedgehog signaling, etc. The proteome-wide predictions are provided in the supplementary file for further biomedical research.
Collapse
Affiliation(s)
- Suyu Mei
- Software College, Shenyang Normal University, Shenyang, 110034, China
| | - Kun Zhang
- Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
| |
Collapse
|
41
|
Ray S, Bandyopadhyay S. A NMF based approach for integrating multiple data sources to predict HIV-1-human PPIs. BMC Bioinformatics 2016; 17:121. [PMID: 26956556 PMCID: PMC4784399 DOI: 10.1186/s12859-016-0952-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 02/05/2016] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Predicting novel interactions between HIV-1 and human proteins contributes most promising area in HIV research. Prediction is generally guided by some classification and inference based methods using single biological source of information. RESULTS In this article we have proposed a novel framework to predict protein-protein interactions (PPIs) between HIV-1 and human proteins by integrating multiple biological sources of information through non negative matrix factorization (NMF). For this purpose, the multiple data sets are converted to biological networks, which are then utilized to predict modules. These modules are subsequently combined into meta-modules by using NMF based clustering method. The integrated meta-modules are used to predict novel interactions between HIV-1 and human proteins. We have analyzed the significant GO terms and KEGG pathways in which the human proteins of the meta-modules participate. Moreover, the topological properties of human proteins involved in the meta modules are investigated. We have also performed statistical significance test to evaluate the predictions. CONCLUSIONS Here, we propose a novel approach based on integration of different biological data sources, for predicting PPIs between HIV-1 and human proteins. Here, the integration is achieved through non negative matrix factorization (NMF) technique. Most of the predicted interactions are found to be well supported by the existing literature in PUBMED. Moreover, human proteins in the predicted set emerge as 'hubs' and 'bottlenecks' in the analysis. Low p-value in the significance test also suggests that the predictions are statistically significant.
Collapse
Affiliation(s)
- Sumanta Ray
- Department of Computer Science and Engineering, Aliah University, Kolkata-700156, West Bengal, India.
| | | |
Collapse
|
42
|
Abbasi WA, Minhas FUAA. Issues in performance evaluation for host-pathogen protein interaction prediction. J Bioinform Comput Biol 2016; 14:1650011. [PMID: 26932275 DOI: 10.1142/s0219720016500116] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein-protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host-pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- 1 Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Fayyaz Ul Amir Afsar Minhas
- 1 Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| |
Collapse
|
43
|
Eid FE, ElHefnawi M, Heath LS. DeNovo: virus-host sequence-based protein–protein interaction prediction. Bioinformatics 2015; 32:1144-50. [DOI: 10.1093/bioinformatics/btv737] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 12/12/2015] [Indexed: 01/02/2023] Open
|
44
|
Nourani E, Khunjush F, Durmuş S. Computational approaches for prediction of pathogen-host protein-protein interactions. Front Microbiol 2015; 6:94. [PMID: 25759684 PMCID: PMC4338785 DOI: 10.3389/fmicb.2015.00094] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 01/26/2015] [Indexed: 12/25/2022] Open
Abstract
Infectious diseases are still among the major and prevalent health problems, mostly because of the drug resistance of novel variants of pathogens. Molecular interactions between pathogens and their hosts are the key parts of the infection mechanisms. Novel antimicrobial therapeutics to fight drug resistance is only possible in case of a thorough understanding of pathogen-host interaction (PHI) systems. Existing databases, which contain experimentally verified PHI data, suffer from scarcity of reported interactions due to the technically challenging and time consuming process of experiments. These have motivated many researchers to address the problem by proposing computational approaches for analysis and prediction of PHIs. The computational methods primarily utilize sequence information, protein structure and known interactions. Classic machine learning techniques are used when there are sufficient known interactions to be used as training data. On the opposite case, transfer and multitask learning methods are preferred. Here, we present an overview of these computational approaches for predicting PHI systems, discussing their weakness and abilities, with future directions.
Collapse
Affiliation(s)
- Esmaeil Nourani
- Department of Computer Science and Engineering, School of Electrical and Computer Engineering, Shiraz University Shiraz, Iran
| | - Farshad Khunjush
- Department of Computer Science and Engineering, School of Electrical and Computer Engineering, Shiraz University Shiraz, Iran ; School of Computer Science, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| | - Saliha Durmuş
- Computational Systems Biology Group, Department of Bioengineering, Gebze Technical University Kocaeli, Turkey
| |
Collapse
|
45
|
Kshirsagar M, Schleker S, Carbonell J, Klein-Seetharaman J. Techniques for transferring host-pathogen protein interactions knowledge to new tasks. Front Microbiol 2015; 6:36. [PMID: 25699028 PMCID: PMC4313693 DOI: 10.3389/fmicb.2015.00036] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 01/12/2015] [Indexed: 11/17/2022] Open
Abstract
We consider the problem of building a model to predict protein-protein interactions (PPIs) between the bacterial species Salmonella Typhimurium and the plant host Arabidopsis thaliana which is a host-pathogen pair for which no known PPIs are available. To achieve this, we present approaches, which use homology and statistical learning methods called “transfer learning.” In the transfer learning setting, the task of predicting PPIs between Arabidopsis and its pathogen S. Typhimurium is called the “target task.” The presented approaches utilize labeled data i.e., known PPIs of other host-pathogen pairs (we call these PPIs the “source tasks”). The homology based approaches use heuristics based on biological intuition to predict PPIs. The transfer learning methods use the similarity of the PPIs from the source tasks to the target task to build a model. For a quantitative evaluation we consider Salmonella-mouse PPI prediction and some other host-pathogen tasks where known PPIs exist. We use metrics such as precision and recall and our results show that our methods perform well on the target task in various transfer settings. We present a brief qualitative analysis of the Arabidopsis-Salmonella predicted interactions. We filter the predictions from all approaches using Gene Ontology term enrichment and only those interactions involving Salmonella effectors. Thereby we observe that Arabidopsis proteins involved e.g., in transcriptional regulation, hormone mediated signaling and defense response may be affected by Salmonella.
Collapse
Affiliation(s)
- Meghana Kshirsagar
- School of Computer Science, Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA, USA
| | - Sylvia Schleker
- Metabolic and Vascular Health, Warwick Medical School, University of Warwick Coventry, UK ; Molecular Phytomedicine, Institute of Crop Science and Resource Conservation, University of Bonn Bonn, Germany
| | - Jaime Carbonell
- School of Computer Science, Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA, USA
| | | |
Collapse
|
46
|
Schleker S, Kshirsagar M, Klein-Seetharaman J. Comparing human-Salmonella with plant-Salmonella protein-protein interaction predictions. Front Microbiol 2015; 6:45. [PMID: 25674082 PMCID: PMC4309195 DOI: 10.3389/fmicb.2015.00045] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Accepted: 01/13/2015] [Indexed: 11/13/2022] Open
Abstract
Salmonellosis is the most frequent foodborne disease worldwide and can be transmitted to humans by a variety of routes, especially via animal and plant products. Salmonella bacteria are believed to use not only animal and human but also plant hosts despite their evolutionary distance. This raises the question if Salmonella employs similar mechanisms in infection of these diverse hosts. Given that most of our understanding comes from its interaction with human hosts, we investigate here to what degree knowledge of Salmonella-human interactions can be transferred to the Salmonella-plant system. Reviewed are recent publications on analysis and prediction of Salmonella-host interactomes. Putative protein-protein interactions (PPIs) between Salmonella and its human and Arabidopsis hosts were retrieved utilizing purely interolog-based approaches in which predictions were inferred based on available sequence and domain information of known PPIs, and machine learning approaches that integrate a larger set of useful information from different sources. Transfer learning is an especially suitable machine learning technique to predict plant host targets from the knowledge of human host targets. A comparison of the prediction results with transcriptomic data shows a clear overlap between the host proteins predicted to be targeted by PPIs and their gene ontology enrichment in both host species and regulation of gene expression. In particular, the cellular processes Salmonella interferes with in plants and humans are catabolic processes. The details of how these processes are targeted, however, are quite different between the two organisms, as expected based on their evolutionary and habitat differences. Possible implications of this observation on evolution of host-pathogen communication are discussed.
Collapse
Affiliation(s)
- Sylvia Schleker
- Klein-Seetharaman Laboratory, Division of Metabolic and Vascular Health, Warwick Medical School, University of Warwick , Coventry, UK ; Department of Molecular Phytomedicine, Institute of Crop Science and Resource Conservation, University of Bonn , Bonn, Germany
| | - Meghana Kshirsagar
- Language Technologies Institute, School of Computer Science, Carnegie Mellon University , Pittsburgh, PA, USA
| | - Judith Klein-Seetharaman
- Klein-Seetharaman Laboratory, Division of Metabolic and Vascular Health, Warwick Medical School, University of Warwick , Coventry, UK
| |
Collapse
|
47
|
Subramanian N, Torabi-Parizi P, Gottschalk RA, Germain RN, Dutta B. Network representations of immune system complexity. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2015; 7:13-38. [PMID: 25625853 PMCID: PMC4339634 DOI: 10.1002/wsbm.1288] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Revised: 12/09/2014] [Accepted: 12/11/2014] [Indexed: 12/25/2022]
Abstract
The mammalian immune system is a dynamic multiscale system composed of a hierarchically organized set of molecular, cellular, and organismal networks that act in concert to promote effective host defense. These networks range from those involving gene regulatory and protein–protein interactions underlying intracellular signaling pathways and single‐cell responses to increasingly complex networks of in vivo cellular interaction, positioning, and migration that determine the overall immune response of an organism. Immunity is thus not the product of simple signaling events but rather nonlinear behaviors arising from dynamic, feedback‐regulated interactions among many components. One of the major goals of systems immunology is to quantitatively measure these complex multiscale spatial and temporal interactions, permitting development of computational models that can be used to predict responses to perturbation. Recent technological advances permit collection of comprehensive datasets at multiple molecular and cellular levels, while advances in network biology support representation of the relationships of components at each level as physical or functional interaction networks. The latter facilitate effective visualization of patterns and recognition of emergent properties arising from the many interactions of genes, molecules, and cells of the immune system. We illustrate the power of integrating ‘omics’ and network modeling approaches for unbiased reconstruction of signaling and transcriptional networks with a focus on applications involving the innate immune system. We further discuss future possibilities for reconstruction of increasingly complex cellular‐ and organism‐level networks and development of sophisticated computational tools for prediction of emergent immune behavior arising from the concerted action of these networks. WIREs Syst Biol Med 2015, 7:13–38. doi: 10.1002/wsbm.1288 This article is categorized under:
Analytical and Computational Methods > Computational Methods Laboratory Methods and Technologies > Macromolecular Interactions, Methods
Collapse
Affiliation(s)
- Naeha Subramanian
- Institute for Systems Biology, Seattle, WA, USA; Laboratory of Systems Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | |
Collapse
|
48
|
Mei S, Zhu H. A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks. Sci Rep 2015; 5:8034. [PMID: 25620466 PMCID: PMC5379509 DOI: 10.1038/srep08034] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 12/22/2014] [Indexed: 11/09/2022] Open
Abstract
Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.
Collapse
Affiliation(s)
- Suyu Mei
- 1] Software College, Shenyang Normal University, Shenyang, 110034, China [2] Bioinformatics Section, School of Biomedical Sciences, Southern Medical University, Guangzhou, 510515, China
| | - Hao Zhu
- Bioinformatics Section, School of Biomedical Sciences, Southern Medical University, Guangzhou, 510515, China
| |
Collapse
|
49
|
Bandyopadhyay S, Ray S, Mukhopadhyay A, Maulik U. A review of in silico approaches for analysis and prediction of HIV-1-human protein-protein interactions. Brief Bioinform 2014; 16:830-51. [PMID: 25479794 DOI: 10.1093/bib/bbu041] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Indexed: 12/19/2022] Open
Abstract
The computational or in silico approaches for analysing the HIV-1-human protein-protein interaction (PPI) network, predicting different host cellular factors and PPIs and discovering several pathways are gaining popularity in the field of HIV research. Although there exist quite a few studies in this regard, no previous effort has been made to review these works in a comprehensive manner. Here we review the computational approaches that are devoted to the analysis and prediction of HIV-1-human PPIs. We have broadly categorized these studies into two fields: computational analysis of HIV-1-human PPI network and prediction of novel PPIs. We have also presented a comparative assessment of these studies and proposed some methodologies for discussing the implication of their results. We have also reviewed different computational techniques for predicting HIV-1-human PPIs and provided a comparative study of their applicability. We believe that our effort will provide helpful insights to the HIV research community.
Collapse
|
50
|
Barman RK, Saha S, Das S. Prediction of interactions between viral and host proteins using supervised machine learning methods. PLoS One 2014; 9:e112034. [PMID: 25375323 PMCID: PMC4223108 DOI: 10.1371/journal.pone.0112034] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 10/11/2014] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein" binds to membrane docking protein, while "X protein" and "P protein" interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model.
Collapse
Affiliation(s)
- Ranjan Kumar Barman
- Biomedical Informatics Centre, National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India
| | - Sudipto Saha
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
- * E-mail: (SS); (SD)
| | - Santasabuj Das
- Biomedical Informatics Centre, National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India
- Division of Clinical Medicine, National Institute of Cholera and Enteric Diseases, Kolkata, West Bengal, India
- * E-mail: (SS); (SD)
| |
Collapse
|