1
|
Lakhanpal G, Tiwari H, Shukla MK, Kumar D. In silico exploration of hypothetical proteins in Neisseria gonorrhoeae for identification of therapeutic targets. In Silico Pharmacol 2024; 12:10. [PMID: 38327876 PMCID: PMC10844189 DOI: 10.1007/s40203-023-00186-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 12/22/2023] [Indexed: 02/09/2024] Open
Abstract
Neisseria gonorrhoeae, a World Health Organization (WHO) declared superbug and the second-most frequent cause of bacterial sexually transmitted infections worldwide is responsible for gonorrhea. Hypothetical proteins are gene products that are predicted to be encoded by a particular gene based on the DNA sequence, but their specific functions and characteristics have not been experimentally determined or verified. In the context of this research, annotating hypothetical proteins is crucial for identifying their potential as therapeutic targets. Without proper annotation, these proteins would remain vague, hindering efforts to understand their roles in disease. The methodology used aims to bridge this gap by employing algorithm-based tools and software to annotate hypothetical proteins and assess their suitability as therapeutic targets based on factors such as essentiality, virulence, subcellular localization, and druggability. Out of 716 N. gonorrhoeae hypothetical proteins reported in UniProt, assessment of crucial pathogenic factors, including essentiality, virulence, subcellular localization, and druggability, effectively filtered and prioritized the hypothetical proteins for further therapeutic exploration and lead to 5 proteins being chosen as targets. The molecular docking studies conducted identified 10 hits targeting the five targets. Conclusively, this study aided in identification of targets and hit compounds for therapeutic targeting of gonorrhea disease. Graphical abstract Supplementary Information The online version contains supplementary material available at 10.1007/s40203-023-00186-w.
Collapse
Affiliation(s)
| | - Harshita Tiwari
- Drug Chemistry Research Centre, Kanadia Road, Indore, Madhya Pradesh 452003 India
| | - Monu Kumar Shukla
- Department of Pharmaceutical Chemistry, School of Pharmaceutical Sciences, Shoolini University, Solan, Himachal Pradesh 173212 India
| | - Deepak Kumar
- Department of Pharmaceutical Chemistry, School of Pharmaceutical Sciences, Shoolini University, Solan, Himachal Pradesh 173212 India
| |
Collapse
|
2
|
Chandrasekharan G, Unnikrishnan M. High throughput methods to study protein-protein interactions during host-pathogen interactions. Eur J Cell Biol 2024; 103:151393. [PMID: 38306772 DOI: 10.1016/j.ejcb.2024.151393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/18/2024] [Accepted: 01/21/2024] [Indexed: 02/04/2024] Open
Abstract
The ability of a pathogen to survive and cause an infection is often determined by specific interactions between the host and pathogen proteins. Such interactions can be both intra- and extracellular and may define the outcome of an infection. There are a range of innovative biochemical, biophysical and bioinformatic techniques currently available to identify protein-protein interactions (PPI) between the host and the pathogen. However, the complexity and the diversity of host-pathogen PPIs has led to the development of several high throughput (HT) techniques that enable the study of multiple interactions at once and/or screen multiple samples at the same time, in an unbiased manner. We review here the major HT laboratory-based technologies employed for host-bacterial interaction studies.
Collapse
Affiliation(s)
| | - Meera Unnikrishnan
- Division of Biomedical Sciences, University of Warwick, Coventry CV4 7AL, United Kingdom.
| |
Collapse
|
3
|
Zhang Z, Lu C, Mo B, Bai K, Ge XY, Deng L, Peng Y. Prediction of mammalian virus cross-species transmission based on host proteins. Microbiol Spectr 2023; 11:e0536822. [PMID: 37754753 PMCID: PMC10581197 DOI: 10.1128/spectrum.05368-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 08/04/2023] [Indexed: 09/28/2023] Open
Abstract
Most emerging viruses are spilled over from mammals. Understanding the mechanism of virus cross-species transmission and identifying zoonotic viruses before their emergence are critical for the prevention and control of newly emerging viruses. This study systematically investigated the host proteins associated with the cross-species transmission of mammalian viruses based on 1,271 pairs of virus-mammal interactions including 382 viruses from 33 viral families and 73 mammal species from 11 orders. Numerous host proteins were found to contribute to the cross-species transmission of mammalian viruses. Host proteins potentially contributing to virus cross-species transmission are specific to viral families, and few overlaps of such host proteins are observed in different viral families. Based on these host proteins, the random-forest (RF) models were built to predict the cross-species transmission potential of mammalian viruses. Moderate performance was obtained when using all viruses together. However, when modeling by viral family, the performance of the RF models varied much among viral families. In 13 viral families such as Flaviviridae, Retroviridae, and Poxviridae, the AUC of the RF model was greater than 0.8. Finally, the contribution of virus receptors to cross-species transmission was evaluated, and the virus receptor was found to have a minor effect in predicting the cross-species transmission of mammalian viruses. The study deepens our understanding of the mechanism of virus cross-species transmission and provides a framework for predicting the cross-species transmission of mammalian viruses. IMPORTANCE Emerging viruses pose serious threats to humans. Understanding the mechanism of virus cross-species transmission and identifying zoonotic viruses before their emergence are critical for the prevention and control of emerging viruses. This study systematically identified host factors associated with cross-species transmission of mammalian viruses and further built machine-learning models for predicting cross-species transmission of the viruses based on host factors including virus receptors. The study not only deepens our understanding of the mechanism of virus cross-species transmission but also provides a framework for predicting the cross-species transmission of mammalian viruses based on host factors.
Collapse
Affiliation(s)
- Zheng Zhang
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha, Hunan, China
- Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis & Decision-making, College of Plant Protection, Hunan Agricultural University, Changsha, Hunan, China
| | - Congyu Lu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha, Hunan, China
| | - Bocheng Mo
- Hunan Engineering and Technology Research Center for Agricultural Big Data Analysis & Decision-making, College of Plant Protection, Hunan Agricultural University, Changsha, Hunan, China
| | - Kehan Bai
- Hunan Juyoubiotech Co., Ltd, Changsha, Hunan, China
| | - Xing-Yi Ge
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha, Hunan, China
| | - Li Deng
- Department of Internal Medicine-Neurology, The Third Hospital of Changsha, Changsha, Hunan, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha, Hunan, China
| |
Collapse
|
4
|
Martins YC, Ziviani A, Cerqueira e Costa MDO, Cavalcanti MCR, Nicolás MF, de Vasconcelos ATR. PPIntegrator: semantic integrative system for protein-protein interaction and application for host-pathogen datasets. BIOINFORMATICS ADVANCES 2023; 3:vbad067. [PMID: 37359724 PMCID: PMC10290227 DOI: 10.1093/bioadv/vbad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 04/28/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023]
Abstract
Summary Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the broadly used Gene Ontology that contains metadata to annotate gene function and subcellular location. Another important subject in the biological area is protein-protein interactions (PPIs) which have applications like protein function inference. Current PPI databases have heterogeneous exportation methods that challenge their integration and analysis. Presently, several initiatives of ontologies covering some concepts of the PPI domain are available to promote interoperability across datasets. However, the efforts to stimulate guidelines for automatic semantic data integration and analysis for PPIs in these datasets are limited. Here, we present PPIntegrator, a system that semantically describes data related to protein interactions. We also introduce an enrichment pipeline to generate, predict and validate new potential host-pathogen datasets by transitivity analysis. PPIntegrator contains a data preparation module to organize data from three reference databases and a triplification and data fusion module to describe the provenance information and results. This work provides an overview of the PPIntegrator system applied to integrate and compare host-pathogen PPI datasets from four bacterial species using our proposed transitivity analysis pipeline. We also demonstrated some critical queries to analyze this kind of data and highlight the importance and usage of the semantic data generated by our system. Availability and implementation https://github.com/YasCoMa/ppintegrator, https://github.com/YasCoMa/ppi_validation_process and https://github.com/YasCoMa/predprin.
Collapse
Affiliation(s)
- Yasmmin Côrtes Martins
- Bioinformatics Laboratory, National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | - Artur Ziviani
- Data Extreme Laboratory (DEXL), National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | | | | | - Marisa Fabiana Nicolás
- Bioinformatics Laboratory, National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | | |
Collapse
|
5
|
Karan B, Mahapatra S, Sahu SS, Pandey DM, Chakravarty S. Computational models for prediction of protein-protein interaction in rice and Magnaporthe grisea. FRONTIERS IN PLANT SCIENCE 2023; 13:1046209. [PMID: 36816487 PMCID: PMC9929577 DOI: 10.3389/fpls.2022.1046209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 12/28/2022] [Indexed: 06/18/2023]
Abstract
INTRODUCTION Plant-microbe interactions play a vital role in the development of strategies to manage pathogen-induced destructive diseases that cause enormous crop losses every year. Rice blast is one of the severe diseases to rice Oryza sativa (O. sativa) due to Magnaporthe grisea (M. grisea) fungus. Protein-protein interaction (PPI) between rice and fungus plays a key role in causing rice blast disease. METHODS In this paper, four genomic information-based models such as (i) the interolog, (ii) the domain, (iii) the gene ontology, and (iv) the phylogenetic-based model are developed for predicting the interaction between O. sativa and M. grisea in a whole-genome scale. RESULTS AND DISCUSSION A total of 59,430 interacting pairs between 1,801 rice proteins and 135 blast fungus proteins are obtained from the four models. Furthermore, a machine learning model is developed to assess the predicted interactions. Using composition-based amino acid composition (AAC) and conjoint triad (CT) features, an accuracy of 88% and 89% is achieved, respectively. When tested on the experimental dataset, the CT feature provides the highest accuracy of 95%. Furthermore, the specificity of the model is verified with other pathogen-host datasets where less accuracy is obtained, which confirmed that the model is specific to O. sativa and M. grisea. Understanding the molecular processes behind rice resistance to blast fungus begins with the identification of PPIs, and these predicted PPIs will be useful for drug design in the plant science community.
Collapse
Affiliation(s)
- Biswajit Karan
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Satyajit Mahapatra
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Sitanshu Sekhar Sahu
- Department of Electronics and Communication Engineering, Birla Institute of Technology, Ranchi, India
| | - Dev Mani Pandey
- Department of Bioengineering and Biotechnology, Birla Institute of Technology, Ranchi, India
| | - Sumit Chakravarty
- Department of Electrical and Computer Engineering, Kennesaw State University, Kennesaw, GA, United States
| |
Collapse
|
6
|
Jain A, Mittal S, Tripathi LP, Nussinov R, Ahmad S. Host-pathogen protein-nucleic acid interactions: A comprehensive review. Comput Struct Biotechnol J 2022; 20:4415-4436. [PMID: 36051878 PMCID: PMC9420432 DOI: 10.1016/j.csbj.2022.08.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/01/2022] [Accepted: 08/01/2022] [Indexed: 12/02/2022] Open
Abstract
Recognition of pathogen-derived nucleic acids by host cells is an effective host strategy to detect pathogenic invasion and trigger immune responses. In the context of pathogen-specific pharmacology, there is a growing interest in mapping the interactions between pathogen-derived nucleic acids and host proteins. Insight into the principles of the structural and immunological mechanisms underlying such interactions and their roles in host defense is necessary to guide therapeutic intervention. Here, we discuss the newest advances in studies of molecular interactions involving pathogen nucleic acids and host factors, including their drug design, molecular structure and specific patterns. We observed that two groups of nucleic acid recognizing molecules, Toll-like receptors (TLRs) and the cytoplasmic retinoic acid-inducible gene (RIG)-I-like receptors (RLRs) form the backbone of host responses to pathogen nucleic acids, with additional support provided by absent in melanoma 2 (AIM2) and DNA-dependent activator of Interferons (IFNs)-regulatory factors (DAI) like cytosolic activity. We review the structural, immunological, and other biological aspects of these representative groups of molecules, especially in terms of their target specificity and affinity and challenges in leveraging host-pathogen protein-nucleic acid interactions (HP-PNI) in drug discovery.
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Shikha Mittal
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh, 173234, India
| | - Lokesh P. Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka, Japan
- Riken Center for Integrative Medical Sciences, Tsurumi, Yokohama, Kanagawa, Japan
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National, Laboratory for Cancer Research, Frederick, MD 21702, USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Israel
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| |
Collapse
|
7
|
Priyamvada P, Debroy R, Anbarasu A, Ramaiah S. A comprehensive review on genomics, systems biology and structural biology approaches for combating antimicrobial resistance in ESKAPE pathogens: computational tools and recent advancements. World J Microbiol Biotechnol 2022; 38:153. [PMID: 35788443 DOI: 10.1007/s11274-022-03343-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
In recent decades, antimicrobial resistance has been augmented as a global concern to public health owing to the global spread of multidrug-resistant strains from different ESKAPE pathogens. This alarming trend and the lack of new antibiotics with novel modes of action in the pipeline necessitate the development of non-antibiotic ways to treat illnesses caused by these isolates. In molecular biology, computational approaches have become crucial tools, particularly in one of the most challenging areas of multidrug resistance. The rapid advancements in bioinformatics have led to a plethora of computational approaches involving genomics, systems biology, and structural biology currently gaining momentum among molecular biologists since they can be useful and provide valuable information on the complex mechanisms of AMR research in ESKAPE pathogens. These computational approaches would be helpful in elucidating the AMR mechanisms, identifying important hub genes/proteins, and their promising targets together with their interactions with important drug targets, which is a crucial step in drug discovery. Therefore, the present review aims to provide holistic information on currently employed bioinformatic tools and their application in the discovery of multifunctional novel therapeutic drugs to combat the current problem of AMR in ESKAPE pathogens. The review also summarizes the recent advancement in the AMR research in ESKAPE pathogens utilizing the in silico approaches.
Collapse
Affiliation(s)
- P Priyamvada
- Medical and Biological Computing Laboratory, School of Biosciences and Technology (SBST), Vellore Institute of Technology (VIT), 632014, Vellore, India.,Department of Bio-Sciences, SBST, VIT, 632014, Vellore, India
| | - Reetika Debroy
- Medical and Biological Computing Laboratory, School of Biosciences and Technology (SBST), Vellore Institute of Technology (VIT), 632014, Vellore, India.,Department of Bio-Medical Sciences, SBST, VIT, 632014, Vellore, India
| | - Anand Anbarasu
- Medical and Biological Computing Laboratory, School of Biosciences and Technology (SBST), Vellore Institute of Technology (VIT), 632014, Vellore, India.,Department of Biotechnology, SBST, VIT, 632014, Vellore, India
| | - Sudha Ramaiah
- Medical and Biological Computing Laboratory, School of Biosciences and Technology (SBST), Vellore Institute of Technology (VIT), 632014, Vellore, India. .,Department of Bio-Sciences, SBST, VIT, 632014, Vellore, India. .,School of Biosciences and Technology VIT, 632014, Vellore, Tamil Nadu, India.
| |
Collapse
|
8
|
Goel P, Panchal T, Kaushik N, Chauhan R, Saini S, Ahuja V, Thakur CJ. In silico functional and structural characterization revealed virulent proteins of Francisella tularensis strain SCHU4. MOLECULAR BIOLOGY RESEARCH COMMUNICATIONS 2022; 11:73-84. [PMID: 36059929 PMCID: PMC9336787 DOI: 10.22099/mbrc.2022.43128.1719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Francisella tularensis is a pathogenic, aerobic gram-negative coccobacillus bacterium. It is the causative agent of tularemia, a rare infectious disease that can attack skin, lungs, eyes, and lymph nodes. The genome of F. tularensis has been sequenced, and ~16% of the proteome is still uncharacterized. Characterizations of these proteins are essential to find new drug targets for better therapeutics. In silico characterization of proteins has become an extremely important approach to determine the functionality of proteins as experimental functional elucidation is unable to keep pace with the current growth of the sequence database. Initially, we have annotated 577 Hypothetical Proteins (HPs) of F. tularensis strain SCHU4 with seven bioinformatics tools which characterized them based on the family, domain and motif. Out of 577 HPs, 119 HPs were annotated by five or more tools and are further screened to predict their virulence properties, subcellular localization, transmembrane helices as well as physicochemical parameters. VirulentPred predicted 66 HPs out of 119 as virulent. These virulent proteins were annotated to find the interacting partner using STRING, and proteins with high confidence interaction scores were used to predict their 3D structures using Phyre2. The three virulent proteins Q5NH99 (phosphoserine phosphatase), Q5NG42 (Cystathionine beta-synthase) and Q5NG83 (Rrf2-type helix turn helix domain) were predicted to involve in modulation of cytoskeletal and innate immunity of host, H2S (hydrogen sulfide) based antibiotic tolerance and nitrite and iron metabolism of bacteria. The above predicted virulent proteins can serve as novel drug targets in the era of antibiotic resistance.
Collapse
Affiliation(s)
- Prerna Goel
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India
| | - Tanya Panchal
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India
| | - Nandini Kaushik
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India
| | - Ritika Chauhan
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India
| | - Sandeep Saini
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India,Department of Biophysics, Panjab University, Sector 25, 160014, Chandigarh, India
| | - Vartika Ahuja
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India
| | - Chander Jyoti Thakur
- Department of Bioinformatics, Goswami Ganesh Dutta Sanatan Dharma College, Sector 32 C, Chandigarh, India,Corresponding Author: Department of Bioinformatics,Goswami Ganesh Dutta Sanatan Dharma College Sector 32 C, Chandigarh, India, 160030. Tel: +91 8699776533; Fax: +91 1722661077, E. mail:
| |
Collapse
|
9
|
Ma Y, He T, Tan Y, Jiang X. Seq-BEL: Sequence-Based Ensemble Learning for Predicting Virus-Human Protein-Protein Interaction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1322-1333. [PMID: 32750886 DOI: 10.1109/tcbb.2020.3008157] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.
Collapse
|
10
|
Integrated analysis of microbe-host interactions in Crohn’s disease reveals potential mechanisms of microbial proteins on host gene expression. iScience 2022; 25:103963. [PMID: 35479407 PMCID: PMC9035720 DOI: 10.1016/j.isci.2022.103963] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 12/11/2021] [Accepted: 02/18/2022] [Indexed: 12/15/2022] Open
|
11
|
Kaundal R, Loaiza CD, Duhan N, Flann N. deepHPI: a comprehensive deep learning platform for accurate prediction and visualization of host-pathogen protein-protein interactions. Brief Bioinform 2022; 23:6576450. [PMID: 35511057 DOI: 10.1093/bib/bbac125] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 02/07/2022] [Accepted: 03/15/2022] [Indexed: 01/06/2023] Open
Abstract
Host-pathogen protein interactions (HPPIs) play vital roles in many biological processes and are directly involved in infectious diseases. With the outbreak of more frequent pandemics in the last couple of decades, such as the recent outburst of Covid-19 causing millions of deaths, it has become more critical to develop advanced methods to accurately predict pathogen interactions with their respective hosts. During the last decade, experimental methods to identify HPIs have been used to decipher host-pathogen systems with the caveat that those techniques are labor-intensive, expensive and time-consuming. Alternatively, accurate prediction of HPIs can be performed by the use of data-driven machine learning. To provide a more robust and accurate solution for the HPI prediction problem, we have developed a deepHPI tool based on deep learning. The web server delivers four host-pathogen model types: plant-pathogen, human-bacteria, human-virus and animal-pathogen, leveraging its operability to a wide range of analyses and cases of use. The deepHPI web tool is the first to use convolutional neural network models for HPI prediction. These models have been selected based on a comprehensive evaluation of protein features and neural network architectures. The best prediction models have been tested on independent validation datasets, which achieved an overall Matthews correlation coefficient value of 0.87 for animal-pathogen using the combined pseudo-amino acid composition and conjoint triad (PAAC_CT) features, 0.75 for human-bacteria using the combined pseudo-amino acid composition, conjoint triad and normalized Moreau-Broto feature (PAAC_CT_NMBroto), 0.96 for human-virus using PAAC_CT_NMBroto and 0.94 values for plant-pathogen interactions using the combined pseudo-amino acid composition, composition and transition feature (PAAC_CTDC_CTDT). Our server running deepHPI is deployed on a high-performance computing cluster that enables large and multiple user requests, and it provides more information about interactions discovered. It presents an enriched visualization of the resulting host-pathogen networks that is augmented with external links to various protein annotation resources. We believe that the deepHPI web server will be very useful to researchers, particularly those working on infectious diseases. Additionally, many novel and known host-pathogen systems can be further investigated to significantly advance our understanding of complex disease-causing agents. The developed models are established on a web server, which is freely accessible at http://bioinfo.usu.edu/deepHPI/.
Collapse
Affiliation(s)
- Rakesh Kaundal
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences.,Department of Computer Science, College of Science; Utah State University, Logan, 84322 USA
| | - Cristian D Loaiza
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences
| | - Naveen Duhan
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences
| | - Nicholas Flann
- Department of Computer Science, College of Science; Utah State University, Logan, 84322 USA
| |
Collapse
|
12
|
Hu RS, Hesham AEL, Zou Q. Machine Learning and Its Applications for Protozoal Pathogens and Protozoal Infectious Diseases. Front Cell Infect Microbiol 2022; 12:882995. [PMID: 35573796 PMCID: PMC9097758 DOI: 10.3389/fcimb.2022.882995] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/28/2022] [Indexed: 12/24/2022] Open
Abstract
In recent years, massive attention has been attracted to the development and application of machine learning (ML) in the field of infectious diseases, not only serving as a catalyst for academic studies but also as a key means of detecting pathogenic microorganisms, implementing public health surveillance, exploring host-pathogen interactions, discovering drug and vaccine candidates, and so forth. These applications also include the management of infectious diseases caused by protozoal pathogens, such as Plasmodium, Trypanosoma, Toxoplasma, Cryptosporidium, and Giardia, a class of fatal or life-threatening causative agents capable of infecting humans and a wide range of animals. With the reduction of computational cost, availability of effective ML algorithms, popularization of ML tools, and accumulation of high-throughput data, it is possible to implement the integration of ML applications into increasing scientific research related to protozoal infection. Here, we will present a brief overview of important concepts in ML serving as background knowledge, with a focus on basic workflows, popular algorithms (e.g., support vector machine, random forest, and neural networks), feature extraction and selection, and model evaluation metrics. We will then review current ML applications and major advances concerning protozoal pathogens and protozoal infectious diseases through combination with correlative biology expertise and provide forward-looking insights for perspectives and opportunities in future advances in ML techniques in this field.
Collapse
Affiliation(s)
- Rui-Si Hu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Abd El-Latif Hesham
- Genetics Department, Faculty of Agriculture, Beni-Suef University, Beni-Suef, Egypt
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- *Correspondence: Quan Zou,
| |
Collapse
|
13
|
Lim H, Cankara F, Tsai CJ, Keskin O, Nussinov R, Gursoy A. Artificial intelligence approaches to human-microbiome protein–protein interactions. Curr Opin Struct Biol 2022; 73:102328. [DOI: 10.1016/j.sbi.2022.102328] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 12/01/2021] [Accepted: 12/31/2021] [Indexed: 02/08/2023]
|
14
|
Deciphering the Host-Pathogen Interactome of the Wheat-Common Bunt System: A Step towards Enhanced Resilience in Next Generation Wheat. Int J Mol Sci 2022; 23:ijms23052589. [PMID: 35269732 PMCID: PMC8910311 DOI: 10.3390/ijms23052589] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 02/09/2022] [Indexed: 02/05/2023] Open
Abstract
Common bunt, caused by two fungal species, Tilletia caries and Tilletia laevis, is one of the most potentially destructive diseases of wheat. Despite the availability of synthetic chemicals against the disease, organic agriculture relies greatly on resistant cultivars. Using two computational approaches—interolog and domain-based methods—a total of approximately 58 M and 56 M probable PPIs were predicted in T. aestivum–T. caries and T. aestivum–T. laevis interactomes, respectively. We also identified 648 and 575 effectors in the interactions from T. caries and T. laevis, respectively. The major host hubs belonged to the serine/threonine protein kinase, hsp70, and mitogen-activated protein kinase families, which are actively involved in plant immune signaling during stress conditions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of the host proteins revealed significant GO terms (O-methyltransferase activity, regulation of response to stimulus, and plastid envelope) and pathways (NF-kappa B signaling and the MAPK signaling pathway) related to plant defense against pathogens. Subcellular localization suggested that most of the pathogen proteins target the host in the plastid. Furthermore, a comparison between unique T. caries and T. laevis proteins was carried out. We also identified novel host candidates that are resistant to disease. Additionally, the host proteins that serve as transcription factors were also predicted.
Collapse
|
15
|
Munjal NS, Sapra D, Parthasarathi KTS, Goyal A, Pandey A, Banerjee M, Sharma J. Deciphering the Interactions of SARS-CoV-2 Proteins with Human Ion Channels Using Machine-Learning-Based Methods. Pathogens 2022; 11:pathogens11020259. [PMID: 35215201 PMCID: PMC8874499 DOI: 10.3390/pathogens11020259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 01/04/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accountable for the protracted COVID-19 pandemic. Its high transmission rate and pathogenicity led to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning (ML)-based algorithms are providing a higher accuracy for host-SARS-CoV-2 protein–protein interactions (PPIs). In this study, PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were trained on the PPI-MetaGO algorithm. PPI networks (PPINs) and a signaling pathway map of HICs with SARS-CoV-2 proteins were generated. Additionally, various U.S. food and drug administration (FDA)-approved drugs interacting with the potential HICs were identified. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% Matthews correlation coefficient score (MCC) and 84.09% F1 score. Several host pathways were found to be altered, including calcium signaling and taste transduction pathway. Potential HICs could serve as an initial set to the experimentalists for further validation. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs that may provide a better therapeutic management strategy for infection caused by SARS-CoV-2.
Collapse
Affiliation(s)
- Nupur S. Munjal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Dikscha Sapra
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - K. T. Shreya Parthasarathi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Abhishek Goyal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Akhilesh Pandey
- Center for Molecular Medicine, National Institute of Mental Health and Neurosciences (NIMHANS), Hosur Road, Bangalore 560029, India;
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Manidipa Banerjee
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India;
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
- Manipal Academy of Higher Education (MAHE), Udupi 576104, India
- Correspondence:
| |
Collapse
|
16
|
Chai H, Gu Q, Hughes J, Robertson DL. In silico prediction of HIV-1-host molecular interactions and their directionality. PLoS Comput Biol 2022; 18:e1009720. [PMID: 35134057 PMCID: PMC8856524 DOI: 10.1371/journal.pcbi.1009720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 02/18/2022] [Accepted: 12/03/2021] [Indexed: 11/18/2022] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) continues to be a major cause of disease and premature death. As with all viruses, HIV-1 exploits a host cell to replicate. Improving our understanding of the molecular interactions between virus and human host proteins is crucial for a mechanistic understanding of virus biology, infection and host antiviral activities. This knowledge will potentially permit the identification of host molecules for targeting by drugs with antiviral properties. Here, we propose a data-driven approach for the analysis and prediction of the HIV-1 interacting proteins (VIPs) with a focus on the directionality of the interaction: host-dependency versus antiviral factors. Using support vector machine learning models and features encompassing genetic, proteomic and network properties, our results reveal some significant differences between the VIPs and non-HIV-1 interacting human proteins (non-VIPs). As assessed by comparison with the HIV-1 infection pathway data in the Reactome database (sensitivity > 90%, threshold = 0.5), we demonstrate these models have good generalization properties. We find that the ‘direction’ of the HIV-1-host molecular interactions is also predictable due to different characteristics of ‘forward’/pro-viral versus ‘backward’/pro-host proteins. Additionally, we infer the previously unknown direction of the interactions between HIV-1 and 1351 human host proteins. A web server for performing predictions is available at http://hivpre.cvr.gla.ac.uk/.
Collapse
Affiliation(s)
- Haiting Chai
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Quan Gu
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Joseph Hughes
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - David L. Robertson
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
- * E-mail:
| |
Collapse
|
17
|
Kataria R, Kaundal R. Deciphering the Crosstalk Mechanisms of Wheat-Stem Rust Pathosystem: Genome-Scale Prediction Unravels Novel Host Targets. FRONTIERS IN PLANT SCIENCE 2022; 13:895480. [PMID: 35800602 PMCID: PMC9253690 DOI: 10.3389/fpls.2022.895480] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 05/31/2022] [Indexed: 05/04/2023]
Abstract
Triticum aestivum (wheat), a major staple food grain, is affected by various biotic stresses. Among these, fungal diseases cause about 15-20% of yield loss, worldwide. In this study, we performed a comparative analysis of protein-protein interactions between two Puccinia graminis races (Pgt 21-0 and Pgt Ug99) that cause stem (black) rust in wheat. The available molecular techniques to study the host-pathogen interaction mechanisms are expensive and labor-intensive. We implemented two computational approaches (interolog and domain-based) for the prediction of PPIs and performed various functional analysis to determine the significant differences between the two pathogen races. The analysis revealed that T. aestivum-Pgt 21-0 and T. aestivum-Pgt Ug99 interactomes consisted of ∼90M and ∼56M putative PPIs, respectively. In the predicted PPIs, we identified 115 Pgt 21-0 and 34 Pgt Ug99 potential effectors that were highly involved in pathogen virulence and development. Functional enrichment analysis of the host proteins revealed significant GO terms and KEGG pathways such as O-methyltransferase activity (GO:0008171), regulation of signal transduction (GO:0009966), lignin metabolic process (GO:0009808), plastid envelope (GO:0009526), plant-pathogen interaction pathway (ko04626), and MAPK pathway (ko04016) that are actively involved in plant defense and immune signaling against the biotic stresses. Subcellular localization analysis anticipated the host plastid as a primary target for pathogen attack. The highly connected host hubs in the protein interaction network belonged to protein kinase domain including Ser/Thr protein kinase, MAPK, and cyclin-dependent kinase. We also identified 5,577 transcription factors in the interactions, associated with plant defense during biotic stress conditions. Additionally, novel host targets that are resistant to stem rust disease were also identified. The present study elucidates the functional differences between Pgt 21-0 and Pgt Ug99, thus providing the researchers with strain-specific information for further experimental validation of the interactions, and the development of durable, disease-resistant crop lines.
Collapse
Affiliation(s)
- Raghav Kataria
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
| | - Rakesh Kaundal
- Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences, Utah State University, Logan, UT, United States
- Bioinformatics Facility, Center for Integrated BioSystems, Utah State University, Logan, UT, United States
- Department of Computer Science, College of Science, Utah State University, Logan, UT, United States
- *Correspondence: Rakesh Kaundal,
| |
Collapse
|
18
|
Onisiforou A, Spyrou GM. Identification of viral-mediated pathogenic mechanisms in neurodegenerative diseases using network-based approaches. Brief Bioinform 2021; 22:bbab141. [PMID: 34237135 PMCID: PMC8574625 DOI: 10.1093/bib/bbab141] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/01/2021] [Accepted: 03/23/2021] [Indexed: 12/18/2022] Open
Abstract
During the course of a viral infection, virus-host protein-protein interactions (PPIs) play a critical role in allowing viruses to replicate and survive within the host. These interspecies molecular interactions can lead to viral-mediated perturbations of the human interactome causing the generation of various complex diseases. Evidences suggest that viral-mediated perturbations are a possible pathogenic etiology in several neurodegenerative diseases (NDs). These diseases are characterized by chronic progressive degeneration of neurons, and current therapeutic approaches provide only mild symptomatic relief; therefore, there is unmet need for the discovery of novel therapeutic interventions. In this paper, we initially review databases and tools that can be utilized to investigate viral-mediated perturbations in complex NDs using network-based analysis by examining the interaction between the ND-related PPI disease networks and the virus-host PPI network. Afterwards, we present our theoretical-driven integrative network-based bioinformatics approach that accounts for pathogen-genes-disease-related PPIs with the aim to identify viral-mediated pathogenic mechanisms focusing in multiple sclerosis (MS) disease. We identified seven high centrality nodes that can act as disease communicator nodes and exert systemic effects in the MS-enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways network. In addition, we identified 12 KEGG pathways, 5 Reactome pathways and 52 Gene Ontology Immune System Processes by which 80 viral proteins from eight viral species might exert viral-mediated pathogenic mechanisms in MS. Finally, our analysis highlighted the Th17 differentiation pathway, a disease communicator node and part of the 12 underlined KEGG pathways, as a key viral-mediated pathogenic mechanism and a possible therapeutic target for MS disease.
Collapse
Affiliation(s)
- Anna Onisiforou
- Department of Bioinformatics, Cyprus Institute of Neurology & Genetics, and the Cyprus School of Molecular Medicine, Cyprus
| | - George M Spyrou
- Department of Bioinformatics, Cyprus Institute of Neurology & Genetics, and professor at the Cyprus School of Molecular Medicine, Cyprus
| |
Collapse
|
19
|
Ceulemans E, Ibrahim HMM, De Coninck B, Goossens A. Pathogen Effectors: Exploiting the Promiscuity of Plant Signaling Hubs. TRENDS IN PLANT SCIENCE 2021; 26:780-795. [PMID: 33674173 DOI: 10.1016/j.tplants.2021.01.005] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 01/21/2021] [Accepted: 01/29/2021] [Indexed: 05/27/2023]
Abstract
Pathogens produce effectors to overcome plant immunity, thereby threatening crop yields and global food security. Large-scale interactomic studies have revealed that pathogens from different kingdoms of life target common plant proteins during infection, the so-called effector hubs. These hubs often play central roles in numerous plant processes through their ability to interact with multiple plant proteins. This ability arises partly from the presence of intrinsically disordered domains (IDDs) in their structure. Here, we highlight the role of the TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR (TCP) and JASMONATE-ZIM DOMAIN (JAZ) transcription regulator families as plant signaling and effector hubs. We consider different evolutionary hypotheses to rationalize the existence of diverse effectors sharing common targets and the possible role of IDDs in this interaction.
Collapse
Affiliation(s)
- Evi Ceulemans
- Ghent University, Department of Plant Biotechnology and Bioinformatics, 9052 Ghent, Belgium; VIB, Center for Plant Systems Biology, 9052 Ghent, Belgium
| | - Heba M M Ibrahim
- Division of Crop Biotechnics, Department of Biosystems, Katholieke Universiteit (KU) Leuven, 3001 Leuven, Belgium
| | - Barbara De Coninck
- Division of Crop Biotechnics, Department of Biosystems, Katholieke Universiteit (KU) Leuven, 3001 Leuven, Belgium.
| | - Alain Goossens
- Ghent University, Department of Plant Biotechnology and Bioinformatics, 9052 Ghent, Belgium; VIB, Center for Plant Systems Biology, 9052 Ghent, Belgium.
| |
Collapse
|
20
|
Zhu J, Eid FE, Tong L, Zhao W, Wang W, Heath LS, Kang L, Cui F. Characterization of protein-protein interactions between rice viruses and vector insects. INSECT SCIENCE 2021; 28:976-986. [PMID: 32537916 DOI: 10.1111/1744-7917.12840] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 06/09/2020] [Accepted: 06/10/2020] [Indexed: 06/11/2023]
Abstract
Planthoppers are the most notorious rice pests, because they transmit various rice viruses in a persistent-propagative manner. Protein-protein interactions (PPIs) between virus and vector are crucial for virus transmission by vector insects. However, the number of known PPIs for pairs of rice viruses and planthoppers is restricted by low throughput research methods. In this study, we applied DeNovo, a virus-host sequence-based PPI predictor, to predict potential PPIs at a genome-wide scale between three planthoppers and five rice viruses. PPIs were identified at two different confidence thresholds, referred to as low and high modes. The number of PPIs for the five planthopper-virus pairs ranged from 506 to 1985 in the low mode and from 1254 to 4286 in the high mode. After eliminating the "one-too-many" redundant interacting information, the PPIs with unique planthopper proteins were reduced to 343-724 in the low mode and 758-1671 in the high mode. Homologous analysis showed that 11 sets and 31 sets of homologous planthopper proteins were shared by all planthopper-virus interactions in the two modes, indicating that they are potential conserved vector factors essential for transmission of rice viruses. Ten PPIs between small brown planthopper and rice stripe virus (RSV) were verified using glutathione-S-transferase (GST)/His-pull down or co-immunoprecipitation assay. Five of the ten PPIs were proven positive, and three of the five SBPH proteins were confirmed to interact with RSV. The predicted PPIs provide new clues for further studies of the complicated relationship between rice viruses and their vector insects.
Collapse
Affiliation(s)
- Junjie Zhu
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | | | - Lu Tong
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Wan Zhao
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Wei Wang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Lenwood S Heath
- Department of Computer Science, Virginia Tech, Blacksburg, VA, United States
| | - Le Kang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Feng Cui
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
21
|
KÖSESOY İ, GÖK M, KAHVECİ T. Prediction of host-pathogen protein interactions by extended network model. Turk J Biol 2021; 45:138-148. [PMID: 33907496 PMCID: PMC8068772 DOI: 10.3906/biy-2009-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/04/2021] [Indexed: 11/26/2022] Open
Abstract
Knowledge of the pathogen-host interactions between the species is essentialin order to develop a solution strategy against infectious diseases. In vitro methods take extended periods of time to detect interactions and provide very few of the possible interaction pairs. Hence, modelling interactions between proteins has necessitated the development of computational methods. The main scope of this paper is integrating the known protein interactions between thehost and pathogen organisms to improve the prediction success rate of unknown pathogen-host interactions. Thus, the truepositive rate of the predictions was expected to increase.In order to perform this study extensively, encoding methods and learning algorithms of several proteins were tested. Along with human as the host organism, two different pathogen organisms were used in the experiments. For each combination of protein-encoding and prediction method, both the original prediction algorithms were tested using only pathogen-host interactions and the same methodwas testedagain after integrating the known protein interactions within each organism. The effect of merging the networks of pathogen-host interactions of different species on the prediction performance of state-of-the-art methods was also observed. Successwas measured in terms of Matthews correlation coefficient, precision, recall, F1 score, and accuracy metrics. Empirical results showed that integrating the host and pathogen interactions yields better performance consistently in almost all experiments.
Collapse
Affiliation(s)
- İrfan KÖSESOY
- Department of Computer Engineering, Faculty of Engineering, Yalova University, YalovaTurkey
| | - Murat GÖK
- Department of Computer Engineering, Faculty of Engineering, Yalova University, YalovaTurkey
| | - Tamer KAHVECİ
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FLUSA
| |
Collapse
|
22
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
23
|
Armingol E, Officer A, Harismendy O, Lewis NE. Deciphering cell-cell interactions and communication from gene expression. Nat Rev Genet 2021; 22:71-88. [PMID: 33168968 PMCID: PMC7649713 DOI: 10.1038/s41576-020-00292-x] [Citation(s) in RCA: 445] [Impact Index Per Article: 148.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2020] [Indexed: 12/13/2022]
Abstract
Cell-cell interactions orchestrate organismal development, homeostasis and single-cell functions. When cells do not properly interact or improperly decode molecular messages, disease ensues. Thus, the identification and quantification of intercellular signalling pathways has become a common analysis performed across diverse disciplines. The expansion of protein-protein interaction databases and recent advances in RNA sequencing technologies have enabled routine analyses of intercellular signalling from gene expression measurements of bulk and single-cell data sets. In particular, ligand-receptor pairs can be used to infer intercellular communication from the coordinated expression of their cognate genes. In this Review, we highlight discoveries enabled by analyses of cell-cell interactions from transcriptomic data and review the methods and tools used in this context.
Collapse
Affiliation(s)
- Erick Armingol
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- Novo Nordisk Foundation Center for Biosustainability at the University of California, San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, USA
| | - Adam Officer
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, USA
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Olivier Harismendy
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA.
| | - Nathan E Lewis
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- Novo Nordisk Foundation Center for Biosustainability at the University of California, San Diego, La Jolla, CA, USA.
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
24
|
An Integrative Computational Approach for the Prediction of Human- Plasmodium Protein-Protein Interactions. BIOMED RESEARCH INTERNATIONAL 2021; 2020:2082540. [PMID: 33426052 PMCID: PMC7771252 DOI: 10.1155/2020/2082540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/08/2020] [Accepted: 12/04/2020] [Indexed: 12/27/2022]
Abstract
Host-pathogen molecular cross-talks are critical in determining the pathophysiology of a specific infection. Most of these cross-talks are mediated via protein-protein interactions between the host and the pathogen (HP-PPI). Thus, it is essential to know how some pathogens interact with their hosts to understand the mechanism of infections. Malaria is a life-threatening disease caused by an obligate intracellular parasite belonging to the Plasmodium genus, of which P. falciparum is the most prevalent. Several previous studies predicted human-plasmodium protein-protein interactions using computational methods have demonstrated their utility, accuracy, and efficiency to identify the interacting partners and therefore complementing experimental efforts to characterize host-pathogen interaction networks. To predict potential putative HP-PPIs, we use an integrative computational approach based on the combination of multiple OMICS-based methods including human red blood cells (RBC) and Plasmodium falciparum 3D7 strain expressed proteins, domain-domain based PPI, similarity of gene ontology terms, structure similarity method homology identification, and machine learning prediction. Our results reported a set of 716 protein interactions involving 302 human proteins and 130 Plasmodium proteins. This work provides a list of potential human-Plasmodium interacting proteins. These findings will contribute to better understand the mechanisms underlying the molecular determinism of malaria disease and potentially to identify candidate pharmacological targets.
Collapse
|
25
|
Han Y, Cheng L, Sun W. Analysis of Protein-Protein Interaction Networks through Computational Approaches. Protein Pept Lett 2020; 27:265-278. [PMID: 31692419 DOI: 10.2174/0929866526666191105142034] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 05/08/2019] [Accepted: 09/26/2019] [Indexed: 01/02/2023]
Abstract
The interactions among proteins and genes are extremely important for cellular functions. Molecular interactions at protein or gene levels can be used to construct interaction networks in which the interacting species are categorized based on direct interactions or functional similarities. Compared with the limited experimental techniques, various computational tools make it possible to analyze, filter, and combine the interaction data to get comprehensive information about the biological pathways. By the efficient way of integrating experimental findings in discovering PPIs and computational techniques for prediction, the researchers have been able to gain many valuable data on PPIs, including some advanced databases. Moreover, many useful tools and visualization programs enable the researchers to establish, annotate, and analyze biological networks. We here review and list the computational methods, databases, and tools for protein-protein interaction prediction.
Collapse
Affiliation(s)
- Ying Han
- Cardiovascular Department, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weiju Sun
- Cardiovascular Department, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
26
|
Araújo CL, Blanco I, Souza L, Tiwari S, Pereira LC, Ghosh P, Azevedo V, Silva A, Folador A. In silico functional prediction of hypothetical proteins from the core genome of Corynebacterium pseudotuberculosis biovar ovis. PeerJ 2020; 8:e9643. [PMID: 32913672 PMCID: PMC7456259 DOI: 10.7717/peerj.9643] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 07/10/2020] [Indexed: 12/30/2022] Open
Abstract
Corynebacterium pseudotuberculosis is a pathogen of veterinary relevance diseases, being divided into two biovars: equi and ovis; causing ulcerative lymphangitis and caseous lymphadenitis, respectively. The isolation and sequencing of C. pseudotuberculosis biovar ovis strains in the Northern and Northeastern regions of Brazil exhibited the emergence of this pathogen, which causes economic losses to small ruminant producers, and condemnation of carcasses and skins of animals. Through the pan-genomic approach, it is possible to determine and analyze genes that are shared by all strains of a species—the core genome. However, many of these genes do not have any predicted function, being characterized as hypothetical proteins (HP). In this study, we considered 32 C. pseudotuberculosis biovar ovis genomes for the pan-genomic analysis, where were identified 172 HP present in a core genome composed by 1255 genes. We are able to functionally annotate 80 sequences previously characterized as HP through the identification of structural features as conserved domains and families. Furthermore, we analyzed the physicochemical properties, subcellular localization and molecular function. Additionally, through RNA-seq data, we investigated the differential gene expression of the annotated HP. Genes inserted in pathogenicity islands had their virulence potential evaluated. Also, we have analyzed the existence of functional associations for their products based on protein–protein interaction networks, and perform the structural prediction of three targets. Due to the integration of different strategies, this study can underlie deeper in vitro researches in the characterization of these HP and the search for new solutions for combat this pathogen.
Collapse
Affiliation(s)
- Carlos Leonardo Araújo
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Iago Blanco
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Luciana Souza
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Sandeep Tiwari
- Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Lino César Pereira
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Vasco Azevedo
- Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Artur Silva
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Adriana Folador
- Laboratory of Genomics and Bioinformatics, Center of Genomics and Systems Biology, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| |
Collapse
|
27
|
Chen H, Li F, Wang L, Jin Y, Chi CH, Kurgan L, Song J, Shen J. Systematic evaluation of machine learning methods for identifying human-pathogen protein-protein interactions. Brief Bioinform 2020; 22:5847611. [PMID: 32459334 DOI: 10.1093/bib/bbaa068] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 12/11/2022] Open
Abstract
In recent years, high-throughput experimental techniques have significantly enhanced the accuracy and coverage of protein-protein interaction identification, including human-pathogen protein-protein interactions (HP-PPIs). Despite this progress, experimental methods are, in general, expensive in terms of both time and labour costs, especially considering that there are enormous amounts of potential protein-interacting partners. Developing computational methods to predict interactions between human and bacteria pathogen has thus become critical and meaningful, in both facilitating the detection of interactions and mining incomplete interaction maps. In this paper, we present a systematic evaluation of machine learning-based computational methods for human-bacterium protein-protein interactions (HB-PPIs). We first reviewed a vast number of publicly available databases of HP-PPIs and then critically evaluate the availability of these databases. Benefitting from its well-structured nature, we subsequently preprocess the data and identified six bacterium pathogens that could be used to study bacterium subjects in which a human was the host. Additionally, we thoroughly reviewed the literature on 'host-pathogen interactions' whereby existing models were summarized that we used to jointly study the impact of different feature representation algorithms and evaluate the performance of existing machine learning computational models. Owing to the abundance of sequence information and the limited scale of other protein-related information, we adopted the primary protocol from the literature and dedicated our analysis to a comprehensive assessment of sequence information and machine learning models. A systematic evaluation of machine learning models and a wide range of feature representation algorithms based on sequence information are presented as a comparison survey towards the prediction performance evaluation of HB-PPIs.
Collapse
|
28
|
Andrighetti T, Bohar B, Lemke N, Sudhakar P, Korcsmaros T. MicrobioLink: An Integrated Computational Pipeline to Infer Functional Effects of Microbiome-Host Interactions. Cells 2020; 9:cells9051278. [PMID: 32455748 PMCID: PMC7291277 DOI: 10.3390/cells9051278] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 05/15/2020] [Accepted: 05/19/2020] [Indexed: 02/07/2023] Open
Abstract
Microbiome–host interactions play significant roles in health and in various diseases including autoimmune disorders. Uncovering these inter-kingdom cross-talks propels our understanding of disease pathogenesis and provides useful leads on potential therapeutic targets. Despite the biological significance of microbe–host interactions, there is a big gap in understanding the downstream effects of these interactions on host processes. Computational methods are expected to fill this gap by generating, integrating, and prioritizing predictions—as experimental detection remains challenging due to feasibility issues. Here, we present MicrobioLink, a computational pipeline to integrate predicted interactions between microbial and host proteins together with host molecular networks. Using the concept of network diffusion, MicrobioLink can analyse how microbial proteins in a certain context are influencing cellular processes by modulating gene or protein expression. We demonstrated the applicability of the pipeline using a case study. We used gut metaproteomic data from Crohn’s disease patients and healthy controls to uncover the mechanisms by which the microbial proteins can modulate host genes which belong to biological processes implicated in disease pathogenesis. MicrobioLink, which is agnostic of the microbial protein sources (bacterial, viral, etc.), is freely available on GitHub.
Collapse
Affiliation(s)
- Tahila Andrighetti
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK; (T.A.); (B.B.)
- Institute of Biosciences, São Paulo University (UNESP), Botucatu 18618-689, SP, Brazil;
| | - Balazs Bohar
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK; (T.A.); (B.B.)
- Department of Genetics, Eötvös Loránd University, Budapest 1117, Hungary
| | - Ney Lemke
- Institute of Biosciences, São Paulo University (UNESP), Botucatu 18618-689, SP, Brazil;
| | - Padhmanand Sudhakar
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK; (T.A.); (B.B.)
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
- Department of Chronic Diseases, Metabolism and Ageing, KU Leuven BE-3000, Leuven, Belgium
- Correspondence: (T.K.); (P.S.)
| | - Tamas Korcsmaros
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK; (T.A.); (B.B.)
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
- Correspondence: (T.K.); (P.S.)
| |
Collapse
|
29
|
Guven-Maiorov E, Hakouz A, Valjevac S, Keskin O, Tsai CJ, Gursoy A, Nussinov R. HMI-PRED: A Web Server for Structural Prediction of Host-Microbe Interactions Based on Interface Mimicry. J Mol Biol 2020; 432:3395-3403. [PMID: 32061934 PMCID: PMC7261632 DOI: 10.1016/j.jmb.2020.01.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 11/28/2019] [Accepted: 01/14/2020] [Indexed: 02/07/2023]
Abstract
Microbes, commensals, and pathogens, control the numerous functions in the host cells. They can alter host signaling and modulate immune surveillance by interacting with the host proteins. For shedding light on the contribution of microbes to health and disease, it is vital to discern how microbial proteins rewire host signaling and through which host proteins they do this. Host-Microbe Interaction PREDictor (HMI-PRED) is a user-friendly web server for structural prediction of protein-protein interactions (PPIs) between the host and a microbial species, including bacteria, viruses, fungi, and protozoa. HMI-PRED relies on "interface mimicry" through which the microbial proteins hijack host binding surfaces. Given the structure of a microbial protein of interest, HMI-PRED will return structural models of potential host-microbe interaction (HMI) complexes, the list of host endogenous and exogenous PPIs that can be disrupted, and tissue expression of the microbe-targeted host proteins. The server also allows users to upload homology models of microbial proteins. Broadly, it aims at large-scale, efficient identification of HMIs. The prediction results are stored in a repository for community access. HMI-PRED is free and available at https://interactome.ku.edu.tr/hmi.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Asma Hakouz
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Sukejna Valjevac
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA.
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA; Sackler Inst. of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, 69978, Israel.
| |
Collapse
|
30
|
James K, Olson PD. The tapeworm interactome: inferring confidence scored protein-protein interactions from the proteome of Hymenolepis microstoma. BMC Genomics 2020; 21:346. [PMID: 32380953 PMCID: PMC7204028 DOI: 10.1186/s12864-020-6710-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 03/30/2020] [Indexed: 12/14/2022] Open
Abstract
Background Reference genome and transcriptome assemblies of helminths have reached a level of completion whereby secondary analyses that rely on accurate gene estimation or syntenic relationships can be now conducted with a high level of confidence. Recent public release of the v.3 assembly of the mouse bile-duct tapeworm, Hymenolepis microstoma, provides chromosome-level characterisation of the genome and a stabilised set of protein coding gene models underpinned by bioinformatic and empirical data. However, interactome data have not been produced. Conserved protein-protein interactions in other organisms, termed interologs, can be used to transfer interactions between species, allowing systems-level analysis in non-model organisms. Results Here, we describe a probabilistic, integrated network of interologs for the H. microstoma proteome, based on conserved protein interactions found in eukaryote model species. Almost a third of the 10,139 gene models in the v.3 assembly could be assigned interaction data and assessment of the resulting network indicates that topologically-important proteins are related to essential cellular pathways, and that the network clusters into biologically meaningful components. Moreover, network parameters are similar to those of single-species interaction networks that we constructed in the same way for S. cerevisiae, C. elegans and H. sapiens, demonstrating that information-rich, system-level analyses can be conducted even on species separated by a large phylogenetic distance from the major model organisms from which most protein interaction evidence is based. Using the interolog network, we then focused on sub-networks of interactions assigned to discrete suites of genes of interest, including signalling components and transcription factors, germline multipotency genes, and genes differentially-expressed between larval and adult worms. Results show not only an expected bias toward highly-conserved proteins, such as components of intracellular signal transduction, but in some cases predicted interactions with transcription factors that aid in identifying their target genes. Conclusions With key helminth genomes now complete, systems-level analyses can provide an important predictive framework to guide basic and applied research on helminths and will become increasingly informative as new protein-protein interaction data accumulate.
Collapse
Affiliation(s)
- Katherine James
- Department of Applied Sciences, Northumbria University, Newcastle Upon Tyne, UK. .,Department of Life Sciences, The Natural History Museum, Cromwell Road, London, UK.
| | - Peter D Olson
- Department of Life Sciences, The Natural History Museum, Cromwell Road, London, UK
| |
Collapse
|
31
|
Sen R, Tagore S, De RK. ASAPP: Architectural Similarity-Based Automated Pathway Prediction System and Its Application in Host-Pathogen Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:506-515. [PMID: 30281472 DOI: 10.1109/tcbb.2018.2872527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The significance of metabolic pathway prediction is to envision the viable unknown transformations that can occur provided the appropriate enzymes are present. It can facilitate the prediction of the consequences of host-pathogen interactions. In this article, we have proposed a new algorithm Architectural Similarity-based Automated Pathway Prediction (ASAPP) to predict metabolic pathways based on the structural similarity among the metabolites. ASAPP takes two-dimensional structure and molecular weight of metabolites as input, and generates a list of probable transformations without the knowledge of any externally established reactions, with an accuracy of 85.09 percent. ASAPP has also been applied to predict the outcome of pathogen liberated toxins on the carbohydrate and lipid pathways of the hosts. We have analyzed the disruption of host pathways in the presence of toxins, and have found that some metabolites in Glycolysis and the TCA cycle have a high chance of being the breakpoints in the pathway. The tool is available at http://asapp.droppages.com/.
Collapse
|
32
|
Oli AN, Obialor WO, Ifeanyichukwu MO, Odimegwu DC, Okoyeh JN, Emechebe GO, Adejumo SA, Ibeanu GC. Immunoinformatics and Vaccine Development: An Overview. Immunotargets Ther 2020; 9:13-30. [PMID: 32161726 PMCID: PMC7049754 DOI: 10.2147/itt.s241064] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/25/2020] [Indexed: 12/11/2022] Open
Abstract
The use of vaccines have resulted in a remarkable improvement in global health. It has saved several lives, reduced treatment costs and raised the quality of animal and human lives. Current traditional vaccines came empirically with either vague or completely no knowledge of how they modulate our immune system. Even at the face of potential vaccine design advance, immune-related concerns (as seen with specific vulnerable populations, cases of emerging/re-emerging infectious disease, pathogens with complex lifecycle and antigenic variability, need for personalized vaccinations, and concerns for vaccines' immunological safety -specifically vaccine likelihood to trigger non-antigen-specific responses that may cause autoimmunity and vaccine allergy) are being raised. And these concerns have driven immunologists toward research for a better approach to vaccine design that will consider these challenges. Currently, immunoinformatics has paved the way for a better understanding of some infectious disease pathogenesis, diagnosis, immune system response and computational vaccinology. The importance of this immunoinformatics in the study of infectious diseases is diverse in terms of computational approaches used, but is united by common qualities related to host–pathogen relationship. Bioinformatics methods are also used to assign functions to uncharacterized genes which can be targeted as a candidate in vaccine design and can be a better approach toward the inclusion of women that are pregnant into vaccine trials and programs. The essence of this review is to give insight into the need to focus on novel computational, experimental and computation-driven experimental approaches for studying of host–pathogen interactions and thus making a case for its use in vaccine development.
Collapse
Affiliation(s)
- Angus Nnamdi Oli
- Department of Pharmaceutical Microbiology and Biotechnology, Faculty of Pharmaceutical Sciences, Nnamdi Azikiwe University, Awka, Nigeria
| | - Wilson Okechukwu Obialor
- Department of Pharmaceutical Microbiology and Biotechnology, Faculty of Pharmaceutical Sciences, Nnamdi Azikiwe University, Awka, Nigeria
| | - Martins Ositadimma Ifeanyichukwu
- Department of Immunology, College of Health Sciences, Faculty of Medicine, Nnamdi Azikiwe University, Anambra, Nigeria.,Department of Medical Laboratory Science,Faculty of Health Science and Technology, College of Health Sciences, Nnamdi Azikiwe University,Nnewi Campus, Nnewi, Nigeria
| | - Damian Chukwu Odimegwu
- Department of Pharmaceutical Microbiology and Biotechnology, Faculty of Pharmaceutical Sciences, University of Nigeria Nsukka, Enugu, Nigeria
| | - Jude Nnaemeka Okoyeh
- Department of Biology and Clinical Laboratory Science, Division of Arts and Sciences, Neumann University, Aston, PA 19014-1298, USA
| | - George Ogonna Emechebe
- Department of Pediatrics, Faculty of Clinical Medicine, Chukwuemeka Odumegwu Ojukwu University, Awka, Nigeria
| | - Samson Adedeji Adejumo
- Department of Pharmaceutical Microbiology and Biotechnology, Faculty of Pharmaceutical Sciences, Nnamdi Azikiwe University, Awka, Nigeria
| | - Gordon C Ibeanu
- Department of Pharmaceutical Science, North Carolina Central University, Durham, NC 27707, USA
| |
Collapse
|
33
|
Li J, Wang S, Chen Z, Wang Y. A Bipartite Network Module-Based Project to Predict Pathogen-Host Association. Front Genet 2020; 10:1357. [PMID: 32038713 PMCID: PMC6992693 DOI: 10.3389/fgene.2019.01357] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 12/11/2019] [Indexed: 12/23/2022] Open
Abstract
Pathogen-host interactions play an important role in understanding the mechanism by which a pathogen can infect its host. Some approaches for predicting pathogen-host association have been developed, but prediction accuracy is still low. In this paper, we propose a bipartite network module-based approach to improve prediction accuracy. First, a bipartite network with pathogens and hosts is constructed. Next, pathogens and hosts are divided into different modules respectively. Then, modular information on the pathogens and hosts is added into a bipartite network projection model and the association scores between pathogens and hosts are calculated. Finally, leave-one-out cross-validation is used to estimate the performance of the proposed method. Experimental results show that the proposed method performs better in predicting pathogen-host association than other methods, and some potential pathogen-host associations with higher prediction scores are also confirmed by the results of biological experiments in the publically available literature.
Collapse
Affiliation(s)
- Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | | | | | | |
Collapse
|
34
|
Mei S, Zhang K. In silico unravelling pathogen-host signaling cross-talks via pathogen mimicry and human protein-protein interaction networks. Comput Struct Biotechnol J 2019; 18:100-113. [PMID: 31956393 PMCID: PMC6956678 DOI: 10.1016/j.csbj.2019.12.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 12/07/2019] [Accepted: 12/14/2019] [Indexed: 01/08/2023] Open
Abstract
Pathogen-host protein interactions are fundamental for pathogens to manipulate host signaling pathways and subvert host immune defense. For most pathogens, very few or no experimental studies have been conducted to investigate their signaling cross-talks with host. In this study, we propose a computational framework to validate the biological assumption that human protein-protein interaction (PPI) networks alone are sufficient to infer pathogen-host PPIs via pathogen functional mimicry. Pathogen functional mimicry assumes that a pathogen functionally mimics and substitutes host counterpart proteins in order for the pathogen to get involved in or hijack the host cellular processes. Through pathogen functional mimicry defined via gene ontology (GO) semantic similarity, we first use the known human PPIs as templates to infer pathogen-host PPIs, and the PPIs are further used as training data to build an l2-regularized logistic regression model for novel pathogen-host PPI prediction. Independent tests on the experimental data from human immunodeficiency virus and Francisella tularensis validate the effectiveness of the proposed pathogen functional mimicry technique. Performance comparisons also show that the proposed technique y excels the existing pathogen sequence mimicry approaches and transfer learning methods. The proposed framework provides a new avenue to study the experimentally less-studied pathogens in the worst scenarios that very few or no experimental pathogen-host PPIs are available. As two case studies, we apply the proposed framework to Salmonella typhimurium and Human respiratory syncytial virus to reconstruct the pathogen-host PPI networks and further investigate the interference of these two pathogens with human immune signaling and transcription regulatory system.
Collapse
Affiliation(s)
- Suyu Mei
- Software College, Shenyang Normal University, Shenyang 110034, China
| | - Kun Zhang
- Bioinformatics Core of Xavier RCMI Center for Cancer Research, Department of Computer Science, Xavier University of Louisiana, New Orleans, LA 70125, USA
| |
Collapse
|
35
|
Guven-Maiorov E, Tsai CJ, Nussinov R. Oncoviruses Can Drive Cancer by Rewiring Signaling Pathways Through Interface Mimicry. Front Oncol 2019; 9:1236. [PMID: 31803618 PMCID: PMC6872517 DOI: 10.3389/fonc.2019.01236] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 10/28/2019] [Indexed: 01/17/2023] Open
Abstract
Oncoviruses rewire host pathways to subvert host immunity and promote their survival and proliferation. However, exactly how is challenging to understand. Here, by employing the first and to date only interface-based host-microbe interaction (HMI) prediction method, we explore a pivotal strategy oncoviruses use to drive cancer: mimicking binding surfaces-interfaces-of human proteins. We show that oncoviruses can target key human network proteins and transform cells by acquisition of cancer hallmarks. Experimental large-scale mapping of HMIs is difficult and individual HMIs do not permit in-depth grasp of tumorigenic virulence mechanisms. Our computational approach is tractable and 3D structural HMI models can help elucidate pathogenesis mechanisms and facilitate drug design. We observe that many host proteins are unique targets for certain oncoviruses, whereas others are common to several, suggesting similar infectious strategies. A rough estimation of our false discovery rate based on the tissue expression of oncovirus-targeted human proteins is 25%.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, United States
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, United States
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, United States
- Department of Human Genetics and Molecular Medicine, Sackler Institute of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
36
|
Peiffer-Smadja N, Rawson TM, Ahmad R, Buchard A, Georgiou P, Lescure FX, Birgand G, Holmes AH. Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clin Microbiol Infect 2019; 26:584-595. [PMID: 31539636 DOI: 10.1016/j.cmi.2019.09.009] [Citation(s) in RCA: 170] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 08/29/2019] [Accepted: 09/09/2019] [Indexed: 12/22/2022]
Abstract
BACKGROUND Machine learning (ML) is a growing field in medicine. This narrative review describes the current body of literature on ML for clinical decision support in infectious diseases (ID). OBJECTIVES We aim to inform clinicians about the use of ML for diagnosis, classification, outcome prediction and antimicrobial management in ID. SOURCES References for this review were identified through searches of MEDLINE/PubMed, EMBASE, Google Scholar, biorXiv, ACM Digital Library, arXiV and IEEE Xplore Digital Library up to July 2019. CONTENT We found 60 unique ML-clinical decision support systems (ML-CDSS) aiming to assist ID clinicians. Overall, 37 (62%) focused on bacterial infections, 10 (17%) on viral infections, nine (15%) on tuberculosis and four (7%) on any kind of infection. Among them, 20 (33%) addressed the diagnosis of infection, 18 (30%) the prediction, early detection or stratification of sepsis, 13 (22%) the prediction of treatment response, four (7%) the prediction of antibiotic resistance, three (5%) the choice of antibiotic regimen and two (3%) the choice of a combination antiretroviral therapy. The ML-CDSS were developed for intensive care units (n = 24, 40%), ID consultation (n = 15, 25%), medical or surgical wards (n = 13, 20%), emergency department (n = 4, 7%), primary care (n = 3, 5%) and antimicrobial stewardship (n = 1, 2%). Fifty-three ML-CDSS (88%) were developed using data from high-income countries and seven (12%) with data from low- and middle-income countries (LMIC). The evaluation of ML-CDSS was limited to measures of performance (e.g. sensitivity, specificity) for 57 ML-CDSS (95%) and included data in clinical practice for three (5%). IMPLICATIONS Considering comprehensive patient data from socioeconomically diverse healthcare settings, including primary care and LMICs, may improve the ability of ML-CDSS to suggest decisions adapted to various clinical contexts. Currents gaps identified in the evaluation of ML-CDSS must also be addressed in order to know the potential impact of such tools for clinicians and patients.
Collapse
Affiliation(s)
- N Peiffer-Smadja
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; French Institute for Medical Research (Inserm), Infection Antimicrobials Modelling Evolution (IAME), UMR 1137, University Paris Diderot, Paris, France.
| | - T M Rawson
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK
| | - R Ahmad
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK
| | | | - P Georgiou
- Department of Electrical and Electronic Engineering, Imperial College, London, UK
| | - F-X Lescure
- French Institute for Medical Research (Inserm), Infection Antimicrobials Modelling Evolution (IAME), UMR 1137, University Paris Diderot, Paris, France; Infectious Diseases Department, Bichat-Claude Bernard Hospital, Assistance-Publique Hôpitaux de Paris, Paris, France
| | - G Birgand
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK
| | - A H Holmes
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK
| |
Collapse
|
37
|
Arenas AF, Arango-Plaza N, Arenas JC, Salcedo GE. Time-Frequency Approach Applied to Finding Interaction Regions in Pathogenic Proteins. Bioinform Biol Insights 2019; 13:1177932219850172. [PMID: 31210729 PMCID: PMC6552352 DOI: 10.1177/1177932219850172] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 04/18/2019] [Indexed: 11/15/2022] Open
Abstract
Protein-protein interactions govern all molecular processes for living organisms, even those involved in pathogen infection. Pathogens such as virus, bacteria, and parasites contain proteins that help the pathogen to attach, penetrate, and settle inside the target cell. Thus, it is necessary to know the regions in pathogenic proteins that interact with host cell receptors. Currently, powerful pathogen databases are available and many pathogenic proteins have been recognized, but many pathogenic proteins have not been characterized. This work developed a program in MATLAB environment based on the time-frequency analysis to recognize important sites in proteins. Our program highlights the highest energy patches in proteins from their time-frequency distribution and matches the corresponding frequency. We sought to know if this approach is able to recognize stretches residues related to interaction. Our approach was applied to five study cases from pathogenic co-crystallized structures that have been well characterized. We searched the frequencies that characterize interaction regions in pathogenic proteins and with this information tried to identify new interaction patches in either paralogs or orthologs. We found that our program generates a well-interpretable graphic under several descriptors that can show important regions in proteins even those related to interaction. We propose that this MATLAB program could be used as a tool to explore outstanding regions in uncharacterized proteins.
Collapse
Affiliation(s)
- Ailan F Arenas
- Grupo de Estudio en Parasitología Molecular (Gepamol), Universidad del Quindío, Armenia, Colombia.,Grupo de Investigación y Asesoría en Estadística, Universidad del Quindío, Armenia, Colombia
| | - Nicolás Arango-Plaza
- Grupo de Investigación y Asesoría en Estadística, Universidad del Quindío, Armenia, Colombia
| | - Juan Camilo Arenas
- Grupo de Estudio en Parasitología Molecular (Gepamol), Universidad del Quindío, Armenia, Colombia.,Grupo de Investigación y Asesoría en Estadística, Universidad del Quindío, Armenia, Colombia
| | - Gladys E Salcedo
- Grupo de Investigación y Asesoría en Estadística, Universidad del Quindío, Armenia, Colombia
| |
Collapse
|
38
|
Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M. Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions. Brief Funct Genomics 2019; 17:381-391. [PMID: 29028879 PMCID: PMC7109800 DOI: 10.1093/bfgp/elx026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Identification of potential virus–host interactions is useful and vital to control the highly infectious virus-caused diseases. This may contribute toward development of new drugs to treat the viral infections. Recently, database records of clinically and experimentally validated interactions between a small set of human proteins and Ebola virus (EBOV) have been published. Using the information of the known human interaction partners of EBOV, our main objective is to identify a set of proteins that may interact with EBOV proteins. Here, we first review the state-of-the-art, computational methods used for prediction of novel virus–host interactions for infectious diseases followed by a case study on EBOV–human interactions. The assessment result shows that the predicted human host proteins are highly similar with known human interaction partners of EBOV in the context of structure and semantics and are responsible for similar biochemical activities, pathways and host–pathogen relationships.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Pritha Dutta
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mahantapas Kundu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, India
| | - Mita Nasipuri
- Department of Computer Science and Engineering, Jadavpur University, India
| |
Collapse
|
39
|
Lian X, Yang S, Li H, Fu C, Zhang Z. Machine-Learning-Based Predictor of Human–Bacteria Protein–Protein Interactions by Incorporating Comprehensive Host-Network Properties. J Proteome Res 2019; 18:2195-2205. [DOI: 10.1021/acs.jproteome.9b00074] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hong Li
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, 570228, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
40
|
Guven-Maiorov E, Tsai CJ, Ma B, Nussinov R. Interface-Based Structural Prediction of Novel Host-Pathogen Interactions. Methods Mol Biol 2019; 1851:317-335. [PMID: 30298406 PMCID: PMC8192064 DOI: 10.1007/978-1-4939-8736-8_18] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
About 20% of the cancer incidences worldwide have been estimated to be associated with infections. However, the molecular mechanisms of exactly how they contribute to host tumorigenesis are still unknown. To evade host defense, pathogens hijack host proteins at different levels: sequence, structure, motif, and binding surface, i.e., interface. Interface similarity allows pathogen proteins to compete with host counterparts to bind to a target protein, rewire physiological signaling, and result in persistent infections, as well as cancer. Identification of host-pathogen interactions (HPIs)-along with their structural details at atomic resolution-may provide mechanistic insight into pathogen-driven cancers and innovate therapeutic intervention. HPI data including structural details is scarce and large-scale experimental detection is challenging. Therefore, there is an urgent and mounting need for efficient and robust computational approaches to predict HPIs and their complex (bound) structures. In this chapter, we review the first and currently only interface-based computational approach to identify novel HPIs. The concept of interface mimicry promises to identify more HPIs than complete sequence or structural similarity. We illustrate this concept with a case study on Kaposi's sarcoma herpesvirus (KSHV) to elucidate how it subverts host immunity and helps contribute to malignant transformation of the host cells.
Collapse
Affiliation(s)
- Emine Guven-Maiorov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Buyong Ma
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD, USA.
- Department of Human Genetics and Molecular Medicine, Sackler Inst. of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
41
|
Abstract
Biological activities are mainly executed by proteins and in most of the occasions these activities are accomplished by protein complexes or through protein-protein interactions (PPI). So it is critical to reveal how the protein complexes are organized and demonstrate the PPIs involved in the biological processes. In addition to the traditional biochemical approaches, proximity-dependent labeling (PDL) has recently been proposed to identify the interacting partners of a given protein. PDL requires the fusion expression of the target protein with an enzyme which catalyzes the attachment of a reactive molecule to the interacting partners in a distance-dependent manner. Further analysis of all the proteins that are modified by the reactive molecule discloses the identity of these proteins which are presumed to be interacting partners of the target protein. BioID is one of those representative PDL methods with the most widely applications. The enzyme used in BioID is a biotin ligase BirA which catalyzes the biotinylation of target protein with the presence of biotin. Through streptavidin-mediated pull-down and mass spectrometry analysis, the interacting protein candidates of a given protein can be obtained.
Collapse
Affiliation(s)
- Peipei Li
- Faculty of Health Sciences, Cancer Center, University of Macau, Macau, China
| | - Yuan Meng
- Faculty of Health Sciences, Cancer Center, University of Macau, Macau, China
| | - Li Wang
- Faculty of Health Sciences, Cancer Center, University of Macau, Macau, China
| | - Li-Jun Di
- Faculty of Health Sciences, Cancer Center, University of Macau, Macau, China.
| |
Collapse
|
42
|
A new sequence based encoding for prediction of host-pathogen protein interactions. Comput Biol Chem 2018; 78:170-177. [PMID: 30553999 DOI: 10.1016/j.compbiolchem.2018.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 08/23/2018] [Accepted: 12/01/2018] [Indexed: 12/22/2022]
Abstract
Pathogen-host interactions are very important to figure out the infection process at the molecular level, where pathogen proteins physically bind to human proteins to manipulate critical biological processes in the host cell. Data scarcity and data unavailability are two major problems for computational approaches in the prediction of pathogen-host interactions. Developing a computational method to predict pathogen-host interactions with high accuracy, based on protein sequences alone, is of great importance because it can eliminate these problems. In this study, we propose a novel and robust sequence based feature extraction method, named Location Based Encoding, to predict pathogen-host interactions with machine learning based algorithms. In this context, we use Bacillus Anthracis and Yersinia Pestis data sets as the pathogen organisms and human proteins as the host model to compare our method with sequence based protein encoding methods, which are widely used in the literature, namely amino acid composition, amino acid pair, and conjoint triad. We use these encoding methods with decision trees (Random Forest, j48), statistical (Bayesian Networks, Naive Bayes), and instance based (kNN) classifiers to predict pathogen-host interactions. We conduct different experiments to evaluate the effectiveness of our method. We obtain the best results among all the experiments with RF classifier in terms of F1, accuracy, MCC, and AUC.
Collapse
|
43
|
Sun J, Yang LL, Chen X, Kong DX, Liu R. Integrating Multifaceted Information to Predict Mycobacterium tuberculosis-Human Protein-Protein Interactions. J Proteome Res 2018; 17:3810-3823. [PMID: 30269499 DOI: 10.1021/acs.jproteome.8b00497] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Tuberculosis (TB) is one of the biggest infectious disease killers caused by Mycobacterium tuberculosis (MTB). Studying the protein-protein interactions (PPIs) between MTB and human can deepen our understanding of the pathogenesis of TB and offer new clues to the treatment against MTB infection, but the experimentally validated interactions are especially scarce in this regard. Herein we proposed an integrated framework that combined template-, domain-domain interaction-, and machine learning-based methods to predict MTB-human PPIs. As a result, we established a network composed of 13 758 PPIs including 451 MTB proteins and 3167 human proteins ( http://liulab.hzau.edu.cn/MTB/ ). Compared to known human targets of various pathogens, our predicted human targets show a similar tendency in terms of the network topological properties and enrichment in important functional genes. Additionally, these human targets largely have longer sequence lengths, more protein domains, more disordered residues, lower evolutionary rates, and older protein ages. Functional analysis demonstrates that these proteins show strong preferences toward the phosphorylation, kinase activity, and signaling transduction processes and the disease and immune related pathways. Dissecting the cross-talk among top-ranked pathways suggests that the cancer pathway may serve as a bridge in MTB infection. Triplet analysis illustrates that the paired targets interacting with the same partner are adjacent to each other in the intraspecies network and tend to share similar expression patterns. Finally, we identified 36 potential anti-MTB human targets by integrating known drug target information and molecular properties of proteins.
Collapse
|
44
|
Zhang L, Liu JY, Gu H, Du Y, Zuo JF, Zhang Z, Zhang M, Li P, Dunwell JM, Cao Y, Zhang Z, Zhang YM. Bradyrhizobium diazoefficiens USDA 110- Glycine max Interactome Provides Candidate Proteins Associated with Symbiosis. J Proteome Res 2018; 17:3061-3074. [PMID: 30091610 DOI: 10.1021/acs.jproteome.8b00209] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Although the legume-rhizobium symbiosis is a most-important biological process, there is a limited knowledge about the protein interaction network between host and symbiont. Using interolog- and domain-based approaches, we constructed an interspecies protein interactome containing 5115 protein-protein interactions between 2291 Glycine max and 290 Bradyrhizobium diazoefficiens USDA 110 proteins. The interactome was further validated by the expression pattern analysis in nodules, gene ontology term semantic similarity, co-expression analysis, and luciferase complementation image assay. In the G. max-B. diazoefficiens interactome, bacterial proteins are mainly ion channel and transporters of carbohydrates and cations, while G. max proteins are mainly involved in the processes of metabolism, signal transduction, and transport. We also identified the top 10 highly interacting proteins (hubs) for each species. Kyoto Encyclopedia of Genes and Genomes pathway analysis for each hub showed that a pair of 14-3-3 proteins (SGF14g and SGF14k) and 5 heat shock proteins in G. max are possibly involved in symbiosis, and 10 hubs in B. diazoefficiens may be important symbiotic effectors. Subnetwork analysis showed that 18 symbiosis-related soluble N-ethylmaleimide sensitive factor attachment protein receptor proteins may play roles in regulating bacterial ion channels, and SGF14g and SGF14k possibly regulate the rhizobium dicarboxylate transport protein DctA. The predicted interactome provide a valuable basis for understanding the molecular mechanism of nodulation in soybean.
Collapse
Affiliation(s)
- Li Zhang
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
- School of Public Health , Xinxiang Medical University , Xinxiang 453003 , China
| | - Jin-Yang Liu
- College of Agriculture, Nanjing Agricultural University , Nanjing 210095 , China
| | - Huan Gu
- College of Agriculture, Nanjing Agricultural University , Nanjing 210095 , China
| | - Yanfang Du
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| | - Jian-Fang Zuo
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| | - Zhibin Zhang
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| | - Menglin Zhang
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| | - Pan Li
- School of Public Health , Xinxiang Medical University , Xinxiang 453003 , China
| | - Jim M Dunwell
- School of Agriculture, Policy and Development , University of Reading , Reading RG6 6AR , United Kingdom
| | - Yangrong Cao
- College of Life Science and Technology , Huazhong Agricultural University , Wuhan 430070 , China
| | - Zuxin Zhang
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| | - Yuan-Ming Zhang
- Crop Information Center , College of Plant Science and Technology, Huazhong Agricultural University , Wuhan 430070 , China
| |
Collapse
|
45
|
Goodacre N, Devkota P, Bae E, Wuchty S, Uetz P. Protein-protein interactions of human viruses. Semin Cell Dev Biol 2018; 99:31-39. [PMID: 30031213 PMCID: PMC7102568 DOI: 10.1016/j.semcdb.2018.07.018] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 04/02/2018] [Accepted: 07/17/2018] [Indexed: 12/16/2022]
Abstract
Viruses infect their human hosts by a series of interactions between viral and host proteins, indicating that detailed knowledge of such virus-host interaction interfaces are critical for our understanding of viral infection mechanisms, disease etiology and the development of new drugs. In this review, we primarily survey human host-virus interaction data that are available from public databases following the standardized PSI-MS format. Notably, available host-virus protein interaction information is strongly biased toward a small number of virus families including herpesviridae, papillomaviridae, orthomyxoviridae and retroviridae. While we explore the reliability and relevance of these protein interactions we also survey the current knowledge about viruses functional and topological targets. Furthermore, we assess emerging frontiers of host-virus protein interaction research, focusing on protein interaction interfaces of hosts that are infected by different viruses and viruses that infect multiple hosts. Finally, we cover the current status of research that investigates the relationships of virus-targeted host proteins to other comorbidities as well as the influence of host-virus protein interactions on human metabolism.
Collapse
Affiliation(s)
- Norman Goodacre
- Division of Viral Products, Office of Vaccines Research and Review, Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Prajwal Devkota
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, 33146, USA
| | - Eunhae Bae
- Division of Viral Products, Office of Vaccines Research and Review, Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Stefan Wuchty
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, 33146, USA; Center for Computational Science, Univ. of Miami, Coral Gables, FL, 33146, USA; Dept. of Biology, Univ. of Miami, Coral Gables, FL, 33146, USA; Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA.
| | - Peter Uetz
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
46
|
da Costa WLO, Araújo CLDA, Dias LM, Pereira LCDS, Alves JTC, Araújo FA, Folador EL, Henriques I, Silva A, Folador ARC. Functional annotation of hypothetical proteins from the Exiguobacterium antarcticum strain B7 reveals proteins involved in adaptation to extreme environments, including high arsenic resistance. PLoS One 2018; 13:e0198965. [PMID: 29940001 PMCID: PMC6016940 DOI: 10.1371/journal.pone.0198965] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 05/28/2018] [Indexed: 02/07/2023] Open
Abstract
Exiguobacterium antarcticum strain B7 is a psychrophilic Gram-positive bacterium that possesses enzymes that can be used for several biotechnological applications. However, many proteins from its genome are considered hypothetical proteins (HPs). These functionally unknown proteins may indicate important functions regarding the biological role of this bacterium, and the use of bioinformatics tools can assist in the biological understanding of this organism through functional annotation analysis. Thus, our study aimed to assign functions to proteins previously described as HPs, present in the genome of E. antarcticum B7. We used an extensive in silico workflow combining several bioinformatics tools for function annotation, sub-cellular localization and physicochemical characterization, three-dimensional structure determination, and protein-protein interactions. This genome contains 2772 genes, of which 765 CDS were annotated as HPs. The amino acid sequences of all HPs were submitted to our workflow and we successfully attributed function to 132 HPs. We identified 11 proteins that play important roles in the mechanisms of adaptation to adverse environments, such as flagellar biosynthesis, biofilm formation, carotenoids biosynthesis, and others. In addition, three predicted HPs are possibly related to arsenic tolerance. Through an in vitro assay, we verified that E. antarcticum B7 can grow at high concentrations of this metal. The approach used was important to precisely assign function to proteins from diverse classes and to infer relationships with proteins with functions already described in the literature. This approach aims to produce a better understanding of the mechanism by which this bacterium adapts to extreme environments and to the finding of targets with biotechnological interest.
Collapse
Affiliation(s)
- Wana Lailan Oliveira da Costa
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Carlos Leonardo de Aragão Araújo
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Larissa Maranhão Dias
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Lino César de Sousa Pereira
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Jorianne Thyeska Castro Alves
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Fabrício Almeida Araújo
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Edson Luiz Folador
- Biotechnology Center, Federal University of Paraiba, João Pessoa, Paraíba, Brazil
| | - Isabel Henriques
- Biology Department & CESAM, University of Aveiro, Aveiro, Portugal
| | - Artur Silva
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
| | - Adriana Ribeiro Carneiro Folador
- Laboratory of Genomic and Bioinformatics, Center of Genomics and System Biology, Institute of Biological Science, Federal University of Para, Belém, Pará, Brazil
- * E-mail: ,
| |
Collapse
|
47
|
Basit AH, Abbasi WA, Asif A, Gull S, Minhas FUAA. Training host-pathogen protein-protein interaction predictors. J Bioinform Comput Biol 2018; 16:1850014. [PMID: 30060698 DOI: 10.1142/s0219720018500142] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology. Particularly, pathogenic infections are caused by interactions of host and pathogen proteins. It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI detection techniques have limitations in terms of cost and large-scale application. Hence, computational approaches are developed to predict PPIs. This study aims to develop machine learning models to predict inter-species PPIs with a special interest in HPIs. Specifically, we focus on seeking answers to three questions that arise while developing an HPI predictor: (1) How should negative training examples be selected? (2) Does assigning sample weights to individual negative examples based on their similarity to positive examples improve generalization performance? and, (3) What should be the size of negative samples as compared to the positive samples during training and evaluation? We compare two available methods for negative sampling: random versus DeNovo sampling and our experiments show that DeNovo sampling offers better accuracy. However, our experiments also show that generalization performance can be improved further by using a soft DeNovo approach that assigns sample weights to negative examples inversely proportional to their similarity to known positive examples during training. Based on our findings, we have also developed an HPI predictor called HOPITOR (Host-Pathogen Interaction Predictor) that can predict interactions between human and viral proteins. The HOPITOR web server can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor .
Collapse
Affiliation(s)
- Abdul Hannan Basit
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan.,† Department of Electrical Engineering, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Wajid Arshad Abbasi
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Amina Asif
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Sadaf Gull
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| | - Fayyaz Ul Amir Afsar Minhas
- * Department of Computer and Information Sciences, Biomedical Informatics Research Laboratory, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad 44000, Pakistan
| |
Collapse
|
48
|
Devkota P, Danzi MC, Wuchty S. Beyond degree and betweenness centrality: Alternative topological measures to predict viral targets. PLoS One 2018; 13:e0197595. [PMID: 29795705 PMCID: PMC5967884 DOI: 10.1371/journal.pone.0197595] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 05/04/2018] [Indexed: 11/18/2022] Open
Abstract
The availability of large-scale screens of host-virus interaction interfaces enabled the topological analysis of viral protein targets of the host. In particular, host proteins that bind viral proteins are generally hubs and proteins with high betweenness centrality. Recently, other topological measures were introduced that a virus may tap to infect a host cell. Utilizing experimentally determined sets of human protein targets from Herpes, Hepatitis, HIV and Influenza, we pooled molecular interactions between proteins from different pathway databases. Apart from a protein's degree and betweenness centrality, we considered a protein's pathway participation, ability to topologically control a network and protein PageRank index. In particular, we found that proteins with increasing values of such measures tend to accumulate viral targets and distinguish viral targets from non-targets. Furthermore, all such topological measures strongly correlate with the occurrence of a given protein in different pathways. Building a random forest classifier that is based on such topological measures, we found that protein PageRank index had the highest impact on the classification of viral (non-)targets while proteins' ability to topologically control an interaction network played the least important role.
Collapse
Affiliation(s)
- Prajwal Devkota
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, United States of America
| | - Matt C. Danzi
- The Miami Project to Cure Paralysis, Miller School of Medicine, University of Miami, Miami, FL, United States of America
- Center for Computational Science, Univ. of Miami, Coral Gables, FL, United States of America
| | - Stefan Wuchty
- Dept. of Computer Science, Univ. of Miami, Coral Gables, FL, United States of America
- Center for Computational Science, Univ. of Miami, Coral Gables, FL, United States of America
- Dept. of Biology, Univ. of Miami, Coral Gables, FL, United States of America
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, United States of America
- * E-mail:
| |
Collapse
|
49
|
Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids. JOURNAL OF HEALTHCARE ENGINEERING 2018; 2018:1391265. [PMID: 29854357 PMCID: PMC5966669 DOI: 10.1155/2018/1391265] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 03/27/2018] [Accepted: 04/17/2018] [Indexed: 11/29/2022]
Abstract
Previous methods for predicting protein-protein interactions (PPIs) were mainly focused on PPIs within a single species, but PPIs across different species have recently emerged as an important issue in some areas such as viral infection. The primary focus of this study is to predict PPIs between virus and its targeted host, which are involved in viral infection. We developed a general method that predicts interactions between virus and host proteins using the repeat patterns and composition of amino acids. In independent testing of the method with PPIs of new viruses and hosts, it showed a high performance comparable to the best performance of other methods for single virus-host PPIs. In comparison of our method with others using same datasets, our method outperformed the others. The repeat patterns and composition of amino acids are simple, yet powerful features for predicting virus-host PPIs. The method developed in this study will help in finding new virus-host PPIs for which little information is available.
Collapse
|
50
|
Computational and Experimental Approaches to Predict Host-Parasite Protein-Protein Interactions. Methods Mol Biol 2018; 1819:153-173. [PMID: 30421403 DOI: 10.1007/978-1-4939-8618-7_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In host-parasite systems, protein-protein interactions are key to allow the pathogen to enter the host and persist within the host. The study of host-parasite molecular communication improves the understanding the mechanisms of infection, evasion of the host immune system and tropism across different tissues. Current trends in parasitology focus on unraveling host-parasite protein-protein interactions to aid the development of new strategies to combat pathogenic parasites with better treatments and prevention mechanisms. Due to the complexity of capturing experimentally these interactions, computational approaches integrating data from different sources (mainly "omics" data) become key to complement or support experimental approaches. Here, we focus on the application of experimental and computational methods in the prediction of host-parasite interactions and highlight the potential of each of these methods in specific contexts.
Collapse
|