1
|
Ayoola MB, Pillai N, Nanduri B, Rothrock MJ, Ramkumar M. Predicting foodborne pathogens and probiotics taxa within poultry-related microbiomes using a machine learning approach. Anim Microbiome 2023; 5:57. [PMID: 37968727 PMCID: PMC10648331 DOI: 10.1186/s42523-023-00260-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 08/23/2023] [Indexed: 11/17/2023] Open
Abstract
BACKGROUND Microbiomes that can serve as an indicator of gut, intestinal, and general health of humans and animals are largely influenced by food consumed and contaminant bioagents. Microbiome studies usually focus on estimating the alpha (within sample) and beta (similarity/dissimilarity among samples) diversities. This study took a combinatorial approach and applied machine learning to microbiome data to predict the presence of disease-causing pathogens and their association with known/potential probiotic taxa. Probiotics are beneficial living microorganisms capable of improving the host organism's digestive system, immune function and ultimately overall health. Here, 16 S rRNA gene high-throughput Illumina sequencing of temporal pre-harvest (feces, soil) samples of 42 pastured poultry flocks (poultry in this entire work solely refers to chickens) from southeastern U.S. farms was used to generate the relative abundance of operational taxonomic units (OTUs) as machine learning input. Unique genera from the OTUs were used as predictors of the prevalence of foodborne pathogens (Salmonella, Campylobacter and Listeria) at different stages of poultry growth (START (2-4 weeks old), MID (5-7 weeks old), END (8-11 weeks old)), association with farm management practices and physicochemical properties. RESULT While we did not see any significant associations between known probiotics and Salmonella or Listeria, we observed significant negative correlations between known probiotics (Bacillus and Clostridium) and Campylobacter at the mid-time point of sample collection. Our data indicates a negative correlation between potential probiotics and Campylobacter at both early and end-time points of sample collection. Furthermore, our model prediction shows that changes in farm operations such as how often the houses are moved on the pasture, age at which chickens are introduced to the pasture, diet composition and presence of other animals on the farm could favorably increase the abundance and activity of probiotics that could reduce Campylobacter prevalence. CONCLUSION Integration of microbiome data with farm management practices using machine learning provided insights on how to reduce Campylobacter prevalence and transmission along the farm-to-fork continuum. Altering management practices to support proliferation of beneficial probiotics to reduce pathogen prevalence identified here could constitute a complementary method to the existing but ineffective interventions such as vaccination and bacteriophage cocktails usage. Study findings also corroborate the presence of bacterial genera such as Caloramator, DA101, Parabacteroides and Faecalibacterium as potential probiotics.
Collapse
Affiliation(s)
- Moses B Ayoola
- Geosystems Research Institute, Mississippi State University, Starkville, MS, 39762, USA
- Department of Comparative Biomedical Sciences, College of Veterinary Medicine, Mississippi State University, Starkville, MS, 39762, USA
| | - Nisha Pillai
- Department of Computer Science and Engineering, Mississippi State University, Starkville, MS, 39762, USA
| | - Bindu Nanduri
- Department of Comparative Biomedical Sciences, College of Veterinary Medicine, Mississippi State University, Starkville, MS, 39762, USA
| | - Michael J Rothrock
- Egg Safety and Quality Research Unit, USDA-ARS U.S. National Poultry Research Center, Athens, GA 30605, USA
| | - Mahalingam Ramkumar
- Department of Computer Science and Engineering, Mississippi State University, Starkville, MS, 39762, USA.
| |
Collapse
|
2
|
Ariute JC, Coelho-Rocha ND, Dantas CWD, de Vasconcelos LAT, Profeta R, de Jesus Sousa T, de Souza Novaes A, Galotti B, Gomes LG, Gimenez EGT, Diniz C, Dias MV, de Jesus LCL, Jaiswal AK, Tiwari S, Carvalho R, Benko-Iseppon AM, Brenig B, Azevedo V, Barh D, Martins FS, Aburjaile F. Probiogenomics of Leuconostoc Mesenteroides Strains F-21 and F-22 Isolated from Human Breast Milk Reveal Beneficial Properties. Probiotics Antimicrob Proteins 2023:10.1007/s12602-023-10170-7. [PMID: 37804433 DOI: 10.1007/s12602-023-10170-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 10/09/2023]
Abstract
Bacteria of the Leuconostoc genus are Gram-positive bacteria that are commonly found in raw milk and persist in fermented dairy products and plant food. Studies have already explored the probiotic potential of L. mesenteroides, but not from a probiogenomic perspective, which aims to explore the molecular features responsible for their phenotypes. In the present work, probiogenomic approaches were applied in strains F-21 and F-22 of L. mesenteroides isolated from human milk to assess their biosafety at the molecular level and to correlate molecular features with their potential probiotic characteristics. The complete genome of strain F-22 is 1.99 Mb and presents one plasmid, while the draft genome of strain F-21 is 1.89 Mb and presents four plasmids. A high percentage of average nucleotide identity among other genomes of L. mesenteroides (≥ 96%) corroborated the previous taxonomic classification of these isolates. Genomic regions that influence the probiotic properties were identified and annotated. Both strains exhibited wide genome plasticity, cell adhesion ability, proteolytic activity, proinflammatory and immunomodulation capacity through interaction with TLR-NF-κB and TLR-MAPK pathway components, and no antimicrobial resistance, denoting their potential to be candidate probiotics. Further, the strains showed bacteriocin production potential and the presence of acid, thermal, osmotic, and bile salt resistance genes, indicating their ability to survive under gastrointestinal stress. Taken together, our results suggest that L. mesenteroides F-21 and F-22 are promising candidates for probiotics in the food and pharmaceutical industries.
Collapse
Affiliation(s)
- Juan Carlos Ariute
- Laboratory of Integrative Bioinformatics, Preventive Veterinary Medicine Department, Veterinary School, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- Graduate Program in Bioinformatics, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Nina Dias Coelho-Rocha
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Carlos Willian Dias Dantas
- Graduate Program in Bioinformatics, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Larissa Amorim Tourinho de Vasconcelos
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Rodrigo Profeta
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- Graduate Program in Bioinformatics, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Thiago de Jesus Sousa
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Ane de Souza Novaes
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Bruno Galotti
- Laboratory of Biotherapeutic Agents, Department of Microbiology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Lucas Gabriel Gomes
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- Graduate Program in Bioinformatics, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Enrico Giovanelli Toccani Gimenez
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- Graduate Program in Bioinformatics, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Carlos Diniz
- Laboratory of Integrative Bioinformatics, Preventive Veterinary Medicine Department, Veterinary School, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Mariana Vieira Dias
- Laboratory of Integrative Bioinformatics, Preventive Veterinary Medicine Department, Veterinary School, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Luís Cláudio Lima de Jesus
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Arun Kumar Jaiswal
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Sandeep Tiwari
- Department of Biochemistry and Biophysics, Institute of Health Sciences, Federal University of Bahia, Salvador, Bahia, 40231-300, Brazil
| | - Rodrigo Carvalho
- Department of Biochemistry and Biophysics, Institute of Health Sciences, Federal University of Bahia, Salvador, Bahia, 40231-300, Brazil
| | - Ana Maria Benko-Iseppon
- Laboratory of Plants Genetics and Biotechnology, Genetics Department, Biosciences Center, Federal University of Pernambuco, Recife, Pernambuco, 50740-600, Brazil
| | - Bertram Brenig
- Institute of Veterinary Medicine, University of Göttingen, Burckhardtweg 2, 37077, Göttingen, Germany
| | - Vasco Azevedo
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Debmalya Barh
- Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, 721172, India
| | - Flaviano S Martins
- Laboratory of Biotherapeutic Agents, Department of Microbiology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
| | - Flavia Aburjaile
- Laboratory of Integrative Bioinformatics, Preventive Veterinary Medicine Department, Veterinary School, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil.
| |
Collapse
|
3
|
Huang Y, Wuchty S, Zhou Y, Zhang Z. SGPPI: structure-aware prediction of protein-protein interactions in rigorous conditions with graph convolutional network. Brief Bioinform 2023; 24:6995378. [PMID: 36682013 DOI: 10.1093/bib/bbad020] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 11/17/2022] [Accepted: 01/05/2023] [Indexed: 01/23/2023] Open
Abstract
While deep learning (DL)-based models have emerged as powerful approaches to predict protein-protein interactions (PPIs), the reliance on explicit similarity measures (e.g. sequence similarity and network neighborhood) to known interacting proteins makes these methods ineffective in dealing with novel proteins. The advent of AlphaFold2 presents a significant opportunity and also a challenge to predict PPIs in a straightforward way based on monomer structures while controlling bias from protein sequences. In this work, we established Structure and Graph-based Predictions of Protein Interactions (SGPPI), a structure-based DL framework for predicting PPIs, using the graph convolutional network. In particular, SGPPI focused on protein patches on the protein-protein binding interfaces and extracted the structural, geometric and evolutionary features from the residue contact map to predict PPIs. We demonstrated that our model outperforms traditional machine learning methods and state-of-the-art DL-based methods using non-representation-bias benchmark datasets. Moreover, our model trained on human dataset can be reliably transferred to predict yeast PPIs, indicating that SGPPI can capture converging structural features of protein interactions across various species. The implementation of SGPPI is available at https://github.com/emerson106/SGPPI.
Collapse
Affiliation(s)
- Yan Huang
- State Key Laboratory of Livestock and Poultry Biotechnology Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China
- Department of Biomedical Informatics, Ministry of Education Key Laboratory of Molecular Cardiovascular Sciences, Center for Non-Coding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Coral Gables, FL 33146, USA
- Department of Biology, University of Miami, Coral Gables, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
- Institute of Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA
| | - Yuan Zhou
- Department of Biomedical Informatics, Ministry of Education Key Laboratory of Molecular Cardiovascular Sciences, Center for Non-Coding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Ziding Zhang
- State Key Laboratory of Livestock and Poultry Biotechnology Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
4
|
Gao H, Chen C, Li S, Wang C, Zhou W, Yu B. Prediction of protein-protein interactions based on ensemble residual conventional neural network. Comput Biol Med 2022. [DOI: 10.1016/j.compbiomed.2022.106471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
5
|
Fang Y, Yang Y, Liu C. New feature extraction from phylogenetic profiles improved the performance of pathogen-host interactions. Front Cell Infect Microbiol 2022; 12:931072. [PMID: 35982784 PMCID: PMC9378789 DOI: 10.3389/fcimb.2022.931072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
MotivationThe understanding of pathogen-host interactions (PHIs) is essential and challenging research because this potentially provides the mechanism of molecular interactions between different organisms. The experimental exploration of PHI is time-consuming and labor-intensive, and computational approaches are playing a crucial role in discovering new unknown PHIs between different organisms. Although it has been proposed that most machine learning (ML)–based methods predict PHI, these methods are all based on the structure-based information extracted from the sequence for prediction. The selection of feature values is critical to improving the performance of predicting PHI using ML.ResultsThis work proposed a new method to extract features from phylogenetic profiles as evolutionary information for predicting PHI. The performance of our approach is better than that of structure-based and ML-based PHI prediction methods. The five different extract models proposed by our approach combined with structure-based information significantly improved the performance of PHI, suggesting that combining phylogenetic profile features and structure-based methods could be applied to the exploration of PHI and discover new unknown biological relativity.Availability and implementationThe KPP method is implemented in the Java language and is available at https://github.com/yangfangs/KPP.
Collapse
Affiliation(s)
- Yang Fang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- Department of Laboratory Medicine, Third Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yi Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| | - Chengcheng Liu
- State Key Laboratory of Oral Diseases, Department of Periodontics, National Clinical Research Center for Oral Diseases, West China School & Hospital of Stomatology, Sichuan University, Chengdu, China
- *Correspondence: Chengcheng Liu, ; Yi Yang,
| |
Collapse
|
6
|
De Jesus LCL, Aburjaile FF, Sousa TDJ, Felice AG, Soares SDC, Alcantara LCJ, Azevedo VADC. Genomic Characterization of Lactobacillus delbrueckii Strains with Probiotics Properties. FRONTIERS IN BIOINFORMATICS 2022; 2:912795. [PMID: 36304288 PMCID: PMC9580953 DOI: 10.3389/fbinf.2022.912795] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 05/16/2022] [Indexed: 01/22/2023] Open
Abstract
Probiotics are health-beneficial microorganisms with mainly immunomodulatory and anti-inflammatory properties. Lactobacillus delbrueckii species is a common bacteria used in the dairy industry, and their benefits to hosting health have been reported. This study analyzed the core genome of nine strains of L. delbrueckii species with documented probiotic properties, focusing on genes related to their host health benefits. For this, a combined methodology including several software and databases (BPGA, SPAAN, BAGEL4, BioCyc, KEEG, and InterSPPI) was used to predict the most important characteristics related to L. delbrueckii strains probiose. Comparative genomics analyses revealed that L. delbrueckii probiotic strains shared essential genes related to acid and bile stress response and antimicrobial activity. Other standard features shared by these strains are surface layer proteins and extracellular proteins-encoding genes, with high adhesion profiles that interacted with human proteins of the inflammatory signaling pathways (TLR2/4-MAPK, TLR2/4-NF-κB, and NOD-like receptors). Among these, the PrtB serine protease appears to be a strong candidate responsible for the anti-inflammatory properties reported for these strains. Furthermore, genes with high proteolytic and metabolic activity able to produce beneficial metabolites, such as acetate, bioactive peptides, and B-complex vitamins were also identified. These findings suggest that these proteins can be essential in biological mechanisms related to probiotics’ beneficial effects of these strains in the host.
Collapse
Affiliation(s)
- Luís Cláudio Lima De Jesus
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Flávia Figueira Aburjaile
- Department of Preventive Veterinary Medicine, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Thiago De Jesus Sousa
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Andrei Giacchetto Felice
- Department of Immunology, Microbiology and Parasitology, Federal University of Triângulo Mineiro, Uberaba, Brazil
| | - Siomar De Castro Soares
- Department of Immunology, Microbiology and Parasitology, Federal University of Triângulo Mineiro, Uberaba, Brazil
| | - Luiz Carlos Junior Alcantara
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte, Brazil
- Flavivirus Laboratory, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
- *Correspondence: Luiz Carlos Junior Alcantara, ; Vasco Ariston De Carvalho Azevedo,
| | - Vasco Ariston De Carvalho Azevedo
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, Belo Horizonte, Brazil
- *Correspondence: Luiz Carlos Junior Alcantara, ; Vasco Ariston De Carvalho Azevedo,
| |
Collapse
|
7
|
Kaundal R, Loaiza CD, Duhan N, Flann N. deepHPI: a comprehensive deep learning platform for accurate prediction and visualization of host-pathogen protein-protein interactions. Brief Bioinform 2022; 23:6576450. [PMID: 35511057 DOI: 10.1093/bib/bbac125] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 02/07/2022] [Accepted: 03/15/2022] [Indexed: 01/06/2023] Open
Abstract
Host-pathogen protein interactions (HPPIs) play vital roles in many biological processes and are directly involved in infectious diseases. With the outbreak of more frequent pandemics in the last couple of decades, such as the recent outburst of Covid-19 causing millions of deaths, it has become more critical to develop advanced methods to accurately predict pathogen interactions with their respective hosts. During the last decade, experimental methods to identify HPIs have been used to decipher host-pathogen systems with the caveat that those techniques are labor-intensive, expensive and time-consuming. Alternatively, accurate prediction of HPIs can be performed by the use of data-driven machine learning. To provide a more robust and accurate solution for the HPI prediction problem, we have developed a deepHPI tool based on deep learning. The web server delivers four host-pathogen model types: plant-pathogen, human-bacteria, human-virus and animal-pathogen, leveraging its operability to a wide range of analyses and cases of use. The deepHPI web tool is the first to use convolutional neural network models for HPI prediction. These models have been selected based on a comprehensive evaluation of protein features and neural network architectures. The best prediction models have been tested on independent validation datasets, which achieved an overall Matthews correlation coefficient value of 0.87 for animal-pathogen using the combined pseudo-amino acid composition and conjoint triad (PAAC_CT) features, 0.75 for human-bacteria using the combined pseudo-amino acid composition, conjoint triad and normalized Moreau-Broto feature (PAAC_CT_NMBroto), 0.96 for human-virus using PAAC_CT_NMBroto and 0.94 values for plant-pathogen interactions using the combined pseudo-amino acid composition, composition and transition feature (PAAC_CTDC_CTDT). Our server running deepHPI is deployed on a high-performance computing cluster that enables large and multiple user requests, and it provides more information about interactions discovered. It presents an enriched visualization of the resulting host-pathogen networks that is augmented with external links to various protein annotation resources. We believe that the deepHPI web server will be very useful to researchers, particularly those working on infectious diseases. Additionally, many novel and known host-pathogen systems can be further investigated to significantly advance our understanding of complex disease-causing agents. The developed models are established on a web server, which is freely accessible at http://bioinfo.usu.edu/deepHPI/.
Collapse
Affiliation(s)
- Rakesh Kaundal
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences.,Department of Computer Science, College of Science; Utah State University, Logan, 84322 USA
| | - Cristian D Loaiza
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences
| | - Naveen Duhan
- Bioinformatics Facility, Center for Integrated BioSystems, College of Agriculture and Applied Sciences.,Department of Plants, Soils, and Climate, College of Agriculture and Applied Sciences
| | - Nicholas Flann
- Department of Computer Science, College of Science; Utah State University, Logan, 84322 USA
| |
Collapse
|
8
|
Lim H, Cankara F, Tsai CJ, Keskin O, Nussinov R, Gursoy A. Artificial intelligence approaches to human-microbiome protein–protein interactions. Curr Opin Struct Biol 2022; 73:102328. [DOI: 10.1016/j.sbi.2022.102328] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 12/01/2021] [Accepted: 12/31/2021] [Indexed: 02/08/2023]
|
9
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
10
|
Machine Learning Approaches for Discriminating Bacterial and Viral Targeted Human Proteins. Processes (Basel) 2022. [DOI: 10.3390/pr10020291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Infectious diseases are one of the core biological complications for public health. It is important to recognize the pathogen-specific mechanisms to improve our understanding of infectious diseases. Differentiations between bacterial- and viral-targeted human proteins are important for improving both prognosis and treatment for the patient. Here, we introduce machine learning-based classifiers to discriminate between the two groups of human proteins. We used the sequence, network, and gene ontology features of human proteins. Among different classifiers and features, the deep neural network (DNN) classifier with amino acid composition (AAC), dipeptide composition (DC), and pseudo-amino acid composition (PAAC) (445 features) achieved the best area under the curve (AUC) value (0.939), F1-score (94.9%), and Matthews correlation coefficient (MCC) value (0.81). We found that each of the selected top 100 of the bacteria- and virus-targeted human proteins from a candidate pool of 1618 and 3916 proteins, respectively, were part of distinct enriched biological processes and pathways. Our proposed method will help to differentiate between the bacterial and viral infections based on the targeted human proteins on a global scale. Furthermore, identification of the crucial pathogen targets in the human proteome would help us to better understand the pathogen-specific infection strategies and develop novel therapeutics.
Collapse
|
11
|
Wan XH. Artificial intelligence reveals roles of gut microbiota in driving human colorectal cancer evolution. Artif Intell Cancer 2021; 2:69-78. [DOI: 10.35713/aic.v2.i5.69] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 10/24/2021] [Accepted: 10/27/2021] [Indexed: 02/06/2023] Open
Abstract
With the rapid development of high-throughput sequencing and artificial intelligence (AI) techniques, gut mucosal microbiota begins to be recognized as critical drivers of human colorectal cancer (CRC). Various AI approaches have been designed to obtain effective information from enormous numbers of microbial cells residing in gut mucosal as well as cancer cells. These mainly include detection of microbial markers for early clinical diagnosis of stage-specific CRC, characterization of pathogenic bacterial activities via genomic and transcriptomic analyses, and prediction of interplay between bacterial drivers and host immune systems. Here I review the current progresses of AI applications in profiling gut microbiomes linked to CRC initiation and development. I further look forward to future AI research for improving our understanding of the roles of gut microbiota in CRC evolution.
Collapse
Affiliation(s)
- Xue-Hua Wan
- TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin 300457, China
| |
Collapse
|
12
|
Pitta JLDLP, Vasconcelos CRDS, Wallau GDL, Campos TDL, Rezende AM. In silico predictions of protein interactions between Zika virus and human host. PeerJ 2021; 9:e11770. [PMID: 34513323 PMCID: PMC8395582 DOI: 10.7717/peerj.11770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 06/23/2021] [Indexed: 11/20/2022] Open
Abstract
Background The ZIKA virus (ZIKV) belongs to the Flaviviridae family, was first isolated in the 1940s, and remained underreported until its global threat in 2016, where drastic consequences were reported as Guillan-Barre syndrome and microcephaly in newborns. Understanding molecular interactions of ZIKV proteins during the host infection is important to develop treatments and prophylactic measures; however, large-scale experimental approaches normally used to detect protein-protein interaction (PPI) are onerous and labor-intensive. On the other hand, computational methods may overcome these challenges and guide traditional approaches on one or few protein molecules. The prediction of PPIs can be used to study host-parasite interactions at the protein level and reveal key pathways that allow viral infection. Results Applying Random Forest and Support Vector Machine (SVM) algorithms, we performed predictions of PPI between two ZIKV strains and human proteomes. The consensus number of predictions of both algorithms was 17,223 pairs of proteins. Functional enrichment analyses were executed with the predicted networks to access the biological meanings of the protein interactions. Some pathways related to viral infection and neurological development were found for both ZIKV strains in the enrichment analysis, but the JAK-STAT pathway was observed only for strain PE243 when compared with the FSS13025 strain. Conclusions The consensus network of PPI predictions made by Random Forest and SVM algorithms allowed an enrichment analysis that corroborates many aspects of ZIKV infection. The enrichment results are mainly related to viral infection, neuronal development, and immune response, and presented differences among the two compared ZIKV strains. Strain PE243 presented more predicted interactions between proteins from the JAK-STAT signaling pathway, which could lead to a more inflammatory immune response when compared with the FSS13025 strain. These results show that the methodology employed in this study can potentially reveal new interactions between the ZIKV and human cells.
Collapse
Affiliation(s)
| | | | | | - Túlio de Lima Campos
- Bioinformatics Platform, Aggeu Magalhães Institute-FIOCRUZ/PE, Recife, PE, Brasil
| | | |
Collapse
|
13
|
Hui X, Xu Z, Cao L, Liu L, Lin X, Yang Y, Sun X, Zhang Q, Jin M. HP0487 contributes to the virulence of Streptococcus suis serotype 2 by mediating bacterial adhesion and anti-phagocytosis to neutrophils. Vet Microbiol 2021; 260:109164. [PMID: 34247113 DOI: 10.1016/j.vetmic.2021.109164] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 06/17/2021] [Indexed: 01/15/2023]
Abstract
Streptococcus suis serotype 2 (SS2) is an important zoonotic pathogen that poses a serious threat to human health and the swine industry. The survival and travel in the bloodstream are the important causes for SS2, contributing to bacteremia, septicemia even septic shock. However, the related mechanism remains largely unknown. Preliminary experiment demonstrated that SS2 could largely attach to the surface of neutrophils, implying that this phenomenon maybe contributed to the travel of SS2 in bloodstream and then influenced its pathogenicity. To confirm this hypothesis, using a previously established screening method that combines affinity chromatography (based on liquid chromatography-tandem mass spectrometry) with shotgun proteomics, three candidate proteins (HP0487, HP1765, and HP1111) were identified from SS2 that could interact with neutrophils. Next, by constructing the deletion mutations, we demonstrated that HP0487 of three proteins could significantly influence the adhesion of SS2 to neutrophils. Furthermore, HP0487 was shown to contribute to the anti-phagocytosis of SS2 to neutrophils and RAW264.7 cells. More importantly, the deletion of HP0487 significantly reduced lethality and bacterial loads in vivo of SS2. Thus, our findings demonstrate that HP0487 contributes to SS2 virulence by mediating the adhesion and anti-phagocytosis of SS2 to neutrophils, promoting a better understanding about the pathogenesis of SS2.
Collapse
Affiliation(s)
- Xianfeng Hui
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Zhongmin Xu
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Lei Cao
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Liang Liu
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xian Lin
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yong Yang
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xiaomei Sun
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Qiang Zhang
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China; College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, China.
| | - Meilin Jin
- State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China; Key Laboratory of Preventive Veterinary Medicine in Hubei Province, The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070, China; Key Laboratory of Development of Veterinary Diagnostic Products, Ministry of Agriculture, Wuhan, 430070, China.
| |
Collapse
|
14
|
Sudhakar P, Machiels K, Verstockt B, Korcsmaros T, Vermeire S. Computational Biology and Machine Learning Approaches to Understand Mechanistic Microbiome-Host Interactions. Front Microbiol 2021; 12:618856. [PMID: 34046017 PMCID: PMC8148342 DOI: 10.3389/fmicb.2021.618856] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/19/2021] [Indexed: 12/11/2022] Open
Abstract
The microbiome, by virtue of its interactions with the host, is implicated in various host functions including its influence on nutrition and homeostasis. Many chronic diseases such as diabetes, cancer, inflammatory bowel diseases are characterized by a disruption of microbial communities in at least one biological niche/organ system. Various molecular mechanisms between microbial and host components such as proteins, RNAs, metabolites have recently been identified, thus filling many gaps in our understanding of how the microbiome modulates host processes. Concurrently, high-throughput technologies have enabled the profiling of heterogeneous datasets capturing community level changes in the microbiome as well as the host responses. However, due to limitations in parallel sampling and analytical procedures, big gaps still exist in terms of how the microbiome mechanistically influences host functions at a system and community level. In the past decade, computational biology and machine learning methodologies have been developed with the aim of filling the existing gaps. Due to the agnostic nature of the tools, they have been applied in diverse disease contexts to analyze and infer the interactions between the microbiome and host molecular components. Some of these approaches allow the identification and analysis of affected downstream host processes. Most of the tools statistically or mechanistically integrate different types of -omic and meta -omic datasets followed by functional/biological interpretation. In this review, we provide an overview of the landscape of computational approaches for investigating mechanistic interactions between individual microbes/microbiome and the host and the opportunities for basic and clinical research. These could include but are not limited to the development of activity- and mechanism-based biomarkers, uncovering mechanisms for therapeutic interventions and generating integrated signatures to stratify patients.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Kathleen Machiels
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
| | - Bram Verstockt
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Séverine Vermeire
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| |
Collapse
|
15
|
de Jesus LCL, Drumond MM, Aburjaile FF, Sousa TDJ, Coelho-Rocha ND, Profeta R, Brenig B, Mancha-Agresti P, Azevedo V. Probiogenomics of Lactobacillus delbrueckii subsp. lactis CIDCA 133: In Silico, In Vitro, and In Vivo Approaches. Microorganisms 2021; 9:microorganisms9040829. [PMID: 33919849 PMCID: PMC8070793 DOI: 10.3390/microorganisms9040829] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 03/24/2021] [Accepted: 03/30/2021] [Indexed: 11/16/2022] Open
Abstract
Lactobacillus delbrueckii subsp. lactis CIDCA 133 (CIDCA 133) has been reported as a potential probiotic strain, presenting immunomodulatory properties. This study investigated the possible genes and molecular mechanism involved with a probiotic profile of CIDCA 133 through a genomic approach associated with in vitro and in vivo analysis. Genomic analysis corroborates the species identification carried out by the classical microbiological method. Phenotypic assays demonstrated that the CIDCA 133 strain could survive acidic, osmotic, and thermic stresses. In addition, this strain shows antibacterial activity against Salmonella Typhimurium and presents immunostimulatory properties capable of upregulating anti-inflammatory cytokines Il10 and Tgfb1 gene expression through inhibition of Nfkb1 gene expression. These reported effects can be associated with secreted, membrane/exposed to the surface and cytoplasmic proteins, and bacteriocins-encoding genes predicted in silico. Furthermore, our results showed the genes and the possible mechanisms used by CIDCA 133 to produce their beneficial host effects and highlight its use as a probiotic microorganism.
Collapse
Affiliation(s)
- Luís Cláudio Lima de Jesus
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
| | - Mariana Martins Drumond
- Centro Federal de Educação Tecnológica de Minas Gerais (CEFET/MG), Departamento de Ciências Biológicas, Belo Horizonte 31421-169, Brazil;
| | - Flávia Figueira Aburjaile
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
- Laboratório de Flavivírus, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro 21040-360, Brazil
| | - Thiago de Jesus Sousa
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
| | - Nina Dias Coelho-Rocha
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
| | - Rodrigo Profeta
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
| | - Bertram Brenig
- Institute of Veterinary Medicine, University of Göttingen, D-37077 Göttingen, Germany;
| | | | - Vasco Azevedo
- Laboratório de Genética Celular e Molecular (LGCM), Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte 31270-901, Brazil; (L.C.L.d.J.); (F.F.A.); (T.d.J.S.); (N.D.C.-R.); (R.P.)
- Correspondence:
| |
Collapse
|
16
|
Wang Y, Zhou M, Zou Q, Xu L. Machine learning for phytopathology: from the molecular scale towards the network scale. Brief Bioinform 2021; 22:6204793. [PMID: 33787847 DOI: 10.1093/bib/bbab037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/09/2021] [Accepted: 01/26/2021] [Indexed: 01/16/2023] Open
Abstract
With the increasing volume of high-throughput sequencing data from a variety of omics techniques in the field of plant-pathogen interactions, sorting, retrieving, processing and visualizing biological information have become a great challenge. Within the explosion of data, machine learning offers powerful tools to process these complex omics data by various algorithms, such as Bayesian reasoning, support vector machine and random forest. Here, we introduce the basic frameworks of machine learning in dissecting plant-pathogen interactions and discuss the applications and advances of machine learning in plant-pathogen interactions from molecular to network biology, including the prediction of pathogen effectors, plant disease resistance protein monitoring and the discovery of protein-protein networks. The aim of this review is to provide a summary of advances in plant defense and pathogen infection and to indicate the important developments of machine learning in phytopathology.
Collapse
Affiliation(s)
- Yansu Wang
- Postdoctoral Innovation Practice Base, Shenzhen Polytechnic, China
| | | | - Quan Zou
- University of Electronic Science and Technology of China
| | - Lei Xu
- Shenzhen Polytechnic, China
| |
Collapse
|
17
|
Lian X, Yang X, Yang S, Zhang Z. Current status and future perspectives of computational studies on human-virus protein-protein interactions. Brief Bioinform 2021; 22:6161422. [PMID: 33693490 DOI: 10.1093/bib/bbab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/19/2022] Open
Abstract
The protein-protein interactions (PPIs) between human and viruses mediate viral infection and host immunity processes. Therefore, the study of human-virus PPIs can help us understand the principles of human-virus relationships and can thus guide the development of highly effective drugs to break the transmission of viral infectious diseases. Recent years have witnessed the rapid accumulation of experimentally identified human-virus PPI data, which provides an unprecedented opportunity for bioinformatics studies revolving around human-virus PPIs. In this article, we provide a comprehensive overview of computational studies on human-virus PPIs, especially focusing on the method development for human-virus PPI predictions. We briefly introduce the experimental detection methods and existing database resources of human-virus PPIs, and then discuss the research progress in the development of computational prediction methods. In particular, we elaborate the machine learning-based prediction methods and highlight the need to embrace state-of-the-art deep-learning algorithms and new feature engineering techniques (e.g. the protein embedding technique derived from natural language processing). To further advance the understanding in this research topic, we also outline the practical applications of the human-virus interactome in fundamental biological discovery and new antiviral therapy development.
Collapse
Affiliation(s)
- Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
18
|
Yang X, Lian X, Fu C, Wuchty S, Yang S, Zhang Z. HVIDB: a comprehensive database for human-virus protein-protein interactions. Brief Bioinform 2021; 22:832-844. [PMID: 33515030 DOI: 10.1093/bib/bbaa425] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/12/2020] [Accepted: 12/19/2020] [Indexed: 12/22/2022] Open
Abstract
While leading to millions of people's deaths every year the treatment of viral infectious diseases remains a huge public health challenge.Therefore, an in-depth understanding of human-virus protein-protein interactions (PPIs) as the molecular interface between a virus and its host cell is of paramount importance to obtain new insights into the pathogenesis of viral infections and development of antiviral therapeutic treatments. However, current human-virus PPI database resources are incomplete, lack annotation and usually do not provide the opportunity to computationally predict human-virus PPIs. Here, we present the Human-Virus Interaction DataBase (HVIDB, http://zzdlab.com/hvidb/) that provides comprehensively annotated human-virus PPI data as well as seamlessly integrates online PPI prediction tools. Currently, HVIDB highlights 48 643 experimentally verified human-virus PPIs covering 35 virus families, 6633 virally targeted host complexes, 3572 host dependency/restriction factors as well as 911 experimentally verified/predicted 3D complex structures of human-virus PPIs. Furthermore, our database resource provides tissue-specific expression profiles of 6790 human genes that are targeted by viruses and 129 Gene Expression Omnibus series of differentially expressed genes post-viral infections. Based on these multifaceted and annotated data, our database allows the users to easily obtain reliable information about PPIs of various human viruses and conduct an in-depth analysis of their inherent biological significance. In particular, HVIDB also integrates well-performing machine learning models to predict interactions between the human host and viral proteins that are based on (i) sequence embedding techniques, (ii) interolog mapping and (iii) domain-domain interaction inference. We anticipate that HVIDB will serve as a one-stop knowledge base to further guide hypothesis-driven experimental efforts to investigate human-virus relationships.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xianyi Lian
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Chen Fu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Institute of Data Science and Sylvester Comprehensive Cancer Center at the University of Miami, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
19
|
Environmental conditions modulate the protein content and immunomodulatory activity of extracellular vesicles produced by the probiotic Propionibacterium freudenreichii. Appl Environ Microbiol 2021; 87:AEM.02263-20. [PMID: 33310709 PMCID: PMC7851693 DOI: 10.1128/aem.02263-20] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Propionibacterium freudenreichii is a probiotic Gram-positive bacterium with promising immunomodulatory properties. It modulates regulatory cytokines, mitigates the inflammatory response in vitro and in vivo These properties were initially attributed to specific bacterial surface proteins. Recently, we showed that extracellular vesicles (EVs) produced by P. freudenreichii CIRM-BIA129 mimic the immunomodulatory features of parent cells in vitro (i.e. modulating NF-κB transcription factor activity and IL-8 release) which underlies the role of EVs as mediators of the probiotic effects of the bacterium. The modulation of EV properties, and particularly of those with potential therapeutic applications such as the EVs produced by the probiotic P. freudenreichii, is one of the challenges in the field to achieve efficient yields with the desired optimal functionality. Here we evaluated whether the culture medium in which the bacteria are grown could be used as a lever to modulate the protein content and hence the properties of P. freudenreichii CIRM-BIA129 EVs. The physical, biochemical and functional properties of EVs produced from cells cultivated on laboratory Yeast Extract Lactate (YEL) medium and cow milk ultrafiltrate (UF) medium were compared. UF-derived EVs were more abundant, smaller in diameter and displayed more intense anti-inflammatory activity than YEL-derived EVs. Furthermore, the growth media modulated EV content in terms of both the identities and abundances of their protein cargos, suggesting different patterns of interaction with the host. Proteins involved in amino acid metabolism and central carbon metabolism were modulated, as were the key surface proteins mediating host-propionibacteria interactions.Importance Extracellular vesicles (EVs) are cellular membrane-derived nanosized particles that are produced by most cells in all three kingdoms of life. They play a pivotal role in cell-cell communication through their ability to transport bioactive molecules from donor to recipient cells. Bacterial EVs are important factors in host-microbe interactions. Recently we have shown that EVs produced by the probiotic P. freudenreichii exhibited immunomodulatory properties. We evaluate here the impact of environmental conditions, notably culture media, on P. freudenreichii EV production and function. We show that EVs display considerable differences in protein cargo and immunomodulation depending on the culture medium used. This work offers new perspectives for the development of probiotic EV-based molecular delivery systems, and reinforces the optimization of growth conditions as a tool to modulate the potential therapeutic applications of EVs.
Collapse
|
20
|
Mathews N, Tran T, Rekabdar B, Ekenna C. Predicting human–pathogen protein–protein interactions using Natural Language Processing methods. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
21
|
McCoubrey LE, Elbadawi M, Orlu M, Gaisford S, Basit AW. Harnessing machine learning for development of microbiome therapeutics. Gut Microbes 2021; 13:1-20. [PMID: 33522391 PMCID: PMC7872042 DOI: 10.1080/19490976.2021.1872323] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 12/20/2020] [Indexed: 02/06/2023] Open
Abstract
The last twenty years of seminal microbiome research has uncovered microbiota's intrinsic relationship with human health. Studies elucidating the relationship between an unbalanced microbiome and disease are currently published daily. As such, microbiome big data have become a reality that provide a mine of information for the development of new therapeutics. Machine learning (ML), a branch of artificial intelligence, offers powerful techniques for big data analysis and prediction-making, that are out of reach of human intellect alone. This review will explore how ML can be applied for the development of microbiome-targeted therapeutics. A background on ML will be given, followed by a guide on where to find reliable microbiome big data. Existing applications and opportunities will be discussed, including the use of ML to discover, design, and characterize microbiome therapeutics. The use of ML to optimize advanced processes, such as 3D printing and in silico prediction of drug-microbiome interactions, will also be highlighted. Finally, barriers to adoption of ML in academic and industrial settings will be examined, concluded by a future outlook for the field.
Collapse
Affiliation(s)
| | - Moe Elbadawi
- UCL School of Pharmacy, University College London, London, UK
| | - Mine Orlu
- UCL School of Pharmacy, University College London, London, UK
| | - Simon Gaisford
- UCL School of Pharmacy, University College London, London, UK
- FabRx Ltd., Ashford, Kent, UK
| | - Abdul W. Basit
- UCL School of Pharmacy, University College London, London, UK
| |
Collapse
|
22
|
Lian X, Yang X, Shao J, Hou F, Yang S, Pan D, Zhang Z. Prediction and analysis of human-herpes simplex virus type 1 protein-protein interactions by integrating multiple methods. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0222-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
23
|
Abstract
Identification of HIV-1 HDFs remains a crucial step to understand the complicated relationships between human and HIV-1. To complement the experimental identification of HDFs, we have implemented an existing network-based gene discovery strategy to predict HDFs from the human genome. The core idea of the proposed method is that the rich information deposited in host gene functional networks can be effectively utilized to infer the potential HDFs. We hope the proposed prediction method could further guide hypothesis-driven experimental efforts to interrogate human–HIV-1 relationships and provide new hints for the development of antiviral drugs to combat HIV-1 infection. Human immunodeficiency virus type 1 (HIV-1) depends on a class of host proteins called host dependency factors (HDFs) to facilitate its infection. So far experimental efforts have detected a certain number of HDFs, but the gene inventory of HIV-1 HDFs remains incomplete. Here, we implemented an existing network-based gene discovery strategy to predict HIV-1 HDFs. First, an encoding scheme based on a publicly available human tissue-specific gene functional network (GIANT; http://giant.princeton.edu/) was designed to convert each human gene into a 25,825-dimensional feature vector. Then, a random forest-based predictive model was trained on a data set containing 868 known HDFs and 1,736 non-HDFs. Through 5-fold cross-validation, an independent test, and comparison with one existing method, the proposed prediction method consistently revealed accurate and competitive performance. The highlight of our method should be ascribed to the introduction of the GIANT encoding scheme, which contains rich information regarding gene interactions. By merging known HDFs and genome-wide HDF prediction results, network analysis was conducted to catch the common patterns of HDFs in the context of the GIANT network. Interestingly, HDFs reveal significantly lower betweenness than HIV-1-interacting human proteins (i.e., HIV targets). In the meantime, the functional roles of HDFs were also examined by mapping all the HDF candidates into human protein complexes. Especially, we observed the frequent co-occurrence of HDFs and HIV targets at the protein complex level. Collectively, we hope the proposed prediction method not only can accelerate the HDF identification and antiviral drug target discovery, but also can provide some mechanistic insights into human-virus relationships. IMPORTANCE Identification of HIV-1 HDFs remains a crucial step to understand the complicated relationships between human and HIV-1. To complement the experimental identification of HDFs, we have implemented an existing network-based gene discovery strategy to predict HDFs from the human genome. The core idea of the proposed method is that the rich information deposited in host gene functional networks can be effectively utilized to infer the potential HDFs. We hope the proposed prediction method could further guide hypothesis-driven experimental efforts to interrogate human–HIV-1 relationships and provide new hints for the development of antiviral drugs to combat HIV-1 infection.
Collapse
|
24
|
Wang W, Guan X, Khan MT, Xiong Y, Wei DQ. LMI-DForest: A deep forest model towards the prediction of lncRNA-miRNA interactions. Comput Biol Chem 2020; 89:107406. [PMID: 33120126 DOI: 10.1016/j.compbiolchem.2020.107406] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 10/12/2020] [Accepted: 10/15/2020] [Indexed: 02/07/2023]
Abstract
The interactions between miRNAs and long non-coding RNAs (lncRNAs) are subject to intensive recent studies due to its critical role in gene regulations. Computational prediction of lncRNA-miRNA interactions has become a popular alternative strategy to the experimental methods for identification of underlying interactions. It is desirable to develop the machine learning-based models for prediction of lncRNA-miRNA based on the experimentally validated interactions between lncRNAs and miRNAs. The accuracy and robustness of existing models based on machine learning techniques are subject to further improvement. Considering that the attributes of lncRNA and miRNA contribute key importance in the interaction between these two RNAs, a deep learning model, named LMI-DForest, is proposed here by combining the deep forest and autoencoder strategies. Systematic comparison on the experiment validated datasets for lncRNA-miRNA interaction datasets demonstrates that the proposed method consistently shows superior performance over the other machine learning models in the lncRNA-miRNA interaction prediction.
Collapse
Affiliation(s)
- Wei Wang
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China
| | - Xiaoqing Guan
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Muhammad Tahir Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore Pakistan, Pakistan
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; Peng Cheng Laboratory, Shenzhen, Guangdong, China.
| |
Collapse
|
25
|
Khatun MS, Shoombuatong W, Hasan MM, Kurata H. Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction. Curr Genomics 2020; 21:454-463. [PMID: 33093807 PMCID: PMC7536797 DOI: 10.2174/1389202921999200625103936] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 03/19/2020] [Accepted: 05/27/2020] [Indexed: 12/22/2022] Open
Abstract
Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
26
|
Chen C, Zhang Q, Yu B, Yu Z, Lawrence PJ, Ma Q, Zhang Y. Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier. Comput Biol Med 2020; 123:103899. [DOI: 10.1016/j.compbiomed.2020.103899] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 06/28/2020] [Accepted: 06/28/2020] [Indexed: 10/23/2022]
|
27
|
Rodovalho VDR, da Luz BSR, Rabah H, do Carmo FLR, Folador EL, Nicolas A, Jardin J, Briard-Bion V, Blottière H, Lapaque N, Jan G, Le Loir Y, de Carvalho Azevedo VA, Guédon E. Extracellular Vesicles Produced by the Probiotic Propionibacterium freudenreichii CIRM-BIA 129 Mitigate Inflammation by Modulating the NF-κB Pathway. Front Microbiol 2020; 11:1544. [PMID: 32733422 PMCID: PMC7359729 DOI: 10.3389/fmicb.2020.01544] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 06/15/2020] [Indexed: 12/20/2022] Open
Abstract
Extracellular vesicles (EVs) are nanometric spherical structures involved in intercellular communication, whose production is considered to be a widespread phenomenon in living organisms. Bacterial EVs are associated with several processes that include survival, competition, pathogenesis, and immunomodulation. Among probiotic Gram-positive bacteria, some Propionibacterium freudenreichii strains exhibit anti-inflammatory activity, notably via surface proteins such as the surface-layer protein B (SlpB). We have hypothesized that, in addition to surface exposure and secretion of proteins, P. freudenreichii may produce EVs and thus export immunomodulatory proteins to interact with the host. In order to demonstrate their production in this species, EVs were purified from cell-free culture supernatants of the probiotic strain P. freudenreichii CIRM-BIA 129, and their physicochemical characterization, using transmission electron microscopy and nanoparticle tracking analysis (NTA), revealed shapes and sizes typical of EVs. Proteomic characterization showed that EVs contain a broad range of proteins, including immunomodulatory proteins such as SlpB. In silico protein-protein interaction predictions indicated that EV proteins could interact with host proteins, including the immunomodulatory transcription factor NF-κB. This potential interaction has a functional significance because EVs modulate inflammatory responses, as shown by IL-8 release and NF-κB activity, in HT-29 human intestinal epithelial cells. Indeed, EVs displayed an anti-inflammatory effect by modulating the NF-κB pathway; this was dependent on their concentration and on the proinflammatory inducer (LPS-specific). Moreover, while this anti-inflammatory effect partly depended on SlpB, it was not abolished by EV surface proteolysis, suggesting possible intracellular sites of action for EVs. This is the first report on identification of P. freudenreichii-derived EVs, alongside their physicochemical, biochemical and functional characterization. This study has enhanced our understanding of the mechanisms associated with the probiotic activity of P. freudenreichii and identified opportunities to employ bacterial-derived EVs for the development of bioactive products with therapeutic effects.
Collapse
Affiliation(s)
- Vinícius de Rezende Rodovalho
- INRAE, Institut Agro, STLO, Rennes, France.,Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Brenda Silva Rosa da Luz
- INRAE, Institut Agro, STLO, Rennes, France.,Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | | | - Fillipe Luiz Rosa do Carmo
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Edson Luiz Folador
- Biotechnology Center, Federal University of Paraíba, João Pessoa, Brazil
| | | | | | | | - Hervé Blottière
- INRAE, AgroParisTech, Paris-Saclay University, Micalis Institute, Jouy-en-Josas, France
| | - Nicolas Lapaque
- INRAE, AgroParisTech, Paris-Saclay University, Micalis Institute, Jouy-en-Josas, France
| | | | | | - Vasco Ariston de Carvalho Azevedo
- Laboratory of Cellular and Molecular Genetics, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | | |
Collapse
|
28
|
Chen H, Li F, Wang L, Jin Y, Chi CH, Kurgan L, Song J, Shen J. Systematic evaluation of machine learning methods for identifying human-pathogen protein-protein interactions. Brief Bioinform 2020; 22:5847611. [PMID: 32459334 DOI: 10.1093/bib/bbaa068] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 12/11/2022] Open
Abstract
In recent years, high-throughput experimental techniques have significantly enhanced the accuracy and coverage of protein-protein interaction identification, including human-pathogen protein-protein interactions (HP-PPIs). Despite this progress, experimental methods are, in general, expensive in terms of both time and labour costs, especially considering that there are enormous amounts of potential protein-interacting partners. Developing computational methods to predict interactions between human and bacteria pathogen has thus become critical and meaningful, in both facilitating the detection of interactions and mining incomplete interaction maps. In this paper, we present a systematic evaluation of machine learning-based computational methods for human-bacterium protein-protein interactions (HB-PPIs). We first reviewed a vast number of publicly available databases of HP-PPIs and then critically evaluate the availability of these databases. Benefitting from its well-structured nature, we subsequently preprocess the data and identified six bacterium pathogens that could be used to study bacterium subjects in which a human was the host. Additionally, we thoroughly reviewed the literature on 'host-pathogen interactions' whereby existing models were summarized that we used to jointly study the impact of different feature representation algorithms and evaluate the performance of existing machine learning computational models. Owing to the abundance of sequence information and the limited scale of other protein-related information, we adopted the primary protocol from the literature and dedicated our analysis to a comprehensive assessment of sequence information and machine learning models. A systematic evaluation of machine learning models and a wide range of feature representation algorithms based on sequence information are presented as a comparison survey towards the prediction performance evaluation of HB-PPIs.
Collapse
|
29
|
Li F, Chen J, Ge Z, Wen Y, Yue Y, Hayashida M, Baggag A, Bensmail H, Song J. Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework. Brief Bioinform 2020; 22:2126-2140. [PMID: 32363397 DOI: 10.1093/bib/bbaa049] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 02/25/2020] [Accepted: 03/11/2020] [Indexed: 12/12/2022] Open
Abstract
Promoters are short consensus sequences of DNA, which are responsible for transcription activation or the repression of all genes. There are many types of promoters in bacteria with important roles in initiating gene transcription. Therefore, solving promoter-identification problems has important implications for improving the understanding of their functions. To this end, computational methods targeting promoter classification have been established; however, their performance remains unsatisfactory. In this study, we present a novel stacked-ensemble approach (termed SELECTOR) for identifying both promoters and their respective classification. SELECTOR combined the composition of k-spaced nucleic acid pairs, parallel correlation pseudo-dinucleotide composition, position-specific trinucleotide propensity based on single-strand, and DNA strand features and using five popular tree-based ensemble learning algorithms to build a stacked model. Both 5-fold cross-validation tests using benchmark datasets and independent tests using the newly collected independent test dataset showed that SELECTOR outperformed state-of-the-art methods in both general and specific types of promoter prediction in Escherichia coli. Furthermore, this novel framework provides essential interpretations that aid understanding of model success by leveraging the powerful Shapley Additive exPlanation algorithm, thereby highlighting the most important features relevant for predicting both general and specific types of promoters and overcoming the limitations of existing 'Black-box' approaches that are unable to reveal causal relationships from large amounts of initially encoded features.
Collapse
Affiliation(s)
- Fuyi Li
- Northwest A&F University, China.,Department of Biochemistry and Molecular Biology and the Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Australia
| | - Jinxiang Chen
- Biomedicine Discovery Institute and the Department of Biochemistry and Molecular Biology, Monash University from the College of Information Engineering, Northwest A&F University, China
| | - Zongyuan Ge
- Monash University and also serves as a Deep Learning Specialist at NVIDIA AI Technology Centre. Before joining Monash, he was a research scientist at IBM Research Australia doing research in medical AI during 2016-2018. His research interests are AI, computer vision, medical image, robotics and deep learning
| | - Ya Wen
- computer technology from Ningxia University, China
| | - Yanwei Yue
- medical science from Southern Medical University, China
| | - Morihiro Hayashida
- informatics from Kyoto University, Japan, in 2005. He is an Assistant Professor in the Department of Electrical Engineering and Computer Science, National Institute of Technology, Matsue College, Japan
| | - Abdelkader Baggag
- computer science from the University of Minnesota. He is a Senior Scientist at the Qatar Computing Research Institute (QCRI) and has a joint appointment as an Associate Professor at Hamad Bin Khalifa University (HBKU) in the Division of Information and Computing Technology. His research interests include data mining, linear algebra and machine learning
| | - Halima Bensmail
- University of Pierre & Marie Currie (Paris 6) in France. She is currently a Principal Scientist at QCRI-HBKU and a joint Associate Professor at the College of Computer and Science Engineering, HBKU
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Australia. He is also affiliated with the Monash Centre for Data Science, Faculty of Information Technology, Monash University. His research interests include bioinformatics, computational biology, machine learning, data mining, and pattern recognition
| |
Collapse
|
30
|
Yang X, Yang S, Li Q, Wuchty S, Zhang Z. Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput Struct Biotechnol J 2019; 18:153-161. [PMID: 31969974 PMCID: PMC6961065 DOI: 10.1016/j.csbj.2019.12.005] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/29/2019] [Accepted: 12/10/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.
Collapse
Key Words
- AC, Auto Covariance
- ACC, Accuracy
- AUC, area under the ROC curve
- AUPRC, area under the PR curve
- Adaboost, Adaptive Boosting
- CT, Conjoint Triad
- Doc2vec
- Embedding
- Human-virus interaction
- LD, Local Descriptor
- MCC, Matthews correlation coefficient
- ML, machine learning
- MLP, Multiple Layer Perceptron
- MS, mass spectroscopy
- Machine learning
- PPIs, protein-protein interactions
- PR, Precision-Recall
- Prediction
- Protein-protein interaction
- RBF, radial basis function
- RF, Random Forest
- ROC, Receiver Operating Characteristic
- SGD, stochastic gradient descent
- SVM, Support Vector Machine
- Y2H, yeast two-hybrid
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qinmengge Li
- National Demonstration Center for Experimental Biological Sciences Education, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Dept. of Computer Science, University of Miami, Miami, FL 33146, USA
- Dept. of Biology, University of Miami, Miami, FL 33146, USA
- Center of Computational Science, University of Miami, Miami, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|