Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tan J, Fang Z, Wu S, Guo Q, Jiang X, Zhu H. HoPhage: an ab initio tool for identifying hosts of phage fragments from metaviromes. Bioinformatics 2021;38:543-545. [PMID: 34383025 PMCID: PMC8723153 DOI: 10.1093/bioinformatics/btab585] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 07/27/2021] [Accepted: 08/10/2021] [Indexed: 02/03/2023] Open

For:	Tan J, Fang Z, Wu S, Guo Q, Jiang X, Zhu H. HoPhage: an ab initio tool for identifying hosts of phage fragments from metaviromes. Bioinformatics 2021;38:543-545. [PMID: 34383025 PMCID: PMC8723153 DOI: 10.1093/bioinformatics/btab585] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 07/27/2021] [Accepted: 08/10/2021] [Indexed: 02/03/2023] Open

Number

Cited by Other Article(s)

Chen J, Sun C, Dong Y, Jin M, Lai S, Jia L, Zhao X, Wang H, Gao NL, Bork P, Liu Z, Chen W, Zhao X. Efficient Recovery of Complete Gut Viral Genomes by Combined Short- and Long-Read Sequencing. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024;11:e2305818. [PMID: 38240578 PMCID: PMC10987132 DOI: 10.1002/advs.202305818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 12/01/2023] [Indexed: 04/04/2024]

Affiliation(s)

Jingchao Chen Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
Chuqing Sun Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
Yanqi Dong Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
Menglu Jin Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China College of Life ScienceHenan Normal UniversityXinxiangHenan453007China
Senying Lai Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
Longhao Jia Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
Xueyang Zhao College of Life ScienceHenan Normal UniversityXinxiangHenan453007China
Huarui Wang Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
Na L. Gao Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China Department of Laboratory MedicineZhongnan Hospital of Wuhan UniversityWuhan UniversityWuhan430071China
Peer Bork European Molecular Biology LaboratoryStructural and Computational Biology Unit69117HeidelbergGermany Max Delbrück Centre for Molecular Medicine13125BerlinGermany Yonsei Frontier Lab (YFL)Yonsei University03722SeoulSouth Korea Department of BioinformaticsBiocenterUniversity of Würzburg97070WürzburgGermany
Zhi Liu Department of BiotechnologyCollege of Life Science and TechnologyHuazhong University of Science and Technology430074WuhanChina
Wei‐Hua Chen Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China College of Life ScienceHenan Normal UniversityXinxiangHenan453007China Institution of Medical Artificial IntelligenceBinzhou Medical UniversityYantai264003China
Xing‐Ming Zhao Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China MOE Key Laboratory of Computational Neuroscience and Brain‐Inspired Intelligenceand MOE Frontiers Center for Brain ScienceFudan UniversityShanghai200433China State Key Laboratory of Medical NeurobiologyInstitute of Brain ScienceFudan UniversityShanghai200433China International Human Phenome Institutes (Shanghai)Shanghai200433China

Collapse

Liu X, Liu Y, Liu J, Zhang H, Shan C, Guo Y, Gong X, Cui M, Li X, Tang M. Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence. Neural Regen Res 2024;19:833-845. [PMID: 37843219 PMCID: PMC10664138 DOI: 10.4103/1673-5374.382223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/19/2023] [Accepted: 06/17/2023] [Indexed: 10/17/2023] Open

Mahony J. Biological and bioinformatic tools for the discovery of unknown phage-host combinations. Curr Opin Microbiol 2024;77:102426. [PMID: 38246125 DOI: 10.1016/j.mib.2024.102426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 12/21/2023] [Accepted: 01/02/2024] [Indexed: 01/23/2024]

Yin H, Wu S, Tan J, Guo Q, Li M, Guo J, Wang Y, Jiang X, Zhu H. IPEV: identification of prokaryotic and eukaryotic virus-derived sequences in virome using deep learning. Gigascience 2024;13:giae018. [PMID: 38649300 PMCID: PMC11034026 DOI: 10.1093/gigascience/giae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 03/14/2024] [Accepted: 03/25/2024] [Indexed: 04/25/2024] Open

Affiliation(s)

Hengchuang Yin Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China
Shufang Wu Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China
Jie Tan Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China
Qian Guo Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China
Mo Li Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China School of Life Sciences, Peking University, Beijing 100871, China
Jinyuan Guo Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
Yaqi Wang Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China
Xiaoqing Jiang Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
Huaiqiu Zhu Department of Biomedical Engineering, College of Future Technology, and Center for Quantitative Biology, Peking University, Beijing 100871, China School of Life Sciences, Peking University, Beijing 100871, China Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA

Collapse

Zhang YZ, Liu Y, Bai Z, Fujimoto K, Uematsu S, Imoto S. Zero-shot-capable identification of phage-host relationships with whole-genome sequence representation by contrastive learning. Brief Bioinform 2023;24:bbad239. [PMID: 37466138 PMCID: PMC10516345 DOI: 10.1093/bib/bbad239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/17/2023] [Accepted: 06/08/2023] [Indexed: 07/20/2023] Open

Abstract

Accurately identifying phage-host relationships from their genome sequences is still challenging, especially for those phages and hosts with less homologous sequences. In this work, focusing on identifying the phage-host relationships at the species and genus level, we propose a contrastive learning based approach to learn whole-genome sequence embeddings that can take account of phage-host interactions (PHIs). Contrastive learning is used to make phages infecting the same hosts close to each other in the new representation space. Specifically, we rephrase whole-genome sequences with frequency chaos game representation (FCGR) and learn latent embeddings that 'encapsulate' phages and host relationships through contrastive learning. The contrastive learning method works well on the imbalanced dataset. Based on the learned embeddings, a proposed pipeline named CL4PHI can predict known hosts and unseen hosts in training. We compare our method with two recently proposed state-of-the-art learning-based methods on their benchmark datasets. The experiment results demonstrate that the proposed method using contrastive learning improves the prediction accuracy on known hosts and demonstrates a zero-shot prediction capability on unseen hosts. In terms of potential applications, the rapid pace of genome sequencing across different species has resulted in a vast amount of whole-genome sequencing data that require efficient computational methods for identifying phage-host interactions. The proposed approach is expected to address this need by efficiently processing whole-genome sequences of phages and prokaryotic hosts and capturing features related to phage-host relationships for genome sequence representation. This approach can be used to accelerate the discovery of phage-host interactions and aid in the development of phage-based therapies for infectious diseases.

Collapse

Gonzales MEM, Ureta JC, Shrestha AMS. Protein embeddings improve phage-host interaction prediction. PLoS One 2023;18:e0289030. [PMID: 37486915 PMCID: PMC10365317 DOI: 10.1371/journal.pone.0289030] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 07/07/2023] [Indexed: 07/26/2023] Open

Roux S, Camargo AP, Coutinho FH, Dabdoub SM, Dutilh BE, Nayfach S, Tritt A. iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria. PLoS Biol 2023;21:e3002083. [PMID: 37083735 PMCID: PMC10155999 DOI: 10.1371/journal.pbio.3002083] [Citation(s) in RCA: 47] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 05/03/2023] [Accepted: 03/15/2023] [Indexed: 04/22/2023] Open

Viral Metagenomic Analysis of the Fecal Samples in Domestic Dogs (Canis lupus familiaris). Viruses 2023;15:v15030685. [PMID: 36992396 PMCID: PMC10058366 DOI: 10.3390/v15030685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 02/24/2023] [Accepted: 03/02/2023] [Indexed: 03/08/2023] Open

Abstract Canine diarrhea is a common intestinal illness that is usually caused by viruses, bacteria, and parasites, and canine diarrhea may induce morbidity and mortality of domestic dogs if treated improperly. Recently, viral metagenomics was applied to investigate the signatures of the enteric virome in mammals. In this research, the characteristics of the gut virome in healthy dogs and dogs with diarrhea were analyzed and compared using viral metagenomics. The alpha diversity analysis indicated that the richness and diversity of the gut virome in the dogs with diarrhea were much higher than the healthy dogs, while the beta diversity analysis revealed that the gut virome of the two groups was quite different. At the family level, the predominant viruses in the canine gut virome were certified to be Microviridae, Parvoviridae, Siphoviridae, Inoviridae, Podoviridae, Myoviridae, and others. At the genus level, the predominant viruses in the canine gut virome were certified to be Protoparvovirus, Inovirus, Chlamydiamicrovirus, Lambdavirus, Dependoparvovirus, Lightbulbvirus, Kostyavirus, Punavirus, Lederbergvirus, Fibrovirus, Peduovirus, and others. However, the viral communities between the two groups differed significantly. The unique viral taxa identified in the healthy dogs group were Chlamydiamicrovirus and Lightbulbvirus, while the unique viral taxa identified in the dogs with diarrhea group were Inovirus, Protoparvovirus, Lambdavirus, Dependoparvovirus, Kostyavirus, Punavirus, and other viruses. Phylogenetic analysis based on the near-complete genome sequences showed that the CPV strains collected in this study together with other CPV Chinese isolates clustered into a separate branch, while the identified CAV-2 strain D5-8081 and AAV-5 strain AAV-D5 were both the first near-complete genome sequences in China. Moreover, the predicted bacterial hosts of phages were certified to be Campylobacter, Escherichia, Salmonella, Pseudomonas, Acinetobacter, Moraxella, Mediterraneibacter, and other commensal microbiota. In conclusion, the enteric virome of the healthy dogs group and the dogs with diarrhea group was investigated and compared using viral metagenomics, and the viral communities might influence canine health and disease by interacting with the commensal gut microbiome. Collapse

Bajiya N, Dhall A, Aggarwal S, Raghava GPS. Advances in the field of phage-based therapy with special emphasis on computational resources. Brief Bioinform 2023;24:6961791. [PMID: 36575815 DOI: 10.1093/bib/bbac574] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 11/07/2022] [Accepted: 11/25/2022] [Indexed: 12/29/2022] Open

Andrianjakarivony HF, Bettarel Y, Armougom F, Desnues C. Phage-Host Prediction Using a Computational Tool Coupled with 16S rRNA Gene Amplicon Sequencing. Viruses 2022;15:76. [PMID: 36680116 PMCID: PMC9862649 DOI: 10.3390/v15010076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/13/2022] [Accepted: 12/20/2022] [Indexed: 12/29/2022] Open

Abstract

Metagenomics studies have revealed tremendous viral diversity in aquatic environments. Yet, while the genomic data they have provided is extensive, it is unannotated. For example, most phage sequences lack accurate information about their bacterial host, which prevents reliable phage identification and the investigation of phage-host interactions. This study aimed to take this knowledge further, using a viral metagenomic framework to decipher the composition and diversity of phage communities and to predict their bacterial hosts. To this end, we used water and sediment samples collected from seven sites with varying contamination levels in the Ebrié Lagoon in Abidjan, Ivory Coast. The bacterial communities were characterized using the 16S rRNA metabarcoding approach, and a framework was developed to investigate the virome datasets that: (1) identified phage contigs with VirSorter and VIBRANT; (2) classified these contigs with MetaPhinder using the phage database (taxonomic annotation); and (3) predicted the phages' bacterial hosts with a machine learning-based tool: the Prokaryotic Virus-Host Predictor. The findings showed that the taxonomic profiles of phages and bacteria were specific to sediment or water samples. Phage sequences assigned to the Microviridae family were widespread in sediment samples, whereas phage sequences assigned to the Siphoviridae, Myoviridae and Podoviridae families were predominant in water samples. In terms of bacterial communities, the phyla Latescibacteria, Zixibacteria, Bacteroidetes, Acidobacteria, Calditrichaeota, Gemmatimonadetes, Cyanobacteria and Patescibacteria were most widespread in sediment samples, while the phyla Epsilonbacteraeota, Tenericutes, Margulisbacteria, Proteobacteria, Actinobacteria, Planctomycetes and Marinimicrobia were most prevalent in water samples. Significantly, the relative abundance of bacterial communities (at major phylum level) estimated by 16S rRNA metabarcoding and phage-host prediction were significantly similar. These results demonstrate the reliability of this novel approach for predicting the bacterial hosts of phages from shotgun metagenomic sequencing data.

Collapse

Tang T, Hou S, Fuhrman JA, Sun F. OUP accepted manuscript. Bioinformatics 2022;38:i45-i52. [PMID: 35758806 PMCID: PMC9235506 DOI: 10.1093/bioinformatics/btac239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Abstract

Motivation

Phage–host associations play important roles in microbial communities. But in natural communities, as opposed to culture-based lab studies where phages are discovered and characterized metagenomically, their hosts are generally not known. Several programs have been developed for predicting which phage infects which host based on various sequence similarity measures or machine learning approaches. These are often based on whole viral and host genomes, but in metagenomics-based studies, we rarely have whole genomes but rather must rely on contigs that are sometimes as short as hundreds of bp long. Therefore, we need programs that predict hosts of phage contigs on the basis of these short contigs. Although most existing programs can be applied to metagenomic datasets for these predictions, their accuracies are generally low. Here, we develop ContigNet, a convolutional neural network-based model capable of predicting phage–host matches based on relatively short contigs, and compare it to previously published VirHostMatcher (VHM) and WIsH.

Results

On the validation set, ContigNet achieves 72–85% area under the receiver operating characteristic curve (AUROC) scores, compared to the maximum of 68% by VHM or WIsH for contigs of lengths between 200 bps to 50 kbps. We also apply the model to the Metagenomic Gut Virus (MGV) catalogue, a dataset containing a wide range of draft genomes from metagenomic samples and achieve 60–70% AUROC scores compared to that of VHM and WIsH of 52%. Surprisingly, ContigNet can also be used to predict plasmid-host contig associations with high accuracy, indicating a similar genetic exchange between mobile genetic elements and their hosts.

Availability and implementation

The source code of ContigNet and related datasets can be downloaded from https://github.com/tianqitang1/ContigNet.

Collapse

Shang J, Sun Y. Predicting the hosts of prokaryotic viruses using GCN-based semi-supervised learning. BMC Biol 2021;19:250. [PMID: 34819064 PMCID: PMC8611875 DOI: 10.1186/s12915-021-01180-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 10/29/2021] [Indexed: 11/23/2022] Open

Abstract

Background

Prokaryotic viruses, which infect bacteria and archaea, are the most abundant and diverse biological entities in the biosphere. To understand their regulatory roles in various ecosystems and to harness the potential of bacteriophages for use in therapy, more knowledge of viral-host relationships is required. High-throughput sequencing and its application to the microbiome have offered new opportunities for computational approaches for predicting which hosts particular viruses can infect. However, there are two main challenges for computational host prediction. First, the empirically known virus-host relationships are very limited. Second, although sequence similarity between viruses and their prokaryote hosts have been used as a major feature for host prediction, the alignment is either missing or ambiguous in many cases. Thus, there is still a need to improve the accuracy of host prediction.

Results

In this work, we present a semi-supervised learning model, named HostG, to conduct host prediction for novel viruses. We construct a knowledge graph by utilizing both virus-virus protein similarity and virus-host DNA sequence similarity. Then graph convolutional network (GCN) is adopted to exploit viruses with or without known hosts in training to enhance the learning ability. During the GCN training, we minimize the expected calibrated error (ECE) to ensure the confidence of the predictions. We tested HostG on both simulated and real sequencing data and compared its performance with other state-of-the-art methods specifically designed for virus host classification (VHM-net, WIsH, PHP, HoPhage, RaFAH, vHULK, and VPF-Class).

Conclusion

HostG outperforms other popular methods, demonstrating the efficacy of using a GCN-based semi-supervised learning approach. A particular advantage of HostG is its ability to predict hosts from new taxa.

Supplementary Information

The online version contains supplementary material available at (10.1186/s12915-021-01180-4).

Collapse