1
|
Wu Y, Peng Y. Ten computational challenges in human virome studies. Virol Sin 2024:S1995-820X(24)00068-3. [PMID: 38697263 DOI: 10.1016/j.virs.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 04/25/2024] [Indexed: 05/04/2024] Open
Abstract
In recent years, substantial advancements have been achieved in understanding the diversity of the human virome and its intricate roles in human health and diseases. Despite this progress, the field of human virome research remains nascent, primarily hindered by the absence of effective methods, particularly in the domain of computational tools. This perspective systematically outlines ten computational challenges spanning various types of virome studies. These challenges arise due to the vast diversity of viromes, the absence of a universal marker gene in viral genomes, the low abundance of virus populations, the remote or minimal homology of viral proteins to known proteins, and the highly dynamic and heterogeneous nature of viromes. For each computational challenge, we discuss the underlying reasons, current research progress, and potential solutions. The resolution of these challenges necessitates ongoing collaboration among computational scientists, virologists, and multidisciplinary experts. In essence, this perspective serves as a comprehensive guide for directing computational efforts in human virome studies.
Collapse
Affiliation(s)
- Yifan Wu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China.
| |
Collapse
|
2
|
Shang J, Peng C, Tang X, Sun Y. PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision Transformer. Bioinformatics 2023; 39:i30-i39. [PMID: 37387136 DOI: 10.1093/bioinformatics/btad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION As viruses that mainly infect bacteria, phages are key players across a wide range of ecosystems. Analyzing phage proteins is indispensable for understanding phages' functions and roles in microbiomes. High-throughput sequencing enables us to obtain phages in different microbiomes with low cost. However, compared to the fast accumulation of newly identified phages, phage protein classification remains difficult. In particular, a fundamental need is to annotate virion proteins, the structural proteins, such as major tail, baseplate, etc. Although there are experimental methods for virion protein identification, they are too expensive or time-consuming, leaving a large number of proteins unclassified. Thus, there is a great demand to develop a computational method for fast and accurate phage virion protein (PVP) classification. RESULTS In this work, we adapted the state-of-the-art image classification model, Vision Transformer, to conduct virion protein classification. By encoding protein sequences into unique images using chaos game representation, we can leverage Vision Transformer to learn both local and global features from sequence "images". Our method, PhaVIP, has two main functions: classifying PVP and non-PVP sequences and annotating the types of PVP, such as capsid and tail. We tested PhaVIP on several datasets with increasing difficulty and benchmarked it against alternative tools. The experimental results show that PhaVIP has superior performance. After validating the performance of PhaVIP, we investigated two applications that can use the output of PhaVIP: phage taxonomy classification and phage host prediction. The results showed the benefit of using classified proteins over all proteins. AVAILABILITY AND IMPLEMENTATION The web server of PhaVIP is available via: https://phage.ee.cityu.edu.hk/phavip. The source code of PhaVIP is available via: https://github.com/KennthShang/PhaVIP.
Collapse
Affiliation(s)
- Jiayu Shang
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong (SAR), China
| | - Cheng Peng
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong (SAR), China
| | - Xubo Tang
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong (SAR), China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong (SAR), China
| |
Collapse
|
4
|
Linard B, Ebersberger I, McGlynn SE, Glover N, Mochizuki T, Patricio M, Lecompte O, Nevers Y, Thomas PD, Gabaldón T, Sonnhammer E, Dessimoz C, Uchiyama I. Ten Years of Collaborative Progress in the Quest for Orthologs. Mol Biol Evol 2021; 38:3033-3045. [PMID: 33822172 PMCID: PMC8321534 DOI: 10.1093/molbev/msab098] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/07/2021] [Accepted: 04/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.
Collapse
Affiliation(s)
- Benjamin Linard
- LIRMM, University of Montpellier, CNRS, Montpellier, France.,SPYGEN, Le Bourget-du-Lac, France
| | - Ingo Ebersberger
- Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt, Germany
| | - Shawn E McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan.,Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Natasha Glover
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Tomohiro Mochizuki
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BCS-CNS), Jordi Girona, Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, University College London, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | | |
Collapse
|
5
|
Guerin E, Hill C. Shining Light on Human Gut Bacteriophages. Front Cell Infect Microbiol 2020; 10:481. [PMID: 33014897 PMCID: PMC7511551 DOI: 10.3389/fcimb.2020.00481] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 08/04/2020] [Indexed: 12/15/2022] Open
Abstract
The human gut is a complex environment that contains a multitude of microorganisms that are collectively termed the microbiome. Multiple factors have a role to play in driving the composition of human gut bacterial communities either toward homeostasis or the instability that is associated with many disease states. One of the most important forces are likely to be bacteriophages, bacteria-infecting viruses that constitute by far the largest portion of the human gut virome. Despite this, bacteriophages (phages) are the one of the least studied residents of the gut. This is largely due to the challenges associated with studying these difficult to culture entities. Modern high throughput sequencing technologies have played an important role in improving our understanding of the human gut phageome but much of the generated sequencing data remains uncharacterised. Overcoming this requires database-independent bioinformatic pipelines and even those phages that are successfully characterized only provide limited insight into their associated biological properties, and thus most viral sequences have been characterized as “viral dark matter.” Fundamental to understanding the role of phages in shaping the human gut microbiome, and in turn perhaps influencing human health, is how they interact with their bacterial hosts. An essential aspect is the isolation of novel phage-bacteria host pairs by direct isolation through various screening methods, which can transform in silico phages into a biological reality. However, this is also beset with multiple challenges including culturing difficulties and the use of traditional methods, such as plaquing, which may bias which phage-host pairs that can be successfully isolated. Phage-bacteria interactions may be influenced by many aspects of complex human gut biology which can be difficult to reproduce under laboratory conditions. Here we discuss some of the main findings associated with the human gut phageome to date including composition, our understanding of phage-host interactions, particularly the observed persistence of virulent phages and their hosts, as well as factors that may influence these highly intricate relationships. We also discuss current methodologies and bottlenecks hindering progression in this field and identify potential steps that may be useful in overcoming these hurdles.
Collapse
Affiliation(s)
- Emma Guerin
- APC Microbiome Ireland, University College Cork, Cork, Ireland.,School of Microbiology, University College Cork, Cork, Ireland
| | - Colin Hill
- APC Microbiome Ireland, University College Cork, Cork, Ireland.,School of Microbiology, University College Cork, Cork, Ireland
| |
Collapse
|
6
|
Sutton TDS, Hill C. Gut Bacteriophage: Current Understanding and Challenges. Front Endocrinol (Lausanne) 2019; 10:784. [PMID: 31849833 PMCID: PMC6895007 DOI: 10.3389/fendo.2019.00784] [Citation(s) in RCA: 96] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/28/2019] [Indexed: 12/13/2022] Open
Abstract
The gut microbiome is widely accepted to have a significant impact on human health yet, despite years of research on this complex ecosystem, the contributions of different forces driving microbial population structure remain to be fully elucidated. The viral component of the human gut microbiome is dominated by bacteriophage, which are known to play crucial roles in shaping microbial composition, driving bacterial diversity, and facilitating horizontal gene transfer. Bacteriophage are also one of the most poorly understood components of the human gut microbiome, with the vast majority of viral sequences sharing little to no homology to reference databases. If we are to understand the dynamics of bacteriophage populations, their interaction with the human microbiome and ultimately their influence on human health, we will depend heavily on sequence based approaches and in silico tools. This is complicated by the fact that, as with any research field in its infancy, methods of analyses vary and this can impede our ability to compare the outputs of different studies. Here, we discuss the major findings to date regarding the human virome and reflect on our current understanding of how gut bacteriophage shape the microbiome. We consider whether or not the virome field is built on unstable foundations and if so, how can we provide a solid basis for future experimentation. The virome is a challenging yet crucial piece of the human microbiome puzzle. In order to develop our understanding, we will discuss the need to underpin future studies with robust research methods and suggest some solutions to existing challenges.
Collapse
Affiliation(s)
| | - Colin Hill
- APC Microbiome Ireland and School of Microbiology, University College Cork, Cork, Ireland
| |
Collapse
|