1
|
Nie W, Qiu T, Wei Y, Ding H, Guo Z, Qiu J. Advances in phage-host interaction prediction: in silico method enhances the development of phage therapies. Brief Bioinform 2024; 25:bbae117. [PMID: 38555471 PMCID: PMC10981677 DOI: 10.1093/bib/bbae117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 01/15/2024] [Accepted: 03/02/2024] [Indexed: 04/02/2024] Open
Abstract
Phages can specifically recognize and kill bacteria, which lead to important application value of bacteriophage in bacterial identification and typing, livestock aquaculture and treatment of human bacterial infection. Considering the variety of human-infected bacteria and the continuous discovery of numerous pathogenic bacteria, screening suitable therapeutic phages that are capable of infecting pathogens from massive phage databases has been a principal step in phage therapy design. Experimental methods to identify phage-host interaction (PHI) are time-consuming and expensive; high-throughput computational method to predict PHI is therefore a potential substitute. Here, we systemically review bioinformatic methods for predicting PHI, introduce reference databases and in silico models applied in these methods and highlight the strengths and challenges of current tools. Finally, we discuss the application scope and future research direction of computational prediction methods, which contribute to the performance improvement of prediction models and the development of personalized phage therapy.
Collapse
Affiliation(s)
- Wanchun Nie
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Tianyi Qiu
- Institute of Clinical Science, Zhongshan Hospital; Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China
- Shanghai Institute of Infectious Disease and Biosecurity, Fudan University, Shanghai, 200032, China
| | - Yiwen Wei
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Hao Ding
- Institute of Clinical Science, Zhongshan Hospital; Intelligent Medicine Institute, Fudan University, Shanghai, 200032, China
| | - Zhixiang Guo
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Jingxuan Qiu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| |
Collapse
|
2
|
Mollentze N, Streicker DG. Predicting zoonotic potential of viruses: where are we? Curr Opin Virol 2023; 61:101346. [PMID: 37515983 DOI: 10.1016/j.coviro.2023.101346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/28/2023] [Accepted: 06/30/2023] [Indexed: 07/31/2023]
Abstract
The prospect of identifying high-risk viruses and designing interventions to pre-empt their emergence into human populations is enticing, but controversial, particularly when used to justify large-scale virus discovery initiatives. We review the current state of these efforts, identifying three broad classes of predictive models that have differences in data inputs that define their potential utility for triaging newly discovered viruses for further investigation. Prospects for model predictions of public health risk to guide preparedness depend not only on computational improvements to algorithms, but also on more efficient data generation in laboratory, field and clinical settings. Beyond public health applications, efforts to predict zoonoses provide unique research value by creating generalisable understanding of the ecological and evolutionary factors that promote viral emergence.
Collapse
Affiliation(s)
- Nardus Mollentze
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Glasgow G12 8QQ, United Kingdom; MRC-University of Glasgow Centre for Virus Research, G61 1QH, United Kingdom
| | - Daniel G Streicker
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Glasgow G12 8QQ, United Kingdom; MRC-University of Glasgow Centre for Virus Research, G61 1QH, United Kingdom.
| |
Collapse
|
3
|
Pan J, You W, Lu X, Wang S, You Z, Sun Y. GSPHI: A novel deep learning model for predicting phage-host interactions via multiple biological information. Comput Struct Biotechnol J 2023; 21:3404-3413. [PMID: 37397626 PMCID: PMC10314231 DOI: 10.1016/j.csbj.2023.06.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 06/14/2023] [Accepted: 06/15/2023] [Indexed: 07/04/2023] Open
Abstract
Emerging evidence suggests that due to the misuse of antibiotics, bacteriophage (phage) therapy has been recognized as one of the most promising strategies for treating human diseases infected by antibiotic-resistant bacteria. Identification of phage-host interactions (PHIs) can help to explore the mechanisms of bacterial response to phages and provide new insights into effective therapeutic approaches. Compared to conventional wet-lab experiments, computational models for predicting PHIs can not only save time and cost, but also be more efficient and economical. In this study, we developed a deep learning predictive framework called GSPHI to identify potential phage and target bacterium pairs through DNA and protein sequence information. More specifically, GSPHI first initialized the node representations of phages and target bacterial hosts via a natural language processing algorithm. Then a graph embedding algorithm structural deep network embedding (SDNE) was utilized to extract local and global information from the interaction network, and finally, a deep neural network (DNN) was applied to accurately detect the interactions between phages and their bacterial hosts. In the drug-resistant bacteria dataset ESKAPE, GSPHI achieved a prediction accuracy of 86.65 % and AUC of 0.9208 under the 5-fold cross-validation technique, significantly better than other methods. In addition, case studies in Gram-positive and negative bacterial species demonstrated that GSPHI is competent in detecting potential Phage-host interactions. Taken together, these results indicate that GSPHI can provide reasonable candidate sensitive bacteria to phages for biological experiments. The webserver of the GSPHI predictor is freely available at http://120.77.11.78/GSPHI/.
Collapse
Affiliation(s)
- Jie Pan
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, The College of Life Sciences, Northwest University, Xi’an 710069, China
| | - Wencai You
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, The College of Life Sciences, Northwest University, Xi’an 710069, China
| | - Xiaoliang Lu
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, The College of Life Sciences, Northwest University, Xi’an 710069, China
| | - Shiwei Wang
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, The College of Life Sciences, Northwest University, Xi’an 710069, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
| | - Yanmei Sun
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, The College of Life Sciences, Northwest University, Xi’an 710069, China
| |
Collapse
|
4
|
Fontdevila Pareta N, Khalili M, Maachi A, Rivarez MPS, Rollin J, Salavert F, Temple C, Aranda MA, Boonham N, Botermans M, Candresse T, Fox A, Hernando Y, Kutnjak D, Marais A, Petter F, Ravnikar M, Selmi I, Tahzima R, Trontin C, Wetzel T, Massart S. Managing the deluge of newly discovered plant viruses and viroids: an optimized scientific and regulatory framework for their characterization and risk analysis. Front Microbiol 2023; 14:1181562. [PMID: 37323908 PMCID: PMC10265641 DOI: 10.3389/fmicb.2023.1181562] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 04/25/2023] [Indexed: 06/17/2023] Open
Abstract
The advances in high-throughput sequencing (HTS) technologies and bioinformatic tools have provided new opportunities for virus and viroid discovery and diagnostics. Hence, new sequences of viral origin are being discovered and published at a previously unseen rate. Therefore, a collective effort was undertaken to write and propose a framework for prioritizing the biological characterization steps needed after discovering a new plant virus to evaluate its impact at different levels. Even though the proposed approach was widely used, a revision of these guidelines was prepared to consider virus discovery and characterization trends and integrate novel approaches and tools recently published or under development. This updated framework is more adapted to the current rate of virus discovery and provides an improved prioritization for filling knowledge and data gaps. It consists of four distinct steps adapted to include a multi-stakeholder feedback loop. Key improvements include better prioritization and organization of the various steps, earlier data sharing among researchers and involved stakeholders, public database screening, and exploitation of genomic information to predict biological properties.
Collapse
Affiliation(s)
| | - Maryam Khalili
- Univ. Bordeaux, INRAE, UMR BFP, Villenave d'Ornon, France
- EGFV, Univ. Bordeaux, INRAE, ISVV, Villenave d’Ornon, France
| | | | - Mark Paul S. Rivarez
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
- College of Agriculture and Agri-Industries, Caraga State University, Butuan, Philippines
| | - Johan Rollin
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
- DNAVision (Belgium), Charleroi, Belgium
| | - Ferran Salavert
- School of Natural and Environmental Sciences, Faculty of Science, Agriculture and Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Coline Temple
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Miguel A. Aranda
- Department of Stress Biology and Plant Pathology, Center for Edaphology and Applied Biology of Segura, Spanish National Research Council (CSIC), Murcia, Spain
| | - Neil Boonham
- School of Natural and Environmental Sciences, Faculty of Science, Agriculture and Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Marleen Botermans
- Netherlands Institute for Vectors, Invasive Plants and Plant Health (NIVIP), Wageningen, Netherlands
| | | | - Adrian Fox
- School of Natural and Environmental Sciences, Faculty of Science, Agriculture and Engineering, Newcastle University, Newcastle upon Tyne, United Kingdom
- Fera Science Ltd, York Biotech Campus, York, United Kingdom
| | | | - Denis Kutnjak
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
| | - Armelle Marais
- Univ. Bordeaux, INRAE, UMR BFP, Villenave d'Ornon, France
| | | | - Maja Ravnikar
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
| | - Ilhem Selmi
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Rachid Tahzima
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
- Plant Sciences Unit, Institute for Agricultural, Fisheries and Food Research (ILVO), Merelbeke, Belgium
| | - Charlotte Trontin
- European and Mediterranean Plant Protection Organization, Paris, France
| | - Thierry Wetzel
- DLR Rheinpfalz, Institute of Plant Protection, Neustadt an der Weinstrasse, Germany
| | - Sebastien Massart
- Plant Pathology Laboratory, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
- Bioversity International, Montpellier, France
| |
Collapse
|
5
|
Beamud B, García-González N, Gómez-Ortega M, González-Candelas F, Domingo-Calap P, Sanjuan R. Genetic determinants of host tropism in Klebsiella phages. Cell Rep 2023; 42:112048. [PMID: 36753420 PMCID: PMC9989827 DOI: 10.1016/j.celrep.2023.112048] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 11/25/2022] [Accepted: 01/13/2023] [Indexed: 02/08/2023] Open
Abstract
Bacteriophages play key roles in bacterial ecology and evolution and are potential antimicrobials. However, the determinants of phage-host specificity remain elusive. Here, we isolate 46 phages to challenge 138 representative clinical isolates of Klebsiella pneumoniae, a widespread opportunistic pathogen. Spot tests show a narrow host range for most phages, with <2% of 6,319 phage-host combinations tested yielding detectable interactions. Bacterial capsule diversity is the main factor restricting phage host range. Consequently, phage-encoded depolymerases are key determinants of host tropism, and depolymerase sequence types are associated with the ability to infect specific capsular types across phage families. However, all phages with a broader host range found do not encode canonical depolymerases, suggesting alternative modes of entry. These findings expand our knowledge of the complex interactions between bacteria and their viruses and point out the feasibility of predicting the first steps of phage infection using bacterial and phage genome sequences.
Collapse
Affiliation(s)
- Beatriz Beamud
- Joint Research Unit Infection and Public Health, FISABIO-Universitat de València, 46020 València, Spain; Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València-CSIC, 46980 Paterna, Spain
| | - Neris García-González
- Joint Research Unit Infection and Public Health, FISABIO-Universitat de València, 46020 València, Spain; Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València-CSIC, 46980 Paterna, Spain
| | - Mar Gómez-Ortega
- Joint Research Unit Infection and Public Health, FISABIO-Universitat de València, 46020 València, Spain
| | - Fernando González-Candelas
- Joint Research Unit Infection and Public Health, FISABIO-Universitat de València, 46020 València, Spain; Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València-CSIC, 46980 Paterna, Spain.
| | - Pilar Domingo-Calap
- Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València-CSIC, 46980 Paterna, Spain.
| | - Rafael Sanjuan
- Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València-CSIC, 46980 Paterna, Spain.
| |
Collapse
|
6
|
Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing. Microbiol Spectr 2023; 11:e0332822. [PMID: 36475839 PMCID: PMC9927493 DOI: 10.1128/spectrum.03328-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as 'NGS', 'Hybrid (NGS+HiFi)', and 'HiFi'. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the 'HiFi' approach to assemble high-quality microbial genomes. Among the 3 strategies, the 'Hybrid' approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the 'HiFi' assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the 'Hybrid' approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the 'Hybrid' assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research.
Collapse
|
7
|
Bajiya N, Dhall A, Aggarwal S, Raghava GPS. Advances in the field of phage-based therapy with special emphasis on computational resources. Brief Bioinform 2023; 24:6961791. [PMID: 36575815 DOI: 10.1093/bib/bbac574] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 11/07/2022] [Accepted: 11/25/2022] [Indexed: 12/29/2022] Open
Abstract
In the current era, one of the major challenges is to manage the treatment of drug/antibiotic-resistant strains of bacteria. Phage therapy, a century-old technique, may serve as an alternative to antibiotics in treating bacterial infections caused by drug-resistant strains of bacteria. In this review, a systematic attempt has been made to summarize phage-based therapy in depth. This review has been divided into the following two sections: general information and computer-aided phage therapy (CAPT). In the case of general information, we cover the history of phage therapy, the mechanism of action, the status of phage-based products (approved and clinical trials) and the challenges. This review emphasizes CAPT, where we have covered primary phage-associated resources, phage prediction methods and pipelines. This review covers a wide range of databases and resources, including viral genomes and proteins, phage receptors, host genomes of phages, phage-host interactions and lytic proteins. In the post-genomic era, identifying the most suitable phage for lysing a drug-resistant strain of bacterium is crucial for developing alternate treatments for drug-resistant bacteria and this remains a challenging problem. Thus, we compile all phage-associated prediction methods that include the prediction of phages for a bacterial strain, the host for a phage and the identification of interacting phage-host pairs. Most of these methods have been developed using machine learning and deep learning techniques. This review also discussed recent advances in the field of CAPT, where we briefly describe computational tools available for predicting phage virions, the life cycle of phages and prophage identification. Finally, we describe phage-based therapy's advantages, challenges and opportunities.
Collapse
Affiliation(s)
- Nisha Bajiya
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India
| | - Suchet Aggarwal
- Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India
| |
Collapse
|
8
|
Kawasaki J, Tomonaga K, Horie M. Large-scale investigation of zoonotic viruses in the era of high-throughput sequencing. Microbiol Immunol 2023; 67:1-13. [PMID: 36259224 DOI: 10.1111/1348-0421.13033] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 09/28/2022] [Accepted: 10/16/2022] [Indexed: 01/10/2023]
Abstract
Zoonotic diseases considerably impact public health and socioeconomics. RNA viruses reportedly caused approximately 94% of zoonotic diseases documented from 1990 to 2010, emphasizing the importance of investigating RNA viruses in animals. Furthermore, it has been estimated that hundreds of thousands of animal viruses capable of infecting humans are yet to be discovered, warning against the inadequacy of our understanding of viral diversity. High-throughput sequencing (HTS) has enabled the identification of viral infections with relatively little bias. Viral searches using both symptomatic and asymptomatic animal samples by HTS have revealed hidden viral infections. This review introduces the history of viral searches using HTS, current analytical limitations, and future potentials. We primarily summarize recent research on large-scale investigations on viral infections reusing HTS data from public databases. Furthermore, considering the accumulation of uncultivated viruses, we discuss current studies and challenges for connecting viral sequences to their phenotypes using various approaches: performing data analysis, developing predictive modeling, or implementing high-throughput platforms of virological experiments. We believe that this article provides a future direction in large-scale investigations of potential zoonotic viruses using the HTS technology.
Collapse
Affiliation(s)
- Junna Kawasaki
- Laboratory of RNA Viruses, Department of Virus Research, Institute for Frontier Life and Medical Sciences, Kyoto University, Kyoto, Japan.,Laboratory of RNA Viruses, Department of Mammalian Regulatory Network, Graduate School of Biostudies, Kyoto University, Kyoto, Japan.,Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Keizo Tomonaga
- Laboratory of RNA Viruses, Department of Virus Research, Institute for Frontier Life and Medical Sciences, Kyoto University, Kyoto, Japan.,Laboratory of RNA Viruses, Department of Mammalian Regulatory Network, Graduate School of Biostudies, Kyoto University, Kyoto, Japan.,Department of Molecular Virology, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Masayuki Horie
- Division of Veterinary Sciences, Graduate School of Life and Environmental Sciences, Osaka Prefecture University, Osaka, Japan.,Osaka International Research Center for Infectious Diseases, Osaka Prefecture University, Osaka, Japan
| |
Collapse
|
9
|
Iuchi H, Kawasaki J, Kubo K, Fukunaga T, Hokao K, Yokoyama G, Ichinose A, Suga K, Hamada M. Bioinformatics approaches for unveiling virus-host interactions. Comput Struct Biotechnol J 2023; 21:1774-1784. [PMID: 36874163 PMCID: PMC9969756 DOI: 10.1016/j.csbj.2023.02.044] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 02/22/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023] Open
Abstract
The coronavirus disease-2019 (COVID-19) pandemic has elucidated major limitations in the capacity of medical and research institutions to appropriately manage emerging infectious diseases. We can improve our understanding of infectious diseases by unveiling virus-host interactions through host range prediction and protein-protein interaction prediction. Although many algorithms have been developed to predict virus-host interactions, numerous issues remain to be solved, and the entire network remains veiled. In this review, we comprehensively surveyed algorithms used to predict virus-host interactions. We also discuss the current challenges, such as dataset biases toward highly pathogenic viruses, and the potential solutions. The complete prediction of virus-host interactions remains difficult; however, bioinformatics can contribute to progress in research on infectious diseases and human health.
Collapse
Affiliation(s)
- Hitoshi Iuchi
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan
| | - Junna Kawasaki
- Faculty of Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Kento Kubo
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Tsukasa Fukunaga
- Waseda Institute for Advanced Study, Waseda University, Nishi Waseda, Shinjuku-ku, Tokyo 169-0051, Japan
| | - Koki Hokao
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Gentaro Yokoyama
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Akiko Ichinose
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan
| | - Kanta Suga
- School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- Waseda Research Institute for Science and Engineering, Waseda University, Tokyo 169-8555, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan.,School of Advanced Science and Engineering, Waseda University, Okubo Shinjuku-ku, Tokyo 169-8555, Japan.,Graduate School of Medicine, Nippon Medical School, Tokyo 113-8602, Japan
| |
Collapse
|
10
|
Khan T, Raza S. Exploration of Computational Aids for Effective Drug Designing and Management of Viral Diseases: A Comprehensive Review. Curr Top Med Chem 2023; 23:1640-1663. [PMID: 36725827 DOI: 10.2174/1568026623666230201144522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/14/2022] [Accepted: 12/19/2022] [Indexed: 02/03/2023]
Abstract
BACKGROUND Microbial diseases, specifically originating from viruses are the major cause of human mortality all over the world. The current COVID-19 pandemic is a case in point, where the dynamics of the viral-human interactions are still not completely understood, making its treatment a case of trial and error. Scientists are struggling to devise a strategy to contain the pandemic for over a year and this brings to light the lack of understanding of how the virus grows and multiplies in the human body. METHODS This paper presents the perspective of the authors on the applicability of computational tools for deep learning and understanding of host-microbe interaction, disease progression and management, drug resistance and immune modulation through in silico methodologies which can aid in effective and selective drug development. The paper has summarized advances in the last five years. The studies published and indexed in leading databases have been included in the review. RESULTS Computational systems biology works on an interface of biology and mathematics and intends to unravel the complex mechanisms between the biological systems and the inter and intra species dynamics using computational tools, and high-throughput technologies developed on algorithms, networks and complex connections to simulate cellular biological processes. CONCLUSION Computational strategies and modelling integrate and prioritize microbial-host interactions and may predict the conditions in which the fine-tuning attenuates. These microbial-host interactions and working mechanisms are important from the aspect of effective drug designing and fine- tuning the therapeutic interventions.
Collapse
Affiliation(s)
- Tahmeena Khan
- Department of Chemistry, Integral University, Lucknow, 226026, U.P., India
| | - Saman Raza
- Department of Chemistry, Isabella Thoburn College, Lucknow, 226007, U.P., India
| |
Collapse
|
11
|
Andrianjakarivony HF, Bettarel Y, Armougom F, Desnues C. Phage-Host Prediction Using a Computational Tool Coupled with 16S rRNA Gene Amplicon Sequencing. Viruses 2022; 15:76. [PMID: 36680116 PMCID: PMC9862649 DOI: 10.3390/v15010076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/13/2022] [Accepted: 12/20/2022] [Indexed: 12/29/2022] Open
Abstract
Metagenomics studies have revealed tremendous viral diversity in aquatic environments. Yet, while the genomic data they have provided is extensive, it is unannotated. For example, most phage sequences lack accurate information about their bacterial host, which prevents reliable phage identification and the investigation of phage-host interactions. This study aimed to take this knowledge further, using a viral metagenomic framework to decipher the composition and diversity of phage communities and to predict their bacterial hosts. To this end, we used water and sediment samples collected from seven sites with varying contamination levels in the Ebrié Lagoon in Abidjan, Ivory Coast. The bacterial communities were characterized using the 16S rRNA metabarcoding approach, and a framework was developed to investigate the virome datasets that: (1) identified phage contigs with VirSorter and VIBRANT; (2) classified these contigs with MetaPhinder using the phage database (taxonomic annotation); and (3) predicted the phages' bacterial hosts with a machine learning-based tool: the Prokaryotic Virus-Host Predictor. The findings showed that the taxonomic profiles of phages and bacteria were specific to sediment or water samples. Phage sequences assigned to the Microviridae family were widespread in sediment samples, whereas phage sequences assigned to the Siphoviridae, Myoviridae and Podoviridae families were predominant in water samples. In terms of bacterial communities, the phyla Latescibacteria, Zixibacteria, Bacteroidetes, Acidobacteria, Calditrichaeota, Gemmatimonadetes, Cyanobacteria and Patescibacteria were most widespread in sediment samples, while the phyla Epsilonbacteraeota, Tenericutes, Margulisbacteria, Proteobacteria, Actinobacteria, Planctomycetes and Marinimicrobia were most prevalent in water samples. Significantly, the relative abundance of bacterial communities (at major phylum level) estimated by 16S rRNA metabarcoding and phage-host prediction were significantly similar. These results demonstrate the reliability of this novel approach for predicting the bacterial hosts of phages from shotgun metagenomic sequencing data.
Collapse
Affiliation(s)
- Harilanto Felana Andrianjakarivony
- Microbes, Evolution, Phylogeny, and Infection (MEΦI), IHU—Méditerranée Infection, 19-21 Boulevard Jean Moulin, 13005 Marseille, France
- Microbiologie Environnementale Biotechnologie (MEB), Mediterranean Institute of Oceanography (MIO), 163 Avenue de Luminy, 13009 Marseille, France
| | - Yvan Bettarel
- MARBEC, Marine Biodiversity, Exploitation & Conservation, Université de Montpellier, CNRS, Ifremer, IRD, 093 Place Eugène Bataillon, 34090 Montpellier, France
| | - Fabrice Armougom
- Microbiologie Environnementale Biotechnologie (MEB), Mediterranean Institute of Oceanography (MIO), 163 Avenue de Luminy, 13009 Marseille, France
| | - Christelle Desnues
- Microbes, Evolution, Phylogeny, and Infection (MEΦI), IHU—Méditerranée Infection, 19-21 Boulevard Jean Moulin, 13005 Marseille, France
- Microbiologie Environnementale Biotechnologie (MEB), Mediterranean Institute of Oceanography (MIO), 163 Avenue de Luminy, 13009 Marseille, France
| |
Collapse
|
12
|
Abstract
Microfluidics has enabled a new era of cellular and molecular assays due to the small length scales, parallelization, and the modularity of various analysis and actuation functions. Droplet microfluidics, in particular, has been instrumental in providing new tools for biology with its ability to quickly and reproducibly generate drops that act as individual reactors. A notable beneficiary of this technology has been single-cell RNA sequencing, which has revealed new heterogeneities and interactions for the fundamental unit of life. However, viruses far surpass the diversity of cellular life, affect the dynamics of all ecosystems, and are a chronic source of global health crises. Despite their impact on the world, high-throughput and high-resolution viral profiling has been difficult, with conventional methods being limited to population-level averaging, large sample volumes, and few cultivable hosts. Consequently, most viruses have not been identified and studied. Droplet microfluidics holds the potential to address many of these limitations and offers new levels of sensitivity and throughput for virology. This Feature highlights recent efforts that have applied droplet microfluidics to the detection and study of viruses, including for diagnostics, virus-host interactions, and cell-independent virus assays. In combination with traditional virology methods, droplet microfluidics should prove a potent tool toward achieving a better understanding of the most abundant biological species on Earth.
Collapse
Affiliation(s)
- Wenyang Jing
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Hee-Sun Han
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, Illinois 61801, United States
| |
Collapse
|
13
|
Bernstein AS, Ando AW, Loch-Temzelides T, Vale MM, Li BV, Li H, Busch J, Chapman CA, Kinnaird M, Nowak K, Castro MC, Zambrana-Torrelio C, Ahumada JA, Xiao L, Roehrdanz P, Kaufman L, Hannah L, Daszak P, Pimm SL, Dobson AP. The costs and benefits of primary prevention of zoonotic pandemics. SCIENCE ADVANCES 2022; 8:eabl4183. [PMID: 35119921 PMCID: PMC8816336 DOI: 10.1126/sciadv.abl4183] [Citation(s) in RCA: 77] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The lives lost and economic costs of viral zoonotic pandemics have steadily increased over the past century. Prominent policymakers have promoted plans that argue the best ways to address future pandemic catastrophes should entail, "detecting and containing emerging zoonotic threats." In other words, we should take actions only after humans get sick. We sharply disagree. Humans have extensive contact with wildlife known to harbor vast numbers of viruses, many of which have not yet spilled into humans. We compute the annualized damages from emerging viral zoonoses. We explore three practical actions to minimize the impact of future pandemics: better surveillance of pathogen spillover and development of global databases of virus genomics and serology, better management of wildlife trade, and substantial reduction of deforestation. We find that these primary pandemic prevention actions cost less than 1/20th the value of lives lost each year to emerging viral zoonoses and have substantial cobenefits.
Collapse
Affiliation(s)
- Aaron S. Bernstein
- Boston Children’s Hospital and the Center for Climate, Health and the Global Environment, Boston, MA 02115, USA
- Corresponding author. (A.S.B.); (S.L.P.); (A.P.D.)
| | - Amy W. Ando
- Department of Agricultural and Consumer Economics, University of Illinois Urbana-Champaign, Champaign, IL 61801, USA
- Resources for the Future, 1616 P Street NW, Washington, DC 20036, USA
| | - Ted Loch-Temzelides
- Department of Economics and Baker Institute for Public Policy, Rice University, Houston, TX 77005, USA
| | - Mariana M. Vale
- Ecology Department, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
- National Institute of Science and Technology in Ecology, Evolution and Biodiversity Conservation, Goiania, Brazil
| | - Binbin V. Li
- Environment Research Center, Duke Kunshan University, Kunshan, Jiangsu Province 215317, China
- Nicholas School of the Environment, Duke University, Durham, NC 27708, USA
| | - Hongying Li
- EcoHealth Alliance, 520 Eighth Avenue, New York, NY 10018, USA
| | - Jonah Busch
- Moore Center for Science, Conservation International, Arlington, VA 22202, USA
| | - Colin A. Chapman
- Wilson Center, 1300 Pennsylvania Avenue NW, Washington, DC 20004, USA
- Center for the Advanced Study of Human Paleobiology, George Washington University, Washington, DC 20004, USA
- School of Life Sciences, University of KwaZulu-Natal, Pietermaritzburg, South Africa
- Shaanxi Key Laboratory for Animal Conservation, Northwest University, Xi’an, China
| | - Margaret Kinnaird
- Practice Leader, Wildlife, WWF International, The Mvuli, Mvuli Road, Westlands, Kenya
| | - Katarzyna Nowak
- The Safina Center, 80 North Country Road, Setauket, NY 11733, USA
| | - Marcia C. Castro
- Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | | | - Jorge A. Ahumada
- Moore Center for Science, Conservation International, Arlington, VA 22202, USA
| | - Lingyun Xiao
- Department of Health and Environmental Sciences, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu Province 215123, China
| | - Patrick Roehrdanz
- Moore Center for Science, Conservation International, Arlington, VA 22202, USA
| | - Les Kaufman
- Department of Biology and Pardee Center for the Study of the Longer-Range Future, Boston University, Boston, MA 02215, USA
| | - Lee Hannah
- Moore Center for Science, Conservation International, Arlington, VA 22202, USA
| | - Peter Daszak
- EcoHealth Alliance, 520 Eighth Avenue, New York, NY 10018, USA
| | - Stuart L. Pimm
- Nicholas School of the Environment, Duke University, Durham, NC 27708, USA
- Corresponding author. (A.S.B.); (S.L.P.); (A.P.D.)
| | - Andrew P. Dobson
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
- Santa Fe Institute, Hyde Park Road, Santa Fe, NM 87501, USA
- Corresponding author. (A.S.B.); (S.L.P.); (A.P.D.)
| |
Collapse
|
14
|
Versoza CJ, Pfeifer SP. Computational Prediction of Bacteriophage Host Ranges. Microorganisms 2022; 10:149. [PMID: 35056598 PMCID: PMC8778386 DOI: 10.3390/microorganisms10010149] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/06/2022] [Accepted: 01/11/2022] [Indexed: 12/27/2022] Open
Abstract
Increased antibiotic resistance has prompted the development of bacteriophage agents for a multitude of applications in agriculture, biotechnology, and medicine. A key factor in the choice of agents for these applications is the host range of a bacteriophage, i.e., the bacterial genera, species, and strains a bacteriophage is able to infect. Although experimental explorations of host ranges remain the gold standard, such investigations are inherently limited to a small number of viruses and bacteria amendable to cultivation. Here, we review recently developed bioinformatic tools that offer a promising and high-throughput alternative by computationally predicting the putative host ranges of bacteriophages, including those challenging to grow in laboratory environments.
Collapse
Affiliation(s)
- Cyril J. Versoza
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA;
| | - Susanne P. Pfeifer
- Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
15
|
Lood C, Boeckaerts D, Stock M, De Baets B, Lavigne R, van Noort V, Briers Y. Digital phagograms: predicting phage infectivity through a multilayer machine learning approach. Curr Opin Virol 2021; 52:174-181. [PMID: 34952265 DOI: 10.1016/j.coviro.2021.12.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/26/2021] [Accepted: 12/04/2021] [Indexed: 12/19/2022]
Abstract
Machine learning has been broadly implemented to investigate biological systems. In this regard, the field of phage biology has embraced machine learning to elucidate and predict phage-host interactions, based on receptor-binding proteins, (anti-)defense systems, prophage detection, and life cycle recognition. Here, we highlight the enormous potential of integrating information from omics data with insights from systems biology to better understand phage-host interactions. We conceptualize and discuss the potential of a multilayer model that mirrors the phage infection process, integrating adsorption, bacterial pan-immune components and hijacking of the bacterial metabolism to predict phage infectivity. In the future, this model can offer insights into the underlying mechanisms of the infection process, and digital phagograms can support phage cocktail design and phage engineering.
Collapse
Affiliation(s)
- Cédric Lood
- Laboratory of Gene Technology, Department of Biosystems, KU Leuven, Leuven, Belgium; Centre of Microbial and Plant Genetics, Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
| | - Dimitri Boeckaerts
- Laboratory of Applied Biotechnology, Department of Biotechnology, Ghent University, Ghent, Belgium; KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Michiel Stock
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium; BIOBIX, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Bernard De Baets
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Rob Lavigne
- Laboratory of Gene Technology, Department of Biosystems, KU Leuven, Leuven, Belgium.
| | - Vera van Noort
- Centre of Microbial and Plant Genetics, Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium; Institute of Biology, Leiden University, Leiden, The Netherlands.
| | - Yves Briers
- Laboratory of Applied Biotechnology, Department of Biotechnology, Ghent University, Ghent, Belgium.
| |
Collapse
|
16
|
Albery GF, Becker DJ, Brierley L, Brook CE, Christofferson RC, Cohen LE, Dallas TA, Eskew EA, Fagre A, Farrell MJ, Glennon E, Guth S, Joseph MB, Mollentze N, Neely BA, Poisot T, Rasmussen AL, Ryan SJ, Seifert S, Sjodin AR, Sorrell EM, Carlson CJ. The science of the host-virus network. Nat Microbiol 2021; 6:1483-1492. [PMID: 34819645 DOI: 10.1038/s41564-021-00999-5] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 10/18/2021] [Indexed: 01/21/2023]
Abstract
Better methods to predict and prevent the emergence of zoonotic viruses could support future efforts to reduce the risk of epidemics. We propose a network science framework for understanding and predicting human and animal susceptibility to viral infections. Related approaches have so far helped to identify basic biological rules that govern cross-species transmission and structure the global virome. We highlight ways to make modelling both accurate and actionable, and discuss the barriers that prevent researchers from translating viral ecology into public health policies that could prevent future pandemics.
Collapse
Affiliation(s)
- Gregory F Albery
- Department of Biology, Georgetown University, Washington DC, USA.
| | - Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, OK, USA
| | - Liam Brierley
- Institute of Translational Medicine, University of Liverpool, Liverpool, UK
| | - Cara E Brook
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | | | - Lily E Cohen
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tad A Dallas
- Department of Biological Sciences, University of South Carolina, Columbia, SC, USA
| | - Evan A Eskew
- Department of Biology, Pacific Lutheran University, Tacoma, WA, USA
| | - Anna Fagre
- Department of Microbiology, Immunology and Pathology, Colorado State University, Fort Collins, CO, USA
| | - Maxwell J Farrell
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| | - Emma Glennon
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Sarah Guth
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Maxwell B Joseph
- Earth Lab, Cooperative Institute for Research in Environmental Science, University of Colorado Boulder, Boulder, CO, USA
| | - Nardus Mollentze
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, UK.,MRC - University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Benjamin A Neely
- National Institute of Standards and Technology, Charleston, SC, USA
| | - Timothée Poisot
- Québec Centre for Biodiversity Sciences, Montréal, Québec, Canada.,Département de Sciences Biologiques, Université de Montréal, Montréal, Québec, Canada
| | - Angela L Rasmussen
- Vaccine and Infectious Disease Organization, University of Saskatchewan, Saskatoon, Saskatchewan, Canada.,Department of Biochemistry, Microbiology, and Immunology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Sadie J Ryan
- Department of Geography, University of Florida, Gainesville, FL, USA.,Emerging Pathogens Institute, University of Florida, Gainesville, FL, USA.,School of Life Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Stephanie Seifert
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA, USA
| | - Anna R Sjodin
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Erin M Sorrell
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA.,Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC, USA
| | - Colin J Carlson
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA. .,Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC, USA.
| |
Collapse
|
17
|
Li M, Zhang W. PHIAF: prediction of phage-host interactions with GAN-based data augmentation and sequence-based feature fusion. Brief Bioinform 2021; 23:6362109. [PMID: 34472593 DOI: 10.1093/bib/bbab348] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 07/05/2021] [Accepted: 07/18/2021] [Indexed: 01/01/2023] Open
Abstract
Phage therapy has become one of the most promising alternatives to antibiotics in the treatment of bacterial diseases, and identifying phage-host interactions (PHIs) helps to understand the possible mechanism through which a phage infects bacteria to guide the development of phage therapy. Compared with wet experiments, computational methods of identifying PHIs can reduce costs and save time and are more effective and economic. In this paper, we propose a PHI prediction method with a generative adversarial network (GAN)-based data augmentation and sequence-based feature fusion (PHIAF). First, PHIAF applies a GAN-based data augmentation module, which generates pseudo PHIs to alleviate the data scarcity. Second, PHIAF fuses the features originated from DNA and protein sequences for better performance. Third, PHIAF utilizes an attention mechanism to consider different contributions of DNA/protein sequence-derived features, which also provides interpretability of the prediction model. In computational experiments, PHIAF outperforms other state-of-the-art PHI prediction methods when evaluated via 5-fold cross-validation (AUC and AUPR are 0.88 and 0.86, respectively). An ablation study shows that data augmentation, feature fusion and an attention mechanism are all beneficial to improve the prediction performance of PHIAF. Additionally, four new PHIs with the highest PHIAF score in the case study were verified by recent literature. In conclusion, PHIAF is a promising tool to accelerate the exploration of phage therapy.
Collapse
Affiliation(s)
- Menglu Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
18
|
Nami Y, Imeni N, Panahi B. Application of machine learning in bacteriophage research. BMC Microbiol 2021; 21:193. [PMID: 34174831 PMCID: PMC8235560 DOI: 10.1186/s12866-021-02256-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 06/08/2021] [Indexed: 12/20/2022] Open
Abstract
Phages are one of the key components in the structure, dynamics, and interactions of microbial communities in different bins. It has a clear impact on human health and the food industry. Bacteriophage characterization using in vitro approaches are time/cost consuming and laborious tasks. On the other hand, with the advent of new high-throughput sequencing technology, the development of a powerful computational framework to characterize the newly identified bacteriophages is inevitable for future research. Machine learning includes powerful techniques that enable the analysis of complex datasets for knowledge discovery and pattern recognition. In this study, we have conducted a comprehensive review of machine learning methods application using different types of features were applied in various aspects of bacteriophage research including, automated curation, identification, classification, host species recognition, virion protein identification, and life cycle prediction. Moreover, potential limitations and advantages of the developed frameworks were discussed.
Collapse
Affiliation(s)
- Yousef Nami
- Department of Food Biotechnology, Branch for Northwest & West Region, Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| | - Nazila Imeni
- Young Researchers and Elite Clube, Marand Branch, Islamic Azad University, Marand, Iran
| | - Bahman Panahi
- Department of Genomics, Branch for Northwest & West Region, Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran.
| |
Collapse
|
19
|
Coutinho FH, Zaragoza-Solas A, López-Pérez M, Barylski J, Zielezinski A, Dutilh BE, Edwards R, Rodriguez-Valera F. RaFAH: Host prediction for viruses of Bacteria and Archaea based on protein content. PATTERNS 2021; 2:100274. [PMID: 34286299 PMCID: PMC8276007 DOI: 10.1016/j.patter.2021.100274] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 11/23/2020] [Accepted: 05/07/2021] [Indexed: 02/06/2023]
Abstract
Culture-independent approaches have recently shed light on the genomic diversity of viruses of prokaryotes. One fundamental question when trying to understand their ecological roles is: which host do they infect? To tackle this issue we developed a machine-learning approach named Random Forest Assignment of Hosts (RaFAH), that uses scores to 43,644 protein clusters to assign hosts to complete or fragmented genomes of viruses of Archaea and Bacteria. RaFAH displayed performance comparable with that of other methods for virus-host prediction in three different benchmarks encompassing viruses from RefSeq, single amplified genomes, and metagenomes. RaFAH was applied to assembled metagenomic datasets of uncultured viruses from eight different biomes of medical, biotechnological, and environmental relevance. Our analyses led to the identification of 537 sequences of archaeal viruses representing unknown lineages, whose genomes encode novel auxiliary metabolic genes, shedding light on how these viruses interfere with the host molecular machinery. RaFAH is available at https://sourceforge.net/projects/rafah/. RaFAH was developed to predict the hosts of viruses of Bacteria and Archaea RaFAH displayed comparable or superior performance to other host-prediction tools RaFAH performed well across viromes from eight different ecosystems RaFAH identified hundreds of genomic sequences as derived from viruses of Archaea
Viruses that infect Bacteria and Archaea are ubiquitous and extremely abundant. Recent advances have led to the discovery of many thousands of complete and partial genomes of these biological entities. Understanding the biology of these viruses and how they influence their ecosystems depends on knowing which hosts they infect. We developed a tool that uses data from complete or fragmented genomes to predict the hosts of viruses using a machine-learning approach. Our tool, RaFAH, displayed performance comparable with or superior to that of other host-prediction tools. In addition, it identified hundreds of sequences as derived from the genomes of viruses of Archaea, which are one of the least characterized fractions of the global virosphere.
Collapse
Affiliation(s)
- Felipe Hernandes Coutinho
- Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Aptdo. 18., Ctra. Alicante-Valencia N-332, s/n, San Juan de Alicante, 03550 Alicante, Spain
| | - Asier Zaragoza-Solas
- Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Aptdo. 18., Ctra. Alicante-Valencia N-332, s/n, San Juan de Alicante, 03550 Alicante, Spain
| | - Mario López-Pérez
- Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Aptdo. 18., Ctra. Alicante-Valencia N-332, s/n, San Juan de Alicante, 03550 Alicante, Spain
| | - Jakub Barylski
- Molecular Virology Research Unit, Faculty of Biology, Adam Mickiewicz University Poznan, 61-614 Poznan, Poland
| | - Andrzej Zielezinski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University Poznan, 61-614 Poznan, Poland
| | - Bas E Dutilh
- Centre for Molecular and Biomolecular Informatics (CMBI), Radboud University Medical Centre/Radboud Institute for Molecular Life Sciences, 6525 GA Nijmegen, the Netherlands.,Theoretical Biology and Bioinformatics, Science for Life, Utrecht University (UU), 3584 CH Utrecht, the Netherlands
| | - Robert Edwards
- College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia
| | - Francisco Rodriguez-Valera
- Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Aptdo. 18., Ctra. Alicante-Valencia N-332, s/n, San Juan de Alicante, 03550 Alicante, Spain.,Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| |
Collapse
|
20
|
Brierley L, Fowler A. Predicting the animal hosts of coronaviruses from compositional biases of spike protein and whole genome sequences through machine learning. PLoS Pathog 2021; 17:e1009149. [PMID: 33878118 PMCID: PMC8087038 DOI: 10.1371/journal.ppat.1009149] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/30/2021] [Accepted: 04/09/2021] [Indexed: 12/21/2022] Open
Abstract
The COVID-19 pandemic has demonstrated the serious potential for novel zoonotic coronaviruses to emerge and cause major outbreaks. The immediate animal origin of the causative virus, SARS-CoV-2, remains unknown, a notoriously challenging task for emerging disease investigations. Coevolution with hosts leads to specific evolutionary signatures within viral genomes that can inform likely animal origins. We obtained a set of 650 spike protein and 511 whole genome nucleotide sequences from 222 and 185 viruses belonging to the family Coronaviridae, respectively. We then trained random forest models independently on genome composition biases of spike protein and whole genome sequences, including dinucleotide and codon usage biases in order to predict animal host (of nine possible categories, including human). In hold-one-out cross-validation, predictive accuracy on unseen coronaviruses consistently reached ~73%, indicating evolutionary signal in spike proteins to be just as informative as whole genome sequences. However, different composition biases were informative in each case. Applying optimised random forest models to classify human sequences of MERS-CoV and SARS-CoV revealed evolutionary signatures consistent with their recognised intermediate hosts (camelids, carnivores), while human sequences of SARS-CoV-2 were predicted as having bat hosts (suborder Yinpterochiroptera), supporting bats as the suspected origins of the current pandemic. In addition to phylogeny, variation in genome composition can act as an informative approach to predict emerging virus traits as soon as sequences are available. More widely, this work demonstrates the potential in combining genetic resources with machine learning algorithms to address long-standing challenges in emerging infectious diseases.
Collapse
Affiliation(s)
- Liam Brierley
- Department of Health Data Science, University of Liverpool, Brownlow Street, Liverpool, United Kingdom
| | - Anna Fowler
- Department of Health Data Science, University of Liverpool, Brownlow Street, Liverpool, United Kingdom
| |
Collapse
|
21
|
Lamy-Besnier Q, Brancotte B, Ménager H, Debarbieux L. Viral Host Range database, an online tool for recording, analyzing and disseminating virus-host interactions. Bioinformatics 2021; 37:2798-2801. [PMID: 33594411 PMCID: PMC8428608 DOI: 10.1093/bioinformatics/btab070] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 01/11/2021] [Accepted: 02/15/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation Viruses are ubiquitous in the living world, and their ability to infect more than one host defines their host range. However, information about which virus infects which host, and about which host is infected by which virus, is not readily available. Results We developed a web-based tool called the Viral Host Range database to record, analyze and disseminate experimental host range data for viruses infecting archaea, bacteria and eukaryotes. Availability and implementation The ViralHostRangeDB application is available from https://viralhostrangedb.pasteur.cloud. Its source code is freely available from the Gitlab instance of Institut Pasteur (https://gitlab.pasteur.fr/hub/viralhostrangedb).
Collapse
Affiliation(s)
- Quentin Lamy-Besnier
- Bacteriophage, Bacterium, Host Laboratory, Department of Microbiology, Institut Pasteur, Paris, F-75015, France.,Université de Paris, Paris, France
| | - Bryan Brancotte
- Bioinformatics and Biostatistics, Institut Pasteur, Paris, F-75015, France
| | - Hervé Ménager
- Bioinformatics and Biostatistics, Institut Pasteur, Paris, F-75015, France
| | - Laurent Debarbieux
- Bacteriophage, Bacterium, Host Laboratory, Department of Microbiology, Institut Pasteur, Paris, F-75015, France
| |
Collapse
|
22
|
Hufsky F, Beerenwinkel N, Meyer IM, Roux S, Cook GM, Kinsella CM, Lamkiewicz K, Marquet M, Nieuwenhuijse DF, Olendraite I, Paraskevopoulou S, Young F, Dijkman R, Ibrahim B, Kelly J, Le Mercier P, Marz M, Ramette A, Thiel V. The International Virus Bioinformatics Meeting 2020. Viruses 2020; 12:E1398. [PMID: 33291220 PMCID: PMC7762161 DOI: 10.3390/v12121398] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 12/01/2020] [Indexed: 12/16/2022] Open
Abstract
The International Virus Bioinformatics Meeting 2020 was originally planned to take place in Bern, Switzerland, in March 2020. However, the COVID-19 pandemic put a spoke in the wheel of almost all conferences to be held in 2020. After moving the conference to 8-9 October 2020, we got hit by the second wave and finally decided at short notice to go fully online. On the other hand, the pandemic has made us even more aware of the importance of accelerating research in viral bioinformatics. Advances in bioinformatics have led to improved approaches to investigate viral infections and outbreaks. The International Virus Bioinformatics Meeting 2020 has attracted approximately 120 experts in virology and bioinformatics from all over the world to join the two-day virtual meeting. Despite concerns being raised that virtual meetings lack possibilities for face-to-face discussion, the participants from this small community created a highly interactive scientific environment, engaging in lively and inspiring discussions and suggesting new research directions and questions. The meeting featured five invited and twelve contributed talks, on the four main topics: (1) proteome and RNAome of RNA viruses, (2) viral metagenomics and ecology, (3) virus evolution and classification and (4) viral infections and immunology. Further, the meeting featured 20 oral poster presentations, all of which focused on specific areas of virus bioinformatics. This report summarizes the main research findings and highlights presented at the meeting.
Collapse
Affiliation(s)
- Franziska Hufsky
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Niko Beerenwinkel
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
| | - Irmtraud M. Meyer
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin Institute for Medical Systems Biology, 10115 Berlin, Germany
- Department of Biology, Chemistry and Pharmacy, Institute of Chemistry and Biochemistry, Freie Universität Berlin, 14195 Berlin, Germany
| | - Simon Roux
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA 94720, USA;
| | - Georgia May Cook
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge CB2 1TN, UK
| | - Cormac M. Kinsella
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Laboratory of Experimental Virology, Department of Medical Microbiology and Infection Prevention, Amsterdam UMC, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands
| | - Kevin Lamkiewicz
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Mike Marquet
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- CaSe Group, Institut für Infektionsmedizin und Krankenhaushygiene, Universitätsklinikum Jena, 07743 Jena, Germany
| | - David F. Nieuwenhuijse
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Viroscience Department, Erasmus MC, 3015 GD Rotterdam, The Netherlands
| | - Ingrida Olendraite
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge CB2 1TN, UK
| | - Sofia Paraskevopoulou
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, 10117 Berlin, Germany
| | - Francesca Young
- MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK;
| | - Ronald Dijkman
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Institute of Virology and Immunology, University of Bern, 3012 Bern, Switzerland
- Department of Infectious Diseases and Pathobiology, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland
- Institute for Infectious Diseases, University of Bern, 3012 Bern, Switzerland
| | - Bashar Ibrahim
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Centre for Applied Mathematics and Bioinformatics, Hawally 32093, Kuwait
- Department of Mathematics and Natural Sciences Gulf University for Science and Technology, Hawally 32093, Kuwait
| | - Jenna Kelly
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Institute of Virology and Immunology, University of Bern, 3012 Bern, Switzerland
| | - Philippe Le Mercier
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1205 Geneva, Switzerland
| | - Manja Marz
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Alban Ramette
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Institute for Infectious Diseases, University of Bern, 3012 Bern, Switzerland
| | - Volker Thiel
- European Virus Bioinformatics Center, 07743 Jena, Germany; (N.B.); (I.M.M.); (G.M.C.); (C.M.K.); (K.L.); (M.M.); (D.F.N.); (I.O.); (S.P.); (R.D.); (B.I.); (J.K.); (P.L.M.); (M.M.); (A.R.); (V.T.)
- Institute of Virology and Immunology, University of Bern, 3012 Bern, Switzerland
| |
Collapse
|