1
|
Hou X, He Y, Fang P, Mei SQ, Xu Z, Wu WC, Tian JH, Zhang S, Zeng ZY, Gou QY, Xin GY, Le SJ, Xia YY, Zhou YL, Hui FM, Pan YF, Eden JS, Yang ZH, Han C, Shu YL, Guo D, Li J, Holmes EC, Li ZR, Shi M. Using artificial intelligence to document the hidden RNA virosphere. Cell 2024:S0092-8674(24)01085-7. [PMID: 39389057 DOI: 10.1016/j.cell.2024.09.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 08/01/2024] [Accepted: 09/16/2024] [Indexed: 10/12/2024]
Abstract
Current metagenomic tools can fail to identify highly divergent RNA viruses. We developed a deep learning algorithm, termed LucaProt, to discover highly divergent RNA-dependent RNA polymerase (RdRP) sequences in 10,487 metatranscriptomes generated from diverse global ecosystems. LucaProt integrates both sequence and predicted structural information, enabling the accurate detection of RdRP sequences. Using this approach, we identified 161,979 potential RNA virus species and 180 RNA virus supergroups, including many previously poorly studied groups, as well as RNA virus genomes of exceptional length (up to 47,250 nucleotides) and genomic complexity. A subset of these novel RNA viruses was confirmed by RT-PCR and RNA/DNA sequencing. Newly discovered RNA viruses were present in diverse environments, including air, hot springs, and hydrothermal vents, with virus diversity and abundance varying substantially among ecosystems. This study advances virus discovery, highlights the scale of the virosphere, and provides computational tools to better document the global RNA virome.
Collapse
Affiliation(s)
- Xin Hou
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Yong He
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Pan Fang
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Shi-Qiang Mei
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Zan Xu
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Wei-Chen Wu
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Jun-Hua Tian
- Wuhan Centers for Disease Control and Prevention, Wuhan, China
| | - Shun Zhang
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Zhen-Yu Zeng
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China
| | - Qin-Yu Gou
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Gen-Yang Xin
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Shi-Jia Le
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Yin-Yue Xia
- Polar Research Institute of China, Shanghai, China
| | - Yu-Lan Zhou
- Department of Nursing, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
| | - Feng-Ming Hui
- School of Geospatial Engineering and Science, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Sun Yat-sen University, Zhuhai, China; Key Laboratory of Comprehensive Observation of Polar Environment, Ministry of Education, Sun Yat-sen University, Zhuhai, China
| | - Yuan-Fei Pan
- Ministry of Education Key Laboratory of Biodiversity Science and Ecological Engineering, National Observations and Research Station for Wetland Ecosystems of the Yangtze Estuary, Institute of Biodiversity Science and Institute of Eco-Chongming, School of Life Sciences, Fudan University Shanghai, Shanghai, China
| | - John-Sebastian Eden
- Centre for Virus Research, Westmead Institute for Medical Research, Westmead, NSW, Australia; School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - Zhao-Hui Yang
- College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Chong Han
- School of Life Science, Guangzhou University, Guangzhou, China
| | - Yue-Long Shu
- Key Laboratory of Pathogen Infection Prevention and Control (MOE), State Key Laboratory of Respiratory Health and Multimorbidity, National Institute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China; School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Deyin Guo
- Guangzhou National Laboratory, Guangzhou International Bio-Island, Guangzhou, China
| | - Jun Li
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong SAR, China
| | - Edward C Holmes
- School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia; Laboratory of Data Discovery for Health Limited, Hong Kong SAR, China.
| | - Zhao-Rong Li
- Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China.
| | - Mang Shi
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China; Shenzhen Key Laboratory for Systems Medicine in Inflammatory Diseases, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China; Guangdong Provincial Center for Disease Control and Prevention, Guangzhou, China.
| |
Collapse
|
2
|
Maestri R, Perez-Lamarque B, Zhukova A, Morlon H. Recent evolutionary origin and localized diversity hotspots of mammalian coronaviruses. eLife 2024; 13:RP91745. [PMID: 39196812 PMCID: PMC11357359 DOI: 10.7554/elife.91745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2024] Open
Abstract
Several coronaviruses infect humans, with three, including the SARS-CoV2, causing diseases. While coronaviruses are especially prone to induce pandemics, we know little about their evolutionary history, host-to-host transmissions, and biogeography. One of the difficulties lies in dating the origination of the family, a particularly challenging task for RNA viruses in general. Previous cophylogenetic tests of virus-host associations, including in the Coronaviridae family, have suggested a virus-host codiversification history stretching many millions of years. Here, we establish a framework for robustly testing scenarios of ancient origination and codiversification versus recent origination and diversification by host switches. Applied to coronaviruses and their mammalian hosts, our results support a scenario of recent origination of coronaviruses in bats and diversification by host switches, with preferential host switches within mammalian orders. Hotspots of coronavirus diversity, concentrated in East Asia and Europe, are consistent with this scenario of relatively recent origination and localized host switches. Spillovers from bats to other species are rare, but have the highest probability to be towards humans than to any other mammal species, implicating humans as the evolutionary intermediate host. The high host-switching rates within orders, as well as between humans, domesticated mammals, and non-flying wild mammals, indicates the potential for rapid additional spreading of coronaviruses across the world. Our results suggest that the evolutionary history of extant mammalian coronaviruses is recent, and that cases of long-term virus-host codiversification have been largely over-estimated.
Collapse
Affiliation(s)
- Renan Maestri
- Institut de Biologie de l'École Normale Supérieure (IBENS), École Normale Supérieure, CNRS, INSERM, Université PSLParisFrance
- Departamento de Ecologia, Instituto de Biociências, Universidade Federal do Rio Grande do SulPorto AlegreBrazil
| | - Benoît Perez-Lamarque
- Institut de Biologie de l'École Normale Supérieure (IBENS), École Normale Supérieure, CNRS, INSERM, Université PSLParisFrance
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum national d’histoire naturelle, CNRS, Sorbonne Université, EPHE, UAParisFrance
| | - Anna Zhukova
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics HubParisFrance
| | - Hélène Morlon
- Institut de Biologie de l'École Normale Supérieure (IBENS), École Normale Supérieure, CNRS, INSERM, Université PSLParisFrance
| |
Collapse
|
3
|
Tian Z, Hu T, Holmes EC, Ji J, Shi W. Analysis of the genetic diversity in RNA-directed RNA polymerase sequences: implications for an automated RNA virus classification system. Virus Evol 2024; 10:veae059. [PMID: 39119135 PMCID: PMC11306317 DOI: 10.1093/ve/veae059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/21/2024] [Accepted: 07/24/2024] [Indexed: 08/10/2024] Open
Abstract
RNA viruses are characterized by a broad host range and high levels of genetic diversity. Despite a recent expansion in the known virosphere following metagenomic sequencing, our knowledge of the species rank genetic diversity of RNA viruses, and how often they are misassigned and misclassified, is limited. We performed a clustering analysis of 7801 RNA-directed RNA polymerase (RdRp) sequences representing 1897 established RNA virus species. From this, we identified substantial genetic divergence within some virus species and inconsistency in RNA virus assignment between the GenBank database and The International Committee on Taxonomy of Viruses (ICTV). In particular, 27.57% virus species comprised multiple virus operational taxonomic units (vOTUs), including Alphainfluenzavirus influenzae, Mammarenavirus lassaense, Apple stem pitting virus, and Rotavirus A, with each having over 100 vOTUs. In addition, the distribution of average amino acid identity between vOTUs within single assigned species showed a relatively low threshold: <90% and sometimes <50%. However, when only exemplar sequences from virus species were analyzed, 1889 of the ICTV-designated RNA virus species (99.58%) were clustered into a single vOTU. Clustering of the RdRp sequences from different virus species also revealed that 17 vOTUs contained two distinct virus species. These potential misassignments were confirmed by phylogenetic analysis. A further analysis of average nucleotide identity (ANI) values ranging from 70% to 97.5% revealed that at an ANI of 82.5%, 1559 (82.18%) of the 1897 virus species could be correctly clustered into one single vOTU. However, at ANI values >82.5%, an increasing number of species were clustered into two or more vOTUs. In sum, we have identified some inconsistency and misassignment of the RNA virus species based on the analysis of RdRp sequences alone, which has important implications for the development of an automated RNA virus classification system.
Collapse
Affiliation(s)
- Zhongshuai Tian
- Key Laboratory of Emerging Infectious Diseases in Universities of Shandong, Shandong First Medical University & Shandong Academy of Medical Sciences, No. 6699 Qingdao Road, Ji’nan 250117, China
- Shanghai Institute of Virology, Shanghai Jiao Tong University School of Medicine, No. 227 Chongqingnanlu, Shanghai 200025, China
| | - Tao Hu
- Key Laboratory of Emerging Infectious Diseases in Universities of Shandong, Shandong First Medical University & Shandong Academy of Medical Sciences, No. 6699 Qingdao Road, Ji’nan 250117, China
| | - Edward C Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, New South Wales 2006, Australia
- Laboratory of Data Discovery for Health Limited, 19 Science Park West Avenue, Hong Kong 999077, China
| | - Jingkai Ji
- School of Life Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, No. 619 Changcheng Road, Taian 271000, China
| | - Weifeng Shi
- Shanghai Institute of Virology, Shanghai Jiao Tong University School of Medicine, No. 227 Chongqingnanlu, Shanghai 200025, China
- Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, No. 197 Ruijinerlu, Shanghai 200025, China
| |
Collapse
|
4
|
Buivydaitė Ž, Winding A, Sapkota R. Transmission of mycoviruses: new possibilities. Front Microbiol 2024; 15:1432840. [PMID: 38993496 PMCID: PMC11236713 DOI: 10.3389/fmicb.2024.1432840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 06/12/2024] [Indexed: 07/13/2024] Open
Abstract
Mycoviruses are viruses that infect fungi. In recent years, an increasing number of mycoviruses have been reported in a wide array of fungi. With the growing interest of scientists and society in reducing the use of agrochemicals, the debate about mycoviruses as an effective next-generation biocontrol has regained momentum. Mycoviruses can have profound effects on the host phenotype, although most viruses have neutral or no effect. We speculate that understanding multiple transmission modes of mycoviruses is central to unraveling the viral ecology and their function in regulating fungal populations. Unlike plant virus transmission via vegetative plant parts, seeds, pollen, or vectors, a widely held view is that mycoviruses are transmitted via vertical routes and only under special circumstances horizontally via hyphal contact depending on the vegetative compatibility groups (i.e., the ability of different fungal strains to undergo hyphal fusion). However, this view has been challenged over the past decades, as new possible transmission routes of mycoviruses are beginning to unravel. In this perspective, we discuss emerging studies with evidence suggesting that such novel routes of mycovirus transmission exist and are pertinent to understanding the full picture of mycovirus ecology and evolution.
Collapse
Affiliation(s)
| | | | - Rumakanta Sapkota
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| |
Collapse
|
5
|
Gupta P, Hiller A, Chowdhury J, Lim D, Lim DY, Saeij JPJ, Babaian A, Rodriguez F, Pereira L, Morales-Tapia A. A parasite odyssey: An RNA virus concealed in Toxoplasma gondii. Virus Evol 2024; 10:veae040. [PMID: 38817668 PMCID: PMC11137675 DOI: 10.1093/ve/veae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/05/2024] [Accepted: 05/10/2024] [Indexed: 06/01/2024] Open
Abstract
We are entering a 'Platinum Age of Virus Discovery', an era marked by exponential growth in the discovery of virus biodiversity, and driven by advances in metagenomics and computational analysis. In the ecosystem of a human (or any animal) there are more species of viruses than simply those directly infecting the animal cells. Viruses can infect all organisms constituting the microbiome, including bacteria, fungi, and unicellular parasites. Thus the complexity of possible interactions between host, microbe, and viruses is unfathomable. To understand this interaction network we must employ computationally assisted virology as a means of analyzing and interpreting the millions of available samples to make inferences about the ways in which viruses may intersect human health. From a computational viral screen of human neuronal datasets, we identified a novel narnavirus Apocryptovirus odysseus (Ao) which likely infects the neurotropic parasite Toxoplasma gondii. Previously, several parasitic protozoan viruses (PPVs) have been mechanistically established as triggers of host innate responses, and here we present in silico evidence that Ao is a plausible pro-inflammatory factor in human and mouse cells infected by T. gondii. T. gondii infects billions of people worldwide, yet the prognosis of toxoplasmosis disease is highly variable, and PPVs like Ao could function as a hitherto undescribed hypervirulence factor. In a broader screen of over 7.6 million samples, we explored phylogenetically proximal viruses to Ao and discovered nineteen Apocryptovirus species, all found in libraries annotated as vertebrate transcriptome or metatranscriptomes. While samples containing this genus of narnaviruses are derived from sheep, goat, bat, rabbit, chicken, and pigeon samples, the presence of virus is strongly predictive of parasitic Apicomplexa nucleic acid co-occurrence, supporting the fact that Apocryptovirus is a genus of parasite-infecting viruses. This is a computational proof-of-concept study in which we rapidly analyze millions of datasets from which we distilled a mechanistically, ecologically, and phylogenetically refined hypothesis. We predict that this highly diverged Ao RNA virus is biologically a T. gondii infection, and that Ao, and other viruses like it, will modulate this disease which afflicts billions worldwide.
Collapse
Affiliation(s)
- Purav Gupta
- The Woodlands Secondary School, 3225 Erindale Station Rd,Mississauga, ON L5C 1Y5, Canada
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Aiden Hiller
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Jawad Chowdhury
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Declan Lim
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Dillon Yee Lim
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
- Department of Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Sherrington Road, Oxford, Oxfordshire, OX1 3PT, UK
| | - Jeroen P J Saeij
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
- Department of Pathology, Microbiology and Immunology, School of Veterinary Medicine, University of California, 1 Shields Ave, Davis, CA 95616, USA
| | - Artem Babaian
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Felipe Rodriguez
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
- Department of Pathology, Microbiology and Immunology, School of Veterinary Medicine, University of California, 1 Shields Ave, Davis, CA 95616, USA
| | - Luke Pereira
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| | - Alejandro Morales-Tapia
- Department of Molecular Genetics, University of Toronto, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- The Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, 160 College St, Toronto, ON M5S 3E1, Canada
- The Woodlands Secondary School, 3225 Erindale Station Rd, Mississauga, ON L5C 1Y5, Canada
| |
Collapse
|
6
|
Luebbert L, Sullivan DK, Carilli M, Hjörleifsson KE, Winnett AV, Chari T, Pachter L. Efficient and accurate detection of viral sequences at single-cell resolution reveals putative novel viruses perturbing host gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.11.571168. [PMID: 38168363 PMCID: PMC10760059 DOI: 10.1101/2023.12.11.571168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
There are an estimated 300,000 mammalian viruses from which infectious diseases in humans may arise. They inhabit human tissues such as the lungs, blood, and brain and often remain undetected. Efficient and accurate detection of viral infection is vital to understanding its impact on human health and to make accurate predictions to limit adverse effects, such as future epidemics. The increasing use of high-throughput sequencing methods in research, agriculture, and healthcare provides an opportunity for the cost-effective surveillance of viral diversity and investigation of virus-disease correlation. However, existing methods for identifying viruses in sequencing data rely on and are limited to reference genomes or cannot retain single-cell resolution through cell barcode tracking. We introduce a method that accurately and rapidly detects viral sequences in bulk and single-cell transcriptomics data based on highly conserved amino acid domains, which enables the detection of RNA viruses covering up to 1012 virus species. The analysis of viral presence and host gene expression in parallel at single-cell resolution allows for the characterization of host viromes and the identification of viral tropism and host responses. We applied our method to identify putative novel viruses in rhesus macaque PBMC data that display cell type specificity and whose presence correlates with altered host gene expression.
Collapse
Affiliation(s)
- Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Delaney K. Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Maria Carilli
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | | | - Alexander Viloria Winnett
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
7
|
Robinson CRP, Dolezal AG, Newton ILG. Host species and geography impact bee-associated RNA virus communities with evidence for isolation by distance in viral populations. ISME COMMUNICATIONS 2024; 4:ycad003. [PMID: 38304079 PMCID: PMC10833078 DOI: 10.1093/ismeco/ycad003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/09/2023] [Accepted: 11/15/2023] [Indexed: 02/03/2024]
Abstract
Virus symbionts are important mediators of ecosystem function, yet we know little of their diversity and ecology in natural populations. The alarming decline of pollinating insects in many regions of the globe, especially the European honey bee, Apis mellifera, has been driven in part by worldwide transmission of virus pathogens. Previous work has examined the transmission of known honey bee virus pathogens to wild bee populations, but only a handful of studies have investigated the native viromes associated with wild bees, limiting epidemiological predictors associated with viral pathogenesis. Further, variation among different bee species might have important consequences in the acquisition and maintenance of bee-associated virome diversity. We utilized comparative metatranscriptomics to develop a baseline description of the RNA viromes associated with wild bee pollinators and to document viral diversity, community composition, and structure. Our sampling includes five wild-caught, native bee species that vary in social behavior as well as managed honey bees. We describe 26 putatively new RNA virus species based on RNA-dependent RNA polymerase phylogeny and show that each sampled bee species was associated with a specific virus community composition, even among sympatric populations of distinct host species. From 17 samples of a single host species, we recovered a single virus species despite over 600 km of distance between host populations and found strong evidence for isolation by distance in associated viral populations. Our work adds to the small number of studies examining viral prevalence and community composition in wild bees.
Collapse
Affiliation(s)
- Chris R P Robinson
- Department of Biology, Indiana University, Bloomington, IN 47405, United States
| | - Adam G Dolezal
- Department of Entomology, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Irene L G Newton
- Department of Biology, Indiana University, Bloomington, IN 47405, United States
| |
Collapse
|
8
|
Edgar R. Known phyla dominate the Tara Oceans RNA virome. Virus Evol 2023; 9:vead063. [PMID: 38028147 PMCID: PMC10649353 DOI: 10.1093/ve/vead063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 10/03/2023] [Accepted: 10/20/2023] [Indexed: 12/01/2023] Open
Abstract
A recent study proposed five new RNA virus phyla, two of which, 'Taraviricota' and 'Arctiviricota', were stated to be 'dominant in the oceans'. However, the study's assignments classify 28,353 putative RdRp-containing contigs to known phyla but only 886 (2.8%) to the five proposed new phyla combined. I re-mapped the reads to the contigs, finding that known phyla also account for a large majority (93.8%) of reads according to the study's classifications, and that contigs originally assigned to 'Arctiviricota' accounted for only a tiny fraction (0.01%) of reads from Arctic Ocean samples. Performing my own virus identification and classifications, I found that 99.95 per cent of reads could be assigned to known phyla. The most abundant species was Beihai picorna-like virus 34 (15% of reads), and the most abundant order-like cluster was classified as Picornavirales (45% of reads). Sequences in the claimed new phylum 'Pomiviricota' were placed inside a phylogenetic tree for established order Durnavirales with 100 per cent confidence. Moreover, two contigs assigned to the proposed phylum 'Taraviricota' were found to have high-identity alignments to dinoflagellate proteins, tentatively identifying this group of RdRp-like sequences as deriving from non-viral transcripts. Together, these results comprehensively contradict the claim that new phyla dominate the data.
Collapse
|
9
|
Petrone ME, Parry R, Mifsud JCO, Van Brussel K, Vorhees I, Richards ZT, Holmes EC. Evidence for an ancient aquatic origin of the RNA viral order Articulavirales. Proc Natl Acad Sci U S A 2023; 120:e2310529120. [PMID: 37906647 PMCID: PMC10636315 DOI: 10.1073/pnas.2310529120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/03/2023] [Indexed: 11/02/2023] Open
Abstract
The emergence of previously unknown disease-causing viruses in mammals is in part the result of a long-term evolutionary process. Reconstructing the deep phylogenetic histories of viruses helps identify major evolutionary transitions and contextualizes the emergence of viruses in new hosts. We used a combination of total RNA sequencing and transcriptome data mining to extend the diversity and evolutionary history of the RNA virus order Articulavirales, which includes the influenza viruses. We identified instances of Articulavirales in the invertebrate phylum Cnidaria (including corals), constituting a novel and divergent family that we provisionally named the "Cnidenomoviridae." We further extended the evolutionary history of the influenza virus lineage by identifying four divergent, fish-associated influenza-like viruses, thereby supporting the hypothesis that fish were among the first hosts of influenza viruses. In addition, we substantially expanded the phylogenetic diversity of quaranjaviruses and proposed that this genus be reclassified as a family-the "Quaranjaviridae." Within this putative family, we identified a novel arachnid-infecting genus, provisionally named "Cheliceravirus." Notably, we observed a close phylogenetic relationship between the Crustacea- and Chelicerata-infecting "Quaranjaviridae" that is inconsistent with virus-host codivergence. Together, these data suggest that the Articulavirales has evolved over at least 600 million years, first emerging in aquatic animals. Importantly, the evolution of the Articulavirales was likely shaped by multiple aquatic-terrestrial transitions and substantial host jumps, some of which are still observable today.
Collapse
Affiliation(s)
- Mary E. Petrone
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW2006, Australia
- Laboratory of Data Discovery for Health Limited, Hong Kong Special Administrative Region, China
| | - Rhys Parry
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD4067, Australia
| | - Jonathon C. O. Mifsud
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW2006, Australia
| | - Kate Van Brussel
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW2006, Australia
| | - Ian Vorhees
- James A. Baker Institute for Animal Health, Department of Microbiology and Immunology, College of Veterinary Medicine, Cornell University, Ithaca, NY14850
| | - Zoe T. Richards
- Coral Conservation and Research Group, Trace and Environmental DNA Laboratory, School of Molecular and Life Sciences, Curtin University, Perth, WA6102, Australia
- Collections and Research, Western Australian Museum, Welshpool, WA6106, Australia
| | - Edward C. Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW2006, Australia
- Laboratory of Data Discovery for Health Limited, Hong Kong Special Administrative Region, China
| |
Collapse
|
10
|
Le Lay C, Hamm JN, Williams TJ, Shi M, Cavicchioli R, Holmes EC. Viral community composition of hypersaline lakes. Virus Evol 2023; 9:vead057. [PMID: 37692898 PMCID: PMC10492444 DOI: 10.1093/ve/vead057] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/03/2023] [Accepted: 08/29/2023] [Indexed: 09/12/2023] Open
Abstract
Despite their widespread distribution and remarkable antiquity no RNA viruses definitively associated with the domain Archaea have been identified. In contrast, 17 families of DNA viruses are known to infect archaea. In an attempt to uncover more of the elusive archaeal virosphere, we investigated the metatranscriptomes of hypersaline lakes that are a rich source of archaea. We sequenced RNA extracted from water filter samples of Lake Tyrrell (Victoria, Australia) and cultures seeded from four lakes in Antarctica. To identify highly divergent viruses in these data, we employed a variety of search tools, including Hidden Markov models (HMMs) and position-specific scoring matrices (PSSMs). From this, we identified 12 highly divergent, RNA virus-like candidate sequences from the virus phyla Artverviricota, Duplornaviricota, Kitrinoviricota, Negarnaviricota, and Pisuviricota, including those with similarity to the RNA-dependent RNA polymerase (RdRp). An additional analysis with an artificial intelligence (AI)-based approach that utilises both sequence and structural information identified seven putative and highly divergent RdRp sequences of uncertain phylogenetic position. A sequence matching the Pisuviricota from Deep Lake in Antarctica had the strongest RNA virus signal. Analyses of the dinucleotide representation of the virus-like candidates in comparison to that of potential host species were in some cases compatible with an association to archaeal or bacterial hosts. Notably, however, the use of archaeal CRISPR spacers as a BLAST database failed to detect any RNA viruses. We also described DNA viruses from the families Pleolipoviridae, Sphaerolipoviridae, Halspiviridae, and the class Caudoviricetes. Although we were unable to provide definitive evidence the existence of an RNA virus of archaea in these hypersaline lakes, this study lays the foundations for further investigations of highly divergent RNA viruses in natural environments.
Collapse
Affiliation(s)
- Callum Le Lay
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | | | - Timothy J Williams
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Mang Shi
- State Key Laboratory for Biocontrol, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, China
| | - Ricardo Cavicchioli
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Edward C Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
- Department of Marine Microbiology and Biogeochemistry, Royal Netherlands Institute for Sea Research, P.O. Box 59, Den Burg NL-1790 AB, The Netherlands
| |
Collapse
|
11
|
Urayama SI, Fukudome A, Hirai M, Okumura T, Nishimura Y, Takaki Y, Kurosawa N, Koonin EV, Krupovic M, Nunoura T. Distinct groups of RNA viruses associated with thermoacidophilic bacteria. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.02.547447. [PMID: 37790367 PMCID: PMC10542131 DOI: 10.1101/2023.07.02.547447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Recent massive metatranscriptome mining substantially expanded the diversity of the bacterial RNA virome, suggesting that additional groups of riboviruses infecting bacterial hosts remain to be discovered. We employed full length double-stranded (ds) RNA sequencing for identification of riboviruses associated with microbial consortia dominated by bacteria and archaea in acidic hot springs in Japan. Whole sequences of two groups of multisegmented riboviruses genomes were obtained. One group, which we denoted hot spring riboviruses (HsRV), consists of unusual viruses with distinct RNA-dependent RNA polymerases (RdRPs) that seem to be intermediates between typical ribovirus RdRPs and viral reverse transcriptases. We also identified viruses encoding HsRV-like RdRPs in moderate aquatic environments, including marine water, river sediments and salt marsh, indicating that this previously overlooked ribovirus group is not restricted to the extreme ecosystem. The HsRV-like viruses are candidates for a distinct phylum or even kingdom within the viral realm Riboviria. The second group, denoted hot spring partiti-like viruses (HsPV), is a distinct branch within the family Partitiviridae. All genome segments in both these groups of viruses display the organization typical of bacterial riboviruses, where multiple open reading frames encoding individual proteins are preceded by ribosome-binding sites. Together with the identification in bacteria-dominated habitats, this genome architecture indicates that riboviruses of these distinct groups infect thermoacidophilic bacterial hosts.
Collapse
Affiliation(s)
- Syun-ichi Urayama
- Department of Life and Environmental Sciences, Laboratory of Fungal Interaction and Molecular Biology, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
- Microbiology Research Center for Sustainability (MiCS), University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Akihito Fukudome
- Howard Hughes Medical Institute, Department of Biology and Department of Molecular and Cellular Biochemistry, Indiana Univeristy, Bloomington, IN, USA
| | - Miho Hirai
- Super-cutting-edge Grand and Advanced Research (SUGAR) Program, Japan Agency for Marine Science and Technology (JAMSTEC), 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
| | - Tomoyo Okumura
- Marine Core Research Institute, Kochi University, 200 Otsu, Monobe, Nankoku City, Kochi, 783-8502, Japan
| | - Yosuke Nishimura
- Research Center for Bioscience and Nanoscience (CeBN), JAMSTEC, 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
| | - Yoshihiro Takaki
- Super-cutting-edge Grand and Advanced Research (SUGAR) Program, Japan Agency for Marine Science and Technology (JAMSTEC), 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
| | - Norio Kurosawa
- Department of Science and Engineering for Sustainable Innovation, Faculty of Science and Engineering, Soka University, Hachioji 192-8577, Japan
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, France
| | - Takuro Nunoura
- Research Center for Bioscience and Nanoscience (CeBN), JAMSTEC, 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
| |
Collapse
|
12
|
Murad T, Ali S, Patterson M. Exploring the Potential of GANs in Biological Sequence Analysis. BIOLOGY 2023; 12:854. [PMID: 37372139 DOI: 10.3390/biology12060854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 06/03/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023]
Abstract
Biological sequence analysis is an essential step toward building a deeper understanding of the underlying functions, structures, and behaviors of the sequences. It can help in identifying the characteristics of the associated organisms, such as viruses, etc., and building prevention mechanisms to eradicate their spread and impact, as viruses are known to cause epidemics that can become global pandemics. New tools for biological sequence analysis are provided by machine learning (ML) technologies to effectively analyze the functions and structures of the sequences. However, these ML-based methods undergo challenges with data imbalance, generally associated with biological sequence datasets, which hinders their performance. Although various strategies are present to address this issue, such as the SMOTE algorithm, which creates synthetic data, however, they focus on local information rather than the overall class distribution. In this work, we explore a novel approach to handle the data imbalance issue based on generative adversarial networks (GANs), which use the overall data distribution. GANs are utilized to generate synthetic data that closely resembles real data, thus, these generated data can be employed to enhance the ML models' performance by eradicating the class imbalance problem for biological sequence analysis. We perform four distinct classification tasks by using four different sequence datasets (Influenza A Virus, PALMdb, VDjDB, Host) and our results illustrate that GANs can improve the overall classification performance.
Collapse
Affiliation(s)
- Taslim Murad
- Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA
| | - Sarwan Ali
- Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA
| | - Murray Patterson
- Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA
| |
Collapse
|
13
|
Forgia M, Navarro B, Daghino S, Cervera A, Gisel A, Perotto S, Aghayeva DN, Akinyuwa MF, Gobbi E, Zheludev IN, Edgar RC, Chikhi R, Turina M, Babaian A, Di Serio F, de la Peña M. Hybrids of RNA viruses and viroid-like elements replicate in fungi. Nat Commun 2023; 14:2591. [PMID: 37147358 PMCID: PMC10162972 DOI: 10.1038/s41467-023-38301-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 04/25/2023] [Indexed: 05/07/2023] Open
Abstract
Earth's life may have originated as self-replicating RNA, and it has been argued that RNA viruses and viroid-like elements are remnants of such pre-cellular RNA world. RNA viruses are defined by linear RNA genomes encoding an RNA-dependent RNA polymerase (RdRp), whereas viroid-like elements consist of small, single-stranded, circular RNA genomes that, in some cases, encode paired self-cleaving ribozymes. Here we show that the number of candidate viroid-like elements occurring in geographically and ecologically diverse niches is much higher than previously thought. We report that, amongst these circular genomes, fungal ambiviruses are viroid-like elements that undergo rolling circle replication and encode their own viral RdRp. Thus, ambiviruses are distinct infectious RNAs showing hybrid features of viroid-like RNAs and viruses. We also detected similar circular RNAs, containing active ribozymes and encoding RdRps, related to mitochondrial-like fungal viruses, highlighting fungi as an evolutionary hub for RNA viruses and viroid-like elements. Our findings point to a deep co-evolutionary history between RNA viruses and subviral elements and offer new perspectives in the origin and evolution of primordial infectious agents, and RNA life.
Collapse
Affiliation(s)
- Marco Forgia
- Institute for Sustainable Plant Protection, National Research Council of Italy, Torino, Italy
| | - Beatriz Navarro
- Institute for Sustainable Plant Protection, National Research Council of Italy, Bari, Italy
| | - Stefania Daghino
- Institute for Sustainable Plant Protection, National Research Council of Italy, Torino, Italy
| | - Amelia Cervera
- Instituto de Biología Molecular y Celular de Plantas, Universidad Politécnica de Valencia-CSIC, Valencia, Spain
| | - Andreas Gisel
- Institute of Biomedical Technologies, National Research Council of Italy, Bari, Italy
- International Institute of Tropical Agriculture, Ibadan, Nigeria
| | - Silvia Perotto
- Department of Life Science and Systems Biology, University of Torino, Torino, Italy
| | - Dilzara N Aghayeva
- Institute of Botany, Ministry of Science and Education of the Republic of Azerbaijan, Baku, Azerbaijan
| | - Mary F Akinyuwa
- Department of Agroforestry Ecosystems, Universidad Politécnica de Valencia, Valencia, Spain
- Department of Land, Environment Agriculture and Forestry, Università Degli Studi di Padova, Padova, Italy
- Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, USA
| | - Emanuela Gobbi
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Ivan N Zheludev
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | | | - Rayan Chikhi
- G5 Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Massimo Turina
- Institute for Sustainable Plant Protection, National Research Council of Italy, Brescia, Italy.
| | - Artem Babaian
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Terrence Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, ON, Canada.
| | - Francesco Di Serio
- Institute for Sustainable Plant Protection, National Research Council of Italy, Bari, Italy.
| | - Marcos de la Peña
- Instituto de Biología Molecular y Celular de Plantas, Universidad Politécnica de Valencia-CSIC, Valencia, Spain.
| |
Collapse
|
14
|
Olendraite I, Brown K, Firth AE. Identification of RNA Virus-Derived RdRp Sequences in Publicly Available Transcriptomic Data Sets. Mol Biol Evol 2023; 40:msad060. [PMID: 37014783 PMCID: PMC10101049 DOI: 10.1093/molbev/msad060] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 01/15/2023] [Accepted: 03/08/2023] [Indexed: 04/05/2023] Open
Abstract
RNA viruses are abundant and highly diverse and infect all or most eukaryotic organisms. However, only a tiny fraction of the number and diversity of RNA virus species have been catalogued. To cost-effectively expand the diversity of known RNA virus sequences, we mined publicly available transcriptomic data sets. We developed 77 family-level Hidden Markov Model profiles for the viral RNA-dependent RNA polymerase (RdRp)-the only universal "hallmark" gene of RNA viruses. By using these to search the National Center for Biotechnology Information Transcriptome Shotgun Assembly database, we identified 5,867 contigs encoding RNA virus RdRps or fragments thereof and analyzed their diversity, taxonomic classification, phylogeny, and host associations. Our study expands the known diversity of RNA viruses, and the 77 curated RdRp Profile Hidden Markov Models provide a useful resource for the virus discovery community.
Collapse
Affiliation(s)
- Ingrida Olendraite
- Division of Virology, Department of Pathology, Addenbrookes Hospital, University of Cambridge, Cambridge, United Kingdom
| | - Katherine Brown
- Division of Virology, Department of Pathology, Addenbrookes Hospital, University of Cambridge, Cambridge, United Kingdom
| | - Andrew E Firth
- Division of Virology, Department of Pathology, Addenbrookes Hospital, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
15
|
Mifsud JCO, Costa VA, Petrone ME, Marzinelli EM, Holmes EC, Harvey E. Transcriptome mining extends the host range of the Flaviviridae to non-bilaterians. Virus Evol 2022; 9:veac124. [PMID: 36694816 PMCID: PMC9854234 DOI: 10.1093/ve/veac124] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 12/20/2022] [Accepted: 12/26/2022] [Indexed: 12/27/2022] Open
Abstract
The flavivirids (family Flaviviridae) are a group of positive-sense RNA viruses that include well-documented agents of human disease. Despite their importance and ubiquity, the timescale of flavivirid evolution is uncertain. An ancient origin, spanning millions of years, is supported by their presence in both vertebrates and invertebrates and by the identification of a flavivirus-derived endogenous viral element in the peach blossom jellyfish genome (Craspedacusta sowerbii, phylum Cnidaria), implying that the flaviviruses arose early in the evolution of the Metazoa. To date, however, no exogenous flavivirid sequences have been identified in these hosts. To help resolve the antiquity of the Flaviviridae, we mined publicly available transcriptome data across the Metazoa. From this, we expanded the diversity within the family through the identification of 32 novel viral sequences and extended the host range of the pestiviruses to include amphibians, reptiles, and ray-finned fish. Through co-phylogenetic analysis we found cross-species transmission to be the predominate macroevolutionary event across the non-vectored flavivirid genera (median, 68 per cent), including a cross-species transmission event between bats and rodents, although long-term virus-host co-divergence was still a regular occurrence (median, 23 per cent). Notably, we discovered flavivirus-like sequences in basal metazoan species, including the first associated with Cnidaria. This sequence formed a basal lineage to the genus Flavivirus and was closer to arthropod and crustacean flaviviruses than those in the tamanavirus group, which includes a variety of invertebrate and vertebrate viruses. Combined, these data attest to an ancient origin of the flaviviruses, likely close to the emergence of the metazoans 750-800 million years ago.
Collapse
Affiliation(s)
- Jonathon C O Mifsud
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney NSW 2006, Australia
| | - Vincenzo A Costa
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney NSW 2006, Australia
| | - Mary E Petrone
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney NSW 2006, Australia
| | - Ezequiel M Marzinelli
- School of Life and Environmental Sciences, The University of Sydney, Sydney NSW 2006, Australia
- Sydney Institute of Marine Science, 19 Chowder Bay Rd, Mosman, NSW 2088, Australia
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore 637551 Singapore
| | - Edward C Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney NSW 2006, Australia
| | - Erin Harvey
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney NSW 2006, Australia
| |
Collapse
|
16
|
Edgar RC. Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny. Nat Commun 2022; 13:6968. [PMID: 36379955 PMCID: PMC9664440 DOI: 10.1038/s41467-022-34630-w] [Citation(s) in RCA: 148] [Impact Index Per Article: 74.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 11/01/2022] [Indexed: 11/16/2022] Open
Abstract
Multiple sequence alignments are widely used to infer evolutionary relationships, enabling inferences of structure, function, and phylogeny. Standard practice is to construct one alignment by some preferred method and use it in further analysis; however, undetected alignment bias can be problematic. I describe Muscle5, a novel algorithm which constructs an ensemble of high-accuracy alignment with diverse biases by perturbing a hidden Markov model and permuting its guide tree. Confidence in an inference is assessed as the fraction of the ensemble which supports it. Applied to phylogenetic tree estimation, I show that ensembles can confidently resolve topologies with low bootstrap according to standard methods, and conversely that some topologies with high bootstraps are incorrect. Applied to the phylogeny of RNA viruses, ensemble analysis shows that recently adopted taxonomic phyla are probably polyphyletic. Ensemble analysis can improve confidence assessment in any inference from an alignment.
Collapse
|
17
|
Cabrera Mederos D, Debat H, Torres C, Portal O, Jaramillo Zapata M, Trucco V, Flores C, Ortiz C, Badaracco A, Acuña L, Nome C, Quito-Avila D, Bejerman N, Castellanos Collazo O, Sánchez-Rodríguez A, Giolitti F. An Unwanted Association: The Threat to Papaya Crops by a Novel Potexvirus in Northwest Argentina. Viruses 2022; 14:2297. [PMID: 36298852 PMCID: PMC9610017 DOI: 10.3390/v14102297] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/10/2022] [Accepted: 10/15/2022] [Indexed: 10/12/2024] Open
Abstract
An emerging virus isolated from papaya (Carica papaya) crops in northwestern (NW) Argentina was sequenced and characterized using next-generation sequencing. The resulting genome is 6667-nt long and encodes five open reading frames in an arrangement typical of other potexviruses. This virus appears to be a novel member within the genus Potexvirus. Blast analysis of RNA-dependent RNA polymerase (RdRp) and coat protein (CP) genes showed the highest amino acid sequence identity (67% and 71%, respectively) with pitaya virus X. Based on nucleotide sequence similarity and phylogenetic analysis, the name papaya virus X is proposed for this newly characterized potexvirus that was mechanically transmitted to papaya plants causing chlorotic patches and severe mosaic symptoms. Papaya virus X (PapVX) was found only in the NW region of Argentina. This prevalence could be associated with a recent emergence or adaptation of this virus to papaya in NW Argentina.
Collapse
Affiliation(s)
- Dariel Cabrera Mederos
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | - Humberto Debat
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | - Carolina Torres
- Facultad de Farmacia y Bioquímica, Instituto de Investigaciones en Bacteriología y Virología Molecular, Universidad de Buenos Aires, Buenos Aires C1425FBQ, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires C1425FBQ, Argentina
| | - Orelvis Portal
- Departamento de Biología, Facultad de Ciencias Agropecuarias, Universidad Central “Marta Abreu” de Las Villas, Santa Clara 54830, Cuba
- Centro de Investigaciones Agropecuarias, Facultad de Ciencias Agropecuarias, Universidad Central “Marta Abreu” de Las Villas, Santa Clara 54830, Cuba
| | | | - Verónica Trucco
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | - Ceferino Flores
- Estación Experimental Agropecuaria Yuto, Instituto Nacional de Tecnología Agropecuaria, Jujuy Y4518, Argentina
| | - Claudio Ortiz
- Estación Experimental Agropecuaria Yuto, Instituto Nacional de Tecnología Agropecuaria, Jujuy Y4518, Argentina
| | - Alejandra Badaracco
- Estación Experimental Agropecuaria Montecarlo, Instituto Nacional de Tecnología Agropecuaria, Misiones N3384, Argentina
| | - Luis Acuña
- Estación Experimental Agropecuaria Montecarlo, Instituto Nacional de Tecnología Agropecuaria, Misiones N3384, Argentina
| | - Claudia Nome
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | - Diego Quito-Avila
- Centro de Investigaciones Biotecnológicas del Ecuador, Escuela Superior Politécnica del Litoral, Guayaquil 090112, Ecuador
| | - Nicolas Bejerman
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | - Onias Castellanos Collazo
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| | | | - Fabián Giolitti
- Unidad de Fitopatología y Modelización Agrícola, Consejo Nacional de Investigaciones Científicas y Técnicas, Córdoba X5020ICA, Argentina
- Instituto de Patología Vegetal “Ing. Agr. Sergio Fernando Nome”, Instituto Nacional de Tecnología Agropecuaria, Córdoba X5020ICA, Argentina
| |
Collapse
|