1
|
McBroome J, de Bernardi Schneider A, Roemer C, Wolfinger MT, Hinrichs AS, O'Toole AN, Ruis C, Turakhia Y, Rambaut A, Corbett-Detig R. A framework for automated scalable designation of viral pathogen lineages from genomic data. Nat Microbiol 2024; 9:550-560. [PMID: 38316930 PMCID: PMC10847047 DOI: 10.1038/s41564-023-01587-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 12/13/2023] [Indexed: 02/07/2024]
Abstract
Pathogen lineage nomenclature systems are a key component of effective communication and collaboration for researchers and public health workers. Since February 2021, the Pango dynamic lineage nomenclature for SARS-CoV-2 has been sustained by crowdsourced lineage proposals as new isolates were sequenced. This approach is vulnerable to time-critical delays as well as regional and personal bias. Here we developed a simple heuristic approach for dividing phylogenetic trees into lineages, including the prioritization of key mutations or genes. Our implementation is efficient on extremely large phylogenetic trees consisting of millions of sequences and produces similar results to existing manually curated lineage designations when applied to SARS-CoV-2 and other viruses including chikungunya virus, Venezuelan equine encephalitis virus complex and Zika virus. This method offers a simple, automated and consistent approach to pathogen nomenclature that can assist researchers in developing and maintaining phylogeny-based classifications in the face of ever-increasing genomic datasets.
Collapse
Affiliation(s)
- Jakob McBroome
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| | - Adriano de Bernardi Schneider
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Cornelius Roemer
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Michael T Wolfinger
- Department of Theoretical Chemistry, University of Vienna, Vienna, Austria
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
- RNA Forecast e.U., Vienna, Austria
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Aine Niamh O'Toole
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, UK
| | - Christopher Ruis
- Molecular Immunity Unit, MRC Laboratory of Molecular Biology, Department of Medicine, University of Cambridge, Cambridge, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
- Cambridge Centre for AI in Medicine, University of Cambridge, Cambridge, UK
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - Andrew Rambaut
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, UK
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
2
|
Phumee A, Chitcharoen S, Sutthanont N, Intayot P, Wacharapluesadee S, Siriyasatien P. Genetic diversity and phylogenetic analyses of Asian lineage Zika virus whole genome sequences derived from Culex quinquefasciatus mosquitoes and urine of patients during the 2020 epidemic in Thailand. Sci Rep 2023; 13:18470. [PMID: 37891235 PMCID: PMC10611781 DOI: 10.1038/s41598-023-45814-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 10/24/2023] [Indexed: 10/29/2023] Open
Abstract
Zika virus (ZIKV), a mosquito-borne flavivirus, has been continually emerging and re-emerging since 2010, with sporadic cases reported annually in Thailand, peaking at over 1000 confirmed positive cases in 2016. Leveraging high-throughput sequencing technologies, specifically whole genome sequencing (WGS), has facilitated rapid pathogen genome sequencing. In this study, we used multiplex amplicon sequencing on the Illumina Miseq instrument to describe ZIKV WGS. Six ZIKV WGS were derived from three samples of field-caught Culex quinquefasciatus mosquitoes (two males and one female) and three urine samples collected from patients in three different provinces of Thailand. Additionally, successful isolation of a ZIKV isolate occurred from a female Cx. quinquefasciatus. The WGS analysis revealed a correlation between the 2020 outbreak and the acquisition of five amino acid changes in the Asian lineage ZIKV strains from Thailand (2006), Cambodia (2010 and 2019), and the Philippines (2012). These changes, including C-T106A, prM-V1A, E-V473M, NS1-A188V, and NS5-M872V, were identified in all seven WGS, previously linked to significantly higher mortality rates. Furthermore, phylogenetic analysis indicated that the seven ZIKV sequences belonged to the Asian lineage. Notably, the genomic region of the E gene showed the highest nucleotide diversity (0.7-1.3%). This data holds significance in informing the development of molecular tools that enhance our understanding of virus patterns and evolution. Moreover, it may identify targets for improved methods to prevent and control future ZIKV outbreaks.
Collapse
Affiliation(s)
- Atchara Phumee
- Department of Medical Technology, School of Allied Health Sciences, Walailak University, Nakhon Si Thammarat, Thailand
- Excellent Center for Dengue and Community Public Health (EC for DACH), Walailak University, Nakhon Si Thammarat, Thailand
| | - Suwalak Chitcharoen
- Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Nataya Sutthanont
- Department of Medical Entomology, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | - Proawpilart Intayot
- Pharmaceutical Ingredient and Medical Device Research Division, Research Development and Innovation Department, The Government Pharmaceutical Organization, Bangkok, Thailand
| | - Supaporn Wacharapluesadee
- Thai Red Cross Emerging Infectious Diseases Clinical Center, King Chulalongkorn Memorial Hospital, Bangkok, Thailand
| | - Padet Siriyasatien
- Center of Excellence in Vector Biology and Vector Borne Diseases, Department of Parasitology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand.
| |
Collapse
|
3
|
Leiva S, Bugnon Valdano M, Gardiol D. Unravelling the epidemiological diversity of Zika virus by analyzing key protein variations. Arch Virol 2023; 168:115. [PMID: 36943525 DOI: 10.1007/s00705-023-05726-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 01/19/2023] [Indexed: 03/23/2023]
Abstract
The consequences of Zika virus (ZIKV) infections were limited to sporadic mild diseases until almost a decade ago, when epidemic outbreaks took place, with quick spread into the Americas. Simultaneously, novel severe neurological manifestations of ZIKV infections were identified, including congenital microcephaly. However, why the epidemic strains behave differently is not yet completely understood, and many questions remain about the actual significance of genetic variations in the epidemiology and biology of ZIKV. In this study, we analysed a large number of viral sequences to identify genes with different levels of variability and patterns of genomic variations that could be associated with ZIKV diversity. We compared numerous epidemic strains with pre-epidemic strains, using the BWA-mem algorithm, and we also examined specific variations among the epidemic ZIKV strains derived from microcephaly cases. We identified several viral genes with dissimilar mutation rates among the ZIKV strain groups and novel protein variation profiles that might be associated with epidemiological particularities. Finally, we assessed the impact of the detected changes on the structure and stability of the NS1, NS5, and E proteins using the I-TASSER, trRosetta, and RaptorX modelling algorithms, and we found some interesting variations that might help to explain the heterogeneous features of the diverse ZIKA strains. This work contributes to the identification of genetic differences in the ZIKV genome that might have a phenotypic impact, providing a basis for future experimental analysis to elucidate the genetic causes of the recent ZIKV emergency.
Collapse
Affiliation(s)
- Santiago Leiva
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina
| | - Marina Bugnon Valdano
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina.
| | - Daniela Gardiol
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina.
| |
Collapse
|